exhaustnote.engineer

INTELLIGENCE

The computational layer. The agent that learns. The system that decides.

Every decision happens inside a light cone. Everything else is noise.

PAST LIGHT CONE FUTURE LIGHT CONE SPACE LIKE SPACE LIKE x ct event GPS: +38μs/day
Minkowski spacetime · 1+1 dimensions

REASONING INSIDE THE CONE

Special relativity tells you that nothing outside your past light cone can have caused the present moment. An intelligent system operates under the same constraint. it can only act on information that has arrived. Everything else is unmeasured state.

GPS satellites drift by 38 microseconds per day without relativistic correction. That is not philosophy. That is a navigation error that compounds until the system fails. Intelligence has to know where it is in time.

Reinforcement learning, control theory, and orbital mechanics are all versions of the same problem: an agent, bounded by what it can observe, deciding what to do next inside a universe that does not pause to explain itself.

The Stack. From Model to Metal
MODEL MIDDLEWARE HARDWARE Simulink Plant model Python Control logic RL Agent Policy network Dynamics State equations ROS2 Topics · Services · Actions · DDS Sensors IMU · DVL · Sonar Compute Onboard CPU Actuators Thrusters · Fins feedback
Full stack. model layer to hardware · gold = physical world boundary

THE AGENT THAT WASN'T PROGRAMMED

Classical control tells the system what to do. Reinforcement learning lets the system figure it out. Given a state, take an action, observe a reward, update the policy. Repeat until the behaviour emerges.

The policy was never written. It was discovered through interaction.

Environment state sₜ Agent policy π(sₜ) state sₜ reward rₜ action aₜ policy update: ∇θ E[Σ γᵗ rₜ]
State · Action · Reward · Update
Verification. From Simulation to Reality
reality SIL Software in the Loop Simulink · Python · ROS2 sim model validation parameter tuning HIL Hardware in the Loop Real hardware · Simulated plant pure simulation real hardware
SIL verifies logic · HIL verifies the system · gold marks the physical boundary
ROS2
The nervous system connecting every layer.
Topics. Services. Actions. DDS transport. The middleware that lets a Python controller talk to a thruster without knowing what a thruster is.
Reinforcement Learning
The policy was never written. It was discovered.
An agent, a state space, a reward signal. The behaviour that emerges was never explicitly programmed. it was found through interaction with the environment.
HIL & SIL
Where simulation ends and physics begins.
Software in the loop validates the logic. Hardware in the loop validates the system. The gap between them is where assumptions are tested.
Control & Dynamics
Simulink. Python. State equations. The maths underneath.
Feedback control, system identification, nonlinear dynamics. The foundation that RL and ROS2 sit on top of.