A robot that has learned to do something dangerous is more dangerous than one that never learned at all, because you trusted it.
That sentence is the whole reason this post exists. Most conversations about "AI safety in robotics" stay at the level of philosophy — alignment, intent, the long-term questions. Those matter. But there is a much more immediate, much more boring problem sitting in front of anyone actually putting an autonomous robot into the world right now, and it has nothing to do with philosophy. It's this: an autonomous robot makes decisions, and those decisions become physical motion, and physical motion has consequences you cannot take back. Between the decision and the motion, something has to be allowed to say no.
I've spent the last while building exactly that "something" — a governance layer — and wiring it into two of the standard open-source robotics frameworks rather than any single manufacturer's proprietary SDK, precisely because the problem is universal and the solution should be too. I'm not going to show you how the governance core works; that's ours, and a blog post is the wrong place for it anyway. What I can do is show you where it plugs in, and why governing a robot turns out to be the same problem in two completely different places. If you work with ROS 2 or with HuggingFace's LeRobot, the integration points will be immediately recognizable, and that recognizability is the point: the gate goes exactly where you'd expect it to, if you were being honest about where danger enters.
Problem one: the command, in real time (ROS 2)
If you've touched ROS 2, you know /cmd_vel. It's the topic that carries velocity commands — a Twist message with linear and angular components — from whatever is doing the thinking to whatever is doing the moving. A planner, a teleop node, or a learned policy publishes to /cmd_vel, and the robot's base controller turns that into wheels and motors actually moving.
Here is the thing nobody emphasizes enough: /cmd_vel is a firehose pointed at the physical world, and in a default setup, nothing stands between the thing producing commands and the wheels. Whatever publishes, moves. If the policy has a bad moment — and learned policies have bad moments — the bad command goes straight to the motors. The architecture trusts the producer completely.
The governance approach is almost embarrassingly simple to describe, which is part of why it works: don't let the planner publish to /cmd_vel directly. Insert a relay. The planner publishes its intended command to a different topic — call it /cmd_vel_proposed — and a governor node is the only thing that publishes to the real /cmd_vel. The governor subscribes to the proposals, evaluates each one, and republishes to the live topic only the commands that pass. A command that violates policy — exceeds a linear or angular velocity limit, arrives while an emergency stop is latched — does not get forwarded. It gets replaced with a zero Twist, and the decision gets logged.
The shape of that node is exactly what a ROS 2 developer would sketch on a whiteboard:
pythonclass CommandRelayGate(Node):
def __init__(self):
super().__init__('thinkneo_governor')
self.sub = self.create_subscription(
Twist, '/cmd_vel_proposed', self.on_proposed, 10)
self.pub = self.create_publisher(Twist, '/cmd_vel', 10)
self.governor = ActionGovernor(...) # the part I'm not detailing
def on_proposed(self, cmd: Twist):
decision = self.governor.evaluate(cmd) # PASS / BLOCK + reason
if decision.allowed:
self.pub.publish(cmd)
else:
self.pub.publish(Twist()) # zero — full stop
# decision is recorded out-of-band, with its reason
That's the whole topology. Subscribe to proposals, evaluate, forward or zero-out. What I'm deliberately leaving as ActionGovernor(...) — how it evaluates, what the policy model is, how limits and e-stop state are represented and checked — is the part that's ours. But notice what's not hidden, because it can't be and shouldn't be: the relay pattern itself is just good engineering. The governor sits at the one chokepoint where intent has been expressed but motion has not yet happened. That's the only place a real-time guarantee can exist. Govern upstream of the planner and you're trusting the planner. Govern downstream of the motors and you're cleaning up an accident.
One design decision worth calling out, because it's a tell for whether someone has actually run this: the evaluation core is kept completely free of ROS itself. It has no rclpy dependency. That sounds like a small thing and it is not. It means the governance logic can be unit-tested without a running robot, without a simulator, without ROS installed at all — you feed it commands and assert on decisions. We have a mock test suite that does exactly this. If your safety-critical logic can only be tested by standing up the whole robot stack, you will not test it enough, and "not enough" is not an acceptable amount of testing for the code that decides whether a machine moves.
Problem two: the command, before it ever exists (LeRobot)
Now the same problem, somewhere completely different.
HuggingFace's LeRobot is one of the cleanest open frameworks for training robot policies — imitation learning, reinforcement learning, the whole loop of teaching a robot to do a task from data. And here's the uncomfortable insight that connects it to the /cmd_vel gate above: every dangerous real-time command was, at some point, a checkpoint that got promoted. The bad moment a policy has in production was trained into it. The governor at /cmd_vel catches the symptom. But the disease entered the system earlier, when a model that learned the wrong thing was allowed to graduate from "a checkpoint" to "the policy we deploy."
So governance can't only live at runtime. It has to live in the training pipeline too, at the moment a checkpoint is about to be trusted.
LeRobot, like most modern training frameworks, gives you a callback interface — hooks that fire during training and evaluation. That's the integration point, and again it's exactly where you'd put it if you were thinking clearly. A governance callback does two things during a training run. First, it streams the train and eval metrics out to a trace — a durable record of how this model actually behaved as it learned, not a number someone wrote down once and lost. Second, and this is the one that matters, it applies a promotion gate: when a checkpoint is a candidate to be promoted, the gate evaluates it and returns PASS or BLOCK. A checkpoint that doesn't clear the bar does not get promoted, regardless of how good its headline metric looks.
The shape, again, is the framework's own callback surface, nothing exotic:
pythonclass GovernancePromotionCallback:
def on_eval_end(self, metrics: dict):
trace.record(metrics) # streamed to the trace
verdict = promotion_gate.check(metrics) # PASS / BLOCK
if verdict.blocked:
raise PromotionBlocked(verdict.reason)
What promotion_gate.check actually weighs — that's the part I'm not handing over. But the principle is the same one as the ROS 2 gate, rotated ninety degrees. There is a moment of transition — proposed command becomes real command; candidate checkpoint becomes deployed policy — and governance is the thing that sits at that transition and can refuse it. I've watched this gate do its job on a live run: a real trace, a checkpoint that looked acceptable on its primary metric, and the gate correctly blocking the promotion. That "correctly" is doing a lot of work, and it's the part that took the real engineering — but the architecture of catching it at promotion time is the transferable idea.
Why these are the same problem
Step back and the two integrations rhyme exactly.
In ROS 2, the dangerous thing is a command that's about to reach the motors, and the gate sits between /cmd_vel_proposed and /cmd_vel. In LeRobot, the dangerous thing is a policy that's about to be deployed, and the gate sits between "candidate checkpoint" and "promoted." Different timescale — milliseconds versus a training run — but identical structure: a transition from intent to consequence, with an enforceable checkpoint placed precisely at the transition.
That structural sameness is not a coincidence, and noticing it is the entire thesis of how I think about governing autonomous systems. It doesn't matter whether the agent is a velocity command, a model checkpoint, a tool call in a software agent, or a humanoid deciding to take a step. Wherever a decision is about to become an effect you can't undo, that's where the gate goes. Everything else — the policies, the limits, the scoring — is implementation. The architecture is just: find the transition, put an enforceable, logged checkpoint on it, and make sure that checkpoint can say no.
The two frameworks share a common governance core underneath, with a control-plane client that also works offline — because a robot that loses its network connection should not lose its ability to be governed; if anything it needs it more. Both integrations ship with mock test suites that run without the framework even installed, for the reason I gave above: safety logic you can't easily test is safety logic you won't test.
I'm not publishing the core, and I won't pretend the hard part is the plumbing I've shown you — the hard part is what's inside the boxes I left as .... But I wanted to show that it's real, running, and wired into the actual tools the robotics community uses, rather than gesturing at a future where someone, someday, governs their robots. The integration points are the honest part. The gate goes where danger enters. In ROS 2 that's one topic upstream of the motors. In LeRobot that's one callback before a checkpoint is trusted. Same problem, twice.
Thinkneo AI builds an AI governance layer for autonomous systems, with integrations for ROS 2 and HuggingFace LeRobot among others. We write about what it actually takes to make AI safe in production — physical or otherwise.