There is a tendency in AI development to treat constraints as problems. A constraint is something that slows you down, something to be engineered around, something that exists because the previous generation of tooling wasn't capable enough. Once the models get better, the constraints go away.
This framing is wrong, and it's wrong in a way that produces fragile systems and, eventually, expensive incidents.
Constraints in well-designed systems aren't failures of capability. They're features of architecture. The append-only ledger cannot be modified — not because we haven't built the edit functionality yet, but because mutability in an audit trail is a bug, not a feature. The proposer cannot approve their own changes — not because we haven't implemented self-approval yet, but because the separation is the point.
The organisations that build durable AI systems are the ones that have learned to reason about constraints as design inputs rather than temporary obstacles.
The reversibility gradient
Not all constraints are equal. The most useful mental model I've found is a reversibility gradient: actions exist on a spectrum from easily reversible (close a modal) to irreversible (send a mass email, delete a database, publish a client-facing document).
The appropriate constraint level is a function of position on this gradient. Easily reversible actions can proceed without friction — no approval, no pause, no confirmation. Irreversible or high-consequence actions warrant a pause, a confirmation, a second set of eyes.
The failure mode in most systems is applying the same constraint level (or lack thereof) to all actions regardless of reversibility. This produces one of two pathologies: everything requires approval, which makes the system useless, or nothing requires approval, which makes it reckless.
The correct architecture is graduated: the system is designed to know where on the reversibility gradient each action falls, and to apply appropriate friction accordingly. This isn't hard to implement. It is surprisingly hard to get teams to design for — because it requires thinking carefully about failure modes before any failure has occurred.
What cannot evolve
There is a related design principle that applies specifically to self-modifying systems — systems that can change their own behaviour, update their own code, or expand their own capabilities.
For these systems, the constraint question becomes: what is the kernel? What is the component that cannot be modified by the system itself, that holds the invariants the rest of the system operates within?
If the answer is "nothing — the system can modify anything", then you don't have a kernel, you have a trust problem. The system's behaviour at time T+1 depends entirely on whether the modifications made at time T were correct. There's no stable ground.
The kernel doesn't need to be large. In most systems, it's small — the governance logic, the ledger, the authentication surface. But it needs to exist, and it needs to be genuinely immutable from the perspective of the rest of the system.
This is constitutional thinking applied to software architecture. A constitution exists precisely because some things should be hard to change, even when the current authorities would prefer to change them.
Constraints as trust infrastructure
The reason all of this matters practically is that constraints are the substrate of trust.
An organisation that wants to give an AI system broad authority needs to be able to trust that authority won't be misused — not through hope, but through architecture. The audit trail means that every action can be reviewed. The proposer/approver separation means that no single entity can unilaterally commit to an irreversible action. The leased capability model means that authority expires and has to be explicitly renewed.
These constraints don't limit what the system can do. They define the conditions under which it's safe to expand what the system can do. An agent with a solid governance layer can be given far more authority than one without — precisely because the governance layer makes overreach detectable and correctable.
The organisations that will deploy the most capable AI systems aren't the ones with the most permissive architectures. They're the ones with the most thoughtful constraints.
That distinction is worth building for.
