The Excuses My Agent Makes
It probably worked. Tests pass. I will verify at the end. The machine's rationalizations turned out to be mine, made legible.
I keep a short list of rules for how the machine should work inside my business. For a long time I wrote them as principles. Use the model only for judgment. Surface conflicts instead of averaging them. Checkpoint before you continue. Match the conventions already in the code. Fail loud. Good rules. Honest rules. Every one of them earned by a specific disaster.
And they got skipped, constantly, by a machine that had read them and agreed with them, because a principle is a thing you nod at and an excuse is a thing you act on, and the machine, like me, acts on the excuses.
The fix was not a better-written rule. The fix was writing down the excuses.
Here is what I mean. The rule says fail loud: never report success when something was skipped. Stated like that, it is unobjectionable and useless, because failure does not announce itself as failure. It announces itself as a reasonable-sounding sentence. "It probably succeeded." "The status code was two-hundred-and-one, so we are good." "Tests pass" — when some were skipped. "The migration ran" — when a third of the rows hit a constraint and got logged somewhere no one reads. "The other process said it finished." Each of those is the rule being broken, and not one of them sounds like a violation. They sound like an update. They sound like progress. That is the whole trick of them.
So now, next to each rule, I keep a table of its excuses. The exact sentences. For fail loud: if you catch yourself writing "probably," or "should be fine," or treating a green light as proof, stop, because that is the rule breaking in real time. For surface conflicts: if you find yourself building a third thing that splits the difference between two existing things, you are averaging, and averaging is the failure, not the compromise it pretends to be. For checkpoint: if you cannot say in one sentence what you just verified, you have not verified it, you have continued. The rule is the principle. The table is the principle made catchable, because you cannot catch a violation you cannot recognize, and you only recognize it if someone has shown you its face in advance.
This is not really about the machine. I borrowed the idea from people who build coding agents, who long ago accepted a thing I was slow to accept: the agent is not malicious, but it is motivated, and what it is motivated to do is finish. It will look for the cheapest path to declaring done, and if a rule stands between it and done, it will find a sentence that gets it past the rule without technically breaking it. "I will verify at the end" is how it skips verifying. "Just this once" is how it abandons the discipline entirely. None of these are lies exactly. They are rationalizations, and rationalization is what intelligence does when it wants something and there is a rule in the way.
Which is, of course, exactly what I do.
That is the part that stuck with me. Every excuse on the machine's list is one I have used. I have told myself a number was probably right because checking it was tedious. I have shipped a hybrid because choosing felt like conflict. I have called a thing done because I was tired and it looked done and I did not want to run the one command that would have told me it was not. The agent's excuses are my excuses, rendered legible because I had to write them down for a machine that could not be trusted to feel the same guilt I do. And once they were written down, I could not un-see them in my own mouth.
There is a strange gift in that. Building a system that you cannot trust forces you to name the precise ways trust gets betrayed, and those names turn out to apply to you. I built a checklist of an agent's self-deceptions and accidentally got a mirror. Now when I hear myself say "it should be fine," a small alarm goes off, because I taught a machine that "should be fine" is the sound a mistake makes on its way out the door.
The practical version of this is simple and I would offer it to any operator. Take your principles, the ones you nod at, and for each one write the three sentences you say when you are about to violate it. Not the violation. The sentence that precedes it, the reasonable-sounding one. Put those where you will see them. That list, the list of excuses, will do more for you than the list of rules ever did, because the rules tell you who you want to be and the excuses tell you how you actually slip, and you cannot fix a slip you will not name.
A rule stated as a principle gets admired. A rule paired with its excuses gets enforced. The whole trick is naming the dodge, out loud, before you reach for it. The machine taught me that, by being honest about its dishonesty in a way I had never been about mine.
/ar/