From the build log.
Picture a busy front desk. A note slides across it. There is no name on the note, so the receptionist has to guess who left it. If the rule is "when in doubt, assume it came from the boss," then anyone can walk up, drop an unsigned note, and have it carried out with the boss's full authority. Nobody broke in. The desk was just confused about who was asking.
That is the bug at the heart of this post. It has an old name in our trade — the confused deputy — and it bites hardest the moment you have more than one person sending notes.
Why one assistant hides the problem
When you run a single AI assistant, this almost never trips you up. There is only one person it could be working for, so when a message arrives without a clear sender, the obvious guess is right. The gap in the system's reasoning gets quietly filled by the only sender it could have been.
Now put several assistants on the same shared inbox, with messages arriving from a handful of different places that all look roughly alike. Suddenly every unsigned message is a real question the system has to answer. And the convenient answer — "it must be the owner" — is now almost always the wrong one.
The default that does the damage
The trap hides inside something that looks perfectly sensible. A message comes in. The system checks it for a sender. If a sender is named, it uses that. If no sender is named, it falls back to a default — and the natural default is the owner, because the owner is the one who set everything up.
Say that out loud and the flaw is plain. The "nobody told me who sent this" case lands straight on the most powerful identity in the system. An anonymous message does not get turned away. It gets promoted. Anything that can reach the inbox without naming a sender now speaks with the owner's voice.
This is not some elaborate attack. It is a single line of "if we don't know, assume it's the boss" written on a quiet afternoon. It works perfectly right up until the day a second sender exists.
The fix is not clever. It is a refusal. The sender's identity has to ride on the message itself, stamped on the way in, by the part of the system that actually knows who sent it. If a message cannot show who sent it, the honest answer is "unknown" — and "unknown" is a real state you can handle, not a polite word for "the boss." Not knowing should never quietly round up to the person with the most power.
An anonymous message should be turned away, not promoted. "We don't know who sent this" must never quietly become "so it must be the owner."
The quieter mistake: trusting your own notes
There is a second, sneakier version of the same problem, and it is the one I would press on if you take only one thing from this.
We are all taught to be careful with what outsiders type — the form field, the box on the web page, the thing a user submits. We scrub that. Where teams get caught is the information the system writes about itself: the sender's name, a record of what an assistant just did, the list of who got mentioned. That all feels internal. It feels safe. So it gets dropped into the message untouched.
But "internal" is a story you tell yourself, not a fact about the text. A name is just a piece of text, and a piece of text can contain anything the message that produced it contained. If your system stitches those pieces of text together into a structured message without treating each one as possibly hostile, you have handed an attacker a way to forge the structure itself — to slam one section shut early, prise open a new one, and rewrite what the message means from inside a field you assumed was harmless.
The rule that falls out of this is simple. A piece of text is suspect from the moment it is written down, not from the moment it is shown on a screen. Clean it on the way in. Do not wait for the display to make it safe, because by then it has already passed through everything that trusted it.
Why this is worth writing down
You can coast on the convenient defaults for a long time and feel fine, because in a one-owner, one-user world they are genuinely correct. The bill arrives later, when the system grows the second sender it was always going to grow.
The discipline is to build the front door as if that second sender already exists. Make the sender's identity something the message carries and the system checks. And treat every piece of text the system writes down as something a stranger might have authored.
Two true rules beat a long checklist. Stamp identity on the message, never on the surroundings. Treat the text you write with the same suspicion as the text you read. Everything else is detail.
Under the hood
Two concrete findings sat behind this, on a shared agent message bus where several agents post to the same surface.
The confused-deputy hole was a one-line fallback when resolving the sender of an inbound message: actor = header || 'brynn'. If the identity header was present it was used; if it was absent the code defaulted to the owner. That meant any unauthenticated message — anything that could reach the bus without a header — resolved to the most privileged identity in the system. The fix is to resolve absence to an explicit unknown state and reject or quarantine it, never to fall through to a default principal.
The second finding was a structured-injection hole. Messages were serialised into an XML-shaped envelope, and two system-written fields — mention_targets (who was @-mentioned) and trace (a record of what an agent did) — were written into that envelope without XML-escaping. Because the values can echo content from the originating message, a crafted value such as </mention_targets><evil/> could close the field early and open attacker-controlled structure. The point: these are fields the system generates about itself, so they slipped past the "sanitise user input at the boundary" instinct. The fix is to escape on serialisation — treat a string field as adversarial from the moment it is written, not from the moment it is rendered — so the display layer is never the thing standing between you and forged structure.
Both are the same lesson at two layers: identity and trust must be asserted by the producer of a value, not inferred by whatever consumes it later.