The website belonged to a South African accounting firm — a tool we'd built for them earlier in the year to manage their client records. In May, one of their staff hit a small bug. A dropdown for choosing the record type was misbehaving, and she couldn't save. An easy fix. We made the change, a teammate reviewed it, all the automatic checks passed, and we merged it in.
Then she still couldn't save.
Five minutes of the old version
Here is the part that's worth slowing down on. The live website kept handing out the old, broken version of the page. Not for a second or two while things settled — for over five minutes, with a person actively stuck on the other end, clicking a button that did nothing.
And everything we'd normally trust was telling us the fix was done. The change had been approved. The build had succeeded. Every status light was green.
The one thing that was actually wrong was the one thing none of those lights were watching.
Think of it like a print shop. You email the corrected poster, you get back a cheerful "received, looks great," and you assume the new poster is now hanging in the shop window. But "we received your file" and "the new poster is in the window" are two completely different claims. Nobody had actually swapped the poster. The window still showed the old one.
The thing nobody had connected
The cause was dull, which is the worst kind. When this website was first set up, one connection was never made — the link that tells the hosting service "whenever the team approves a change, automatically put the new version live." That link simply wasn't there.
Without it, approving a change triggers nothing. The new version gets built and set aside, and someone has to carry it to the live website by hand, every single time. Nobody had been told that, so nobody did it.
A green checkmark tells you the new version was built. It does not tell you the live website changed. Those are two different things, and only one of them is the one you actually care about.
Putting the fix live took about ten seconds, once we did it by hand. The page updated, and she saved her record. Ten seconds of cure on top of five minutes of someone stuck staring at a button that wouldn't work.
Why this trap stays hidden
It hides because the broken path and the working path look identical until the second change.
The first time, someone puts the website live by hand to get it up and running. It works. The site is live. From then on it looks live, behaves live, and genuinely is live — right up until the next change, when "we approved it" and "it's actually out there" turn out to be two separate steps and only the first one happened.
The sneaky part is that the green checkmark isn't lying. The change really was built. It's not a faulty result — it's a false sense of being finished. The checkmark answers "did the new version build?" and we read it as "is the fix live?" Those two questions only come apart when a step is quietly missing, and a missing step doesn't throw an error you'd trip over. It just sits there.
What we changed
Two things, and the order matters.
- Check the window, not the receipt. After any change meant to update the live website, we now look at the live site itself and confirm it's serving the new version — not the dashboards that claim it should be. The site itself is the only witness that counts.
- Make the connection once. We wired up the missing link so that approving a change now puts it live automatically. Doing it by hand is fine as a backup and a lifesaver in an emergency, but leaving it as the only way means every future change carries the same "approve it and forget" trap.
The honest opinion underneath all this: doing a step by hand is perfectly acceptable infrastructure right up until the moment a person is waiting on the other end of it. The cost here wasn't the ten seconds of fixing it. It was five minutes of someone unable to do her job, trusting a fix we'd told her was live. Check the live site, not the green tick — the tick is answering a question you didn't ask.
Under the hood
The site was hosted on Cloudflare Pages. The bug was a misbehaving entity_type / record-type dropdown that blocked a save; the fix was a one-line change that went through pull request review, passed CI, and merged to main.
The diagnosis came from the live asset hash. The production site was still serving the old hashed JavaScript bundle filename — the same one from before the merge. If a deploy had run, that hash would have changed. It hadn't, so no deploy had happened.
Root cause: the Cloudflare Pages project had been created without a Git source connected. We confirmed it against the Cloudflare API — the project's source field was null. With no repository wired to the project, a merge to main triggers no build hook and no deploy; the only thing that ever put a bundle live was a human running the deploy by hand.
The fix was wrangler pages deploy of the built output (under ten seconds), which flipped the live asset hash and updated the site. The durable follow-up was a one-time connection of the Pages project to the GitHub repository so future merges to main auto-deploy.
The two standing rules that came out of it:
- Verify the artefact, not the process. After a deploy, poll the live asset hash and compare it to the bundle you just built. Match on the new hash = landed; still the old hash = didn't, regardless of any upstream green check.
- Close the gap at the source — connect the Pages project to Git so merges trigger deploys. Keep the manual
wranglerpath as a documented fallback, not the sole deploy mechanism.