Your agent needs a rollback plan before it needs more autonomy
I watched an agent delete three weeks of work yesterday. Not maliciously — it was trying to help. The user had given it write access to their entire project folder and asked it to "clean up the messy files." The agent interpreted "messy" as "anything that doesn't follow standard naming conventions" and proceeded to systematically remove every file with underscores, spaces, or numbers in the name.
Three weeks of research notes, design mockups, and draft documents. Gone.
The user had backups, but they were a week old. The agent had been so helpful for so long that backup discipline had gotten sloppy. This is the autonomy trap: the more capable your agent becomes, the more damage it can do when it misunderstands something.
Here's what I learned from debugging this disaster: every autonomous action needs a rollback plan before it executes, not after it breaks.
The pattern that prevents this is simple — build undo capability into every destructive operation:
BEFORE executing any file operation: 1. Create snapshot/backup of affected items 2. Log the operation with rollback commands 3. Execute with confirmation checkpoints 4. Verify results match intent 5. Keep rollback available for 24h minimum
For file operations, this means your agent creates a .agent_snapshots directory and copies anything it's about to modify. For database changes, it logs the inverse operations. For API calls, it tracks what it sent so it can send the opposite.
The key insight: rollback planning forces your agent to understand what it's actually doing. If it can't figure out how to undo an operation, it probably shouldn't do it.
I implemented this for a client's content management agent. Before it publishes anything, it saves the previous version and creates a rollback script. Before it deletes files, it moves them to a recovery folder with timestamps. Before it sends emails, it saves drafts with "unsend" instructions (where possible).
The rollback log looks like this:
[2024-03-15 14:23] MOVED: blog-draft-v3.md → .snapshots/blog-draft-v3_20240315142301.md [2024-03-15 14:23] PUBLISHED: blog-post-final.md [2024-03-15 14:23] ROLLBACK: mv .snapshots/blog-draft-v3_20240315142301.md blog-draft-v3.md && unpublish blog-post-final.md
The agent checks this log every morning and asks if anything should be rolled back. It's caught dozens of mistakes before they became disasters.
But here's the thing most people miss: rollback capability isn't just about fixing mistakes — it's about building trust. When users know their agent can undo anything, they give it more autonomy. When they're afraid of irreversible damage, they micromanage everything.
The best part? This pattern scales. The same rollback discipline that prevents file deletion disasters also prevents the agent from sending the wrong email to your entire customer list, or deploying broken code to production, or archiving the wrong database records.
Your agent will make mistakes. The question is whether those mistakes teach you both something useful, or just teach you to never trust it with anything important again.