Your agent needs a harness — here's how to build one that actually works
Your agent is running wild. It starts tasks, abandons them halfway through, burns through your API budget, and occasionally does something brilliant that you can't reproduce. Sound familiar?
The problem isn't the agent — it's that you're running it bare metal. Production agents need harnesses: wrapper systems that constrain, observe, and recover them when things go sideways.
Here's the harness pattern that turns chaotic agents into reliable operators.
The Three-Layer Harness
A proper agent harness has three layers: constraints (what it can't do), observability (what it's actually doing), and recovery (what happens when it breaks).
Start with constraints. Your agent needs hard limits before it needs more capabilities:
// Resource constraints max_api_calls_per_hour: 100 max_file_writes_per_session: 50 max_subprocess_duration: 300 allowed_domains: ["api.github.com", "api.openai.com"] forbidden_paths: ["/etc", "/usr", "~/.ssh"]
Next, observability. You need to see what your agent is doing in real-time, not just when it reports back:
// Log every action with context
function logAgentAction(action, context) {
console.log({
timestamp: new Date().toISOString(),
action: action.type,
input: action.input,
context_size: context.length,
session_id: getCurrentSessionId(),
cost_estimate: estimateTokenCost(context)
});
}Finally, recovery. Your agent will break. Plan for it:
// Recovery triggers
if (consecutiveErrors > 3) {
agent.enterSafeMode();
notifyOperator("Agent entering safe mode");
}
if (apiCostThisHour > budget.hourly_limit) {
agent.pause();
waitForBudgetReset();
}The Wrapper That Actually Works
Don't build this from scratch. Use a process manager like PM2 or systemd to wrap your agent. Here's a PM2 config that gives you instant harness capabilities:
{
"name": "my-agent",
"script": "agent.py",
"max_memory_restart": "1G",
"max_restarts": 5,
"min_uptime": "10s",
"watch": false,
"env": {
"NODE_ENV": "production",
"AGENT_MODE": "harness"
}
}This gives you automatic restarts, memory limits, and crash protection. Your agent can't bring down your system anymore.
Monitor the Right Metrics
Most people monitor the wrong things. Don't just watch success rates — watch behavioral patterns:
- Task completion rate: Is it finishing what it starts?
- Context utilization: Is it using its full context window efficiently?
- Error clustering: Are failures random or systematic?
- Resource drift: Is memory/CPU usage creeping up over time?
Pro tip: Set up alerts for behavioral changes, not just failures. An agent that suddenly starts making 10x more API calls is probably stuck in a loop, even if it's not throwing errors.
The harness pattern works because it treats your agent like what it is: a powerful but unpredictable process that needs guardrails. Stop running agents naked in production. Wrap them, watch them, and give them room to fail safely.
If you're ready to build production-grade agent systems, you need operational patterns that go beyond basic prompting. The fundamentals of agent operations — session discipline, memory patterns, safety rules, and communication protocols — are things most people learn the hard way.