MCP servers are everywhere now — here's how to vet them before they break your agent

The Model Context Protocol exploded. There are 10,000+ MCP servers now, and everyone's telling you to plug them into your agents. "Turn your $20 Claude into 100x power!" they say.

Here's what they don't tell you: 90% of these servers are demos that will break your agent in production.

I learned this the hard way last week. Plugged a popular "database query" MCP server into my content agent. Looked great in testing — clean responses, fast queries. Then I deployed it.

Within 2 hours: 47 timeout errors, 12 malformed responses, and one query that somehow tried to DROP a table. My agent spent more tokens retrying failed calls than actually working.

The MCP problem: GitHub stars don't predict runtime reliability. A server can have perfect docs and 500+ stars but still fail 30% of calls in production.

Here's the vetting process I use now before connecting any MCP server:

1. Check the error handling

Most MCP servers assume perfect conditions. Look for:

Timeout configuration (should be under 10 seconds)
Rate limit handling
Graceful degradation when APIs are down
Retry logic with exponential backoff

2. Test the failure modes

Don't just test happy paths. I run these scenarios:

# Kill the network mid-request
sudo pfctl -f /dev/stdin <<< "block drop all"

# Simulate rate limits
curl -X POST -H "X-RateLimit-Remaining: 0" ...

# Send malformed inputs
echo '{"invalid": json}' | your-mcp-server

If the server crashes or hangs on any of these, don't use it.

3. Monitor token consumption

Bad MCP servers are token vampires. They return verbose error messages, redundant data, or trigger retry loops.

I track tokens per successful operation. If an MCP server uses more than 200 tokens per successful call, it's probably wasteful.

4. Check the permissions

This is the big one. Some MCP servers ask for:

Database write access (when they only need read)
File system access outside their scope
Network access to arbitrary domains
API keys with admin permissions

If you can't run it with minimal permissions, find another server.

5. Test with production load

Spin up the MCP server locally and hit it with concurrent requests:

for i in {1..50}; do
  curl -X POST localhost:3000/query \
    -d '{"query": "SELECT COUNT(*) FROM users"}' &
done
wait

Watch memory usage, response times, and error rates. Production MCP servers should handle 50+ concurrent requests without breaking.

The pattern that works:

Start with 2-3 vetted MCP servers. Test them for a week. Add monitoring. Only then consider adding more.

I keep a "MCP graveyard" — a list of servers that looked promising but failed in production. It's longer than my approved list.

The MCP ecosystem is powerful, but it's also the Wild West. Vet before you deploy, or your agent will spend more time recovering from failures than doing actual work.

MCP servers are everywhere now — here's how to vet them before they break your agent

Get tips like this every morning