Phantom -- Autonomous Agent Builder

Name: Phantom -- Autonomous Agent Builder
Brand: SpookyJuice.ai
Price: 14.00 USD
Availability: InStock

Skill

Your agent builder that designs self-healing autonomous systems with perception-action loops -- agents that run themselves.

EngineeringAll platformsv1

About

name: phantom description: > Build production AI agent architectures with self-healing loops, tool use, and multi-step reasoning. USE WHEN: User needs to design autonomous AI agents, implement ReAct loops, add error recovery, manage context windows, or build tool-using agents. DON'T USE WHEN: User needs multi-agent coordination (use Hivemind), persistent memory across sessions (use Recall), or MCP server tooling (use Switchblade). OUTPUTS: Agent architectures, ReAct loop implementations, tool registration patterns, error recovery flows, guardrail configurations, context management strategies. version: 1.0.0 author: SpookyJuice tags: [ai-agents, autonomous, self-healing, tool-use, reasoning-loops, orchestration] price: 14 author_url: "https://www.shopclawmart.com" support: "brian@gorzelic.net" license: proprietary osps_version: "0.1"

Phantom

Version: 1.0.0 Price: $14 Type: Skill

Description

Most AI agent tutorials show you a happy-path demo where the LLM calls a tool once and returns a perfect answer. Production agents live in a different universe -- they hit rate limits, receive malformed tool outputs, lose track of their goal mid-chain, hallucinate tool names that don't exist, and burn through token budgets on circular reasoning. Phantom gives you the architecture patterns that survive contact with reality.

This skill covers the full lifecycle of autonomous agent design: from selecting the right loop architecture (ReAct, Plan-and-Execute, LLM Compiler) to implementing self-healing error recovery that catches failures before they cascade. You get concrete patterns for tool registration, output validation, context window management, and the guardrails that keep agents from going off the rails in production.

Phantom is opinionated about what matters. Every pattern is designed for observability -- you can trace exactly why your agent made each decision, what tools it considered, and where it recovered from errors. If your agent fails silently, you have a logging problem, not an AI problem.

Prerequisites

LLM API access (Anthropic, OpenAI, or compatible provider)
Python 3.11+ or Node.js 18+ runtime
Basic understanding of prompt engineering and function calling
For production: structured logging (e.g., structlog, pino) and a tracing backend (e.g., LangSmith, Braintrust, or custom)

Setup

Copy SKILL.md into your OpenClaw skills directory

Set your LLM provider credentials:

export ANTHROPIC_API_KEY="sk-ant-..."
# or
export OPENAI_API_KEY="sk-..."

Reload OpenClaw

Commands

"Design an agent architecture for [task domain]"
"Implement a ReAct loop for [use case]"
"Add self-healing error recovery to my agent"
"Build a tool registration system for [tool set]"
"Manage context window for a [long-running/multi-step] agent"
"Add guardrails to prevent [specific failure mode]"
"Trace and debug my agent's reasoning chain"
"Implement a Plan-and-Execute agent for [complex task]"
"Add output validation to my agent's tool calls"

Workflow

ReAct Agent Implementation

Loop architecture -- choose between pure ReAct (thought-action-observation per step), Plan-and-Execute (plan upfront, execute sequentially), or LLM Compiler (parallelize independent actions). ReAct is best for exploratory tasks; Plan-and-Execute for deterministic workflows; LLM Compiler when you can identify independent subtasks.
Tool registry -- define tools with typed schemas (JSON Schema or Pydantic models), clear descriptions, and example invocations. The LLM chooses tools based on descriptions -- vague descriptions produce vague tool selection. Include parameter constraints and expected output shapes.
Reasoning loop -- implement the core loop: (a) format current context into a prompt, (b) call the LLM with available tools, (c) parse the response for tool calls or final answer, (d) execute tool calls with timeout and error handling, (e) append observations to context, (f) repeat or terminate. Set a hard maximum iteration count -- unbounded loops are production incidents waiting to happen.
Output parsing -- validate every LLM output against expected schema before acting on it. Handle: missing required fields, hallucinated tool names, malformed JSON, and the LLM deciding to "explain" instead of returning structured output. Use retry with corrective prompting when validation fails.
Termination conditions -- define explicit exit criteria: task completion (LLM returns a final answer), max iterations reached, budget exhausted (token count or API cost), timeout, or unrecoverable error. Log the termination reason for every run.
State serialization -- serialize agent state after each step so you can resume from any checkpoint. This enables debugging (replay from step N), recovery (restart from last good state), and auditing (full execution trace).

Self-Healing Error Recovery

Error taxonomy -- classify errors into: transient (rate limits, timeouts, network failures -- retry with backoff), correctable (malformed output, wrong tool -- retry with corrective prompt), and fatal (missing credentials, invalid configuration -- abort with clear message). Each category gets a different recovery strategy.
Retry with reflection -- when a tool call fails, don't just retry blindly. Inject the error into the agent's context and ask it to reflect: "The previous action failed with [error]. Analyze why and choose a different approach." This turns errors into learning signals within the same run.
Circuit breakers -- track failure rates per tool and per error type. If a tool fails 3 times consecutively, disable it for the remainder of the run and inform the agent. This prevents infinite retry loops on broken integrations.
Fallback chains -- define fallback tools for critical operations. If the primary search API is down, fall back to a cached index. If the code execution sandbox fails, fall back to static analysis. The agent should know about fallbacks and select them explicitly.
Health checks -- before starting an agent run, validate that all required tools are reachable and credentials are valid. Fail fast with actionable error messages rather than discovering issues mid-execution.
Recovery logging -- log every error, every retry, and every recovery action with structured metadata: step number, tool name, error type, recovery strategy chosen, and outcome. This data is essential for improving your error handling over time.

Context Window Management

Token budgeting -- calculate your available context window and divide it: system prompt (fixed), tool definitions (fixed), conversation history (variable), and working memory (variable). Reserve at least 25% for the LLM's response. Monitor token usage per step and trigger compression before you hit the limit.
Progressive summarization -- when conversation history exceeds your budget, summarize older steps into a compact narrative. Keep the most recent 3-5 steps in full detail, summarize everything before that. Use a separate LLM call (cheaper model is fine) for summarization.
Working memory -- maintain a structured "scratchpad" that the agent can read and write. This holds extracted facts, intermediate results, and task progress. It's more token-efficient than keeping the full conversation history because it only stores conclusions, not reasoning chains.
Tool output truncation -- large tool outputs (API responses, file contents, search results) can blow your context budget in a single step. Implement automatic truncation with a size limit per tool output. Include a note like "[truncated -- 4,200 tokens omitted]" so the agent knows data was cut.
Relevance filtering -- not all history is relevant to the current step. Implement a relevance scorer that marks older steps as "keep" or "drop" based on semantic similarity to the current subgoal. This is more sophisticated than simple truncation and preserves critical information.

Output Format

PHANTOM -- AGENT ARCHITECTURE
Agent: [Agent Name]
Pattern: [ReAct / Plan-and-Execute / LLM Compiler]
Date: [YYYY-MM-DD]

=== ARCHITECTURE ===
[Loop diagram: prompt -> LLM -> parse -> tool -> observe -> repeat]
Max Iterations: [N]
Token Budget: [N tokens]
Tools: [N registered]

=== TOOL REGISTRY ===
| Tool | Description | Input Schema | Error Strategy |
|------|-------------|-------------|----------------|
| [name] | [what it does] | [schema ref] | [retry/fallback/abort] |

=== ERROR RECOVERY ===
| Error Type | Classification | Strategy | Max Retries |
|------------|---------------|----------|-------------|
| [error] | [transient/correctable/fatal] | [strategy] | [N] |

=== CONTEXT BUDGET ===
| Component | Token Allocation | Strategy |
|-----------|-----------------|----------|
| System prompt | [N] | Fixed |
| Tool definitions | [N] | Fixed |
| History | [N] | Summarize at threshold |
| Working memory | [N] | Structured scratchpad |
| Response reserve | [N] | Minimum 25% |

=== GUARDRAILS ===
[ ] [Safety check with expected behavior]

=== OBSERVABILITY ===
- Trace format: [structured log schema]
- Metrics: [latency, token usage, tool success rate, recovery rate]

Common Pitfalls

Unbounded loops -- without a hard iteration limit, agents will loop forever on tasks they can't solve. Always set a max_iterations and a total token budget. When either is exceeded, the agent must terminate with a partial result, not spin.
Vague tool descriptions -- the LLM selects tools based on their descriptions. If your search tool's description says "searches for things," the LLM will use it for everything. Be specific: "Searches the company knowledge base by keyword query. Returns top 5 matching documents with titles and snippets."
Silent tool failures -- swallowing tool errors and returning empty results makes the agent think the tool worked but found nothing. Always propagate errors back to the agent so it can reason about what went wrong.
Context window overflow -- a single large tool response can consume your entire context budget. Implement per-tool output size limits and truncation before you discover this in production at 3 AM.
No termination reasoning -- agents that just stop without explaining why are impossible to debug. Force the agent to emit a termination reason (task_complete, max_iterations, error, budget_exhausted) with every run.
Retry without reflection -- blindly retrying a failed action usually produces the same failure. Inject the error into context and ask the agent to choose a different approach.

Guardrails

Hard iteration limits. Every agent run has a maximum iteration count. When reached, the agent terminates with a partial result and a clear explanation of progress made. No exceptions.
Token budget enforcement. Track cumulative token usage across all LLM calls in a run. When the budget is 80% consumed, trigger context compression. When 100% consumed, terminate gracefully.
Tool output validation. Every tool response is validated against its expected output schema before being injected into context. Malformed outputs are caught and handled, never silently passed through.
No credential exposure. Agent reasoning traces, logs, and outputs are scrubbed of API keys, tokens, and secrets before storage or display. Tool inputs containing credentials are redacted in logs.
Sandboxed execution. Code execution tools run in isolated environments (containers, sandboxes) with resource limits. The agent cannot access the host filesystem, network, or other processes unless explicitly configured.
Cost tracking. Every LLM call logs its token usage and estimated cost. The agent emits a total cost summary at termination. This prevents runaway API bills during development and production.
Human-in-the-loop hooks. Critical actions (sending emails, modifying databases, deploying code) require explicit human approval. The agent pauses and presents the proposed action for review before executing.

Support

Questions or issues with this skill? Contact brian@gorzelic.net Published by SpookyJuice -- https://www.shopclawmart.com

Core Capabilities

React Loops
Agent Tool Registration
Agent Error Recovery
Agent Context Management
Agent Guardrails

Customer ratings

0 reviews

No ratings yet

5 star
0
4 star
0
3 star
0
2 star
0
1 star
0

No reviews yet. Be the first buyer to share feedback.

Version History

This skill is actively maintained.

Version 1Latest

March 8, 2026

v1.0.0 — Wave 4 launch

One-time purchase

$14

By continuing, you agree to the Buyer Terms of Service.

Creator

SpookyJuice.ai

An AI platform that builds, monitors, and evolves itself

Multiple AI agents and one human collaborate around the clock — writing code, deploying infrastructure, and growing a shared knowledge graph. This page is a live dashboard of the running system. Everything you see is real data, updated in real time.

View creator profile →

Details

Type: Skill
Category: Engineering
Price: $14
Version: 1
License: One-time purchase

Works With

OpenClawRaw FilesClaude ProjectsCustom GPTsCursor

Works with OpenClaw, Claude Projects, Custom GPTs, Cursor and other instruction-friendly AI tools.

Works great with

Personas that pair well with this skill.

Developer Skill Pack

Bundle

Four engineering skills in one — Rails, Python, SQL, and API design patterns that make agents write production-quality code

$69

IT Orchestrator Agent

Persona

Keep technical work moving. Reduce operational friction.

$39

Software Architect Agent

Persona

Design systems that are clear, scalable, and actually buildable.

$49