How to Add Custom Tools to Your OpenClaw Agent in 10 Minutes

Let's cut straight to it: adding custom tools to an AI agent should not be a multi-day research project. But if you've spent any time in LangChain Discord servers or scrolled through Reddit threads about agent frameworks, you know that's exactly what it becomes. People burn entire weekends wrestling with Pydantic schema errors, debugging why the model hallucinates a response instead of calling their carefully defined function, and untangling state management spaghetti.

OpenClaw fixes most of this. Not all of it — you still need to think about what your tool does and write a decent description — but the framework gets out of your way in a manner that most alternatives simply don't.

I'm going to walk you through adding a custom tool to an OpenClaw agent from scratch. By the end, you'll have a working tool integrated into your agent, and you'll understand the pattern well enough to add five more before lunch.

Why Custom Tools Matter (and Why They're Usually Painful)

An AI agent without tools is just a chatbot with delusions of grandeur. Tools are what let your agent actually do things — query a database, send a Slack message, check inventory, hit an API, run a calculation, whatever your use case demands.

The problem is that every major framework has made this harder than it needs to be:

LangChain requires you to navigate a maze of StructuredTool, BaseTool, @tool decorators, and Pydantic schemas that break if you look at them sideways.
CrewAI ties tools to agents in rigid ways that fall apart when you need shared state or dynamic tool selection.
AutoGen makes registering tools across multiple agents verbose to the point of absurdity.
OpenAI's Assistants API works for simple cases but crumbles when you need error recovery or want custom tools alongside file search.

The core issues are always the same: schema definition is fragile, debugging is opaque, the model ignores your tool or calls it with garbage arguments, and error handling is an afterthought.

OpenClaw takes a different approach. Tools are defined with a simple, consistent interface. The schema is inferred from your code. Descriptions are first-class citizens. And when something goes wrong, you can actually see why.

The Setup: What You Need

Before we get into code, make sure you have:

OpenClaw installed — pip install openclaw (requires Python 3.9+)
An OpenClaw project initialized — openclaw init my-agent
Your API keys configured — OpenClaw supports multiple LLM providers; set your preferred one in .openclaw/config.yaml

If you don't have a project yet, the init command scaffolds everything:

pip install openclaw
openclaw init my-agent
cd my-agent

This gives you a project structure like:

my-agent/
├── .openclaw/
│   └── config.yaml
├── agent.py
├── tools/
│   └── __init__.py
└── skills/
    └── __init__.py

The tools/ directory is where your custom tools live. The skills/ directory is for higher-level compositions of tools (more on that later). Let's focus on tools.

Step 1: Define Your Tool

Let's say you're building an agent that helps manage an e-commerce store, and you need a tool that checks current inventory for a product. Here's the entire tool definition:

# tools/check_inventory.py

from openclaw import Tool, ToolResult

class CheckInventory(Tool):
    name = "check_inventory"
    description = (
        "Checks the current inventory count for a product. "
        "Use this when the user asks about stock levels, availability, "
        "or how many units remain for a specific product. "
        "Requires a product_id (string) and optionally a warehouse location."
    )

    def run(self, product_id: str, warehouse: str = "all") -> ToolResult:
        # Your actual inventory logic here
        inventory_data = self._fetch_inventory(product_id, warehouse)

        return ToolResult(
            success=True,
            data=inventory_data,
            message=f"Found inventory for product {product_id}"
        )

    def _fetch_inventory(self, product_id: str, warehouse: str) -> dict:
        # Replace with your real API call, database query, etc.
        # This is a stub for demonstration
        import requests
        response = requests.get(
            f"https://api.yourstore.com/inventory/{product_id}",
            params={"warehouse": warehouse},
            headers={"Authorization": f"Bearer {self.context.get('api_key')}"}
        )
        response.raise_for_status()
        return response.json()

That's it. No Pydantic model to define separately. No JSON schema to hand-craft. No decorator gymnastics. OpenClaw reads the type hints and docstring from run() and generates the tool schema automatically.

A few things to notice:

name is what the LLM sees when deciding which tool to call. Keep it clear and action-oriented.
description is the single most important field. This is your tool's resume. More on this in a minute.
run() accepts typed parameters and returns a ToolResult. OpenClaw uses the type hints to build the argument schema that gets sent to the LLM.
self.context gives you access to shared agent context — API keys, user session data, whatever you've injected. No weird global state hacks.

Step 2: Register the Tool with Your Agent

Open your agent.py and register the tool:

# agent.py

from openclaw import Agent
from tools.check_inventory import CheckInventory

agent = Agent(
    name="Store Assistant",
    instructions=(
        "You are a helpful e-commerce assistant. "
        "Always check inventory before making claims about product availability. "
        "If a tool call fails, tell the user you're having trouble and suggest they try again."
    ),
    tools=[
        CheckInventory(),
    ],
    context={
        "api_key": "your-inventory-api-key-here"
    }
)

Run it:

openclaw run agent.py

That's two files. Maybe 40 lines of code total. Your agent now has a working inventory tool.

Step 3: Write a Description That Actually Works

I said the description was the most important field, and I meant it. This is the number one reason custom tools fail across every framework, not just OpenClaw. The LLM decides whether to use your tool based almost entirely on the description. A bad description means your tool gets ignored — or worse, called with wrong arguments.

Here's what bad looks like:

description = "Checks inventory"

The model has almost no information. When should it use this vs. just answering from memory? What arguments does it need? What does "inventory" even mean in this context?

Here's what good looks like:

description = (
    "Checks the current inventory count for a product in the store's warehouse system. "
    "Use this tool whenever the user asks about stock levels, product availability, "
    "whether an item is in stock, or how many units are left. "
    "Requires product_id as a string (e.g., 'SKU-12345'). "
    "Optionally accepts a warehouse parameter to check a specific location "
    "('east', 'west', 'central') or defaults to 'all' for aggregate counts. "
    "Returns the current unit count and last restock date."
)

This tells the model:

What the tool does (checks inventory counts)
When to use it (stock levels, availability, units remaining)
What arguments it needs (product_id, with an example format)
What optional parameters exist (warehouse, with valid values)
What it returns (unit count and restock date)

The difference between these two descriptions is the difference between a tool that works reliably and a tool you'll spend three days debugging.

OpenClaw also supports an examples field on tools for few-shot prompting, which can help with trickier tools:

class CheckInventory(Tool):
    name = "check_inventory"
    description = "..."
    examples = [
        {
            "user_says": "Do you have any blue widgets left?",
            "tool_call": {"product_id": "WIDGET-BLUE-001", "warehouse": "all"}
        },
        {
            "user_says": "How many units in the east warehouse for SKU-999?",
            "tool_call": {"product_id": "SKU-999", "warehouse": "east"}
        }
    ]

This is subtle but powerful. For tools where the model consistently misformats arguments, a couple of examples in this format solve the problem almost immediately. Most frameworks make you jam few-shot examples into the system prompt. OpenClaw attaches them directly to the tool, which is where they belong.

Step 4: Handle Errors Like a Grown-Up

Here's where most frameworks completely fall over. Your tool will fail at some point — the API is down, the product ID is invalid, the network times out. What happens then?

In LangChain, by default: the agent often enters a retry loop, hallucinates an answer, or just dies. In OpenClaw, you handle it explicitly:

def run(self, product_id: str, warehouse: str = "all") -> ToolResult:
    try:
        inventory_data = self._fetch_inventory(product_id, warehouse)
        return ToolResult(
            success=True,
            data=inventory_data,
            message=f"Found inventory for product {product_id}"
        )
    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 404:
            return ToolResult(
                success=False,
                error=f"Product {product_id} not found in the system. "
                       "Ask the user to double-check the product ID.",
                recoverable=True
            )
        return ToolResult(
            success=False,
            error="Inventory system is currently unavailable. "
                  "Suggest the user try again in a few minutes.",
            recoverable=False
        )
    except requests.exceptions.Timeout:
        return ToolResult(
            success=False,
            error="Inventory check timed out. Suggest trying again.",
            recoverable=True
        )

The ToolResult has a recoverable flag. When recoverable=True, OpenClaw lets the agent try a different approach or ask the user for clarification. When recoverable=False, it signals the agent to gracefully acknowledge the failure rather than spinning its wheels.

This is the kind of thing you don't think about until your agent is in production and a user reports that it confidently told them a product was in stock when the API was actually down. Error handling isn't glamorous, but it's the difference between a demo and a product.

Step 5: Add Multiple Tools and Watch Them Compose

One tool is useful. Multiple tools working together is where agents actually become powerful. Let's add a price-checking tool and a tool that creates discount codes:

# tools/check_price.py

from openclaw import Tool, ToolResult

class CheckPrice(Tool):
    name = "check_price"
    description = (
        "Retrieves the current price for a product. "
        "Use when the user asks about pricing, cost, or how much something costs. "
        "Requires product_id (string). Returns price in USD and any active promotions."
    )

    def run(self, product_id: str) -> ToolResult:
        # Your pricing logic
        price_data = {"product_id": product_id, "price": 49.99, "currency": "USD"}
        return ToolResult(success=True, data=price_data)

# tools/create_discount.py

from openclaw import Tool, ToolResult

class CreateDiscount(Tool):
    name = "create_discount_code"
    description = (
        "Creates a one-time discount code for a customer. "
        "Use ONLY when the user explicitly asks for a discount or when instructed to "
        "offer a retention discount. Requires discount_percent (integer, 5-25) and "
        "customer_email (string). Returns the generated discount code."
    )
    requires_approval = True  # Human-in-the-loop for sensitive actions

    def run(self, discount_percent: int, customer_email: str) -> ToolResult:
        if not 5 <= discount_percent <= 25:
            return ToolResult(
                success=False,
                error="Discount must be between 5% and 25%.",
                recoverable=True
            )
        # Create the code in your system
        code = f"SAVE{discount_percent}-{customer_email[:4].upper()}"
        return ToolResult(
            success=True,
            data={"code": code, "percent": discount_percent}
        )

agent = Agent(
    name="Store Assistant",
    instructions="...",
    tools=[
        CheckInventory(),
        CheckPrice(),
        CreateDiscount(),
    ],
    context={"api_key": "..."}
)

Notice requires_approval = True on the discount tool. This is OpenClaw's built-in human-in-the-loop mechanism. When the agent tries to create a discount code, the execution pauses and waits for approval before the tool actually runs. No bolted-on middleware. No custom callback chains. One line.

This matters. Security-sensitive tools — sending emails, making purchases, modifying data, executing code — should never run unattended without at least the option for human review. The fact that most frameworks treat this as an advanced topic instead of a first-class feature is, frankly, irresponsible.

Debugging: Seeing What Your Agent Is Thinking

When your tool isn't getting called, or it's getting called with bad arguments, you need visibility. Run your agent with the trace flag:

openclaw run agent.py --trace

This outputs the full decision trace: what the model considered, why it selected (or didn't select) a tool, what arguments it generated, and what the tool returned. It looks something like:

[TRACE] User: "Do you have blue widgets in stock?"
[TRACE] Model reasoning: User asking about stock → check_inventory tool matches
[TRACE] Tool call: check_inventory(product_id="WIDGET-BLUE", warehouse="all")
[TRACE] Tool result: success=True, data={"count": 47, "warehouse": "all"}
[TRACE] Model response: "Yes! We currently have 47 blue widgets in stock."

Compare this to most frameworks where you stare at a final output and have absolutely no idea what happened in between. Observability isn't optional when you're building agents that interact with real systems.

The Fast Path: Skip the Boilerplate

Everything above is straightforward, but I'll be honest — if you're building anything beyond a toy project, you're going to want pre-built tools for common integrations (Slack, email, databases, search, file handling) plus well-tested patterns for error handling and state management.

If you don't want to set all this up manually, Felix's OpenClaw Starter Pack on Claw Mart is genuinely the fastest way to get moving. For $29, you get a bundle of pre-configured skills and tool templates that cover the most common use cases — inventory systems, notification tools, database connectors, and more. The descriptions are already battle-tested (which, if you've read this far, you know is half the battle), and the error handling patterns are production-grade out of the box.

I'm not saying you can't build all of this yourself. You obviously can — I just walked you through it. But the Starter Pack saves you the iteration cycles of learning which description phrasing works best, how to structure error recovery for different failure modes, and how to compose tools into skills that don't trip over each other. It's the kind of thing I wish existed when I was starting out, and it would have saved me a solid week of trial and error.

Quick Reference: The Tool Checklist

Before you ship any custom tool, run through this:

Name is a clear, action-oriented verb phrase (check_inventory, not inventory or inv_tool)
Description explains what, when, arguments, and return value
Type hints on all run() parameters (OpenClaw uses these for schema generation)
Default values for optional parameters
Error handling returns ToolResult with meaningful error messages and recoverable flag
requires_approval is set for any tool that modifies data or triggers external actions
Examples provided for tools with complex or ambiguous argument patterns
Tested with --trace to verify the model calls it correctly

What to Build Next

Now that you know the pattern, here's where to go:

Build a tool that connects to your actual system. Not a stub — a real API call, a real database query. The gap between demo and production is where all the learning happens.
Add 3-5 tools and test them together. Multi-tool composition is where agents shine and where description quality becomes critical. You'll find tools that "compete" for the same queries — this is where you refine descriptions.
Explore OpenClaw skills. Skills are compositions of tools with defined workflows. Once your individual tools work, skills let you chain them into reliable sequences — "check inventory, then check price, then offer a discount if the item is low stock."
Set up traces in your staging environment. Don't wait until production to figure out why your agent is making weird decisions.

Custom tools are the foundation of useful agents. OpenClaw makes the mechanical parts easy so you can focus on what actually matters: defining the right tools for your use case and writing descriptions that make the model use them correctly.

Stop overthinking it. Go build something.