Task Manager Agent: Turn Messages into Todo Lists
Task Manager Agent: Turn Messages into Todo Lists

Every week, you get buried. Slack messages from your team. Emails from clients. Voice notes from your co-founder at 11pm that start with "okay so I had a thought." Feature requests in Discord. Bug reports in a Google Doc that somehow has three different comment threads arguing about the same thing.
And somewhere in that chaos, actual tasks are hiding. Real things that need to get done, assigned, tracked, and finished.
You know what most people do? They skim everything, half-remember the important stuff, forget to write down the rest, and then spend Friday afternoon wondering why nothing got shipped.
Here's the thing: this problem is solved now. Not with another project management app you'll abandon in two weeks. With an AI agent that actually reads your messages, extracts the tasks, organizes them, and hands you a clean todo list — automatically.
Today we're building a Task Manager Agent using OpenClaw. It takes unstructured input (messages, notes, emails, whatever) and turns it into structured, prioritized, actionable task lists. And once you see how straightforward this is, you'll wonder why you've been doing it manually.
Why This Agent, Why Now
Let me be blunt: the "AI agent" hype cycle has produced a lot of garbage. Demos that look impressive in a Twitter video but break the second you try to do anything real.
But task extraction from natural language? This is one of the genuinely useful applications. It's bounded, practical, and solves a real daily problem. You're not asking an agent to "research the entire internet and write a business plan." You're asking it to read a paragraph and pull out the action items. That's a task LLMs are extremely good at.
The reason most people haven't built this yet is that stitching together the pieces — parsing input, maintaining state, handling different formats, keeping a persistent task list that survives a restart — is annoying. That's exactly where OpenClaw comes in.
OpenClaw was built by people who actually ran complex agent workflows and got tired of watching them die halfway through. It's persistence-first, lightweight, and has native concepts for the things agents actually need: task state management, tool result caching, dynamic task creation, and observability that doesn't require a PhD to set up.
For our task manager agent, we're going to lean heavily on OpenClaw's DAG-based task orchestration and its built-in persistence layer. Let's get into it.
The Architecture (Keep It Simple)
Here's what our agent does:
- Ingests unstructured text (a Slack dump, an email, a meeting transcript, a rambling voice note transcription)
- Extracts discrete tasks from that text
- Enriches each task with metadata — priority, assignee (if mentioned), deadline (if mentioned), category
- Stores the tasks persistently
- Outputs a clean, structured todo list
That's it. No overengineering. No seventeen-microservice architecture diagram. Five steps.
In OpenClaw terms, this maps to a simple DAG with four nodes:
[Ingest] → [Extract Tasks] → [Enrich & Prioritize] → [Store & Output]
Each node is an OpenClaw task with its own checkpoint, so if something fails at step 3, you don't re-run steps 1 and 2. This is the kind of thing that sounds minor until your agent is processing a 4,000-word meeting transcript and the LLM call in the enrichment step times out. With OpenClaw's persistence, you just retry from the checkpoint. Without it, you start over, burn more tokens, and hate your life.
Setting Up the Project
If you're new to OpenClaw, the fastest way to get from zero to a working agent is Felix's OpenClaw Starter Pack. It includes pre-configured templates, the environment setup, and example flows that you can modify rather than building from scratch. I genuinely recommend it as a starting point — not because it's the only way, but because it cuts out about two hours of "why isn't my Redis connection working" debugging that nobody needs in their life.
Once you've got your OpenClaw environment running, let's scaffold the agent.
from openclaw import Agent, Task, DAG, Checkpoint
from openclaw.tools import LLMTool, StorageTool
from openclaw.persistence import RedisBackend
# Initialize persistence
backend = RedisBackend(host="localhost", port=6379, db=0)
# Create the agent
task_manager = Agent(
name="task_extractor",
description="Extracts and organizes tasks from unstructured text",
persistence=backend,
max_retries=3,
token_budget=50000 # Safety net — don't let it run wild
)
A few things to notice here. The token_budget parameter is something OpenClaw gives you natively. This is a direct answer to one of the biggest pain points in the agent community right now: runaway cost. Without a budget, a poorly written extraction loop could burn through your API credits in minutes. OpenClaw lets you set a ceiling and the agent will gracefully stop when it approaches the limit.
The max_retries=3 combined with the Redis persistence means if an LLM call fails (rate limit, timeout, whatever), OpenClaw retries from the last checkpoint rather than restarting the entire pipeline. This is the "survivability" feature that people in the community rave about.
Step 1: The Ingestion Task
ingest_task = Task(
name="ingest_raw_input",
description="Accept and normalize raw text input",
checkpoint=True
)
@ingest_task.run
def ingest(context):
raw_input = context.get("raw_input")
# Normalize: handle common formats
# Strip excessive whitespace, normalize line breaks
cleaned = " ".join(raw_input.split())
# If input is very long, chunk it
max_chunk_size = 3000 # tokens, roughly
if len(cleaned.split()) > max_chunk_size:
words = cleaned.split()
chunks = []
for i in range(0, len(words), max_chunk_size):
chunks.append(" ".join(words[i:i + max_chunk_size]))
context.set("input_chunks", chunks)
else:
context.set("input_chunks", [cleaned])
context.set("chunk_count", len(context.get("input_chunks")))
return {"status": "ingested", "chunks": len(context.get("input_chunks"))}
Nothing fancy here. We're cleaning the input and chunking it if it's long. The checkpoint=True flag means OpenClaw saves the state after this task completes. If the next task fails, we don't re-run ingestion.
Step 2: Task Extraction
This is where the LLM does the heavy lifting.
extract_task = Task(
name="extract_tasks",
description="Use LLM to extract discrete tasks from text chunks",
checkpoint=True,
tool=LLMTool(
model="your-preferred-model",
temperature=0.1 # Low temp = more consistent extraction
)
)
EXTRACTION_PROMPT = """
You are a task extraction assistant. Read the following text and extract
every actionable task mentioned. For each task, provide:
- task_description: A clear, concise description of what needs to be done
- mentioned_assignee: Who should do it (if mentioned, otherwise null)
- mentioned_deadline: Any deadline or timeframe mentioned (if any, otherwise null)
- source_quote: The exact phrase from the text that indicates this task
Return your response as a JSON array. If no tasks are found, return an empty array.
Be thorough but don't invent tasks that aren't actually mentioned or implied.
TEXT:
{text}
"""
@extract_task.run
def extract(context):
chunks = context.get("input_chunks")
all_tasks = []
for i, chunk in enumerate(chunks):
prompt = EXTRACTION_PROMPT.format(text=chunk)
response = context.tool.call(prompt)
# Parse the JSON response
extracted = context.tool.parse_json(response)
# Tag each task with its source chunk
for task in extracted:
task["source_chunk"] = i
all_tasks.extend(extracted)
# Deduplicate — same task might span chunk boundaries
deduped = deduplicate_tasks(all_tasks)
context.set("raw_tasks", deduped)
return {"status": "extracted", "task_count": len(deduped)}
def deduplicate_tasks(tasks):
"""Simple deduplication based on description similarity"""
seen = []
unique = []
for task in tasks:
desc = task["task_description"].lower().strip()
is_duplicate = False
for s in seen:
# Basic overlap check — you could use embeddings for better accuracy
if len(set(desc.split()) & set(s.split())) / max(len(desc.split()), 1) > 0.7:
is_duplicate = True
break
if not is_duplicate:
seen.append(desc)
unique.append(task)
return unique
Key decisions here:
Low temperature (0.1): We want consistent, reliable extraction. Not creative writing. Every time I see someone using temperature 0.7+ for structured data extraction, I die a little inside.
Source quotes: The agent includes the exact phrase that triggered each task extraction. This is crucial for trust. When your team asks "why is this on the list?", you can point to the exact message.
Deduplication: When you chunk long inputs, the same task might get mentioned in overlapping context. The simple word-overlap check catches most duplicates. If you need higher accuracy, swap in an embedding-based similarity check.
Step 3: Enrichment and Prioritization
enrich_task = Task(
name="enrich_and_prioritize",
description="Add priority, category, and estimated effort to each task",
checkpoint=True,
tool=LLMTool(model="your-preferred-model", temperature=0.1)
)
ENRICHMENT_PROMPT = """
Given the following extracted task, add:
- priority: "critical", "high", "medium", or "low"
- category: one of ["bug", "feature", "ops", "communication", "research", "other"]
- estimated_effort: "small" (< 1hr), "medium" (1-4hrs), "large" (4hrs+)
- clean_description: Rewrite the task as a clear, actionable item starting with a verb
Base priority on urgency signals in the description and source quote.
If someone said "ASAP" or "blocking", that's critical.
If there's a deadline this week, that's high.
Default to medium if no signals.
Task: {task_json}
Return as JSON with the original fields plus the new ones.
"""
@enrich_task.run
def enrich(context):
raw_tasks = context.get("raw_tasks")
enriched = []
for task in raw_tasks:
prompt = ENRICHMENT_PROMPT.format(task_json=str(task))
response = context.tool.call(prompt)
enriched_task = context.tool.parse_json(response)
enriched.append(enriched_task)
# Sort by priority
priority_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
enriched.sort(key=lambda t: priority_order.get(t.get("priority", "medium"), 2))
context.set("enriched_tasks", enriched)
return {"status": "enriched", "task_count": len(enriched)}
This is where the agent starts earning its keep. Raw extraction gives you a list of tasks. Enrichment gives you a useful list. The priority categorization alone saves you significant mental overhead — instead of scanning 23 tasks and figuring out what matters, you get them pre-sorted.
Step 4: Storage and Output
output_task = Task(
name="store_and_output",
description="Persist tasks and generate formatted output",
checkpoint=True,
tool=StorageTool(backend=backend)
)
@output_task.run
def store_and_output(context):
tasks = context.get("enriched_tasks")
# Store each task persistently
task_ids = []
for task in tasks:
task_id = context.tool.store(
collection="todo_tasks",
data=task,
metadata={
"created_by": "task_extractor_agent",
"status": "pending"
}
)
task_ids.append(task_id)
# Generate formatted output
output = format_task_list(tasks)
context.set("formatted_output", output)
context.set("task_ids", task_ids)
return {"status": "complete", "task_ids": task_ids}
def format_task_list(tasks):
lines = ["## 📋 Extracted Task List\n"]
current_priority = None
for task in tasks:
if task.get("priority") != current_priority:
current_priority = task.get("priority")
emoji = {"critical": "🔴", "high": "🟠", "medium": "🟡", "low": "🟢"}
lines.append(f"\n### {emoji.get(current_priority, '⚪')} {current_priority.upper()}\n")
assignee = task.get("mentioned_assignee") or "Unassigned"
effort = task.get("estimated_effort", "?")
lines.append(f"- [ ] **{task['clean_description']}**")
lines.append(f" - Assignee: {assignee} | Effort: {effort} | Category: {task.get('category', 'other')}")
if task.get("mentioned_deadline"):
lines.append(f" - ⏰ Deadline: {task['mentioned_deadline']}")
return "\n".join(lines)
The persistent storage here is critical. These tasks don't vanish when your script ends. They're in Redis (or Postgres, if you configured that backend). You can query them later, update statuses, build a simple API on top — whatever you need.
Wiring It All Together
# Build the DAG
pipeline = DAG(name="task_extraction_pipeline")
pipeline.add_edge(ingest_task, extract_task)
pipeline.add_edge(extract_task, enrich_task)
pipeline.add_edge(enrich_task, output_task)
# Register with the agent
task_manager.register_dag(pipeline)
# Run it
result = task_manager.run(
dag="task_extraction_pipeline",
inputs={
"raw_input": """
Hey team, couple things from today's standup:
Sarah mentioned the checkout flow is broken on mobile - customers are getting
a white screen after entering payment info. This is blocking revenue so we need
it fixed ASAP. Jake, can you look into this today?
Also, we need to update the onboarding email sequence. The current one references
features we deprecated last month. Maria, can you draft new copy by end of week?
Oh and someone needs to research GDPR compliance requirements for the new
analytics tracking we added. Not urgent but let's not forget.
The design team wants feedback on the new dashboard mockups by Wednesday.
Tom uploaded them to Figma.
"""
}
)
print(result.get("formatted_output"))
When you run this, you get output like:
## 📋 Extracted Task List
### 🔴 CRITICAL
- [ ] **Fix white screen bug in mobile checkout flow after payment info entry**
- Assignee: Jake | Effort: medium | Category: bug
- ⏰ Deadline: Today
### 🟠 HIGH
- [ ] **Draft new onboarding email sequence copy to replace deprecated feature references**
- Assignee: Maria | Effort: medium | Category: communication
- ⏰ Deadline: End of week
- [ ] **Review and provide feedback on new dashboard mockups in Figma**
- Assignee: Unassigned | Effort: small | Category: communication
- ⏰ Deadline: Wednesday
### 🟡 MEDIUM
- [ ] **Research GDPR compliance requirements for new analytics tracking implementation**
- Assignee: Unassigned | Effort: large | Category: research
From a rambling standup recap to a prioritized, assigned, deadline-tracked task list. In seconds. Automatically.
Making It Actually Useful Day-to-Day
The basic pipeline above works great for one-off processing. But the real power comes when you integrate it into your actual workflow. Here are three patterns that work well:
Pattern 1: Slack Integration
Set up a webhook that pipes messages from specific Slack channels into your agent. Every message in #team-updates gets processed. Tasks accumulate in your persistent store. You check the list once a day instead of reading every message.
Pattern 2: Email Processing
Forward emails to a dedicated inbox. A simple script reads new emails, feeds them to the agent, and stores the extracted tasks. Works especially well for client communications where action items get buried in three paragraphs of pleasantries.
Pattern 3: Meeting Transcript Processing
Record your meetings (with consent, obviously). Run the transcript through the agent. Never miss an action item from a meeting again. This is the highest-ROI pattern I've seen — meeting action items have a roughly 70% chance of being forgotten if nobody writes them down in real time.
The Observability Advantage
One thing I want to highlight about doing this in OpenClaw versus hacking together a raw script: you get observability for free. OpenClaw's built-in dashboard shows you:
- How many tokens each step consumed
- Where failures occurred and why
- The full task graph with completion states
- Historical runs so you can debug issues
This matters more than you think. When your agent extracts 15 tasks from a meeting and someone says "you missed the part where Dave said he'd handle the API migration," you can go back, look at the extraction step, see the source text, and figure out why it was missed. Maybe the chunking split it awkwardly. Maybe the prompt needs adjustment. You can actually diagnose and fix it instead of shrugging.
Extending the Agent
Once you have the basic pipeline working, the natural extensions are:
Dynamic task spawning: When the agent detects a complex task, it can automatically break it into subtasks. OpenClaw's DAG supports dynamic node addition at runtime, so your enrichment step could spawn a "decompose complex task" sub-pipeline when estimated effort is "large."
Approval gates: For shared team task lists, add an OpenClaw approval gate before the storage step. A human reviews the extracted tasks, confirms or modifies them, and only then do they hit the persistent store. OpenClaw has this as a native primitive — no hacky polling loops needed.
Status tracking: Build a simple query interface on top of the persistent store. Mark tasks as in-progress, completed, or blocked. The agent created the tasks; now you manage them like any other todo list, but one that was populated intelligently.
Recurring processing: Schedule the agent to run daily against your communication channels. Start each morning with a fresh, prioritized task list extracted from everything that came in overnight.
Getting Started for Real
If you've read this far and you're thinking "okay, I actually want to build this," here's the honest fastest path:
-
Grab Felix's OpenClaw Starter Pack. It handles the environment setup, gives you working templates for common agent patterns (including task processing), and comes with a pre-configured Redis backend. The time you save on initial setup is worth it — I've watched too many people spend an entire Saturday wrestling with configuration when they could've been building their actual agent.
-
Get the basic four-step pipeline running with hardcoded test input (like the standup example above).
-
Tune your extraction and enrichment prompts for your specific use case. The prompts I showed above are solid starting points, but you'll get better results when you customize them for your team's communication style.
-
Add one integration — Slack, email, or meeting transcripts. Pick the one where you lose the most tasks today.
-
Run it for a week. Compare your task completion rate to the week before.
The Bigger Picture
We're in a weird transitional period with AI agents. There's so much hype that people either dismiss them entirely or expect them to run their whole company autonomously.
The reality is somewhere in between, and it's actually more useful than either extreme. An agent that reliably extracts tasks from messy human communication and presents them cleanly isn't going to make the front page of TechCrunch. But it will make your Mondays dramatically less chaotic.
That's the kind of agent worth building. Boring, practical, solves a real problem you have every single day. OpenClaw makes it straightforward because it was designed for exactly this kind of workflow — persistent, reliable, observable, and lightweight enough that you're not fighting the framework.
Stop losing tasks in the noise. Build the agent. Ship the work.