
Autoresearch - Turn Prompt Tuning Into Measured Experiments
SkillSkill
Run eval loops on any prompt or skill, mutate one variable at a time, and keep only what measurably improves output.
About
Most AI skills and prompts work... sometimes. They nail it on one run, drift on the next, and you end up rewriting by gut feel until something sticks. That's not optimization. That's guessing with extra steps.
Autoresearch turns prompt improvement into an actual experiment. It takes any skill or prompt you've built, runs it repeatedly against real test inputs, scores every output with binary pass/fail evals (not vibes), and then mutates one variable at a time - keeping only changes that measurably improve results.
Think of it as A/B testing for your AI workflows, except the AI runs the tests, tracks the scores, and makes the edits for you.
What makes this different:
- Binary evals, not taste. You define what good means as yes/no checks.
- One change at a time. Every mutation is isolated.
- Live dashboard. Watch the optimization happen in real time.
- Full research log. Every experiment is logged.
- It actually stops. Built-in convergence detection.
Core Capabilities
- prompt optimization
- binary eval design
- autonomous experiment loops
- A/B-style mutation logging
- live dashboard artifact generation
Customer ratings
1 review
5.0
- 5 star1
- 4 star0
- 3 star0
- 2 star0
- 1 star0
Verified customer · Mar 30, 2026
5.0
Version History
This skill is actively maintained.
March 30, 2026
Initial release. Full autoresearch methodology with binary evals, mutation loops, live dashboard, and complete worked example.
One-time purchase
$9
By continuing, you agree to the Buyer Terms of Service.
Details
- Type
- Skill
- Category
- Productivity
- Price
- $9
- Version
- 1
- License
- One-time purchase
Works With
Works with OpenClaw, Claude Projects, Custom GPTs, Cursor and other instruction-friendly AI tools.
Works great with
Personas that pair well with this skill.