OpenAI -- GPT Integration Expert

Name: OpenAI -- GPT Integration Expert
Brand: SpookyJuice.ai
Price: 19.00 USD
Availability: InStock

Skill

Your OpenAI expert that builds GPT integrations, fine-tunes models, and manages API costs.

EngineeringAll platformsv3

About

name: openai description: > Implement OpenAI structured outputs, function calling, Assistants API, fine-tuning, and RAG. USE WHEN: User needs GPT integration, structured output parsing, tool use orchestration, Assistants API setup, fine-tuning, or embeddings-based search. DON'T USE WHEN: User needs general AI architecture design. Use Architect for agent system design. OUTPUTS: Prompt templates, function schemas, assistant configs, fine-tuning pipelines, embedding workflows, RAG architectures, cost optimization strategies. version: 1.1.0 author: SpookyJuice tags: [openai, gpt, assistants, embeddings, fine-tuning, rag] price: 19 author_url: "https://www.shopclawmart.com" support: "brian@gorzelic.net" license: proprietary osps_version: "0.1" content_hash: "sha256:8554d0d19f56513c3d97ad6d3211c12b96205795afede71b236583d47ef69139"

#‍‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‌‍ OpenAI

Version: 1.1.0 Price: $19 Type: Skill

Description

Production OpenAI API integration for structured outputs, multi-tool agents, and RAG systems. The API surface is evolving fast — structured outputs, vision inputs, function calling, and the Assistants API each have subtle constraints around token limits, tool choice behavior, and response formats that the changelog buries and the docs underspecify. The real complexity isn't making a single API call — it's building the infrastructure around it: prompt versioning that doesn't break when you update instructions, function calling loops that handle parallel tool calls without hallucinating results, and embeddings pipelines that cut costs 50-70% through caching and deduplication while maintaining search quality.

Prerequisites

OpenAI account with API access
API key: OPENAI_API_KEY
Python 3.10+ or Node.js 18+ (official SDK support)
For fine-tuning: training data in JSONL format
For embeddings: vector database (Pinecone, pgvector, Qdrant, or Chroma)

Setup

Copy SKILL.md into your OpenClaw skills directory
Set environment variables:
```
export OPENAI_API_KEY="sk-..."
```
Install the SDK: pip install openai or npm install openai
Reload OpenClaw

Commands

"Build a structured output pipeline for [data type]"
"Implement function calling for [tool set]"
"Set up an Assistant with [capabilities]"
"Create a fine-tuning pipeline for [task]"
"Build RAG with embeddings for [content type]"
"Optimize my OpenAI API costs"
"Debug this API error: [error]"

Workflow

Structured Outputs and Prompt Engineering

Model selection — choose based on task complexity and cost: gpt-4o (best quality, vision support, $2.50/$10 per 1M tokens), gpt-4o-mini (fast and cheap, $0.15/$0.60 per 1M tokens), o1 (reasoning-heavy tasks, $15/$60 per 1M tokens). Start with gpt-4o-mini and upgrade only if quality is insufficient.
System prompt design — structure system prompts with: role definition, task boundaries, output format specification, and few-shot examples. Keep instructions specific and testable. Version system prompts in code alongside your application — never edit production prompts without a review process.
Structured outputs — use response_format: { type: "json_schema", json_schema: { ... } } for guaranteed JSON conformance. Define the schema with required fields, property types, and descriptions. The model will always produce valid JSON matching your schema — no parsing errors, no retry loops.
Context window management — track token usage with tiktoken (Python) or the tokenizer library. Budget: system prompt (fixed cost per request), conversation history (grows over time), user input (variable), and response (set max_tokens). Implement context pruning: summarize old messages, drop irrelevant context, keep system prompt and recent turns.
Temperature and parameters — temperature: 0 for deterministic, factual outputs. temperature: 0.7-1.0 for creative generation. top_p as an alternative to temperature (don't use both). frequency_penalty and presence_penalty to reduce repetition. Document your parameter choices and rationale.
Prompt testing — build an evaluation suite: collect input/output pairs, define quality metrics (accuracy, format compliance, latency), and run against prompt changes. A/B test prompt variations with a holdout set. Never deploy prompt changes without regression testing.

Function Calling and Tool Use

Tool schema definition — define tools with JSON Schema: name, description (be specific — the model uses this to decide when to call), and parameters (typed, with descriptions per parameter). Keep schemas tight — unnecessary optional parameters cause the model to hallucinate values.
Tool call handling loop — the standard loop: send message → check if response contains tool_calls → execute each tool call → send results back → repeat until model responds with text. Handle: zero tool calls (model answered directly), one tool call, and multiple parallel tool calls in a single response.
Parallel tool calls — the model may return multiple tool calls in one response (e.g., "look up the weather in NYC and London" → two function calls). Execute them concurrently for performance. Return all results in the same message, matching each result to its tool_call_id.
Error propagation — when a tool call fails, return the error message as the tool result (not an exception). The model can often recover: retry with different parameters, try an alternative approach, or inform the user. Never silently drop tool call results.
Confirmation flows — for destructive actions (delete, purchase, send), implement a two-step pattern: first tool call returns a preview, model asks for confirmation, user confirms, second tool call executes. Don't auto-execute destructive tools.
Tool choice control — tool_choice: "auto" (model decides), "required" (must use a tool), "none" (no tools), or { "type": "function", "function": { "name": "specific_tool" } } (force specific tool). Use "required" when you know the model should always call a tool. Use specific tool forcing for structured extraction.

Embeddings and RAG

Document chunking — split documents at semantic boundaries: paragraphs, sections, or sliding windows with overlap. Target 200-500 tokens per chunk for text-embedding-3-small. Include metadata with each chunk: source document, section title, page number. Overlapping windows (50-100 token overlap) prevent information loss at chunk boundaries.
Embedding generation — use text-embedding-3-small (1536 dims, $0.02/1M tokens) for most use cases, text-embedding-3-large (3072 dims, $0.13/1M tokens) for maximum quality. Batch embeddings (up to 2048 inputs per request) for efficiency. Cache embeddings — same text always produces the same vector.
Vector storage — choose based on scale: pgvector (simple, works with existing Postgres), Pinecone (managed, serverless), Qdrant (self-hosted, fast), Chroma (lightweight, local development). Index with appropriate distance metric: cosine similarity for normalized embeddings (most common), dot product for un-normalized.
Retrieval pipeline — query flow: embed the user question → search top-K similar chunks (K=5-20) → re-rank results by relevance → inject top chunks into the prompt as context. Use metadata filters to narrow search (date range, source, category) before vector similarity.
Hybrid search — combine vector similarity with keyword search (BM25) for better recall. Vector search finds semantically similar content; keyword search catches exact matches the embeddings miss. Weight both scores and merge results. Most vector databases support hybrid search natively.
Cost optimization — cache frequently-queried embeddings. Deduplicate identical chunks before embedding. Use text-embedding-3-small with reduced dimensions (dimensions: 256) for development/testing, full dimensions for production. Batch embed during off-peak hours. Track embedding costs separately from completion costs.

Output Format

🤖 OPENAI — [IMPLEMENTATION TYPE]
Project: [Name]
Model: [gpt-4o / gpt-4o-mini / o1]
Date: [YYYY-MM-DD]

═══ MODEL CONFIG ═══
| Parameter | Value | Rationale |
|-----------|-------|-----------|
| Model | [model] | [why this model] |
| Temperature | [0-1] | [why] |
| Max Tokens | [n] | [why] |
| Response Format | [text/json_schema] | [why] |

═══ TOOLS ═══
| Tool | Parameters | Description | Destructive |
|------|-----------|-------------|-------------|
| [name] | [params] | [purpose] | [yes/no] |

═══ TOKEN BUDGET ═══
| Component | Tokens | Cost/Request |
|-----------|--------|-------------|
| System Prompt | [n] | $[x] |
| Context (RAG) | [n] | $[x] |
| User Input | [n] | $[x] |
| Response | [n] | $[x] |
| Total | [n] | $[x] |

═══ COST ESTIMATE ═══
| Usage | Volume | Model | Monthly Cost |
|-------|--------|-------|-------------|
| [use case] | [requests/mo] | [model] | $[x] |
Total: $[x]/month

═══ EVALUATION ═══
| Metric | Target | Current | Status |
|--------|--------|---------|--------|
| Accuracy | [%] | [%] | 🟢/🟡/🔴 |
| Latency P50 | [ms] | [ms] | 🟢/🟡/🔴 |
| Cost/Request | $[x] | $[x] | 🟢/🟡/🔴 |

Common Pitfalls

Token limit overflow — stuffing too much context into the prompt causes silent truncation or errors. Always count tokens before sending. Budget for system prompt + context + user input + response, leaving headroom for the model's response.
Function calling hallucination — the model may "invent" tool call parameters that look plausible but are wrong (e.g., a user ID it guessed). Validate all tool call parameters against your data before executing. Never trust model-generated IDs without verification.
Structured output schema drift — changing your JSON schema without updating the model's understanding leads to validation errors. When you change the schema, update the system prompt's examples to match. Test schema changes against your evaluation suite.
Embedding dimension mismatch — mixing embeddings from different models or dimension settings in the same vector index produces garbage similarity scores. Tag stored vectors with model version and dimensions. Never mix embedding sources.
Fine-tuning data quality — fine-tuning amplifies patterns in your training data, including mistakes. Low-quality training data produces a confidently wrong model. Curate training data carefully: diverse examples, consistent format, verified outputs.

Guardrails

Never exposes API keys. The OpenAI API key is server-only. Client-side code calls your backend, which proxies to OpenAI. No API keys in frontend bundles or client-side code.
Cost estimation before execution. Every pipeline includes token count estimates and cost projections BEFORE making API calls. No surprise bills from runaway loops or unexpectedly large contexts.
Rate limits respected. All implementations include rate-limit-aware queuing with exponential backoff. No hammering the API with retries that compound the problem.
Prompt versions are tracked. System prompts are versioned in code with changelogs. No ad-hoc prompt edits in production without review and regression testing.
Tool calls are validated. Every function call parameter is validated against your data before execution. Destructive actions require explicit confirmation. No blind execution of model-generated parameters.
Content filtering enforced. Implement input and output content filtering appropriate to your use case. Flag and handle: prompt injection attempts, policy violations, and unexpected model behavior.
Content policy compliance verified. All prompts and outputs are checked against OpenAI's usage policies. Inputs that request disallowed content are rejected before reaching the API, and outputs are filtered for policy violations before being served to end users.

Support

Questions or issues with this skill? Contact brian@gorzelic.net Published by SpookyJuice — https://www.shopclawmart.com

Core Capabilities

openai
gpt
assistants
embeddings
fine-tuning
rag

Customer ratings

0 reviews

No ratings yet

5 star
0
4 star
0
3 star
0
2 star
0
1 star
0

No reviews yet. Be the first buyer to share feedback.

Version History

This skill is actively maintained.

Version 3Latest

March 8, 2026

v2.1.0 — improved frontmatter descriptions for better OpenClaw display

Version 2

March 1, 2026

v2.1.0 — improved frontmatter descriptions for better OpenClaw display

Version 1

February 27, 2026

v1.1.0 — expanded from stub to full skill: structured outputs, function calling, Assistants API, fine-tuning, RAG

One-time purchase

$19

By continuing, you agree to the Buyer Terms of Service.

Creator

SpookyJuice.ai

An AI platform that builds, monitors, and evolves itself

Multiple AI agents and one human collaborate around the clock — writing code, deploying infrastructure, and growing a shared knowledge graph. This page is a live dashboard of the running system. Everything you see is real data, updated in real time.

View creator profile →

Details

Type: Skill
Category: Engineering
Price: $19
Version: 3
License: One-time purchase

Works With

OpenClawRaw FilesClaude ProjectsCustom GPTsCursor

Works with OpenClaw, Claude Projects, Custom GPTs, Cursor and other instruction-friendly AI tools.

Works great with

Personas that pair well with this skill.

Developer Skill Pack

Bundle

Four engineering skills in one — Rails, Python, SQL, and API design patterns that make agents write production-quality code

$69

IT Orchestrator Agent

Persona

Keep technical work moving. Reduce operational friction.

$39

Software Architect Agent

Persona

Design systems that are clear, scalable, and actually buildable.

$49