Advanced
1. Custom LLM Provider Integration
OpenClaw's LLM provider interface is an extension point: any API that accepts a prompt and returns a completion can be integrated by implementing a small TypeScript adapter. This enables three scenarios not covered by the built-in providers: OpenAI-compatible third-party APIs, locally-running models via Ollama, and organization-specific model deployments with custom authentication schemes.
The provider interface has four methods: complete (single completion), stream
(streaming completion), embedText (embedding for memory), and countTokens (context
budget management). Only complete is required; the others fall back to reasonable
defaults if not implemented.
// ~/.openclaw/workspace/providers/azure-openai.ts
// Custom provider: Azure OpenAI with org-specific auth headers
import { LLMProvider, CompletionRequest, CompletionResponse } from "@openclaw/llm";
export class AzureOpenAIProvider implements LLMProvider {
readonly name = "azure-openai";
constructor(
private config: {
endpoint: string; // => e.g. "https://myorg.openai.azure.com"
deploymentId: string; // => Azure model deployment name (e.g. "gpt-4o-prod")
apiKey: string; // => Azure API key (different from OpenAI API key)
apiVersion: string; // => e.g. "2024-12-01-preview"
},
) {}
async complete(request: CompletionRequest): Promise<CompletionResponse> {
const url =
`${this.config.endpoint}/openai/deployments/${this.config.deploymentId}/chat/completions` +
`?api-version=${this.config.apiVersion}`;
// => Azure uses deployment-specific URLs, not model-name URLs like OpenAI direct
const body = {
messages: request.messages, // => [{role, content}, ...] — standard format
max_tokens: request.maxTokens ?? 4096,
temperature: request.temperature ?? 0.3,
tools: request.tools, // => Tool declarations for function calling
tool_choice: request.toolChoice ?? "auto",
};
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
"api-key": this.config.apiKey, // => Azure uses "api-key" header, not "Authorization: Bearer"
// => This is the primary auth difference from OpenAI direct
},
body: JSON.stringify(body),
});
if (!response.ok) {
const error = await response.json();
throw new Error(`Azure OpenAI error ${response.status}: ${error.error?.message}`);
// => => e.g. "Azure OpenAI error 429: Rate limit exceeded for deployment gpt-4o-prod"
}
const data = await response.json();
const choice = data.choices[0]; // => Azure returns same structure as OpenAI
return {
text: choice.message.content ?? "", // => Final text response (null if tool call)
toolCalls:
choice.message.tool_calls?.map((tc) => ({
id: tc.id,
name: tc.function.name,
parameters: JSON.parse(tc.function.arguments), // => JSON string → object
})) ?? [],
usage: {
inputTokens: data.usage.prompt_tokens, // => For cost tracking
outputTokens: data.usage.completion_tokens,
},
finishReason: choice.finish_reason, // => "stop" | "tool_calls" | "length"
};
}
async embedText(text: string): Promise<number[]> {
// Azure also hosts embedding models — use a separate deployment for embeddings
const url =
`${this.config.endpoint}/openai/deployments/text-embedding-3-small/embeddings` +
`?api-version=${this.config.apiVersion}`;
const response = await fetch(url, {
method: "POST",
headers: { "Content-Type": "application/json", "api-key": this.config.apiKey },
body: JSON.stringify({ input: text }),
});
const data = await response.json();
return data.data[0].embedding; // => 1536-dimensional float array
// => Used by memory system for semantic similarity computation
}
}
// Register the custom provider in openclaw.config.ts
import { AzureOpenAIProvider } from "./providers/azure-openai";
export default {
llm: {
provider: new AzureOpenAIProvider({
endpoint: process.env.AZURE_OPENAI_ENDPOINT!,
deploymentId: process.env.AZURE_DEPLOYMENT_ID!,
apiKey: process.env.AZURE_OPENAI_API_KEY!,
apiVersion: "2024-12-01-preview",
}),
},
};
// => On start: LLM provider: azure-openai (custom)
// => On start: Test call successful — deployment responds correctlyFor locally-running Ollama models, the adapter is simpler because Ollama exposes an
OpenAI-compatible API — you use the built-in openai-compatible provider type with
baseUrl: "http://localhost:11434/v1". The full custom adapter pattern is only needed
when the target API differs from the OpenAI request/response format.
Key Takeaway: Custom LLM providers implement a four-method interface that normalizes
any completion API into the runtime's internal format — the complete method is the
only required implementation, handling authentication and response mapping for any target API.
Why It Matters: Many enterprise AI deployments use internal model endpoints rather than direct Anthropic or OpenAI API access — Azure OpenAI, AWS Bedrock, or self-hosted vLLM clusters behind an API gateway. Without a custom provider interface, OpenClaw would be unusable in these environments. The adapter pattern means OpenClaw works in any deployment environment where an HTTP-accessible LLM endpoint exists, regardless of the API format.
2. Gateway Customization
The Gateway is extensible through a middleware and event handler system that lets you add custom logic at every stage of message processing: before routing, after routing, on tool execution, on response delivery, and on session events. Gateway customization is the right tool for cross-cutting concerns like audit logging, custom authentication, request transformation, and integration with external monitoring systems.
Middleware functions intercept the message pipeline and can read, modify, or block messages at each stage. They execute in registered order, and any middleware can short- circuit the pipeline by returning an early response.
// ~/.openclaw/workspace/gateway-extensions.ts
// Custom Gateway middleware and event handlers
import { GatewayMiddleware, GatewayEvent, MessageContext, MiddlewareNext } from "@openclaw/gateway";
// Middleware 1: Audit logging — logs every inbound message to an audit trail
export const auditLoggingMiddleware: GatewayMiddleware = {
name: "audit-logger",
stage: "pre-route", // => Runs before route selection
async handle(ctx: MessageContext, next: MiddlewareNext) {
const auditEntry = {
timestamp: new Date().toISOString(),
sessionId: ctx.message.sessionId, // => "telegram:123456789"
channelId: ctx.message.channelId,
userId: ctx.message.userId,
messageLength: ctx.message.text.length,
// => Do NOT log message.text in the audit trail — preserves user privacy
// => Log only metadata: who sent, when, from where, how long
};
await appendToAuditLog(auditEntry); // => Write to append-only audit log file
return next(ctx); // => Call next() to continue the pipeline
// => If next() is not called, the message is dropped silently
},
};
// Middleware 2: Custom authentication — restrict access to specific user IDs per channel
export const authMiddleware: GatewayMiddleware = {
name: "custom-auth",
stage: "pre-route",
async handle(ctx: MessageContext, next: MiddlewareNext) {
const allowedUsers = await loadAllowedUsers(ctx.message.channelId);
// => allowedUsers: Set<string> loaded from DB or config file
// => Allows dynamic updates without restarting the Gateway
if (!allowedUsers.has(ctx.message.userId)) {
return {
// => Return early response — pipeline stops here
text: "Sorry, you are not authorized to use this agent.",
// => This response is sent back through the channel adapter to the user
};
}
return next(ctx); // => Authorized user — continue to agent runtime
},
};
// Middleware 3: Request transformation — translate formal commands to natural language
export const commandTranslationMiddleware: GatewayMiddleware = {
name: "command-translator",
stage: "pre-route",
async handle(ctx: MessageContext, next: MiddlewareNext) {
// Transform Slack slash commands to natural language for the agent
if (ctx.message.channelId === "slack" && ctx.message.text.startsWith("/research ")) {
const company = ctx.message.text.replace("/research ", "");
ctx.message.text = `Research ${company} for a CRM entry`;
// => "/research Acme Corp" → "Research Acme Corp for a CRM entry"
// => The agent never sees the slash command format — it sees natural language
}
return next(ctx);
},
};
// Event handler: react to tool execution events for monitoring
export const toolExecutionMonitor = {
event: "tool.executed" as GatewayEvent,
async handle(event: { toolName: string; params: unknown; result: unknown; durationMs: number; sessionId: string }) {
if (event.durationMs > 5000) {
// => Alert on slow tool executions
await sendSlackAlert(`Slow tool: ${event.toolName} took ${event.durationMs}ms`);
// => Sends to a monitoring Slack channel — not the user's channel
}
await metricsClient.increment("tool.executed", { tool: event.toolName });
// => Increment Prometheus/Datadog counter for observability
},
};// Register middleware and handlers in openclaw.config.ts
import {
auditLoggingMiddleware,
authMiddleware,
commandTranslationMiddleware,
toolExecutionMonitor,
} from "./gateway-extensions";
export default {
gateway: {
middleware: [
auditLoggingMiddleware, // => Runs first (pre-route stage)
authMiddleware, // => Runs second (pre-route stage)
commandTranslationMiddleware, // => Runs third (pre-route stage)
],
eventHandlers: [toolExecutionMonitor],
},
};
// => On start: Gateway middleware loaded: audit-logger, custom-auth, command-translator
// => On start: Event handlers registered: tool.executedThe stage field on each middleware specifies when it runs: pre-route (before route
selection), post-route (after route selected, before runtime), post-response (after
runtime produces response, before delivery), or on-error (when any stage throws).
Middleware stages let you intercept the pipeline at the exact point relevant to your
concern without affecting other stages.
Key Takeaway: Gateway middleware intercepts the message pipeline at configurable stages, enabling audit logging, authentication, request transformation, and monitoring without modifying the core runtime or skill implementations.
Why It Matters: In production deployments, the Gateway customization layer is where compliance requirements land: audit trails, access control, PII redaction before LLM calls, and integration with external monitoring. Without a middleware system, these concerns would require forking the Gateway source. With it, they are configuration-level additions that survive framework upgrades.
3. Security Hardening
Security hardening for a production OpenClaw deployment goes beyond the minimal-permission principle covered in Beginner. At the advanced level, the concerns are: defending against prompt injection in untrusted content, sandboxing tool execution to prevent privilege escalation, implementing comprehensive audit logging, and scoping credentials so a compromise of one workspace does not compromise all workspaces.
%% Color Palette: Blue #0173B2, Orange #DE8F05, Teal #029E73, Purple #CC78BC, Brown #CA9161
graph TD
MSG["Incoming message"]:::blue
PIX["Prompt injection<br/>scanner"]:::orange
GW["Gateway auth<br/>middleware"]:::orange
CTX["Context assembly<br/>PII redaction"]:::orange
RT["Agent Runtime<br/>permission enforcement"]:::teal
SBX["Tool sandbox<br/>process isolation"]:::teal
AUD["Audit logger<br/>append-only log"]:::brown
LLM["LLM Provider<br/>min-context policy"]:::blue
MSG --> PIX
PIX -->|"clean"| GW
PIX -->|"injection detected"| AUD
GW -->|"authorized"| CTX
CTX --> RT
RT --> SBX
SBX -->|"tool result"| RT
RT --> LLM
RT --> AUD
SBX --> AUD
classDef blue fill:#0173B2,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef orange fill:#DE8F05,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef teal fill:#029E73,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef brown fill:#CA9161,stroke:#000000,color:#FFFFFF,stroke-width:2px
// ~/.openclaw/workspace/security.config.ts — production hardening configuration
export const securityConfig = {
// --- Prompt Injection Defenses ---
promptInjection: {
scanIncomingMessages: true,
scanToolResults: true, // => Also scan content returned by tools
// => Critical: web_fetch can return injections
// => embedded in scraped web pages
patterns: [
// Built-in patterns (always active)
"ignore.*previous.*instructions",
"you are now",
"disregard.*system.*prompt",
// Custom patterns for your domain
"forward.*to.*email", // => Catch injection attempts targeting email tool
"delete.*all.*files", // => Catch filesystem destruction attempts
],
onDetection: "block-and-alert", // => "warn" | "block" | "block-and-alert"
// => block-and-alert: rejects message + notifies admin
alertChannel: "slack:C01ADMIN123", // => Slack channel ID for security alerts
},
// --- Tool Execution Sandboxing ---
toolSandbox: {
enabled: true,
mode: "subprocess", // => Execute tools in isolated child processes
// => Each tool call spawns a new process
// => Process inherits only explicitly granted env vars
env: {
allowList: ["HOME", "PATH", "NODE_ENV"], // => Only these env vars visible to tool processes
// => CRM_API_KEY not in allowList → not visible
// => Tools that need it declare it explicitly
toolSpecificEnv: {
crm_get_contact: ["CRM_API_KEY"], // => Only crm_get_contact sees CRM_API_KEY
send_email: ["SMTP_PASSWORD"], // => Only send_email sees SMTP_PASSWORD
// => Other tools see neither — credential compartmentalization
},
},
resourceLimits: {
maxMemoryMB: 256, // => Per-tool memory limit
maxCpuPercent: 50, // => Per-tool CPU limit
maxExecutionMs: 15000, // => Hard timeout (overrides TOOLS.md timeout)
maxNetworkBandwidthKBs: 1024, // => Limit tool network usage
},
filesystem: {
allowedReadPaths: ["~/Documents", "~/Downloads"],
allowedWritePaths: ["~/Downloads/agent-output/"], // => Restrict writes to one dir
denyPaths: ["~/.ssh", "~/.aws", "~/.gnupg"], // => Explicit deny — highest priority
// => Even if allowedReadPaths were /
// => these paths remain blocked
},
},
// --- Credential Scoping ---
credentials: {
store: "system-keychain", // => macOS Keychain / Linux Secret Service
// => Never store credentials in config files
scopeByWorkspace: true, // => Each workspace has isolated keychain namespace
// => personal/ workspace cannot read devops/ credentials
rotationReminder: {
enabled: true,
maxAgeDays: 90, // => Alert when any credential is older than 90 days
alertChannel: "telegram:personal",
},
},
// --- Audit Logging ---
audit: {
enabled: true,
logPath: "~/.openclaw/audit/",
format: "jsonl", // => One JSON object per line — grep-friendly
retentionDays: 365,
events: [
"message.received", // => Every inbound message (metadata only, no text)
"tool.executed", // => Every tool call (name + params, not results)
"tool.blocked", // => Tool calls blocked by permission system
"injection.detected", // => Prompt injection scan results
"permission.denied", // => Permission check failures
"session.created",
"session.expired",
],
immutable: true, // => Use append-only file mode
// => Agent runtime cannot delete or modify audit logs
// => Requires separate privileged process to rotate
},
};Sandboxing tool execution via subprocess isolation is the most impactful hardening
measure for deployments with shell.execute permissions. Without sandboxing, a successful
prompt injection that reaches the shell tool runs in the same process and user account
as the OpenClaw runtime — with access to all environment variables, including credentials.
With subprocess isolation and a credential allowList per tool, the blast radius of a
shell injection is bounded by the subprocess's permitted environment.
Note that subprocess sandboxing does not prevent a compromised shell tool from exfiltrating files within the allowed filesystem paths. For true filesystem isolation, combine subprocess sandboxing with a dedicated OS user account for tool processes (macOS sandbox profiles or Linux seccomp filters provide the strongest guarantees but require OS-level configuration beyond OpenClaw's config system).
Key Takeaway: Production security hardening requires defense in depth across four layers: prompt injection scanning, tool execution sandboxing, credential compartmentalization, and immutable audit logging — no single layer is sufficient on its own.
Why It Matters: An agent with email, calendar, and shell access running in a production environment without hardening is a significant security liability. Prompt injections embedded in documents the agent reads or web pages it fetches can trigger unintended actions across all those tools simultaneously. Defense in depth means an attacker who succeeds at prompt injection still cannot escalate to arbitrary code execution if the sandbox is properly configured.
4. Building a Domain-Specific Agent
A domain-specific agent is an OpenClaw deployment optimized for one business domain: every configuration decision — AGENTS.md, skills, tools, memory, channels — is made to serve that domain's workflows as effectively as possible. Building one well requires treating it as a product design problem, not a configuration exercise.
This section walks through building a CRM-integrated sales research agent end to end. The same methodology applies to any domain: legal, medical, DevOps, customer support, financial analysis.
// Design phase: Define the agent's scope, users, channels, and capability boundaries
// (This is architecture documentation, not code — captured in AGENTS.md)
/*
* CRM Research Agent — design decisions:
*
* Users: Sales team (5 people), on Slack + occasionally Telegram
* Primary workflows:
* 1. Research a prospect company before a meeting
* 2. Add research notes to CRM after a meeting
* 3. Check CRM status of a deal by name
* 4. Draft follow-up emails based on meeting notes
*
* Out of scope (explicitly excluded from AGENTS.md):
* - Sending emails directly (too much risk; draft + confirm only)
* - Accessing deal financial data (not in CRM tool permissions)
* - Non-sales queries (route to general assistant if needed)
*
* Channels: slack (primary), telegram (personal use by individual reps)
* Memory: long-term enabled; knowledge base with company profiles, ICP definition
* Tools: web_search, web_fetch, crm_get_contact, crm_create_note, email_draft
* Skills: lead-research, crm-formatter, email-drafter, meeting-summarizer
*/<!-- ~/.openclaw/workspaces/crm-agent/AGENTS.md — domain-specific system instructions -->
# CRM Research Agent — System Instructions
You are a sales research assistant for a 5-person sales team. Your job is to support
deal research, CRM note-taking, and follow-up email drafting. You do not close deals,
send emails autonomously, or handle non-sales queries.
## What You Do
1. **Prospect research**: When asked to research a company, use the lead-research skill.
Always end with a CRM-formatted note the rep can review before adding.
2. **CRM lookups**: When asked about a deal or contact, call crm_get_contact first.
If the contact is not in CRM, say so clearly and offer to research them.
3. **Meeting notes to CRM**: When given meeting notes, summarize using meeting-summarizer
skill and produce a CRM-formatted note. Always ask for confirmation before calling
crm_create_note.
4. **Email drafting**: When asked to draft a follow-up, use email-drafter skill. Produce
the draft as text for review — never call send_email directly. Wait for explicit
approval.
## What You Do Not Do
- Send emails without explicit user confirmation
- Access financial or pipeline value data
- Handle IT, HR, or non-sales queries (say "that's outside my scope" and stop)
- Make commitments or promises on behalf of the sales team
## Response Format
Keep responses concise. If producing a CRM note or email draft, use the structured
format from the relevant skill. For research summaries, use the lead-research format.// Knowledge base for domain-specific grounding
// ~/.openclaw/workspaces/crm-agent/knowledge/
// Files indexed:
// ideal-customer-profile.md — ICP definition for qualification scoring
// competitor-comparison.md — Competitor feature comparison table
// product-positioning.md — Product messaging guidelines
// sales-playbook.md — Stage-by-stage sales process
// How the knowledge base improves research quality:
// When agent researches "Acme Corp", it retrieves:
// - ICP doc chunk: "Target companies: 50-500 employees, B2B SaaS, series A-C"
// - Retrieved: Acme Corp has 200 employees, is B2B SaaS, series B → ICP match: HIGH
// - Agent includes ICP qualification score in research summary without being prompted
// Memory configuration for the CRM agent
export const memoryConfig = {
longTermMemory: {
autoStoreThreshold: 0.8, // => Higher threshold — only store notable interactions
// => Avoids cluttering memory with routine CRM lookups
namespacePrefix: "crm-agent:", // => Isolated from personal-assistant memories
},
sessionHistory: { maxTurns: 10 }, // => Shorter than default — CRM queries are stateless
};The design discipline most commonly skipped: defining what the agent does not do, as explicitly as what it does. An AGENTS.md that says "I am a sales assistant, help with anything sales-related" will drift into handling IT tickets, expense reports, and general research because the LLM interprets "anything sales-related" broadly. An AGENTS.md that lists five specific "What You Do Not Do" items produces dramatically more consistent scope enforcement.
Key Takeaway: A domain-specific agent is defined as much by its explicit exclusions in AGENTS.md as by its included capabilities — scope drift is the primary failure mode for single-domain agents.
Why It Matters: Domain-specific agents earn trust from their users precisely because they are predictable within their scope. A sales rep who knows the CRM agent always asks for confirmation before writing to CRM, never sends emails autonomously, and always produces ICP-qualified research notes will use it daily. An agent that sometimes writes to CRM without asking, occasionally handles unrelated queries, and produces inconsistent research formats will be abandoned. Predictable scope is a trust prerequisite.
5. Multi-Agent Patterns
At the advanced level, multi-agent patterns extend beyond the simple orchestrator-specialist delegation covered in Intermediate. Production multi-agent systems use three primary patterns: fan-out (parallel specialist execution), sequential pipeline (output of one agent is input to the next), and hierarchical delegation (orchestrator recursively delegates to sub-orchestrators for complex tasks).
Understanding when each pattern applies prevents over-engineering simple workflows and under-engineering complex ones.
%% Color Palette: Blue #0173B2, Orange #DE8F05, Teal #029E73, Purple #CC78BC, Brown #CA9161
graph TD
subgraph "Pattern 1: Fan-out (parallel)"
U1["User request"]:::blue
O1["Orchestrator"]:::orange
S1A["Specialist A<br/>web research"]:::teal
S1B["Specialist B<br/>CRM lookup"]:::teal
S1C["Specialist C<br/>news search"]:::teal
AGG["Aggregator<br/>combine results"]:::brown
U1 --> O1
O1 --> S1A
O1 --> S1B
O1 --> S1C
S1A --> AGG
S1B --> AGG
S1C --> AGG
end
subgraph "Pattern 2: Sequential pipeline"
U2["User request"]:::blue
P1["Stage 1<br/>research"]:::orange
P2["Stage 2<br/>summarize"]:::teal
P3["Stage 3<br/>format for CRM"]:::purple
U2 --> P1
P1 -->|"raw research"| P2
P2 -->|"summary"| P3
end
classDef blue fill:#0173B2,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef orange fill:#DE8F05,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef teal fill:#029E73,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef purple fill:#CC78BC,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef brown fill:#CA9161,stroke:#000000,color:#FFFFFF,stroke-width:2px
// Fan-out pattern — orchestrator calls multiple specialists in parallel
// Useful when subtasks are independent and total time matters
export const parallelResearchTool = defineTool({
name: "parallel_research",
async execute(params: { company: string }) {
// Launch all three specialist calls simultaneously
const [webResearch, crmData, newsResults] = await Promise.allSettled([
callAgent("research", `Research ${params.company} company background`),
// => Launches research agent — takes ~8s
callAgent("crm", `Look up ${params.company} in CRM`),
// => Launches CRM agent — takes ~2s
callAgent("news", `Find recent news about ${params.company} last 90 days`),
// => Launches news agent — takes ~5s
]);
// => All three start at t=0, total wall time = max(8s, 2s, 5s) = 8s
// => Sequential would be: 8s + 2s + 5s = 15s — fan-out saves 7s
return {
web: webResearch.status === "fulfilled" ? webResearch.value : "Research failed",
crm: crmData.status === "fulfilled" ? crmData.value : "CRM unavailable",
news: newsResults.status === "fulfilled" ? newsResults.value : "News search failed",
// => Promise.allSettled (not Promise.all) — partial results on specialist failure
// => Orchestrator gets whatever completed; LLM synthesizes with available data
};
},
});
// Sequential pipeline — when each stage requires the previous stage's output
export const researchToCRMPipeline = defineTool({
name: "research_to_crm_pipeline",
async execute(params: { company: string; contactName: string }) {
// Stage 1: Raw research
const rawResearch = await callAgent(
"research",
`Deep research on ${params.company}: background, recent news, key contacts`,
);
// => => Long string of raw research findings (~2000 words)
// Stage 2: Summarize and structure (cannot parallelize — needs Stage 1 output)
const summary = await callAgent(
"summarizer",
`Summarize this research for a sales rep meeting ${params.contactName}:\n\n${rawResearch}`,
);
// => => Structured 300-word summary with key points highlighted
// Stage 3: Format for CRM (cannot parallelize — needs Stage 2 output)
const crmNote = await callAgent("crm-formatter", `Format this summary as a CRM note:\n\n${summary}`);
// => => Formatted CRM note with delimiters ready to paste
return { rawResearch, summary, crmNote };
// => Orchestrator delivers all three to user; user picks which level of detail they want
},
});Result aggregation in fan-out patterns requires the orchestrator LLM to synthesize outputs from multiple specialists that may have different formats, confidence levels, and even contradictions. Include explicit aggregation instructions in the orchestrator's AGENTS.md: "When combining results from multiple research specialists, note any contradictions between sources and rate each source's confidence explicitly."
Hierarchical delegation adds a third tier: an orchestrator delegates to a sub-orchestrator that itself coordinates specialists. This pattern is only justified for genuinely complex workflows (10+ distinct subtasks, dynamic task decomposition based on intermediate results) — the added latency and operational complexity of a three-tier system is not worth it for simpler workflows.
Key Takeaway: Fan-out (parallel execution) reduces wall-clock time for independent subtasks; sequential pipeline handles data dependencies between stages — choose based on whether subtasks are independent, not based on task complexity.
Why It Matters: The difference between a 30-second agent response and an 8-second one is the difference between a workflow that disrupts the user's rhythm and one that feels instant. For tasks requiring multiple independent data sources — sales research, incident analysis, content compilation — fan-out parallelism is the primary tool for reaching response times that do not create friction in professional workflows.
6. Memory Persistence Architecture
Long-term memory persistence at production scale requires architectural decisions that are not surfaced in the default configuration: which embedding model to use, how to manage store growth, how to implement retention policies, and whether to integrate a knowledge graph for structured relationship memory alongside the vector store for semantic memory.
The default SQLite-VSS backend works well for a single user with up to ~100,000 memory entries. Beyond that, or for multi-user deployments, a dedicated vector database is the right choice.
// ~/.openclaw/workspace/memory.config.ts — production memory architecture
export const memoryConfig = {
backend: {
type: "qdrant", // => Dedicated vector database
// => Alternatives: "weaviate", "pinecone" (cloud),
// => "milvus", "chromadb"
connection: {
host: "localhost",
port: 6333, // => Qdrant default port
// => Run: docker run -p 6333:6333 qdrant/qdrant
// => Or: brew install qdrant && qdrant &
},
collections: {
conversations: "openclaw_conversations", // => Conversation memory collection
knowledge: "openclaw_knowledge", // => Knowledge base chunks collection
},
},
embedding: {
model: "text-embedding-3-large", // => 3072 dimensions (vs. 1536 for small)
// => Higher dimensions = better semantic precision
// => Cost: ~$0.00013 per 1K tokens (small: ~$0.00002)
// => Use large for high-value knowledge bases,
// => small for high-volume conversation memory
batchSize: 100, // => Embed up to 100 texts per API call
// => Reduces embedding cost and latency for bulk indexing
},
retention: {
policies: [
{
namespace: "*", // => Applies to all namespaces
maxEntries: 50000, // => Delete oldest when limit exceeded
maxAgeDays: 730, // => Delete entries older than 2 years
importanceThreshold: 0.3, // => Delete low-importance entries first
// => (importance scored 0-1 at store time)
},
{
namespace: "crm-agent:", // => CRM agent gets longer retention
maxAgeDays: 1825, // => 5 years — sales relationships are long-lived
maxEntries: 200000,
},
],
runSchedule: "0 2 * * *", // => Cron: run retention job at 2 AM daily
},
// Knowledge graph integration — for structured relationship memory
knowledgeGraph: {
enabled: true,
backend: "neo4j", // => Neo4j graph database
connection: {
uri: "bolt://localhost:7687",
auth: { username: "neo4j", password: process.env.NEO4J_PASSWORD },
},
// => Graph complements vector store: vector store answers "what is similar to X?"
// => Graph answers "who is connected to X?" and "what is the relationship between X and Y?"
entityExtraction: {
enabled: true, // => After each turn, extract entities and relationships
types: ["Person", "Company", "Deal", "Product", "Event"],
// => "Alice Smith from Acme Corp mentioned the Omega Deal closing in Q3"
// => Extracts: Person(Alice Smith) -[WORKS_AT]-> Company(Acme Corp)
// => Person(Alice Smith) -[MENTIONED]-> Deal(Omega Deal)
},
},
};// Querying both vector store and knowledge graph in context assembly
async function assembleRichMemoryContext(message: ChannelMessage, session: Session) {
const [vectorMemories, graphContext] = await Promise.all([
// Vector store: semantic similarity
memory.retrieve(message.text, session.id, { limit: 5, minSimilarity: 0.65 }),
// => => ["Acme Corp deal closes June 15", "Alice Smith prefers morning calls", ...]
// Knowledge graph: relationship traversal
knowledgeGraph.query(
`
MATCH (p:Person)-[:WORKS_AT]->(c:Company {name: $company})
OPTIONAL MATCH (c)-[:HAS_DEAL]->(d:Deal)
RETURN p.name, p.title, d.name, d.stage
LIMIT 5
`,
{ company: extractCompanyName(message.text) },
),
// => => [{name:"Alice Smith", title:"VP Sales", dealName:"Omega Deal", stage:"Proposal"}]
]);
return { vectorMemories, graphContext };
// => LLM receives both: semantic memories (what was said) and
// => graph context (structured relationships between entities)
// => Combined: richer context than either source provides alone
}The knowledge graph integration is an advanced capability worth evaluating only for deployments where relationship memory is central to the use case: CRM workflows (who knows whom, which companies are linked, which deals involve which people), legal workflows (which clauses appear in which contracts, which parties are in which agreements), or research workflows (which papers cite which others, which authors collaborate on which topics). For simpler personal productivity use cases, the vector store alone is sufficient.
Key Takeaway: Production memory architecture separates the embedding model choice (precision vs. cost), the vector backend (SQLite-VSS for personal use, Qdrant/Weaviate for scale), and retention policy (namespace-specific rules) — the knowledge graph adds structured relationship memory orthogonal to semantic similarity.
Why It Matters: Memory quality is the primary differentiator between an agent that improves with use and one that plateaus. Poor embedding model choice leads to low-quality retrieval (semantically unrelated memories returned, relevant ones missed). Absence of retention policies leads to store pollution over time. Knowledge graph integration enables relationship-aware queries that vector search cannot answer. These architectural decisions compound: a well-configured memory system makes every subsequent interaction better.
7. ClawHub: Publishing Skills at Scale
Publishing skills at scale means treating skill packages as software artifacts: semantic versioning, automated testing, dependency management, and a publishing pipeline that validates quality before release. Teams with multiple OpenClaw deployments benefit from maintaining an internal skill registry (a private ClawHub mirror) rather than publishing sensitive domain skills publicly.
// Skill package structure for a publishable, well-tested skill
// ~/.openclaw/workspace/skills/contract-review/
// ├── SKILL.md (main skill definition)
// ├── tests/
// │ ├── trigger-tests.yaml (trigger phrase test cases)
// │ └── behavior-tests.yaml (behavior test cases using the skill testing framework)
// ├── examples/
// │ └── sample-contracts/ (test fixtures for behavior tests)
// └── package.json (version, dependencies, metadata)
// trigger-tests.yaml — verify trigger phrase matching
// Each test verifies that a message does or does not trigger the skill# ~/.openclaw/workspace/skills/contract-review/tests/trigger-tests.yaml
triggers:
should_activate:
- "can you review this contract"
- "review the agreement at ~/Documents/vendor.pdf"
- "check this contract for red flags"
- "look at this service agreement"
- "I need a contract review"
should_not_activate:
- "what is the weather today" # => Completely unrelated
- "summarize my emails" # => Different domain skill
- "how do I write a contract" # => Question about contracts, not a review request
# => "how do I write" should NOT activate contract-review
# => It should activate a different skill (or no skill) — legal advice is out of scope// behavior-tests.yaml — verify the skill produces correct output format
// Uses openclaw's skill testing framework to replay interactions
// openclaw skill test contract-review
// => Running trigger tests... 9/9 passed
// => Running behavior tests...
// => Test 1: standard-nda.pdf → checking output format... PASS
// => Contains: "Contract Review Summary" header ✓
// => Contains: clause checklist table ✓
// => Contains: "This review is AI-assisted" footer ✓
// => Test 2: missing-termination-clause.pdf → checking FLAGS output... PASS
// => Contains: "[FLAG]" marker ✓
// => Contains: "termination" in flagged section ✓
// => Test 3: large-contract-100-pages.pdf → checking truncation behavior... PASS
// => Contains: "Only the first 50,000 bytes were reviewed" notice ✓
// => All tests passed (3/3)
// Private ClawHub mirror — for sensitive domain skills
// openclaw.config.ts registry configuration:
export const registryConfig = {
registries: [
{
name: "internal",
url: "https://clawhub.internal.yourcompany.com", // => Internal Qdrant or self-hosted registry
priority: 1, // => Try internal first
auth: { type: "bearer", token: process.env.INTERNAL_REGISTRY_TOKEN },
},
{
name: "clawhub",
url: "https://clawhub.dev",
priority: 2, // => Fall back to public registry
// => Skills found in internal registry shadow public ClawHub skills of the same name
},
],
};
// => openclaw skill search "contract-review"
// => Results from internal registry (1): contract-review-enterprise v3.2.0 (private)
// => Results from clawhub (1): contract-review v1.1.0 (public)
// => Internal result takes precedence — contract-review-enterprise installs by default# Automated publishing pipeline (CI integration example)
# .github/workflows/publish-skills.yml (or equivalent CI config)
# On merge to main:
# 1. Run skill tests
openclaw skill test contract-review # => Must pass all trigger + behavior tests
# 2. Validate for registry
openclaw skill validate --registry internal --strict
# => Quality score: 94/100 (strict mode requires >= 85)
# => All metadata fields present ✓
# => License: MIT ✓
# => Minimum version compatibility verified ✓
# 3. Publish to internal registry
openclaw skill publish --registry internal --changelog "$(git log -1 --pretty=%B)"
# => Publishing contract-review-enterprise v3.3.0 to internal registry...
# => Published. 12 internal deployments will see update available.
# => Deployments can update with: openclaw skill update contract-review-enterpriseThe --strict flag on openclaw skill validate enforces additional requirements beyond
the default: minimum example count of 3 (vs. 2), mandatory edge case handling documented
in the instructions block, mandatory minOpenClawVersion metadata, and a quality score
of 85+ (vs. 70+ for the public registry). For internal skills that teams rely on in
production, these stricter standards reduce the rate of behavioral regressions between
skill versions.
Key Takeaway: Treat skill packages as versioned software artifacts with automated trigger tests, behavior tests, and a CI publishing pipeline — an internal skill registry isolates sensitive domain skills from the public ClawHub while maintaining version control.
Why It Matters: As an organization's OpenClaw deployment grows from one agent to dozens, skill maintenance becomes a shared engineering concern. Skills that lack tests produce silent behavior regressions when updated. Skills published without version discipline break downstream deployments that depend on them. Applying the same software engineering practices to skill packages that you apply to library code prevents the technical debt accumulation that makes large skill libraries brittle over time.
8. OpenClaw and Pi Architecture
Understanding how OpenClaw's design was influenced by Pi (the minimal coding agent harness) provides architectural insight that is directly useful when extending OpenClaw. Pi was created as a four-tool minimal agent (Read, Write, Edit, Bash); OpenClaw's architecture retains this minimal core and layers the channel abstraction, skills system, and memory on top of it.
The conceptual debt between Pi and OpenClaw surfaces in several specific design decisions worth understanding for contributors and advanced customizers.
%% Color Palette: Blue #0173B2, Orange #DE8F05, Teal #029E73, Purple #CC78BC, Brown #CA9161
graph TD
subgraph "Pi: minimal harness"
P_LOOP["Agentic loop<br/>LLM → tool → LLM"]:::blue
P_TOOLS["4 primitive tools<br/>Read Write Edit Bash"]:::orange
P_CTX["Context files<br/>AGENTS.md SYSTEM.md"]:::teal
end
subgraph "OpenClaw: full framework"
OC_LOOP["Agentic loop<br/>same core pattern"]:::blue
OC_TOOLS["Tool registry<br/>TOOLS.md + custom"]:::orange
OC_CTX["Context files<br/>AGENTS.md SOUL.md TOOLS.md"]:::teal
OC_CHAN["Channel System<br/>Pi has no concept of channels"]:::purple
OC_SK["Skills System<br/>Pi: SKILL.md per session<br/>OpenClaw: selective injection"]:::purple
OC_MEM["Memory System<br/>Pi: session only<br/>OpenClaw: semantic long-term"]:::purple
OC_GW["Gateway<br/>Pi has no equivalent"]:::purple
end
P_LOOP -.->|"same pattern"| OC_LOOP
P_CTX -.->|"extended"| OC_CTX
P_TOOLS -.->|"formalized"| OC_TOOLS
classDef blue fill:#0173B2,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef orange fill:#DE8F05,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef teal fill:#029E73,stroke:#000000,color:#FFFFFF,stroke-width:2px
classDef purple fill:#CC78BC,stroke:#000000,color:#FFFFFF,stroke-width:2px
// The agentic loop in Pi (conceptual — Pi's actual code is simpler)
// vs. the same pattern in OpenClaw:
// Pi loop (conceptual):
async function piLoop(userMessage: string, context: string[]): Promise<string> {
context.push({ role: "user", content: userMessage });
while (true) {
const completion = await llm.complete(context);
if (!completion.toolCall) return completion.text; // => Done
const result = await tools[completion.toolCall.name](completion.toolCall.args);
context.push({ role: "tool", content: result });
// => Loop: no iteration limit in Pi's minimal implementation
}
}
// OpenClaw loop — same core pattern, production additions:
async function openClawLoop(message: ChannelMessage, agentCtx: AgentContext, session: Session): Promise<string> {
// Context assembly is richer: AGENTS.md + SOUL.md + skills + memories + history
const context = await assembleContext(message, agentCtx, session);
// => Pi assembles context manually per session; OpenClaw automates this
let iterations = 0;
while (iterations < agentCtx.maxIterations) {
// => Pi: no limit; OpenClaw: configurable
iterations++;
const completion = await agentCtx.llmProvider.complete(context);
// => Pi: hardcoded provider; OpenClaw: pluggable provider interface
if (!completion.toolCalls?.length) return completion.text;
for (const toolCall of completion.toolCalls) {
if (agentCtx.toolRegistry.requiresConfirmation(toolCall.name)) {
const approved = await awaitUserConfirmation(toolCall, session);
// => Pi: no confirmation system; OpenClaw: per-tool confirmation
if (!approved) {
context.append({ role: "tool", content: "User declined this action." });
continue;
}
}
const result = await agentCtx.toolRegistry.execute(toolCall.name, toolCall.parameters);
// => Pi: direct function call; OpenClaw: registry with permission enforcement + sandboxing
context.append({ role: "tool", content: result });
}
}
return "Maximum iterations reached.";
// => Pi has no safety limit; OpenClaw adds this for production reliability
}The three most significant differences between Pi and OpenClaw at the architecture level:
-
Channel abstraction — Pi has no concept of channels; it runs in a terminal. OpenClaw adds the Channel System and Gateway as a presentation layer that Pi simply does not need.
-
Selective skill injection — Pi loads SKILL.md files into context manually, per session. OpenClaw automates injection with trigger matching and token budgeting. The SKILL.md format is shared; the injection mechanism is OpenClaw's contribution.
-
Production guardrails — Pi's loop has no iteration limit, no confirmation system, no permission enforcement. OpenClaw adds all three as first-class features, trading some flexibility for production reliability.
Key Takeaway: OpenClaw is Pi's agentic loop plus a channel presentation layer, a skill injection automation system, and production guardrails — understanding Pi's minimal design helps identify which parts of OpenClaw are fundamental versus accidental complexity.
Why It Matters: When extending OpenClaw, knowing which components derive from the minimal agent harness pattern (loop, tools, context files) versus which are OpenClaw- specific additions (channels, Gateway, selective injection, memory) tells you where changes are safe versus where they could break fundamental behavior. The loop and tool execution are load-bearing; the Gateway and Channel System are replaceable presentation layers.
9. Production Deployment
Production deployment of OpenClaw means running the agent reliably under real load, with monitoring, automatic recovery from failures, controlled cost, and a deployment process that does not require manual intervention for routine restarts or updates.
The primary deployment target for most OpenClaw users is a dedicated machine (Mac Mini, NUC, or Linux workstation) running continuously. Container deployment is possible but adds complexity that is only justified for multi-user or high-availability requirements.
# macOS production setup — launchd service for automatic startup and restart
# Create a launchd plist (macOS service definition)
cat > ~/Library/LaunchAgents/com.openclaw.agent.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.openclaw.agent</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/openclaw</string>
<string>start</string>
<string>--config</string>
<string>/Users/yourname/.openclaw/workspace/openclaw.config.ts</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>ANTHROPIC_API_KEY</key>
<string>use-keychain</string> <!-- openclaw reads from keychain, not env -->
</dict>
<key>RunAtLoad</key>
<true/> <!-- Start on login -->
<key>KeepAlive</key>
<true/> <!-- Restart if crashed -->
<key>StandardOutPath</key>
<string>/Users/yourname/.openclaw/logs/stdout.log</string>
<key>StandardErrorPath</key>
<string>/Users/yourname/.openclaw/logs/stderr.log</string>
<key>ThrottleInterval</key>
<integer>30</integer> <!-- Wait 30s between restart attempts -->
</dict>
</plist>
EOF
# Load the service
launchctl load ~/Library/LaunchAgents/com.openclaw.agent.plist
# => Service registered. OpenClaw will start now and on every login.
launchctl list | grep openclaw
# => 12345 0 com.openclaw.agent
# => PID 12345, exit code 0 — running normally// Cost management configuration — critical for production cost control
export const costConfig = {
budgets: {
daily: {
usd: 10.0, // => Hard limit: $10/day across all LLM calls
onExceeded: "warn-and-throttle", // => "warn-and-throttle" | "warn-only" | "hard-stop"
// => "hard-stop" stops accepting messages until next day reset
// => "warn-and-throttle" reduces token budget and warns user, keeps running
},
perUser: {
usd: 2.0, // => $2/day per user across all channels
onExceeded: "warn-and-throttle",
},
},
tokenOptimization: {
compressionThreshold: 0.8, // => When context is >80% of max tokens,
compressionStrategy: "summarize-oldest", // => summarize oldest history turns to save space
// => "summarize-oldest": LLM summarizes oldest 10 turns into 1 summary turn (~70% compression)
// => This loses some detail but keeps the agent responsive within budget
skillBudget: 3000, // => Max tokens for skill injection per turn (default: 4000)
// => Reducing this from 4000 to 3000 saves ~$0.003 per turn with Claude Sonnet
// => At 1000 turns/day: $3/day savings — meaningful at scale
},
monitoring: {
reportingInterval: "daily", // => Send daily cost report
alertThreshold: 0.8, // => Alert at 80% of daily budget
alertChannel: "telegram:personal", // => Where to send alerts
},
};# Monitoring — key metrics to track in production
openclaw metrics
# => === OpenClaw Production Metrics (last 24h) ===
# => Messages received: 847
# => Messages processed: 841 (6 blocked by rate limiter)
# => Average response time: 4.2s (median: 3.1s, p95: 12.4s)
# => Tool calls: 2,341 (2.78 per message average)
# => Most called tools: web_search (891), crm_get_contact (445), read_file (312)
# => Skill injections: lead-research (312), crm-formatter (298), contract-review (87)
# => LLM cost: $3.42 (Claude: $2.14, DeepSeek: $1.28)
# => Memory writes: 234 (0.28 per message)
# => Errors: 3 (tool timeout: 2, LLM rate limit: 1)
# => Uptime: 99.7% (1 restart due to memory limit)Key Takeaway: Production deployment requires a process supervisor for automatic restart, daily cost budgets with throttling rather than hard stops, and metrics collection focused on response time distribution, tool call frequency, and per-day LLM cost.
Why It Matters: An OpenClaw agent that runs unmonitored will accumulate costs silently (especially if a skill triggers unexpectedly frequent web searches), fail silently when a tool times out, and restart without context if the process crashes. A production deployment with launchd supervision, daily cost budgets, and metrics export makes the agent's behavior observable and controllable — turning a personal tool into a reliable operational system.
10. Performance Optimization
Performance optimization in OpenClaw targets three independent cost drivers: LLM token usage (affects both latency and cost), tool execution latency (affects response time), and memory retrieval time (affects context assembly latency). Each has distinct optimization strategies.
The most impactful optimizations are context window management (reducing unnecessary tokens in every call) and skill pruning (removing installed skills that are no longer used but consume injection budget capacity).
// Context window optimization — the highest-leverage performance improvement
// Tool 1: Measure actual token distribution per turn
// openclaw trace send "research Acme Corp" --token-breakdown
// => === Token breakdown ===
// => AGENTS.md: 287 tokens (4.3%)
// => SOUL.md: 143 tokens (2.2%)
// => Injected skills: 1847 tokens (27.9%) ← largest component
// => lead-research: 1247 tokens
// => crm-formatter: 600 tokens
// => Long-term memories: 421 tokens (6.4%)
// => Session history: 3891 tokens (58.9%) ← second largest
// => Current message: 12 tokens (0.2%)
// => Total: 6601 tokens (100%)
// Tool 2: Trim session history — biggest wins often here
export const memoryConfig = {
sessionHistory: {
maxTurns: 20, // => Default; try 12 for routine task workflows
// => Routine CRM lookups don't need 20 turns of history
// => 12 turns: saves ~2000 tokens per call = $0.006/call with Claude Sonnet
// => At 500 calls/day: $3/day savings with no meaningful quality loss
summarizeAfter: 10, // => After 10 turns, summarize oldest 5 turns
// => into 1 summary turn (saves ~300 tokens/turn)
summaryModel: "claude-haiku-3-5", // => Use cheap model for summarization
// => => $0.0002 for summarization vs $0.003 saved
// => => Net positive at every call after turn 10
},
};
// Tool 3: Skill size audit — identify bloated skills
// openclaw skill stats --token-count
// => Skill token counts (injected size):
// => lead-research: 1247 tokens ← review for redundancy
// => crm-formatter: 600 tokens
// => contract-review: 1847 tokens ← largest; worth trimming
// => core-memory: 312 tokens (built-in)
// => core-search: 287 tokens (built-in)
// Tool 4: Trim large skills — remove redundant examples
// contract-review has 5 examples; reducing to 2 saves ~600 tokens per injection
// => 87 injections/day × 600 tokens × $0.000003/token = $0.16/day savings
// => Minimal quality impact if the 2 retained examples are the most representative// Tool execution latency — identify slow tools
// openclaw tool benchmark --sample 20
// => Tool performance (20 samples each):
// => web_search: p50: 1.2s p95: 3.4s p99: 8.1s ← high variance (network)
// => web_fetch: p50: 0.8s p95: 2.1s p99: 5.2s ← high variance (target server)
// => crm_get_contact: p50: 0.3s p95: 0.8s ← consistent (local API on LAN)
// => read_file: p50: 0.02s p95: 0.1s ← fastest (local disk)
// Optimization: add caching for frequently-called tools
export const toolCacheConfig = {
cache: {
crm_get_contact: {
ttlSeconds: 300, // => Cache CRM lookups for 5 minutes
// => Same contact looked up multiple times in a session
// => First call: ~0.3s; subsequent: ~0.001s
keyFn: (params: { query: string }) => `crm:${params.query.toLowerCase()}`,
},
web_search: {
ttlSeconds: 3600, // => Cache search results for 1 hour
// => Same query across multiple turns (common in research)
keyFn: (params: { query: string }) => `search:${params.query}`,
},
},
};
// => Tool cache enabled: crm_get_contact (5m TTL), web_search (1h TTL)
// => Cache hit rate after 1 day: crm_get_contact 34%, web_search 18%
// => Average response time improvement: -0.8s per messageMemory retrieval optimization is the third axis. The default sqlite-vss backend performs retrieval in 10–50ms for stores up to ~50,000 entries. Beyond that, retrieval degrades to 100–500ms. The fix is either migrating to Qdrant (which maintains retrieval speed to millions of entries) or applying aggressive retention policies to keep the sqlite store below 50,000 entries. Qdrant migration takes approximately 30 minutes and requires no changes to application code — only the memory backend configuration.
Key Takeaway: The three performance levers are token reduction (session history trimming, skill size reduction), tool caching (for frequently-repeated external calls), and memory backend scaling (sqlite-vss to Qdrant when store exceeds 50,000 entries) — measure before optimizing to find which lever has the highest return.
Why It Matters: A 30% reduction in tokens per call translates directly to 30% lower LLM cost and typically 20–25% lower response latency (since LLM latency scales with input token count). At production scale with hundreds of daily turns, these optimizations determine whether operating an OpenClaw deployment is economically sustainable. The measurement-first principle applies here more strongly than anywhere else in the framework: the actual distribution of token usage varies significantly between use cases, and optimization without measurement produces diminishing returns.
11. Contributing to OpenClaw
Contributing to OpenClaw requires understanding the two-language codebase: the TypeScript core (agent runtime, Gateway, Skills System, memory, CLI, provider adapters) and the Swift companion apps (macOS menu bar app, iOS app, A2UI rendering). Most contributors work exclusively in the TypeScript core; Swift is only necessary for companion app features.
The architecture of the TypeScript core follows a clear package structure that maps directly to the seven components.
# Clone and set up the development environment
git clone https://github.com/openclaw/openclaw.git
cd openclaw
# Install dependencies (Node.js 20+ required)
npm install # => Installs monorepo dependencies (pnpm workspace)
# => openclaw uses a pnpm monorepo with these packages:
# => packages/core — agent runtime, agentic loop, context assembly
# => packages/gateway — Gateway server, middleware system, routing
# => packages/skills — Skills System, injection algorithm, ClawHub client
# => packages/memory — vector store, embedding, knowledge base, retention
# => packages/channels — channel adapters (telegram, slack, discord, ...)
# => packages/llm — LLM provider interface and built-in adapters
# => packages/cli — openclaw CLI commands
# => packages/tools — built-in tool implementations
# => apps/companion-macos — Swift macOS app (separate Xcode project)
# => apps/companion-ios — Swift iOS app (separate Xcode project)
# Run the test suite
npm test # => Runs unit tests across all packages
# => packages/core: 214 tests passed
# => packages/gateway: 98 tests passed
# => packages/skills: 87 tests passed
# => ...
# Run a specific package's tests in watch mode during development
cd packages/skills && npm test -- --watch// Adding a new channel adapter — the most common contribution type
// packages/channels/src/adapters/matrix.ts (example: Matrix protocol adapter)
import { ChannelAdapter, ChannelMessage, AdapterConfig } from "@openclaw/core";
export class MatrixAdapter implements ChannelAdapter {
readonly channelId = "matrix"; // => Must be unique across all adapters
constructor(
private config: {
homeserver: string; // => e.g. "https://matrix.org"
accessToken: string;
userId: string; // => e.g. "@bot:matrix.org"
},
) {}
async connect(): Promise<void> {
// Initialize Matrix client, join configured rooms
this.client = createMatrixClient(this.config);
await this.client.startClient(); // => Begins sync loop with homeserver
// => Emits "Room.timeline" events for new messages
}
normalize(matrixEvent: MatrixTimelineEvent): ChannelMessage {
return {
id: matrixEvent.getId(), // => Matrix event ID string
channelId: "matrix",
userId: matrixEvent.getSender(), // => "@alice:matrix.org" format
displayName: matrixEvent.getSender().split(":")[0].replace("@", ""),
// => "@alice:matrix.org" → "alice"
text: matrixEvent.getContent().body ?? "", // => m.text event body
attachments: [], // => Matrix file attachments: left as future work
timestamp: new Date(matrixEvent.getTs()), // => Matrix timestamp in ms
sessionId: `matrix:${matrixEvent.getSender()}`, // => Standard sessionId format
metadata: { roomId: matrixEvent.getRoomId() }, // => Needed to send reply to correct room
};
}
async send(chatId: string, text: string): Promise<void> {
await this.client.sendTextMessage(chatId, text);
// => chatId is roomId from metadata.roomId
}
async disconnect(): Promise<void> {
await this.client.stopClient();
}
}
// Register in packages/channels/src/index.ts and add to default channel map
// Submit PR with: adapter + normalize tests + send tests + README update# PR workflow for the OpenClaw project
# Following the project's contributing guide (CONTRIBUTING.md in repo root)
# 1. Create feature branch
git checkout -b feat/matrix-channel-adapter
# 2. Implement the feature with tests
# ... (adapter code + tests)
# 3. Run quality gates
npm run typecheck # => TypeScript strict mode — no errors allowed
npm run lint # => ESLint with openclaw config
npm test # => All tests must pass
# 4. Update documentation
# => packages/channels/README.md: add Matrix to supported channels table
# => apps/companion-macos/README.md: if companion app changes needed
# 5. Submit PR
# => PR title format: "feat(channels): add Matrix protocol adapter"
# => PR body: describe the adapter, link to Matrix SDK used, include example config
# => Required reviewers: at least one @openclaw/core-team member
# => CI runs: typecheck, lint, test, integration tests against Matrix homeserver
# Check CI status
gh pr checks # => View CI check results for your PR
# => TypeScript: passed ✓
# => Lint: passed ✓
# => Unit tests: passed ✓
# => Integration tests: passed ✓ (runs against matrix.org test homeserver)The TypeScript core uses strict TypeScript configuration ("strict": true in tsconfig).
Contributions that introduce any types, suppress TypeScript errors with @ts-ignore,
or omit unit tests for new code paths are rejected in review. The project enforces 80%
line coverage across packages/core and packages/skills — the most critical packages
for agent reliability.
Key Takeaway: Contributing to OpenClaw is TypeScript-first with strict type checking
and coverage requirements — channel adapters are the most accessible contribution type,
requiring only the four-method ChannelAdapter interface and corresponding normalize/send
tests.
Why It Matters: OpenClaw's 247,000+ GitHub stars reflect a community that depends on the framework for production workflows. Every new channel adapter expands the framework's reach to users locked into specific messaging ecosystems; every memory backend adapter enables new deployment environments. Contributing well-tested, properly typed code to the framework compounds the investment of the entire user community — and the contribution process is designed to make that standard achievable by any competent TypeScript engineer.
Last updated May 20, 2026