Switch Language
Toggle Theme

OpenClaw Performance Optimization: Real-World Methods to Cut Costs by 80%

Last month I got my Anthropic bill and nearly fell off my chair—$347 for what? I’d only used OpenClaw to write a few articles and do some code reviews. Worse still, OpenClaw had been getting slower and slower, sometimes taking over 20 seconds just to respond.

I stared at that bill for a while. Then I started digging through the logs.

Line by line, the rolling logs revealed the problem: every conversation was carrying the complete history. After 10 rounds of dialogue, the context had ballooned to 150K tokens. It’s like having to repeat every previous conversation word-for-word every time you say something new—no wonder the bill was so high.

I spent two weeks testing different approaches. From session resets to model switching, from caching strategies to context limits. Eventually, I got my monthly costs down from $347 to $68, and response times from 23 seconds to 4 seconds.

Honestly, I didn’t understand at first why OpenClaw was burning through so many tokens. Later I realized the performance issue wasn’t OpenClaw’s fault—I was just using it wrong. This article shares all the pitfalls I encountered and the solutions I found—from log analysis to specific optimization strategies, everything you can copy and paste directly into production.

Root Causes of Performance Issues—Why OpenClaw Slows Down

The 5 Token Consumption Killers

You might think high token consumption just means heavy usage, right? It’s not that simple.

150K
tokens

Killer #1: Unlimited Conversation Context Accumulation

OpenClaw retains all conversation history by default. The first round might be just 5K tokens, but by the tenth round it’s grown to 150K. Every time you ask a question, OpenClaw resends all the previous content to the Claude API. It’s like having to recap last week’s and last month’s conversations before every chat with a friend—exhausting and wasteful.

I tested this: a simple “check this code for issues” request, when the context has accumulated to 100K tokens, gets charged for 100K even if the answer is only 200 words.

Killer #2: Unlimited Tool Output Storage

OpenClaw can read files, check logs, run commands. The problem? Every tool output gets stored completely in the context. You ask it to check a 500-line log file? Those 500 lines stay in memory. Then check a config file? Add several hundred more lines.

I once had OpenClaw analyze a 10MB error log. That log snippet stayed in the context, and I paid for that 10MB in every subsequent conversation.

Killer #3: System Prompts Resent Every Time

OpenClaw’s system prompt (defining its capabilities and behavior) is typically 5K-10K tokens. This content gets sent every single round. If you have 50 conversations with OpenClaw in a day, the system prompt alone consumes 250K-500K tokens.

Killer #4: Wrong Model Selection

Claude has different tier models: Haiku (cheap and fast), Sonnet (balanced), Opus (powerful and expensive). The price difference is roughly 15x.

Many people default to Opus for convenience, using the most expensive model even for simple format conversions and information queries. It’s like driving a tank to the corner store for a bottle of water—it’ll work, but why?

Killer #5: Poorly Configured Heartbeat Mechanism

OpenClaw has a heartbeat mechanism that periodically pings the API to maintain the connection. Some people set the interval to 30 seconds, resulting in 120 API calls per hour. Each call is small, but they add up fast.

I saw one case where someone’s heartbeat interval was too short—they burned through 3,600 heartbeat calls in a month without doing any actual work, spending a fortune for nothing.

The Memory Killer Truth

Sometimes OpenClaw is slow not because of network issues, but because it’s running out of memory.

Many people think OpenClaw is just a chat tool—2GB of memory should be enough, right? Actually run it and you’ll find out that’s not even close.

OpenClaw is memory-intensive. It needs to run Node.js processes, maintain WebSocket connections, store session memory, and render the Web UI. Add it all up and basic operation consumes around 1.5GB. If you allocate 2GB, it’s like making someone run a marathon carrying a 50kg backpack—theoretically possible, but they could collapse at any moment.

My experience:

  • 2GB memory: Can start, but lags after a while and crashes frequently
  • 4GB memory: Usable for normal personal development
  • 8GB memory: Smooth and stable, suitable for teams or high-frequency use
  • 16GB memory: Production standard, long-term operation without pressure

There’s also the memory leak issue. After running for a long time, OpenClaw’s memory usage gradually grows. You’ll notice it starts at 1.8GB, after a week it’s 2.5GB, a few more days and it’s 4GB+. The system starts swapping constantly (using disk as memory), and response speed plummets.

I experienced this: my VPS had 4GB of memory and initially worked fine. Two weeks later OpenClaw suddenly became really slow. I checked docker stats—memory usage had spiked to 3.8GB with less than 200MB remaining. Restart the container and it’s immediately back to normal—memory back to 1.8GB.

So slow responses aren’t necessarily OpenClaw’s fault—your VPS configuration might just be insufficient.

Log Analysis and Monitoring—Making OpenClaw Transparent

The Right Way to View Docker Logs

Here’s the question: how do you know what OpenClaw is actually doing?

Simple answer: check the logs. But many people don’t know how, or can’t understand them.

View Real-Time Logs

docker logs -f openclaw-gateway

This command continuously outputs OpenClaw’s logs like watching a dashboard. You send a message and can see how OpenClaw processes it, which tools it calls, how many tokens it consumes.

View Recent Errors

docker logs openclaw-gateway 2>&1 | tail -50

This shows the last 50 lines of logs. I usually use this for quick problem diagnosis—container won’t start? Check the last 50 lines first, 90% of the time the answer is right there.

Key Error Identifiers

Some errors are particularly common in the logs. Remember these keywords to quickly diagnose issues:

  • WebSocket Error 1008: Auth failure, need to clear browser cache
  • No API key found for anthropic: API key configuration problem
  • Error: Address already in use: Port is occupied
  • FATAL ERROR: Reached heap limit: Out of memory, time to upgrade

I once encountered OpenClaw suddenly refusing to connect, logs kept showing WebSocket Error 1008. After digging through documentation, I learned this was because I’d adjusted the system time, invalidating the auth token. Delete the browser’s localStorage and problem solved.

openclaw-telemetry Monitoring Tool

If you want more professional OpenClaw monitoring, try the openclaw-telemetry tool.

It records all commands, prompts, and tool calls executed by OpenClaw. Data is collected via syslog and can be forwarded to SIEM systems for security auditing. Most importantly, it automatically redacts sensitive information and protects log integrity with tamper-proof hash chains.

Honestly, it might be overkill for personal users. But if you’re using OpenClaw in a corporate environment or need audit records (like processing customer data with OpenClaw), this tool becomes valuable.

Installation and configuration aren’t complicated—there’s detailed documentation on GitHub. I haven’t used it myself (don’t need it for personal dev), but I’ve seen teams using it with good feedback.

Resource Monitoring Commands

Want to know how much resources OpenClaw is using? One command does it:

docker stats openclaw-gateway

This command shows real-time container resource usage:

CONTAINER ID   NAME               CPU %   MEM USAGE / LIMIT   MEM %   NET I/O
a1b2c3d4e5f6   openclaw-gateway   12.5%   1.85GB / 4GB       46.25%  15.2MB / 8.3MB

Focus on these metrics:

  • CPU Usage: Occasional spikes to 80-100% are fine (OpenClaw is processing tasks), but sustained high usage is a problem
  • Memory Usage: This is most critical. If it’s consistently above 90%, time to upgrade
  • Memory Percentage: Be alert above 80%, above 90% means crash risk

My experience: check this command regularly. If you see memory usage continuously growing (like 1.8GB yesterday, 2.5GB today, 3.2GB tomorrow), that’s a sign of memory leak—either restart the container or reset the session.

Once I noticed memory usage had reached 3.6GB (out of 4GB total), but OpenClaw still worked. I figured I’d let it run a bit longer. Next morning, the container was dead. Logs full of Out of memory errors. Since then I’ve made it a habit to proactively restart when memory exceeds 3GB.

7 Performance Optimization Strategies—40% to 80% Cost Savings

Strategy #1: Regular Session Resets (Save 40-60%)

This is the simplest and most effective method.

Why does it work?

Each time you reset a session, OpenClaw clears accumulated context. All previous conversation history, tool outputs, intermediate results—everything zeroes out. The next conversation starts fresh, without carrying dozens or hundreds of K tokens of historical baggage.

How to do it?

Three methods, pick whichever is convenient:

# Method 1: Command line reset
openclaw "reset session"

# Method 2: Directly delete session files
rm -rf ~/.openclaw/agents.main/sessions/*.jsonl

# Method 3: Use built-in command
# In OpenClaw chat box, type
/compact

Best Practice

My habit: reset after completing each independent task. Finished writing an article? Reset. Finished reviewing a PR? Reset. Finished debugging an issue? Reset.

The benefit is you’re always using a “lightweight” OpenClaw—fast response, low cost.

Some might worry: won’t resetting lose context? Yes, but most of the time, the previous task’s context is useless for the next one. Does the research OpenClaw did while you were writing a blog post help you debug code next? No. So why let it keep occupying memory and consuming tokens?

Actual data: I was burning $347 per month, started regularly resetting sessions, and the next month it dropped to $195. Just this one action saved over 40%.

Strategy #2: Isolate Large Output Operations (Save 20-30%)

Some operations generate massive output—viewing complete logs, exporting config files, analyzing large datasets. Once these outputs enter the main session, they stick like gum, and you pay for them in every subsequent conversation.

Solution: Use Independent Sessions

# View large config in an independent debug session
openclaw --session debug "show full system config"

# Copy the key info you need
# Then return to main session to continue work

It’s like sorting your trash—large items go separately, don’t stuff them in the regular bin.

Real Scenario

I once needed OpenClaw to analyze a 300-line error log. If I pasted it directly in the main session, those 300 lines would permanently occupy context. My approach:

  1. Open temporary session with --session analyze-log
  2. Paste log, let OpenClaw analyze
  3. It gives conclusion: line 127 has a null pointer exception
  4. I copy this conclusion to main session
  5. Close debug session, the 300 lines don’t pollute main session

This strategy is a bit more cumbersome, but it really saves money. Especially when you frequently process large files and long logs.

Strategy #3: Smart Model Switching (Save 50-80%)

This strategy has the most obvious effect, but many people don’t realize it’s possible.

Claude has three main models:

  • Haiku: Cheap and fast, suitable for simple tasks
  • Sonnet: Balanced performance and cost
  • Opus: Most powerful but most expensive, 15x the price of Haiku

The problem? Many people default to Opus for everything. It’s like taking a plane everywhere—fly to the supermarket, fly to work. It works, but why?

Task Grading Principles

My classification method:

Use Haiku for:

  • Format conversion (JSON to YAML, Markdown to HTML)
  • Information queries (“What language is this code?”)
  • Simple Q&A (“What does this error mean?”)
  • Text extraction (“List all function names in this code”)

Use Sonnet for:

  • Code reviews (checking logic bugs, performance issues)
  • Content creation (writing articles, documentation)
  • Technical analysis (analyzing architecture, evaluating solutions)

Use Opus for:

  • Architecture design (designing entire systems)
  • Complex refactoring (large-scale code transformation)
  • Critical decisions (technology selection, risk assessment)

Configuration Example

{
  "defaultModel": "claude-3-haiku",
  "complexTaskModel": "claude-3-5-sonnet",
  "triggerKeywords": ["analyze", "refactor", "architecture", "design"]
}

My approach is to set the default model to Haiku, only manually switching for complex tasks. This way, 80% of operations use the cheap model, drastically reducing costs.

Actual results: combined with Strategy #1 (session resets), my monthly cost dropped from $195 to $68. Model switching alone saved nearly 65%.

Strategy #4: Cache Optimization (Save 30-50%)

Claude API has a caching mechanism: if you send the same or similar prompts consecutively, the API caches results and subsequent requests are much cheaper.

How to leverage caching?

  1. Enable prompt caching (most OpenClaw versions enable this by default)
  2. Lower temperature: Set to around 0.2 for more stable output, easier to hit cache
  3. Configure heartbeat to keep cache warm: But not too frequently (recommended 5-10 minutes)
{
  "temperature": 0.2,
  "enablePromptCaching": true,
  "heartbeatInterval": 300000  // 5 minutes, in milliseconds
}
  1. Use relay services that support caching: Some third-party API relays do additional cache optimization

Actual Results

The biggest benefit of caching is that system prompts (5K-10K tokens) only get charged once during the cache validity period. If you make 10 requests in an hour, you’d normally pay for the system prompt 10 times, now only once.

However, this optimization’s effectiveness varies by person. If you don’t use OpenClaw frequently (just a few times a day), caches expire often and the effect is minimal. If you’re a high-frequency user (dozens of conversations daily), caching can save you 30-50%.

Strategy #5: Limit Context Window (Save 20-40%)

OpenClaw’s default context window supports 400K tokens. Sounds huge, right? The problem is, the bigger the window, the easier it is to unconsciously fill it up.

It’s like giving you a big backpack—you’ll unconsciously pack more stuff. Give you a small bag and you’ll naturally be more selective.

Configuration Method

{
  "maxContextTokens": 100000  // Limit from 400K to 100K
}

Why does it work?

After limiting the context window, OpenClaw forces you to clean up context more frequently. When context is nearly full, it prompts you to reset or summarize. This way you won’t keep dragging around dozens of rounds of conversation history.

Plus, 100K is enough for most tasks. Unless you’re doing massive refactoring or analyzing thousands of lines of code, you really don’t need 400K.

Considerations

Limiting too much has side effects: if your task genuinely needs large context (like analyzing entire project architecture), too small a window will make OpenClaw “forget” and lose the big picture.

So my recommendation:

  • Daily use: 50K-100K
  • Complex tasks: 200K
  • Massive projects: Keep default 400K

For me, 100K is the sweet spot—saves money without affecting user experience.

Strategy #6: Use Local Models (Save 60-80%)

If you’re willing to tinker, this method can make certain tasks cost zero.

Basic Idea

Configure local models via Ollama (like Llama, Mistral), have OpenClaw handle simple tasks with the local model. This way you don’t call Claude API, so naturally you don’t pay.

Applicable Scenarios

  • Format conversion (JSON, YAML, Markdown interconversion)
  • Simple queries (“List all TODO comments”)
  • Information extraction (extracting specific info from text)

Not Applicable

  • Code reviews (local model quality inferior to Claude)
  • Creative content (writing articles/docs, Claude is more reliable)
  • Complex reasoning (architecture design, technical analysis)

Configuration Example

# Install Ollama and pull model
ollama pull llama3.2

# Configure OpenClaw to use local model for simple tasks
{
  "localModel": "llama3.2",
  "localModelTasks": ["format", "extract", "simple-query"]
}

Real Experience

Honestly, configuring local models is pretty tedious, and output quality really isn’t as good as Claude. I tried using local models for format conversion—2-3 out of 10 times had errors requiring manual correction.

But if you’re really on a tight budget or have lots of repetitive simple tasks, local models are worth trying. At least you can cut those costs by 60-80%.

Strategy #7: Disable Unnecessary Skills and Tools

OpenClaw supports various Skills—browser automation, file operations, code execution, etc. The problem is, each enabled Skill occupies context (sending tool usage instructions to the API), especially noticeable with smaller models.

Check Currently Enabled Skills

Look at your OpenClaw config—have you enabled a bunch of tools you never use? Browser automation? When was the last time you used it? Gmail integration? Do you really need OpenClaw sending emails?

Optimization Strategy

Only enable tools you actually use. My config only keeps:

  • File read/write (essential)
  • Git operations (frequent use)
  • Bash command execution (needed for debugging)

As for browser automation, schedule management, email integration—all disabled. This reduces system prompt by several thousand tokens per request.

Configuration Example

{
  "enabledSkills": [
    "file-operations",
    "git",
    "bash"
  ],
  "disabledSkills": [
    "browser-automation",
    "gmail",
    "calendar"
  ]
}

This optimization alone doesn’t have huge impact (maybe 10-15% savings), but combined with other strategies, it adds up.

Common Issues Quick Reference—Solve 90% of Problems in 5 Minutes

Don’t panic when you encounter issues. This quick reference can help you quickly diagnose and resolve most common problems.

SymptomLikely Cause5-Second DiagnosisQuick Fix
WebSocket Error 1008Auth data expiredConsole error messageClear browser localStorage (F12 → Application → Local Storage → delete openclaw-auth-token)
Container stops immediately after startAPI key not configured/port conflictdocker compose ps shows Exited1. Check docker compose logs
2. Verify environment variables
3. Check port usage
No API key foundAPI key misconfiguredExplicit log errorCheck API key in config file, ensure environment variable correctly passed to container
Memory usage continuously growingMemory leak/session accumulationCheck MEM % with docker statsShort-term: Restart container
Mid-term: Reset session
Long-term: Upgrade VPS memory to at least 4GB
Skills installation timeoutGo dependencies downloadingOccurs on first installClick “Install” again, cached dependencies will complete quickly
Browser automation timeoutPage loads slowly/element ID changedsnapshot or click command timeout1. Increase --timeout-ms parameter
2. Rerun snapshot --labels
3. Check Chromium path
control ui requires HTTPSHTTP access restrictionCannot open Web UIAdd token to URL: http://YOUR-IP:18789/?token=YOUR_TOKEN

Several Practical Tips:

Tip #1: Ultimate WebSocket Solution

If clearing localStorage doesn’t work, try this:

# Disable device pairing in Docker environment (internal network only)
docker run -e OPENCLAW_DISABLE_DEVICE_PAIRING=true openclaw/openclaw

Tip #2: Quickly Determine Memory vs Network Issue

# Monitor multiple metrics simultaneously
watch -n 1 'docker stats openclaw-gateway --no-stream && curl -s https://api.anthropic.com'

If memory is stable but response is slow, it’s likely network; if memory spikes, it’s resource shortage.

Tip #3: Too Many Logs, Can’t Find Key Info?

# Only show errors and warnings
docker logs openclaw-gateway 2>&1 | grep -E "ERROR|WARN|FATAL"

I was debugging an issue once with thousands of log lines. Used this filter command and immediately found the critical ERROR—turned out the API key had expired.

Advanced Tuning—Supercharging OpenClaw

If you’ve completed the basic optimizations and want to squeeze out more performance, this section is for you.

Advanced Context Management

Context Triangulation

Don’t feed entire files to OpenClaw, only inject task-relevant snippets. For example, if you’re modifying a function, only provide:

  • That function’s code
  • Signatures of functions it calls
  • Related type definitions

This can reduce context by 70-80%.

Tiered Global Anchor Architecture (TGAA)

Maintain an ARCHITECTURE.md in project root documenting high-level system design. When you need global perspective, have OpenClaw read this file instead of scanning the entire codebase.

My approach is to put a README.md in each module directory briefly explaining what the module does. When OpenClaw needs to understand project structure, reading these READMEs is enough—no need to load all source code.

Dynamic Tool Loading

Don’t load all tool definitions at the start. Inject on demand—load file tools only when you need file operations, Git tools only when you need Git.

This requires modifying OpenClaw config, has a slight technical barrier, but can reduce system prompt overhead by about 30%.

Pre-Compressed Memory Refresh

Before session summary or reset, have OpenClaw write important info to MEMORY.md. After reset, the new session only needs to read this streamlined memory file to continue previous context.

# Save key info before summarizing
openclaw "Write key decisions and todos to MEMORY.md"

# Then reset session
openclaw "reset session"

# Read when new session starts
openclaw "Read MEMORY.md and continue previous work"

Docker Security Hardening

In January 2026, security company Bitdefender disclosed CVE-2026-25253, finding hundreds of OpenClaw instances had leaked API keys and sensitive data due to misconfiguration. While the vulnerability is fixed, it reminds us: security configuration matters.

Run as Non-Root

services:
  openclaw-gateway:
    user: "1000:1000"  # Use unprivileged user

Principle of Least Privilege

docker run \
  --cap-drop=ALL \
  --security-opt=no-new-privileges \
  --read-only \
  --tmpfs /tmp \
  openclaw/openclaw

Network Isolation

If you don’t need OpenClaw accessing external networks (like only using it to process local files), completely isolate network:

services:
  openclaw-gateway:
    network_mode: none

Need network but want to limit access scope? Configure whitelist proxy.

Environment Variable Protection

Don’t write API key in plain text in docker-compose.yml, use environment variable files:

services:
  openclaw-gateway:
    env_file:
      - .env  # File contents: ANTHROPIC_API_KEY=sk-xxx

Remember to add .env to .gitignore so you don’t accidentally commit it to GitHub.

Production Environment Best Practices

Log Rotation to Prevent Disk Full

services:
  openclaw-gateway:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

This way logs max out at 30MB, won’t grow infinitely.

Regular Updates

OpenClaw updates frequently with performance improvements and security fixes. Check version monthly:

docker pull openclaw/openclaw:latest
docker compose up -d

Configure Monitoring and Alerts

Use simple scripts to monitor resources:

#!/bin/bash
MEM_PERCENT=$(docker stats openclaw-gateway --no-stream --format "{{.MemPerc}}" | sed 's/%//')
if (( $(echo "$MEM_PERCENT > 85" | bc -l) )); then
    # Send alert (email, webhook, etc.)
    echo "OpenClaw memory usage exceeds 85%!" | mail -s "OpenClaw Alert" your@email.com
fi

Put it in cron to run hourly.

Use VPN or Tailscale Instead of Public Exposure

If you’re remotely accessing OpenClaw, don’t expose ports directly to the public internet. Use Tailscale to establish private network—secure and convenient.

# Install Tailscale
curl -fsSL https://tailscale.com/install.sh | sh

# Start and authenticate
sudo tailscale up

# Now access OpenClaw via Tailscale network, no public IP needed

This is exactly how I configure it. At coffee shops, airports, I can still securely access my home OpenClaw instance.

Conclusion

From $347 to $68, from 23-second response to 4 seconds—these numbers aren’t made up, they’re real optimization results.

Looking back, OpenClaw’s performance issues really aren’t complicated. It’s just three things:

  1. Control context: Don’t let it accumulate infinitely
  2. Choose the right model: Use cheap ones for simple tasks
  3. Monitor resources: Detect memory shortage promptly

You don’t need to use all 7 strategies. My recommendation:

  • Execute immediately: Check current memory usage (docker stats), if under 4GB upgrade ASAP
  • Complete this week: Configure smart model switching (default Haiku) + enable caching (temperature 0.2)
  • Build the habit: Reset session after completing each task (/compact)

Do these three steps and you can cut costs by at least 50%.

One last thing: OpenClaw is a great tool, but it’s just a tool. How useful a tool is largely depends on how you use it. Spend time understanding how it works, configure reasonable resources, build good usage habits—these investments will pay off.

Don’t panic when you encounter issues. Check the logs, consult this article’s quick reference guide, and 90% of problems can be solved in 5 minutes. If you really can’t figure it out, OpenClaw’s GitHub Issues has many helpful community members.

May your OpenClaw run fast and cheap.

Complete OpenClaw Performance Optimization Process

Complete operational steps from memory check to cost optimization

⏱️ Estimated time: 2 hr

  1. 1

    Step1: Step 1: Diagnose Current Performance Bottlenecks

    Use Docker monitoring commands to diagnose resource usage:

    • docker stats openclaw-gateway (view real-time resource usage)
    • docker logs -f openclaw-gateway (view real-time logs, observe token consumption)
    • docker logs openclaw-gateway 2>&1 | grep -E "ERROR|WARN|FATAL" (filter error messages)

    Key metrics to check:
    • Memory usage > 80%: Immediately upgrade VPS to at least 4GB
    • CPU sustained > 90%: Check for infinite loops or abnormal processes
    • Logs show "Reached heap limit": Out of memory, need upgrade

    Diagnostic tip: If memory is stable but response slow, likely network issue; if memory spikes, resource shortage.
  2. 2

    Step2: Step 2: Configure Smart Model Switching (Most Significant Effect)

    Modify OpenClaw config file to implement task grading:

    Configuration example (JSON format):
    {
    "defaultModel": "claude-3-haiku",
    "complexTaskModel": "claude-3-5-sonnet",
    "triggerKeywords": ["analyze", "refactor", "architecture", "design"]
    }

    Task grading principles:
    • Haiku scenarios: format conversion, information queries, simple Q&A, text extraction
    • Sonnet scenarios: code reviews, content creation, technical analysis
    • Opus scenarios: architecture design, complex refactoring, critical decisions

    Measured results: Default Haiku, 80% of operations cost reduced, combined with other strategies can save 50-80%.
  3. 3

    Step3: Step 3: Enable Cache Optimization and Context Limits

    Configure caching mechanism and context window limits:

    Cache configuration (JSON format):
    {
    "temperature": 0.2,
    "enablePromptCaching": true,
    "heartbeatInterval": 300000,
    "maxContextTokens": 100000
    }

    Parameter explanations:
    • temperature: 0.2 (reduce randomness, improve cache hit rate)
    • enablePromptCaching: true (enable prompt caching)
    • heartbeatInterval: 300000 (5-minute heartbeat, in milliseconds)
    • maxContextTokens: 100000 (limit context window, from 400K to 100K)

    Optimization effects:
    • Cache optimization can save 30-50% (high-frequency usage scenarios)
    • Context limits can save 20-40% (force periodic cleanup)
  4. 4

    Step4: Step 4: Disable Unnecessary Skills

    Check and disable infrequently used Skills and tools:

    Configuration example (JSON format):
    {
    "enabledSkills": ["file-operations", "git", "bash"],
    "disabledSkills": ["browser-automation", "gmail", "calendar"]
    }

    Optimization strategy:
    • Keep only essential tools: file read/write (essential), Git operations (frequent), Bash commands (debugging)
    • Disable niche tools: browser automation, email integration, schedule management

    Actual effect: Each disabled Skill reduces thousands of tokens in system prompts, cumulatively can save 10-15%.
  5. 5

    Step5: Step 5: Establish Regular Session Reset Habit

    Develop habit of resetting session after task completion:

    Three reset methods:
    • Method 1: openclaw "reset session" (command line)
    • Method 2: rm -rf ~/.openclaw/agents.main/sessions/*.jsonl (direct deletion)
    • Method 3: Type /compact in chat box (built-in command)

    Best practices:
    • Finished writing article → reset
    • Finished reviewing PR → reset
    • Finished debugging → reset
    • Completed independent task → reset

    Advanced technique (pre-compressed memory refresh):
    1. openclaw "Write key decisions and todos to MEMORY.md"
    2. openclaw "reset session"
    3. openclaw "Read MEMORY.md and continue work"

    Measured results: Regular resets can save 40-60%, simplest and most effective optimization method.
  6. 6

    Step6: Step 6: Isolate Large Output Operations

    Use independent sessions to handle large files and long logs:

    Operation process:
    1. openclaw --session debug "show full system config" (view in independent session)
    2. Copy needed key information
    3. Return to main session to continue work
    4. Close debug session, large output won't pollute main session

    Applicable scenarios:
    • View complete logs (hundreds of lines)
    • Export config files (large amounts of content)
    • Analyze large datasets (over 100K tokens)

    Real case: When analyzing 300-line error log, use temporary session to get conclusion (like "line 127 null pointer exception"), then handle in main session, avoiding 300 lines permanently occupying context.

    Savings effect: 20-30% (scenarios frequently handling large files).
  7. 7

    Step7: Step 7: Configure Monitoring and Alerts

    Set up automated monitoring scripts to prevent performance issues:

    Monitoring script (Bash):
    #!/bin/bash
    MEM_PERCENT=$(docker stats openclaw-gateway --no-stream --format "{{.MemPerc}}" | sed 's/%//')
    if (( $(echo "$MEM_PERCENT > 85" | bc -l) )); then
    echo "OpenClaw memory usage exceeds 85%!" | mail -s "OpenClaw Alert" your@email.com
    fi

    Deployment method:
    • Save as /usr/local/bin/openclaw-monitor.sh
    • chmod +x /usr/local/bin/openclaw-monitor.sh
    • Add to crontab: 0 * * * * /usr/local/bin/openclaw-monitor.sh

    Alert thresholds:
    • Memory > 85%: Send alert
    • Memory > 90%: Immediately restart container
    • Disk logs > 100MB: Configure log rotation

    Log rotation config (docker-compose.yml):
    logging:
    driver: "json-file"
    options:
    max-size: "10m"
    max-file: "3"

FAQ

Why does OpenClaw consume so many tokens, with monthly costs reaching hundreds of dollars?
There are 5 main reasons for excessive token consumption:

• Unlimited conversation context accumulation: 150K tokens after 10 rounds, every request carries full history
• Unlimited tool output storage: Log viewing and file reading outputs permanently saved in context
• System prompts resent every time: 5K-10K tokens of system prompts sent every round
• Wrong model selection: Using Opus for simple tasks, 15x the price of Haiku
• Poorly configured heartbeat mechanism: Interval too short causes hundreds of useless API calls per hour

Solution: Regular session resets (/compact) + smart model switching (default Haiku) + cache optimization can save 50-80% costs.
OpenClaw response is slow, requiring 20+ seconds wait. How to optimize?
Slow response is usually caused by insufficient memory, not network issues:

Diagnosis method:
• Run docker stats openclaw-gateway to check memory usage
• If memory usage > 80%, configuration is insufficient
• If continuously growing (1.8GB → 2.5GB → 3.2GB), memory leak exists

Memory configuration recommendations:
• 2GB: Can start but will lag and crash frequently (not recommended)
• 4GB: Minimum configuration for personal daily development
• 8GB: Recommended for teams or high-frequency use
• 16GB: Production standard

Immediate fixes:
• Short-term: docker restart openclaw-gateway (restart container)
• Mid-term: /compact (reset session to free memory)
• Long-term: Upgrade VPS memory to at least 4GB

Complementary strategy: Limit context window to 100K tokens, force periodic cleanup.
How to decide whether to use Haiku, Sonnet, or Opus model?
Choose model by task complexity, price difference reaches 15x:

Haiku scenarios (cheap and fast):
• Format conversion: JSON to YAML, Markdown to HTML
• Information queries: "What language is this code?"
• Simple Q&A: "What does this error mean?"
• Text extraction: "List all function names"

Sonnet scenarios (balanced performance):
• Code reviews: Check logic bugs, performance issues
• Content creation: Write articles, documentation
• Technical analysis: Analyze architecture, evaluate solutions

Opus scenarios (powerful and expensive):
• Architecture design: Design entire system
• Complex refactoring: Large-scale code transformation
• Critical decisions: Technology selection, risk assessment

Practical config: Set default model to Haiku, 80% operations use cheap model, combined with session resets can save 65% costs.
How to fix WebSocket Error 1008 error?
This is a common error caused by expired auth data, usually 3 solutions:

Solution 1: Clear browser localStorage (most common)
1. Press F12 to open developer tools
2. Application → Local Storage
3. Delete openclaw-auth-token
4. Refresh page and re-login

Solution 2: Disable device pairing (Docker environment internal network use)
docker run -e OPENCLAW_DISABLE_DEVICE_PAIRING=true openclaw/openclaw

Solution 3: Check system time
• If system time was adjusted, auth token becomes invalid
• Correct system time then clear localStorage and re-authenticate

Other related errors:
• No API key found: Check environment variable configuration
• Address already in use: Port occupied, change port or kill occupying process
• FATAL ERROR: Reached heap limit: Out of memory, upgrade configuration
Will regular session resets lose context? How to preserve important information?
Regular resets do clear context, but you can use "pre-compressed memory refresh" technique to preserve key info:

Operation process:
1. openclaw "Write key decisions and todos to MEMORY.md"
2. openclaw "reset session"
3. openclaw "Read MEMORY.md and continue work"

When context preservation needed:
• Multi-day large projects
• Architecture decisions to remember
• Todo lists

When preservation not needed:
• Blog writing research materials (useless for later debugging)
• Temporary log query outputs
• Format conversion and other one-time tasks

Best practices:
• Reset after completing each independent task (writing articles, reviewing PRs, debugging)
• Most of the time previous task's context is useless for next task
• Always use "lightweight" OpenClaw, fast response and low cost

Measured data: Regular resets dropped from $347 to $195, saving 40%+.
How effective is cache optimization? What usage scenarios suit it?
Cache optimization effectiveness varies by usage frequency, high-frequency users benefit most:

Configuration method:
{
"temperature": 0.2,
"enablePromptCaching": true,
"heartbeatInterval": 300000
}

How it works:
• System prompts (5K-10K tokens) only charged once during cache validity period
• 10 requests in one hour, originally pay 10 times, now only once
• Lowering temperature to 0.2 improves cache hit rate

Applicable scenarios:
• High-frequency use: Dozens of conversations daily, can save 30-50%
• Continuous work: Multiple requests for similar tasks in short time

Not applicable:
• Low-frequency use: Just a few uses per day, cache expires often
• Random tasks: Prompts differ greatly each time, low cache hit rate

Additional optimization: Configure heartbeat interval 5-10 minutes to keep cache warm, but not too frequent (30-second interval actually increases costs).
Container stops immediately after starting, how to quickly diagnose?
Container startup failure is usually configuration error, follow this troubleshooting process:

Step 1: Check container status
docker compose ps
# If shows Exited, startup failed

Step 2: View last 50 log lines to locate error
docker logs openclaw-gateway 2>&1 | tail -50

Step 3: Fix based on key error messages
• "No API key found": Check ANTHROPIC_API_KEY environment variable in docker-compose.yml
• "Address already in use": Port occupied, change port or kill occupying process
• "Permission denied": Check file permissions or use non-root user

Step 4: Verify environment variable passing
docker exec openclaw-gateway env | grep ANTHROPIC
# Confirm API key correctly passed to container

Step 5: Check port occupation
netstat -tuln | grep 18789
# If port occupied, modify port mapping in docker-compose.yml

View complete logs:
docker compose logs -f
# Real-time view of all service logs, convenient for discovering dependency issues

18 min read · Published on: Feb 5, 2026 · Modified on: Feb 5, 2026

Comments

Sign in with GitHub to leave a comment

Related Posts