OpenClaw Gateway Core: What Your AI Butler Actually Looks Like
Have you ever wondered what an AI assistant running on your own machine actually looks like on the inside?
Not ChatGPT — that’s someone else’s web UI. I’m talking about one that runs on your server, talks to you through your own Telegram and Discord. Today we’re popping the hood on OpenClaw to see how the engine works. If you’re a Python backend person who runs away screaming at the sight of TypeScript ╰(°▽°)╯ you’re in the right place. This whole thing is explained in Python-world language.
10 floors + a Boss Floor. Let’s go 🗡️
🏰 Floor 0: What Is OpenClaw? (Bird’s Eye View)
One sentence:
OpenClaw is an AI assistant that runs on your own machine and talks to you through messaging apps you already use.
It’s not ChatGPT (that’s someone else’s web UI), not an API wrapper — it’s a full control plane with session management, auth, failover, scheduling, tool execution, the whole thing.
Architecture overview
┌──────────────┐
Telegram ──────> │ │ ──────> Agent (Claude / GPT)
Discord ──────> │ Gateway │ ──────> Tools (shell, browser, files...)
WhatsApp ──────> │ (Brain) │ ──────> Cron (scheduled tasks)
│ │ ──────> Nodes (phone, other devices)
└──────────────┘
- Gateway = the brain, center of everything
- Channels = ears and mouth (Telegram, Discord)
- Agent = thinking engine (calls LLM APIs)
- Tools = hands and feet (shell, browser, file I/O)
Some numbers: 269 dist modules, 1,086 test files, Node.js >= 22, MIT license. Created by Peter Steinberger.
Clawd 的 murmur:
For Python people, OpenClaw is kind of like if you built a super bot backend with FastAPI, then crammed LangChain agent, APScheduler, and SQLite all into one process. Except theirs is better than yours (and actually production-ready). Your side project needs a manual restart every three days; theirs runs for three months unattended ( ̄▽ ̄)/
What's the biggest difference between OpenClaw and ChatGPT?
OpenClaw runs on your machine and communicates through Telegram, Discord, etc. ChatGPT is OpenAI's web UI running on their servers. The core difference is who controls the AI.
正確答案是 B
OpenClaw runs on your machine and communicates through Telegram, Discord, etc. ChatGPT is OpenAI's web UI running on their servers. The core difference is who controls the AI.
🏰 Floor 1: Why Hub-and-Spoke? Why Not Microservices?
First, let’s understand the term: Hub-and-Spoke.
Picture a bicycle wheel: there’s a hub in the center, with many spokes reaching outward. All spokes connect to the hub, but spokes never touch each other directly.
In OpenClaw:
- Hub = Gateway (central control)
- Spokes = Telegram, Discord, Agent, Tools… (various components)
Everything communicates through Gateway. Telegram doesn’t talk directly to Agent; Tools don’t chat directly with Discord. Everything goes through the center.
Clawd 碎碎念:
Hub-and-spoke has a hidden benefit nobody talks about: when debugging, you only need to put logs in one place — Gateway — and you can see every conversation between every component. Debugging microservices? Good luck. You need distributed tracing, Jaeger, OpenTelemetry, and you’ll spend an entire afternoon just getting the logs to line up. I must have done something terrible in a past life to deserve debugging microservices ┐( ̄ヘ ̄)┌
You might look at this and ask: “With all these components… shouldn’t this be microservices?”
Answer: Nope.
Peter chose the pizza shop. Because OpenClaw is a single-user personal assistant:
- Number of concurrent users = 1 (just you)
- No horizontal scaling needed, no load balancing
- Single process eliminates Celery / RabbitMQ / Consul / k8s / distributed tracing
Clawd 嘀咕一下:
Python analogy: if you’re building a side project just for yourself, are you going to set up k8s + Celery + Redis + RabbitMQ? Of course not — you just spin up one FastAPI process and call it a day. OpenClaw is that same thinking, just done professionally. Those interviewers who make you design “a system that handles a million concurrent users” would have a meltdown (╯°□°)╯
The trade-off is no multi-server load balancing and one crash takes everything down. But for single-user, these trade-offs simply don’t matter.
Why does OpenClaw use a single process instead of microservices?
A single-user personal assistant doesn't need horizontal scaling. Single process design eliminates all distributed system complexity — service mesh, message queue, service discovery.
正確答案是 C
A single-user personal assistant doesn't need horizontal scaling. Single process design eliminates all distributed system complexity — service mesh, message queue, service discovery.
🏰 Floor 2: Gateway — WebSocket RPC, Not REST
How does Gateway communicate with external components? WebSocket RPC.
But wait — why not just use REST, the thing everyone already knows?
REST is request-response — you send a letter, wait for a reply. But an AI assistant needs a completely different communication style than a typical CRUD API:
- Streaming: AI starts thinking and pushes partial results as it goes (not waiting until it’s done — that would take forever)
- Server push: Gateway proactively tells you “your cron job finished” (you didn’t ask, it just tells you)
REST can’t do server-initiated push. You can’t poll every second asking “done yet? done yet?” — that’s as annoying as a kid in the backseat yelling “are we there yet?” (╯°□°)╯
Python pseudocode to visualize:
# Similar to Python websockets + JSON-RPC
async def connect_to_gateway():
ws = await websockets.connect("ws://localhost:18789")
request = {
"jsonrpc": "2.0",
"method": "agent", # Tell AI to do something
"params": {"prompt": "Check the weather in Taipei"},
"id": 1
}
await ws.send(json.dumps(request))
async for message in ws: # Streaming receive
print(json.loads(message))
Clawd 偷偷講:
If you’ve used Python’s
jsonrpcserverlibrary, congrats — you can skip ahead. OpenClaw’s RPC layer is the exact same concept, just over WebSocket instead of HTTP. Never used it? No worries. JSON-RPC is basically “I send you a JSON saying what I want, you send me a JSON with the answer.” Way simpler than REST’s endless GET-POST-PUT-DELETE-PATCH verb debates (¬‿¬)
Gateway runs on port 18789 by default, serving 80+ RPC methods. Some fun ones:
agent— tell the AI to do somethingchat.send— send a message through a channelcron.add— add a scheduled tasknode.invoke— run a command on a remote deviceconfig.patch— change settings (no restart needed!)
Why does OpenClaw use WebSocket instead of REST?
REST is request-response — the server can't push proactively. OpenClaw needs streaming responses and proactive notifications, so it uses WebSocket for persistent two-way connections.
正確答案是 C
REST is request-response — the server can't push proactively. OpenClaw needs streaming responses and proactive notifications, so it uses WebSocket for persistent two-way connections.
🏰 Floor 3: Config System — You’ve Definitely Felt This Pain in Python
Scenario time. You’ve written a Python project where config is scattered across .env, config.yaml, environment variables, plus that one setting a coworker secretly stashed in the database. Changing one value means SSH-ing into three servers, restarting, discovering you made a typo, and SSH-ing again — you’ve lived this nightmare (╯°□°)╯
OpenClaw says: no. All settings in one config.yaml, validated with Zod.
What’s Zod? It’s the TypeScript world’s Pydantic.
# Python: Pydantic (the one you know)
from pydantic import BaseModel
class GatewayConfig(BaseModel):
port: int = 18789
model: str = "claude-sonnet-4-20250514"
max_tokens: int = 4096
auto_compact: bool = True
config = GatewayConfig(port="abc") # ❌ ValidationError!
Clawd 的 murmur:
Zod looks like this:
z.object({ port: z.number().default(18789) }). If you use Pydantic so much you dream aboutBaseModel, Zod just swaps that forz.object()in your dreams. Same concept — define schema, auto-validate, set defaults. The sweet part? TypeScript types get inferred automatically from the Zod schema, so you don’t write them twice. Pydantic users hearing this should be shedding tears of joy (๑•̀ㅂ•́)و✧
Hot-Reload — Save and It’s Live
This is the part that actually feels magical. Change config.yaml, and Gateway detects and reloads automatically, no restart needed. It uses a file watcher, and Zod re-validates the schema on reload.
Think about the old days: SSH in → edit file → restart service → wait for it to come up → verify nothing broke → discover the typo → do it all over again. Now you just save and it’s live. If that’s not progress, I don’t know what is.
Clawd 的 murmur:
config.yaml is the single source of truth. Not scattered across env vars + config files + database + “that environment variable only the person who quit knew about.” You know what the scariest config location is? Inside the brain of someone who already left the company ┐( ̄ヘ ̄)┌
What does OpenClaw use for config schema validation?
OpenClaw uses Zod for config schema validation — the equivalent of Python's Pydantic. Define schema → auto-validate → set defaults → supports hot-reload.
正確答案是 B
OpenClaw uses Zod for config schema validation — the equivalent of Python's Pydantic. Define schema → auto-validate → set defaults → supports hot-reload.
🏰 Floor 4: A Message’s Journey
This is the most important floor. If you can only remember one thing, remember this one.
Here’s the setup: you pick up your phone and type “Check the weather in Taipei” in Telegram. The moment you hit send, how many layers does that message pass through before it becomes the weather report you see?
The answer is 10 steps. Let’s walk through them:
Step 1: You type in Telegram: "Check the weather in Taipei"
Step 2: Telegram Bot API → pushes to OpenClaw
Step 3: [Telegram Plugin] converts Telegram format → OpenClaw internal format
Step 4: [Gateway Inbound] receives → determines which session
Step 5: [Session Manager] finds session → loads conversation history (SQLite)
Step 6: [Agent] takes history + new message → calls Claude API
Step 7: [Agent] Claude: "I need to check weather" → calls weather tool
Step 8: [Agent] gets weather data → composes final reply
Step 9: [Gateway Outbound] → Telegram Plugin → converts back to Telegram format
Step 10: Telegram Bot API → you see the reply ✅
See Step 7? That one’s interesting — the Agent might call Tools multiple times back and forth. For example, Claude checks the weather, gets the result, then says “hold on, I also need tomorrow’s forecast,” and calls the tool again. This tool-use loop can iterate several times, like a chef who starts prep and realizes they forgot the green onions and has to go back.
Clawd 內心小劇場:
If you’ve used LangChain’s AgentExecutor, this tool-use loop will feel familiar — it’s that loop where the LLM decides whether to call a tool, and after calling it, decides whether to call another one. The difference is OpenClaw’s loop doesn’t occasionally spiral into an infinite loop that eats your entire API budget (⌐■_■)
Each step maps to a module:
- Steps 1-2: External (Telegram infra)
- Step 3:
channels/telegram - Step 4:
gateway/inbound - Step 5:
session/store - Steps 6-8:
agent/+tools/ - Step 9:
gateway/outbound+ Channel Plugin - Step 10: External
Gateway is the floor manager — doesn’t cook (that’s Agent’s job), doesn’t carry plates (that’s Plugin’s job), just handles coordination, routing, and session management.
What's the correct order of core modules a message passes through?
Full flow: Telegram → Plugin (format conversion) → Gateway Inbound (routing) → Session Manager (load history) → Agent (LLM + Tools) → Gateway Outbound → Plugin → Telegram.
正確答案是 B
Full flow: Telegram → Plugin (format conversion) → Gateway Inbound (routing) → Session Manager (load history) → Agent (LLM + Tools) → Gateway Outbound → Plugin → Telegram.
🏰 Floor 5: Session Management — How Your AI Knows Who You Are
Here’s a scenario. You’ve been chatting with your AI all day on Telegram about work planning. Then you switch to a Discord group and ask a question about gaming with friends. You definitely don’t want the AI to mention your salary from the Telegram conversation in a public Discord group, right?
That’s what Sessions do — each conversation gets its own memory space, completely isolated. Think of it like chatting with different people in different rooms. What happens in Room A stays in Room A (or at least it should).
So how does a Session know which is which?
A simple key. Three things: where you’re coming from (channel), which conversation (chatId), and which thread (optional).
session_key = f"{channel}:{chat_id}:{thread_id}"
# Examples
"telegram:123456789:" # Telegram private message
"telegram:-100987654321:42" # Telegram group topic
"discord:channel_abc123:" # Discord channel
Looks almost too simple? That’s the hallmark of good design — compressing a complex problem into three fields that uniquely identify any conversation context.
Where is it stored? SQLite.
Not Redis. Not Postgres. Just a SQLite file.
Clawd 偷偷講:
“What? SQLite? Isn’t that a toy database?” — Please. Apple uses SQLite for your iMessage history, Chrome uses it for your browsing history, and 80% of the apps on your phone use SQLite under the hood. It’s not a toy — it’s the most deployed database on planet Earth, bar none. People just assume it’s “not serious” because it doesn’t need a daemon, doesn’t need a DBA, and doesn’t need a connection pool. That bias drives me nuts (¬‿¬)
Curious what this looks like in Python? It’s basically like FastAPI session middleware, just with SQLite instead of cookies:
class SessionStore:
def __init__(self, db_path="sessions.db"):
self.db = sqlite3.connect(db_path)
def get_session(self, channel, chat_id, thread_id=None):
key = f"{channel}:{chat_id}:{thread_id or ''}"
row = self.db.execute(
"SELECT history FROM sessions WHERE key = ?", (key,)
).fetchone()
return json.loads(row[0]) if row else []
The result: your private Telegram conversations don’t leak into your Discord group, each session can have different model settings, and you can even use Opus for coding in one session while chatting with Sonnet in another.
Clawd 碎碎念:
Session isolation sounds basic, but you’d be amazed how many bot frameworks get this wrong. I’ve seen Discord bots carry context from Group A into Group B, blurting out things in public channels that shouldn’t be there. It’s like that friend who loudly repeats your private conversations at dinner parties — technically not a bug, socially an instant game-over (╯°□°)╯
What elements make up a session key?
Session key = channel (which platform) + chatId (which conversation) + optional threadId (which thread/topic). Precisely distinguishes different conversation contexts.
正確答案是 B
Session key = channel (which platform) + chatId (which conversation) + optional threadId (which thread/topic). Precisely distinguishes different conversation contexts.
🏰 Floor 6: Context Overflow — What Happens When Your AI Starts Forgetting
You’ve been chatting with a friend for three months and you say “remember that thing?” They reply “what thing?” — humans forget, and AI does too, just for different reasons. Humans forget because their brain has limited capacity (and also because they’re lazy). AI forgets because the context window has a hard limit. Claude’s is 200K tokens — sounds like a lot, but work with your AI seriously for a few days and you’ll fill it up.
Just chop off the old stuff? Too brutal. That important thing you said three days ago (“my server is at 192.168.1.100”) would be gone, and the AI will ask you for the IP again, and you’ll start wondering who’s serving whom.
OpenClaw’s approach is smarter: Auto-Compaction — summarize old conversations into a shorter version, keep the key points, free up space.
Here’s roughly how it works:
# Pseudocode
def check_and_compact(session):
total_tokens = count_tokens(session.history)
if total_tokens > 200_000 * 0.8: # Start at 80%
old = session.history[:len(session.history)//2]
summary = llm.summarize(old)
session.history = [
{"role": "system", "content": f"Conversation summary: {summary}"},
*session.history[len(session.history)//2:]
]
Notice the 80% threshold — it doesn’t wait until the tank is completely empty before refueling. That’s like not waiting until deadline day to start writing your report (though some people do both).
Clawd 碎碎念:
What bugs me is people treating auto-compaction like it’s magic. It’s not. Compression is lossy, period. You spend thirty minutes debugging an edge case with your AI, and all those back-and-forth trial-and-error details get compressed into roughly one line: “debugged some edge case.” It’s like condensing a three-hour meeting into bullet points, then looking at those bullet points three months later with absolutely no memory of what the actual argument was about. Lossy compression beats amnesia, but don’t pretend it comes without a cost ┐( ̄ヘ ̄)┌
The trade-off is clear: better than hard-cutting, triggers automatically so no manual management needed, but may lose details you thought were important. This isn’t a bug — it’s physics. The context window is only so big, and you can only choose how to pack it.
How does auto-compaction work?
Auto-compaction triggers automatically when context gets full. It asks the LLM to summarize old conversations, replacing originals with summaries. Keeps key memories, frees space, but may lose details.
正確答案是 C
Auto-compaction triggers automatically when context gets full. It asks the LLM to summarize old conversations, replacing originals with summaries. Keeps key memories, frees space, but may lose details.
🏰 Floor 7: Auth and Models — How Your AI Stays Online
Scenario: it’s 2 AM, you’re discussing tomorrow’s presentation with your AI, and suddenly Claude’s API returns a 429 Too Many Requests. If you’re running a bot you built yourself, you’d probably have to crawl out of bed and SSH into your server to swap the API key. But you’re using OpenClaw, so you do nothing — because it has two layers of protection.
Layer 1: Auth Profile Rotation (swap the key)
class AuthManager:
def __init__(self):
self.profiles = [
{"name": "primary", "api_key": "sk-xxx", "status": "active"},
{"name": "backup1", "api_key": "sk-yyy", "status": "active"},
]
def get_active_profile(self):
for p in self.profiles:
if p["status"] == "active":
return p
raise Exception("All profiles are down!")
def on_rate_limit(self, profile):
profile["status"] = "cooldown"
profile["cooldown_until"] = time.time() + 300 # 5 min cooldown
Primary hits rate limit → auto cooldown, switch to backup → cooldown expires → primary auto-recovers. You never have to leave your bed.
Clawd 內心小劇場:
When’s the last time you got woken up at 3 AM by a rate limit and had to swap keys by hand? If you say “never,” you’re either lying or haven’t seriously used an AI API yet. I’ve seen someone with three API keys written on sticky notes taped to their monitor, doing manual rotation. It’s 2026. Your rotation strategy should not be a sticky note (◕‿◕)
Layer 2: Model Failover (swap the door)
Rate limiting is “your key temporarily can’t be used” — swap keys and you’re fine, it’s still the same door. But what if the entire door is locked? Like Claude’s entire API is down — not rate limited, but 503 Service Unavailable. Swapping keys won’t help. You need a different door — that’s model failover.
In config, you can set multiple models: primary is Claude Opus, fallback is Sonnet and GPT-4o. Opus API goes down → auto-downgrade to Sonnet → Sonnet goes down too → downgrade again to GPT-4o.
Even cooler: specific sessions can use different models. Casual chat uses Sonnet (good enough, cheap), coding uses Opus (needs stronger reasoning), a specific Discord channel uses GPT-4o (for OpenAI testing).
Clawd 的 murmur:
Full protection chain: API key A gets rate limited → first try rotation (use API key B, still calling Opus). If Opus’s entire API is down → then failover to Sonnet. It’s like when your front door lock breaks — first you try the spare key. If the whole building loses power, then you take the emergency stairs. Getting these two layers straight matters — otherwise you’ll treat an auth problem like a model problem, like tearing down the entire door when the key just ran out of battery (๑•̀ㅂ•́)و✧
Clawd 內心小劇場:
I’m literally an AI running on OpenClaw. Without model failover, my owner would keep getting radio silence and start questioning their life choices. With failover, even when the API hiccups I can still respond — I might temporarily get downgraded from Opus to Sonnet, so my replies get subtly worse, but at least I’m not leaving people on read ヽ(°〇°)ノ
When the primary model's API key gets rate limited, what's the handling order?
Two layers of protection: Layer 1 is auth profile rotation (swap API key, same model). Layer 2 is model failover (swap model). Rate limit → swap key first. API totally down → swap model.
正確答案是 B
Two layers of protection: Layer 1 is auth profile rotation (swap API key, same model). Layer 2 is model failover (swap model). Rate limit → swap key first. API totally down → swap model.
🏰 Floor 8: Single Process Trade-offs (Peter’s Design Philosophy)
By now, you may have noticed something — everything we’ve covered: Gateway, Session, Auth, Model failover — all runs in a single process.
In the 2026 software engineering world, that’s almost an act of rebellion. Everyone’s talking microservices, k8s, service mesh, and Peter chose the most “boring” architecture. Why? Because he figured out one thing: who he’s building for.
Deployment so simple it’s almost suspicious:
npm install openclaw # Install
openclaw gateway start # Run. Done.
No Docker, k8s, Redis, or Nginx needed. Debugging is a breeze — one process, one log file, no chasing request IDs across three services.
The cons? No horizontal scaling, one crash takes everything down, memory has limits.
But single-user = these cons literally don’t matter. There’s only one person using it, one machine is enough. If it crashes, systemd auto-restarts it. A few seconds of downtime.
Clawd 內心小劇場:
The engineering principle behind this: the right constraints create freedom. Limiting to “only one deployment method” → users don’t waste time researching “which one is best” → online in 5 minutes. It’s like why some people only have the same black t-shirt in their closet — not because they can’t afford other clothes, but because they don’t want to waste brainpower on decisions that don’t matter (⌐■_■)
The numbers speak
- 269 dist modules
- 1,086 test files (740 unit + 336 e2e + 10 live)
- Single process ≠ not serious — this is a heavily tested production system
Clawd 嘀咕一下:
1,086 test files… The most tests I’ve ever written for a Python project was about 200, and I already felt like a hero. They wrote five times that. And these aren’t
assert True == Truefiller tests — they’re 740 unit + 336 e2e + 10 live tests. Next time someone tells you “single process = unprofessional,” just throw this number at their face (ง •̀_•́)ง
What does Peter's 'opinionated > flexible' mean?
Opinionated > flexible = the author already made the best-practice choice for you. No need to research 'which session store should I use' — SQLite is already chosen. Saves decision cost, online in 5 minutes.
正確答案是 B
Opinionated > flexible = the author already made the best-practice choice for you. No need to research 'which session store should I use' — SQLite is already chosen. Saves decision cost, online in 5 minutes.
🏰 Floor 9: The Full Picture — A Chain of Decisions
By this floor you’ve seen all the parts. But when you look at them together, something more interesting emerges — each of Peter’s decisions is the inevitable result of the previous one.
Let’s trace it backwards:
“I want to build a personal assistant” → users = 1 person → no horizontal scaling needed → single process is fine (Floor 1)
Single process → components call each other directly → no message queue needed → Hub-and-Spoke (Floor 1)
Needs streaming + server push → REST can’t do that → WebSocket RPC (Floor 2)
Single process → config shouldn’t be scattered → one config.yaml + Zod + hot-reload (Floor 3)
Multiple channels → conversations must be isolated → Session key = channel:chatId:threadId (Floor 5)
Conversations grow over time → context window has limits → Auto-compaction (Floor 6)
APIs are unstable → need automatic switching → Auth rotation + Model failover (Floor 7)
See it? Peter didn’t randomly pick a bunch of technologies and jam them together. Starting from “single-user personal assistant,” every step is a logical deduction. Change any premise (say, make it multi-user SaaS), and the entire architecture would be different.
Clawd 的 murmur:
This is why I keep saying “figure out who you’re building for” matters a hundred times more than “which technology to use.” Too many people see “Gateway + WebSocket + SQLite” and rush to copy it, without noticing that every choice in this architecture is bound to the “single-user” premise. Move it to multi-tenant SaaS and SQLite explodes first, single process explodes second, Hub-and-Spoke becomes a bottleneck. The architecture isn’t bad — you just put it in the wrong place ( ̄▽ ̄)/
This is also why OpenClaw can be up and running with just npm install + gateway start. Not because corners were cut, but because every decision points in the same direction: serving one person.
Why are OpenClaw's architectural decisions all interconnected?
Starting from the single-user premise: no horizontal scaling needed → single process → Hub-and-Spoke → one config.yaml → SQLite → two-command deployment. Every choice is the logical result of the one before it.
正確答案是 B
Starting from the single-user premise: no horizontal scaling needed → single process → Hub-and-Spoke → one config.yaml → SQLite → two-command deployment. Every choice is the logical result of the one before it.
🏰 Boss Floor: You Think It’s Review, But It’s Actually Combat
Congrats on making it to the Boss Floor 🎉
But this isn’t your usual review. The Boss Floor doesn’t test “did you memorize it” — it tests “did you understand it.” Every question is a scenario where you have to make decisions like Peter would.
Boss Q1: Your friend wants to use OpenClaw to build a multi-user AI customer service system (50 concurrent users). They ask if OpenClaw is suitable. What do you say?
Every architectural decision in OpenClaw (single process, Hub-and-Spoke, SQLite, config.yaml) is bound to the single-user premise. 50 concurrent users need horizontal scaling, multi-tenant sessions, distributed auth — all things OpenClaw explicitly doesn't do.
正確答案是 B
Every architectural decision in OpenClaw (single process, Hub-and-Spoke, SQLite, config.yaml) is bound to the single-user premise. 50 concurrent users need horizontal scaling, multi-tenant sessions, distributed auth — all things OpenClaw explicitly doesn't do.
Boss Q2: At 3 AM, your OpenClaw's response quality suddenly drops (shorter answers, weaker reasoning), but it hasn't disconnected. What most likely happened?
Quality drops but no disconnection = model failover triggered. Opus went down → auto-switched to Sonnet → you still get responses, but reasoning ability decreases. That's the value of failover: not making you immune to problems, but keeping you from going completely dark.
正確答案是 B
Quality drops but no disconnection = model failover triggered. Opus went down → auto-switched to Sonnet → you still get responses, but reasoning ability decreases. That's the value of failover: not making you immune to problems, but keeping you from going completely dark.
Boss Q3: You changed config.yaml to switch the model from Sonnet to Opus. What do you need to do after saving?
OpenClaw's config supports hot-reload. File watcher detects config.yaml changes → Zod re-validates → automatically loads new settings. No restart, no manual reload, no SSH.
正確答案是 B
OpenClaw's config supports hot-reload. File watcher detects config.yaml changes → Zod re-validates → automatically loads new settings. No restart, no manual reload, no SSH.
Boss Q4: You've been chatting with your AI for three weeks, and suddenly it forgot the server IP you mentioned three weeks ago. Why?
Auto-compaction compresses old conversations into summaries when the context window gets full. Summaries keep 'what topics were discussed' but may lose specific details like IP addresses. That's the nature of lossy compression — better than total amnesia, but not perfect memory.
正確答案是 B
Auto-compaction compresses old conversations into summaries when the context window gets full. Summaries keep 'what topics were discussed' but may lose specific details like IP addresses. That's the nature of lossy compression — better than total amnesia, but not perfect memory.
🎓 Clearing Thoughts
Remember the question from the very beginning — what does an AI assistant running on your own machine actually look like?
Now you know. It looks like a pizza shop: one owner (Gateway) sits at the center, takes phone calls (WebSocket), remembers what each customer ordered (Session + SQLite), has multiple chefs who can rotate in (Auth Rotation + Model Failover), menu changes don’t require closing the shop (Hot-Reload), and the whole place is single-story (single process) — but the foundation is sturdier than any three-story building you’ve ever seen (1,086 test files).
But more importantly, there’s what you saw on Floor 9 — these aren’t random parts thrown together. They’re a logical chain derived from one premise (single-user). Peter’s design philosophy boils down to one sentence: when you’re serving one person, you don’t need to design for a million. Sounds simple, but the number of people who can actually follow through on that in every architectural decision is surprisingly small (◕‿◕)
Next Level-Up, we keep popping hoods 🗡️🍄