Gateway (daemon) runbook
Last updated: 2025-12-09What it is
- The always-on process that owns the single Baileys/Telegram connection and the control/event plane.
- Replaces the legacy
gatewaycommand. CLI entry point:clawdbot gateway. - Runs until stopped; exits non-zero on fatal errors so the supervisor restarts it.
How to run (local)
- Config hot reload watches
~/.clawdbot/clawdbot.json(orCLAWDBOT_CONFIG_PATH).- Default mode:
gateway.reload.mode="hybrid"(hot-apply safe changes, restart on critical). - Hot reload uses in-process restart via SIGUSR1 when needed.
- Disable with
gateway.reload.mode="off".
- Default mode:
- Binds WebSocket control plane to
127.0.0.1:<port>(default 18789). - The same port also serves HTTP (control UI, hooks, A2UI). Single-port multiplex.
- Starts a Canvas file server by default on
canvasHost.port(default18793), servinghttp://<gateway-host>:18793/__clawdbot__/canvas/from~/clawd/canvas. Disable withcanvasHost.enabled=falseorCLAWDBOT_SKIP_CANVAS_HOST=1. - Logs to stdout; use launchd/systemd to keep it alive and rotate logs.
- Pass
--verboseto mirror debug logging (handshakes, req/res, events) from the log file into stdio when troubleshooting. --forceuseslsofto find listeners on the chosen port, sends SIGTERM, logs what it killed, then starts the gateway (fails fast iflsofis missing).- If you run under a supervisor (launchd/systemd/mac app child-process mode), a stop/restart typically sends SIGTERM; older builds may surface this as
pnpmELIFECYCLEexit code 143 (SIGTERM), which is a normal shutdown, not a crash. - SIGUSR1 triggers an in-process restart (no external supervisor required). This is what the
gatewayagent tool uses. - Optional shared secret: pass
--token <value>or setCLAWDBOT_GATEWAY_TOKENto require clients to sendconnect.params.auth.token. - Port precedence:
--port>CLAWDBOT_GATEWAY_PORT>gateway.port> default18789.
Remote access
- Tailscale/VPN preferred; otherwise SSH tunnel:
- Clients then connect to
ws://127.0.0.1:18789through the tunnel. - If a token is configured, clients must include it in
connect.params.auth.tokeneven over the tunnel.
Multiple gateways (same host)
Supported if you isolate state + config and use unique ports.Dev profile (--dev)
Fast path: run a fully-isolated dev instance (config/state/workspace) without touching your primary setup.
CLAWDBOT_STATE_DIR=~/.clawdbot-devCLAWDBOT_CONFIG_PATH=~/.clawdbot-dev/clawdbot.jsonCLAWDBOT_GATEWAY_PORT=19001(Gateway WS + HTTP)bridge.port=19002(derived:gateway.port+1)browser.controlUrl=http://127.0.0.1:19003(derived:gateway.port+2)canvasHost.port=19005(derived:gateway.port+4)agent.workspacedefault becomes~/clawd-devwhen you runsetup/onboardunder--dev.
- Base port =
gateway.port(orCLAWDBOT_GATEWAY_PORT/--port) bridge.port = base + 1(orCLAWDBOT_BRIDGE_PORT/ config override)browser.controlUrl port = base + 2(orCLAWDBOT_BROWSER_CONTROL_URL/ config override)canvasHost.port = base + 4(orCLAWDBOT_CANVAS_HOST_PORT/ config override)- Browser profile CDP ports auto-allocate from
browser.controlPort + 9 .. + 108(persisted per profile).
- unique
gateway.port - unique
CLAWDBOT_CONFIG_PATH - unique
CLAWDBOT_STATE_DIR - unique
agent.workspace - separate WhatsApp numbers (if using WA)
Protocol (operator view)
- Mandatory first frame from client:
req {type:"req", id, method:"connect", params:{minProtocol,maxProtocol,client:{name,version,platform,deviceFamily?,modelIdentifier?,mode,instanceId}, caps, auth?, locale?, userAgent? } }. - Gateway replies
res {type:"res", id, ok:true, payload:hello-ok }(orok:falsewith an error, then closes). - After handshake:
- Requests:
{type:"req", id, method, params}→{type:"res", id, ok, payload|error} - Events:
{type:"event", event, payload, seq?, stateVersion?}
- Requests:
- Structured presence entries:
{host, ip, version, platform?, deviceFamily?, modelIdentifier?, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? }. agentresponses are two-stage: firstresack{runId,status:"accepted"}, then a finalres{runId,status:"ok"|"error",summary}after the run finishes; streamed output arrives asevent:"agent".
Methods (initial set)
health— full health snapshot (same shape asclawdbot health --json).status— short summary.system-presence— current presence list.system-event— post a presence/system note (structured).send— send a message via the active provider(s).agent— run an agent turn (streams events back on same connection).node.list— list paired + currently-connected bridge nodes (includescaps,deviceFamily,modelIdentifier,paired,connected, and advertisedcommands).node.describe— describe a node (capabilities + supportednode.invokecommands; works for paired nodes and for currently-connected unpaired nodes).node.invoke— invoke a command on a node (e.g.canvas.*,camera.*).node.pair.*— pairing lifecycle (request,list,approve,reject,verify).
docs/presence.md for how presence is produced/deduped and why instanceId matters.
Events
agent— streamed tool/output events from the agent run (seq-tagged).presence— presence updates (deltas with stateVersion) pushed to all connected clients.tick— periodic keepalive/no-op to confirm liveness.shutdown— Gateway is exiting; payload includesreasonand optionalrestartExpectedMs. Clients should reconnect.
WebChat integration
- WebChat is a native SwiftUI UI that talks directly to the Gateway WebSocket for history, sends, abort, and events.
- Remote use goes through the same SSH/Tailscale tunnel; if a gateway token is configured, the client includes it during
connect. - macOS app connects via a single WS (shared connection); it hydrates presence from the initial snapshot and listens for
presenceevents to update the UI.
Typing and validation
- Server validates every inbound frame with AJV against JSON Schema emitted from the protocol definitions.
- Clients (TS/Swift) consume generated types (TS directly; Swift via the repo’s generator).
- Protocol definitions are the source of truth; regenerate schema/models with:
pnpm protocol:genpnpm protocol:gen:swift
Connection snapshot
hello-okincludes asnapshotwithpresence,health,stateVersion, anduptimeMspluspolicy {maxPayload,maxBufferedBytes,tickIntervalMs}so clients can render immediately without extra requests.health/system-presenceremain available for manual refresh, but are not required at connect time.
Error codes (res.error shape)
- Errors use
{ code, message, details?, retryable?, retryAfterMs? }. - Standard codes:
NOT_LINKED— WhatsApp not authenticated.AGENT_TIMEOUT— agent did not respond within the configured deadline.INVALID_REQUEST— schema/param validation failed.UNAVAILABLE— Gateway is shutting down or a dependency is unavailable.
Keepalive behavior
tickevents (or WS ping/pong) are emitted periodically so clients know the Gateway is alive even when no traffic occurs.- Send/agent acknowledgements remain separate responses; do not overload ticks for sends.
Replay / gaps
- Events are not replayed. Clients detect seq gaps and should refresh (
health+system-presence) before continuing. WebChat and macOS clients now auto-refresh on gap.
Supervision (macOS example)
- Use launchd to keep the daemon alive:
- Program: path to
clawdbot - Arguments:
gateway - KeepAlive: true
- StandardOut/Err: file paths or
syslog
- Program: path to
- On failure, launchd restarts; fatal misconfig should keep exiting so the operator notices.
- LaunchAgents are per-user and require a logged-in session; for headless setups use a custom LaunchDaemon (not shipped).
clawdbot daemon installwrites~/Library/LaunchAgents/com.clawdbot.gateway.plist.clawdbot doctoraudits the LaunchAgent config and can update it to current defaults.
Daemon management (CLI)
Use the CLI daemon manager for install/start/stop/restart/status:daemon statusprobes the Gateway RPC by default using the daemon’s resolved port/config (override with--url).daemon status --deepadds system-level scans (LaunchDaemons/system units).daemon status --no-probeskips the RPC probe (useful when networking is down).daemon status --jsonis stable for scripts.daemon statusreports supervisor runtime (launchd/systemd running) separately from RPC reachability (WS connect + status RPC).daemon statusprints config path + probe target to avoid “localhost vs LAN bind” confusion and profile mismatches.daemon statusincludes the last gateway error line when the service looks running but the port is closed.logstails the Gateway file log via RPC (no manualtail/grepneeded).- If other gateway-like services are detected, the CLI warns. We recommend one gateway per machine; one gateway can host multiple agents.
- Cleanup:
clawdbot daemon uninstall(current service) andclawdbot doctor(legacy migrations).
- Cleanup:
daemon installis a no-op when already installed; useclawdbot daemon install --forceto reinstall (profile/env/path changes).
- Clawdbot.app can bundle a bun-compiled gateway binary and install a per-user LaunchAgent labeled
com.clawdbot.gateway. - To stop it cleanly, use
clawdbot daemon stop(orlaunchctl bootout gui/$UID/com.clawdbot.gateway). - To restart, use
clawdbot daemon restart(orlaunchctl kickstart -k gui/$UID/com.clawdbot.gateway).launchctlonly works if the LaunchAgent is installed; otherwise useclawdbot daemon installfirst.
Supervision (systemd user unit)
Clawdbot installs a systemd user service by default on Linux/WSL2. We recommend user services for single-user machines (simpler env, per-user config). Use a system service for multi-user or always-on servers (no lingering required, shared supervision).clawdbot daemon install writes the user unit. clawdbot doctor audits the
unit and can update it to match the current recommended defaults.
Create ~/.config/systemd/user/clawdbot-gateway.service:
/var/lib/systemd/linger).
Then enable the service:
/etc/systemd/system/clawdbot-gateway.service (copy the unit above,
switch WantedBy=multi-user.target, set User= + WorkingDirectory=), then:
Windows (WSL2)
Windows installs should use WSL2 and follow the Linux systemd section above.Operational checks
- Liveness: open WS and send
req:connect→ expectreswithpayload.type="hello-ok"(with snapshot). - Readiness: call
health→ expectok: trueandweb.linked=true. - Debug: subscribe to
tickandpresenceevents; ensurestatusshows linked/auth age; presence entries show Gateway host and connected clients.
Safety guarantees
- Only one Gateway per host; all sends/agent calls must go through it.
- No fallback to direct Baileys connections; if the Gateway is down, sends fail fast.
- Non-connect first frames or malformed JSON are rejected and the socket is closed.
- Graceful shutdown: emit
shutdownevent before closing; clients must handle close + reconnect.
CLI helpers
clawdbot gateway health|status— request health/status over the Gateway WS.clawdbot send --to <num> --message "hi" [--media ...]— send via Gateway (idempotent for WhatsApp).clawdbot agent --message "hi" --to <num>— run an agent turn (waits for final by default).clawdbot gateway call <method> --params '{"k":"v"}'— raw method invoker for debugging.clawdbot daemon stop|restart— stop/restart the supervised gateway service (launchd/systemd).- Gateway helper subcommands assume a running gateway on
--url; they no longer auto-spawn one.
Migration guidance
- Retire uses of
clawdbot gatewayand the legacy TCP control port. - Update clients to speak the WS protocol with mandatory connect and structured presence.