Exploring Clawdbot: Your Self-Hosted AI Assistant
Building your own multi-channel AI assistant with local-first control
The era of personal AI assistants has arrived, but most solutions require surrendering your data to cloud providers. Clawdbot challenges this paradigm by offering a self-hosted, open-source personal AI assistant that runs entirely on your infrastructure while seamlessly integrating with 12+ messaging platforms. With over 12,000 GitHub stars and 7,900+ commits, this project represents one of the most comprehensive local-first AI assistant implementations available today.
In this deep technical analysis, we’ll dissect Clawdbot’s architecture, explore its WebSocket-based protocol, and understand how it orchestrates multi-channel communication while maintaining security through innovative sandboxing techniques.
Architecture Overview: The Gateway-Centric Design
At its core, Clawdbot employs a Gateway-centric architecture where a single long-lived daemon serves as the control plane for all messaging surfaces, tool executions, and client connections.
WhatsApp / Telegram / Slack / Discord / Signal / iMessage / Microsoft Teams
│
▼
┌─────────────────────────────────┐
│ Gateway │
│ (Control Plane) │
│ ws://127.0.0.1:18789 │
└──────────────┬──────────────────┘
│
┌─────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
Pi Agent CLI/macOS App Nodes
(RPC) (WebSocket) (iOS/Android)
The Gateway maintains provider connections across all supported channels while exposing a typed WebSocket API for control-plane clients.
Key Architectural Invariants
The architecture enforces several critical invariants:
- Single Gateway per Host: Exactly one Gateway controls a single messaging session (e.g., Baileys for WhatsApp) per host
- Mandatory Handshake: The first WebSocket frame must be a
connectrequest; non-JSON or non-connect frames result in immediate connection termination - Event Non-Replay: Events are fire-and-forget; clients must refresh on gaps rather than expecting replay
The WebSocket Protocol: Typed Communication at Scale
Clawdbot’s wire protocol is built on WebSocket text frames with JSON payloads, implementing a request-response pattern with server-push events.
Protocol Frame Types
// Request Frame
{
type: "req",
id: string, // Unique request identifier
method: string, // RPC method name
params: object // Method parameters
}
// Response Frame
{
type: "res",
id: string, // Matching request ID
ok: boolean,
payload?: object, // Success payload
error?: object // Error details
}
// Event Frame
{
type: "event",
event: string, // Event type: agent, chat, presence, health, heartbeat, cron
payload: object,
seq?: number, // Sequence number
stateVersion?: string
}
Connection Lifecycle
The connection lifecycle follows a strict handshake protocol:
Client Gateway
│ │
│──── req:connect ────────▶│
│◀────── res (ok) ─────────│ (payload includes presence + health snapshot)
│ │
│◀────── event:presence ───│
│◀────── event:tick ───────│
│ │
│──── req:agent ──────────▶│
│◀────── res:agent ────────│ (ack: {runId, status:"accepted"})
│◀────── event:agent ──────│ (streaming response)
│◀────── res:agent ────────│ (final: {runId, status, summary})
Idempotency and Request Deduplication
For side-effecting methods (send, agent), idempotency keys are mandatory:
{
"type": "req",
"id": "uuid-v4",
"method": "send",
"params": {
"idempotencyKey": "unique-operation-key",
"channel": "whatsapp",
"to": "+1234567890",
"message": "Hello from Clawdbot"
}
}
The Gateway maintains a short-lived dedupe cache, enabling safe retries without duplicate message delivery.
Multi-Channel Integration: Protocol Adapters Deep Dive
Clawdbot supports 12+ messaging platforms through dedicated protocol adapters:
| Channel | Library/Protocol | Authentication |
|---|---|---|
| Baileys | QR-based device linking | |
| Telegram | grammY | Bot Token |
| Slack | Bolt | Bot Token + App Token |
| Discord | discord.js | Bot Token |
| Signal | signal-cli | Phone number registration |
| iMessage | imsg (macOS only) | Messages.app session |
| Microsoft Teams | Bot Framework | Azure AD OAuth |
| Matrix | matrix-js-sdk | Homeserver auth |
| Google Chat | Chat API | Service Account |
Channel Routing Architecture
The routing system maps inbound messages to sessions through a multi-level resolution:
interface ChannelRoute {
channel: ChannelType;
accountId?: string; // Optional: specific account
peerId?: string; // Optional: specific contact
groupId?: string; // Optional: specific group
agentId: string; // Target agent/workspace
sessionIsolation: 'per-peer' | 'per-group' | 'shared';
}
Group messages support configurable activation modes:
- Mention Gating: Agent responds only when explicitly @mentioned
- Always Active: Agent participates in all group messages
- Reply Tags: Respond only to direct replies to agent messages
Session Management and Context Pruning
Clawdbot implements sophisticated session management with automatic context pruning to handle token limits gracefully.
Session Model
interface Session {
id: string;
type: 'main' | 'group' | 'channel';
agentId: string;
context: Message[];
tokenCount: number;
metadata: {
thinkingLevel: 'off' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh';
verboseLevel: boolean;
model: string;
sendPolicy: 'immediate' | 'batch';
groupActivation: 'mention' | 'always';
};
}
Compaction Strategy
When context exceeds token thresholds, the system employs intelligent compaction:
- Summary Generation: Recent context is summarized by the LLM
- Semantic Preservation: Key facts and ongoing task state are preserved
- Sliding Window: Oldest non-essential messages are pruned
- User Notification: Optional
/compactcommand for manual triggering
# Manual compaction via slash command
/compact
Security Architecture: DM Pairing and Sandboxing
DM Pairing Protocol
By default, Clawdbot implements a pairing protocol for unknown senders:
Unknown Sender → Message → Gateway
│
▼
Generate 6-digit code
│
▼
Send "Pair with code: XXXXXX"
│
▼
Admin: clawdbot pairing approve <channel> <code>
│
▼
Add to local allowlist
Configuration options:
{
"channels": {
"discord": {
"dm": {
"policy": "pairing", // or "open" for unrestricted
"allowFrom": ["*"] // Explicit opt-in for public access
}
}
}
}
Docker Sandboxing for Non-Main Sessions
For group and channel sessions, Clawdbot supports per-session Docker sandboxes:
{
"agents": {
"defaults": {
"sandbox": {
"mode": "non-main", // Sandbox non-main sessions
"allowlist": [
"bash", "process", "read", "write", "edit",
"sessions_list", "sessions_history", "sessions_send"
],
"denylist": [
"browser", "canvas", "nodes", "cron", "discord", "gateway"
]
}
}
}
}
Each sandboxed session runs in an isolated Docker container with:
- Restricted filesystem access
- Network isolation
- Resource limits (CPU, memory)
- Tool-level permissions
Node Architecture: Distributed Device Capabilities
Nodes (macOS/iOS/Android) connect to the Gateway as capability providers:
interface NodeConnection {
role: 'node';
caps: {
canvas: boolean;
camera: boolean;
screenRecording: boolean;
notifications: boolean;
location: boolean;
};
commands: string[];
permissions: PermissionMap;
}
Node Command Routing
Agent requests camera.snap
│
▼
Gateway checks node registry
│
▼
node.invoke { command: "camera.snap", params: {...} }
│
▼
iOS/Android Node executes, returns image
│
▼
Agent receives base64 image data
Available node commands:
canvas.*– Visual workspace operationscamera.snap/camera.clip– Photo/video capturescreen.record– Screen recordinglocation.get– GPS coordinatessystem.run(macOS only) – Local command executionsystem.notify– Native notifications
Agent Runtime: The Pi Agent Loop
The core agent runtime follows a reactive loop pattern:
Input Message
│
▼
┌─────────────────┐
│ Parse Intent │
│ & Load Context │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Tool Selection │◀──────────────┐
│ (if needed) │ │
└────────┬────────┘ │
│ │
▼ │
┌─────────────────┐ │
│ Execute Tool │ │
│ (sandboxed) │ │
└────────┬────────┘ │
│ │
▼ │
┌─────────────────┐ │
│ LLM Processing │───(needs more │
│ with Results │ tools?)────┘
└────────┬────────┘
│
▼
┌─────────────────┐
│ Stream Response│
│ to Channel │
└─────────────────┘
Thinking Levels
Clawdbot supports configurable reasoning depth (for compatible models):
| Level | Description | Token Overhead |
|---|---|---|
off | No explicit reasoning | Minimal |
minimal | Brief consideration | ~100 tokens |
low | Quick analysis | ~300 tokens |
medium | Thorough reasoning | ~800 tokens |
high | Deep analysis | ~2000 tokens |
xhigh | Exhaustive reasoning | ~5000+ tokens |
# Set thinking level via slash command
/think high
Browser Control: CDP Integration
Clawdbot includes a dedicated browser control system using Chrome DevTools Protocol:
{
"browser": {
"enabled": true,
"controlUrl": "http://127.0.0.1:18791",
"color": "#FF4500",
"profile": "clawd-browser"
}
}
Capabilities:
- Full page snapshots (screenshot + DOM)
- Click, type, scroll actions
- Form filling
- File uploads
- Multi-tab management
- Profile persistence
Voice Integration: Wake Word and Talk Mode
Voice Wake Architecture
Audio Input Stream
│
▼
┌───────────────────┐
│ VAD Detection │
│ (Voice Activity) │
└────────┬──────────┘
│
▼
┌───────────────────┐
│ Wake Word Model │
│ (Local inference)│
└────────┬──────────┘
│
▼
┌───────────────────┐
│ Speech-to-Text │
│ (Whisper/Cloud) │
└────────┬──────────┘
│
▼
Agent
Talk Mode: Continuous Conversation
Talk Mode enables always-on voice interaction with:
- Push-to-talk or voice-activated modes
- ElevenLabs TTS integration for responses
- Interrupt handling for natural conversation
- Silence detection for turn-taking
Inter-Agent Communication: Sessions Tools
Clawdbot supports multi-agent coordination through session tools:
// List active sessions
sessions_list() → Session[]
// Fetch another session's history
sessions_history({ sessionId: string }) → Message[]
// Send to another session
sessions_send({
sessionId: string,
message: string,
replyBack?: boolean, // Enable ping-pong
announceStep?: boolean // Notify source session
})
Use cases:
- Task delegation across specialized agents
- Cross-channel message forwarding
- Supervisor/worker patterns
Deployment Topologies
Local Single-User (Recommended)
┌─────────────────────────────────────┐
│ Local Machine │
│ │
│ ┌─────────┐ ┌─────────────┐ │
│ │ Gateway │◀───▶│ macOS App │ │
│ └────┬────┘ └─────────────┘ │
│ │ │
│ ├───▶ WhatsApp (Baileys) │
│ ├───▶ Telegram │
│ └───▶ Other Channels │
└─────────────────────────────────────┘
Remote Gateway (Linux Server)
┌─────────────────┐ ┌──────────────────────┐
│ Client Mac │ │ Linux Server │
│ │ │ │
│ ┌───────────┐ │ SSH/ │ ┌─────────────────┐│
│ │ macOS App │◀┼─Tailscale─▶│ Gateway ││
│ └───────────┘ │ │ └────────┬────────┘│
│ │ │ │ │
│ ┌───────────┐ │ │ ▼ │
│ │ iOS Node │◀┼─────────│──▶ Channel Adapters │
│ └───────────┘ │ │ │
└─────────────────┘ └──────────────────────┘
Tailscale Integration
Native Tailscale Serve/Funnel support:
{
"gateway": {
"tailscale": {
"mode": "serve", // "off" | "serve" | "funnel"
"resetOnExit": true
},
"auth": {
"mode": "password", // Required for funnel
"allowTailscale": true // Trust Tailscale headers
}
}
}
Installation and Quick Start
Prerequisites
- Node.js ≥22
- pnpm (recommended) or npm
Installation
# Install globally
npm install -g clawdbot@latest
# Run onboarding wizard
clawdbot onboard --install-daemon
# Start gateway
clawdbot gateway --port 18789 --verbose
Basic Agent Interaction
# Send a message
clawdbot message send --to +1234567890 --message "Hello from Clawdbot"
# Talk to the agent
clawdbot agent --message "Ship checklist" --thinking high
Health Check
clawdbot doctor
Model Configuration
Clawdbot supports multiple LLM providers with failover:
{
"agent": {
"model": "anthropic/claude-opus-4-5",
"fallback": [
"anthropic/claude-sonnet-4-5",
"openai/gpt-5"
]
}
}
OAuth authentication is supported for:
- Anthropic (Claude Pro/Max subscriptions)
- OpenAI (ChatGPT/Codex)
Conclusion
Clawdbot represents a significant engineering achievement in the personal AI assistant space. Its Gateway-centric architecture provides a solid foundation for multi-channel communication while maintaining local control. The typed WebSocket protocol ensures reliable client-server communication, and the sandboxing system enables safe execution of AI-driven tools.
Key technical strengths:
- True local-first design with optional remote access
- 12+ channel integrations through well-abstracted adapters
- Production-grade security with DM pairing and Docker sandboxing
- Cross-platform node support for distributed device capabilities
- Extensible skills system for custom functionality
For developers building personal AI assistants or exploring multi-channel AI integration, Clawdbot provides an excellent reference architecture and a genuinely useful day-to-day tool.
Resources:
- GitHub: https://github.com/clawdbot/clawdbot
- Documentation: https://docs.clawd.bot
- Discord: https://discord.gg/clawd
- License: MIT
Written for the Collabnix community – bridging the gap between AI innovation and practical implementation.