Join our Discord Server
Collabnix Team The Collabnix Team is a diverse collective of Docker, Kubernetes, and IoT experts united by a passion for cloud-native technologies. With backgrounds spanning across DevOps, platform engineering, cloud architecture, and container orchestration, our contributors bring together decades of combined experience from various industries and technical domains.

What is Clawdbot and why is it getting popular?

4 min read

Exploring Clawdbot: Your Self-Hosted AI Assistant

Building your own multi-channel AI assistant with local-first control


The era of personal AI assistants has arrived, but most solutions require surrendering your data to cloud providers. Clawdbot challenges this paradigm by offering a self-hosted, open-source personal AI assistant that runs entirely on your infrastructure while seamlessly integrating with 12+ messaging platforms. With over 12,000 GitHub stars and 7,900+ commits, this project represents one of the most comprehensive local-first AI assistant implementations available today.

In this deep technical analysis, we’ll dissect Clawdbot’s architecture, explore its WebSocket-based protocol, and understand how it orchestrates multi-channel communication while maintaining security through innovative sandboxing techniques.


Architecture Overview: The Gateway-Centric Design

At its core, Clawdbot employs a Gateway-centric architecture where a single long-lived daemon serves as the control plane for all messaging surfaces, tool executions, and client connections.

WhatsApp / Telegram / Slack / Discord / Signal / iMessage / Microsoft Teams
                          │
                          ▼
          ┌─────────────────────────────────┐
          │            Gateway              │
          │       (Control Plane)           │
          │      ws://127.0.0.1:18789       │
          └──────────────┬──────────────────┘
                         │
           ┌─────────────┼─────────────────┐
           │             │                 │
           ▼             ▼                 ▼
     Pi Agent       CLI/macOS App      Nodes
      (RPC)         (WebSocket)    (iOS/Android)

The Gateway maintains provider connections across all supported channels while exposing a typed WebSocket API for control-plane clients.

Key Architectural Invariants

The architecture enforces several critical invariants:

  1. Single Gateway per Host: Exactly one Gateway controls a single messaging session (e.g., Baileys for WhatsApp) per host
  2. Mandatory Handshake: The first WebSocket frame must be a connect request; non-JSON or non-connect frames result in immediate connection termination
  3. Event Non-Replay: Events are fire-and-forget; clients must refresh on gaps rather than expecting replay

The WebSocket Protocol: Typed Communication at Scale

Clawdbot’s wire protocol is built on WebSocket text frames with JSON payloads, implementing a request-response pattern with server-push events.

Protocol Frame Types

// Request Frame
{
  type: "req",
  id: string,          // Unique request identifier
  method: string,      // RPC method name
  params: object       // Method parameters
}

// Response Frame  
{
  type: "res",
  id: string,          // Matching request ID
  ok: boolean,
  payload?: object,    // Success payload
  error?: object       // Error details
}

// Event Frame
{
  type: "event",
  event: string,       // Event type: agent, chat, presence, health, heartbeat, cron
  payload: object,
  seq?: number,        // Sequence number
  stateVersion?: string
}

Connection Lifecycle

The connection lifecycle follows a strict handshake protocol:

Client                    Gateway
  │                          │
  │──── req:connect ────────▶│
  │◀────── res (ok) ─────────│  (payload includes presence + health snapshot)
  │                          │
  │◀────── event:presence ───│
  │◀────── event:tick ───────│
  │                          │
  │──── req:agent ──────────▶│
  │◀────── res:agent ────────│  (ack: {runId, status:"accepted"})
  │◀────── event:agent ──────│  (streaming response)
  │◀────── res:agent ────────│  (final: {runId, status, summary})

Idempotency and Request Deduplication

For side-effecting methods (send, agent), idempotency keys are mandatory:

{
  "type": "req",
  "id": "uuid-v4",
  "method": "send",
  "params": {
    "idempotencyKey": "unique-operation-key",
    "channel": "whatsapp",
    "to": "+1234567890",
    "message": "Hello from Clawdbot"
  }
}

The Gateway maintains a short-lived dedupe cache, enabling safe retries without duplicate message delivery.


Multi-Channel Integration: Protocol Adapters Deep Dive

Clawdbot supports 12+ messaging platforms through dedicated protocol adapters:

ChannelLibrary/ProtocolAuthentication
WhatsAppBaileysQR-based device linking
TelegramgrammYBot Token
SlackBoltBot Token + App Token
Discorddiscord.jsBot Token
Signalsignal-cliPhone number registration
iMessageimsg (macOS only)Messages.app session
Microsoft TeamsBot FrameworkAzure AD OAuth
Matrixmatrix-js-sdkHomeserver auth
Google ChatChat APIService Account

Channel Routing Architecture

The routing system maps inbound messages to sessions through a multi-level resolution:

interface ChannelRoute {
  channel: ChannelType;
  accountId?: string;      // Optional: specific account
  peerId?: string;         // Optional: specific contact
  groupId?: string;        // Optional: specific group
  agentId: string;         // Target agent/workspace
  sessionIsolation: 'per-peer' | 'per-group' | 'shared';
}

Group messages support configurable activation modes:

  • Mention Gating: Agent responds only when explicitly @mentioned
  • Always Active: Agent participates in all group messages
  • Reply Tags: Respond only to direct replies to agent messages

Session Management and Context Pruning

Clawdbot implements sophisticated session management with automatic context pruning to handle token limits gracefully.

Session Model

interface Session {
  id: string;
  type: 'main' | 'group' | 'channel';
  agentId: string;
  context: Message[];
  tokenCount: number;
  metadata: {
    thinkingLevel: 'off' | 'minimal' | 'low' | 'medium' | 'high' | 'xhigh';
    verboseLevel: boolean;
    model: string;
    sendPolicy: 'immediate' | 'batch';
    groupActivation: 'mention' | 'always';
  };
}

Compaction Strategy

When context exceeds token thresholds, the system employs intelligent compaction:

  1. Summary Generation: Recent context is summarized by the LLM
  2. Semantic Preservation: Key facts and ongoing task state are preserved
  3. Sliding Window: Oldest non-essential messages are pruned
  4. User Notification: Optional /compact command for manual triggering
# Manual compaction via slash command
/compact

Security Architecture: DM Pairing and Sandboxing

DM Pairing Protocol

By default, Clawdbot implements a pairing protocol for unknown senders:

Unknown Sender → Message → Gateway
                              │
                              ▼
                    Generate 6-digit code
                              │
                              ▼
              Send "Pair with code: XXXXXX"
                              │
                              ▼
            Admin: clawdbot pairing approve <channel> <code>
                              │
                              ▼
                   Add to local allowlist

Configuration options:

{
  "channels": {
    "discord": {
      "dm": {
        "policy": "pairing",     // or "open" for unrestricted
        "allowFrom": ["*"]       // Explicit opt-in for public access
      }
    }
  }
}

Docker Sandboxing for Non-Main Sessions

For group and channel sessions, Clawdbot supports per-session Docker sandboxes:

{
  "agents": {
    "defaults": {
      "sandbox": {
        "mode": "non-main",      // Sandbox non-main sessions
        "allowlist": [
          "bash", "process", "read", "write", "edit",
          "sessions_list", "sessions_history", "sessions_send"
        ],
        "denylist": [
          "browser", "canvas", "nodes", "cron", "discord", "gateway"
        ]
      }
    }
  }
}

Each sandboxed session runs in an isolated Docker container with:

  • Restricted filesystem access
  • Network isolation
  • Resource limits (CPU, memory)
  • Tool-level permissions

Node Architecture: Distributed Device Capabilities

Nodes (macOS/iOS/Android) connect to the Gateway as capability providers:

interface NodeConnection {
  role: 'node';
  caps: {
    canvas: boolean;
    camera: boolean;
    screenRecording: boolean;
    notifications: boolean;
    location: boolean;
  };
  commands: string[];
  permissions: PermissionMap;
}

Node Command Routing

Agent requests camera.snap
           │
           ▼
    Gateway checks node registry
           │
           ▼
    node.invoke { command: "camera.snap", params: {...} }
           │
           ▼
    iOS/Android Node executes, returns image
           │
           ▼
    Agent receives base64 image data

Available node commands:

  • canvas.* – Visual workspace operations
  • camera.snap / camera.clip – Photo/video capture
  • screen.record – Screen recording
  • location.get – GPS coordinates
  • system.run (macOS only) – Local command execution
  • system.notify – Native notifications

Agent Runtime: The Pi Agent Loop

The core agent runtime follows a reactive loop pattern:

Input Message
      │
      ▼
┌─────────────────┐
│  Parse Intent   │
│  & Load Context │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Tool Selection │◀──────────────┐
│  (if needed)    │               │
└────────┬────────┘               │
         │                        │
         ▼                        │
┌─────────────────┐               │
│  Execute Tool   │               │
│  (sandboxed)    │               │
└────────┬────────┘               │
         │                        │
         ▼                        │
┌─────────────────┐               │
│  LLM Processing │───(needs more │
│  with Results   │    tools?)────┘
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Stream Response│
│  to Channel     │
└─────────────────┘

Thinking Levels

Clawdbot supports configurable reasoning depth (for compatible models):

LevelDescriptionToken Overhead
offNo explicit reasoningMinimal
minimalBrief consideration~100 tokens
lowQuick analysis~300 tokens
mediumThorough reasoning~800 tokens
highDeep analysis~2000 tokens
xhighExhaustive reasoning~5000+ tokens
# Set thinking level via slash command
/think high

Browser Control: CDP Integration

Clawdbot includes a dedicated browser control system using Chrome DevTools Protocol:

{
  "browser": {
    "enabled": true,
    "controlUrl": "http://127.0.0.1:18791",
    "color": "#FF4500",
    "profile": "clawd-browser"
  }
}

Capabilities:

  • Full page snapshots (screenshot + DOM)
  • Click, type, scroll actions
  • Form filling
  • File uploads
  • Multi-tab management
  • Profile persistence

Voice Integration: Wake Word and Talk Mode

Voice Wake Architecture

Audio Input Stream
        │
        ▼
┌───────────────────┐
│  VAD Detection    │
│  (Voice Activity) │
└────────┬──────────┘
         │
         ▼
┌───────────────────┐
│  Wake Word Model  │
│  (Local inference)│
└────────┬──────────┘
         │
         ▼
┌───────────────────┐
│  Speech-to-Text   │
│  (Whisper/Cloud)  │
└────────┬──────────┘
         │
         ▼
      Agent

Talk Mode: Continuous Conversation

Talk Mode enables always-on voice interaction with:

  • Push-to-talk or voice-activated modes
  • ElevenLabs TTS integration for responses
  • Interrupt handling for natural conversation
  • Silence detection for turn-taking

Inter-Agent Communication: Sessions Tools

Clawdbot supports multi-agent coordination through session tools:

// List active sessions
sessions_list() → Session[]

// Fetch another session's history
sessions_history({ sessionId: string }) → Message[]

// Send to another session
sessions_send({
  sessionId: string,
  message: string,
  replyBack?: boolean,    // Enable ping-pong
  announceStep?: boolean  // Notify source session
})

Use cases:

  • Task delegation across specialized agents
  • Cross-channel message forwarding
  • Supervisor/worker patterns

Deployment Topologies

Local Single-User (Recommended)

┌─────────────────────────────────────┐
│           Local Machine             │
│                                     │
│  ┌─────────┐     ┌─────────────┐   │
│  │ Gateway │◀───▶│ macOS App   │   │
│  └────┬────┘     └─────────────┘   │
│       │                             │
│       ├───▶ WhatsApp (Baileys)     │
│       ├───▶ Telegram               │
│       └───▶ Other Channels         │
└─────────────────────────────────────┘

Remote Gateway (Linux Server)

┌─────────────────┐        ┌──────────────────────┐
│   Client Mac    │        │    Linux Server      │
│                 │        │                      │
│  ┌───────────┐ │  SSH/   │  ┌─────────────────┐│
│  │ macOS App │◀┼─Tailscale─▶│     Gateway     ││
│  └───────────┘ │        │  └────────┬────────┘│
│                │        │           │          │
│  ┌───────────┐ │        │           ▼          │
│  │ iOS Node  │◀┼─────────│──▶ Channel Adapters │
│  └───────────┘ │        │                      │
└─────────────────┘        └──────────────────────┘

Tailscale Integration

Native Tailscale Serve/Funnel support:

{
  "gateway": {
    "tailscale": {
      "mode": "serve",        // "off" | "serve" | "funnel"
      "resetOnExit": true
    },
    "auth": {
      "mode": "password",     // Required for funnel
      "allowTailscale": true  // Trust Tailscale headers
    }
  }
}

Installation and Quick Start

Prerequisites

  • Node.js ≥22
  • pnpm (recommended) or npm

Installation

# Install globally
npm install -g clawdbot@latest

# Run onboarding wizard
clawdbot onboard --install-daemon

# Start gateway
clawdbot gateway --port 18789 --verbose

Basic Agent Interaction

# Send a message
clawdbot message send --to +1234567890 --message "Hello from Clawdbot"

# Talk to the agent
clawdbot agent --message "Ship checklist" --thinking high

Health Check

clawdbot doctor

Model Configuration

Clawdbot supports multiple LLM providers with failover:

{
  "agent": {
    "model": "anthropic/claude-opus-4-5",
    "fallback": [
      "anthropic/claude-sonnet-4-5",
      "openai/gpt-5"
    ]
  }
}

OAuth authentication is supported for:

  • Anthropic (Claude Pro/Max subscriptions)
  • OpenAI (ChatGPT/Codex)

Conclusion

Clawdbot represents a significant engineering achievement in the personal AI assistant space. Its Gateway-centric architecture provides a solid foundation for multi-channel communication while maintaining local control. The typed WebSocket protocol ensures reliable client-server communication, and the sandboxing system enables safe execution of AI-driven tools.

Key technical strengths:

  • True local-first design with optional remote access
  • 12+ channel integrations through well-abstracted adapters
  • Production-grade security with DM pairing and Docker sandboxing
  • Cross-platform node support for distributed device capabilities
  • Extensible skills system for custom functionality

For developers building personal AI assistants or exploring multi-channel AI integration, Clawdbot provides an excellent reference architecture and a genuinely useful day-to-day tool.


Resources:


Written for the Collabnix community – bridging the gap between AI innovation and practical implementation.

Have Queries? Join https://launchpass.com/collabnix

Collabnix Team The Collabnix Team is a diverse collective of Docker, Kubernetes, and IoT experts united by a passion for cloud-native technologies. With backgrounds spanning across DevOps, platform engineering, cloud architecture, and container orchestration, our contributors bring together decades of combined experience from various industries and technical domains.
Join our Discord Server
Index