Join our Discord Server
Docker Cagent

Getting Started with Cagent

Estimated reading: 6 minutes 571 views

cagent is a command-line tool for running AI agents.

Installation

Setting up cagent is remarkably straightforward:


# Download fresh (make sure you're getting the right architecture)
curl -L -o cagent https://github.com/docker/cagent/releases/download/v1.0.3/cagent-darwin-arm64

# Remove quarantine immediately after download
/usr/bin/xattr -rd com.apple.quarantine cagent

# Make executable
chmod +x cagent

# Move to PATH
sudo mv cagent /usr/local/bin/

# Test
cagent --help

Ensuring that cagent is installed.

Set up your API keys:

# For OpenAI models
export OPENAI_API_KEY=your_api_key_here
# For Anthropic models  
export ANTHROPIC_API_KEY=your_api_key_here
# For Gemini models
export GOOGLE_API_KEY=your_api_key_here

Creating Your First Agent

Here’s a basic agent configuration (basic_agent.yaml):

This is a minimal cagent configuration that creates a simple AI assistant named “root” using OpenAI’s GPT-4o-mini model (the faster, more cost-effective version of GPT-4o). The description field provides a brief summary of what the agent does, while the instruction section contains the system prompt that defines the agent’s personality and behavior – in this case, telling it to be a helpful, accurate, and concise assistant for various tasks.

agents:
  root:
    model: openai/gpt-4o-mini
    description: A helpful AI assistant
    instruction: |
      You are a knowledgeable assistant that helps users with various tasks.
      Be helpful, accurate, and concise in your responses.

This is a “vanilla” agent without any special tools or capabilities – notice there’s no toolset section, which means it can’t search the web, read files, or access external services. It can only respond based on its training data and the conversation context. This type of basic configuration is perfect for general Q&A, explanations, writing help, or simple problem-solving where you don’t need real-time information or external tool access.

Run it with:

cagent run basic_agent.yaml

That’s it! You now have a functioning AI agent.

Multi-Agent Coordination System

Here’s a more sophisticated setup with a coordinator and specialist agent:

agents:
  root:
    model: anthropic/claude-sonnet-4-20250514  # Latest Sonnet 4
    description: "Main coordinator agent that delegates tasks and manages workflow"
    instruction: |
      You are the root coordinator agent. Your job is to:
      1. Understand user requests and break them down into manageable tasks
      2. Delegate appropriate tasks to your helper agent
      3. Coordinate responses and ensure tasks are completed properly
      4. Provide final responses to the user
    sub_agents: ["helper"]
    
  helper:
    model: anthropic/claude-opus-4-20250805   # Latest Opus 4.1
    description: "Assistant agent that helps with various tasks as directed by the root agent"
    instruction: |
      You are a helpful assistant agent. Your role is to:
      1. Complete specific tasks assigned by the root agent
      2. Provide detailed and accurate responses
      3. Ask for clarification if tasks are unclear
      4. Report back to the root agent with your results

Research Agent with Web Search Capabilities

This YAML file defines a research agent named “root” that uses OpenAI’s GPT-4o model. The description provides a brief summary of what the agent does, while the instruction section contains the detailed system prompt that tells the agent exactly how to behave – in this case, acting as an expert research analyst that searches for current information, verifies facts, and provides structured summaries with citations. The | symbol after instruction: allows for multi-line text formatting.

agents:
  root:
    model: openai/gpt-4o
    description: Advanced research agent with multiple tools
    instruction: |
      You are an expert research analyst with access to web search tools.
      Your capabilities include:
      - Real-time web searching and information gathering
      - Fact verification across multiple sources
      - Trend analysis and competitive intelligence
      - Academic and scientific research
      - Market research and business intelligence

      Always:
      1. Search for the most current information available
      2. Cross-reference multiple sources for accuracy
      3. Provide clear source attribution
      4. Distinguish between verified facts and speculation
      5. Offer analysis and insights based on findings
      6. Structure your responses clearly with key takeaways
    toolset:
      - type: mcp
        command: docker
        args: ["mcp", "gateway", "run", "--servers=duckduckgo"]
      # Add more tools as needed
      # - type: mcp
      #   command: docker
      #   args: ["mcp", "gateway", "run", "--servers=brave"]

models:
  gpt4o:
    provider: openai
    model: gpt-4o
    max_tokens: 4000
    temperature: 0.1  # Lower temperature for more factual responses

The toolset section is what gives the agent its web search superpowers – it connects to Docker’s MCP Gateway running a DuckDuckGo search server, enabling real-time web searching instead of just relying on training data. The models section at the bottom defines the specific model configuration, including token limits (4000 tokens max) and temperature (0.1 for more factual, less creative responses). This combination creates an agent that can actually search the web and provide current information rather than just giving generic advice.

Development Assistant with File Operations

This YAML defines a sophisticated coding assistant powered by Claude Sonnet 4 (Anthropic’s latest model) that combines AI reasoning with practical development tools.

The agent is designed as an “expert coding assistant” with dual capabilities: it can read and write files in your local directory using the rust-mcp-filesystem tool, and search the web for documentation and solutions using DuckDuckGo via Docker’s MCP Gateway.

This combination allows the agent to understand your existing codebase by reading files, research best practices and solutions online, and then implement changes directly to your files.

agents:
  root:
    model: anthropic/claude-sonnet-4-20250514  # Correct Claude Sonnet 4 model
    description: A development assistant with file system access and web search
    instruction: |
      You are an expert coding assistant with access to file operations and web search.
      Your capabilities include:
      - Reading and writing files in the current directory
      - Searching the web for documentation, solutions, and current information
      - Code review, refactoring, and development assistance
      - Debugging and troubleshooting
      
      Always:
      1. Read existing files to understand the codebase structure
      2. Search for best practices and current solutions when needed
      3. Write clean, well-documented code
      4. Explain your changes and reasoning
      5. Follow the existing code style and conventions
    toolset:
      - type: mcp
        command: docker
        args: ["mcp", "gateway", "run", "--servers=duckduckgo"]
      - type: mcp
        command: rust-mcp-filesystem
        args: ["--allow-write", "."]
        tools: ["read_file", "write_file"]

The instruction section establishes a methodical workflow where the agent first examines existing code to understand the project structure, researches current best practices when needed, and then writes clean, well-documented code while explaining its reasoning.

The toolset configuration is what makes this powerful – the MCP (Model Context Protocol) tools give the agent real-world capabilities beyond just text generation. Unlike a basic AI assistant that can only provide advice, this agent can actually read your code files, search for current documentation or solutions, and write code changes back to your filesystem, making it a true development partner rather than just a chatbot.

Leave a Reply

Share this Doc

Getting Started with Cagent

Or copy link

CONTENTS
Join our Discord Server