For the past two years, the most common question I’ve received at every Collabnix Meetup — and we run a lot of them, with three to four hundred DevOps engineers and developers showing up each time — has been the same:
“Ajeet, where do I start with Docker and AI?”
Until today, my honest answer was a bit lame: “Follow my blogs.” I was writing one almost every day, because the AI landscape was shifting under our feet that fast.
That changed in December 2024. MCP (Model Context Protocol) was introduced, and within a few weeks the ecosystem had over 3,000 MCP servers. That’s when it became clear that scattered blog posts weren’t going to cut it anymore. Developers needed a single, structured, end-to-end resource for running AI workloads operationally — not as research experiments, but as production systems.
So Harsh Manvar and I wrote one.
Operational AI with Docker is officially live today, published by Packt. This post explains what’s in it, who it’s for, and the four problems it solves that no other Docker book on the market addresses head-on.
What is “Operational AI” — and Why Does It Need Its Own Book?
The phrase Operational AI is deliberate. Most AI content today falls into two camps:
- Research and capability content — what models can do, benchmarks, prompt engineering.
- Application content — how to build a chatbot, a RAG pipeline, an agent demo on a laptop.
Neither tells you how to run the thing. How do you containerize an LLM workload? How do you give an AI agent tools without handing it your AWS credentials? How do you isolate code that an agent generates and decides to execute? How do you observe a system whose control flow lives inside a non-deterministic model?
These are operational concerns — and they’re exactly where Docker has spent the last two years building. Operational AI with Docker brings that work together in one place.
The thesis is simple: the same “build, ship, run” philosophy Docker brought to software in 2013 now applies to AI workloads in 2026.
- Build AI agents using Docker MCP Toolkit, the MCP Catalog, and Docker Model Runner.
- Ship them with Agentic Compose and Docker Agents.
- Run them inside Docker Sandboxes.
Docker isn’t trying to be an AI framework. It’s the runtime and packaging layer underneath so the AI parts can actually be portable, reproducible, and shippable. That’s the entire premise.
The Four Problems This Book Solves
When teams come to me asking about Docker and AI, they think their problem is technical. Most of the time, it isn’t. Here are the four real challenges I see over and over — and the book tackles each one.
1. How Do I Choose the Right AI Model?
Hugging Face today has more than two million AI models. So the first wall a team hits isn’t a Docker problem — it’s a decision problem.
Chapter 6 breaks this down with a clear framework, splitting models into three buckets:
- Small Language Models (SLMs) — 0 to 7 billion parameters
- Medium Language Models (MLMs) — the middle range
- Large Language Models (LLMs) — 70 billion+ parameters
Most teams don’t need an LLM. They need an SLM that runs cheaply and close to the user. The book walks you through how to make that call before you write a single line of code.
2. Why Most Developers Wrongly Assume “AI = Cloud-Only”
When developers hear “AI model,” they immediately think OpenAI, Gemini, Claude — something behind an API key, costing money per token. That assumption shapes the entire architecture before anyone has thought it through.
The book breaks that assumption open. With Docker Model Runner (DMR), you can pull a model the same way you pull a container image, run it natively on your hardware, and expose it through an OpenAI- and Anthropic-compatible API endpoint. Local-first, GPU-aware, production-ready.
3. The “GPU Fear” Problem
The moment you say “run a model locally,” developers panic about hardware: Do I need a beefy GPU? Will my MacBook handle this?
There are great open-source tools — llmfit literally scans your hardware and tells you which models will run well on your machine, with Docker Model Runner support built in. But most developers don’t know these tools exist, so they default to the cloud out of fear, not necessity. The book walks through hardware decisions with data, not anxiety.
4. “Should I Even Run My Model Inside a Container?”
This is the question I get more than any other — and the answer is more nuanced than most people realize.
With Docker Model Runner, you don’t actually have to wrap the model in a container at all. DMR runs the model natively on the host, uses the GPU directly, and exposes it through an OpenAI-compatible endpoint — but it gives you Docker’s packaging, versioning, and pull experience on top. You get the best of both worlds. That distinction is missed by a lot of teams, and the book unpacks it early.
What You’ll Learn
The book is structured around the real lifecycle of a production AI workload:
- ✅ Run LLMs locally with Docker Model Runner — Pull models like container images. OpenAI- and Anthropic-compatible APIs. Hardware-aware backends.
- ✅ Build and secure AI agents with Docker MCP Gateway — Dynamic tool discovery, policy enforcement, secrets isolation, audit logs. The most underrated piece of the current AI ecosystem.
- ✅ Orchestrate multi-agent workflows declaratively — Define agents, sub-agents, and tools in YAML using Docker Agent. Versioned, reproducible, reviewable.
- ✅ Isolate agent execution with Docker Sandboxes — Run untrusted, agent-generated code inside microVMs so a hallucinated
rm -rfnever reaches your host. - ✅ Build, run, and share multi-agent systems — Orchestrator-worker patterns, agent-to-agent communication via MCP, shared state, and when not to reach for multi-agent.
- ✅ Deploy and scale GenAI services on Kubernetes — The observability, cost-routing, and supply-chain patterns production demands.
Built for developers. Grounded in real tools. No fluff.
Who Should Read This Book?
Back in 2023, Harsh and I co-wrote a Docker blog called “LLM Everywhere: Docker for Local and Hugging Face Hosting.” It went deep into model quantization formats — GPTQ, GGML — and how to run these models locally in containers. From an SEO standpoint, it became one of Docker’s biggest hits in the AI space.
But what really shaped the audience for this book was the flood of follow-up questions. People wanted to understand training, quantization, format trade-offs. And one question kept coming up: “Should I even run a model inside a container?”
That tells you exactly who the readers are. They’re not AI researchers. They’re not pure ML engineers. They’re developers, DevOps folks, and platform teams who suddenly need to make AI work in their existing world.
So we wrote for:
- Developers who built an agent demo and are now being asked to “productionize it.”
- Platform engineers whose teams are shipping LLM-powered services and need a runtime story.
- Architects mapping out an agentic AI strategy who need a concrete reference for the operational layer.
We didn’t assume a deep ML background. If you know containers and you’re curious about AI, you’re in the right place. If you know AI but containers feel like a black box, the early chapters meet you there.
Frequently Asked Questions
What is Operational AI with Docker about?
It’s a practical, end-to-end guide to running production AI workloads using Docker — covering local model serving with Docker Model Runner, secure AI agents with MCP Gateway, isolated execution with Docker Sandboxes, multi-agent orchestration with Docker Agents, and deployment at scale on Kubernetes.
Who are the authors?
Ajeet Singh Raina is a Developer Advocate at Docker, founder of Collabnix (17,000+ members), and a former Docker Captain. Harsh Manvar is a Senior Software Engineer, Docker Captain, Google Developer Expert, and CNCF Ambassador.
How is this different from other Docker books?
Most Docker books cover AI as a chapter or appendix. Operational AI with Docker is the first book dedicated entirely to the operational AI stack — Model Runner, MCP Gateway, Sandboxes, Docker Agents, and the Kubernetes patterns to run them all in production.
Do I need a GPU to follow along?
No. The book covers hardware-aware decisions and includes guidance on running smaller models on CPUs and consumer-grade machines. Tools like llmfit are introduced to help you match models to your available hardware.
Where can I buy it?
📘 Packt: Operational AI with Docker
📦 Amazon: Available now (paperback + Kindle)
🔖 ISBN: 9781807301095
Final Word
Operational AI is not a finished problem. This book is a snapshot of where the practice stands in 2026 — and a foundation you can build on as the stack continues to evolve.
If you’ve ever stood up an agent demo on your laptop and wondered how to get it past the demo phase, this book is for you.
If you pick it up, I’d genuinely love to hear what you think. Tag me on LinkedIn or X (@ajeetsraina), or drop into the Collabnix Slack. The companion code is open source and we’ll be maintaining it as the Docker AI ecosystem evolves.
Build, ship, run — extended to AI workloads. Let’s get to work.