🚀 100K+ Downloads | Latest Google AI Model | Docker Hub: ai/gemma3
What is Gemma3?

Gemma3 is Google DeepMind’s latest generation of versatile AI models designed for tasks like question answering, summarization, reasoning, and multimodal understanding. With open weights and responsible commercial use, Gemma3 supports both text and image inputs, offers up to 131K token context windows, and comprehends over 140 languages with built-in tool calling capabilities.
Key Technical Specifications
Architecture Overview
- Total Models: 7 variants (270M → 4B parameters)
- Languages: 140 languages
- Context Window: 33K-131K tokens (model dependent)
- License: Gemma Terms (open source)
- Provider: Google DeepMind
- Architecture: Gemma3 (advanced transformer)
- Input Modalities: Text, Image
- Output Modalities: Text, Code
- Tool Calling: ✅ Supported
What Makes Gemma3 Different
# Traditional AI Model
model_size = "4B parameters"
context_window = "8K tokens"
multimodal = False
tool_calling = False
languages = "Limited"
# Gemma3 Model
model_size = "4B parameters"
context_window = "131K tokens" # 16x larger context
multimodal = True # Text + Image input
tool_calling = True # Function calling support
languages = "140 languages" # Massive multilingual support
performance = "Superior reasoning & code generation"
quantization = "Efficient Q4_K_M" # Optimized memory usage
What is Docker Model Runner?
Docker Model Runner (DMR) is a tool that makes it easy to manage, run, and deploy AI models using Docker. It allows developers to pull, run, and serve large language models (LLMs) and other AI models directly from Docker Hub or any OCI-compliant registry. DMR integrates with Docker Desktop and Docker Engine, enabling you to serve models via OpenAI-compatible APIs, interact with models from the command line, and manage them through a graphical interface. Models are cached locally after the first pull and are loaded into memory only at runtime to optimize resource usage. DMR supports both command-line and API-based interactions, making it suitable for building generative AI applications, experimenting with ML workflows, or integrating AI into software development pipelines.
Quick Start
# Pull latest model (4B-Q4_K_M)
docker model pull ai/gemma3
All Available Models
# Ultra-lightweight models (< 1GB)
docker model pull ai/gemma3:270M-F16 # 511MB
docker model pull ai/gemma3:270M-UD-IQ2_XXS # Ultra-compressed
docker model pull ai/gemma3:270M-UD-Q4_K_XL # Balanced compression
# Compact models (< 2GB)
docker model pull ai/gemma3:1B-Q4_K_M # 763MB
docker model pull ai/gemma3:1B-F16 # 1.86GB
# Standard models (recommended)
docker model pull ai/gemma3:4B-Q4_0 # 2.19GB
docker model pull ai/gemma3:4B-Q4_K_M # 2.31GB (recommended)
docker model pull ai/gemma3:4B-F16 # 7.23GB (highest quality)
Basic Usage Examples
1. Simple Chat Interface
docker model run ai/gemma3:4B-Q4_K_M
> Hello, can you help me analyze this image and write Python code?
Assistant: Absolutely! I can help you with both image analysis and Python code generation. Please share the image you'd like me to analyze, and let me know what specific Python functionality you need.
> Create a function to process CSV data and generate visualizations
2. Multimodal Image Analysis
docker model run ai/gemma3:4B-Q4_K_M
> [Upload an image] What do you see in this chart?
Assistant: I can see a bar chart showing quarterly sales data. The chart displays revenue figures across Q1-Q4, with Q3 showing the highest performance at approximately $2.3M. Would you like me to help you create similar visualizations or analyze this data further?
Docker Model Runner REST API
Once Docker Model Runner is enabled, you can interact with your models programmatically using OpenAI-compatible REST API endpoints:
curl http://localhost:12434/engines/llama.cpp/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ai/gemma3",
"messages": [
{
"role": "system",
"content": "You are a helpful coding assistant with multimodal capabilities."
},
{
"role": "user",
"content": "Generate a Python script to analyze customer sentiment from reviews."
}
]
}'
Thinking Mode Examples
Enable Thinking Mode
docker model run ai/gemma3:4B-Q4_K_M
> /think Solve this step by step: What is the best approach to implement a machine learning pipeline?
<thinking>
To design an effective ML pipeline, I need to consider the key stages:
1. Data collection and validation
2. Data preprocessing and feature engineering
3. Model training and validation
4. Model evaluation and testing
5. Model deployment and monitoring
Let me break this down systematically...
</thinking>
For implementing a robust machine learning pipeline, I recommend following these systematic steps:
1. **Data Collection & Validation**:
- Establish data sources and collection mechanisms
- Implement data quality checks and validation rules
- Set up automated data ingestion pipelines
2. **Data Preprocessing**:
- Handle missing values and outliers
- Perform feature scaling and normalization
- Engineer relevant features based on domain knowledge
3. **Model Development**:
- Split data into train/validation/test sets
- Implement cross-validation strategies
- Compare multiple algorithms and hyperparameters
4. **Deployment & Monitoring**:
- Set up automated model deployment
- Implement performance monitoring
- Establish retraining triggers and schedules
This approach ensures reliability, scalability, and maintainability of your ML system.
> /no_think What's 15 + 27?
42
Tool Calling Examples
docker model run ai/gemma3:4B-Q4_K_M
> I need to calculate the area of a circle with radius 5 and then search for information about geometry formulas