Join our Discord Server
Ajeet Raina Ajeet Singh Raina is a former Docker Captain, Community Leader and Distinguished Arm Ambassador. He is a founder of Collabnix blogging site and has authored more than 700+ blogs on Docker, Kubernetes and Cloud-Native Technology. He runs a community Slack of 9800+ members and discord server close to 2600+ members. You can follow him on Twitter(@ajeetsraina).

Ollama Cheatsheet 2025

1 min read

Ollama is an open-source framework that lets you run large language models (LLMs) locally on your own computer instead of using cloud-based AI services. It’s designed to make running these powerful AI models simple and accessible to individual users and developers.

Key features of Ollama include:

  • Local execution – All processing happens on your own hardware, providing privacy and eliminating the need for internet connectivity after model download
  • Model library – Supports various open-source LLMs like Llama, Mistral, Vicuna, and many others
  • Simple interface – Provides an easy command-line interface and API for interacting with models
  • Resource optimization – Includes tools to manage memory usage and optimize performance based on your hardware capabilities
  • Customization – Allows you to create and modify models with custom system prompts using Modelfiles
  • Cross-platform – Works on macOS, Linux, and Windows

Ollama Cheatsheet

Ollama is a lightweight, open-source framework for running large language models (LLMs) locally on your machine. This cheatsheet provides a quick reference for common Ollama commands and configurations to help you get started and make the most of your local AI models.

Installation

PlatformCommand
macOS/Linuxcurl -fsSL https://ollama.com/install.sh \
WindowsDownload from https://ollama.com/download/windows

Basic Commands

ActionCommand
Run a modelollama run llama3
List modelsollama list
Pull a modelollama pull mistral
Remove a modelollama rm codellama
Show model infoollama show llama3

Running Models

ActionCommand
Start chat sessionollama run llama3
Run with parametersollama run llama3:13b –temperature 0.7 –top_p 0.9
One-shot generationecho “Write a poem about coding” \
Save output to fileecho “Explain Docker” \
Multiline inputollama run llama3 << EOF
Please write a short story
about artificial intelligence.
EOF

API Usage

ActionCommand
Generate responsecurl -X POST http://localhost:11434/api/generate -d ‘{
“model”: “llama3”,
“prompt”: “What is Docker?”
}’
Chat with historycurl -X POST http://localhost:11434/api/chat -d ‘{
“model”: “llama3”,
“messages”: [
{ “role”: “user”, “content”: “Hello, who are you?” },
{ “role”: “assistant”, “content”: “I am an AI assistant.” },
{ “role”: “user”, “content”: “What can you do?” }
]
}’

Advanced Usage

ActionCommand
Create custom modelollama create custom-llama -f Modelfile
Example ModelfileFROM llama3
SYSTEM “You are a helpful programming assistant focused on Python.”
Run with GPU limitsOLLAMA_GPU_LAYERS=35 ollama run llama3
Multi-modal with imageollama run llava < image.jpg

Performance Tips

TipCommand
Reduce memory usageOLLAMA_HOST=0.0.0.0 OLLAMA_KEEP_ALIVE=5m ollama serve
Run quantized modelsollama run llama3:8b-q4_0
Use mmap (Linux)OLLAMA_MMAP=1 ollama run llama3

Environment Variables

VariableDescription
OLLAMA_HOSTDefault: 127.0.0.1 (use 0.0.0.0 for network access)
OLLAMA_PORTDefault: 11434
OLLAMA_MODELSCustom path to store models
OLLAMA_KEEP_ALIVEDuration to keep models loaded (e.g., 5m, 1h)
OLLAMA_GPU_LAYERSNumber of layers to offload to GPU

Ollama essentially bridges the gap between powerful AI capabilities and local computing, making it possible to have conversations with AI, generate text, answer questions, and create content without sending your data to third-party services. It’s particularly useful for developers who want to integrate AI into their applications while maintaining data privacy or for users who want to experiment with AI without recurring subscription costs.

Have Queries? Join https://launchpass.com/collabnix

Ajeet Raina Ajeet Singh Raina is a former Docker Captain, Community Leader and Distinguished Arm Ambassador. He is a founder of Collabnix blogging site and has authored more than 700+ blogs on Docker, Kubernetes and Cloud-Native Technology. He runs a community Slack of 9800+ members and discord server close to 2600+ members. You can follow him on Twitter(@ajeetsraina).
Join our Discord Server
Index