Ollama is a powerful framework that allows you to run, create, and modify large language models (LLMs) locally. This guide will walk you through the installation process across different platforms and provide best practices for optimal performance.
Table of Contents
- System Requirements
- Installation Methods
- Platform-Specific Installation
- Docker Installation
- Post-Installation Steps
- Troubleshooting
System Requirements
Minimum Hardware Requirements:
- CPU: 64-bit processor
- RAM: 8GB (16GB recommended)
- Storage: 10GB free space
- GPU: Optional, but recommended for better performance
Supported Platforms:
- macOS 12+ (Intel & Apple Silicon)
- Windows 10/11 with WSL2
- Linux (Ubuntu 20.04+, Debian 11+, Fedora 37+)
Installation Methods
Method 1: Direct Installation (macOS)
# Install using Homebrew
brew install ollama
# Or using curl
curl -fsSL https://ollama.com/install.sh | sh
Method 2: Linux Installation
# Ubuntu/Debian
curl -fsSL https://ollama.com/install.sh | sh
# Verify Installation
ollama --version
Method 3: Windows Installation (WSL2)
- Enable WSL2:
wsl --install
- Install Ubuntu on WSL2:
wsl --install -d Ubuntu
- Inside WSL2:
curl -fsSL https://ollama.com/install.sh | sh
Docker Installation
Basic Docker Setup
# Pull the Ollama image
docker pull ollama/ollama
# Run Ollama container
docker run -d --gpus=all \
-v ollama:/root/.ollama \
-p 11434:11434 \
--name ollama \
ollama/ollama
Docker Compose Setup
services:
ollama:
image: ollama/ollama
container_name: ollama
restart: unless-stopped
ports:
- "11434:11434"
volumes:
- ollama:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
volumes:
ollama:
Post-Installation Steps
- Verify Installation
ollama --version
- Pull Your First Model
# Pull a lightweight model
ollama pull tinyllama
# Or a more capable model
ollama pull llama2
- Test the Installation
ollama run llama2 "What is the meaning of life?"
Environment Configuration
GPU Setup (Linux)
- Install NVIDIA Drivers:
sudo ubuntu-drivers autoinstall
- Install NVIDIA Container Toolkit:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
Advanced Configuration
Custom Model Directory
# Set custom model directory
export OLLAMA_MODELS=/path/to/models
# Or add to .bashrc/.zshrc
echo 'export OLLAMA_MODELS=/path/to/models' >> ~/.bashrc
API Configuration
# Set custom API host
export OLLAMA_HOST=0.0.0.0:11434
# Set custom origins
export OLLAMA_ORIGINS=http://localhost:3000
Troubleshooting
Common Issues and Solutions
- Port Conflict
# Check if port 11434 is in use
sudo lsof -i :11434
# Use alternative port
export OLLAMA_HOST=127.0.0.1:11435
- Memory Issues
# Set memory limit
export OLLAMA_GPU_LAYERS=0
- GPU Not Detected
# Verify NVIDIA installation
nvidia-smi
# Check Docker GPU support
docker run --gpus all nvidia/cuda:11.0-base nvidia-smi
Best Practices
- Resource Management
- Monitor system resources during model runs
- Use appropriate model sizes for your hardware
- Consider quantized models for resource-constrained environments
- Security
- Keep Ollama updated
- Use firewall rules to restrict access
- Run as non-root user when possible
- Performance Optimization
- Use GPU acceleration when available
- Implement proper cooling for extended runs
- Monitor memory usage with large models
Conclusion
Ollama provides a flexible and powerful way to run LLMs locally. This installation guide should help you get started with your local AI deployment. Remember to check the official documentation for the latest updates and features.