Want to run powerful AI models locally and access them remotely through a user-friendly interface? This guide explores a seamless Docker Compose setup that combines Ollama, Ollama UI, and Cloudflare for a secure and accessible experience.
Prerequisites:
- Supported NVIDIA GPU (for efficient model inference)
- NVIDIA Container Toolkit (to manage GPU resources)
- Docker Compose (to orchestrate containerized services)
Understanding the Services:
- webui (ghcr.io/open-webui/open-webui:main):
This acts as the web interface, allowing you to interact with your Ollama AI models visually. - ollama (Optional – ollama/ollama):
This is the AI model server itself. It can leverage your NVIDIA GPU for faster inference tasks. - tunnel (cloudflare/cloudflared:latest):
This service establishes a secure tunnel to your web UI via Cloudflare, enabling safe remote access.
Volumes and Environment Variables:
- Two volumes,
ollama
andopen-webui
, are defined to store data persistently across container restarts. This ensures your models and configurations remain intact. - The crucial environment variable is
OLLAMA_API_BASE_URL
. Make sure it points to the correct internal network URL of the ollama service. If ollama runs directly on your Docker host, you can use host.docker.internal as the address.
Deployment and Access:
- Deployment: Execute
docker compose up -d
to start all services in detached mode, running them in the background. - Local Access: If you just need to access the web UI locally, simply navigate to
http://localhost:8080
in your web browser. - Remote Access: To access your AI models remotely, locate the Cloudflare Tunnel URL printed in the Docker logs. Use docker compose logs tunnel to retrieve this URL. Now, you can access your models from anywhere with an internet connection, provided you have the URL.
Benefits:
- Simplified AI Model Management: Easily interact with your AI models through the user-friendly Ollama UI.
- Remote Accessibility: Securely access your models from any location with a web browser thanks to Cloudflare’s tunneling capabilities.
- GPU Acceleration (Optional): Leverage your NVIDIA GPU for faster model inference, speeding up tasks.
Getting Started
- Install Docker
curl -sSL https://get.docker.com/ | sh
Writing a Docker Compose file
services:
webui:
image: ghcr.io/open-webui/open-webui:main
expose:
- 8080/tcp
ports:
- 8080:8080/tcp
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
volumes:
- open-webui:/app/backend/data
depends_on:
- ollama
ollama:
image: ollama/ollama
expose:
- 11434/tcp
ports:
- 11434:11434/tcp
healthcheck:
test: ollama --version || exit 1
command: serve
volumes:
- ollama:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['all']
capabilities: [gpu]
tunnel:
image: cloudflare/cloudflared:latest
restart: unless-stopped
environment:
- TUNNEL_URL=http://webui:8080
command: tunnel --no-autoupdate
depends_on:
- webui
volumes:
ollama:
open-webui:
The Compose file defines the individual services that make up the entire application. Here, we have three services:
- webui,
- ollama ,
- and tunnel.
The webui service acts as your user interface for interacting with Ollama AI models. It fetches data from the optional ollama service (the AI model server) running on the same network, and lets you manage and use your models visually. You can access the web interface at http://localhost:8080 if running locally. The ollama service itself (optional) handles running your models, and can leverage your NVIDIA GPU for faster computations. Finally, the tunnel service provides a secure way to access the web interface remotely through Cloudflare.
Bringing up the Stack
docker compose up -d
You will see the following services:
docker compose ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
cloudflare-ollama-1 ollama/ollama "/bin/ollama serve" ollama About a minute ago Up About a minute (healthy) 0.0.0.0:11434->11434/tcp
cloudflare-tunnel-1 cloudflare/cloudflared:latest "cloudflared --no-au…" tunnel About a minute ago Up About a minute
cloudflare-webui-1 ghcr.io/open-webui/open-webui:main "bash start.sh" webui About a minute ago Up About a minute 0.0.0.0:8080->8080/tcp
Conclusion
This setup empowers you to unlock the potential of your AI models both locally and remotely. With Ollama, Ollama UI, and Cloudflare working in tandem, you gain a powerful and accessible platform for exploring and utilizing AI technology.