Join our Discord Server
Avinash Bendigeri Avinash is a developer-turned Technical writer skilled in core content creation. He has an excellent track record of blogging in areas like Docker, Kubernetes, IoT and AI.

The Ollama Docker Compose Setup with WebUI and Remote Access via Cloudflare

2 min read

Want to run powerful AI models locally and access them remotely through a user-friendly interface? This guide explores a seamless Docker Compose setup that combines Ollama, Ollama UI, and Cloudflare for a secure and accessible experience.

Prerequisites:

  • Supported NVIDIA GPU (for efficient model inference)
  • NVIDIA Container Toolkit (to manage GPU resources)
  • Docker Compose (to orchestrate containerized services)

Understanding the Services:

  • webui (ghcr.io/open-webui/open-webui:main):
    This acts as the web interface, allowing you to interact with your Ollama AI models visually.
  • ollama (Optional – ollama/ollama):
    This is the AI model server itself. It can leverage your NVIDIA GPU for faster inference tasks.
  • tunnel (cloudflare/cloudflared:latest):
    This service establishes a secure tunnel to your web UI via Cloudflare, enabling safe remote access.

Volumes and Environment Variables:

  • Two volumes, ollama and open-webui, are defined to store data persistently across container restarts. This ensures your models and configurations remain intact.
  • The crucial environment variable is OLLAMA_API_BASE_URL. Make sure it points to the correct internal network URL of the ollama service. If ollama runs directly on your Docker host, you can use host.docker.internal as the address.

Deployment and Access:

  • Deployment: Execute docker compose up -d to start all services in detached mode, running them in the background.
  • Local Access: If you just need to access the web UI locally, simply navigate to http://localhost:8080 in your web browser.
  • Remote Access: To access your AI models remotely, locate the Cloudflare Tunnel URL printed in the Docker logs. Use docker compose logs tunnel to retrieve this URL. Now, you can access your models from anywhere with an internet connection, provided you have the URL.

Benefits:

  • Simplified AI Model Management: Easily interact with your AI models through the user-friendly Ollama UI.
  • Remote Accessibility: Securely access your models from any location with a web browser thanks to Cloudflare’s tunneling capabilities.
  • GPU Acceleration (Optional): Leverage your NVIDIA GPU for faster model inference, speeding up tasks.

Getting Started

  • Install Docker
curl -sSL https://get.docker.com/ | sh

Writing a Docker Compose file

services:

  webui:
    image: ghcr.io/open-webui/open-webui:main
    expose:
     - 8080/tcp
    ports:
     - 8080:8080/tcp
    environment:
      - OLLAMA_BASE_URL=http://host.docker.internal:11434
    volumes:
      - open-webui:/app/backend/data
    depends_on:
     - ollama

  ollama:
    image: ollama/ollama
    expose:
     - 11434/tcp
    ports:
     - 11434:11434/tcp
    healthcheck:
      test: ollama --version || exit 1
    command: serve
    volumes:
      - ollama:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['all']
              capabilities: [gpu]

  tunnel:
    image: cloudflare/cloudflared:latest
    restart: unless-stopped
    environment:
      - TUNNEL_URL=http://webui:8080
    command: tunnel --no-autoupdate
    depends_on:
      - webui

volumes:
  ollama:
  open-webui:

The Compose file defines the individual services that make up the entire application. Here, we have three services:

  • webui,
  • ollama ,
  • and tunnel.

The webui service acts as your user interface for interacting with Ollama AI models. It fetches data from the optional ollama service (the AI model server) running on the same network, and lets you manage and use your models visually. You can access the web interface at http://localhost:8080 if running locally. The ollama service itself (optional) handles running your models, and can leverage your NVIDIA GPU for faster computations. Finally, the tunnel service provides a secure way to access the web interface remotely through Cloudflare.

Bringing up the Stack

docker compose up -d

You will see the following services:

docker compose ps
NAME                  IMAGE                                COMMAND                  SERVICE   CREATED              STATUS                        PORTS
cloudflare-ollama-1   ollama/ollama                        "/bin/ollama serve"      ollama    About a minute ago   Up About a minute (healthy)   0.0.0.0:11434->11434/tcp
cloudflare-tunnel-1   cloudflare/cloudflared:latest        "cloudflared --no-au…"   tunnel    About a minute ago   Up About a minute
cloudflare-webui-1    ghcr.io/open-webui/open-webui:main   "bash start.sh"          webui     About a minute ago   Up About a minute             0.0.0.0:8080->8080/tcp
Image3

Conclusion

This setup empowers you to unlock the potential of your AI models both locally and remotely. With Ollama, Ollama UI, and Cloudflare working in tandem, you gain a powerful and accessible platform for exploring and utilizing AI technology.

Have Queries? Join https://launchpass.com/collabnix

Avinash Bendigeri Avinash is a developer-turned Technical writer skilled in core content creation. He has an excellent track record of blogging in areas like Docker, Kubernetes, IoT and AI.
Join our Discord Server
Index