Join our Discord Server
Avinash Bendigeri Avinash is a developer-turned Technical writer skilled in core content creation. He has an excellent track record of blogging in areas like Docker, Kubernetes, IoT and AI.

Ollama: A Lightweight, Extensible Framework for Building Language Models

2 min read

Join Our Slack Community

With over 10,00,000 Docker Pulls, Ollama is highly popular, lightweight, extensible framework for building and running language models on the local machine. With Ollama, all your interactions with large language models happen locally without sending private data to third-party services. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

Key Features

Some of the key features of Ollama include:

  • Ease of use: Ollama is easy to install and use, even for users with no prior experience with language models.
  • Extensible: Ollama is highly extensible, allowing users to create their own custom models or import models from other sources.
  • Powerful: Ollama models can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

Ollama is a valuable tool for researchers, developers, and anyone who wants to experiment with language models. It is also a great platform for building language-based applications, such as chatbots, summarization tools, and creative writing assistants.

Benefits of using Ollama

Here are some of the benefits of using Ollama:

  • Reduced development time: Ollama’s simple API and library of pre-built models can save developers a significant amount of time and effort.
  • Improved model performance: Ollama’s extensible architecture allows developers to fine-tune models for specific tasks or applications.
  • Increased flexibility: Ollama can be used to build a wide variety of language-based applications, limited only by the developer’s imagination.

If you are looking for a powerful, easy-to-use, and extensible framework for building language models, Ollama is a great option. With its wide range of features and benefits, Ollama can help you create innovative and effective language-based applications.

Ollama is now available as an official Docker image

Ollama is now available as an official Docker sponsored open-source image, making it simpler to get up and running with large language models using Docker containers.

Getting Started


Ollama handles running the model with GPU acceleration. It provides both a simple CLI as well as a REST API for interacting with your applications.

To get started, simply download from this link and install Ollama.

Note: Ollama Team recommend running Ollama alongside Docker Desktop for macOS in order for Ollama to enable GPU acceleration for models.


Ollama can run with GPU acceleration inside Docker containers for Nvidia GPUs.

To get started using the Docker image, please use the commands below.

CPU only

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Nvidia GPU

Install the Nvidia container toolkit.
Run Ollama inside a Docker container

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Run a model

Now you can run a model like Llama 2 inside the container.

docker exec -it ollama ollama run llama2


curl | sh

Get Connected

Ollama has active Discord community with over 9000+ community members

Model library

As published in their GitHub repo, Ollama supports a list of open-source models available on

Here are some example open-source models that can be downloaded:

Model Parameters Size Download
Mistral 7B 4.1GB ollama run mistral
Llama 2 7B 3.8GB ollama run llama2
Code Llama 7B 3.8GB ollama run codellama
Llama 2 Uncensored 7B 3.8GB ollama run llama2-uncensored
Llama 2 13B 13B 7.3GB ollama run llama2:13b
Llama 2 70B 70B 39GB ollama run llama2:70b
Orca Mini 3B 1.9GB ollama run orca-mini
Vicuna 7B 3.8GB ollama run vicuna

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Community Integrations


Web & Desktop


Package managers



  • Maid (Mobile Artificial Intelligence Distribution)

Extensions & Plugins

Have Queries? Join

Avinash Bendigeri Avinash is a developer-turned Technical writer skilled in core content creation. He has an excellent track record of blogging in areas like Docker, Kubernetes, IoT and AI.
Join our Discord Server