Join our Discord Server
Ajeet Raina Ajeet Singh Raina is a former Docker Captain, Community Leader and Distinguished Arm Ambassador. He is a founder of Collabnix blogging site and has authored more than 700+ blogs on Docker, Kubernetes and Cloud-Native Technology. He runs a community Slack of 9800+ members and discord server close to 2600+ members. You can follow him on Twitter(@ajeetsraina).

Running Ollama 2 on NVIDIA Jetson Nano with GPU using Docker

2 min read

Ollama is a rapidly growing development tool, with 10,000 Docker Hub pulls in a short period of time. It is a large language model (LLM) from Google AI that is trained on a massive dataset of text and code. It can generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

To run OLLAMA on a Jetson Nano, you will need to install the following software:

  • Docker Engine
  • OLLAMA Docker image
  • Jetson Nano 4GB

Hardware

  • Jetson Nano
  • A 5V 4Ampere Charger
  • 64GB SD card

Software

Preparing Your Jetson Nano

1. Preparing Your Raspberry Pi Flashing Jetson SD Card Image

  • Unzip the SD card image
  • Insert SD card into your system.
  • Bring up Etcher tool and select the target SD card to which you want to flash the image.

2. Verifying if it is shipped with Docker Binaries

ajeetraina@ajeetraina-desktop:~$ sudo docker version

3. Checking Docker runtime

Starting with JetPack 4.2, NVIDIA has introduced a container runtime with Docker integration. This custom runtime enables Docker containers to access the underlying GPUs available in the Jetson family.

pico@pico1:/tmp/docker-build$ sudo nvidia-docker version
NVIDIA Docker: 2.0.3
Client:
 Version:           19.03.6
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        369ce74a3c
 Built:             Fri Feb 28 23:47:53 2020
 OS/Arch:           linux/arm64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.6
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       369ce74a3c
  Built:            Wed Feb 19 01:06:16 2020
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.3.3-0ubuntu1~18.04.2
  GitCommit:        
 runc:
  Version:          spec: 1.0.1-dev
  GitCommit:        
 docker-init:
  Version:          0.18.0
  GitCommit:

Setting up Docker

Jetson Nano comes with Docker installed by default. To install the latest version of Docker on a Jetson Nano, follow these steps:

Update the package list:

sudo apt update

Install Docker:

sudo curl -sSL https://get.docker.com/ | sh

Add your user to the Docker group:

sudo groupadd docker
sudo usermod -aG docker $USER

Log out and back in for the changes to take effect.

Install with Apt

Configure the repository

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update

Install the NVIDIA Container Toolkit packages

sudo apt-get install -y nvidia-container-toolkit

Configure Docker to use Nvidia driver

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Start the container

sudo docker run -d --gpus=all --runtime=nvidia -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Run model locally

Now you can run a model:

sudo docker exec -it ollama ollama run llama2
sudo docker exec -it ollama ollama run llama2
pulling manifest
pulling 8daa9615cce3...   7% |█                     | (280 MB/3.8 GB, 4.4 MB/s) [1m15s:13m13s]

The command sudo docker exec -it ollama ollama run llama2 will start the OLLAMA 2 model in the ollama container. This will allow you to interact with the model directly from the command line.

To use the OLLAMA 2 model, you can send it text prompts and it will generate text in response. For example, to generate a poem about a cat, you would run the following command:

docker exec -it ollama ollama run llama2 "Write a poem about a cat."

This will generate a poem about a cat and print it to the console. You can also use the OLLAMA 2 model to translate languages, write different kinds of creative content, and answer your questions in an informative way.

Experiment with different prompts to test the capabilities of the OLLAMA 2 model.

Here are some examples of prompts you can use with the OLLAMA 2 model:

  • Translate the sentence "Hello, world!" into Spanish.
  • Write a short story about a robot who falls in love with a human.
  • Generate a list of ideas for new products.
  • Answer the question "What is the meaning of life?"

Models from the Ollama library can be customized with a prompt. The example

ollama pull llama2

Create a Modelfile

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Create and run the model

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

Conclusion

The OLLAMA 2 model is still under development, but it has the potential to be a powerful tool for a variety of tasks.

Have Queries? Join https://launchpass.com/collabnix

Ajeet Raina Ajeet Singh Raina is a former Docker Captain, Community Leader and Distinguished Arm Ambassador. He is a founder of Collabnix blogging site and has authored more than 700+ blogs on Docker, Kubernetes and Cloud-Native Technology. He runs a community Slack of 9800+ members and discord server close to 2600+ members. You can follow him on Twitter(@ajeetsraina).
Join our Discord Server
Index