Join our Discord Server
Ajeet Raina Ajeet Singh Raina is a former Docker Captain, Community Leader and Distinguished Arm Ambassador. He is a founder of Collabnix blogging site and has authored more than 700+ blogs on Docker, Kubernetes and Cloud-Native Technology. He runs a community Slack of 9800+ members and discord server close to 2600+ members. You can follow him on Twitter(@ajeetsraina).

Running Llama 3 Locally with Ollama and Ollama WebUI

3 min read

Meta (formerly Facebook) has just released Llama 3, a groundbreaking large language model (LLM) that promises to push the boundaries of what AI can achieve. The exciting news? It’s available now through Ollama, an open-source platform!

Get Started with Llama 3

Ready to experience the power of Llama 3? Here’s all you need to do:

  • Download Ollama from https://ollama.com/ and get ready to explore!
  • Open your terminal and type the following command:
ollama run llama3:70b
pulling manifest
pulling 00e1317cbf74... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████▏ 39 GB
pulling 4fa551d4f938... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████▏  12 KB
pulling 8ab4849b038c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████▏  254 B
pulling 577073ffcc6c... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████▏  110 B
pulling ad1518640c43... 100% ▕██████████████████████████████████████████████████████████████████████████████████████████████████████▏  483 B
verifying sha256 digest
writing manifest
removing any unused layers
success
>>> Send a message (/? for help
>>> Write a Python script for Sentiment Analysis
Here is a simple Python script that uses the Natural Language Processing (NLP) library NLTK and the VADER sentiment analysis tool to analyze the sentiment of a
piece of text.


import nltk
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')  # Download the VADER sentiment lexicon

text = "I love this product, it's amazing! The customer service is also great."
sia = SentimentIntensityAnalyzer()

sentences = nltk.sent_tokenize(text)
compound_scores = []

for sentence in sentences:
    sentiment_scores = sia.polarity_scores(sentence)
    compound_score = sentiment_scores['compound']
    compound_scores.append(compound_score)


Congratulations, you’re now connected to the power of Llama 3!

Setting up Ollama WebUI

Clone the official repository of Ollama WebUI

git clone https://github.com/ollama-webui/ollama-webui
cd ollama-webui

Open up the Compose file to see the YAML file:

version: '3.6'

services:
  ollama:
    volumes:
      - ollama:/root/.ollama
    # Uncomment below to expose Ollama API outside the container stack
    # ports:
    #   - 11434:11434
    container_name: ollama
    pull_policy: always
    tty: true
    restart: unless-stopped
    image: ollama/ollama:latest

  ollama-webui:
    build:
      context: .
      args:
        OLLAMA_API_BASE_URL: '/ollama/api'
      dockerfile: Dockerfile
    image: ollama-webui:latest
    container_name: ollama-webui
    depends_on:
      - ollama
    ports:
      - 3000:8080
    environment:
      - "OLLAMA_API_BASE_URL=http://ollama:11434/api"
    extra_hosts:
      - host.docker.internal:host-gateway
    restart: unless-stopped

volumes:
  ollama: {}

Ensure that you stop the Ollama Docker container before you run the following command:

docker compose up -d

Access the Ollama WebUI

Open Docker Dashboard > Containers > Click on WebUI port

Click on Ports to access Ollama WebUI. Start typing llama3:70b to download this latest model.

Using Llama 3 using Docker GenAI Stack

Did you try using Llama 3 using Docker GenAI Stack? It’s easy. Just change the following entries in your .env file and you’re all set to leverage the latest Meta Llama 3:

cat .env
OPENAI_API_KEY=sk-EsNJzI5uXXXXXXXXdsJ7gr0Htnig8KIil4x
OLLAMA_BASE_URL=http://host.docker.internal:11434
NEO4J_URI=neo4j://database:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=XXXX
LLM=llama3:8b #or any Ollama model tag, or gpt-4 or gpt-3.5
EMBEDDING_MODEL=sentence_transformer #or openai or ollama

LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_TRACING_V2=true # false
LANGCHAIN_PROJECT=default
LANGCHAIN_API_KEY=ls__cbabXXXXXXXXd6106dd

That’s it. Follow this guide for the complete instructions.

Llama 2 Vs Llama 3

Llama 3 represents a large improvement over Llama 2 and other openly available models:

  • Trained on a dataset seven times larger than Llama 2
  • Double the context length of 8K from Llama 2
  • Encodes language much more efficiently using a larger token vocabulary with 128K tokens
  • Less than 1⁄3 of the false “refusals” when compared to Llama 2

Availability of Llama 3

Meta is planning to make Llama 3 models available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.

Why is Llama 3 Special?

Compared to its predecessors, Llama 3 boasts some impressive upgrades:

  • Massive Knowledge Base: Llama 3 has been trained on a dataset seven times larger than Llama 2, giving it a vast pool of knowledge to draw from.
  • Deeper Context: Understanding context is key to meaningful conversations. Llama 3 doubles the context length of its predecessor, allowing it to analyze and respond to information within a larger window.
  • Efficient Communication: Llama 3 utilizes a larger token vocabulary (128K tokens!), enabling it to encode language more effectively and communicate with greater precision.
  • Reduced Frustration: Have you ever asked a question and gotten a weird response from an AI? Llama 3 suffers from less than a third of the “false refusals” compared to Llama 2, meaning you’re more likely to get a clear and helpful response to your queries.
  • Choose Your Power: Llama 3 comes in two flavors – 8B and 70B parameters. Think of parameters as the building blocks of an – LLM’s abilities. The larger 70B version offers more power for complex tasks, while the 8B version is ideal for situations that don’t require as much computational muscle.

Unleashing Llama 3’s Potential

The possibilities with Llama 3 are truly exciting. Here are a few examples of how you can leverage its power:

  • LangChain Integration: For developers, LangChain provides a seamless way to interact with Llama 3. Imagine asking complex questions and receiving informative answers, all within your development workflow!
  • LlamaIndex at Your Service: The LlamaIndex library allows you to leverage Llama 3 for text completion tasks. Stuck on a sentence? Let Llama 3 suggest some creative continuations!

The Future of Llama 3

This is just the beginning! Meta has ambitious plans for Llama 3, including:

  • A Gigantic Leap: Get ready for a 400B parameter version of Llama 3, offering even more power and capabilities.
  • Multimodality on the Horizon: Imagine an LLM that can not only understand text but also process images and other formats. That’s what Meta is working on with future iterations of Llama 3!
  • Global Conversations: Breaking down language barriers! Future versions of Llama 3 might be able to converse fluently across multiple languages.
  • Even More Context: The ability to analyze even longer stretches of text will allow Llama 3 to grasp complex topics with even greater depth.

The Takeaway:

Llama 3 marks a significant step forward in LLM technology. Its open-source nature and impressive capabilities make it a valuable tool for researchers, developers, and anyone curious about the future of AI. So, download Ollama, fire up Llama 3, and get ready to explore the exciting possibilities of this powerful language model!

Have Queries? Join https://launchpass.com/collabnix

Ajeet Raina Ajeet Singh Raina is a former Docker Captain, Community Leader and Distinguished Arm Ambassador. He is a founder of Collabnix blogging site and has authored more than 700+ blogs on Docker, Kubernetes and Cloud-Native Technology. He runs a community Slack of 9800+ members and discord server close to 2600+ members. You can follow him on Twitter(@ajeetsraina).

Platform Engineering vs DevOps vs SRE: A Cheatsheet

According to Gartner®, by 2026, 80% of large software engineering organizations will establish platform engineering teams—a significant leap from 45% in 2022. This shift...
Tanvir Kour
2 min read

How to Develop Event-Driven Applications with Kafka and Docker

Event-driven architectures have become increasingly popular with the rise of microservices. These architectures are built around the concept of reacting to events in real-time,...
Abraham Dahunsi
6 min read
Join our Discord Server
Index