Join our Discord Server
Adesoji Alu Adesoji brings a proven ability to apply machine learning(ML) and data science techniques to solve real-world problems. He has experience working with a variety of cloud platforms, including AWS, Azure, and Google Cloud Platform. He has a strong skills in software engineering, data science, and machine learning. He is passionate about using technology to make a positive impact on the world.

Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System

2 min read

Discover how to create a private AI-powered document analysis system using cutting-edge open-source tools.

System Requirements

  • 16GB RAM minimum
  • 10th Gen Intel Core i5 or equivalent
  • 10GB free storage space
  • Windows 10+/macOS 12+/Linux Ubuntu 20.04+

🛠️ Step 1: Installing Ollama

Download Ollama for macOS, Linux, or Windows:

# For Linux
curl -fsSL https://ollama.ai/install.sh | sh

🤖 What is Ollama?

ollama Image

Ollama is a framework designed for running large language models (LLMs) directly on your local machine. It allows users to download, execute, and interact with AI models without relying on cloud-based APIs.

  • Example: ollama run deepseek-r1:1.5b – Executes DeepSeek R1 locally.
  • Why use it? It offers a free, private, and offline AI experience with low latency.

🔗 What is LangChain?

LangChain is a Python/JavaScript framework that enables the seamless integration of LLMs with various data sources, APIs, and memory systems.

  • Why use it? It helps connect LLMs to applications like chatbots, document processing, and Retrieval-Augmented Generation (RAG) systems.

📄 What is Retrieval-Augmented Generation (RAG)?

RAG is an AI technique that improves the accuracy of LLM responses by incorporating information retrieved from external sources like PDFs and databases.

  • Why use it? It enhances factual correctness and reduces hallucinations by referencing actual documents.
  • Example: An AI-powered Q&A system that fetches relevant document excerpts before generating responses.

⚡ DeepSeek R1: A Powerful Open-Source AI Model

Deepseek ollama Image

DeepSeek R1 is an AI model optimized for logical reasoning, problem-solving, and factual retrieval.

  • Why use it? It excels in RAG applications and can run efficiently on local machines with Ollama.

🚀 How Do These Technologies Work Together?

  • Ollama runs DeepSeek R1 locally.
  • LangChain connects the AI model to external data.
  • RAG retrieves relevant information for accurate responses.
  • DeepSeek R1 generates high-quality, context-aware answers.

📈 Use Case Example: AI-Powered PDF Q&A System

This system allows users to upload a PDF and ask questions about its content. The AI, powered by DeepSeek R1, retrieves relevant sections and generates precise answers.

🎯 Why Run DeepSeek R1 Locally?

Feature Cloud-Based Models Local DeepSeek R1
Privacy Data sent to external servers 100% Local & Secure
Speed API latency & network delays Instant inference
Cost Pay per API request Free after setup
Customization Limited fine-tuning Full model control
Deployment Cloud-dependent Works offline & on-premises

🛠️ Step 2: Running DeepSeek R1

ollama pull deepseek-r1:1.5b
ollama run deepseek-r1:1.5b
Ollama installation terminal screenshot

🛠️ Step 3: Setting Up a RAG System with Streamlit in a Virtual Environment

Ollama virtual Environment
pip install -U langchain langchain-community streamlit pdfplumber semantic-chunkers
open-text-embeddings faiss ollama prompt-template langchain_experimental sentence-transformers faiss-cpu

🛠️ Step 4: Creating and Running the App

mkdir rag-system && cd rag-system

Create a Python script app.py and insert the following code:

import streamlit as st
from langchain_community.document_loaders import PDFPlumberLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import Ollama
from langchain.prompts import PromptTemplate
from langchain.chains.llm import LLMChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import RetrievalQA

st.title("📄 RAG System with DeepSeek R1 & Ollama")

uploaded_file = st.file_uploader("Upload your PDF", type="pdf")

if uploaded_file:
    with open("temp.pdf", "wb") as f:
        f.write(uploaded_file.getvalue())

    loader = PDFPlumberLoader("temp.pdf")
    docs = loader.load()

    text_splitter = SemanticChunker(HuggingFaceEmbeddings())
    documents = text_splitter.split_documents(docs)

    embedder = HuggingFaceEmbeddings()
    vector = FAISS.from_documents(documents, embedder)
    retriever = vector.as_retriever(search_type="similarity", search_kwargs={"k": 3})

    llm = Ollama(model="deepseek-r1:1.5b")
    QA_PROMPT = PromptTemplate.from_template("Context: {context}\nQuestion: {question}\nAnswer:")
    qa = RetrievalQA(combine_documents_chain=StuffDocumentsChain(LLMChain(llm=llm, prompt=QA_PROMPT)), retriever=retriever)

    user_input = st.text_input("Ask a question:")
    if user_input:
        response = qa(user_input)["result"]
        st.write("**Response:**", response)
streamlit run app.py

Now we see streamlit running on the web and in the terminal at Local URL: http://localhost:8501.

Streamlit Run Port
Streamlit Run Environment

👌 Final Thoughts

Congratulations! You have successfully set up a local RAG system with DeepSeek R1 and Ollama. Enjoy building AI-powered applications with privacy, speed, and full control!

The full code of this blog can be found here.

Have Queries? Join https://launchpass.com/collabnix

Adesoji Alu Adesoji brings a proven ability to apply machine learning(ML) and data science techniques to solve real-world problems. He has experience working with a variety of cloud platforms, including AWS, Azure, and Google Cloud Platform. He has a strong skills in software engineering, data science, and machine learning. He is passionate about using technology to make a positive impact on the world.

How to Build and Host Your Own MCP Servers…

Introduction The Model Context Protocol (MCP) is revolutionizing how LLMs interact with external data sources and tools. Think of MCP as the “USB-C for...
Adesoji Alu
1 min read
Join our Discord Server
Index