Is Ollama available for Windows?

Table of Contents

Ollama, a powerful framework for running and managing large language models (LLMs) locally, is now available as a native Windows application. This means you no longer need to rely on Windows Subsystem for Linux (WSL) to run Ollama. Whether you’re a software developer, AI engineer, or DevOps professional, this guide will walk you through setting up Ollama on Windows, optimizing it for your workflow, and leveraging its capabilities for AI-driven applications.

Why Ollama on Windows?

Ollama simplifies the process of running LLMs locally, making it an excellent choice for developers and engineers who need to work with AI models without relying on cloud-based solutions. With native Windows support, Ollama now offers:

Native Performance: No more WSL overhead—Ollama runs directly on Windows.
GPU Support: Full support for NVIDIA and AMD Radeon GPUs, enabling faster model inference.
Ease of Use: The ollama command-line interface (CLI) is available in cmd, powershell, or any terminal application.
Local API: The Ollama API is served on http://localhost:11434, making it easy to integrate into your applications.

System Requirements

Before installing Ollama, ensure your system meets the following requirements:

Operating System: Windows 10 22H2 or newer (Home or Pro).
GPU Drivers:
- NVIDIA: 452.39 or newer.
- AMD Radeon: Latest drivers from AMD’s support page.
Storage: At least 4GB of space for the binary install, plus additional space for model files (which can range from tens to hundreds of GB).

Installing Ollama on Windows

Ollama is designed to be easy to install and use. Follow these steps to get started:

Step 1: Download and Install Ollama

Download the Ollama installer from the official website and run it. The installer does not require administrator privileges and will install Ollama in your home directory by default.

OllamaSetup.exe

If you want to install Ollama in a custom directory, use the following command:

OllamaSetup.exe /DIR="d:\some\location"

Step 2: Set Up Model Storage

By default, Ollama stores models in your home directory. If you want to change this location, set the OLLAMA_MODELS environment variable:

Open Settings (Windows 11) or Control Panel (Windows 10) and search for environment variables.
Click on Edit environment variables for your account.
Create or edit the OLLAMA_MODELS variable and set it to your desired directory.
Restart Ollama or open a new terminal for the changes to take effect.

Step 3: Verify Installation

After installation, open a terminal and run the following command to verify that Ollama is installed correctly:

ollama --version

You should see the installed version of Ollama displayed.

Running Ollama on Windows

Once installed, Ollama runs in the background, and you can interact with it using the CLI or API. Here’s how to get started:

Step 1: Pull a Model

To use Ollama, you need to download a model. For example, to pull the llama3.2 model, run:

ollama pull llama3.2

This will download the model and make it available for use.

Step 2: Run the Model

You can now run the model and interact with it via the CLI or API. For example, to generate text using the llama3.2 model, run:

ollama run llama3.2

This will start an interactive session where you can input prompts and receive responses from the model.

Step 3: Use the Ollama API

Ollama provides a REST API that you can use to integrate it into your applications. The API is available at http://localhost:11434. Here’s an example of using the API in PowerShell:

(Invoke-WebRequest -method POST -Body '{"model":"llama3.2", "prompt":"Why is the sky blue?", "stream": false}' -uri http://localhost:11434/api/generate ).Content | ConvertFrom-json

This will send a prompt to the model and return the generated response.

Optimizing Ollama for Windows

To get the most out of Ollama on Windows, consider the following optimizations:

1. GPU Acceleration

Ollama supports both NVIDIA and AMD GPUs. Ensure your GPU drivers are up to date to take full advantage of hardware acceleration. This will significantly improve the performance of model inference.

2. Model Quantization

Quantization reduces the size of models, making them faster and more efficient. Ollama supports various quantization levels, such as q4_K_M and q5_K_M. You can quantize a model during the pull process:

ollama pull llama3.2:q4_K_M

3. Custom Model Storage

If your home directory is on a small drive, consider changing the OLLAMA_MODELS environment variable to point to a directory on a larger drive. This will prevent storage issues when working with large models.

Troubleshooting

If you encounter issues, Ollama stores logs and other files in the following locations:

Logs: %LOCALAPPDATA%\Ollama
Binaries: %LOCALAPPDATA%\Programs\Ollama
Models: %HOMEPATH%\.ollama
Temporary Files: %TEMP%

Check the logs for detailed error messages if something goes wrong.

Uninstalling Ollama

To uninstall Ollama, go to Add or remove programs in Windows Settings and select Ollama. Note that if you changed the OLLAMA_MODELS location, the installer will not remove your downloaded models.

Conclusion

Ollama on Windows provides a seamless experience for running and managing large language models locally. With native support for NVIDIA and AMD GPUs, easy installation, and a powerful API, Ollama is an excellent tool for developers, AI engineers, and DevOps professionals. Whether you’re building AI applications or experimenting with LLMs, Ollama makes it easy to get started.

For more information, check out the Ollama documentation and start exploring the possibilities today!