Table of Contents

Ollama, the versatile platform for running large language models (LLMs) locally, is now available on Windows. This update empowers Windows users to pull, run, and create LLMs with a seamless native experience. Packed with features like GPU acceleration, access to an extensive model library, and OpenAI-compatible APIs, Ollama on Windows is designed to deliver a robust and efficient AI development environment.

What’s New in Ollama on Windows?

1. Native Windows Experience

The Windows preview brings Ollama’s capabilities to a new audience, offering:

GPU Acceleration: Built-in support for NVIDIA GPUs and modern CPU instruction sets like AVX and AVX2 ensures faster model performance. No configuration or virtualization is required!
Full Model Library Access: From language models like Llama 2 to vision models like LLaVA 1.6, the entire Ollama library is now accessible on Windows. Vision models even allow drag-and-drop image inputs in the terminal during runtime.
Always-On API: Ollama’s API runs automatically in the background on http://localhost:11434, allowing tools and applications to connect seamlessly.

2. OpenAI Compatibility

Ollama on Windows supports the same OpenAI-compatible API as its macOS counterpart. This means you can integrate Ollama with existing OpenAI-compatible tooling and workflows for local model execution.

Step-by-Step Guide to Running Ollama on Windows

1. Get Started

Download Ollama on Windows
Visit Ollama’s website and download the Windows preview installer.
Install Ollama
Double-click OllamaSetup.exe and follow the installation prompts.
Verify Installation
Open a terminal (Command Prompt, PowerShell, or your preferred CLI) and type:

ollama --version

If successful, you’ll see the installed version number.

2. Run Your First Model

Open your terminal and run:

ollama run llama2

This command pulls and runs the Llama 2 model.

For vision models like LLaVA 1.6, simply drag and drop an image into the terminal window during runtime.

3. Using the Ollama API

The always-on API lets you integrate Ollama into your projects with ease. It serves requests on http://localhost:11434.
For example, use PowerShell to query the API:

Invoke-WebRequest -Method POST -Body '{"model":"llama2", "prompt":"Why is the sky blue?", "stream": false}' -Uri http://localhost:11434/api/generate ).Content | ConvertFrom-Json

Features Overview

Hardware Acceleration

Ollama takes full advantage of NVIDIA GPUs and modern CPU instruction sets like AVX/AVX2 for faster model execution. This feature eliminates the need for additional configuration or virtualization, making the setup effortless.

Extensive Model Library

Whether you need language models or vision models, the full Ollama library is available on Windows. Vision models enable dynamic workflows by supporting image-based prompts.

Background API Service

The API running on http://localhost:11434 is always available for applications and tools to connect without extra setup.

Updating and Feedback

Ollama on Windows will notify you about updates as they become available. The team actively welcomes feedback through their GitHub issues page or on their Discord server.

Conclusion

The arrival of Ollama on Windows opens up a world of possibilities for developers, researchers, and businesses. Whether you’re exploring local AI models for enhanced privacy or integrating them into larger workflows, Ollama’s preview release makes it simple and powerful.

Start building with Ollama today and redefine your local AI experience!

Running Ollama on Windows: A Comprehensive Guide