Ollama Provider
Run open-source models locally with Ollama for privacy and cost savings.
Prerequisites
- Install Ollama: ollama.ai
- Pull a model:
ollama pull llama3.2 - Verify:
ollama list
Configuration
Available Models
Pull models:
CLI Usage
Configuration Options
GPU Acceleration
Ollama automatically uses GPU when available:
- macOS: Metal (Apple Silicon)
- Linux: CUDA (NVIDIA)
- Windows: CUDA (NVIDIA)
Memory Requirements
Troubleshooting
Connection refused
- Ensure Ollama is running:
ollama serve - Check the base URL
- Verify the port (default: 11434)
Out of memory
- Use a smaller model
- Reduce
num_ctx - Close other applications
Slow responses
- Enable GPU acceleration
- Use a smaller model
- Reduce context length