Groq Provider

Groq provides ultra-fast inference using custom LPU (Language Processing Unit) hardware, delivering extremely low latency responses.

Authentication

Using API Key

[auth]
provider = "groq"
api_key = "${GROQ_API_KEY}"

Or set the environment variable:

export GROQ_API_KEY=gsk_...

Get your API key at console.groq.com.

Available Models

ModelDescriptionBest For
llama-3.3-70bLlama 3.3 70BGeneral tasks
llama-3.1-8bLlama 3.1 8BFast, simple tasks
mixtral-8x7bMixtral MoEBalanced quality
gemma2-9bGoogle Gemma 2Quick tasks

Configuration

[model]
provider_id = "groq"
model = "llama-3.3-70b"

[model.groq]
temperature = 0.7
max_tokens = 4096

CLI Usage

# Use Llama 3.3 70B on Groq
savfox -m groq:llama-3.3-70b exec "Refactor this code"

# Fast small model
savfox -m groq:llama-3.1-8b exec "Quick explanation"

Rate Limits

Groq has rate limits based on your plan:

  • Free tier: Limited requests per minute and tokens per day
  • Paid plans: Higher limits available

Savfox handles rate limiting automatically with retries.

Troubleshooting

Authentication errors

  1. Verify your API key starts with gsk_
  2. Check if the key is active
  3. Ensure you haven't exceeded your quota

Model not available

  1. Check the Groq docs for current model list
  2. Some models may be deprecated or renamed