Jarvis now uses langchaingo for unified access to multiple LLM providers. This allows you to use OpenAI, Anthropic Claude, Google Gemini, or Ollama (local) models interchangeably.
| Provider | Models | API Key Required | Local/Cloud |
|---|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo | Yes | Cloud |
| Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-sonnet, claude-3-haiku | Yes | Cloud |
| gemini-1.5-pro, gemini-1.5-flash, gemini-pro | Yes | Cloud | |
| Ollama | llama3.2, codellama, mistral, phi, etc. | No | Local |
Jarvis automatically detects which provider to use based on available API keys. Priority order:
# Option 1: OpenAI (recommended for production)
export OPENAI_API_KEY="sk-..."
export LLM_MODEL="gpt-4o-mini" # Optional, defaults to gpt-4o-mini
# Option 2: Anthropic Claude (recommended for complex reasoning)
export ANTHROPIC_API_KEY="sk-ant-..."
export LLM_MODEL="claude-3-5-sonnet-20241022" # Optional
# Option 3: Google Gemini (recommended for cost-effectiveness)
export GOOGLE_API_KEY="AIza..."
export LLM_MODEL="gemini-1.5-flash" # Optional
# Option 4: Ollama (recommended for privacy/offline use)
export OLLAMA_HOST="http://localhost:11434" # Optional, default
export LLM_MODEL="llama3.2" # Optional# Force a specific provider
export LLM_PROVIDER="openai" # or "anthropic", "google", "ollama"
export LLM_MODEL="gpt-4o"
export LLM_API_KEY="sk-..." # Only for cloud providers# Fine-tune generation parameters
export LLM_TEMPERATURE="0.7" # Creativity (0.0 - 1.0)
export LLM_MAX_TOKENS="2048" # Maximum response length
export LLM_TOP_P="0.9" # Nucleus sampling
export LLM_TOP_K="40" # Top-K sampling
# Custom endpoints
export LLM_BASE_URL="https://api.openai.com/v1" # For OpenAI-compatible APIs# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama3.2
# Use Jarvis (automatically detects Ollama)
jarvis gen generate-testdata --spec api.yaml
# Or explicitly set Ollama
export LLM_PROVIDER=ollama
jarvis analyze analyze-failures# Set API key
export OPENAI_API_KEY="sk-proj-..."
# Use default model (gpt-4o-mini)
jarvis gen generate-testdata --spec api.yaml
# Or specify a different model
export LLM_MODEL="gpt-4o"
jarvis analyze analyze-failures# Set API key
export ANTHROPIC_API_KEY="sk-ant-..."
# Use default model (claude-3-5-sonnet)
jarvis gen generate-testdata --spec api.yaml
# Or specify a different model
export LLM_MODEL="claude-3-opus-20240229"
jarvis analyze analyze-failures# Set API key (get from https://aistudio.google.com/app/apikey)
export GOOGLE_API_KEY="AIza..."
# Use default model (gemini-1.5-flash)
jarvis gen generate-testdata --spec api.yaml
# Or specify pro model
export LLM_MODEL="gemini-1.5-pro"
jarvis analyze analyze-failuresAll providers support test data generation. Recommended models:
- Best Quality:
claude-3-5-sonnet-20241022(Anthropic) - Best Speed:
gemini-1.5-flash(Google) - Best Cost:
gpt-4o-mini(OpenAI) - Offline/Privacy:
llama3.2(Ollama)
# High-quality test data with Claude
export ANTHROPIC_API_KEY="sk-ant-..."
jarvis gen generate-testdata --spec api.yaml --count 10
# Fast generation with Gemini
export GOOGLE_API_KEY="AIza..."
export LLM_MODEL="gemini-1.5-flash"
jarvis gen generate-testdata --spec api.yaml --count 10All providers support failure analysis. Recommended models:
- Best Reasoning:
claude-3-5-sonnet-20241022(Anthropic) - Best Code:
gpt-4o(OpenAI) - Best Cost:
gpt-4o-mini(OpenAI) - Offline:
llama3.2(Ollama)
# Deep analysis with Claude
export ANTHROPIC_API_KEY="sk-ant-..."
jarvis analyze analyze-failures --limit 5
# Code-focused analysis with GPT-4
export OPENAI_API_KEY="sk-proj-..."
export LLM_MODEL="gpt-4o"
jarvis analyze analyze-failures| Provider | Model | Cost per 1M Input Tokens | Cost per 1M Output Tokens |
|---|---|---|---|
| OpenAI | gpt-4o-mini | $0.15 | $0.60 |
| OpenAI | gpt-4o | $2.50 | $10.00 |
| Anthropic | claude-3-haiku | $0.25 | $1.25 |
| Anthropic | claude-3-5-sonnet | $3.00 | $15.00 |
| gemini-1.5-flash | $0.075 | $0.30 | |
| gemini-1.5-pro | $1.25 | $5.00 | |
| Ollama | Any model | FREE | FREE |
Prices as of January 2025. Check provider websites for current pricing.
-
Development/Testing: Ollama (
llama3.2)- Free, fast, runs locally
- Good enough for most scenarios
-
Production/CI: Google Gemini (
gemini-1.5-flash)- Very low cost ($0.075/1M tokens)
- Fast response times
- Good quality
-
High Quality Needs: Anthropic (
claude-3-5-sonnet)- Best understanding of complex schemas
- Most accurate edge cases
- Higher cost justified for critical tests
-
Quick Debugging: OpenAI (
gpt-4o-mini)- Good balance of speed and quality
- Affordable for frequent use
-
Complex Issues: Anthropic (
claude-3-5-sonnet)- Best reasoning capabilities
- Most detailed root cause analysis
-
Code-Heavy Errors: OpenAI (
gpt-4o)- Excellent code understanding
- Great for stack traces
- Requires API key from https://platform.openai.com/api-keys
- Best general-purpose choice
- Fast and reliable
- Good code understanding
- Requires API key from https://console.anthropic.com
- Excellent reasoning and analysis
- Longer context windows (200K+ tokens)
- Best for complex scenarios
- Requires API key from https://aistudio.google.com/app/apikey
- Most cost-effective cloud option
- Fast inference
- Good multimodal capabilities
- No API key required
- Runs completely locally
- Privacy-preserving
- Requires local resources (GPU recommended)
- Model quality varies
# Check environment variables
env | grep -E '(OPENAI|ANTHROPIC|GOOGLE|OLLAMA|LLM)'
# Force provider selection
export LLM_PROVIDER=ollama# Verify API key is valid
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY
echo $GOOGLE_API_KEY
# Test API key
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
https://api.openai.com/v1/models# List available models for provider
# OpenAI
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY"
# Ollama
ollama listIf you hit rate limits:
# Use a different provider
export LLM_PROVIDER=google
# Or reduce concurrent requests
jarvis gen generate-testdata --spec api.yaml --count 3 # Instead of 10Previous Jarvis versions only supported Ollama. To migrate:
If you were using Ollama, everything continues to work:
# This still works exactly the same
ollama pull llama3.2
jarvis gen generate-testdata --spec api.yaml# Just add an API key
export OPENAI_API_KEY="sk-..."
# Jarvis automatically switches
jarvis gen generate-testdata --spec api.yamlexport LLM_PROVIDER=ollama
export LLM_MODEL=llama3.2- Development: Use Ollama for fast iteration
- CI/CD: Use Gemini Flash for cost-effectiveness
- Production: Use Claude or GPT-4 for quality
- Testing: Mix providers to ensure robustness
# Generate test data with Ollama (free, fast)
export LLM_PROVIDER=ollama
jarvis gen generate-testdata --spec api.yaml --count 20
# Analyze failures with Claude (best quality)
export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
jarvis analyze analyze-failures --limit 5
# Generate scenarios with Gemini (cost-effective)
export LLM_PROVIDER=google
export GOOGLE_API_KEY="AIza..."
jarvis gen generate-scenarios --path specs/#!/bin/bash
# Try providers in order of preference
if [ -n "$ANTHROPIC_API_KEY" ]; then
export LLM_PROVIDER=anthropic
elif [ -n "$OPENAI_API_KEY" ]; then
export LLM_PROVIDER=openai
elif [ -n "$GOOGLE_API_KEY" ]; then
export LLM_PROVIDER=google
else
export LLM_PROVIDER=ollama
fi
jarvis gen generate-testdata --spec api.yaml# More deterministic (good for test generation)
export LLM_TEMPERATURE=0.1
# More creative (good for scenario generation)
export LLM_TEMPERATURE=0.9
# Balanced (default)
export LLM_TEMPERATURE=0.7# Shorter responses (faster, cheaper)
export LLM_MAX_TOKENS=1024
# Longer responses (more detailed)
export LLM_MAX_TOKENS=4096
# Maximum (for complex analysis)
export LLM_MAX_TOKENS=8192Same as Jarvis (MIT)