Skip to content

LLM Provider Overview

EdgeCrab supports 15 LLM providers out of the box (13 cloud, 2 local). Over 200 models are compiled in, with user override via ~/.edgecrab/models.yaml. Auto-detection finds the right provider from your environment variables — or switch at any time with --model or /model inside the TUI.


PriorityProviderEnv varNotable Models
1copilotGITHUB_TOKENGPT-4.1-mini, GPT-4.1 — free with GitHub Copilot
2openaiOPENAI_API_KEYGPT-4.1, GPT-5, o3, o4-mini
3anthropicANTHROPIC_API_KEYClaude Opus 4.6, Sonnet 4.6, Haiku 4.5
4googleGOOGLE_API_KEYGemini 2.5 Pro, Gemini 2.5 Flash
5vertexaiGOOGLE_APPLICATION_CREDENTIALSGemini via Google Cloud
6bedrockAWS credentials chainClaude, Nova, and Bedrock-hosted models
7xaiXAI_API_KEYGrok 3, Grok 4
8deepseekDEEPSEEK_API_KEYDeepSeek V3, DeepSeek R1
9mistralMISTRAL_API_KEYMistral Large, Mistral Small
10groqGROQ_API_KEYLlama 3.3 70B, Gemma2 (blazing fast inference)
11huggingfaceHUGGING_FACE_HUB_TOKENAny HF Inference API model
12zaiZAI_API_KEYZ.AI / GLM series
13openrouterOPENROUTER_API_KEY600+ models via one endpoint
ollama(none)Any model — ollama serve on port 11434
lmstudio(none)Any model — LM Studio on port 1234

Auto-detection order: EdgeCrab checks env vars in priority order (1–13). The first matching key sets the default provider. Local providers (ollama, lmstudio) are available regardless.


Add to your shell profile or ~/.edgecrab/.env:

Terminal window
# ~/.edgecrab/.env <- edgecrab loads this automatically
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Note for Gemini: The env var is GOOGLE_API_KEY, not GEMINI_API_KEY.

Either via setup wizard (recommended for first run):

Terminal window
edgecrab setup

Or directly in ~/.edgecrab/config.yaml:

provider: openai
model: gpt-4o
Terminal window
edgecrab doctor
# OK OpenAI OPENAI_API_KEY set
# OK Provider ping openai/gpt-4o -> OK (421 ms)

Uses your existing GitHub Copilot subscription — no additional billing. Requires a valid GITHUB_TOKEN with Copilot access.

Terminal window
GITHUB_TOKEN=ghp_...

Models:

  • gpt-4.1-mini (default, fast)
  • gpt-4o (more capable)
  • claude-sonnet-4-5 (when available in Copilot)
Terminal window
OPENAI_API_KEY=sk-...

Recommended models:

openai/gpt-4o # Best general-purpose
openai/gpt-4.1-mini # Fast, cost-effective
openai/o3 # Advanced reasoning
openai/o4-mini # Fast reasoning
Terminal window
ANTHROPIC_API_KEY=sk-ant-...

Recommended models:

anthropic/claude-opus-4-5 # Most capable
anthropic/claude-sonnet-4-5 # Balanced
anthropic/claude-haiku-3-5 # Fast, lightweight
Terminal window
GOOGLE_API_KEY=AIza...

Important: The env var is GOOGLE_API_KEY — not GEMINI_API_KEY.

Models:

google/gemini-2.5-flash # Fast, capable
google/gemini-2.5-pro # Long context, advanced reasoning

Access Gemini models via Google Cloud with enterprise billing and data residency.

Terminal window
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# or: use Application Default Credentials (gcloud auth application-default login)

Models: same as google provider, routed through Vertex AI API endpoint.

Terminal window
XAI_API_KEY=...

Models:

xai/grok-3 # Most capable
xai/grok-3-mini # Fast, cost-effective

Excellent for code tasks, highly cost-effective.

Terminal window
DEEPSEEK_API_KEY=...

Models:

deepseek/deepseek-chat # V3 — general purpose
deepseek/deepseek-reasoner # R1 — advanced reasoning

European-headquartered provider with strong multilingual support and GDPR data residency options.

Terminal window
MISTRAL_API_KEY=...

Models:

mistral/mistral-large-latest # Most capable
mistral/mistral-small-latest # Fast, cost-effective
mistral/codestral-latest # Code-focused

Ultra-fast inference via custom LPU chips. Lowest latency of any cloud provider.

Terminal window
GROQ_API_KEY=...

Models:

groq/llama-3.3-70b-versatile # Best balance of speed + quality
groq/llama-3.1-8b-instant # Extremely fast, lightweight
groq/gemma2-9b-it # Google Gemma2 via Groq

Access open models via the Hugging Face Inference API.

Terminal window
HUGGING_FACE_HUB_TOKEN=hf_...
Terminal window
edgecrab --model huggingface/meta-llama/Llama-3.3-70B-Instruct "..."

Z.AI provides access to GLM model series.

Terminal window
ZAI_API_KEY=...

Models:

zai/glm-4.5 # Latest GLM
zai/glm-5 # Most capable GLM

Run any model locally. Requires Ollama installed and running:

Terminal window
# Start Ollama (keep this running)
ollama serve
# Pull a model
ollama pull llama3.3
ollama pull codestral

No API key needed. EdgeCrab connects to http://localhost:11434 automatically:

Terminal window
edgecrab --model ollama/llama3.3 "explain this code"
edgecrab --model ollama/codestral "write a Rust async function"

Local Models guide for full setup and model recommendations.

Download a model in LM Studio and start its local server (default port 1234):

Terminal window
edgecrab --model lmstudio/your-loaded-model "..."

Local Models guide.

Access 600+ models from a single API endpoint and API key.

Terminal window
OPENROUTER_API_KEY=...
Terminal window
edgecrab --model openrouter/anthropic/claude-opus-4-5 "..."
edgecrab --model openrouter/google/gemini-2.5-flash "..."
edgecrab --model openrouter/meta-llama/llama-3.3-70b-instruct "..."

Terminal window
edgecrab --model anthropic/claude-opus-4-5 "refactor this module"
/model groq/llama-3.3-70b-versatile

The switch takes effect immediately — the conversation history carries over, with the new model seeing all previous messages.


Configure automatic failover in config.yaml:

provider: openai
model: gpt-4o
fallback_providers:
- anthropic/claude-sonnet-4-5
- ollama/llama3.3

If the primary provider returns an error (rate limit, outage), EdgeCrab retries with the next in the chain.


TaskRecommended
Large refactor (100+ files)anthropic/claude-opus-4-6
Quick one-file fixgroq/llama-3.3-70b-versatile or openai/gpt-4.1-mini
Reasoning / complex logicdeepseek/deepseek-reasoner or openai/o3
Offline / air-gappedollama/llama3.3 or ollama/codestral
Maximum model varietyopenrouter/... (600+ models)
Budget-consciousdeepseek/deepseek-chat or groq/llama-3.1-8b-instant
Lowest latencygroq/llama-3.3-70b-versatile (LPU hardware)
European data residencymistral/mistral-large-latest
Code generationdeepseek/deepseek-chat or mistral/codestral-latest

  • Use /model in the TUI to experiment: type /model groq/llama-3.3-70b-versatile mid-session to switch models without losing conversation history.
  • Groq for speed-sensitive tasks: Groq’s LPU chips deliver 300+ tokens/second — ideal for quick iterations and interactive use where waiting 5 seconds per response breaks flow.
  • OpenRouter for prototyping: a single OPENROUTER_API_KEY unlocks 600+ models. Iterate fast across different providers before committing to one API key.
  • DeepSeek R1 for hard reasoning: deepseek/deepseek-reasoner matches o3-class reasoning at a fraction of the cost. Ideal for algorithm design and complex debugging.
  • Mistral for European compliance: data stays in EU datacenters. Use mistral/codestral-latest for code tasks with GDPR requirements.
  • edgecrab doctor shows which providers are configured and their latency. Run it after adding a new key to verify the key works.
  • Fallback chain protects long runs: configure fallback_providers in config.yaml so that a rate-limit spike doesn’t kill a multi-hour refactor.

Why is my GOOGLE_API_KEY not working with Gemini? Make sure you’re using GOOGLE_API_KEY (not GEMINI_API_KEY). Also verify the key has the Generative Language API enabled in Google Cloud Console.

Can I use two providers in the same session? Not simultaneously, but you can switch mid-session with /model provider/model-name. Each turn after the switch uses the new model.

Does EdgeCrab send conversation history to every provider I’ve configured? No. Only the active provider receives messages. Other API keys are only used if you explicitly switch to that provider.

How does auto-detection priority work? EdgeCrab checks env vars in the order listed in the Provider Quick Reference table. The first key found sets the default provider. If you have multiple keys set, add provider: <name> to config.yaml to pin the preference.

Can I use a fine-tuned model? Yes — any OpenAI-compatible endpoint accepts a custom model name. Set base_url and model.default in config.yaml.