LLM Providers
Suitcase works with any OpenAI-compatible or Anthropic-compatible API. You bring the model and the key — Suitcase handles the rest.
Quick Reference
| Provider | Base URL | Mode |
|---|---|---|
| Relay with llama.cpp (recommended local, free) | http://127.0.0.1:8080/v1 | OpenAI-compatible |
| LM Studio | http://127.0.0.1:1234/v1 | OpenAI-compatible |
| Ollama | http://127.0.0.1:11434/v1 | OpenAI-compatible |
| OpenAI | https://api.openai.com/v1 | OpenAI-compatible |
| Anthropic | https://api.anthropic.com | Anthropic-compatible |
| DeepSeek | https://api.deepseek.com/v1 | OpenAI-compatible |
| Groq | https://api.groq.com/openai/v1 | OpenAI-compatible |
| Together AI | https://api.together.xyz/v1 | OpenAI-compatible |
| Custom / Self-hosted | https://your-server/v1 | Either mode |
Relay with llama.cpp (Recommended Local, Free)
Relay with llama.cpp is the recommended free local server path for Suitcase. It gives you a local OpenAI-compatible endpoint backed by GGUF models, without needing a cloud API key.
- Follow the Relay docs to install and run Relay with
llama.cpp - Start the local server with your chosen GGUF model
- In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL:
http://127.0.0.1:8080/v1 - Model: the model name exposed by Relay /
llama.cpp - API key: leave blank unless you configured one
Local recommendation
Use Relay with llama.cpp first if you want the most direct local-server setup for Suitcase. LM Studio and Ollama still work, but Relay with llama.cpp is the preferred local path.
LM Studio (Local, Free)
LM Studio runs models locally on your GPU. No API key needed unless you set one.
- Download LM Studio and install
- Download a model (recommended: Qwen 3 Coder, Gemma 3, Llama 3)
- Go to the Developer tab → start the local server
- In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL:
http://127.0.0.1:1234/v1 - Model: the model filename (e.g.,
qwen3-coder-30b) - API key: leave blank (LM Studio doesn't require one by default, or use the
lm-studiokey if configured)
Performance
For a good Suitcase experience, use a model with at least 8B parameters. 20B+ is ideal for nuanced career strategy conversations. If you have an Apple Silicon Mac, look for MLX-format models in LM Studio.
Ollama (Local, Free)
- Install Ollama and pull a model:bash
ollama pull qwen3:30b - In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL:
http://127.0.0.1:11434/v1 - Model:
qwen3:30b - API key:
ollama(any value works)
OpenAI
- Get an API key from platform.openai.com/api-keys
- In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL:
https://api.openai.com/v1 - Model:
gpt-4oorgpt-4o-mini - API key:
sk-...
Anthropic
- Get an API key from console.anthropic.com
- In Suitcase setup:
- Mode: Anthropic-compatible
- Base URL:
https://api.anthropic.com - Model:
claude-sonnet-4-20250514orclaude-3-5-haiku-latest - API key:
sk-ant-...
The Anthropic-compatible mode sends requests in Anthropic's Messages API format rather than OpenAI's Chat Completions format. This is handled transparently by the inference layer.
DeepSeek
- Get an API key from platform.deepseek.com
- In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL:
https://api.deepseek.com/v1 - Model:
deepseek-chat - API key:
sk-...
Custom / Self-Hosted Endpoint
Any endpoint that speaks the OpenAI Chat Completions API or Anthropic Messages API works:
- In Suitcase setup:
- Mode: OpenAI-compatible or Anthropic-compatible (match your server)
- Base URL: your server's URL (e.g.,
https://ai.yourdomain.com/v1) - Model: the model name your server expects
- API key: your server's API key (if required)
Cloudflare Access / Zero Trust
If your endpoint sits behind Cloudflare Access:
- In Suitcase setup, expand Optional headers
- Add:json
{ "CF-Access-Client-Id": "your-client-id.apps.cloudflareaccess.com", "CF-Access-Client-Secret": "your-client-secret" } - These headers are attached to every LLM request
Verification
When you click Save and verify in the setup wizard, Suitcase:
- Saves your configuration server-side (API key is never sent to the browser after save)
- Calls the LLM endpoint to confirm it's reachable
- Primes the model with Suitcase's identity, world rules, and your profile facts — so the first conversation starts warm
If verification fails, check:
- Your base URL is correct and reachable
- The model name matches what your endpoint expects
- Your API key is valid
- Cloudflare Access headers are correct (if applicable)
Switching Providers
You can change your LLM provider at any time from the Admin Console → re-enter the API key and endpoint, and Suitcase will re-verify and re-prime the context.