LLM Providers

Suitcase works with any OpenAI-compatible or Anthropic-compatible API. You bring the model and the key — Suitcase handles the rest.

Quick Reference

Provider	Base URL	Mode
Relay with llama.cpp (recommended local, free)	`http://127.0.0.1:8080/v1`	OpenAI-compatible
LM Studio	`http://127.0.0.1:1234/v1`	OpenAI-compatible
Ollama	`http://127.0.0.1:11434/v1`	OpenAI-compatible
OpenAI	`https://api.openai.com/v1`	OpenAI-compatible
Anthropic	`https://api.anthropic.com`	Anthropic-compatible
DeepSeek	`https://api.deepseek.com/v1`	OpenAI-compatible
Groq	`https://api.groq.com/openai/v1`	OpenAI-compatible
Together AI	`https://api.together.xyz/v1`	OpenAI-compatible
Custom / Self-hosted	`https://your-server/v1`	Either mode

Relay with llama.cpp (Recommended Local, Free)

Relay with llama.cpp is the recommended free local server path for Suitcase. It gives you a local OpenAI-compatible endpoint backed by GGUF models, without needing a cloud API key.

Follow the Relay docs to install and run Relay with llama.cpp
Start the local server with your chosen GGUF model
In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL: http://127.0.0.1:8080/v1
- Model: the model name exposed by Relay / llama.cpp
- API key: leave blank unless you configured one

Local recommendation

Use Relay with llama.cpp first if you want the most direct local-server setup for Suitcase. LM Studio and Ollama still work, but Relay with llama.cpp is the preferred local path.

LM Studio (Local, Free)

LM Studio runs models locally on your GPU. No API key needed unless you set one.

Download LM Studio and install
Download a model (recommended: Qwen 3 Coder, Gemma 3, Llama 3)
Go to the Developer tab → start the local server
In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL: http://127.0.0.1:1234/v1
- Model: the model filename (e.g., qwen3-coder-30b)
- API key: leave blank (LM Studio doesn't require one by default, or use the lm-studio key if configured)

Performance

For a good Suitcase experience, use a model with at least 8B parameters. 20B+ is ideal for nuanced career strategy conversations. If you have an Apple Silicon Mac, look for MLX-format models in LM Studio.

Ollama (Local, Free)

Install Ollama and pull a model:
bash
```
ollama pull qwen3:30b
```
In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL: http://127.0.0.1:11434/v1
- Model: qwen3:30b
- API key: ollama (any value works)

OpenAI

Get an API key from platform.openai.com/api-keys
In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL: https://api.openai.com/v1
- Model: gpt-4o or gpt-4o-mini
- API key: sk-...

Anthropic

Get an API key from console.anthropic.com
In Suitcase setup:
- Mode: Anthropic-compatible
- Base URL: https://api.anthropic.com
- Model: claude-sonnet-4-20250514 or claude-3-5-haiku-latest
- API key: sk-ant-...

The Anthropic-compatible mode sends requests in Anthropic's Messages API format rather than OpenAI's Chat Completions format. This is handled transparently by the inference layer.

DeepSeek

Get an API key from platform.deepseek.com
In Suitcase setup:
- Mode: OpenAI-compatible
- Base URL: https://api.deepseek.com/v1
- Model: deepseek-chat
- API key: sk-...

Custom / Self-Hosted Endpoint

Any endpoint that speaks the OpenAI Chat Completions API or Anthropic Messages API works:

In Suitcase setup:
- Mode: OpenAI-compatible or Anthropic-compatible (match your server)
- Base URL: your server's URL (e.g., https://ai.yourdomain.com/v1)
- Model: the model name your server expects
- API key: your server's API key (if required)

Cloudflare Access / Zero Trust

If your endpoint sits behind Cloudflare Access:

In Suitcase setup, expand Optional headers

Add:

json

{
  "CF-Access-Client-Id": "your-client-id.apps.cloudflareaccess.com",
  "CF-Access-Client-Secret": "your-client-secret"
}

These headers are attached to every LLM request

Verification

When you click Save and verify in the setup wizard, Suitcase:

Saves your configuration server-side (API key is never sent to the browser after save)
Calls the LLM endpoint to confirm it's reachable
Primes the model with Suitcase's identity, world rules, and your profile facts — so the first conversation starts warm

If verification fails, check:

Your base URL is correct and reachable
The model name matches what your endpoint expects
Your API key is valid
Cloudflare Access headers are correct (if applicable)

Switching Providers

You can change your LLM provider at any time from the Admin Console → re-enter the API key and endpoint, and Suitcase will re-verify and re-prime the context.

LLM Providers ​

Quick Reference ​

Relay with llama.cpp (Recommended Local, Free) ​

LM Studio (Local, Free) ​

Ollama (Local, Free) ​

OpenAI ​

Anthropic ​

DeepSeek ​

Custom / Self-Hosted Endpoint ​

Cloudflare Access / Zero Trust ​

Verification ​

Switching Providers ​