BYOK LLM configuration
Hivemind is bring-your-own-key: it never proxies LLM calls through Seglamater. You point it at your own provider — Anthropic, OpenAI, Ollama (local), or any OpenAI-compatible endpoint (LM Studio, vLLM, LocalAI, Azure OpenAI, LiteLLM, …). Hivemind boots and runs without a provider configured; LLM-dependent surfaces (public chat, agent runtime tool-call loop, content studio drafting) stay degraded until you set one. The boot log makes the degraded state loud:
WARN LLM startup ping FAILED — LLM-dependent features degraded; fix
HIVEMIND_LLM_* env vars and restart provider=anthropic
error="missing config: HIVEMIND_LLM_API_KEY (anthropic)"
The envelope
Every provider uses the same five environment variables; the values differ.
Set them in your install dir's environment file and then
docker compose up -d server (the server picks them up on restart):
| Variable | What | Default if unset |
|---|---|---|
HIVEMIND_LLM_PROVIDER |
anthropic | openai | ollama | openai-compat |
anthropic |
HIVEMIND_LLM_ENDPOINT |
Base URL (no trailing /) |
provider-specific (see below) |
HIVEMIND_LLM_API_KEY |
Provider API token | unset → ping fails |
HIVEMIND_LLM_MODEL |
Provider-specific model id | provider-specific (see below) |
HIVEMIND_LLM_TIMEOUT_SECS |
Per-request timeout in seconds | 60 |
HIVEMIND_LLM_PACE_MS |
Inter-call gap (rate-limit politeness) | 400 |
HIVEMIND_LLM_PACE_MS is a Tidal-style respectful inter-call gap, not a
rate limit. It enforces a minimum of N milliseconds between successive
calls to the upstream so a burst of agent tool-calls doesn't hammer the
provider. The default 400 ms is conservative; tighten for local providers.
Provider snippets
Pick one. Copy the four lines into the environment file under
<install-dir>/, then restart the server.
Anthropic (Claude)
HIVEMIND_LLM_PROVIDER=anthropic
HIVEMIND_LLM_ENDPOINT=https://api.anthropic.com
HIVEMIND_LLM_API_KEY=sk-ant-… # console.anthropic.com → Settings → API Keys
HIVEMIND_LLM_MODEL=claude-haiku-4-5-20251001
Default model is claude-haiku-4-5-20251001 if you leave _MODEL unset.
For agent-runtime / content-studio work the haiku tier is the right
default; bump to a sonnet tier (claude-sonnet-4-5-…) if you want richer
reasoning at higher per-call cost.
OpenAI
HIVEMIND_LLM_PROVIDER=openai
HIVEMIND_LLM_ENDPOINT=https://api.openai.com
HIVEMIND_LLM_API_KEY=sk-… # platform.openai.com → API keys
HIVEMIND_LLM_MODEL=gpt-4o-mini
gpt-4o-mini is a reasonable default; pick whichever chat model your
account has access to.
Ollama (local, no key)
HIVEMIND_LLM_PROVIDER=ollama
HIVEMIND_LLM_ENDPOINT=http://host.docker.internal:11434
HIVEMIND_LLM_MODEL=llama3.2:3b
No API key. Note the host.docker.internal hostname: the
hivemind-server container needs to reach Ollama on the host. The shipped
docker-compose.yml adds extra_hosts: - "host.docker.internal:host-gateway"
for the server service so this resolves correctly on Linux Docker.
Pull the model on the host first:
For larger models (8B+) tighten HIVEMIND_LLM_PACE_MS if you find the
agent loop is artificially slow:
OpenAI-compatible (LM Studio / vLLM / LocalAI / Azure / LiteLLM)
HIVEMIND_LLM_PROVIDER=openai-compat
HIVEMIND_LLM_ENDPOINT=http://host.docker.internal:1234 # LM Studio default
HIVEMIND_LLM_API_KEY= # leave blank for unauth local
HIVEMIND_LLM_MODEL=local-model # whatever your server reports
This path speaks the OpenAI Chat Completions API shape; it works against anything that does too. Common targets:
- LM Studio — default
http://host.docker.internal:1234, no key. - vLLM —
http://<host>:8000/v1, no key by default. - LocalAI —
http://<host>:8080/v1, key optional. - Azure OpenAI —
https://<resource>.openai.azure.com/openai/deployments/<dep>, key required. SetHIVEMIND_LLM_MODELto the deployment id. - LiteLLM proxy —
http://<host>:4000, key per your LiteLLM config.
Verify
After you set the four vars and restart the server, look for these
INFO lines in the boot log (no WARN LLM startup ping FAILED):
docker compose logs --tail=30 hivemind-server | grep -E "LLM|listening"
# INFO hivemind_server: BYOK LLM provider initialised provider="anthropic"
# endpoint="https://api.anthropic.com" model="claude-haiku-4-5-20251001"
# INFO hivemind_server: listening addr=0.0.0.0:3000
If you see the WARN LLM startup ping FAILED line instead, the error
message names what's wrong (most often a missing key or a wrong endpoint).
Re-edit the environment file and docker compose up -d server — provider
init runs at every server start, so you'll see the new attempt on the
next restart.
Switching providers
Change the four variables in the environment file and restart the server:
There is no on-disk LLM-provider state to migrate. Agent profiles + the chat history stay; the model that backs them just changes. Re-run any agent-quality checks you have after the switch — output quality differs across providers and especially across model tiers.
Common pitfalls
- Wrong endpoint path.
HIVEMIND_LLM_ENDPOINTis the BASE URL — no trailing/v1, no path. Hivemind appends the right path per provider shape. Settinghttps://api.openai.com/v1makes the ping fail. - Model name doesn't exist on the account. The startup ping is a real call against the model; if your account doesn't have access, the ping errors. Use a model id you've confirmed in the provider's console.
- Local provider unreachable from the container. Ollama / LM Studio on
the host requires
host.docker.internal(Linux Docker honors this via thehost-gatewayextra_hosts entry the shipped compose includes).localhostfrom inside the container points at the container itself, not the host. - Aggressive
HIVEMIND_LLM_PACE_MS=0. You can set it to zero — but most cloud providers will rate-limit you. The default 400 ms is the polite floor; lower it deliberately and watch for 429s. - Timeout too tight for local large models. Local 70B+ inference on
modest hardware easily exceeds 60 s for the first ping. Bump
HIVEMIND_LLM_TIMEOUT_SECS=180if you're seeing timeout errors on a local provider that you know is working.
If the ping keeps failing after the obvious fixes, troubleshooting.md
has the "missing HIVEMIND_LLM_API_KEY" entry with the exact log line +
fix pattern.