Managed-update sidecar
The chatalot-updater service and how to operate it.
What it is
The chatalot-updater container runs alongside chatalot-server in
the same Compose stack. It owns the managed-update lifecycle:
- Fetches release manifests from
updates.seglamater.appover HTTPS. - Takes a pre-upgrade Postgres snapshot.
- Pulls the new image (with digest pinning), runs pre-flight and migration scripts as throwaway containers, and recreates the server container on the new image.
- Polls
/healthon the new container; on failure, restores the snapshot and redeploys the previous image. - Persists every apply's state + event log to a local SQLite database
(
/var/lib/chatalot-updater/state.db, mode 0600).
An apply is kicked off by an authenticated call from chatalot-server
to the updater's HTTP API — operators never call the updater directly
in normal use. The admin UI on the server surfaces apply status + event
history by proxying through signed requests.
Container layout
Enabling the sidecar adds two services (both opt-in via the updater
Compose profile):
| Container | Image | Purpose |
|---|---|---|
chatalot-updater |
built from Dockerfile.updater |
Orchestrator + HTTP API on port 8081 |
chatalot-socket-proxy |
registry.seglamater.app/seglamater/docker-socket-proxy:0.3.0 |
Filtered gateway to the host Docker socket |
The socket-proxy image is mirrored from docker.io/tecnativa/docker-socket-proxy:0.3.0
into the Seglamater Forgejo registry so the managed-update system does not
depend on Docker Hub availability. If you're self-hosting chatalot
against a different registry, repeat the mirror step from the section
Mirroring the socket-proxy image
below and update the image: reference in docker-compose.yml.
Both containers live on an isolated internal Docker network
(updater_internal). Neither port is exposed to the host. The updater
also joins the existing internal network so it can reach Postgres
for pg_dump.
Socket-proxy ACL
The proxy permits exactly the Docker API surface BollardDockerClient
uses (see crates/chatalot-updater/src/update/docker_client.rs):
CONTAINERS=1 IMAGES=1 NETWORKS=1 POST=1
EXEC=0 VOLUMES=0 SYSTEM=0 SECRETS=0 CONFIGS=0
SERVICES=0 TASKS=0 SWARM=0 PLUGINS=0 NODES=0
BUILD=0 COMMIT=0 DISTRIBUTION=0
EXEC=0 is the critical one — it means a compromised updater cannot
run arbitrary commands inside any existing container. Extending the
orchestrator with a new Docker call means revisiting this ACL; the
module docstring flags that obligation.
Image size
The runtime image is debian:bookworm-slim plus postgresql-client-17
(for pg_dump / psql) and curl (for the healthcheck probe).
Expect ~110–130 MB pulled. The ticket originally aimed for <30 MB, but
the postgres client alone is ~50 MB. Keeping that client is mandatory —
the updater can't orchestrate DB snapshots without it.
Operator workflow
Enabling
- Copy the secret placeholders:
cp secrets/updater_token.example secrets/updater_token
openssl rand -hex 32 > secrets/updater_token
chmod 600 secrets/updater_token
cp secrets/cosign_pub.example secrets/cosign_pub
# Paste the cosign public key (see the .example file or the
# canonical chatalot.pub at https://updates.seglamater.app/.well-known/keys/chatalot.pub
# — or generate your own if self-publishing).
chmod 600 secrets/cosign_pub
- Bring up the sidecar:
- Verify health from inside the stack:
Expect {"status":"ok","version":"0.24.6"} (or the current updater
crate version).
Inspecting apply state
The quickest view is the SQLite database inside the container:
docker compose exec chatalot-updater \
sqlite3 -readonly /var/lib/chatalot-updater/state.db \
'SELECT id, target_version, current_state, outcome FROM applies ORDER BY started_at DESC LIMIT 10;'
For event-level detail on a single apply:
docker compose exec chatalot-updater \
sqlite3 -readonly /var/lib/chatalot-updater/state.db \
"SELECT ts, event_code, detail FROM apply_events WHERE apply_id='<uuid>' ORDER BY ts;"
Restart safety
If the updater container is restarted (operator-triggered or OOM-killed)
mid-apply, the next startup's orphan scan transitions any in-flight
apply to FrozenMaintenanceRequired. Operators should treat a frozen
state as an intervention point — see the rollback recipe below.
Rollback recipe
When an apply lands in FrozenMaintenanceRequired, rollback requires
manual review. The snapshot produced during the failed apply is named
pre-apply-<version>-<timestamp>.sql.gz under /var/backups/chatalot.
- Identify the snapshot:
-
Confirm the snapshot matches the failed apply (cross-check the
snapshot_startedevent's detail blob for the snapshot id). -
Restore via the CHAT-17 CLI:
- Recreate the server container on the previous image. The updater
records the previous image reference in the
rollback_startedevent's detail:
docker compose exec chatalot-updater \
sqlite3 -readonly /var/lib/chatalot-updater/state.db \
"SELECT detail FROM apply_events WHERE apply_id='<uuid>' AND event_code='rollback_started';"
- After
chatalot-serveris healthy, transition the frozen row to a terminal outcome so it no longer trips the orphan-scan alert:
docker compose exec chatalot-updater \
sqlite3 /var/lib/chatalot-updater/state.db \
"UPDATE applies SET outcome='rolled_back', error='manual-recovery' WHERE id='<uuid>';"
Wave-1 limitations
Called out here so operators don't assume they're bugs:
- Cosign verification is stubbed.
StubCosignVerifiervalidates the pubkey file is readable at startup and rejects missing/empty signatures on verify calls, but does not cryptographically verify anything. Every verify call emits a loudWARN target="cosign_stub"log — grep for that string to confirm you're still on wave-1. - Maintenance broadcast is a log-only stub. The pre-disruption
broadcast that tells users "maintenance in N seconds" isn't wired
to the server-side WebSocket API yet; the phase just logs and sleeps
for
CHATALOT_UPDATER_BROADCAST_GRACE_SECS. - No WebSocket progress stream. Admin UI polls
/v1/apply/:id. Sub-second live progress lands in wave-2.
All three are tracked as separate wave-2 tickets and will slot in without Compose changes.
Configuration reference
All env vars the updater reads (defaults in parentheses):
| Variable | Default | Purpose |
|---|---|---|
CHATALOT_UPDATER_LISTEN_ADDR |
0.0.0.0:8081 |
HTTP listen |
CHATALOT_UPDATER_API_TOKEN_FILE |
/run/secrets/updater_token |
HMAC shared secret |
CHATALOT_UPDATER_SQLITE_PATH |
/var/lib/chatalot-updater/state.db |
Apply state + event log |
DOCKER_HOST |
tcp://chatalot-socket-proxy:2375 |
Daemon endpoint |
CHATALOT_UPDATER_SERVER_CONTAINER |
chatalot-server |
Container the orchestrator manipulates |
CHATALOT_UPDATER_COSIGN_PUBKEY_PATH |
/run/secrets/cosign_pub |
Cosign public key |
CHATALOT_UPDATER_PRE_FLIGHT_TIMEOUT_SECS |
30 |
Pre-flight phase timeout |
CHATALOT_UPDATER_COSIGN_TIMEOUT_SECS |
30 |
Cosign verify timeout |
CHATALOT_UPDATER_SNAPSHOT_TIMEOUT_SECS |
300 |
pg_dump timeout |
CHATALOT_UPDATER_PULL_TIMEOUT_SECS |
600 |
docker pull timeout |
CHATALOT_UPDATER_MIGRATE_TIMEOUT_SECS |
600 |
Migrate one-shot timeout |
CHATALOT_UPDATER_HEALTH_CHECK_TIMEOUT_SECS |
60 |
Post-start health wait |
CHATALOT_UPDATER_BROADCAST_GRACE_SECS |
30 |
Maintenance broadcast grace |
All timeouts are clamped to [1, 3600] seconds with a warning log on
out-of-range values.
Mirroring the socket-proxy image
The stock docker-compose.yml references
registry.seglamater.app/seglamater/docker-socket-proxy:0.3.0 — a mirror
of the upstream tecnativa/docker-socket-proxy:0.3.0 image published
via the Seglamater Forgejo registry. This keeps the managed-update
install path Docker-Hub-independent: once chatalot is running, pulling
updates and their required sidecar images hits only Seglamater
infrastructure.
If you're self-hosting chatalot against your own registry, replicate the mirror:
# 1. Pull the upstream image from Docker Hub.
docker pull tecnativa/docker-socket-proxy:0.3.0
# 2. Verify the content digest matches what upstream published at
# mirror time. sha256:9e4b9e7517a6b660f2cc903a19b257b1852d5b3344794e3ea334ff00ae677ac2
docker inspect --format '{{index .RepoDigests 0}}' \
tecnativa/docker-socket-proxy:0.3.0
# 3. Re-tag for your registry (example: my-registry.example.com, org `acme`).
docker tag tecnativa/docker-socket-proxy:0.3.0 \
my-registry.example.com/acme/docker-socket-proxy:0.3.0
# 4. Push.
echo "$REGISTRY_TOKEN" | docker login my-registry.example.com -u acme --password-stdin
docker push my-registry.example.com/acme/docker-socket-proxy:0.3.0
# 5. Update docker-compose.yml so chatalot-socket-proxy uses the
# mirrored path, then restart the sidecar:
# docker compose --profile updater up -d chatalot-socket-proxy
After the mirror you can (and should) cosign sign the image in your
registry with the same key that signs your chatalot release manifests,
so the supply-chain story is symmetric end-to-end. Cosign is stubbed
in wave-1 so this isn't wired to block the pull yet; the signature
lands in place for wave-2 to verify.
Related source
- Crate:
crates/chatalot-updater/ - Orchestrator:
crates/chatalot-updater/src/update/orchestrator.rs - HTTP API:
crates/chatalot-updater/src/update/http_api.rs - Dockerfile:
Dockerfile.updater - Compose services:
docker-compose.yml(chatalot-updater + chatalot-socket-proxy)