WebSocket Protocol
Status: Complete
The WebSocket protocol is the real-time communication layer for Chatalot. All live events -- messages, presence, typing, voice signaling -- flow through a single WebSocket connection per client.
Connection
- Endpoint:
wss://your-instance/ws (or ws:// for unencrypted dev)
- The connection uses standard WebSocket upgrade
- All messages are JSON-encoded text frames
- Max frame size: 1 MB
Authentication
Authentication happens via the first message, not headers. The browser WebSocket API does not reliably support custom headers, and this approach also works cleanly with the Tauri desktop client.
Flow
Client Server
| |
|-- GET /ws (Upgrade) ---------------->|
|<-- 101 Switching Protocols ----------|
| |
|-- {"type":"authenticate", |
| "token":"<JWT | cb_…>"} ------->|
| | Validate JWT (EdDSA/Ed25519),
| | else bot token (cb_…) via shared bot-auth
|<-- {"type":"authenticated"} ---------| (success)
| |
| OR |
|<-- {"type":"error", |
| "code":401, |
| "message":"..."} -------------| (failure)
| |
- Client opens WebSocket connection
- Client sends:
{"type": "authenticate", "token": "<credential>"}
- Server validates the credential (see Accepted credentials below)
- Server responds with
{"type": "authenticated"} on success or {"type": "error", "code": 401, "message": "..."} on failure
- If no auth message arrives within 10 seconds, the connection is dropped
Accepted credentials
The token field accepts either:
- A JWT access token (human/desktop clients) — validated as EdDSA/Ed25519 with the same audience + 60s leeway as the HTTP API.
- A bot token
cb_<64 hex> (CHAT-162) — the same X-Bot-Token credential the HTTP API accepts. The server tries JWT validation first; if that fails and the token is cb_-prefixed, it runs the shared bot-auth path (authenticate_bot_token), enforcing identical checks to the HTTP middleware: token active/unrevoked/unexpired, is_bot, not suspended, and the per-token rate-limit budget (one unit charged per WS auth). Suspended users are rejected on both paths.
This is authentication only — it does not change authorization. A bot that authenticates to the WS but is not a channel member still cannot send to or read that channel (send_message to a non-member channel returns {"type":"error","code":"forbidden"}). Bots participate in E2E channels only once they run a crypto client and hold their own keys (see ADR-002, bot-as-client).
Connection Limits
| Limit |
Value |
| Max concurrent connections per user |
8 (multi-device support) |
| Rate limiting (burst) |
10 messages/second |
| Rate limiting (sustained) |
5 messages/second (token bucket refill) |
| Heartbeat interval |
Server sends Ping every 30 seconds; client must respond with Pong |
| Broadcast channel buffer |
256 messages per channel subscription |
Client Messages (Client -> Server)
Messages sent from the client to the server.
Messaging
| Type |
Fields |
Description |
send_message |
channel_id, ciphertext, nonce, reply_to_id?, sender_key_id?, thread_id? |
Send an encrypted message to a channel |
edit_message |
message_id, ciphertext, nonce |
Edit a previously sent message |
delete_message |
message_id |
Delete a message |
Presence
| Type |
Fields |
Description |
update_presence |
status: string |
Set status: online, idle, dnd, invisible, offline |
typing |
channel_id |
Start typing indicator in a channel |
stop_typing |
channel_id |
Stop typing indicator in a channel |
Channel Subscriptions
| Type |
Fields |
Description |
subscribe |
channel_id |
Subscribe to real-time events for a channel |
unsubscribe |
channel_id |
Unsubscribe from a channel's events |
WebRTC Signaling
| Type |
Fields |
Description |
rtc_offer |
channel_id, target_user_id, sdp |
Send a WebRTC offer for voice/video |
rtc_answer |
channel_id, target_user_id, sdp |
Send a WebRTC answer |
rtc_ice_candidate |
channel_id, target_user_id, candidate |
Exchange an ICE candidate |
Voice
| Type |
Fields |
Description |
join_voice |
channel_id |
Join a voice channel |
leave_voice |
channel_id |
Leave a voice channel |
kick_from_voice |
channel_id, user_id |
Kick a user from voice (mod+) |
Reactions
| Type |
Fields |
Description |
add_reaction |
message_id, emoji |
Add a reaction to a message |
remove_reaction |
message_id, emoji |
Remove a reaction from a message |
Unread Tracking
| Type |
Fields |
Description |
mark_read |
channel_id, message_id |
Mark a channel as read up to a specific message |
mark_all_read |
(none) |
Mark all channels as read |
Keepalive
| Type |
Fields |
Description |
authenticate |
token: string |
First message -- JWT or bot token (cb_…, CHAT-162) authentication |
ping |
timestamp: i64 |
Client keepalive; server responds with pong |
Server Messages (Server -> Client)
Messages sent from the server to the client.
Auth
| Type |
Fields |
Description |
authenticated |
(none) |
Authentication succeeded |
error |
code: u16, message: string |
Error response |
Messaging
| Type |
Fields |
Description |
new_message |
message object |
New message in a subscribed channel |
message_sent |
message_id, created_at, thread_id? |
Confirmation of the sender's own message |
message_edited |
message_id, ciphertext, nonce, edited_at |
A message was edited |
message_deleted |
message_id, channel_id |
A message was deleted |
mentioned |
message_id, channel_id, channel_name, sender_name |
The recipient was @mentioned. Sent directly to the mentioned user's sessions regardless of channel subscription, so the client can surface a live alert even when not viewing the channel. Metadata only -- never message content. Currently emitted for webhook posts only (the server can read webhook plaintext but not normal E2E ciphertext). |
Presence
| Type |
Fields |
Description |
presence_update |
user_id, status |
A user changed their presence status |
user_typing |
channel_id, user_id |
A user started typing |
user_stopped_typing |
channel_id, user_id |
A user stopped typing |
WebRTC Signaling
| Type |
Fields |
Description |
rtc_offer |
channel_id, from_user_id, sdp |
Incoming WebRTC offer |
rtc_answer |
channel_id, from_user_id, sdp |
Incoming WebRTC answer |
rtc_ice_candidate |
channel_id, from_user_id, candidate |
Incoming ICE candidate |
Voice
| Type |
Fields |
Description |
voice_state_update |
channel_id, user_id, state |
Voice state changed |
user_joined_voice |
channel_id, user_id |
A user joined voice |
user_left_voice |
channel_id, user_id |
A user left voice |
Reactions
| Type |
Fields |
Description |
reaction_added |
message_id, user_id, emoji |
A reaction was added |
reaction_removed |
message_id, user_id, emoji |
A reaction was removed |
Read Receipts
| Type |
Fields |
Description |
read_receipt |
channel_id, user_id, message_id, timestamp |
A user read up to a message |
Pinned Messages
| Type |
Fields |
Description |
message_pinned |
message_id, channel_id, pinned_by |
A message was pinned |
message_unpinned |
message_id, channel_id |
A message was unpinned |
Channel Moderation
| Type |
Fields |
Description |
member_kicked |
channel_id, user_id |
A member was kicked |
member_banned |
channel_id, user_id |
A member was banned |
member_role_updated |
channel_id, user_id, role |
A member's role changed |
DM Notifications
| Type |
Fields |
Description |
new_dm_channel |
channel object |
A new DM channel was created with you |
Sender Keys (Group E2E)
| Type |
Fields |
Description |
sender_key_updated |
channel_id, user_id, distribution |
A member uploaded a new sender key |
sender_key_rotation_required |
channel_id |
All members must rotate their sender keys |
keys_low |
remaining: u32 |
One-time prekey count is low; upload more |
System
| Type |
Fields |
Description |
pong |
timestamp: i64 |
Heartbeat response (echoes the client's timestamp) |
Message Flow Examples
Sending a Message
Client: {"type": "send_message", "channel_id": "...", "ciphertext": "base64...", "nonce": "base64..."}
Server -> Self: {"type": "message_sent", "message_id": "...", "created_at": "2024-..."}
Server -> Others: {"type": "new_message", "message": {...}}
Typing Indicator
- Typing events are deduplicated: only one
typing event per user per channel per 3 seconds
stop_typing is sent when the user clears the input or sends a message
Voice Call Signaling
1. Client sends `join_voice`
-> Server broadcasts `user_joined_voice` to others in the channel
2. Existing participants send `rtc_offer` to the new participant
3. New participant responds with `rtc_answer`
4. ICE candidates are exchanged via `rtc_ice_candidate`
5. On leave: Client sends `leave_voice`
-> Server broadcasts `user_left_voice`
Error Codes
| Code |
Meaning |
400 |
Bad request / malformed message |
401 |
Authentication failed |
403 |
Permission denied |
404 |
Resource not found |
429 |
Rate limited |
Subscription Model
Clients must explicitly subscribe to channels to receive events. On connection, the client typically subscribes to all channels the user is a member of. Channel events are broadcast via tokio::sync::broadcast channels with a 256-message buffer. If a subscriber falls behind, it receives a Lagged error and may miss messages.
Delivery, Reconnect, and Reconciliation (CHAT-191)
Direct messages are delivered with deliver_to_user, which sends to each of the recipient's live sessions, prunes any session whose receiver has been dropped (a dead/closing socket), and returns the live-delivery count. The offline web-push fallback fires when that count is 0 — i.e. based on real delivery rather than is_online(), which lags a dead socket and could otherwise let a DM vanish into a stale handle.
Because a frame can still be lost in the disconnect→reconnect gap (sent while the recipient had no live session), the client reconciles on reconnect rather than relying on live delivery alone:
- On reconnect the client re-subscribes to all channels, reloads the active channel, re-syncs unread counts, and (CHAT-191) re-fetches the latest message page for already-loaded non-active DM channels plus refreshes the DM list (surfacing a DM started during the gap). Re-fetched messages are merged idempotently (dedup by id), so a reconnect recovers missed messages the way a full page reload does — without one. The per-reconnect DM re-fetch is bounded.
Next Step