Message Types
The Diminuendo wire protocol defines a structured, bidirectional communication layer between frontend clients and the gateway — a channel designed for real-time streaming of AI agent events (thinking blocks, tool invocations, terminal output, file mutations) while maintaining the strict ordering and persistence guarantees required for reliable session replay. This page documents both the protocol’s transport semantics and the complete catalog of 21 client-to-server message types: the upstream vocabulary through which clients command the gateway.Transport and Encoding
The protocol operates exclusively over WebSocket connections (RFC 6455). Every frame is a UTF-8-encoded JSON object. Binary frames are not used. Compression (perMessageDeflate) is deliberately disabled to minimize latency on the hot path — text deltas arrive at sub-millisecond intervals during active turns, and decompression overhead is unacceptable at that cadence.
The protocol version is currently 1. The
welcome message includes a protocolVersion field that clients must validate. A ProtocolVersionMismatch error is raised if the client and gateway disagree on the version.Protocol Version
All messages belong to protocol version1. This version number is transmitted in the initial welcome event and is available as a constant in every SDK:
| SDK | Constant |
|---|---|
| TypeScript | PROTOCOL_VERSION (= 1) |
| Rust | PROTOCOL_VERSION: u32 (= 1) |
| Python | Implicit in wire format |
Connection Lifecycle
Every WebSocket connection progresses through a deterministic sequence of phases. There are no shortcuts, no optional steps, no negotiation — the sequence is invariant.Connect
The client opens a WebSocket connection to
ws(s)://host:port/ws. The gateway validates the Origin header against its allowlist (bypassed in dev mode) and performs a CSRF check for browser-origin connections.Welcome
The gateway immediately sends a
welcome event containing the protocol version and whether authentication is required. A connected event follows with the assigned clientId and the heartbeat interval.Authenticate
If
requiresAuth is true, the client must send an authenticate message with a valid JWT or API key. The gateway verifies the token (via Auth0 JWKS in production) and responds with authenticated containing the user’s identity. In dev mode, authentication is automatic — the gateway assigns a synthetic identity (developer@example.com) and sends authenticated without requiring a token.Session Interaction
After authentication, the client may send any of the 21 message types: listing sessions, creating sessions, joining sessions, running turns, and so on. Messages sent before authentication (except
authenticate itself) are rejected with a NOT_AUTHENTICATED error.Join Session
To receive streaming events for a session, the client sends
join_session. The gateway responds with a state_snapshot — a complete portrait of the session’s current state — and subscribes the client to all future events on that session’s topic.Message Format
Every message — both client-to-server and server-to-client — is a JSON object with atype field that serves as the discriminator. This is the single key by which the gateway’s Effect Schema parser dispatches incoming frames against the full union of 21 client message schemas.
INVALID_MESSAGE error. The rejection is immediate and deterministic — there is no fallback parsing, no fuzzy matching, no “best effort” interpretation.
Sequence Numbers
Events that belong to a session carry aseq field — a per-session, monotonically increasing integer. Sequence numbers serve three distinct purposes:
- Ordering — clients sort events by
seqto reconstruct the correct temporal order, even if WebSocket frames arrive out of order due to network conditions or buffering - Deduplication — replayed events carry the same
seqas the original emission; clients skip events they have already processed - Resumption — when reconnecting, clients pass
afterSeqin thejoin_sessionmessage to receive only events they missed
seq: 1.
Timestamps
Events include ats field containing the Unix epoch time in milliseconds at which the gateway generated or relayed the event. Timestamps are server-authoritative — clients should not rely on their own wall clock for ordering. The server’s clock is the single source of temporal truth.
The ping/pong mechanism exposes both clientTs (echoed from the client’s ping) and serverTs (the gateway’s timestamp at pong emission), enabling clients to compute round-trip latency and approximate clock skew.
Event Classification
Server events fall into two persistence categories, which determine whether they survive a gateway restart and remain available for replay:Persistent Events
Stored in the session’s SQLite database. Available for replay via
get_events and join_session with afterSeq. Examples: turn_started, turn_complete, tool_call, tool_result, question_requested, session_state, sandbox_ready, sandbox_removed.Ephemeral Events
Broadcast to currently-connected subscribers only. Not stored. Permanently lost if no client is listening. Examples:
text_delta, thinking.progress, terminal.stream, heartbeat, usage_update.turn_started, turn_complete) capture the final, authoritative state. The ephemeral stream is the live performance; the persistent log is the published recording.
Heartbeat
The gateway emits aheartbeat event every 30 seconds to all session topics with active subscribers:
connected event via heartbeatIntervalMs, so clients need not hardcode the value.
Reconnection
The protocol is engineered for graceful recovery from disconnections. The reconnection procedure is deterministic and lossless for persistent events:Detect Disconnection
Either the WebSocket
close event fires, or the heartbeat timeout expires. SDKs with autoReconnect enabled handle detection automatically.Re-establish Connection
Open a new WebSocket, authenticate, and receive the
welcome / connected / authenticated sequence as normal. The connection lifecycle is invariant — reconnection is indistinguishable from first connection.Rejoin with afterSeq
Send
join_session with the afterSeq field set to the last seq received before disconnection. The gateway replays all persistent events after that sequence number.Handle Gap Events
If the gateway detects that some events between the client’s
afterSeq and the current head were ephemeral (not persisted), it sends a gap event indicating the range of missing sequence numbers. The client must tolerate these gaps gracefully.Rate Limiting
The gateway enforces rate limits at multiple layers to protect against abuse and resource exhaustion: Per-connection message rate: 60 messages per 10-second sliding window.Message Reference
The gateway accepts 21 distinct client-to-server message types, organized into six functional categories. Every inbound message is validated against an Effect Schema union; anything that does not match is rejected withINVALID_MESSAGE.
| Category | Messages | Count |
|---|---|---|
| Authentication | authenticate | 1 |
| Session Management | list_sessions, create_session, rename_session, archive_session, unarchive_session, delete_session | 6 |
| Session Interaction | join_session, leave_session, run_turn, stop_turn, steer, answer_question | 6 |
| Data Retrieval | get_history, get_events, ping | 3 |
| File Access | list_files, read_file, file_history, file_at_iteration | 4 |
| Administration | manage_members | 1 |
Authentication
authenticate
The threshold crossing. This must be the first message sent when the welcome event indicates requiresAuth: true. The gateway validates the token — a JWT verified against Auth0 JWKS in production, or any non-empty string in dev mode — and responds with either authenticated (the doors open) or error with code AUTH_FAILED (they do not).
Must be
"authenticate".A valid JWT or API key. In production, this is an Auth0-issued access token with the appropriate audience and scope claims. In dev mode, any non-empty string is accepted, or authentication is bypassed entirely.
authenticated on success; error (code AUTH_FAILED or AUTH_RATE_LIMITED) on failure.
Session Management
list_sessions
Queries the tenant’s session registry. Returns the full catalog of sessions belonging to the authenticated user’s tenant, subject to the archive filter.
Must be
"list_sessions".When
true, includes archived sessions in the response. Defaults to false.session_list containing an array of SessionMeta objects.
create_session
Brings a new session into existence. The session is created in inactive status — a row in the tenant’s SQLite registry, inert until activated. It must be joined before events will flow.
Must be
"create_session".The type of agent to associate with this session (e.g.,
"coding-agent", "echo"). Determines which Podium agent template is instantiated when a turn is initiated.An optional human-readable name. If omitted, the session’s
name field will be null.An optional key-value map of arbitrary metadata. Stored by the gateway but not interpreted — opaque cargo for the client’s own bookkeeping.
session_created containing the full SessionMeta of the new session.
rename_session
Amends the display name of an existing session. A cosmetic operation — it mutates metadata without affecting the session’s state, agent, or event history.
Must be
"rename_session".The UUID of the session to rename.
The new display name.
session_updated containing the updated SessionMeta.
archive_session
Consigns a session to the archive. Archived sessions are hidden from list_sessions by default (unless includeArchived: true is specified) but retain all their data — conversation history, event log, workspace files. This is soft deletion: reversible, non-destructive, and immediate.
Must be
"archive_session".The UUID of the session to archive.
session_archived containing the updated SessionMeta with archived: true.
unarchive_session
Recalls a session from dormancy. Restores it to active visibility in the session list without altering its state or data.
Must be
"unarchive_session".The UUID of the session to restore.
session_unarchived (or session_updated with archived: false).
delete_session
The irrevocable act. Permanently destroys a session and all associated data — conversation history, event log, workspace files. The gateway flushes pending writes, closes all database handles, tears down any active Podium agent connection, stops the agent instance, and removes the session’s data directory from disk. There is no retrieval, no undo, no grace period.
Must be
"delete_session".The UUID of the session to destroy.
session_deleted confirming the obliteration.
Session Interaction
join_session
Subscribes the client to a session’s event stream and retrieves its current state. The gateway responds with a state_snapshot — a comprehensive portrait of the session including metadata, any in-progress turn, recent conversation history, and sandbox status. If afterSeq is provided, the gateway additionally replays all persistent events after that sequence number, enabling seamless reconnection.
Must be
"join_session".The UUID of the session to join.
If provided, the gateway replays all persistent events with
seq > afterSeq before switching to real-time streaming. Used for reconnection. Omit for a fresh join.state_snapshot on success; error with code SessionNotFound if the session does not exist.
leave_session
Unsubscribes from a session’s event stream. The client will no longer receive events for this session. This is a fire-and-forget message — the gateway processes it silently, with no acknowledgment response.
Must be
"leave_session".The UUID of the session to leave.
run_turn
The principal act of the protocol: initiating agent execution. The gateway reserves billing credits, persists the user’s message to the conversation history, establishes a Podium agent connection (if not already active), and dispatches the text to the agent. The turn produces a stream of events broadcast to all subscribed clients — turn_started, followed by text_delta, tool_call, thinking.start, terminal.stream, and their kin, concluding with turn_complete or turn_error.
Must be
"run_turn".The UUID of the session in which to execute the turn.
The user’s message text — the instruction, question, or directive to send to the agent.
An optional client-generated turn ID for idempotency and correlation. If omitted, the gateway generates a UUID.
turn_started -> text_delta / tool_call / thinking.start -> turn_complete | turn_error.
The gateway checks billing credits before starting the turn. If the tenant has insufficient credits, an
INSUFFICIENT_CREDITS error is returned immediately and no agent interaction occurs. A credit reservation is held for the estimated cost of the turn and settled when the turn completes or errors.stop_turn
Requests immediate termination of the currently running turn. The gateway forwards the stop signal to the Podium agent, transitions the session state to ready, and broadcasts a session_state event with reason user_stopped. The agent may emit a small amount of residual output before fully halting.
Must be
"stop_turn".The UUID of the session whose turn should be terminated.
session_state event with reason "user_stopped" is broadcast to the session. A stop_acknowledged event may also be emitted.
steer
Injects a steering message into an active turn. Steering allows the user to redirect the agent’s behavior mid-execution without stopping and restarting — a course correction rather than an about-face. The gateway assigns a steerId and broadcasts a steer_sent event to confirm delivery.
Must be
"steer".The UUID of the session.
The steering instruction text, injected into the agent’s context.
steer_sent event is broadcast to the session, confirming that the steering message was delivered to the agent.
answer_question
Responds to a question_requested event from the agent. During a turn, the agent may pose interactive questions — seeking clarification, requesting a decision, asking for preferences. This message provides the answers and resumes execution, transitioning the session from waiting back to running.
Must be
"answer_question".The UUID of the session.
The
requestId from the corresponding question_requested event. Correlates the answer to the specific question set.A map of question IDs to answer strings. Each key corresponds to a question
id from the questions array in the question_requested event.When
true, indicates the user dismissed the question without answering. The agent receives “Question dismissed” and proceeds with its own judgment. Defaults to false.running state as the agent resumes processing with the provided answers.
Data Retrieval
get_history
Retrieves the conversation transcript for a session. Returns messages (both user and assistant) ordered by creation time, with optional cursor-based pagination via afterSeq.
Must be
"get_history".The UUID of the session.
Return only messages after this sequence number. Defaults to
0 (from the beginning).Maximum number of messages to return. Defaults to
50.history containing an array of message items.
get_events
Retrieves the raw event log for a session. Returns persisted events — turn lifecycle markers, tool calls and results, interactive requests — with their sequence numbers, enabling full event replay and audit.
Must be
"get_events".The UUID of the session.
Return only events after this sequence number. Defaults to
0.Maximum number of events to return. Defaults to
200.events containing an array of event objects with seq, type, data, and createdAt fields.
ping
Measures round-trip latency between client and gateway. The gateway echoes back the client’s timestamp alongside its own, providing the two data points necessary for latency calculation and clock skew estimation. This is the protocol’s metronome.
Must be
"ping".The client’s current Unix timestamp in milliseconds.
pong containing both clientTs (echoed) and serverTs (gateway’s timestamp).
File Access
list_files
Enumerates files and directories in the agent’s workspace. Requires an active Podium agent connection for the session — the gateway proxies this request to the agent’s filesystem.
Must be
"list_files".The UUID of the session.
The directory path to list. Defaults to the workspace root if omitted.
Maximum directory traversal depth. Omit for single-level listing.
file_list containing an array of FileEntry objects.
read_file
Reads the contents of a file from the agent’s workspace at its current state.
Must be
"read_file".The UUID of the session.
The file path relative to the workspace root.
file_content containing the file’s content, encoding, and size.
file_history
Retrieves the version lineage of a file in the agent’s workspace. Each iteration represents a snapshot of the file at a point in the session’s timeline — an immutable record of how the file evolved.
Must be
"file_history".The UUID of the session.
The file path relative to the workspace root.
file_history_result containing an array of IterationMeta objects with iteration number, timestamp, size, and optional content hash.
file_at_iteration
Reads the contents of a file at a specific historical iteration, enabling diff views and rollback examination. This is time travel within the session’s filesystem.
Must be
"file_at_iteration".The UUID of the session.
The file path relative to the workspace root.
The iteration number to retrieve (from
file_history results).file_content containing the file’s content at the specified iteration.
Administration
manage_members
Administers tenant membership: listing members, assigning roles, and removing members. The available actions are gated by the caller’s role — listing requires member:read, role assignment requires member:write, and removal requires member:delete. This is the RBAC control surface.
Must be
"manage_members".One of
"list", "set_role", or "remove".Required for
set_role and remove actions. The user ID of the member to modify.Required for
set_role. The new role to assign. Valid roles: "owner", "admin", "member"."list"returnsmember_listwith an array ofMemberRecordobjects."set_role"returnsmember_updatedwith the user ID and new role."remove"returnsmember_removedwith the user ID.
- List Members
- Set Role
- Remove Member