Message Types

The Diminuendo wire protocol defines a structured, bidirectional communication layer between frontend clients and the gateway — a channel designed for real-time streaming of AI agent events (thinking blocks, tool invocations, terminal output, file mutations) while maintaining the strict ordering and persistence guarantees required for reliable session replay. This page documents both the protocol’s transport semantics and the complete catalog of 21 client-to-server message types: the upstream vocabulary through which clients command the gateway.

Transport and Encoding

The protocol operates exclusively over WebSocket connections (RFC 6455). Every frame is a UTF-8-encoded JSON object. Binary frames are not used. Compression (perMessageDeflate) is deliberately disabled to minimize latency on the hot path — text deltas arrive at sub-millisecond intervals during active turns, and decompression overhead is unacceptable at that cadence.
Client                          Gateway
  |                                |
  |  -- WebSocket upgrade -------> |  GET /ws -> 101 Switching Protocols
  |  <---- welcome --------------- |  {"type":"welcome","protocolVersion":1,"requiresAuth":true}
  |  <---- connected ------------- |  {"type":"connected","clientId":"...","heartbeatIntervalMs":30000,"ts":...}
  |  -- authenticate ------------> |  {"type":"authenticate","token":"ey..."}
  |  <---- authenticated --------- |  {"type":"authenticated","identity":{...}}
  |                                |
  |  (connection is now live)      |
The protocol version is currently 1. The welcome message includes a protocolVersion field that clients must validate. A ProtocolVersionMismatch error is raised if the client and gateway disagree on the version.

Protocol Version

All messages belong to protocol version 1. This version number is transmitted in the initial welcome event and is available as a constant in every SDK:
SDKConstant
TypeScriptPROTOCOL_VERSION (= 1)
RustPROTOCOL_VERSION: u32 (= 1)
PythonImplicit in wire format
The version will be incremented only when breaking changes to the wire format are introduced. Additive changes — new event types, new optional fields on existing events — do not constitute a version increment.

Connection Lifecycle

Every WebSocket connection progresses through a deterministic sequence of phases. There are no shortcuts, no optional steps, no negotiation — the sequence is invariant.
1

Connect

The client opens a WebSocket connection to ws(s)://host:port/ws. The gateway validates the Origin header against its allowlist (bypassed in dev mode) and performs a CSRF check for browser-origin connections.
2

Welcome

The gateway immediately sends a welcome event containing the protocol version and whether authentication is required. A connected event follows with the assigned clientId and the heartbeat interval.
3

Authenticate

If requiresAuth is true, the client must send an authenticate message with a valid JWT or API key. The gateway verifies the token (via Auth0 JWKS in production) and responds with authenticated containing the user’s identity. In dev mode, authentication is automatic — the gateway assigns a synthetic identity (developer@example.com) and sends authenticated without requiring a token.
4

Session Interaction

After authentication, the client may send any of the 21 message types: listing sessions, creating sessions, joining sessions, running turns, and so on. Messages sent before authentication (except authenticate itself) are rejected with a NOT_AUTHENTICATED error.
5

Join Session

To receive streaming events for a session, the client sends join_session. The gateway responds with a state_snapshot — a complete portrait of the session’s current state — and subscribes the client to all future events on that session’s topic.
6

Event Stream

While joined, the client receives all events broadcast to the session topic: text_delta, tool_call, thinking.progress, terminal.stream, and the full downstream vocabulary. Events carry monotonically increasing sequence numbers for ordering and replay.

Message Format

Every message — both client-to-server and server-to-client — is a JSON object with a type field that serves as the discriminator. This is the single key by which the gateway’s Effect Schema parser dispatches incoming frames against the full union of 21 client message schemas.
{
  "type": "text_delta",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "turnId": "turn-001",
  "text": "Let me analyze ",
  "seq": 42,
  "ts": 1709312400000
}
Messages that do not match any schema are rejected with an INVALID_MESSAGE error. The rejection is immediate and deterministic — there is no fallback parsing, no fuzzy matching, no “best effort” interpretation.
All field names use camelCase on the wire, regardless of the SDK’s native naming convention. The Rust SDK maps snake_case struct fields to camelCase via #[serde(rename)]; the Python SDK converts in its from_dict class methods.

Sequence Numbers

Events that belong to a session carry a seq field — a per-session, monotonically increasing integer. Sequence numbers serve three distinct purposes:
  1. Ordering — clients sort events by seq to reconstruct the correct temporal order, even if WebSocket frames arrive out of order due to network conditions or buffering
  2. Deduplication — replayed events carry the same seq as the original emission; clients skip events they have already processed
  3. Resumption — when reconnecting, clients pass afterSeq in the join_session message to receive only events they missed
Sequence numbers are scoped to a session. Different sessions maintain independent counters. The first event in a session has seq: 1.
Not all events carry a seq field. Connection-level events (welcome, authenticated, connected, heartbeat, pong, error) are not session-scoped and therefore have no sequence number. Session-level metadata events (session_list, session_created, session_state) also lack sequence numbers — they are not part of the replayable event stream.

Timestamps

Events include a ts field containing the Unix epoch time in milliseconds at which the gateway generated or relayed the event. Timestamps are server-authoritative — clients should not rely on their own wall clock for ordering. The server’s clock is the single source of temporal truth. The ping/pong mechanism exposes both clientTs (echoed from the client’s ping) and serverTs (the gateway’s timestamp at pong emission), enabling clients to compute round-trip latency and approximate clock skew.

Event Classification

Server events fall into two persistence categories, which determine whether they survive a gateway restart and remain available for replay:

Persistent Events

Stored in the session’s SQLite database. Available for replay via get_events and join_session with afterSeq. Examples: turn_started, turn_complete, tool_call, tool_result, question_requested, session_state, sandbox_ready, sandbox_removed.

Ephemeral Events

Broadcast to currently-connected subscribers only. Not stored. Permanently lost if no client is listening. Examples: text_delta, thinking.progress, terminal.stream, heartbeat, usage_update.
The distinction is deliberate and economic. Ephemeral events represent intermediate streaming state — individual text tokens, thinking fragments, terminal bytes — that would be prohibitively expensive to persist at the rate they are produced (hundreds per second during active generation). The persistent events that bookend them (turn_started, turn_complete) capture the final, authoritative state. The ephemeral stream is the live performance; the persistent log is the published recording.
When building a client that supports reconnection, persist the last received seq locally. On reconnect, pass it as afterSeq when re-joining the session. The gateway replays all persistent events after that sequence number, followed by a replay_complete event. Ephemeral events from before the reconnection are permanently lost — the state_snapshot delivered on join provides the accumulated textSoFar to compensate.

Heartbeat

The gateway emits a heartbeat event every 30 seconds to all session topics with active subscribers:
{ "type": "heartbeat", "ts": 1709312430000 }
Clients should treat a silence exceeding 35 seconds (the heartbeat interval plus a 5-second tolerance) as evidence of a stale connection and initiate reconnection. The heartbeat interval is also communicated in the initial connected event via heartbeatIntervalMs, so clients need not hardcode the value.

Reconnection

The protocol is engineered for graceful recovery from disconnections. The reconnection procedure is deterministic and lossless for persistent events:
1

Detect Disconnection

Either the WebSocket close event fires, or the heartbeat timeout expires. SDKs with autoReconnect enabled handle detection automatically.
2

Re-establish Connection

Open a new WebSocket, authenticate, and receive the welcome / connected / authenticated sequence as normal. The connection lifecycle is invariant — reconnection is indistinguishable from first connection.
3

Rejoin with afterSeq

Send join_session with the afterSeq field set to the last seq received before disconnection. The gateway replays all persistent events after that sequence number.
4

Handle Gap Events

If the gateway detects that some events between the client’s afterSeq and the current head were ephemeral (not persisted), it sends a gap event indicating the range of missing sequence numbers. The client must tolerate these gaps gracefully.
5

Replay Complete

After replaying all available events, the gateway sends replay_complete with the lastSeq value. The client transitions from replay mode to real-time event processing.

Rate Limiting

The gateway enforces rate limits at multiple layers to protect against abuse and resource exhaustion: Per-connection message rate: 60 messages per 10-second sliding window.
{ "type": "error", "code": "RATE_LIMITED", "message": "Too many messages -- slow down" }
Authentication rate: Tracked by IP address, with a dedicated rate limiter separate from the per-connection counter.
{ "type": "error", "code": "AUTH_RATE_LIMITED", "message": "Too many auth attempts. Retry after 30s" }
Message size: Individual messages are capped at 1 MB. Messages exceeding this threshold are rejected before parsing:
{ "type": "error", "code": "MESSAGE_TOO_LARGE", "message": "Message exceeds maximum allowed size (1MB)" }

Message Reference

The gateway accepts 21 distinct client-to-server message types, organized into six functional categories. Every inbound message is validated against an Effect Schema union; anything that does not match is rejected with INVALID_MESSAGE.
CategoryMessagesCount
Authenticationauthenticate1
Session Managementlist_sessions, create_session, rename_session, archive_session, unarchive_session, delete_session6
Session Interactionjoin_session, leave_session, run_turn, stop_turn, steer, answer_question6
Data Retrievalget_history, get_events, ping3
File Accesslist_files, read_file, file_history, file_at_iteration4
Administrationmanage_members1

Authentication

authenticate

The threshold crossing. This must be the first message sent when the welcome event indicates requiresAuth: true. The gateway validates the token — a JWT verified against Auth0 JWKS in production, or any non-empty string in dev mode — and responds with either authenticated (the doors open) or error with code AUTH_FAILED (they do not).
type
string
required
Must be "authenticate".
token
string
required
A valid JWT or API key. In production, this is an Auth0-issued access token with the appropriate audience and scope claims. In dev mode, any non-empty string is accepted, or authentication is bypassed entirely.
Response: authenticated on success; error (code AUTH_FAILED or AUTH_RATE_LIMITED) on failure.
{
  "type": "authenticate",
  "token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJhdXRoMHwxMjM0NTY..."
}
Authentication attempts are rate-limited per IP address. Exceeding the limit returns an AUTH_RATE_LIMITED error with a retryAfterMs hint embedded in the message text. Implement exponential backoff in retry logic.

Session Management

list_sessions

Queries the tenant’s session registry. Returns the full catalog of sessions belonging to the authenticated user’s tenant, subject to the archive filter.
type
string
required
Must be "list_sessions".
includeArchived
boolean
When true, includes archived sessions in the response. Defaults to false.
Response: session_list containing an array of SessionMeta objects.
{
  "type": "list_sessions",
  "includeArchived": true
}

create_session

Brings a new session into existence. The session is created in inactive status — a row in the tenant’s SQLite registry, inert until activated. It must be joined before events will flow.
type
string
required
Must be "create_session".
agentType
string
required
The type of agent to associate with this session (e.g., "coding-agent", "echo"). Determines which Podium agent template is instantiated when a turn is initiated.
name
string
An optional human-readable name. If omitted, the session’s name field will be null.
metadata
object
An optional key-value map of arbitrary metadata. Stored by the gateway but not interpreted — opaque cargo for the client’s own bookkeeping.
Response: session_created containing the full SessionMeta of the new session.
{
  "type": "create_session",
  "agentType": "coding-agent",
  "name": "Refactoring Sprint",
  "metadata": { "project": "diminuendo", "priority": "high" }
}

rename_session

Amends the display name of an existing session. A cosmetic operation — it mutates metadata without affecting the session’s state, agent, or event history.
type
string
required
Must be "rename_session".
sessionId
string
required
The UUID of the session to rename.
name
string
required
The new display name.
Response: session_updated containing the updated SessionMeta.
{
  "type": "rename_session",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "name": "Auth Module Refactor"
}

archive_session

Consigns a session to the archive. Archived sessions are hidden from list_sessions by default (unless includeArchived: true is specified) but retain all their data — conversation history, event log, workspace files. This is soft deletion: reversible, non-destructive, and immediate.
type
string
required
Must be "archive_session".
sessionId
string
required
The UUID of the session to archive.
Response: session_archived containing the updated SessionMeta with archived: true.
{
  "type": "archive_session",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}

unarchive_session

Recalls a session from dormancy. Restores it to active visibility in the session list without altering its state or data.
type
string
required
Must be "unarchive_session".
sessionId
string
required
The UUID of the session to restore.
Response: session_unarchived (or session_updated with archived: false).
{
  "type": "unarchive_session",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}

delete_session

The irrevocable act. Permanently destroys a session and all associated data — conversation history, event log, workspace files. The gateway flushes pending writes, closes all database handles, tears down any active Podium agent connection, stops the agent instance, and removes the session’s data directory from disk. There is no retrieval, no undo, no grace period.
type
string
required
Must be "delete_session".
sessionId
string
required
The UUID of the session to destroy.
Response: session_deleted confirming the obliteration.
{
  "type": "delete_session",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}
Deletion is destructive and immediate. There is no soft-delete path for this operation — use archive_session if reversibility is desired. Once deleted, a session’s UUID may not be reused, and its event history is permanently lost.

Session Interaction

join_session

Subscribes the client to a session’s event stream and retrieves its current state. The gateway responds with a state_snapshot — a comprehensive portrait of the session including metadata, any in-progress turn, recent conversation history, and sandbox status. If afterSeq is provided, the gateway additionally replays all persistent events after that sequence number, enabling seamless reconnection.
type
string
required
Must be "join_session".
sessionId
string
required
The UUID of the session to join.
afterSeq
number
If provided, the gateway replays all persistent events with seq > afterSeq before switching to real-time streaming. Used for reconnection. Omit for a fresh join.
Response: state_snapshot on success; error with code SessionNotFound if the session does not exist.
{
  "type": "join_session",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "afterSeq": 142
}
When reconnecting after a disconnection, always pass the last seq you received as afterSeq. This ensures you receive any events that occurred during the disconnection window, without duplicating events already in your local state.

leave_session

Unsubscribes from a session’s event stream. The client will no longer receive events for this session. This is a fire-and-forget message — the gateway processes it silently, with no acknowledgment response.
type
string
required
Must be "leave_session".
sessionId
string
required
The UUID of the session to leave.
Response: None. The gateway unsubscribes the client silently.
{
  "type": "leave_session",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}

run_turn

The principal act of the protocol: initiating agent execution. The gateway reserves billing credits, persists the user’s message to the conversation history, establishes a Podium agent connection (if not already active), and dispatches the text to the agent. The turn produces a stream of events broadcast to all subscribed clients — turn_started, followed by text_delta, tool_call, thinking.start, terminal.stream, and their kin, concluding with turn_complete or turn_error.
type
string
required
Must be "run_turn".
sessionId
string
required
The UUID of the session in which to execute the turn.
text
string
required
The user’s message text — the instruction, question, or directive to send to the agent.
clientTurnId
string
An optional client-generated turn ID for idempotency and correlation. If omitted, the gateway generates a UUID.
Response: No direct response. The turn manifests as an event stream: turn_started -> text_delta / tool_call / thinking.start -> turn_complete | turn_error.
{
  "type": "run_turn",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "text": "Refactor the authentication module to use dependency injection",
  "clientTurnId": "turn-20240301-001"
}
The gateway checks billing credits before starting the turn. If the tenant has insufficient credits, an INSUFFICIENT_CREDITS error is returned immediately and no agent interaction occurs. A credit reservation is held for the estimated cost of the turn and settled when the turn completes or errors.

stop_turn

Requests immediate termination of the currently running turn. The gateway forwards the stop signal to the Podium agent, transitions the session state to ready, and broadcasts a session_state event with reason user_stopped. The agent may emit a small amount of residual output before fully halting.
type
string
required
Must be "stop_turn".
sessionId
string
required
The UUID of the session whose turn should be terminated.
Response: No direct response. A session_state event with reason "user_stopped" is broadcast to the session. A stop_acknowledged event may also be emitted.
{
  "type": "stop_turn",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}

steer

Injects a steering message into an active turn. Steering allows the user to redirect the agent’s behavior mid-execution without stopping and restarting — a course correction rather than an about-face. The gateway assigns a steerId and broadcasts a steer_sent event to confirm delivery.
type
string
required
Must be "steer".
sessionId
string
required
The UUID of the session.
content
string
required
The steering instruction text, injected into the agent’s context.
Response: A steer_sent event is broadcast to the session, confirming that the steering message was delivered to the agent.
{
  "type": "steer",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "content": "Focus on the database layer first, skip the API routes for now"
}

answer_question

Responds to a question_requested event from the agent. During a turn, the agent may pose interactive questions — seeking clarification, requesting a decision, asking for preferences. This message provides the answers and resumes execution, transitioning the session from waiting back to running.
type
string
required
Must be "answer_question".
sessionId
string
required
The UUID of the session.
requestId
string
required
The requestId from the corresponding question_requested event. Correlates the answer to the specific question set.
answers
Record<string, string>
required
A map of question IDs to answer strings. Each key corresponds to a question id from the questions array in the question_requested event.
dismissed
boolean
When true, indicates the user dismissed the question without answering. The agent receives “Question dismissed” and proceeds with its own judgment. Defaults to false.
Response: No direct response. The session transitions back to running state as the agent resumes processing with the provided answers.
{
  "type": "answer_question",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "requestId": "q-abc123",
  "answers": {
    "migration-strategy": "incremental",
    "backup-first": "yes"
  },
  "dismissed": false
}

Data Retrieval

get_history

Retrieves the conversation transcript for a session. Returns messages (both user and assistant) ordered by creation time, with optional cursor-based pagination via afterSeq.
type
string
required
Must be "get_history".
sessionId
string
required
The UUID of the session.
afterSeq
number
Return only messages after this sequence number. Defaults to 0 (from the beginning).
limit
number
Maximum number of messages to return. Defaults to 50.
Response: history containing an array of message items.
{
  "type": "get_history",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "afterSeq": 0,
  "limit": 100
}

get_events

Retrieves the raw event log for a session. Returns persisted events — turn lifecycle markers, tool calls and results, interactive requests — with their sequence numbers, enabling full event replay and audit.
type
string
required
Must be "get_events".
sessionId
string
required
The UUID of the session.
afterSeq
number
Return only events after this sequence number. Defaults to 0.
limit
number
Maximum number of events to return. Defaults to 200.
Response: events containing an array of event objects with seq, type, data, and createdAt fields.
{
  "type": "get_events",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "afterSeq": 50,
  "limit": 200
}

ping

Measures round-trip latency between client and gateway. The gateway echoes back the client’s timestamp alongside its own, providing the two data points necessary for latency calculation and clock skew estimation. This is the protocol’s metronome.
type
string
required
Must be "ping".
ts
number
required
The client’s current Unix timestamp in milliseconds.
Response: pong containing both clientTs (echoed) and serverTs (gateway’s timestamp).
{
  "type": "ping",
  "ts": 1709312400000
}
The TypeScript and Python SDKs automatically send periodic pings (default every 30 seconds) to detect stale connections. Round-trip latency is available via client.ping() in both SDKs.

File Access

list_files

Enumerates files and directories in the agent’s workspace. Requires an active Podium agent connection for the session — the gateway proxies this request to the agent’s filesystem.
type
string
required
Must be "list_files".
sessionId
string
required
The UUID of the session.
path
string
The directory path to list. Defaults to the workspace root if omitted.
depth
number
Maximum directory traversal depth. Omit for single-level listing.
Response: file_list containing an array of FileEntry objects.
{
  "type": "list_files",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "path": "src/",
  "depth": 2
}

read_file

Reads the contents of a file from the agent’s workspace at its current state.
type
string
required
Must be "read_file".
sessionId
string
required
The UUID of the session.
path
string
required
The file path relative to the workspace root.
Response: file_content containing the file’s content, encoding, and size.
{
  "type": "read_file",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "path": "src/main.ts"
}

file_history

Retrieves the version lineage of a file in the agent’s workspace. Each iteration represents a snapshot of the file at a point in the session’s timeline — an immutable record of how the file evolved.
type
string
required
Must be "file_history".
sessionId
string
required
The UUID of the session.
path
string
required
The file path relative to the workspace root.
Response: file_history_result containing an array of IterationMeta objects with iteration number, timestamp, size, and optional content hash.
{
  "type": "file_history",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "path": "src/auth/AuthService.ts"
}

file_at_iteration

Reads the contents of a file at a specific historical iteration, enabling diff views and rollback examination. This is time travel within the session’s filesystem.
type
string
required
Must be "file_at_iteration".
sessionId
string
required
The UUID of the session.
path
string
required
The file path relative to the workspace root.
iteration
number
required
The iteration number to retrieve (from file_history results).
Response: file_content containing the file’s content at the specified iteration.
{
  "type": "file_at_iteration",
  "sessionId": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "path": "src/auth/AuthService.ts",
  "iteration": 3
}

Administration

manage_members

Administers tenant membership: listing members, assigning roles, and removing members. The available actions are gated by the caller’s role — listing requires member:read, role assignment requires member:write, and removal requires member:delete. This is the RBAC control surface.
type
string
required
Must be "manage_members".
action
string
required
One of "list", "set_role", or "remove".
userId
string
Required for set_role and remove actions. The user ID of the member to modify.
role
string
Required for set_role. The new role to assign. Valid roles: "owner", "admin", "member".
Response: Depends on the action:
  • "list" returns member_list with an array of MemberRecord objects.
  • "set_role" returns member_updated with the user ID and new role.
  • "remove" returns member_removed with the user ID.
{
  "type": "manage_members",
  "action": "list"
}
The gateway protects the last owner of a tenant from demotion or removal. Attempting to change the role of the sole owner returns a LAST_OWNER_PROTECTED error. Promote another member to owner first — the invariant is that every tenant must have at least one owner.