Session State Machine
A session without a formal lifecycle model is a session governed by accident. Simple status strings —"idle", "running" — offer no algebraic guarantees: there is no compile-time proof that a crashed session will not silently resume, no contract preventing a client from displaying a state the server has already abandoned, no deterministic path from failure back to operational readiness. When an agent’s upstream connection collapses mid-turn, a naive status field offers no vocabulary for describing what happened, let alone for prescribing what must happen next.
Diminuendo addresses this with a formal finite-state machine: seven named states, an explicit transition guard map encoded as a Record<SessionState, ReadonlySet<SessionState>>, and a family of pure functions that compute the next state from the current state and an incoming agent signal. The model was ported from the Crescendo desktop client’s connection-state.ts and elevated to server-side enforcement — a migration from local observation to authoritative arbitration.
The Seven States
inactive
The ground state. No Podium connection exists. The session is metadata alone — a row in the tenant’s SQLite registry, inert and weightless. This is the resting state for sessions that have been created but never activated, sessions whose agents have been gracefully torn down, and sessions recovered from stale state after a gateway restart. It is both genesis and terminus — the only state from which activation can begin, and the state to which all paths ultimately return.
activating
The gateway is materializing an agent. A Podium instance is being created, a WebSocket connection is being established, and the system is in a liminal state between potential and readiness. This is inherently transient — it resolves to
ready on success, error on failure, or back to inactive if the activation is cancelled before the handshake completes. Like Schrodinger’s connection, it exists in superposition until the upstream settles.ready
The Podium connection is established and the agent stands idle, awaiting instruction. This is the quiescent state of a live session — the system has committed resources (a WebSocket, a Podium instance, a credit reservation pipeline) and is prepared to execute. From here, the session may begin processing a turn (
running), be torn down (deactivating), return to inactive, or encounter failure (error). It is the fulcrum on which all productive work pivots.running
The agent is actively processing a turn — streaming text fragments, invoking tools, executing shell commands, performing multi-step reasoning within its context window. This is the state of maximum activity and maximum exposure: tokens are being consumed, credits are being drawn, and the Podium connection is carrying live traffic. The state persists until the turn completes (
ready), the agent requests human intervention (waiting), a failure occurs (error), or a tear-down is initiated (deactivating).waiting
The agent has yielded control to the user. A
question_requested or permission_requested event has been emitted, and execution is suspended until the human responds. The session holds its breath — the Podium connection remains open, the credit reservation remains active, but no tokens are flowing. The wait resolves when the user answers (transitioning back to running), when a failure occurs (error), or when the session is torn down (deactivating).deactivating
Dissolution is underway. The Podium instance is being stopped, the WebSocket connection is being closed, and the session is transitioning from a live entity back to inert metadata. This resolves to
inactive on success or error if the tear-down itself encounters a failure — a rare but possible condition when the upstream refuses to release cleanly.error
An unrecoverable failure has occurred. The Podium connection may be in an unknown state; the agent may be unreachable; the credit reservation may be orphaned. The critical invariant is that there is no direct path from
error back to ready or running. Recovery always requires passing through inactive (a full reset) or activating (a fresh connection attempt). This prevents the gateway from silently resuming a session whose underlying substrate may be corrupted.Transition Guard Map
TheVALID_TRANSITIONS constant encodes the complete set of legal state transitions as a Record<SessionState, ReadonlySet<SessionState>>. It is the single source of truth for what the state machine will accept. Any transition not present in this map is rejected — silently from the perspective of the upstream agent, but loudly in the gateway’s structured logs.
| From | Allowed Targets |
|---|---|
inactive | activating |
activating | ready, error, inactive |
ready | running, deactivating, inactive, error |
running | ready, waiting, error, deactivating |
waiting | running, error, deactivating |
deactivating | inactive, error |
error | inactive, activating |
Agent Status Mapping
The Podium agent speaks a different vocabulary than the session state machine. Where the state machine traffics in seven states, the agent reports ten distinct status values. TheapplySessionTransition function bridges these two worlds: a pure function that accepts the current session state and an agent-reported status, and returns either the next valid state or null if the transition would violate the guard map.
| Agent Status | Target State | Notes |
|---|---|---|
created | activating | Podium instance materialized, WebSocket connecting |
connected | ready | Handshake complete, agent idle |
turn_started | running | Agent has begun processing |
turn_complete | ready | Turn concluded successfully |
turn_error | ready or error | Context-dependent: ready if currently running or waiting (recoverable); error otherwise (unrecoverable) |
question_requested | waiting | Agent yields control, awaiting human input |
approval_resolved | running | User responded to interactive prompt |
terminating | deactivating | Graceful shutdown initiated |
terminated | inactive | Shutdown complete, resources released |
error | error | Unrecoverable upstream failure |
The
turn_error status exhibits context-dependent polymorphism — a deliberate design choice. When a turn fails while the session is running or waiting, the failure is scoped to the turn itself: the Podium connection is still viable, and the session can return to ready to accept another turn. In any other state, a turn error signals a deeper structural problem, and the session must transition to error for full recovery.Enforcement: transitionSessionState
ThetransitionSessionState helper in MessageRouterLive.ts is the chokepoint through which all state transitions must pass. It is the enforcer — the function that stands between the agent’s reported status and the gateway’s authoritative state. It validates the proposed transition against the guard map, updates the session’s ConnectionState ref, persists the new status to the tenant’s SQLite registry, and broadcasts the change to all subscribers.
ConnectionState: Per-Connection Typed Refs
Each active session materializes aConnectionState — a struct of Effect Ref values that collectively track the full in-flight state of a live session. Where less structured architectures accumulate scattered mutable variables across handler closures, ConnectionState consolidates everything into a single, typed, ref-counted structure. Every field is an atomic Ref, enabling concurrent reads and writes without locks.
resetTurnState
At the boundary between turns,resetTurnState performs a surgical clearing of all turn-scoped refs. This is the function that enforces the invariant: no state leaks between turns. The previous turn’s accumulated text, pending tool calls, thinking content, billing reservation, and interactive state are all zeroed out, returning the ConnectionState to a clean slate while preserving session-level refs like sessionState and agentMode.
Stale Session Recovery
A gateway restart is an extinction event for in-memory state. No Podium connections survive the process boundary; noConnectionState refs persist beyond the runtime that created them. Any session that was in a non-idle state at the moment of shutdown is stale by definition — its metadata claims a status that its infrastructure can no longer support.
The reconcileStaleSessions function runs on startup for each tenant, querying all sessions whose status is not "inactive" and resetting them to the ground state. For sessions that were running or waiting at crash time, it additionally attempts to recover the last full_content_snapshot from SQLite, so that reconnecting clients can display partial output rather than a blank slate.
Legacy State Migration
The current seven-state model superseded an earlier four-state model (idle, running, awaiting_question, error). Sessions persisted under the old schema require translation when read. The migrateLegacyStatus function provides this bridge — a pure mapping from the old vocabulary to the new.
"idle" and "awaiting_question" are recognized and silently converted — though new code never emits them. This is a one-way valve: the old world can be read, but the new world is the only world that can be written.
Comparison with Crescendo
The state machine was ported from the Crescendo desktop client’s connection management layer and substantially rearchitected for server-side enforcement. The differences are not incremental improvements but categorical changes in where authority resides.Server-Side Enforcement
In Crescendo, the state machine runs client-side within a Tauri process. Invalid transitions are visible only in local logs — the server has no opinion on session state. In Diminuendo, the state machine is enforced at the gateway: all clients observe the same authoritative state, and invalid transitions are rejected before they can propagate. The server is the single source of truth; clients are projections.
Persistent State
Crescendo holds session state in memory alone — a process restart erases all knowledge of what sessions were doing. Diminuendo persists the current state to SQLite on every transition, enabling stale session recovery after restarts, consistent state across reconnections, and an audit trail of state changes that survives the process boundary.
Multi-Client Broadcast
Crescendo manages a single user’s view of a single session. Diminuendo broadcasts state transitions to all subscribers of a session via Bun pub/sub — dashboards, CLIs, web clients, and desktop applications all observe transitions in real time. The state machine is a shared oracle, not a local notebook.
Billing Integration
Diminuendo’s state transitions are coupled with the billing subsystem. A credit reservation is created when entering
running and settled when transitioning to ready (on turn completion) or error (on failure). The state machine does not merely describe what the session is doing — it governs what the session is allowed to cost. Crescendo has no billing integration.