Error Handling
Every protocol must reckon with failure. The question is not whether errors will occur — they will, with the certainty of entropy — but how they are classified, communicated, and recovered from. Diminuendo employs a layered error handling strategy that draws clear boundaries between the concerns of transport, domain, and presentation. Transport-level errors (malformed JSON, oversized messages, rate limiting) are intercepted at the WebSocket handler before any business logic executes — rejected at the gate, not in the throne room. Domain-level errors (authentication failures, session not found, insufficient credits) are raised within the Effect TS runtime as typedTaggedError values and mapped to structured error events. All errors, regardless of origin, pass through a sanitization pipeline before reaching clients: stack traces are stripped, API keys are redacted, messages are truncated to a safe length. The client sees a clean, structured error; the server logs the raw truth.
Error Event Format
All errors are delivered as a server event withtype: "error":
Always
"error".A machine-readable error code. This is the key for programmatic error handling — switch statements, retry logic, UI rendering. It is stable across releases; changes to error codes are breaking changes.
A human-readable description. Safe for display to end users — guaranteed to contain no stack traces, no API keys, no internal file paths, no implementation details. Sanitized by the pipeline described below.
Error events are not session-scoped. They carry no
sessionId, seq, or ts fields. They are sent directly to the connection that triggered the error and are never broadcast to session subscribers. An error is a private conversation between the gateway and the client that provoked it.Error Codes
Transport-Level Errors
These errors are raised at the WebSocket handler layer — the outermost membrane of the gateway — before the message reaches any domain logic. They represent failures of form rather than failures of intent.| Code | Condition | Description |
|---|---|---|
INVALID_JSON | Message is not valid JSON | The raw WebSocket frame could not be parsed. Typically caused by incomplete messages, binary data sent to a text frame, or encoding errors. |
INVALID_MESSAGE | JSON does not match any schema | The JSON was parsed successfully but does not conform to any of the 21 client message schemas. Common causes: unknown type field, missing required fields, incorrect field types. |
MESSAGE_TOO_LARGE | Raw message exceeds 1 MB | The message was rejected before parsing. The gateway enforces a 1 MB size limit on inbound WebSocket frames to prevent resource exhaustion. |
RATE_LIMITED | Per-connection rate limit exceeded | More than 60 messages in a 10-second sliding window. The client must reduce its transmission rate. |
AUTH_RATE_LIMITED | Authentication attempts exceeded | Too many authenticate messages from the same IP address in a short period. Includes a retryAfterMs hint in the message text. |
Authentication & Authorization Errors
| Code | Condition | Description |
|---|---|---|
NOT_AUTHENTICATED | Message sent before authentication | The client attempted to send a message (other than authenticate) before completing the authentication handshake. |
AUTH_FAILED | Token validation failed | The JWT or API key was invalid, expired, malformed, or could not be verified against the Auth0 JWKS endpoint. |
Unauthenticated | Authentication required | A gateway-internal error indicating the operation requires an authenticated identity. Semantically equivalent to NOT_AUTHENTICATED but raised within the Effect runtime rather than the transport layer. |
Unauthorized | Insufficient permissions | The authenticated user does not possess the required RBAC permission for this operation (e.g., a member attempting manage_members with set_role action). |
Domain Errors
| Code | Condition | Description |
|---|---|---|
SessionNotFound | Session does not exist | The referenced sessionId does not exist in the tenant’s registry, or the session has been deleted. |
SessionAlreadyExists | Duplicate session creation | Attempted to create a session with an ID that already exists. Rare — UUIDs are generated server-side. |
INSUFFICIENT_CREDITS | Billing check failed | The tenant does not have enough credits to start a turn. No agent interaction occurs; no tokens are consumed. |
PodiumConnectionError | Agent connection failed | Failed to establish or maintain a WebSocket connection to the Podium agent orchestrator. |
PodiumTimeout | Agent operation timed out | A Podium operation (instance creation, message dispatch, health check) exceeded its timeout threshold. |
EnsembleError | LLM inference failure | An error from the Ensemble inference service — model unavailable, provider error, token limit exceeded, or rate limiting from the upstream model provider. |
SandboxNotConfigured | No sandbox available | The operation requires a sandbox environment, but none is configured or provisioned for this session. |
DbError | Database operation failed | A SQLite operation failed. Possible causes: disk full, database corruption, concurrent write conflict, or WAL checkpoint failure. |
ProtocolVersionMismatch | Version mismatch | The client and gateway disagree on the protocol version. The client should update to match the gateway’s advertised version. |
INTERNAL_ERROR | Unclassified error | A catch-all for unexpected failures that do not map to a known error code. The message is sanitized before delivery. |
Turn-Specific Errors
Turn errors occupy a distinct category because they are session-scoped: they are delivered via theturn_error event type (not the generic error event) and carry seq and ts fields that place them in the session’s event timeline.
| Code | Condition | Description |
|---|---|---|
AGENT_ERROR | Agent reported an error | The Podium agent encountered an error during execution — an LLM API failure, a tool crash, a context overflow. |
AGENT_DISCONNECTED | WebSocket dropped mid-turn | The gateway lost its WebSocket connection to the Podium agent while a turn was in progress. The session transitions to error state. |
Member Management Errors
| Code | Condition | Description |
|---|---|---|
INVALID_MEMBER_UPDATE | Missing or invalid fields | The set_role action was missing userId or role, or the role value is not one of the three valid roles (owner, admin, member). |
LAST_OWNER_PROTECTED | Cannot demote last owner | The operation would leave the tenant with no owners. The invariant is absolute: every tenant must have at least one owner. Promote another member first. |
Error Sanitization
All error messages pass through a sanitization pipeline before reaching clients. This is not optional hardening but a critical security boundary. Internal errors may contain stack traces with file paths, API keys embedded in HTTP headers, database connection strings, or other sensitive artifacts that must never cross the trust boundary to an end user. The sanitization process applies three transformations in strict sequence:Strip Stack Traces
Any line matching the pattern of a JavaScript/TypeScript stack trace (
at Function.name (/path/to/file.ts:42:10)) is removed. This prevents leaking internal file paths, function names, line numbers, and the architectural topology they reveal.Redact Secrets
The following patterns are replaced with
[REDACTED]:- Anthropic API keys:
sk-ant-* - Generic secret keys:
sk-* - GitHub personal access tokens:
ghp_* - Bearer tokens:
Bearer * - URL token parameters:
token=*
Gateway Typed Errors
Internally, the gateway uses Effect’sTaggedError pattern to define typed, structured error classes. Each error carries a _tag field that serves as both a discriminant union tag within the Effect type system and the wire code sent to clients. This provides type-safe error handling within the Effect runtime — errors are tracked in the type signature of every Effect, and the compiler ensures they are handled.
Full Error Type Catalog
Full Error Type Catalog
| Error Class | Tag | Fields | Wire Code |
|---|---|---|---|
Unauthenticated | "Unauthenticated" | reason: string | Unauthenticated |
Unauthorized | "Unauthorized" | tenantId: string, resource: string | Unauthorized |
SessionNotFound | "SessionNotFound" | sessionId: string | SessionNotFound |
SessionAlreadyExists | "SessionAlreadyExists" | sessionId: string | SessionAlreadyExists |
InsufficientCredits | "InsufficientCredits" | tenantId: string, required: number, available: number | INSUFFICIENT_CREDITS |
PaymentFailed | "PaymentFailed" | reason: string, stripeError?: string | PaymentFailed |
PodiumConnectionError | "PodiumConnectionError" | message: string, cause?: unknown | PodiumConnectionError |
PodiumTimeout | "PodiumTimeout" | operation: string, timeoutMs: number | PodiumTimeout |
EnsembleError | "EnsembleError" | message: string, statusCode?: number | EnsembleError |
SandboxNotConfigured | "SandboxNotConfigured" | message: string | SandboxNotConfigured |
DbError | "DbError" | message: string, cause?: unknown | DbError |
InvalidMessage | "InvalidMessage" | reason: string, raw?: string | InvalidMessage |
ProtocolVersionMismatch | "ProtocolVersionMismatch" | expected: number, received: number | ProtocolVersionMismatch |
_tag field of each TaggedError to the wire code, and provides safe, predefined messages for known error types — messages that have been authored by humans, not generated by stack unwinding:sanitizeErrorMessage before transmission. The safe message map is the first line of defense; sanitization is the last.Recovery Strategies
Different error codes demand different recovery postures. Some errors are transient and should be retried; some are permanent and should be surfaced to the user; some are bugs in the client that should be fixed in code. The table below provides guidance for client implementations — a field manual for the discipline of failure.| Error Code | Strategy | Details |
|---|---|---|
AUTH_FAILED | Re-authenticate | The token may be expired. Obtain a fresh JWT from your identity provider and send a new authenticate message. |
AUTH_RATE_LIMITED | Exponential backoff | Parse the retry delay from the error message. Wait at least that duration before attempting authentication again. Do not retry immediately — the rate limiter’s window has not expired. |
NOT_AUTHENTICATED | Re-authenticate | The connection may have been reset or the authentication state lost. Send an authenticate message before retrying the original operation. |
Unauthorized | Surface to user | The user lacks the required permission. Display an appropriate message and do not retry — retrying an authorization failure is futile without a role change. |
SessionNotFound | Refresh session list | The session may have been deleted by another client or a concurrent operation. Call list_sessions to refresh the UI and remove stale references. |
INSUFFICIENT_CREDITS | Surface to user | The tenant has exhausted its credits. Direct the user to the billing interface to purchase additional credits before retrying the turn. |
RATE_LIMITED | Backoff and retry | Reduce message frequency. Implement a client-side rate limiter to stay within 60 messages per 10 seconds. Consider batching rapid-fire operations. |
PodiumConnectionError | Retry with backoff | The agent backend may be temporarily unavailable. Wait 2-5 seconds and retry. If the error persists across multiple retries, the agent infrastructure may be experiencing an outage. |
AGENT_DISCONNECTED | Offer retry | The agent’s WebSocket connection dropped mid-turn. The session is in error state. The user can retry the turn via run_turn — the gateway will establish a fresh Podium connection. |
AGENT_ERROR | Offer retry | The agent encountered an internal error during execution. The user can retry the turn. Persistent errors may indicate a problem with the agent’s configuration or the underlying LLM provider. |
INTERNAL_ERROR | Retry once, then surface | An unexpected, unclassified error. Retry the operation once. If it fails again, surface the error to the user and consider reconnecting the WebSocket to obtain a clean connection state. |
MESSAGE_TOO_LARGE | Reduce payload | The message exceeds the 1 MB limit. Reduce the content size — truncate very long prompts, compress file contents, or split the operation into smaller messages. |
INVALID_JSON | Fix client bug | The client is sending malformed JSON. This is invariably a client-side defect — examine the serialization path. |
INVALID_MESSAGE | Fix client bug | The message structure does not match any schema. Verify field names, types, and required fields against this documentation. Check for typos in the type discriminator. |
The SDKs handle several of these recovery strategies automatically. The TypeScript SDK’s
autoReconnect option re-establishes the connection and re-authenticates on disconnect. The Python SDK’s auto_reconnect does the same. For session-level recovery (re-joining with afterSeq), the client application must implement its own logic — the SDKs provide the primitives but do not presume to know which sessions the user wishes to rejoin.