Error Handling

Every protocol must reckon with failure. The question is not whether errors will occur — they will, with the certainty of entropy — but how they are classified, communicated, and recovered from. Diminuendo employs a layered error handling strategy that draws clear boundaries between the concerns of transport, domain, and presentation. Transport-level errors (malformed JSON, oversized messages, rate limiting) are intercepted at the WebSocket handler before any business logic executes — rejected at the gate, not in the throne room. Domain-level errors (authentication failures, session not found, insufficient credits) are raised within the Effect TS runtime as typed TaggedError values and mapped to structured error events. All errors, regardless of origin, pass through a sanitization pipeline before reaching clients: stack traces are stripped, API keys are redacted, messages are truncated to a safe length. The client sees a clean, structured error; the server logs the raw truth.

Error Event Format

All errors are delivered as a server event with type: "error":
{
  "type": "error",
  "code": "SessionNotFound",
  "message": "Session not found"
}
type
string
Always "error".
code
string
A machine-readable error code. This is the key for programmatic error handling — switch statements, retry logic, UI rendering. It is stable across releases; changes to error codes are breaking changes.
message
string
A human-readable description. Safe for display to end users — guaranteed to contain no stack traces, no API keys, no internal file paths, no implementation details. Sanitized by the pipeline described below.
Error events are not session-scoped. They carry no sessionId, seq, or ts fields. They are sent directly to the connection that triggered the error and are never broadcast to session subscribers. An error is a private conversation between the gateway and the client that provoked it.

Error Codes

Transport-Level Errors

These errors are raised at the WebSocket handler layer — the outermost membrane of the gateway — before the message reaches any domain logic. They represent failures of form rather than failures of intent.
CodeConditionDescription
INVALID_JSONMessage is not valid JSONThe raw WebSocket frame could not be parsed. Typically caused by incomplete messages, binary data sent to a text frame, or encoding errors.
INVALID_MESSAGEJSON does not match any schemaThe JSON was parsed successfully but does not conform to any of the 21 client message schemas. Common causes: unknown type field, missing required fields, incorrect field types.
MESSAGE_TOO_LARGERaw message exceeds 1 MBThe message was rejected before parsing. The gateway enforces a 1 MB size limit on inbound WebSocket frames to prevent resource exhaustion.
RATE_LIMITEDPer-connection rate limit exceededMore than 60 messages in a 10-second sliding window. The client must reduce its transmission rate.
AUTH_RATE_LIMITEDAuthentication attempts exceededToo many authenticate messages from the same IP address in a short period. Includes a retryAfterMs hint in the message text.

Authentication & Authorization Errors

CodeConditionDescription
NOT_AUTHENTICATEDMessage sent before authenticationThe client attempted to send a message (other than authenticate) before completing the authentication handshake.
AUTH_FAILEDToken validation failedThe JWT or API key was invalid, expired, malformed, or could not be verified against the Auth0 JWKS endpoint.
UnauthenticatedAuthentication requiredA gateway-internal error indicating the operation requires an authenticated identity. Semantically equivalent to NOT_AUTHENTICATED but raised within the Effect runtime rather than the transport layer.
UnauthorizedInsufficient permissionsThe authenticated user does not possess the required RBAC permission for this operation (e.g., a member attempting manage_members with set_role action).

Domain Errors

CodeConditionDescription
SessionNotFoundSession does not existThe referenced sessionId does not exist in the tenant’s registry, or the session has been deleted.
SessionAlreadyExistsDuplicate session creationAttempted to create a session with an ID that already exists. Rare — UUIDs are generated server-side.
INSUFFICIENT_CREDITSBilling check failedThe tenant does not have enough credits to start a turn. No agent interaction occurs; no tokens are consumed.
PodiumConnectionErrorAgent connection failedFailed to establish or maintain a WebSocket connection to the Podium agent orchestrator.
PodiumTimeoutAgent operation timed outA Podium operation (instance creation, message dispatch, health check) exceeded its timeout threshold.
EnsembleErrorLLM inference failureAn error from the Ensemble inference service — model unavailable, provider error, token limit exceeded, or rate limiting from the upstream model provider.
SandboxNotConfiguredNo sandbox availableThe operation requires a sandbox environment, but none is configured or provisioned for this session.
DbErrorDatabase operation failedA SQLite operation failed. Possible causes: disk full, database corruption, concurrent write conflict, or WAL checkpoint failure.
ProtocolVersionMismatchVersion mismatchThe client and gateway disagree on the protocol version. The client should update to match the gateway’s advertised version.
INTERNAL_ERRORUnclassified errorA catch-all for unexpected failures that do not map to a known error code. The message is sanitized before delivery.

Turn-Specific Errors

Turn errors occupy a distinct category because they are session-scoped: they are delivered via the turn_error event type (not the generic error event) and carry seq and ts fields that place them in the session’s event timeline.
CodeConditionDescription
AGENT_ERRORAgent reported an errorThe Podium agent encountered an error during execution — an LLM API failure, a tool crash, a context overflow.
AGENT_DISCONNECTEDWebSocket dropped mid-turnThe gateway lost its WebSocket connection to the Podium agent while a turn was in progress. The session transitions to error state.
turn_error events are broadcast to all session subscribers, not just the client that initiated the turn. This ensures all connected clients observe the error state and can update their UI accordingly. The session transitions to error or ready state depending on context (see the Session State Machine for the turn_error mapping rules). Clients should offer a retry action.

Member Management Errors

CodeConditionDescription
INVALID_MEMBER_UPDATEMissing or invalid fieldsThe set_role action was missing userId or role, or the role value is not one of the three valid roles (owner, admin, member).
LAST_OWNER_PROTECTEDCannot demote last ownerThe operation would leave the tenant with no owners. The invariant is absolute: every tenant must have at least one owner. Promote another member first.

Error Sanitization

All error messages pass through a sanitization pipeline before reaching clients. This is not optional hardening but a critical security boundary. Internal errors may contain stack traces with file paths, API keys embedded in HTTP headers, database connection strings, or other sensitive artifacts that must never cross the trust boundary to an end user. The sanitization process applies three transformations in strict sequence:
1

Strip Stack Traces

Any line matching the pattern of a JavaScript/TypeScript stack trace (at Function.name (/path/to/file.ts:42:10)) is removed. This prevents leaking internal file paths, function names, line numbers, and the architectural topology they reveal.
2

Redact Secrets

The following patterns are replaced with [REDACTED]:
  • Anthropic API keys: sk-ant-*
  • Generic secret keys: sk-*
  • GitHub personal access tokens: ghp_*
  • Bearer tokens: Bearer *
  • URL token parameters: token=*
3

Truncate

Messages exceeding 500 characters are truncated with an ellipsis (...). This prevents pathologically long error messages — serialized stack traces, large JSON payloads, recursive error chains — from bloating WebSocket frames and consuming client memory.
The sanitization is implemented by the sanitizeErrorMessage function in the gateway’s security module. It processes errors from all sources — Effect TaggedError instances, native JavaScript Error objects, plain strings, and arbitrary objects (via JSON.stringify). The pipeline is deterministic and idempotent: the same error always produces the same sanitized output, and sanitizing an already-sanitized message produces no further changes.

Gateway Typed Errors

Internally, the gateway uses Effect’s TaggedError pattern to define typed, structured error classes. Each error carries a _tag field that serves as both a discriminant union tag within the Effect type system and the wire code sent to clients. This provides type-safe error handling within the Effect runtime — errors are tracked in the type signature of every Effect, and the compiler ensures they are handled.
Error ClassTagFieldsWire Code
Unauthenticated"Unauthenticated"reason: stringUnauthenticated
Unauthorized"Unauthorized"tenantId: string, resource: stringUnauthorized
SessionNotFound"SessionNotFound"sessionId: stringSessionNotFound
SessionAlreadyExists"SessionAlreadyExists"sessionId: stringSessionAlreadyExists
InsufficientCredits"InsufficientCredits"tenantId: string, required: number, available: numberINSUFFICIENT_CREDITS
PaymentFailed"PaymentFailed"reason: string, stripeError?: stringPaymentFailed
PodiumConnectionError"PodiumConnectionError"message: string, cause?: unknownPodiumConnectionError
PodiumTimeout"PodiumTimeout"operation: string, timeoutMs: numberPodiumTimeout
EnsembleError"EnsembleError"message: string, statusCode?: numberEnsembleError
SandboxNotConfigured"SandboxNotConfigured"message: stringSandboxNotConfigured
DbError"DbError"message: string, cause?: unknownDbError
InvalidMessage"InvalidMessage"reason: string, raw?: stringInvalidMessage
ProtocolVersionMismatch"ProtocolVersionMismatch"expected: number, received: numberProtocolVersionMismatch
The message router’s catch-all handler maps the _tag field of each TaggedError to the wire code, and provides safe, predefined messages for known error types — messages that have been authored by humans, not generated by stack unwinding:
const safeMessages: Record<string, string> = {
  Unauthenticated: "Authentication required",
  Unauthorized: "Insufficient permissions",
  SessionNotFound: "Session not found",
  PodiumConnectionError: "Failed to connect to agent",
  DbError: "Database operation failed",
  InsufficientCredits: "Insufficient credits",
}
For unrecognized error tags, the original error message is passed through sanitizeErrorMessage before transmission. The safe message map is the first line of defense; sanitization is the last.

Recovery Strategies

Different error codes demand different recovery postures. Some errors are transient and should be retried; some are permanent and should be surfaced to the user; some are bugs in the client that should be fixed in code. The table below provides guidance for client implementations — a field manual for the discipline of failure.
Error CodeStrategyDetails
AUTH_FAILEDRe-authenticateThe token may be expired. Obtain a fresh JWT from your identity provider and send a new authenticate message.
AUTH_RATE_LIMITEDExponential backoffParse the retry delay from the error message. Wait at least that duration before attempting authentication again. Do not retry immediately — the rate limiter’s window has not expired.
NOT_AUTHENTICATEDRe-authenticateThe connection may have been reset or the authentication state lost. Send an authenticate message before retrying the original operation.
UnauthorizedSurface to userThe user lacks the required permission. Display an appropriate message and do not retry — retrying an authorization failure is futile without a role change.
SessionNotFoundRefresh session listThe session may have been deleted by another client or a concurrent operation. Call list_sessions to refresh the UI and remove stale references.
INSUFFICIENT_CREDITSSurface to userThe tenant has exhausted its credits. Direct the user to the billing interface to purchase additional credits before retrying the turn.
RATE_LIMITEDBackoff and retryReduce message frequency. Implement a client-side rate limiter to stay within 60 messages per 10 seconds. Consider batching rapid-fire operations.
PodiumConnectionErrorRetry with backoffThe agent backend may be temporarily unavailable. Wait 2-5 seconds and retry. If the error persists across multiple retries, the agent infrastructure may be experiencing an outage.
AGENT_DISCONNECTEDOffer retryThe agent’s WebSocket connection dropped mid-turn. The session is in error state. The user can retry the turn via run_turn — the gateway will establish a fresh Podium connection.
AGENT_ERROROffer retryThe agent encountered an internal error during execution. The user can retry the turn. Persistent errors may indicate a problem with the agent’s configuration or the underlying LLM provider.
INTERNAL_ERRORRetry once, then surfaceAn unexpected, unclassified error. Retry the operation once. If it fails again, surface the error to the user and consider reconnecting the WebSocket to obtain a clean connection state.
MESSAGE_TOO_LARGEReduce payloadThe message exceeds the 1 MB limit. Reduce the content size — truncate very long prompts, compress file contents, or split the operation into smaller messages.
INVALID_JSONFix client bugThe client is sending malformed JSON. This is invariably a client-side defect — examine the serialization path.
INVALID_MESSAGEFix client bugThe message structure does not match any schema. Verify field names, types, and required fields against this documentation. Check for typos in the type discriminator.
The SDKs handle several of these recovery strategies automatically. The TypeScript SDK’s autoReconnect option re-establishes the connection and re-authenticates on disconnect. The Python SDK’s auto_reconnect does the same. For session-level recovery (re-joining with afterSeq), the client application must implement its own logic — the SDKs provide the primitives but do not presume to know which sessions the user wishes to rejoin.

Error Handling in SDKs

Each SDK surfaces errors through its host language’s native error handling paradigm. The gateway’s structured error events are translated into exceptions, result types, or callback invocations as appropriate — the wire protocol’s errors are rendered in the idiom of the language.
const client = new DiminuendoClient({ url: "ws://localhost:8080/ws" })

// Connection errors throw during connect()
try {
  await client.connect()
} catch (err) {
  // "AUTH_FAILED: Authentication failed"
  // "Connection timeout while waiting for authentication"
}

// Request-response methods reject their promises
try {
  await client.joinSession("nonexistent-id")
} catch (err) {
  // "SessionNotFound: Session not found"
}

// Listen for error events on the connection
client.on("error", (event) => {
  console.error(`[${event.code}] ${event.message}`)
})

// Listen for turn errors on the session
client.on("turn_error", (event) => {
  console.error(`Turn failed: [${event.code}] ${event.message}`)
})