Architecture Overview
Diminuendo is a WebSocket gateway built on three foundational technology choices: Bun for the runtime, Effect TS for business logic, and SQLite for persistence. Each choice was made deliberately — not by default, not by trend — to minimize operational complexity while maximizing correctness and performance. The details of each decision are developed on their dedicated pages; this overview maps the terrain.Three Foundations
Bun
Native WebSocket server with built-in pub/sub, native SQLite via
bun:sqlite, sub-200ms startup, and Web Workers for I/O offloading. One runtime, zero external infrastructure.Effect TS
Typed errors in every function signature, dependency injection via Layers, structured concurrency with automatic cleanup, and resource management that does not rely on developer discipline.
SQLite
WAL-mode databases at per-tenant and per-session granularity. No connection pool, no cluster, no replication topology. Horizontal scaling is a consequence of the data model, not an afterthought.
Storage Architecture
Diminuendo uses SQLite exclusively for persistence. The storage hierarchy reflects the isolation model:WHERE tenant_id = ? clause to forget, no row-level security policy to misconfigure, no cross-tenant join to accidentally permit.
Per-Tenant Isolation
A query against one tenant’s data cannot accidentally touch another tenant’s rows. The databases are physically separate files — isolation by construction, not by convention.
Zero-Contention Writes
Concurrent writes to different sessions hit different SQLite files. WAL mode allows concurrent readers and a single writer per database, which matches the gateway’s access pattern precisely.
Trivial Deletion
Deleting a session means deleting a directory. No cascading
DELETE FROM across multiple tables, no orphaned rows, no vacuum required.Horizontal Scaling
Moving a tenant to a different gateway instance means moving a directory. No data migration, no schema changes, no downtime. The data model makes scaling a file-copy operation.
Transport Layer
The transport layer consists of two components: the HTTP/WebSocket server and the Broadcaster.Server
Bun.serve() handles both HTTP requests (health checks, REST API) and WebSocket connections on a single port. The WebSocket lifecycle proceeds as follows:
WsData): client ID, authentication identity, topic subscriptions, negotiated protocol version, and the last event sequence number for replay.
Broadcaster
The Broadcaster abstracts Bun’s native pub/sub into an Effect service. It provides two publishing channels:- Session events — published to
session:{sessionId}, received by all clients that have joined that session - Tenant events — published to
tenant:{tenantId}:sessions, received by all authenticated clients in the tenant, used for session list updates and cross-session notifications
server_shutdown event to every active topic before closing connections — an orderly retreat, not an abrupt silence.
Module Layout
The gateway source is organized into eleven modules, each with a single responsibility:| Module | Directory | Responsibility |
|---|---|---|
| Auth | src/auth/ | JWT verification (Auth0), identity extraction, RBAC permission checks, tenant membership management |
| Transport | src/transport/ | WebSocket server, HTTP REST router (40+ endpoints), Broadcaster service |
| Protocol | src/protocol/ | Effect Schema definitions for all 21 client message types; runtime validation and parsing |
| Session | src/session/ | Session lifecycle, MessageRouter (central dispatch), ConnectionState, SessionState machine, event handlers |
| Automation | src/automation/ | Automation store, scheduler, run execution, heartbeat configuration, inbox management |
| Domain | src/domain/ | Business rules — PodiumEventMapper, BillingService, TurnTracker, ThreadNaming, StreamSnapshots, and more |
| Upstream | src/upstream/ | External service clients — PodiumClient (agent orchestration), EnsembleClient (LLM inference), GitHubClient |
| Security | src/security/ | CSRF protection, security headers, error message sanitization, auth rate limiting |
| Resilience | src/resilience/ | RetryPolicy (exponential backoff with jitter), CircuitBreaker (failure threshold with cooldown) |
| Observability | src/observability/ | Health endpoint with deep checks, OpenTelemetry tracing, structured metrics |
| DB | src/db/ | Schema migrations, WorkerManager (batched async writes to SQLite via Web Workers), TenantDbPool |
Layer Composition
Diminuendo uses Effect’sLayer system for dependency injection. Every service is a Context.Tag with a corresponding Live implementation. No service imports another service directly — dependencies flow through the Layer graph, making them explicit, testable, and replaceable.
The composition is defined in src/main.ts:
Data Flow
The complete path of a client message through the gateway:Wire
Client sends JSON over WebSocket.
Bun.serve receives raw bytes and invokes the message() handler.Validation
The raw string is parsed as JSON, then validated against
ClientMessage using Schema.decodeUnknownEither. Invalid messages are rejected with an error event — they never reach business logic.Authentication Check
The server verifies that the connection has completed authentication (the
authenticated flag on WsData). Unauthenticated connections can only send authenticate messages.Rate Limiting
A per-connection sliding window rate limiter checks whether the client has exceeded 60 messages per 10-second window. Rate-limited messages are rejected with
RATE_LIMITED.Routing
MessageRouter.route(identity, message) dispatches to the appropriate handler based on message.type. The router returns a RouteResult: either respond (send to this client), broadcast (publish to session topic), or none (no response needed).Execution
The handler performs its work — querying the registry, creating a Podium connection, reserving billing credits, writing to SQLite, or forwarding to an upstream service.
Response
For
respond results, the event is sent directly to the requesting client. For session-mutating operations (create, archive, delete, rename), the event is also broadcast to the tenant topic so other clients can update their session lists.Streaming
For
run_turn, the response is broadcast — all events from the agent are streamed to the session topic via the Broadcaster. The startEventStreamFiber consumes the Podium event stream, maps each event through PodiumEventMapper, publishes to the session topic, persists important events to SQLite, and dispatches to specialized event handlers for state management.Per-Tenant Isolation
Diminuendo enforces tenant isolation at three levels, each independent of the others:-
Authentication — every JWT contains a
tenant_idclaim. The gateway extracts this during authentication and attaches it to the connection’s identity. All subsequent operations are scoped to this tenant. -
Storage — each tenant has its own SQLite registry database at
data/tenants/{tenantId}/registry.db. Session databases are stored atdata/sessions/{sessionId}/session.db. A session can only be accessed by providing asessionIdthat exists in the requesting tenant’s registry. -
Pub/Sub — topic subscriptions are namespaced by tenant. A client authenticated as tenant
acmesubscribes totenant:acme:sessions; it will never receive events intended for tenantglobex. The topic namespace is the third wall of the isolation boundary.