Architecture Overview

Diminuendo is a WebSocket gateway built on three foundational technology choices: Bun for the runtime, Effect TS for business logic, and SQLite for persistence. Each choice was made deliberately — not by default, not by trend — to minimize operational complexity while maximizing correctness and performance. The details of each decision are developed on their dedicated pages; this overview maps the terrain.

Three Foundations

Storage Architecture

Diminuendo uses SQLite exclusively for persistence. The storage hierarchy reflects the isolation model:
data/
  tenants/
    {tenantId}/
      registry.db          # Session metadata for this tenant
  sessions/
    {sessionId}/
      session.db           # Conversation history, events, and turn usage
Each tenant has a registry database containing session metadata — id, name, status, timestamps, project associations. Each session has its own session database containing messages, events, and turn usage records. These are physically separate files on separate filesystem paths. There is no WHERE tenant_id = ? clause to forget, no row-level security policy to misconfigure, no cross-tenant join to accidentally permit.

Per-Tenant Isolation

A query against one tenant’s data cannot accidentally touch another tenant’s rows. The databases are physically separate files — isolation by construction, not by convention.

Zero-Contention Writes

Concurrent writes to different sessions hit different SQLite files. WAL mode allows concurrent readers and a single writer per database, which matches the gateway’s access pattern precisely.

Trivial Deletion

Deleting a session means deleting a directory. No cascading DELETE FROM across multiple tables, no orphaned rows, no vacuum required.

Horizontal Scaling

Moving a tenant to a different gateway instance means moving a directory. No data migration, no schema changes, no downtime. The data model makes scaling a file-copy operation.
Migrations are applied automatically on first access. The registry schema includes sessions and tenant members tables; the session schema includes messages, events, and turn usage tables.

Transport Layer

The transport layer consists of two components: the HTTP/WebSocket server and the Broadcaster.

Server

Bun.serve() handles both HTTP requests (health checks, REST API) and WebSocket connections on a single port. The WebSocket lifecycle proceeds as follows:
Client connects --> fetch() upgrades to WS --> open() sends welcome + connected
  --> In dev mode: auto-sends authenticated
  --> In production: client must send authenticate with JWT
--> Authenticated: client may send any message
--> message() --> Schema.decodeUnknownEither --> MessageRouter.route()
--> close() --> cleanup rate limiters, unsubscribe topics, remove from active sessions
Each connection carries typed per-connection state (WsData): client ID, authentication identity, topic subscriptions, negotiated protocol version, and the last event sequence number for replay.

Broadcaster

The Broadcaster abstracts Bun’s native pub/sub into an Effect service. It provides two publishing channels:
  • Session events — published to session:{sessionId}, received by all clients that have joined that session
  • Tenant events — published to tenant:{tenantId}:sessions, received by all authenticated clients in the tenant, used for session list updates and cross-session notifications
The Broadcaster tracks all known topics for graceful shutdown. When the gateway receives SIGINT/SIGTERM, it publishes a server_shutdown event to every active topic before closing connections — an orderly retreat, not an abrupt silence.

Module Layout

The gateway source is organized into eleven modules, each with a single responsibility:
ModuleDirectoryResponsibility
Authsrc/auth/JWT verification (Auth0), identity extraction, RBAC permission checks, tenant membership management
Transportsrc/transport/WebSocket server, HTTP REST router (40+ endpoints), Broadcaster service
Protocolsrc/protocol/Effect Schema definitions for all 21 client message types; runtime validation and parsing
Sessionsrc/session/Session lifecycle, MessageRouter (central dispatch), ConnectionState, SessionState machine, event handlers
Automationsrc/automation/Automation store, scheduler, run execution, heartbeat configuration, inbox management
Domainsrc/domain/Business rules — PodiumEventMapper, BillingService, TurnTracker, ThreadNaming, StreamSnapshots, and more
Upstreamsrc/upstream/External service clients — PodiumClient (agent orchestration), EnsembleClient (LLM inference), GitHubClient
Securitysrc/security/CSRF protection, security headers, error message sanitization, auth rate limiting
Resiliencesrc/resilience/RetryPolicy (exponential backoff with jitter), CircuitBreaker (failure threshold with cooldown)
Observabilitysrc/observability/Health endpoint with deep checks, OpenTelemetry tracing, structured metrics
DBsrc/db/Schema migrations, WorkerManager (batched async writes to SQLite via Web Workers), TenantDbPool

Layer Composition

Diminuendo uses Effect’s Layer system for dependency injection. Every service is a Context.Tag with a corresponding Live implementation. No service imports another service directly — dependencies flow through the Layer graph, making them explicit, testable, and replaceable. The composition is defined in src/main.ts:
AppConfigLive (environment variables)
  |
  +--> TenantDbPoolLive    (SQLite connection pool)
  |
  +--> ConfigProvidedLayers (13 services that depend on AppConfig + TenantDbPool)
  |      SessionRegistryService, ProjectRegistryService, PodiumClient,
  |      AuthService, WorkerManager, EnsembleClient, MembershipService,
  |      TurnTracker, CredentialService, InvitationService, AuditService,
  |      FileStorageService, SkillService, UserPreferencesService
  |
  +--> BillingLayer         (depends on ConfigProvidedLayers)
  +--> ThreadNamingLayer    (depends on ConfigProvidedLayers + Broadcaster)
  +--> AutomationStoreLayer
  +--> SessionRuntimeLayer  (Podium + Broadcaster + Worker + AppConfig)
  +--> AutomationEngineLayer (6 sub-layer dependencies)
  |
  +--> RouterDeps = merge(all of the above)
  |      |
  |      +--> RouterLayer (MessageRouterLive -- the central dispatch)
  |
  +--> AppLayer = merge(everything)
         |
         +--> program (startServer + stale recovery + automation + shutdown)
                |
                +--> Effect.provide(AppLayer)
                +--> Effect.provide(LoggerLive)
If a layer requires a dependency that is not provided, the Effect compiler rejects it at build time. The entire dependency graph is statically verified before a single byte of runtime work occurs.

Data Flow

The complete path of a client message through the gateway:
1

Wire

Client sends JSON over WebSocket. Bun.serve receives raw bytes and invokes the message() handler.
2

Validation

The raw string is parsed as JSON, then validated against ClientMessage using Schema.decodeUnknownEither. Invalid messages are rejected with an error event — they never reach business logic.
3

Authentication Check

The server verifies that the connection has completed authentication (the authenticated flag on WsData). Unauthenticated connections can only send authenticate messages.
4

Rate Limiting

A per-connection sliding window rate limiter checks whether the client has exceeded 60 messages per 10-second window. Rate-limited messages are rejected with RATE_LIMITED.
5

Routing

MessageRouter.route(identity, message) dispatches to the appropriate handler based on message.type. The router returns a RouteResult: either respond (send to this client), broadcast (publish to session topic), or none (no response needed).
6

Execution

The handler performs its work — querying the registry, creating a Podium connection, reserving billing credits, writing to SQLite, or forwarding to an upstream service.
7

Response

For respond results, the event is sent directly to the requesting client. For session-mutating operations (create, archive, delete, rename), the event is also broadcast to the tenant topic so other clients can update their session lists.
8

Streaming

For run_turn, the response is broadcast — all events from the agent are streamed to the session topic via the Broadcaster. The startEventStreamFiber consumes the Podium event stream, maps each event through PodiumEventMapper, publishes to the session topic, persists important events to SQLite, and dispatches to specialized event handlers for state management.

Per-Tenant Isolation

Diminuendo enforces tenant isolation at three levels, each independent of the others:
  1. Authentication — every JWT contains a tenant_id claim. The gateway extracts this during authentication and attaches it to the connection’s identity. All subsequent operations are scoped to this tenant.
  2. Storage — each tenant has its own SQLite registry database at data/tenants/{tenantId}/registry.db. Session databases are stored at data/sessions/{sessionId}/session.db. A session can only be accessed by providing a sessionId that exists in the requesting tenant’s registry.
  3. Pub/Sub — topic subscriptions are namespaced by tenant. A client authenticated as tenant acme subscribes to tenant:acme:sessions; it will never receive events intended for tenant globex. The topic namespace is the third wall of the isolation boundary.
The sessionId is a UUID, so guessing a valid session ID across tenants is computationally infeasible. However, the gateway does not rely on this obscurity — it performs an explicit tenant ownership check on every session access.