Diminuendo is a pure gateway. It does not execute agent code, does not invoke tools, and does not call language models. Those responsibilities belong to three upstream platform services — Podium, Chronicle, and Ensemble — each accessed through an Effect service with circuit-breaker resilience, exponential retry, and health probing. The gateway’s role is to mediate: it translates between the client-facing wire protocol and the platform services’ internal APIs, maps events from one vocabulary to another, and ensures that transient failures in any upstream service degrade gracefully rather than cascading into the client connection.
Podium is the agent orchestration platform that manages compute instances, agent lifecycles, and message routing. Diminuendo connects to Podium as an upstream service — creating compute instances for coding agents, establishing WebSocket connections to stream events, and proxying file operations to agent workspaces.
The lifecycle of an agent instance follows a predictable four-phase pattern:
1
Create Instance
When a user starts a session or an inactive session is reactivated, the gateway calls createInstance with the agent type, deployment descriptor, secrets, and environment variables.
This returns a PodiumConnection object with methods to send messages, access the event stream, and close the connection.
3
Stream Events and Send Messages
The connection remains open for the duration of the session. Messages from the user are sent via sendMessage(content). Events from the agent arrive through the events stream (Stream.Stream<PodiumEvent>).
4
Stop Instance
When a session is deactivated or the gateway shuts down, the instance is stopped:
Copy
DELETE /api/v1/instances/{instanceId}
A 404 response is treated as success — the instance may have already been reclaimed by Podium.
The PodiumEventMapper (src/domain/PodiumEventMapper.ts) is a pure function that translates Podium’s 30+ message types into Diminuendo’s 51-event wire protocol. Each Podium event is transformed into one or more client events with session-scoped sequence numbers and timestamps.
Any Podium event that does not match a known message type but contains text content is treated as a text_delta fallback. Events with no matching type and no content are silently dropped (mapped to an empty array).
File access messages from clients are proxied through to Podium’s REST file API:
Client Message
Podium API
list_files
GET /api/v1/instances/{id}/files?path={path}&depth={depth}
read_file
GET /api/v1/instances/{id}/files/{path}
file_history
GET /api/v1/instances/{id}/files/{path}/history
file_at_iteration
GET /api/v1/instances/{id}/files/{path}/at/{iteration}
All file API calls use a 15-second timeout and propagate PodiumConnectionError on failure. File operations require an active Podium instance for the session — if the session is inactive, the gateway must first activate it before file operations can proceed.
The gateway prevents parallel createInstance calls for the same session. If a session is already in the activating state (meaning a createInstance call is in flight), a second activation request for the same session will wait for the first call to complete rather than issuing a duplicate request to Podium.
This probe is used by the /health endpoint to assess Podium availability, and by the stale session recovery mechanism to determine whether a session’s Podium instance survived a gateway restart (it never does — Podium connections do not survive process death).
Chronicle is a versioned filesystem that provides each agent session with an isolated, content-addressable workspace. Every file change creates a new iteration, enabling clients to browse version history, view files at any point in time, and synchronize workspaces to local directories for editing in a native IDE.
Chronicle sits behind Podium in the service hierarchy. The gateway does not communicate with Chronicle directly. All file operations are routed through Podium’s REST API, which manages the mapping between compute instances and their associated Chronicle workspaces:
Copy
Client --> Gateway --> Podium --> Chronicle list_files GET /files Content-addressable store read_file GET /files/{path} Versioned iterations file_history GET /files/{path}/history file_at_iteration GET /files/{path}/at/{n}
Chronicle tracks every modification to every file in an agent’s workspace as a discrete iteration — an immutable snapshot of a file at a specific point in time, identified by:
Iteration number — monotonically increasing integer, scoped to the file path within a workspace
Timestamp — epoch milliseconds when the change was persisted
Size — byte count of the file content at this iteration
Hash — content-addressable hash for deduplication and integrity verification
Clients can browse a file’s complete history using the file_history message and retrieve any past version using file_at_iteration:
Copy
// Get version historyconst history = await client.fileHistory(sessionId, "src/auth.ts")// [// { iteration: 1, timestamp: 1709312400000, size: 1234, hash: "abc..." },// { iteration: 2, timestamp: 1709312460000, size: 1456, hash: "def..." },// { iteration: 3, timestamp: 1709312520000, size: 1389, hash: "ghi..." },// ]// Read the file at iteration 1 (before the agent's changes)const original = await client.fileAtIteration(sessionId, "src/auth.ts", 1)
Chronicle’s local-sync mode replaces FUSE or NFS-based filesystem abstractions with real files on the local filesystem, synchronized bidirectionally using platform-native file watching. This enables users to open agent workspace files in their preferred IDE while the agent is actively modifying them.
Local-sync is a desktop-only feature. It requires filesystem access from the Tauri Rust backend and is enabled via the local-sync Cargo feature flag in Chronicle.
Local-sync consists of three cooperating components:
LocalMaterializer
Writes files from the upstream replication stream to the local filesystem. Implements the EventHandler trait. Includes echo suppression to prevent its own writes from being detected as local changes.
FsWatcher
Monitors the local directory for changes using notify::RecommendedWatcher (FSEvents on macOS). Changes are debounced via notify-debouncer-full and filtered to exclude editor temp files, .git directories, and other noise.
WriteJournal
Tracks all changes (both local and upstream) in a journal for replication. Provides the mechanism for conflict detection when both local and remote changes affect the same file.
Upstream (agent changes) Local (user changes) | | Replication stream FSEvents / inotify | | LocalMaterializer FsWatcher (write to local) (detect local change) | | Echo suppression <---------------------------> Change filtering | | WriteJournal -------------------------------------> WriteJournal | | Local filesystem <------------------------------ Local filesystem
Echo suppression is the critical invariant: when the LocalMaterializer writes a file, the FsWatcher will detect that write as a local change. Without suppression, this would create an infinite sync loop. The materializer registers each write in a short-lived “echo set,” and the watcher ignores any change events for paths currently in the echo set.Debouncing prevents rapid successive changes (common during agent file writes or IDE auto-save) from generating a flood of upstream sync operations. The watcher uses notify-debouncer-full with a configurable debounce interval.
When both the agent (upstream) and the user (local) modify the same file within the same debounce window, a conflict is detected. The desktop client surfaces this conflict to the user via the Tauri IPC resolve_conflict command, presenting options to keep the local version, accept the upstream version, or merge manually.
Chronicle’s local-sync feature is behind a Cargo feature flag:
Copy
# Build Chronicle with local-sync supportcargo build --no-default-features --features local-sync# Run with local-sync enabledchronicle /path/to/workspace --local-sync --ws-connect ws://podium:5082/sync
The local-sync feature pulls in the notify (v7) and notify-debouncer-full crates, which provide the platform-native file watching implementation. On macOS, this uses FSEvents. Files are materialized directly on APFS, so workspaces appear as real files in Finder and editors — no FUSE layer or kernel extension required.
Ensemble is the LLM inference proxy that handles model routing, rate limiting, and cost tracking. Diminuendo connects to Ensemble as an upstream service for any gateway-level inference needs that are separate from the agent’s own LLM calls (which flow through Podium).
The request is sent as a POST to {ENSEMBLE_URL}/api/v1/generate/stream with a 120-second timeout. The response body is piped through a TextDecoderStream.
The EnsembleClientLive layer reads two environment variables:
Variable
Default
Description
ENSEMBLE_URL
http://localhost:5180
Base URL for the Ensemble API
ENSEMBLE_API_KEY
(empty)
Bearer token for authentication
If ENSEMBLE_URL is set to a non-default value but ENSEMBLE_API_KEY is empty, the gateway logs a warning at startup: “ENSEMBLE_URL is set but ENSEMBLE_API_KEY is empty — Ensemble integration will fail.” This catches a common misconfiguration where the URL is set but the secret was not provisioned.
If either ENSEMBLE_URL or ENSEMBLE_API_KEY is missing, the EnsembleClient falls back to a no-op implementation. Both generate and generateStream return Effect.fail(new EnsembleError({ message: "Ensemble unavailable: ..." })). The isHealthy probe returns false.This ensures the gateway starts successfully even without Ensemble configured — it degrades gracefully rather than failing to boot. Ensemble is an optional dependency, not a hard prerequisite.
Half-open after cooldown; first success closes the breaker
When the circuit breaker opens, all inference calls fail immediately with EnsembleError (status code 503) rather than attempting the HTTP request. This prevents a failing Ensemble service from consuming gateway resources with timeout-bound requests.
This probe is used by the gateway’s /health endpoint. Unlike Podium (which is critical), an unhealthy Ensemble degrades the overall health status but does not mark the gateway as unhealthy. The gateway continues to function for all operations that do not require direct inference.
Agent LLM usage is tracked through the gateway’s event system. When an agent consumes tokens during a turn, the gateway maps Podium’s usage events to two client-facing event types.
Frontend clients use usage_context events to render a context window utilization indicator, helping users understand how much context capacity remains before the agent needs to summarize or truncate its working memory.