From 7b8b136e92180a3503ffa60e05412c51f2ea74e5 Mon Sep 17 00:00:00 2001 From: Tudisco Date: Sun, 24 May 2026 22:45:29 -0600 Subject: [PATCH] plan(kez-chat): NATS is bundled in docker-compose, not in Rust code MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Correcting an overcorrection. Previous version pushed NATS fully external — "operator brings their own, we don't ship it." That went too far. The right line is: - NATS isn't *Rust code we wrote* — it's the official Go nats-server, separate process. We don't embed it. ✓ (unchanged) - NATS *is* part of our deployment recipe — docker-compose includes a `nats` service alongside chat-server and sig-server so operators can `docker compose up` and have everything working. This is the standard "we ship docker-compose with the dependencies wired up" pattern (like projects that include Postgres in their compose). Operators with existing NATS deployments can disable the bundled service and set NATS_URL to their own broker. Changes: - §4.2 process diagram: NATS back inside the "our deployment" box, with a note that it's bundled-but-separable - §4.3 docker-compose: nats service restored alongside chat-server and sig-server. Reference nats.conf path documented. Instructions for swapping in your own NATS broker. - §6.4 NATS section retitled from "external dependency" to "bundled in compose, not in code." Same requirements (NATS 2.10+, JetStream, auth_callout) but framed as turn-key by default. - Decisions-locked NATS row updated: "not in Rust code, yes in docker-compose; swap-able by config." - §11 sequenced plan step 3: wire up the bundled nats service rather than "spin up a separate broker for dev." - §12 summary: "we ship two Rust services PLUS a docker-compose recipe that includes nats-server." - Appendix A trimmed: now just "running NATS standalone if you're iterating on chat-server in cargo watch and don't want the full compose stack." The full compose IS the standard dev setup. --- kez-chat/document.md | 208 ++++++++++++++++++++++--------------------- 1 file changed, 108 insertions(+), 100 deletions(-) diff --git a/kez-chat/document.md b/kez-chat/document.md index def7d56..e4bd985 100644 --- a/kez-chat/document.md +++ b/kez-chat/document.md @@ -194,65 +194,71 @@ becomes a fourth optional container. ### 4.2 Process / deployment model -NATS is **not part of our deployment.** The operator runs NATS however -they want (Synadia Cloud, their own cluster, a friend's broker, a single -local container) and gives the chat-server a URL. Same idea as a -database: we connect to one; we don't ship one. +NATS is **not embedded in our Rust code** — it's a separate process +(the official Go `nats-server`). But we **do bundle it in our deployment +recipe** so operators get a turn-key setup. Same pattern as projects +that ship docker-compose with Postgres included: we don't write the +database, but we wire it up so you can `docker compose up` and have +everything working. ``` - External infrastructure - (operator's responsibility) - ┌──────────────────────┐ - │ NATS broker │ - │ + JetStream │ - │ somewhere │ - └─────────▲─────▲──────┘ - │ │ - chat-server ──────┘ │ ◄────── client app - (auth callout) │ (publish/subscribe) - │ -┌─────────────── our deployment ─────────────────┐ -│ │ -│ ┌─────────────────┐ ┌────────────────┐ │ -│ │ kez-chat-server │ │ kez-sig-server │ │ -│ │ (Rust) │ │ (Rust) │ │ -│ │ │ │ │ │ -│ │ ↓ handles │ │ ↓ sigchain │ │ -│ │ ↓ nats auth │ │ storage │ │ -│ │ ↓ HTTP API │ │ │ │ -│ └─────────────────┘ └────────────────┘ │ -│ ▲ ▲ │ -└─────────┼──────────────────────┼───────────────┘ - │ │ - ┌──────┴──────────────────────┴────────────────────────┐ - │ Chat app (per user, runs on phone/desktop) │ - │ │ - │ • talks to the operator's NATS broker (NATS proto) │ - │ • talks to kez-chat-server over HTTPS │ - │ • talks to kez-sig-server over HTTPS │ - │ • runs local iroh::Node for file send/receive │ - └──────────────────────────────────────────────────────┘ +┌────────────────── our deployment (docker-compose) ────────────────┐ +│ │ +│ ┌──────────────┐ ┌─────────────────┐ ┌────────────────┐ │ +│ │ nats-server │ │ kez-chat-server │ │ kez-sig-server │ │ +│ │ (Go) │◄──┤ (Rust) ├──►│ (Rust) │ │ +│ │ + JetStream │ │ │ │ (existing) │ │ +│ │ │ │ ↓ handles │ │ ↓ sigchain │ │ +│ │ ↓ chat msgs │ │ ↓ nats auth │ │ storage │ │ +│ │ ↓ tickets │ │ ↓ HTTP API │ │ │ │ +│ └──────────────┘ └─────────────────┘ └────────────────┘ │ +│ ▲ ▲ ▲ │ +└─────────┼───────────────────┼──────────────────────┼──────────────┘ + │ │ │ + ┌──────┴───────────────────┴──────────────────────┴───────────┐ + │ Chat app (per user, runs on phone/desktop) │ + │ │ + │ • talks to nats-server over native NATS protocol │ + │ • talks to kez-chat-server over HTTPS │ + │ • talks to kez-sig-server over HTTPS │ + │ • runs local iroh::Node for file send/receive │ + └──────────────────────────────────────────────────────────────┘ ``` -The chat-server orchestrates auth against whatever NATS broker is -configured, but doesn't run, host, supervise, or ship NATS in any form. +The chat-server orchestrates auth between NATS and the handle registry. +NATS runs in its own container; we ship the config wired up. -### 4.3 docker-compose sketch (our two services only) +**Operators who already run NATS** can disable our bundled `nats` +service and point `NATS_URL` at their own broker — same auth_callout +config snippet works in any NATS deployment. Bundled NATS is the +default for convenience, not a requirement. + +### 4.3 docker-compose recipe ```yaml -# deploy/docker-compose.yml — what we ship +# deploy/docker-compose.yml services: + nats: + image: nats:latest + command: ["-c", "/etc/nats/nats.conf", "--jetstream"] + volumes: + - ./nats.conf:/etc/nats/nats.conf:ro + - nats-data:/data + ports: + - "4222:4222" # client connections (TLS in prod) + - "8222:8222" # monitoring + chat-server: build: . # kez-chat-server Rust binary environment: - NATS_URL: ${NATS_URL} # operator points us at their NATS broker + NATS_URL: nats://nats:4222 SIG_SERVER_URL: http://sig-server:7878 DB_PATH: /data/handles.db AUTH_CALLOUT_NKEY_PATH: /etc/kez/auth-callout.nkey volumes: - chat-data:/data - ./auth-callout.nkey:/etc/kez/auth-callout.nkey:ro - depends_on: [sig-server] + depends_on: [nats, sig-server] ports: - "8080:8080" # HTTP API for clients @@ -266,19 +272,23 @@ services: - "7878:7878" volumes: + nats-data: chat-data: sig-data: ``` -**NATS is not in this file.** The operator brings their own — running -on a different host, in a different compose project, on Synadia Cloud, -or wherever. They give us `NATS_URL` and a place to put our auth -callout endpoint URL in their `nats.conf`. +We ship a reference `deploy/nats.conf` with the auth_callout wired up +to talk to our chat-server. Operators who want to bring their own +NATS: -What the operator needs to add on the NATS side (in **their** config): +1. Comment out (or delete) the `nats` service from the compose file. +2. Set `NATS_URL=nats://your-broker:4222` in the chat-server's env. +3. Apply our reference `nats.conf` snippet to their NATS deployment. + +The auth_callout config snippet: ```conf -# nats.conf — added to whatever NATS deployment the operator runs +# nats.conf — patched into whichever NATS deployment is used authorization { auth_callout { issuer: "" @@ -293,11 +303,6 @@ that NATS trusts. When a client connects to NATS with their KEZ ed25519 key, NATS forwards the auth request to our chat-server, which checks the handle registry and signs a yes/no response. -We provide a reference `nats.conf` snippet in the docs. The operator -patches it into their own NATS deployment. - -For local development, see Appendix A. - ### 4.4 Endpoints ``` @@ -519,43 +524,46 @@ import cleanly from the start. - `iroh` — server doesn't run an Iroh node in v0 (no pinning) - nats-server (Go) — separate container, not a Rust dep -### 6.4 NATS broker — external dependency +### 6.4 NATS broker — bundled in compose, not in code -NATS is **not part of our project**. It's external infrastructure the -operator provides, the same way they'd provide a database or an SMTP -relay. We ship: +NATS is **not embedded in the Rust binary** — it's the official Go +`nats-server` running as its own container. But we **do include it +in the docker-compose deployment** so `docker compose up` is the +whole setup for new operators. Same pattern as projects shipping +Postgres-in-compose: it's bundled for convenience, not because we +wrote a database. -- An `async-nats` client used by the chat-server (admin/utility work) -- An auth-callout HTTP endpoint that NATS calls during client connection -- A documented `nats.conf` snippet operators add to their NATS deployment -- A reference local-dev setup (Appendix A) for running NATS yourself - while developing +What we ship: -What we require from the operator's NATS: +- `deploy/docker-compose.yml` with a `nats` service alongside our + Rust services +- `deploy/nats.conf` — reference config with auth_callout wired up +- `async-nats` client inside chat-server for admin/utility work +- The auth-callout HTTP endpoint chat-server exposes for NATS to call + +What NATS we require (whether bundled or BYO): | Requirement | Why | |---|---| -| **NATS 2.10+** (for auth_callout) | We rely on auth callout to bridge KEZ identity into NATS | +| **NATS 2.10+** (for auth_callout) | We use auth_callout to bridge KEZ identity into NATS | | **JetStream enabled** | For offline message buffering (durable consumers) | | **TCP reachable** from chat-server and clients | Standard | | **TLS** (in production) | Standard | -| **auth_callout configured** to hit our endpoint | Required for client auth | +| **auth_callout configured** to hit our chat-server endpoint | Required for client auth | -That's it. Operator can run a single Docker container, a clustered -production deployment, or a managed service — we don't care, as long -as `NATS_URL` and the callout config are correct. +**Swapping in your own NATS** is a config change, not a code change: +disable the bundled `nats` service in the compose, set `NATS_URL` to +your own broker, apply our `nats.conf` snippet there. Useful for +operators with existing NATS infrastructure, Synadia Cloud users, etc. -Why fully external rather than alongside us: +Why bundled rather than embedded: -- NATS is a serious piece of infrastructure with its own scaling and - operational concerns. Bundling it implies we're responsible for it. - We're not. -- Operators with existing NATS deployments can reuse them. No "now run - our copy of NATS too." -- Different teams might run different NATS topologies (single instance, - cluster, mesh, leaf nodes). None of that is our problem. -- Swapping NATS implementations or moving to a managed provider is a - config change, not a code change. +- NATS is a 200KLOC Go service with its own ops story. We're not + rewriting it in Rust just to embed it. +- Bundling it as a separate process keeps the architecture honest — + if NATS misbehaves, it's a separate process to restart, debug, log. +- Operators can swap to a different broker deployment without touching + our code. ### 6.5 Iroh — client-side only @@ -646,7 +654,7 @@ Settle yes/no on this and the design is locked. | Question | Decision | |---|---| | Bundle sigchain in chat-server? | **No.** Use existing `kez-sig-server`. Microservices. | -| Bundle NATS into Rust server? | **No.** NATS is external infrastructure the operator provides. We don't ship, embed, or supervise it. We connect to whatever broker `NATS_URL` points at. | +| Bundle NATS into Rust server? | **Not in the Rust code** — NATS stays the official Go `nats-server` running as its own process. **Yes in our docker-compose** — operators get `nats + chat-server + sig-server` wired up out of the box. Operators with existing NATS deployments can disable the bundled service and set `NATS_URL` to point elsewhere. | | KEZ + nostr coexistence for chat? | **No nostr in chat.** KEZ is identity-only; nostr only as a verifiable claim in someone's sigchain, not as transport. | | Handle scope: federation or global? | **Global for v0**, federation-ready design (see §3.5). | | Recovery if key lost? | **Paper backup (24-word mnemonic), Keybase-style.** No server-side recovery. | @@ -701,8 +709,9 @@ When we start building: Handle registry + WebFinger first — these unblock client-side account creation. -3. **NATS auth callout.** Bring up a NATS broker for development (see - Appendix A), configure its auth_callout to hit our chat-server's +3. **NATS auth callout.** Wire up the `nats` service in our compose + (or, in dev, run `nats-server -c deploy/nats.conf --jetstream` + locally). Its auth_callout hits our chat-server's `/internal/nats/auth`. End-to-end: a client can register a handle and then connect to NATS authenticated by its KEZ key. @@ -737,11 +746,15 @@ ed25519 primary key. The same key authenticates to a NATS broker (chat, presence, file tickets — broker is dumb, clients do E2E with ChaCha20-Poly1305 over X25519-derived keys) and identifies an Iroh node (P2P bulk transfer, content-addressed blobs, on-demand fetch). -**Our project ships two services**: a thin Rust `kez-chat-server` -that handles the handle registry + NATS auth callout + HTTP API, and -the existing `kez-sig-server` that stores sigchains. **NATS is -external infrastructure the operator provides** — we never ship, -embed, or supervise it. The chat-server does not run an Iroh node +**Our project ships two Rust services** (`kez-chat-server` for handle +registry + NATS auth callout + HTTP API, and the existing +`kez-sig-server` for sigchain storage) **plus a docker-compose recipe +that includes `nats-server`** for turn-key deployment. NATS isn't in +our Rust code — it's the official Go binary running as its own +container — but it's wired up in our compose so operators can +`docker compose up` and have everything working. Operators with +existing NATS deployments can disable the bundled service and point +us elsewhere. The chat-server does not run an Iroh node and does not pin files in v0; file transfer is pure P2P between online peers. Account recovery is via a 24-word paper-backup mnemonic. Federation across home servers is deferred but the design @@ -749,26 +762,21 @@ keeps it as a flip-the-switch future change. --- -## Appendix A: running a NATS broker locally for development +## Appendix A: running just NATS during development -NATS is not part of our project, but you need one running to test the -chat-server end-to-end. Easiest path during development: +The full deployment is `docker compose up` in `deploy/` — that brings +nats, chat-server, and sig-server together. But if you're iterating on +chat-server in `cargo watch` and want a standalone NATS to point at: ```sh docker run -d --name kez-dev-nats \ -p 4222:4222 -p 8222:8222 \ - -v "$PWD/dev-nats.conf:/etc/nats/nats.conf:ro" \ + -v "$PWD/deploy/nats.conf:/etc/nats/nats.conf:ro" \ nats:latest -c /etc/nats/nats.conf --jetstream ``` -Where `dev-nats.conf` enables the auth callout pointing at your -locally-running chat-server (e.g. `http://host.docker.internal:8080/internal/nats/auth`). +Point your locally-running chat-server at it with +`NATS_URL=nats://127.0.0.1:4222`. The auth_callout in the same +`nats.conf` will reach back to `http://host.docker.internal:8080/internal/nats/auth`. -A full reference `dev-nats.conf` will live at `deploy/dev-nats.conf` -when we start building. This appendix exists so developers have a -one-liner to spin up NATS for testing; **it is not the production -deployment recipe** (operators run their own NATS however they want). - -For production: see the NATS docs (https://docs.nats.io). Our project -has no opinion beyond "must be 2.10+ with JetStream + auth_callout -configured to hit our endpoint." +Tear down with `docker rm -f kez-dev-nats` when done.