Kez/kez-chat/document.md
Tudisco f0aa86f71a plan(kez-chat): NATS is external infrastructure, not part of our stack
Sharpen the framing: our project doesn't ship, embed, supervise, or
even sit-next-to NATS. NATS is external infrastructure the operator
provides (their own server, Synadia Cloud, whatever) and we connect
to it the way an app connects to a database.

Changes:

- §4.2 process model: redraw the diagram showing NATS *outside* our
  deployment boundary (with a dashed line for "external"), our two
  services on one side, chat-server reaches out to the operator's
  NATS via the auth callout.

- §4.3 docker-compose sketch: remove the nats container entirely.
  Our compose ships chat-server + sig-server only. NATS_URL is an
  environment variable the operator sets. We document the nats.conf
  snippet the operator needs to add to their own NATS deployment.

- §6.4 NATS broker section rewritten as "external dependency" — what
  we require from the operator's NATS (version, JetStream, callout
  config), and why we don't bundle it (NATS is its own ops problem;
  operators may already have one; we shouldn't lock them in).

- §11 sequenced plan step 3: developers spin up a local NATS for
  testing via Appendix A, not "run nats-server in a sibling container."

- Decisions-locked row for NATS now explicit: "We don't ship, embed,
  or supervise it. We connect to whatever broker NATS_URL points at."

- New Appendix A: "running a NATS broker locally for development" —
  one-liner docker run for testing, with explicit "this is dev only,
  not the production deployment recipe."

- §12 one-paragraph summary updated to reflect "our project ships two
  services" (chat-server + sig-server), NATS is external.
2026-05-24 22:40:15 -06:00

32 KiB

KEZ Chat & File Share — Design Document

Status: Pre-implementation planning. No code yet. Last updated: 2026-05-24 Goal: A Keybase-class chat + file sharing experience built on the KEZ identity stack, with NATS for messaging and Iroh for file transfer.


1. What this is

A real-time chat + file-sharing application with verified identities.

  • Users get human-friendly handles like @tudisco@kez.lat.
  • The handle is bound to a KEZ ed25519 primary key; the same key authenticates to the chat infrastructure.
  • Conversations are end-to-end encrypted; the broker is dumb.
  • Files are visible in the sender's "shared files" list but only downloaded when a recipient actually wants them. No background sync.
  • Identity is portable: the underlying key + sigchain survives the home server going dark. Handles can be migrated to other servers later.

This is the Keybase model rebuilt on a decentralized substrate:

  • Identity layer → KEZ (instead of Keybase's central account system)
  • Chat layer → NATS with client-side E2E (instead of Keybase Chat)
  • File layer → Iroh peer-to-peer with content addressing (instead of KBFS)

2. Three-layer architecture

┌─────────────────────────────────────────────────────────────┐
│                kez-chat application                         │
│           (chat UI, file browser, profile views)            │
└────┬──────────────┬─────────────────────┬───────────────────┘
     │              │                     │
     ▼              ▼                     ▼
┌─────────┐   ┌──────────┐         ┌────────────────┐
│   KEZ   │   │   NATS   │         │     Iroh       │
│         │   │          │         │                │
│ ↓ who   │   │ ↓ chat   │         │ ↓ file blobs   │
│ ↓ what  │   │ ↓ tickets│         │ ↓ on-demand    │
│   they  │   │ ↓ presence         │ ↓ NAT travers. │
│   own   │   │ ↓ small  │         │ ↓ E2E in QUIC  │
│ ↓ where │   │   stuff  │         │                │
│   they  │   │          │         │                │
│   listen│   │ dumb     │         │                │
│         │   │ broker;  │         │                │
│         │   │ clients  │         │                │
│         │   │ do E2E   │         │                │
└─────────┘   └──────────┘         └────────────────┘
     │              ▲                       ▲
     └─────────── sigchain ──────────────────┘
       (handle → KEZ primary → endpoints
        and links to other identities)

Each layer does one thing well. Each is replaceable without touching the others. The KEZ sigchain is the bridge that ties them together — it tells a verifier "this user's broker is X, their Iroh nodes are Y₁ and Y₂."


3. Identity & username model

3.1 Handles

Handles look like email and Mastodon addresses:

@tudisco@kez.lat
@chris@kez.lat
@alice@chris.com         ← custom domain, opted out of default (future)

kez.lat is the placeholder default home server domain. We'll lock in the real production domain before launch.

For v0, the handle namespace is global — registration is on the one default home server. Federation (multiple servers with their own namespaces) is deliberately not in v0, but the design must not preclude it. See §3.5.

In the UI, since there's only one home server in v0, handles are displayed bare (@tudisco). The @kez.lat suffix is implied and stored internally.

3.2 Key generation tied to username

When a user creates an account:

  1. App generates a fresh ed25519 keypair locally.
    • This is the user's KEZ primary key.
    • It's also their NATS nkey for the chat broker (same key, same algorithm).
    • It's also their Iroh node identity (same primitive again).
  2. App registers @username on the home server's handle registry.
    • Sends a signed registration request proving control of the private key.
    • Registry rejects squatting (first-come-first-served).
  3. App initializes a sigchain for the new primary.
    • First event: add_endpoint advertising the NATS broker the app will use.
    • Second event: add_endpoint advertising the Iroh NodeId of the local device.
  4. App uploads the sigchain to the deployed kez-sig-server.

After this flow the user has a fully working KEZ identity:

  • @tudisco@kez.lat resolves via the handle registry to their primary key.
  • That key's sigchain (on kez-sig-server) advertises their NATS broker and Iroh nodes.
  • Other users can verify them and reach them.

3.3 Why ed25519 only for this app

Both KEZ primary types work in general, but the chat app requires ed25519:

  • NATS nkeys are ed25519. Direct alignment: the user's KEZ primary key is their NATS credential. No second auth scheme.
  • Iroh node IDs are ed25519. Same primitive, native fit.
  • One key type to manage. Users with a pre-existing nostr key can still attach it to their KEZ sigchain as a verifiable claim (so they're cross-referenced on nostr too), but the primary that runs the app is ed25519. The nostr key never participates in chat or file transport.

3.4 Account recovery: paper backup (Keybase-style)

The user's ed25519 private key is the only thing that can prove their identity. Lose it, lose the account.

Recovery model for v0:

  • On account creation, the app converts the 32-byte ed25519 seed to a mnemonic phrase (BIP-39 style, 24 words). Standard, well-tested word lists, deterministic encoding.
  • App forces the user to write it down before continuing — shows the words, asks for confirmation, then asks them to retype a few random words back to prove they recorded it.
  • App stores the seed locally in OS-protected storage (Keychain, Credential Manager, libsecret). Mnemonic is shown only at creation and on-demand from settings.
  • Lost device flow: user installs the app on a new device, types their mnemonic, app regenerates the same ed25519 keypair, then pulls the sigchain from kez-sig-server to restore their identity state.
  • The handle is still theirs because the registry knows the primary key.

No server-side recovery. No email reset. No customer support. Same model Bitcoin wallets and Keybase used — user holds the seed phrase, user is responsible for it.

3.5 Federation-ready design (not in v0)

For v0 we have one home server (kez.lat). All handles live there. To make sure we don't paint ourselves into a corner:

  1. Internal representation of a handle is always the qualified form (tudisco@kez.lat), never just tudisco. The UI strips the suffix for display; storage always keeps the full form.
  2. Handle resolution is HTTP-based, not hard-coded. The chat app looks up chris@kez.lat by hitting https://kez.lat/v1/u/chris. When federation lands, looking up chris@example.com hits https://example.com/v1/u/chris instead.
  3. WebFinger endpoint included from v0 — so cross-server discovery already works via standard tooling, even if our app only uses our own server for now.
  4. Sigchain endpoint URLs are fully qualified. A user's sigchain lives at https://sig.kez.lat/v1/sigchains/ed25519/<hex> — when another server's user wants to verify ours, the URL is right there.

The v0 chat app might hard-code "lookups go to kez.lat" for now; flipping that to "lookups go to whatever's after the @" is a config change later, not a redesign.


4. The home server (kez-chat-server)

A single Rust binary, deployed as one container alongside other microservices (NATS broker, sigchain server).

4.1 What it does (and what it doesn't)

Responsibility This server?
Handle registry Yes
NATS auth callout Yes
WebFinger endpoint Yes
HTTP API for clients Yes
Sigchain storage No — defer to kez-sig-server (separate container)
NATS broker No — separate nats-server (Go) container
Iroh pinning No for v0 — files transfer P2P when both peers are online. Pinning is a future tier.
Channel verification (gist/dns/etc.) No — clients do it locally via kez-channels. KEZ system is only used for identity, not as part of chat.

The chat server is deliberately small. Microservices: each service does one thing, deployed independently. Operator runs three containers (chat-server + nats-server + sig-server). When pinning lands later, that becomes a fourth optional container.

4.2 Process / deployment model

NATS is not part of our deployment. The operator runs NATS however they want (Synadia Cloud, their own cluster, a friend's broker, a single local container) and gives the chat-server a URL. Same idea as a database: we connect to one; we don't ship one.

                External infrastructure
                (operator's responsibility)
                ┌──────────────────────┐
                │     NATS broker      │
                │   + JetStream        │
                │   somewhere          │
                └─────────▲─────▲──────┘
                          │     │
       chat-server ──────┘     │ ◄────── client app
       (auth callout)           │         (publish/subscribe)
                                │
┌─────────────── our deployment ─────────────────┐
│                                                │
│  ┌─────────────────┐   ┌────────────────┐      │
│  │ kez-chat-server │   │ kez-sig-server │      │
│  │   (Rust)        │   │   (Rust)       │      │
│  │                 │   │                │      │
│  │ ↓ handles       │   │ ↓ sigchain     │      │
│  │ ↓ nats auth     │   │   storage      │      │
│  │ ↓ HTTP API      │   │                │      │
│  └─────────────────┘   └────────────────┘      │
│         ▲                      ▲               │
└─────────┼──────────────────────┼───────────────┘
          │                      │
   ┌──────┴──────────────────────┴────────────────────────┐
   │  Chat app (per user, runs on phone/desktop)          │
   │                                                      │
   │  • talks to the operator's NATS broker (NATS proto)  │
   │  • talks to kez-chat-server over HTTPS               │
   │  • talks to kez-sig-server over HTTPS                │
   │  • runs local iroh::Node for file send/receive       │
   └──────────────────────────────────────────────────────┘

The chat-server orchestrates auth against whatever NATS broker is configured, but doesn't run, host, supervise, or ship NATS in any form.

4.3 docker-compose sketch (our two services only)

# deploy/docker-compose.yml — what we ship
services:
  chat-server:
    build: .                   # kez-chat-server Rust binary
    environment:
      NATS_URL: ${NATS_URL}    # operator points us at their NATS broker
      SIG_SERVER_URL: http://sig-server:7878
      DB_PATH: /data/handles.db
      AUTH_CALLOUT_NKEY_PATH: /etc/kez/auth-callout.nkey
    volumes:
      - chat-data:/data
      - ./auth-callout.nkey:/etc/kez/auth-callout.nkey:ro
    depends_on: [sig-server]
    ports:
      - "8080:8080"            # HTTP API for clients

  sig-server:
    image: kez-sig-server:latest   # the existing rust-sig-server
    environment:
      KEZ_DB: /data/sigchains.db
    volumes:
      - sig-data:/data
    ports:
      - "7878:7878"

volumes:
  chat-data:
  sig-data:

NATS is not in this file. The operator brings their own — running on a different host, in a different compose project, on Synadia Cloud, or wherever. They give us NATS_URL and a place to put our auth callout endpoint URL in their nats.conf.

What the operator needs to add on the NATS side (in their config):

# nats.conf — added to whatever NATS deployment the operator runs
authorization {
  auth_callout {
    issuer: "<our auth-callout signing nkey public part>"
    auth_users: ["AUTHUSER"]      # placeholder identity NATS uses
    account: "DEFAULT"
  }
}

The chat-server signs auth-callout responses with a long-lived nkey that NATS trusts. When a client connects to NATS with their KEZ ed25519 key, NATS forwards the auth request to our chat-server, which checks the handle registry and signs a yes/no response.

We provide a reference nats.conf snippet in the docs. The operator patches it into their own NATS deployment.

For local development, see Appendix A.

4.4 Endpoints

GET   /v1/healthz
GET   /v1/u/:handle                      handle → { primary, sigchain_url, endpoints }
POST  /v1/register                       claim a handle (signed body)
GET   /.well-known/webfinger?resource=acct:tudisco@kez.lat

# NATS auth callout (called BY nats-server, not by users)
POST  /internal/nats/auth                verify nkey signature, return permissions

Sigchain endpoints are not on this server — clients talk directly to kez-sig-server for those.


5. End-to-end flows

5.1 Account creation — @tudisco@kez.lat

1. User opens chat app, clicks "Create account"
2. App: generates ed25519 keypair locally
3. App: converts seed to 24-word mnemonic, makes user write it down,
   verifies recall before continuing
4. App: user picks handle "tudisco"
5. App → chat-server:
     POST /v1/register
     { "handle": "tudisco",
       "primary": "ed25519:<pubkey-hex>",
       "registration_sig": "<sig over canonical message>" }
6. Server: validates signature, checks handle is free, stores in registry
7. Server: 201 Created
8. App: initializes sigchain locally, signs:
     - add_endpoint { protocol: "nats", url: "...", inbox: "kez.inbox.<pk>" }
     - add_endpoint { protocol: "iroh", node_id: "<local iroh id>" }
9. App → sig-server: POST /v1/sigchains/ed25519/<pk>/events  (one per event)
10. App: connects to nats-server with nkey auth (signed challenge,
    nats-server invokes chat-server's auth callout, gets back yes/no
    + allowed subjects)
11. App: subscribes to JetStream durable consumer on its inbox subject
12. Done — @tudisco@kez.lat is live and reachable

5.2 Adding a contact

1. Tudisco types "@chris" in app
2. App → chat-server: GET /v1/u/chris
   Returns: { primary: "ed25519:abc...", sigchain_url: "https://sig.kez.lat/..." }
3. App → sig-server (URL from above): fetch sigchain
4. App walks events to extract:
     - NATS broker URL + inbox subject (from add_endpoint nats)
     - Iroh node IDs (from add_endpoint iroh)
     - Other identity claims (github:chris, dns:chris.com, etc. — for display)
5. App caches LOCALLY: { "@chris@kez.lat" => ed25519:abc..., endpoints: {...} }
   (TOFU — trust on first use)

5.3 Sending a chat message

1. Tudisco types "hello" to Chris
2. App looks up Chris's primary key + NATS endpoint from local cache
3. App derives a per-message symmetric key:
     X25519(tudisco_priv, chris_pub) → HKDF → 32-byte ChaCha20-Poly1305 key
4. App encrypts "hello" with that key (+ random nonce)
5. App signs ciphertext with tudisco's KEZ primary
6. App publishes to subject `kez.inbox.<chris-pubkey-hex>` on the NATS
   broker, JetStream-published so the broker stores it durably
7. Chris's app (subscribed via durable consumer) receives the message
   whenever next online — broker buffers it if offline
8. Chris's app verifies signature against tudisco's key, decrypts,
   shows "tudisco: hello"

The broker sees:

  • An nkey-authenticated client publishing encrypted bytes to a subject
  • It does NOT see: who's reading the subject, message contents, sender identity (sender identity is in the signed payload, not the NATS frame)

5.4 Sharing a file (v0: both peers online)

1. Tudisco drags `report.pdf` into the chat with Chris
2. App imports the blob into local Iroh node → BLAKE3 hash + ticket
3. App optionally adds an entry to tudisco's "shared files" manifest
   (visible if Chris later browses tudisco's profile)
4. App generates a per-file symmetric content key
5. App encrypts the blob in place (or stores both plaintext + encrypted —
   detail for later) with the content key
6. App wraps the content key for chris's KEZ key (X25519 → HKDF)
7. App sends a NATS message to chris's inbox:
     { type: "file_share",
       filename: "report.pdf",
       size: 1234567,
       iroh_ticket: "blobac://...",
       wrapped_content_key: "..." }
   (same encryption as chat messages, so chris can read this)
8. Chris's app sees the notification: "tudisco shared report.pdf (1.2 MB)"
   File NOT downloaded yet.
9. Chris clicks Download.
10. Chris's app opens an Iroh connection to tudisco's NodeId (from
    tudisco's sigchain), pulls the blob via the ticket, decrypts with
    the unwrapped content key, verifies BLAKE3 hash. File appears.

v0 limitation: If tudisco is offline at step 10, chris waits. Iroh will retry; download starts when tudisco's node comes back. Pinning (the server holding a copy) is not in v0 — we accept this limitation in exchange for zero server-side storage cost and the simplest possible architecture.

5.5 Browsing someone's files (Keybase-style)

1. Chris opens tudisco's profile
2. App resolves @tudisco → primary → sigchain
3. Sigchain has a `set_shared_files` op pointing at a manifest blob hash
4. App fetches the manifest blob via Iroh (small, fast)
5. App decrypts entries wrapped for chris's key, ignores ones it can't
   decrypt (those are wrapped for other people)
6. App renders the visible entries: name, size, share date,
   thumbnail (if present)
7. Chris clicks an entry → flow continues like §5.4 step 9

Manifest is small (KB-scale); blobs are MB-to-GB. Browsing is cheap; fetching is per-file deliberate. Recipient never auto-syncs.


6. Project & folder layout

6.1 Where this project lives

/Kez
├── rust-lib/                ← (proposed refactor) shared Rust libraries
│   ├── Cargo.toml           workspace
│   └── crates/
│       ├── kez-core/        moved from rust/crates/
│       └── kez-channels/    moved from rust/crates/
│
├── rust/                    ← Rust CLI (kez binary)
│   └── crates/
│       └── kez-cli/         depends on ../../rust-lib/crates/...
│
├── rust-sig-server/         ← existing sigchain storage (reused as-is)
│
├── kez-chat/                ← THIS PROJECT
│   ├── document.md          (this file)
│   ├── Cargo.toml
│   ├── src/
│   │   ├── main.rs          binary entry
│   │   ├── handles.rs       handle registry (sqlite-backed)
│   │   ├── nats_auth.rs     NATS auth callout endpoint
│   │   ├── webfinger.rs     WebFinger discovery endpoint
│   │   └── api.rs           axum routes + state
│   ├── deploy/
│   │   ├── docker-compose.yml   chat-server + nats + sig-server
│   │   ├── nats.conf            with auth_callout config
│   │   └── systemd/             alternative deployment
│   └── tests/
│       └── http.rs              integration tests
│
├── nodejs/                  ← (unchanged)
└── crosstest.sh             ← (path updates if rust-lib moves)

6.2 The rust-lib/ proposal — share code, no duplication

Right now, kez-core and kez-channels live inside rust/crates/. The sig-server and the new chat-server both want to use them. Today's downstream path-dep is:

kez-core = { path = "../rust/crates/kez-core" }

…which works but reaches into another project's crate tree.

Recommendation: move the pure libraries out into a top-level rust-lib/ workspace. The CLI stays in rust/. Downstream servers depend on ../rust-lib/crates/kez-core. Clean structure, no duplication, no confusion about which folder owns what.

Refactor steps:

  • mv rust/crates/kez-core rust-lib/crates/
  • mv rust/crates/kez-channels rust-lib/crates/
  • Create rust-lib/Cargo.toml (workspace).
  • Update rust/Cargo.toml to have just kez-cli.
  • Update path deps in: rust/crates/kez-cli/Cargo.toml, rust-sig-server/Cargo.toml.
  • Update crosstest.sh if any paths are hardcoded.

Suggested order: do the refactor before starting kez-chat so we import cleanly from the start.

6.3 Dependencies (planned)

Crate Why
kez-core (path) Identity types, ed25519, signing
kez-channels (path) Verify users' linked accounts when displayed
axum 0.8 HTTP API
tokio Async runtime
rusqlite (bundled) Handle registry
async-nats NATS client (admin work + the auth callout glue)
serde / serde_json Standard
thiserror / anyhow Standard
tracing / tracing-subscriber Logging
tower-http CORS, request tracing
clap CLI args

Not depended on by the chat-server:

  • iroh — server doesn't run an Iroh node in v0 (no pinning)
  • nats-server (Go) — separate container, not a Rust dep

6.4 NATS broker — external dependency

NATS is not part of our project. It's external infrastructure the operator provides, the same way they'd provide a database or an SMTP relay. We ship:

  • An async-nats client used by the chat-server (admin/utility work)
  • An auth-callout HTTP endpoint that NATS calls during client connection
  • A documented nats.conf snippet operators add to their NATS deployment
  • A reference local-dev setup (Appendix A) for running NATS yourself while developing

What we require from the operator's NATS:

Requirement Why
NATS 2.10+ (for auth_callout) We rely on auth callout to bridge KEZ identity into NATS
JetStream enabled For offline message buffering (durable consumers)
TCP reachable from chat-server and clients Standard
TLS (in production) Standard
auth_callout configured to hit our endpoint Required for client auth

That's it. Operator can run a single Docker container, a clustered production deployment, or a managed service — we don't care, as long as NATS_URL and the callout config are correct.

Why fully external rather than alongside us:

  • NATS is a serious piece of infrastructure with its own scaling and operational concerns. Bundling it implies we're responsible for it. We're not.
  • Operators with existing NATS deployments can reuse them. No "now run our copy of NATS too."
  • Different teams might run different NATS topologies (single instance, cluster, mesh, leaf nodes). None of that is our problem.
  • Swapping NATS implementations or moving to a managed provider is a config change, not a code change.

6.5 Iroh — client-side only

Clients run a local Iroh node for sending and receiving files. The chat-server does NOT run an Iroh node in v0.

Implication: when @tudisco shares a file with @chris, the bytes go directly from tudisco's device to chris's device via Iroh. If tudisco is offline, chris waits. There's no fallback to a server-stored copy.

This is the simplest possible operational model. Pinning (server-side fallback storage) is a future addition (§8).


7. MVP scope

Server (kez-chat-server)

  • HTTP API scaffold (axum + tokio)
  • Handle registry (POST /register, GET /u/:handle)
  • Registration signature validation (uses kez-core)
  • WebFinger endpoint
  • NATS auth callout (POST /internal/nats/auth)
  • Healthz / metrics
  • Integration tests against real nats-server + sig-server in a test docker-compose

Deployment

  • docker-compose.yml (chat + nats + sig-server)
  • nats.conf with auth_callout configured
  • systemd alternative deployment recipe
  • README with TLS / reverse proxy guidance

Client (kez-chat-cli — separate project later)

Out of scope for the server work, but the server isn't usable without at least a CLI client that does:

  • Account creation (key gen + mnemonic backup + handle registration)
  • Contact lookup + verification
  • Send / receive 1:1 chat messages (E2E via NATS)
  • Send / receive files (E2E via Iroh)
  • Browse @user shared-files manifest

UI app comes after CLI proves the flow works.


8. Out of scope (v0)

  • Iroh pinning (sender must be online for receiver to fetch)
  • Group chat (only 1:1 for v0)
  • Forward secrecy / ratcheting (Double Ratchet, MLS) — chat is encrypted but each message uses the same X25519-derived key per pair
  • Voice / video calls
  • Multi-device key sync — one device per user in v0
  • Account recovery beyond mnemonic — paper backup is the only recovery
  • Federation across home servers — one server (kez.lat) in v0; design preserves the option
  • Channel-based identity verification — the CLI already does kez verify id ...; not duplicated in the chat-server. Users add KEZ channel proofs (gist, dns, etc.) via the existing CLI separately.
  • Avatars / display names — defer the design. For v0 the UI shows the handle and that's enough.

9. The one remaining open question

Manifest format for "@chris's shared files":

Option How Tradeoff
A. Signed JSON blob, hash in sigchain Manifest is a JSON blob stored on Iroh. A new sigchain op set_shared_files commits the latest manifest hash. Recipients walk the sigchain → find the pointer → fetch the manifest blob from Iroh. Simpler. No Iroh Docs dep. Sigchain anchors the version (signed). Update = new sigchain event.
B. Iroh Doc Manifest is a mutable CRDT document. Recipients subscribe; updates sync in near-real-time. Fancier UX (live updates). Requires Iroh Docs subsystem (heavier dep, less stable).

Recommended default: A. Simpler, fewer moving parts, reuses primitives we already have. We can upgrade to B later if real users need real-time profile feed updates.

Settle yes/no on this and the design is locked.


Decisions locked from earlier discussion

Question Decision
Bundle sigchain in chat-server? No. Use existing kez-sig-server. Microservices.
Bundle NATS into Rust server? No. NATS is external infrastructure the operator provides. We don't ship, embed, or supervise it. We connect to whatever broker NATS_URL points at.
KEZ + nostr coexistence for chat? No nostr in chat. KEZ is identity-only; nostr only as a verifiable claim in someone's sigchain, not as transport.
Handle scope: federation or global? Global for v0, federation-ready design (see §3.5).
Recovery if key lost? Paper backup (24-word mnemonic), Keybase-style. No server-side recovery.
Iroh pinning in v0? No. Sender must be online for receiver to fetch. Pinning is a future tier.

10. Risks & honest concerns

  1. NATS auth callout integration depth. Documented but fiddly. nkey signature verification is straightforward; the per-user subject permission glue needs care. Test cases for "user can publish to their own inbox only" / "user can subscribe to their own inbox only" matter.

  2. Iroh is pre-1.0. Pin a version. Migration is a chore but only touches client code, not identity. Identity stays stable (KEZ).

  3. Single-device assumption. Real users have phones AND laptops. v0 assumes one device per primary. Designing multi-device is a real follow-up.

  4. No offline file delivery. A natural user complaint will be "Chris sent me a file but he's offline now." We've made the trade knowingly; document the limitation clearly in-app ("File will download when @chris is back online").

  5. Handle squatting. First-come-first-served. Mitigations:

    • Rate-limit registration by IP
    • Reserve some handles (@admin, common project names)
    • Accept that some squatting will happen; document the policy
  6. NAT traversal. Iroh handles it with relays. Test on hostile networks (corporate firewalls, mobile carriers with CGNAT) before claiming "just works."

  7. Operational cost. Three containers (chat + nats + sig-server)

    • bandwidth + a domain. Cheap at small scale, scales with users. Need a "running kez.lat for 1k users — what does it cost?" answer before community adoption.

11. The plan, sequenced

When we start building:

  1. Refactor: move kez-core + kez-channels to rust-lib/. Small but unblocks clean imports from kez-chat.

  2. Scaffold kez-chat-server (axum + tokio + sqlite + tracing). Handle registry + WebFinger first — these unblock client-side account creation.

  3. NATS auth callout. Bring up a NATS broker for development (see Appendix A), configure its auth_callout to hit our chat-server's /internal/nats/auth. End-to-end: a client can register a handle and then connect to NATS authenticated by its KEZ key.

  4. Minimal kez-chat-cli client (separate project) that does:

    • kez-chat register tudisco
    • kez-chat add @chris
    • kez-chat send @chris "hello"
    • kez-chat listen No UI. Enough to prove the chat flow works end-to-end against the server.
  5. Iroh integration in the client (not the server).

    • Client runs a local Iroh node
    • kez-chat share @chris ./file.pdf
    • kez-chat fetch <ticket>
  6. Shared-files manifest. New set_shared_files sigchain op. kez-chat browse @tudisco lists his shared files.

  7. Deployment recipe. docker-compose, systemd, deployment doc.

  8. Then start the GUI app. Could be Tauri (Rust + web frontend), Iced (pure Rust UI), or something else.


12. One-paragraph summary

kez-chat is a Keybase-class chat and file-sharing app built on the KEZ identity stack. Users get @username@kez.lat handles backed by an ed25519 primary key. The same key authenticates to a NATS broker (chat, presence, file tickets — broker is dumb, clients do E2E with ChaCha20-Poly1305 over X25519-derived keys) and identifies an Iroh node (P2P bulk transfer, content-addressed blobs, on-demand fetch). Our project ships two services: a thin Rust kez-chat-server that handles the handle registry + NATS auth callout + HTTP API, and the existing kez-sig-server that stores sigchains. NATS is external infrastructure the operator provides — we never ship, embed, or supervise it. The chat-server does not run an Iroh node and does not pin files in v0; file transfer is pure P2P between online peers. Account recovery is via a 24-word paper-backup mnemonic. Federation across home servers is deferred but the design keeps it as a flip-the-switch future change.


Appendix A: running a NATS broker locally for development

NATS is not part of our project, but you need one running to test the chat-server end-to-end. Easiest path during development:

docker run -d --name kez-dev-nats \
  -p 4222:4222 -p 8222:8222 \
  -v "$PWD/dev-nats.conf:/etc/nats/nats.conf:ro" \
  nats:latest -c /etc/nats/nats.conf --jetstream

Where dev-nats.conf enables the auth callout pointing at your locally-running chat-server (e.g. http://host.docker.internal:8080/internal/nats/auth).

A full reference dev-nats.conf will live at deploy/dev-nats.conf when we start building. This appendix exists so developers have a one-liner to spin up NATS for testing; it is not the production deployment recipe (operators run their own NATS however they want).

For production: see the NATS docs (https://docs.nats.io). Our project has no opinion beyond "must be 2.10+ with JetStream + auth_callout configured to hit our endpoint."