# kez-chat security + protocol TODO Consolidated from a nostr-protocol expert review and an independent security audit (both run 2026-06-08). Ordered by impact-per-hour-of-work, not by review severity — a CRIT that's a half-week of design discussion isn't a ship-stopper when there are real CRITs that take 30 minutes. Update the status column as we land things. Cross-links to the original review reports are at the bottom. ## Status legend - `TODO` — not started - `WIP` — actively being worked - `DONE` — landed in main - `ROADMAP` — committed but multi-day; will need its own design doc - `WONTFIX` — accepted trade-off, documented --- ## Day 1 — ~half a day of work, biggest wins ### #1. Strip envelope metadata (ephemeral x25519 per message) [TODO] **Why it matters:** the `SealedEnvelope.from` (KEZ identity) and `to` (handle) fields sit in cleartext alongside the ciphertext in `event.content`. Any nostr relay can JSON-parse the content and build a perfect social graph: who-messages-whom, when, how often. The actual message body stays encrypted; all the metadata is wide open. **Files:** - `kez-chat/web/src/lib/crypto.ts:42-54` (the `SealedEnvelope` shape) - `kez-chat/web/src/lib/crypto.ts:115-153` (`sealMessage` / `openMessage`) **Fix:** replace `envelope.from` (the KEZ identity) with `envelope.eph_pub` (a 32-byte x25519 public key sender generated for this message only). Recipient does ECDH against `eph_pub` instead of deriving it from the KEZ identity. The decrypted plaintext already carries `from` for identity verification. Bonus: a fresh ephemeral key per message gives partial forward secrecy — compromise of the recipient's long-term seed still decrypts retained ciphertexts, but a captured single-message ECDH key reveals only that one message. **Migration:** envelope `v: 1 → v: 2`. Recipient accepts both during a 1-week window, then drops v1 support. ### #3. Empty Web Push payload [TODO] **Why it matters:** we send `{type, to: , seq}` to FCM/APNs/Mozilla on every fanout. RFC 8291 encrypts the payload so the push provider can't read the bytes, but the provider already knows the endpoint's owner — the `to` field adds no information for the recipient and *does* give Google a clear "message arrived for alice at T" timeline. **Files:** - `kez-chat/src/messages.rs:123-127` (the payload we hand to `push.fanout`) - `kez-chat/web/src/sw.ts:78-99` (the `push` handler) **Fix:** send `{}` as the payload. The service worker shows a generic "New kez-chat message" notification — it can still focus an existing tab, which navigates to the conversation list. Deep-linking to a specific peer goes away on cold-open (acceptable trade-off — one extra tap to open the right thread is the price of *not* exporting metadata to FCM). ### #5. Rename the routing tag from `h` to something less claimed [TODO] **Why it matters:** `h` is informally used by NIP-29 (Simple Groups) as the group id. Today's three relays don't enforce NIP-29 semantics, but the moment a NIP-29-aware relay enters our pool it will try to route our `#h` filter as a group join, and we'll get cryptic failures. **Files:** - `kez-chat/web/src/lib/nostr-id.ts:28` (`ADDR_TAG = "h"`) - `kez-chat/web/src/lib/nostr-transport.ts:201, 269` (publish + subscribe) - `kez-chat/src/nostr_listener.rs:113` (server-side mirror) **Fix:** switch to `q` (less-claimed single letter, still indexable per NIP-01). Bump envelope/event `v` so the listener can tell old-tag events from new ones during the migration window. Server-side listener subscribes to BOTH `#h` and `#q` for one week. ### #17. Demote handle-revealing logs to `debug!` [TODO] **Why it matters:** every fanout currently logs `push: fanout triggered handle= sub_count=N` at INFO level. Operator-side log retention turns this into a permanent "who's chatting" ledger. Even if we trust the operator (it's us), forensics on a stolen log file leaks the social graph in plaintext. **Files:** - `kez-chat/src/push.rs:259-262, 275-281` (fanout + send logging) - `kez-chat/src/api.rs:387-393` (subscribe registration) **Fix:** demote the handle-bearing INFO lines to DEBUG. Replace the visible field with a short HMAC of the handle under a server-instance secret so we can still group "all sends for X" in logs without exposing X. Set log level in production to INFO, so DEBUG lines are off by default. --- ## Day 2 — another half-day ### #2. Replay protection — bound + timestamp freshness [DONE] **Why it matters:** `SEEN_CAP=500` evicts oldest event ids once we've seen 500 messages. An active user rolls past that in days, then a malicious relay can re-broadcast any old event and we accept it as a fresh message — the decrypted `sent_at` is never compared to wall-clock. **Files:** - `kez-chat/web/src/lib/nostr-transport.ts:107` (`SEEN_CAP = 500`) - `kez-chat/web/src/lib/nostr-transport.ts:142` (`slice(-SEEN_CAP)`) - `kez-chat/web/src/lib/crypto.ts:161-205` (`openMessage` — no freshness check) **Fix:** 1. Bump `SEEN_CAP` to 10_000 and move from localStorage to IndexedDB so the set isn't capped by the 5MB localStorage quota. 2. In `openMessage`, reject envelopes where `|now − sent_at| > 7 days`. 3. Also clamp `ev.created_at` to `[now − 7d, now + 5min]` before using it as a seq generator — otherwise a relay can backdate or future-date events and either replay or skip-ahead `bumpSince`. ### #4. Reveal-recovery-phrase requires fresh auth [DONE] **Why it matters:** 30 seconds of access to an unlocked phone = full identity exfil. The Settings → Reveal Phrase button decrypts straight from the persistent-session blob with no re-prompt. **Files:** - `kez-chat/web/src/routes/Settings.svelte` (the Reveal flow) **Fix:** gate Reveal Phrase + Lock + biometric setup behind a fresh passphrase prompt OR a WebAuthn assertion. Same model Apple/1Password use: "this action requires your password again". ### #15. Rate-limit `POST /v1/messages` [DONE] **Why it matters:** the endpoint currently accepts anonymous posts (no auth on send) capped at 256KB per envelope. A bot can fill any mailbox until disk fills. Acknowledged in `messages.rs:18-20` ("Spam: v0.1 doesn't gate POST"). **Files:** - `kez-chat/src/messages.rs:70-133` **Fix (v0.1):** per-IP token bucket — 60 messages/min per source IP. Drop overflow with 429. **Fix (v0.2):** require the sender to sign with their KEZ primary; chat- server verifies. Becomes useless for cross-server v0.2 unless the sender's server vouches. --- ## Roadmap — multi-day, needs design pass ### #6. Forward secrecy (Double Ratchet) [ROADMAP] **Why it matters:** today's static-static x25519 means whoever compromises a seed once decrypts ALL retained history that any relay still has — and relays retain indefinitely. The ephemeral-x25519 fix in #1 is partial forward secrecy (per-message) but not post-compromise security. **What's needed:** Signal-style X3DH + Double Ratchet. Significant refactor of crypto.ts; needs careful API design so existing conversations migrate cleanly. Owner: TBD. ETA: separate sprint. ### #7. WebAuthn-gated session rehydrate [ROADMAP] **Why it matters:** the persistent-session blob's non-extractable AES key blocks `exportKey` but NOT `decrypt`-then-read-plaintext. Any malicious extension with ``, any XSS, any compromised npm dep can call `restoreSession()` and lift the seed. My comment in `persistent-session.ts:18-23` overstates the actual protection. **Fix:** gate `restoreSession()` on a user-gesture WebAuthn assertion (touchID / passkey). Background scripts can't fake a user gesture, so the seed never gets decrypted unattended. Falls back to passphrase on devices without WebAuthn. ### #8. Rotate addr daily (`info = "v1|YYYYMMDD"`) [ROADMAP] **Why it matters:** a relay scrapes `#h` filter values + the public KEZ directory + builds a rainbow table mapping `addr → primary → handle`. The hash buys little when the input space is enumerable. Per-day addr rotation forces the rainbow table to be rebuilt daily and stops long-term correlation. **Trade-off:** receivers need to subscribe to multiple addrs during the boundary day (yesterday's + today's). Listener server-side needs the same. Migration logic isn't hard but isn't free. ### #9. Unforgeable delivery acks [DONE — Day 3 Option A] **Why it matters:** anyone who saw an event id can publish a fake kind-4244 ack. Sender's UI shows false "delivered". Cosmetic-only today; will be a real problem when someone builds a tracker bot. **Fix:** ack payload = recipient's ed25519 signature over the acked event id. Sender verifies against the recipient's known KEZ primary. Free — already have ed25519 plumbing. ### #10. NIP-65 outbox model [PARTIAL — Day 3 Option B] Publish-side only. We now emit a `kind:10002` event on first session alongside the kind:0 baseline, listing our 3 default relays as read+write. NIP-65-aware clients can discover where to reach us. What's still missing: when SENDING to a peer, we should fetch their `kind:10002` and union their read-relays with ours. v0.2 — needs a deeper transport refactor (per-message relay set). We hardcode 3 relays for every user. Real nostr clients publish `kind:10002` listing their preferred read+write relays; senders publish to each recipient's published read-relays. Without this, isolated networks of users on different relay sets can't reach each other. ### #11. NIP-42 AUTH support [DONE — Day 3 Option B] damus.io regularly requires NIP-42 AUTH for DM-kind reads. Without it our subscriptions get rejected silently. Add the client AUTH handshake + support being prompted by the relay. ### #12. Publish a minimal kind-0 profile on first use [DONE — Day 3 Option B] Some relays silently drop writes from "unknown" pubkeys (no kind-0). A single minimal `kind:0` per derived nostr pubkey (just `{"name":"kez-chat user"}`) unblocks this without revealing anything. ### #13. NIP-25 ack shape with `["p", senderNostrPubkey]` [DONE — Day 3 Option B] Our kind-4244 ack is custom. Adopting the NIP-25 shape gets free interop with nostr clients that already render reactions — handy if we ever expose the underlying events. ### #14. Shorten `since=` default cursor [DONE — Day 3 Option A] Default 7-day cursor exceeds most relay retention windows (often 1–3 days). Fresh devices on quiet conversations silently miss messages. Shorten to 48h + augment with explicit "fetch full history" UI for the rare resurrect case. ### #16. Bounded concurrency on push fanout [DONE — Day 3 Option A] **Why it matters:** every send spawns an unbounded `tokio::spawn` to fan out push. Under flood, OOM. **Files:** - `kez-chat/src/messages.rs:128` **Fix:** semaphore-bound to ~32 concurrent fanouts. Excess queues; under extreme flood we drop with a warn-log rather than swap-thrash. --- ## Visually-encrypted profile pictures (new feature, in progress) ### Phase 1A — local scramble + per-contact key wrap [DONE this commit] - `kez-chat/web/src/lib/visual-crypto.ts` — keyed Fisher-Yates pixel permutation + xoshiro256** PRNG. Output is a valid PNG with same dimensions, scrambled content. Salt embedded as `#kez-visual-v1:` URL fragment so descramble doesn't need out-of-band metadata. - `profile-store.ts` — profile gains `encrypted: boolean` (default true) + `picture_key` (local-only). On publish: scramble the picture, wrap the visual key for each contact via the existing `sealMessage()` envelope, embed as `kez_visual_keys` map in the kind:0 content. - Settings — "Visually encrypt picture (recommended)" toggle, default ON. ### Phase 1B — peer descramble [DONE this commit] - `peer-profile-store.ts` (new): IDB cache + one-shot `pool.querySync` fetch of the peer's kind:0 metadata event. On hit, looks up our primary in `kez_visual_keys`, opens the SealedEnvelope wrap to recover the visual key, descrambles `metadata.picture`, caches the rendered data URL. - `peer-profile-cell.svelte.ts` (new): reactive Svelte 5 mirror over the IDB cache so component re-renders are automatic on fetch. - `nostr-transport.ts`: surfaces `sender_nostr_pubkey` on every inbound DM. `conversations-store.ts` persists it on the conversation row so we can locate the peer's kind:0 later. - `inbox-service.svelte.ts`: on every fresh DM, fires off a profile fetch for the sender — first DM lights up their avatar. - `Messages.svelte`: hydrates the cache on mount, kicks off refreshes for every visible conversation, threads cached pictures through both Avatar usages (conversation list + thread header). - Conversation list re-renders on cache update; staleness window 24h. Edges noted for later: peers we've only *sent* to (never received from) have no `peer_nostr_pubkey` until they reply, so they don't get a picture lookup yet. Easy follow-up: backfill pubkey from a NIP-05 or WebFinger lookup, or proactively probe relays for `kind:0` events whose content tags match a known primary. ### Phase 1C — UX polish [TODO] - "X contacts can see your real picture" hint in Settings. - Re-publish kind:0 automatically when a new conversation is created (so the new contact gets key-wrapped without the user re-saving). - Optional: per-image AES-CTR mode for uniform-noise output (stronger, less "visually meaningful"). --- ## Acknowledged trade-offs (won't fix in v0.1) ### Persistent-session is no stronger than the biometric path The `non-extractable AES key` story stops `exportKey`, NOT `decrypt`+`read`. Anyone with origin-execution access (XSS, malicious extension) can lift the seed. Document this honestly in the README and the file header. Real fix is #7 above. ### 30-day TTL is client-only `expiresAt` in localStorage is editable by anyone with file-system access. Server-side device binding (issue a signed nonce on unlock, expire at the server) would help but adds round-trips. v0.2 candidate. ### Identity-key reuse is safe under current crypto ed25519 seed → ed25519 (sigchain, envelope sig) + x25519 (ECDH) + HKDF → secp256k1 (nostr signer). The auditor confirmed: no cross-curve chosen-message attack path. Standard libsodium pattern. --- ## Tracking + cross-references - Nostr-protocol review: see commit message of this commit; full report in the audit-trail. - Security audit: ditto. - Owner: tudisco - Last updated: 2026-06-08