Add README for main project and each example

Main README covers quick start, API overview, and links to example READMEs.
Each example (paste, filemanager, can-sync, canfs) gets its own README
with setup instructions, architecture, and configuration details.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Jason Tudisco 2026-03-17 14:45:20 -06:00
parent e7def4b819
commit 689d14202b
5 changed files with 425 additions and 242 deletions

162
README.md Normal file
View File

@ -0,0 +1,162 @@
# CAN Service
**Containerized Asset Network** -- A self-healing local storage daemon with HTTP REST and Protobuf APIs for ingesting, managing, and retrieving files.
CAN stores any file you throw at it, tags it with metadata, verifies integrity in the background, and syncs between machines over encrypted P2P connections. Think of it as a personal S3 that runs on your laptop and replicates to your other devices automatically.
---
## Quick Start
```bash
# Build and run (listens on port 3210)
cargo run
```
The service reads `config.yaml` from the current directory:
```yaml
storage_root: "./can_data"
admin_token: "super_secret_rebuild"
enable_thumbnail_cache: true
verify_interval_hours: 12
sync_api_key: "can-sync-default-key" # enables P2P sync endpoints
```
Override the port with `CAN_PORT=8080 cargo run`.
### Upload a file
```bash
curl -X POST http://localhost:3210/api/v1/can/0/ingest \
-F "file=@photo.jpg" \
-F "tags=vacation,summer" \
-F "application=my-app"
```
### Upload JSON data (agent-friendly)
```bash
curl -X POST http://localhost:3210/api/v1/can/0/ingest/data \
-H "Content-Type: application/json" \
-d '{"data": {"key": "value"}, "tags": "config,backup"}'
```
### Download a file
```bash
curl http://localhost:3210/api/v1/can/0/asset/{hash} -o file.jpg
```
---
## How It Works
```
+-----------+
Upload ---->| |----> SQLite index (millisecond queries)
| CAN |
Download <---| Service |----> Flat file storage (one file per asset)
| |
Search ---->| port 3210 |----> OS file attributes (disaster recovery)
| |
SSE <----| |----> Background verifier (integrity checks)
+-----------+
|
P2P Sync (protobuf over QUIC)
|
+-----------+
| CAN |
| Service | (another machine)
| port 3210 |
+-----------+
```
Each asset is saved as `{timestamp}_{sha256hash}_{tags}.{ext}` in a flat directory. Metadata lives in SQLite for fast queries and is redundantly written to OS-level file attributes (xattr on macOS/Linux, NTFS ADS on Windows) so you can recover even if the database is lost.
A background verifier re-hashes every file periodically and flags corruption. It also watches for filesystem changes in real time.
---
## API
All endpoints live under `/api/v1/can/0/`. See [API.md](API.md) for the full specification.
| Method | Path | Description |
|--------|------|-------------|
| `POST` | `/ingest` | Upload a file (multipart form) |
| `POST` | `/ingest/data` | Upload JSON data (no multipart needed) |
| `GET` | `/asset/{hash}` | Download an asset by its SHA-256 hash |
| `GET` | `/asset/{hash}/meta` | Get metadata as JSON |
| `PATCH` | `/asset/{hash}` | Update tags and/or description |
| `GET` | `/asset/{hash}/thumb/{w}/{h}` | Get a resized thumbnail (images only) |
| `GET` | `/list` | Paginated listing with filters |
| `GET` | `/search` | Search by hash prefix, time range, MIME, tags, etc. |
| `GET` | `/events` | SSE stream of new asset notifications |
Private sync endpoints (`/sync/*`) use protobuf and require the `X-Sync-Key` header.
---
## Examples
Four example apps show what you can build on top of CAN:
| Example | Port | Description | README |
|---------|------|-------------|--------|
| **[Paste](examples/paste/)** | 3211 | Pastebin -- type text or paste images, auto-tags with #hashtags | [README](examples/paste/README.md) |
| **[File Manager](examples/filemanager/)** | 3212 | Web file browser with grid/list views, search, and filters | [README](examples/filemanager/README.md) |
| **[CAN Sync](examples/can-sync/)** | -- | P2P replication agent -- encrypted sync via shared passphrase | [README](examples/can-sync/README.md) |
| **[CanFS](examples/canfs/)** | -- | Mount assets as a read-only Windows drive (WinFSP) | [README](examples/canfs/README.md) |
### Run everything at once
```powershell
# Windows
.\go_example_1.ps1
# macOS / Linux
./go_example_1.sh
```
Starts CAN Service + Sync Agent + Paste, builds everything, cleans up on Ctrl+C.
---
## Configuration
| Field | Default | Description |
|-------|---------|-------------|
| `storage_root` | (required) | Directory where assets and the database are stored |
| `admin_token` | `"changeme"` | Bearer token for admin endpoints |
| `enable_thumbnail_cache` | `true` | Cache resized thumbnails in `.thumbs/` |
| `rebuild_error_threshold` | `50` | Max errors before triggering a full rebuild |
| `verify_interval_hours` | `12` | Hours between full integrity scans |
| `sync_api_key` | (none) | API key for sync endpoints; omit to disable sync |
---
## Project Structure
```
src/
main.rs Entry point: config, DB, verifier, HTTP server
config.rs YAML config loading
db.rs SQLite CRUD (assets, tags, search)
hash.rs SHA-256 content hashing
storage.rs File I/O (write, read, trash, filename parsing)
verifier.rs Background integrity checker + file watcher
xattr.rs OS-level file attributes (xattr / NTFS ADS)
routes/ HTTP API handlers (ingest, asset, list, search, thumb, sync, events)
examples/
paste/ Pastebin web app
filemanager/ File browser web app
can-sync/ P2P sync agent (iroh + gossip + pkarr)
canfs/ Windows virtual filesystem (WinFSP)
```
## Requirements
- **Rust** 1.75+
- **SQLite** bundled (no system install needed)
- **WinFSP** only for the canfs example (Windows only)

View File

@ -1,263 +1,91 @@
# CAN Sync
P2P file synchronization service that runs on top of [CAN Service](../../). Uses [iroh](https://iroh.computer/) for encrypted peer-to-peer networking with NAT traversal.
P2P full-mirror replication for [CAN Service](../../). Two machines with the same passphrase automatically discover each other and sync all assets over encrypted connections. No port forwarding or static IPs needed.
```
┌─────────────┐ HTTP API ┌─────────────┐ iroh (QUIC) ┌─────────────┐
│ CAN Service │◄───────────►│ CAN Sync │◄─────────────►│ CAN Sync │
│ (port 3210)│ │ (port 3213)│ │ (remote) │
│ storage + │ │ P2P node + │ │ │
│ SQLite │ │ libraries │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
┌─────────────┐ protobuf ┌─────────────┐ iroh (QUIC) ┌─────────────┐ protobuf ┌─────────────┐
│ CAN Service │◄───────────►│ CAN Sync │◄─────────────►│ CAN Sync │◄───────────►│ CAN Service │
│ Machine A │ sync API │ Agent A │ encrypted │ Agent B │ sync API │ Machine B │
│ port 3210 │ │ │ │ │ │ port 3210 │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
```
CAN Sync communicates with CAN Service **only** via its public HTTP API — zero changes to CAN Service required.
## Quick Start
1. **Start CAN Service** (default port 3210):
```bash
cd ../..
cargo run
```
2. **Edit config** (optional — defaults work out of the box):
```bash
cp config.yaml my-config.yaml
# edit my-config.yaml if needed
```
3. **Start CAN Sync**:
1. **Start CAN Service** on each machine (port 3210):
```bash
cargo run
# or with a custom config:
cargo run -- my-config.yaml
```
CAN Sync starts on `http://127.0.0.1:3213` and connects to CAN Service at `http://127.0.0.1:3210/api/v1/can/0`.
2. **Configure the sync agent** -- edit `config.yaml`:
```yaml
can_service_url: "http://127.0.0.1:3210"
sync_api_key: "can-sync-default-key"
sync_passphrase: "my-secret-phrase" # must be the same on all machines
poll_interval_secs: 30
```
3. **Start the sync agent** on each machine:
```bash
cd examples/can-sync
cargo run -- config.yaml
```
That's it. Any file uploaded to either CAN Service will appear on the other within seconds.
## How It Works
### Peer Discovery
Peers find each other through two mechanisms (both run simultaneously):
- **Gossip** -- [iroh-gossip](https://docs.rs/iroh-gossip) uses a topic derived from the shared passphrase. Peers on the same local network or connected to the same relay discover each other by broadcasting their node IDs.
- **Internet rendezvous** -- Each agent publishes its node ID to [pkarr](https://pkarr.org) relay servers using deterministic DNS-like "slots" derived from the passphrase. All agents scan these slots periodically to find peers worldwide.
### Sync Protocol
Once two peers connect over iroh's encrypted QUIC transport:
1. **Hash exchange** -- Both sides send their full list of asset hashes
2. **Diff** -- Each side computes what the other is missing
3. **Transfer** -- Missing assets are sent concurrently in both directions (metadata + file content bundled together as protobuf)
4. **Live sync** -- After the initial reconciliation, each agent subscribes to SSE events from its local CAN Service. When a new asset is ingested locally, it's pushed to the connected peer instantly.
The live sync uses SSE events (not polling) for instant propagation. A fallback incremental poll runs every 30 seconds as a safety net.
### Echo Prevention
When peer A sends an asset to peer B, B's CAN Service emits an SSE event for the new ingest. Without protection, B would try to push that asset right back to A. The sync agent tracks which hashes were received from each peer and filters them out of the push loop.
## Configuration
`config.yaml`:
| Field | Default | Description |
|-------|---------|-------------|
| `can_service_url` | (required) | URL of the local CAN Service |
| `sync_api_key` | (required) | Must match `sync_api_key` in CAN Service's config |
| `sync_passphrase` | (required) | Shared secret for peer discovery (all peers must match) |
| `poll_interval_secs` | `3` | Fallback poll interval for catching missed events |
| `ticket_file` | (none) | Write this node's address to a file (for direct connection in tests) |
| `connect_ticket_file` | (none) | Read a peer's address from a file (for direct connection in tests) |
```yaml
# URL of the local CAN Service API
can_service_url: "http://127.0.0.1:3210/api/v1/can/0"
# Address for the CAN Sync HTTP API
listen_addr: "127.0.0.1:3213"
# Directory for persistent data (peer key, sync state DB)
data_dir: "./can_sync_data"
# Custom relay server URL (null = iroh's public relay)
relay_url: null
# Seconds between fast polls for new assets
poll_interval_secs: 5
# Seconds between full scans of all assets
full_scan_interval_secs: 300
```
## Concepts
### Libraries
A **library** is a shared collection of CAN assets that syncs between peers. Each library has a **filter** that determines which assets belong to it.
Filter options (combined with AND logic):
- `application` — match assets with this application tag (e.g. `"paste"`)
- `tags` — match assets with any of these tags (e.g. `["photos", "backup"]`)
- `user` — match assets from this user identity
- `mime_prefix` — match assets whose MIME type starts with this (e.g. `"image/"`)
- `hashes` — manual list of specific asset hashes to include
### Sync Flow
**Outbound** (local → remote):
1. Announcer polls CAN Service for new/changed assets
2. Assets matching a library's filter get announced to the library's iroh document
3. iroh replicates the entry to all subscribed peers
4. Remote peer's fetcher downloads the blob and ingests it into their local CAN Service
**Inbound** (remote → local):
1. iroh document receives new entry from remote peer
2. Fetcher downloads the blob via iroh's encrypted QUIC transport
3. Fetcher verifies the CAN hash (SHA-256) independently
4. Fetcher ingests the file into local CAN Service with all metadata preserved
## API
All endpoints return JSON with `{ "status": "success", "data": ... }` or `{ "status": "error", "error": "..." }`.
### Status & Peers
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/status` | Node status, CAN service health, library count |
| GET | `/peers` | Connected peers list |
### Libraries
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/libraries` | Create a library |
| GET | `/libraries` | List all libraries |
| GET | `/libraries/{id}` | Get library details |
| DELETE | `/libraries/{id}` | Remove a library |
### Sharing
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/libraries/{id}/invite` | Generate a share ticket |
| POST | `/join` | Join a library from a ticket |
### Examples
**Create a library** that syncs all assets with `application=paste`:
```bash
curl -X POST http://127.0.0.1:3213/libraries \
-H "Content-Type: application/json" \
-d '{"name": "my-pastes", "filter": {"application": "paste"}}'
```
**Create a library** that syncs all images:
```bash
curl -X POST http://127.0.0.1:3213/libraries \
-H "Content-Type: application/json" \
-d '{"name": "images", "filter": {"mime_prefix": "image/"}}'
```
**Generate an invite ticket** to share with another machine:
```bash
curl -X POST http://127.0.0.1:3213/libraries/{id}/invite
```
**Join a library** on another machine using the ticket:
```bash
curl -X POST http://127.0.0.1:3213/join \
-H "Content-Type: application/json" \
-d '{"ticket": "eyJsaWJyYXJ5X25hbWUiOi..."}'
```
**List all libraries**:
```bash
curl http://127.0.0.1:3213/libraries
```
**Check status**:
```bash
curl http://127.0.0.1:3213/status
```
## Two-Machine Setup
### Machine A (the host)
**1. Start CAN Service** (default port 3210):
```bash
cd /path/to/CanService
cargo run
```
**2. Start CAN Sync** with default config (port 3213):
```bash
cd examples/can-sync
cargo run
```
**3. Create a library** (e.g. sync all images):
```bash
curl -X POST http://127.0.0.1:3213/libraries \
-H "Content-Type: application/json" \
-d '{"name": "shared-images", "filter": {"mime_prefix": "image/"}}'
```
Save the `id` from the response (e.g. `"id": "a1b2c3d4-..."`).
**4. Generate an invite ticket:**
```bash
curl -X POST http://127.0.0.1:3213/libraries/a1b2c3d4-.../invite
```
Copy the `ticket` string from the response — this is what Machine B needs.
### Machine B (the joiner)
**1. Start CAN Service** on a different port:
```bash
cd /path/to/CanService
CAN_PORT=3220 cargo run
```
**2. Create a config file** for CAN Sync pointing at Machine B's CAN Service:
```yaml
# machine-b-config.yaml
can_service_url: "http://127.0.0.1:3220/api/v1/can/0"
listen_addr: "127.0.0.1:3223"
data_dir: "./can_sync_data_b"
```
**3. Start CAN Sync** with that config:
```bash
cd examples/can-sync
cargo run -- machine-b-config.yaml
```
**4. Join the library** using Machine A's ticket:
```bash
curl -X POST http://127.0.0.1:3223/join \
-H "Content-Type: application/json" \
-d '{"ticket": "eyJsaWJyYXJ5X25hbWUiOi..."}'
```
### Verify it works
**Ingest a file on Machine A:**
```bash
curl -X POST http://127.0.0.1:3210/api/v1/can/0/ingest \
-F "file=@photo.jpg" \
-F "mime_type=image/jpeg"
```
**Check Machine B** — the file should appear within a few seconds:
```bash
curl http://127.0.0.1:3220/api/v1/can/0/list?limit=5
```
The same image (with matching hash and metadata) will be in Machine B's CAN Service, synced over iroh's encrypted P2P connection.
## Architecture
```
src/
├── main.rs — entry point: config, iroh node, announcer, fetcher, HTTP server
├── config.rs — YAML config loading
├── can_client.rs — HTTP client for CAN Service API (list, search, ingest, meta, etc.)
├── node.rs — iroh endpoint + blobs + docs + gossip + router
├── library.rs — library/filter definitions + SQLite state tracking
├── manifest.rs — AssetSyncEntry serialized into iroh document entries
├── announcer.rs — polls CAN Service, announces matching assets to libraries
├── fetcher.rs — receives remote entries, downloads blobs, ingests into CAN Service
└── routes.rs — Axum HTTP API handlers
```
CAN Service must have `sync_api_key` set in its `config.yaml` for the sync endpoints to be enabled.
## Security
- **Transport**: All peer-to-peer traffic is encrypted with QUIC + TLS 1.3 (mandatory in iroh)
- **Identity**: Each node has an Ed25519 keypair generated on first run
- **Access control**: Library access via cryptographic capability tickets — only peers with a valid ticket can read/write
- **NAT traversal**: iroh's built-in relay servers and hole-punching
- **Hash verification**: Downloaded files are independently verified against CAN's SHA-256 hash before ingestion
- **Transport** -- All peer traffic is encrypted with QUIC + TLS 1.3 (mandatory in iroh)
- **Identity** -- Each node gets an Ed25519 keypair on first run
- **Discovery** -- Only peers with the same passphrase can find each other
- **Hash verification** -- Every received asset is re-hashed and compared before being stored
## Current Status
## Project Structure
The service compiles and runs with the following fully implemented:
- iroh P2P node startup with all protocol handlers (blobs, docs, gossip)
- CAN Service HTTP client with full API coverage
- Library management with SQLite persistence
- Announcer polling loop (fast + full scan) with real iroh-docs writes
- Fetcher with iroh document event subscription for real-time sync
- Fetcher blob download via iroh and CAN hash verification before ingestion
- Real DocTicket-based invite/join with cryptographic capability tokens
- HTTP API for library CRUD, invite, and join
```
src/
main.rs Entry point: config, iroh endpoint, discovery, peer connections
config.rs YAML config loading
can_client.rs HTTP client for CAN Service's sync API (protobuf + SSE)
protocol.rs Protobuf message types (shared with CAN Service)
discovery.rs Peer discovery via iroh-gossip
rendezvous.rs Internet peer discovery via pkarr relay
peer.rs Per-peer sync: reconciliation, live push/receive, echo prevention
```

90
examples/canfs/README.md Normal file
View File

@ -0,0 +1,90 @@
# CanFS
Mount [CAN Service](../../) assets as a read-only Windows drive using [WinFSP](https://winfsp.dev). Browse your assets in Windows Explorer like regular files.
## Features
- **Drive letter mount** -- assets appear as files under a drive like `X:\`
- **Virtual folder structure** -- files organized into `CAN\`, `APPLICATION\`, `DATES\`, and `TAGS\` directories
- **Lazy file loading** -- file content is fetched from CAN Service only when you actually open/read a file
- **Background refresh** -- the file tree updates periodically to pick up new assets
## Requirements
- **Windows** (this example uses WinFSP, which is Windows-only)
- **[WinFSP](https://winfsp.dev/rel/)** must be installed (the filesystem driver)
## Running
Make sure CAN Service is running on port 3210 first:
```bash
# From the repo root
cargo run
```
Then mount the filesystem:
```bash
cd examples/canfs
cargo run
```
By default, it mounts on `X:`. Customize with flags:
```bash
cargo run -- --mount Z: --can-url http://127.0.0.1:3210/api/v1/can/0 --refresh-secs 30
```
Press Ctrl+C to unmount.
## Folder Structure
When mounted, the drive shows these virtual directories:
```
X:\
CAN\ All assets by timestamp and hash
1710000000000_abc123.pdf
1710000005000_def456.jpg
APPLICATION\ Grouped by the "application" field
paste\
readme.txt
my-app\
report.pdf
DATES\ Grouped by year and month
2025\
01\
photo.jpg
03\
report.pdf
TAGS\ One folder per tag
vacation\
photo.jpg
work\
report.pdf
```
Files with a `human_filename` show their friendly name in `APPLICATION/`, `DATES/`, and `TAGS/` folders. The `CAN/` folder always shows the raw `{timestamp}_{hash}.{ext}` format.
## CLI Options
| Flag | Default | Description |
|------|---------|-------------|
| `-m, --mount` | `X:` | Drive letter or directory to mount on |
| `--can-url` | `http://127.0.0.1:3210/api/v1/can/0` | CAN Service API base URL |
| `--refresh-secs` | `60` | Seconds between cache refreshes |
## Project Structure
```
src/
main.rs Entry point: CLI args, WinFSP host setup, background refresh
api.rs Blocking HTTP client for CAN Service (list, fetch)
fs.rs WinFSP filesystem implementation (open, read, readdir, etc.)
tree.rs Virtual directory tree builder (turns flat asset list into folders)
util.rs Helpers: MIME-to-extension, timestamp conversion, path sanitization
```

View File

@ -0,0 +1,52 @@
# File Manager
A web-based file browser for [CAN Service](../../) assets. Grid and list views, search, filters, and a detail modal with previews.
## Features
- **Grid and list views** -- toggle between thumbnail cards and a compact file list
- **Virtual folder tree** -- assets organized into `CAN/`, `APPLICATION/`, `DATES/`, `TAGS/`, and `TYPE/` folders
- **Search** -- filter by filename, description, or hash prefix
- **Filters** -- narrow by application, MIME type, tag, or date range
- **Detail modal** -- click any file to see full metadata, preview images, and download
## Running
Make sure CAN Service is running on port 3210 first:
```bash
# From the repo root
cargo run
```
Then start the File Manager:
```bash
cd examples/filemanager
cargo run
```
Opens automatically at [http://127.0.0.1:3212](http://127.0.0.1:3212).
## How It Works
The Rust backend serves a single-page app and proxies all data requests to CAN Service:
| File Manager Route | Proxies To | Purpose |
|--------------------|-----------|---------|
| `GET /` | -- | Serve the HTML/JS/CSS frontend |
| `GET /fm/list` | `GET /api/v1/can/0/list` | Paginated asset listing |
| `GET /fm/search` | `GET /api/v1/can/0/search` | Search with filters |
| `GET /fm/asset/{hash}` | `GET /api/v1/can/0/asset/{hash}` | Download/preview a file |
| `GET /fm/asset/{hash}/meta` | `GET /api/v1/can/0/asset/{hash}/meta` | Asset metadata |
| `GET /fm/thumb/{hash}` | `GET /api/v1/can/0/asset/{hash}/thumb/200/200` | Thumbnail |
The virtual folder tree is built entirely in the browser from the flat asset list -- no folder structure exists on disk.
## Project Structure
```
src/
main.rs HTTP server: proxy handlers and query forwarding
html.rs Single-page frontend (HTML + CSS + JS, embedded as a string)
```

51
examples/paste/README.md Normal file
View File

@ -0,0 +1,51 @@
# Paste
A minimal pastebin web app built on [CAN Service](../../). Type text and press Enter, or paste an image from your clipboard. Everything gets stored as a CAN asset.
## Features
- **Text paste** -- type and hit Enter to store a text snippet
- **Image paste** -- Ctrl+V an image from your clipboard, or click the paperclip to attach a file
- **Auto-tagging** -- use `#hashtags` in your text and they're extracted as CAN tags
- **Live refresh** -- new pastes appear instantly via Server-Sent Events (including content arriving from P2P sync on another machine)
## Running
Make sure CAN Service is running on port 3210 first:
```bash
# From the repo root
cargo run
```
Then start Paste:
```bash
cd examples/paste
cargo run
```
Opens automatically at [http://127.0.0.1:3211](http://127.0.0.1:3211).
## How It Works
Paste is a thin proxy layer. The Rust backend serves a single-page HTML/JS frontend and forwards requests to the CAN Service API:
| Paste Route | Proxies To | Purpose |
|-------------|-----------|---------|
| `POST /paste/text` | `POST /api/v1/can/0/ingest` | Store text as a `.txt` asset |
| `POST /paste/file` | `POST /api/v1/can/0/ingest` | Store an uploaded file |
| `GET /paste/list` | `GET /api/v1/can/0/list?application=paste` | List paste assets |
| `GET /paste/asset/{hash}` | `GET /api/v1/can/0/asset/{hash}` | Download an asset |
| `GET /paste/thumb/{hash}` | `GET /api/v1/can/0/asset/{hash}/thumb/200/200` | Image thumbnail |
| `GET /paste/events` | `GET /api/v1/can/0/events` | SSE stream for live updates |
All pastes are tagged with `application=paste` so they're scoped separately from other CAN content.
## Project Structure
```
src/
main.rs HTTP server: proxy handlers, tag extraction, SSE relay
html.rs Single-page frontend (HTML + CSS + JS, embedded as a string)
```