Jason Tudisco 360ecbdad0 Initial commit: CAN Service + examples (can-sync v1, canfs, filemanager, paste)
CAN Service: content-addressable storage with HTTP API, SQLite metadata,
file-based blob storage, thumbnail generation, and integrity verification.

can-sync v1: P2P sync sidecar using iroh-docs for encrypted peer-to-peer
replication with library/filter-based selective sync. Fully builds but
being superseded by v2 (simplified full-mirror approach).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 10:32:04 -06:00

264 lines
8.4 KiB
Markdown

# CAN Sync
P2P file synchronization service that runs on top of [CAN Service](../../). Uses [iroh](https://iroh.computer/) for encrypted peer-to-peer networking with NAT traversal.
```
┌─────────────┐ HTTP API ┌─────────────┐ iroh (QUIC) ┌─────────────┐
│ CAN Service │◄───────────►│ CAN Sync │◄─────────────►│ CAN Sync │
│ (port 3210)│ │ (port 3213)│ │ (remote) │
│ storage + │ │ P2P node + │ │ │
│ SQLite │ │ libraries │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
```
CAN Sync communicates with CAN Service **only** via its public HTTP API — zero changes to CAN Service required.
## Quick Start
1. **Start CAN Service** (default port 3210):
```bash
cd ../..
cargo run
```
2. **Edit config** (optional — defaults work out of the box):
```bash
cp config.yaml my-config.yaml
# edit my-config.yaml if needed
```
3. **Start CAN Sync**:
```bash
cargo run
# or with a custom config:
cargo run -- my-config.yaml
```
CAN Sync starts on `http://127.0.0.1:3213` and connects to CAN Service at `http://127.0.0.1:3210/api/v1/can/0`.
## Configuration
`config.yaml`:
```yaml
# URL of the local CAN Service API
can_service_url: "http://127.0.0.1:3210/api/v1/can/0"
# Address for the CAN Sync HTTP API
listen_addr: "127.0.0.1:3213"
# Directory for persistent data (peer key, sync state DB)
data_dir: "./can_sync_data"
# Custom relay server URL (null = iroh's public relay)
relay_url: null
# Seconds between fast polls for new assets
poll_interval_secs: 5
# Seconds between full scans of all assets
full_scan_interval_secs: 300
```
## Concepts
### Libraries
A **library** is a shared collection of CAN assets that syncs between peers. Each library has a **filter** that determines which assets belong to it.
Filter options (combined with AND logic):
- `application` — match assets with this application tag (e.g. `"paste"`)
- `tags` — match assets with any of these tags (e.g. `["photos", "backup"]`)
- `user` — match assets from this user identity
- `mime_prefix` — match assets whose MIME type starts with this (e.g. `"image/"`)
- `hashes` — manual list of specific asset hashes to include
### Sync Flow
**Outbound** (local → remote):
1. Announcer polls CAN Service for new/changed assets
2. Assets matching a library's filter get announced to the library's iroh document
3. iroh replicates the entry to all subscribed peers
4. Remote peer's fetcher downloads the blob and ingests it into their local CAN Service
**Inbound** (remote → local):
1. iroh document receives new entry from remote peer
2. Fetcher downloads the blob via iroh's encrypted QUIC transport
3. Fetcher verifies the CAN hash (SHA-256) independently
4. Fetcher ingests the file into local CAN Service with all metadata preserved
## API
All endpoints return JSON with `{ "status": "success", "data": ... }` or `{ "status": "error", "error": "..." }`.
### Status & Peers
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/status` | Node status, CAN service health, library count |
| GET | `/peers` | Connected peers list |
### Libraries
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/libraries` | Create a library |
| GET | `/libraries` | List all libraries |
| GET | `/libraries/{id}` | Get library details |
| DELETE | `/libraries/{id}` | Remove a library |
### Sharing
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/libraries/{id}/invite` | Generate a share ticket |
| POST | `/join` | Join a library from a ticket |
### Examples
**Create a library** that syncs all assets with `application=paste`:
```bash
curl -X POST http://127.0.0.1:3213/libraries \
-H "Content-Type: application/json" \
-d '{"name": "my-pastes", "filter": {"application": "paste"}}'
```
**Create a library** that syncs all images:
```bash
curl -X POST http://127.0.0.1:3213/libraries \
-H "Content-Type: application/json" \
-d '{"name": "images", "filter": {"mime_prefix": "image/"}}'
```
**Generate an invite ticket** to share with another machine:
```bash
curl -X POST http://127.0.0.1:3213/libraries/{id}/invite
```
**Join a library** on another machine using the ticket:
```bash
curl -X POST http://127.0.0.1:3213/join \
-H "Content-Type: application/json" \
-d '{"ticket": "eyJsaWJyYXJ5X25hbWUiOi..."}'
```
**List all libraries**:
```bash
curl http://127.0.0.1:3213/libraries
```
**Check status**:
```bash
curl http://127.0.0.1:3213/status
```
## Two-Machine Setup
### Machine A (the host)
**1. Start CAN Service** (default port 3210):
```bash
cd /path/to/CanService
cargo run
```
**2. Start CAN Sync** with default config (port 3213):
```bash
cd examples/can-sync
cargo run
```
**3. Create a library** (e.g. sync all images):
```bash
curl -X POST http://127.0.0.1:3213/libraries \
-H "Content-Type: application/json" \
-d '{"name": "shared-images", "filter": {"mime_prefix": "image/"}}'
```
Save the `id` from the response (e.g. `"id": "a1b2c3d4-..."`).
**4. Generate an invite ticket:**
```bash
curl -X POST http://127.0.0.1:3213/libraries/a1b2c3d4-.../invite
```
Copy the `ticket` string from the response — this is what Machine B needs.
### Machine B (the joiner)
**1. Start CAN Service** on a different port:
```bash
cd /path/to/CanService
CAN_PORT=3220 cargo run
```
**2. Create a config file** for CAN Sync pointing at Machine B's CAN Service:
```yaml
# machine-b-config.yaml
can_service_url: "http://127.0.0.1:3220/api/v1/can/0"
listen_addr: "127.0.0.1:3223"
data_dir: "./can_sync_data_b"
```
**3. Start CAN Sync** with that config:
```bash
cd examples/can-sync
cargo run -- machine-b-config.yaml
```
**4. Join the library** using Machine A's ticket:
```bash
curl -X POST http://127.0.0.1:3223/join \
-H "Content-Type: application/json" \
-d '{"ticket": "eyJsaWJyYXJ5X25hbWUiOi..."}'
```
### Verify it works
**Ingest a file on Machine A:**
```bash
curl -X POST http://127.0.0.1:3210/api/v1/can/0/ingest \
-F "file=@photo.jpg" \
-F "mime_type=image/jpeg"
```
**Check Machine B** — the file should appear within a few seconds:
```bash
curl http://127.0.0.1:3220/api/v1/can/0/list?limit=5
```
The same image (with matching hash and metadata) will be in Machine B's CAN Service, synced over iroh's encrypted P2P connection.
## Architecture
```
src/
├── main.rs — entry point: config, iroh node, announcer, fetcher, HTTP server
├── config.rs — YAML config loading
├── can_client.rs — HTTP client for CAN Service API (list, search, ingest, meta, etc.)
├── node.rs — iroh endpoint + blobs + docs + gossip + router
├── library.rs — library/filter definitions + SQLite state tracking
├── manifest.rs — AssetSyncEntry serialized into iroh document entries
├── announcer.rs — polls CAN Service, announces matching assets to libraries
├── fetcher.rs — receives remote entries, downloads blobs, ingests into CAN Service
└── routes.rs — Axum HTTP API handlers
```
## Security
- **Transport**: All peer-to-peer traffic is encrypted with QUIC + TLS 1.3 (mandatory in iroh)
- **Identity**: Each node has an Ed25519 keypair generated on first run
- **Access control**: Library access via cryptographic capability tickets — only peers with a valid ticket can read/write
- **NAT traversal**: iroh's built-in relay servers and hole-punching
- **Hash verification**: Downloaded files are independently verified against CAN's SHA-256 hash before ingestion
## Current Status
The service compiles and runs with the following fully implemented:
- iroh P2P node startup with all protocol handlers (blobs, docs, gossip)
- CAN Service HTTP client with full API coverage
- Library management with SQLite persistence
- Announcer polling loop (fast + full scan) with real iroh-docs writes
- Fetcher with iroh document event subscription for real-time sync
- Fetcher blob download via iroh and CAN hash verification before ingestion
- Real DocTicket-based invite/join with cryptographic capability tokens
- HTTP API for library CRUD, invite, and join