CAN Service: content-addressable storage with HTTP API, SQLite metadata, file-based blob storage, thumbnail generation, and integrity verification. can-sync v1: P2P sync sidecar using iroh-docs for encrypted peer-to-peer replication with library/filter-based selective sync. Fully builds but being superseded by v2 (simplified full-mirror approach). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8.4 KiB
CAN (Containerized Asset Network) Service Specification
Version: 1.0 (Final MVP) Target Language: Rust
1. System Overview
The CAN service is a robust, self-healing local network daemon designed to simulate a high-speed, append-oriented file system. It provides an HTTP REST and Protobuf API to ingest, manage, and retrieve assets (files and data).
To bypass the slow nature of traditional OS file searches, it uses an embedded SQLite database for millisecond querying. To ensure 100% disaster-recovery readiness, critical metadata is redundantly written to the host's native OS file attributes.
MVP Scope: This version supports a single, default container. To future-proof the API, all routes require a {can_id} parameter, which must always be 0. Physically, all data is mapped flatly to the configured storage root.
2. Directory Structure & Configuration
The system uses a flat directory structure within the configured root folder.
Physical Structure:
/var/lib/can_data/ # Defined by storage_root
├── .can.db # Master SQLite Index (Hidden)
├── .trash/ # Soft-deleted physical assets (Hidden)
├── .thumbs/ # Cached thumbnail images (Hidden, if enabled)
├── 1773014400123_a3b2... # Physical Asset
└── 1773014405999_f8c9... # Physical Asset
Configuration (config.yaml):
storage_root: "/var/lib/can_data" # Absolute path to the storage folder
admin_token: "super_secret_rebuild" # Bearer token for admin operations
enable_thumbnail_cache: true # Toggle caching in .thumbs/
rebuild_error_threshold: 50 # Tolerance before triggering a hard rebuild
verify_interval_hours: 12 # Frequency of full background hash verification
3. Storage Mechanics & Disaster Recovery
3.1 Cryptographic Naming Convention
Files are written with a strict physical naming format to allow offline, mathematical verification of integrity.
Format: {timestamp}_{sha256}_{truncated_tags}.{extension}
timestamp: Epoch Unix timestamp in milliseconds (e.g.,1773014400123).sha256: A SHA-256 hash calculated exactly as:SHA256([timestamp_bytes] + [raw_file_content_bytes]).truncated_tags: Tags joined by underscores (_). Non-alphanumeric characters stripped. Safely truncated to ensure the total filename stays safely under OS path limits (~255 chars). Omitted if no tags provided.extension: Derived from themime_typeor magic bytes (e.g.,.pdf,.json).
3.2 Native OS File Attributes
To guarantee the SQLite database can be rebuilt from scratch, critical metadata is bound directly to the file using OS-level attributes (Extended Attributes / xattr on Linux/macOS; NTFS Alternate Data Streams on Windows).
Required Attributes:
can.application: Software that ingested the file.can.user: User identity.can.tags: The complete, unbounded, comma-separated list of tags.can.description: Human-readable description.can.human_filename: The logical filename provided during ingestion.can.human_path: The logical folder path provided during ingestion.
4. Metadata Indexing (.can.db)
A fully normalized SQLite database located at {storage_root}/.can.db.
Schema:
CREATE TABLE assets (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp INTEGER NOT NULL,
hash TEXT NOT NULL UNIQUE,
mime_type TEXT NOT NULL,
application TEXT,
user_identity TEXT,
description TEXT,
actual_filename TEXT NOT NULL,
human_filename TEXT,
human_path TEXT,
is_trashed BOOLEAN NOT NULL DEFAULT 0,
is_corrupted BOOLEAN NOT NULL DEFAULT 0
);
CREATE TABLE tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE
);
CREATE TABLE asset_tags (
asset_id INTEGER NOT NULL,
tag_id INTEGER NOT NULL,
PRIMARY KEY (asset_id, tag_id),
FOREIGN KEY (asset_id) REFERENCES assets(id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);
-- Optimization Indexes
CREATE INDEX idx_hash ON assets(hash);
CREATE INDEX idx_timestamp ON assets(timestamp);
CREATE INDEX idx_application ON assets(application);
CREATE INDEX idx_user ON assets(user_identity);
CREATE INDEX idx_trashed ON assets(is_trashed);
CREATE INDEX idx_tag_name ON tags(name);
5. Background Verifier Subsystem
A low-priority background thread dedicated to data integrity.
- Initial Scrub: Runs on startup. Verifies
SHA256(timestamp + content)for all files against their filenames. - Continuous Monitoring: Hooks into OS file system events (e.g.,
inotify). If a file is touched or altered by an external program, the verifier immediately rescans it. - Periodic Scrub: Runs every
verify_interval_hoursto catch silent bit rot. - Corruption Handling: If a hash mismatch is found, it flags
is_corrupted = 1in.can.db. Corrupted files are explicitly marked in API responses and excluded from standard operations.
6. API Endpoints
Protocol Negotation: All endpoints communicate in JSON by default. Clients can request/send Protocol Buffers by providing the HTTP headers:
Accept: application/x-protobufContent-Type: application/x-protobuf
(Note: Endpoint paths below use {can_id} which must be passed as 0)
6.1 Ingest Data
- Method:
POST - Path:
/api/v1/can/0/ingest - Content-Type:
multipart/form-data - Form Payload:
file(Binary File) - Requiredmime_type(String) - Optionalhuman_file_name(String) - Optionalhuman_readable_path(String) - Optionalapplication(String) - Optionaluser(String) - Optionaltags(String) - Optional (comma-separated)description(String) - Optional
- Action: Hashes file, writes to
{storage_root}, attaches OS attributes, logs to DB. - Response (JSON):
{ "status": "success", "data": { "timestamp": 1773014400123, "hash": "abc...", "filename": "1773014400123_abc_tag.pdf" } }
6.2 Retrieve Physical Asset
- Method:
GET - Path:
/api/v1/can/0/asset/{hash} - Action: Streams the physical file. Sets
Content-Typevia DB mapping andContent-Dispositionusinghuman_filename. Returns 500/Warning ifis_corrupted = 1.
6.3 Retrieve Asset Metadata
- Method:
GET - Path:
/api/v1/can/0/asset/{hash}/meta - Action: Returns DB record.
- Response (JSON):
{ "status": "success", "data": { "hash": "abc...", "mime_type": "image/jpeg", "application": "WebUI", "user": "Jason", "tags": ["tag1", "tag2"], "description": "...", "human_filename": "photo.jpg", "human_path": "/img/", "timestamp": 1773014400123, "is_trashed": false, "is_corrupted": false } }
6.4 Retrieve Thumbnail
- Method:
GET - Path:
/api/v1/can/0/asset/{hash}/thumb/{max_width}/{max_height} - Action: Resizes image strictly preserving aspect ratio. Falls back to static icon (SVG/PNG) for non-images. If
enable_thumbnail_cache=true, reads/writes to{storage_root}/.thumbs/{hash}_{max_width}x{max_height}.jpg. Streams byte payload.
6.5 Modify Metadata
- Method:
PATCH - Path:
/api/v1/can/0/asset/{hash} - Body (JSON/Protobuf):
{ "tags": ["new_tag1", "new_tag2"], "description": "New description" } - Action: Updates
can.tagsandcan.descriptionOS Attributes. Updates SQLiteassets,tags, andasset_tagstables inside a transaction. Physical filename remains unchanged.
6.6 List Assets
- Method:
GET - Path:
/api/v1/can/0/list - Query Parameters:
limit(Integer) - Default50offset(Integer) - Default0offset_time(Integer) - Optional. Epoch ms. High-speed cursor. Lists items strictly after/before this timestamp based onorder.order(String) -ascordesc. Defaultdesc.application(String) - Optional. Scopes list exclusively to files ingested by this Application ID.include_trashed(Boolean) - Defaultfalse.include_corrupted(Boolean) - Defaultfalse.
- Response: Paginated array of metadata objects (matching 6.3 output) +
paginationblock.
6.7 Search Assets
- Method:
GET - Path:
/api/v1/can/0/search - Query Parameters:
hash(String) - Exact or partial prefix.start_time(Integer) - Epoch ms.end_time(Integer) -