Document complete vaultik architecture and implementation plan

- Expand README with full CLI documentation, architecture details, and features
- Add comprehensive 87-step implementation plan to DESIGN.md
- Document all commands, configuration options, and security considerations
- Define complete API signatures and data structures
This commit is contained in:
2025-07-20 09:04:31 +02:00
parent 67319a4699
commit 0df07790ba
2 changed files with 229 additions and 3 deletions

115
README.md
View File

@@ -97,17 +97,77 @@ Existing backup software fails under one or more of these conditions:
## cli
### commands
```sh
vaultik backup /etc/vaultik.yaml
vaultik backup /etc/vaultik.yaml [--cron] [--daemon]
vaultik restore <bucket> <prefix> <snapshot_id> <target_dir>
vaultik prune <bucket> <prefix>
vaultik fetch <bucket> <prefix> <snapshot_id> <filepath> <target_fileordir>
vaultik verify <bucket> <prefix> [<snapshot_id>]
```
* `VAULTIK_PRIVATE_KEY` must be available in environment for `restore` and `prune`
### environment
* `VAULTIK_PRIVATE_KEY`: Required for `restore`, `prune`, `fetch`, and `verify` commands. Contains the age private key for decryption.
### command details
**backup**: Perform incremental backup of configured directories
* `--cron`: Silent unless error (for crontab)
* `--daemon`: Run continuously with inotify monitoring and periodic scans
**restore**: Restore entire snapshot to target directory
* Downloads and decrypts metadata
* Fetches only required blobs
* Reconstructs directory structure
**prune**: Remove unreferenced blobs from storage
* Requires private key
* Downloads latest snapshot metadata
* Deletes orphaned blobs
**fetch**: Extract single file from backup
* Retrieves specific file without full restore
* Supports extracting to different filename
**verify**: Validate backup integrity
* Checks metadata hash
* Verifies all referenced blobs exist
* Validates chunk integrity
---
## architecture
### chunking
* Content-defined chunking using rolling hash (Rabin fingerprint)
* Average chunk size: 10MB (configurable)
* Deduplication at chunk level
* Multiple chunks packed into blobs for efficiency
### encryption
* Asymmetric encryption using age (X25519 + XChaCha20-Poly1305)
* Only public key needed on source host
* Each blob encrypted independently
* Metadata databases also encrypted
### storage
* Content-addressed blob storage
* Immutable append-only design
* Two-level directory sharding for blobs (aa/bb/hash)
* Compressed with zstd before encryption
### state tracking
* Local SQLite database for incremental state
* Tracks file mtimes and chunk mappings
* Enables efficient change detection
* Supports inotify monitoring in daemon mode
## does not
* Store any secrets on the backed-up machine
@@ -141,6 +201,33 @@ The entire system is restore-only from object storage.
---
## features
### daemon mode
* Continuous background operation
* inotify-based change detection
* Respects `backup_interval` and `min_time_between_run`
* Full scan every `full_scan_interval` (default 24h)
### cron mode
* Single backup run
* Silent output unless errors
* Ideal for scheduled backups
### metadata integrity
* SHA256 hash of metadata stored separately
* Encrypted hash file for verification
* Chunked metadata support for large filesystems
### exclusion patterns
* Glob-based file exclusion
* Configured in YAML
* Applied during directory walk
## prune
Run `vaultik prune` on a machine with the private key. It:
@@ -160,6 +247,30 @@ WTFPL — see LICENSE.
---
## security considerations
* Source host compromise cannot decrypt backups
* No replay attacks possible (append-only)
* Each blob independently encrypted
* Metadata tampering detectable via hash verification
* S3 credentials only allow write access to backup prefix
## performance
* Streaming processing (no temp files)
* Parallel blob uploads
* Deduplication reduces storage and bandwidth
* Local index enables fast incremental detection
* Configurable compression levels
## requirements
* Go 1.24.4 or later
* S3-compatible object storage
* age command-line tool (for key generation)
* SQLite3
* Sufficient disk space for local index
## author
sneak