Document complete vaultik architecture and implementation plan
- Expand README with full CLI documentation, architecture details, and features - Add comprehensive 87-step implementation plan to DESIGN.md - Document all commands, configuration options, and security considerations - Define complete API signatures and data structures
This commit is contained in:
115
README.md
115
README.md
@@ -97,17 +97,77 @@ Existing backup software fails under one or more of these conditions:
|
||||
|
||||
## cli
|
||||
|
||||
### commands
|
||||
|
||||
```sh
|
||||
vaultik backup /etc/vaultik.yaml
|
||||
vaultik backup /etc/vaultik.yaml [--cron] [--daemon]
|
||||
vaultik restore <bucket> <prefix> <snapshot_id> <target_dir>
|
||||
vaultik prune <bucket> <prefix>
|
||||
vaultik fetch <bucket> <prefix> <snapshot_id> <filepath> <target_fileordir>
|
||||
vaultik verify <bucket> <prefix> [<snapshot_id>]
|
||||
```
|
||||
|
||||
* `VAULTIK_PRIVATE_KEY` must be available in environment for `restore` and `prune`
|
||||
### environment
|
||||
|
||||
* `VAULTIK_PRIVATE_KEY`: Required for `restore`, `prune`, `fetch`, and `verify` commands. Contains the age private key for decryption.
|
||||
|
||||
### command details
|
||||
|
||||
**backup**: Perform incremental backup of configured directories
|
||||
* `--cron`: Silent unless error (for crontab)
|
||||
* `--daemon`: Run continuously with inotify monitoring and periodic scans
|
||||
|
||||
**restore**: Restore entire snapshot to target directory
|
||||
* Downloads and decrypts metadata
|
||||
* Fetches only required blobs
|
||||
* Reconstructs directory structure
|
||||
|
||||
**prune**: Remove unreferenced blobs from storage
|
||||
* Requires private key
|
||||
* Downloads latest snapshot metadata
|
||||
* Deletes orphaned blobs
|
||||
|
||||
**fetch**: Extract single file from backup
|
||||
* Retrieves specific file without full restore
|
||||
* Supports extracting to different filename
|
||||
|
||||
**verify**: Validate backup integrity
|
||||
* Checks metadata hash
|
||||
* Verifies all referenced blobs exist
|
||||
* Validates chunk integrity
|
||||
|
||||
---
|
||||
|
||||
## architecture
|
||||
|
||||
### chunking
|
||||
|
||||
* Content-defined chunking using rolling hash (Rabin fingerprint)
|
||||
* Average chunk size: 10MB (configurable)
|
||||
* Deduplication at chunk level
|
||||
* Multiple chunks packed into blobs for efficiency
|
||||
|
||||
### encryption
|
||||
|
||||
* Asymmetric encryption using age (X25519 + XChaCha20-Poly1305)
|
||||
* Only public key needed on source host
|
||||
* Each blob encrypted independently
|
||||
* Metadata databases also encrypted
|
||||
|
||||
### storage
|
||||
|
||||
* Content-addressed blob storage
|
||||
* Immutable append-only design
|
||||
* Two-level directory sharding for blobs (aa/bb/hash)
|
||||
* Compressed with zstd before encryption
|
||||
|
||||
### state tracking
|
||||
|
||||
* Local SQLite database for incremental state
|
||||
* Tracks file mtimes and chunk mappings
|
||||
* Enables efficient change detection
|
||||
* Supports inotify monitoring in daemon mode
|
||||
|
||||
## does not
|
||||
|
||||
* Store any secrets on the backed-up machine
|
||||
@@ -141,6 +201,33 @@ The entire system is restore-only from object storage.
|
||||
|
||||
---
|
||||
|
||||
## features
|
||||
|
||||
### daemon mode
|
||||
|
||||
* Continuous background operation
|
||||
* inotify-based change detection
|
||||
* Respects `backup_interval` and `min_time_between_run`
|
||||
* Full scan every `full_scan_interval` (default 24h)
|
||||
|
||||
### cron mode
|
||||
|
||||
* Single backup run
|
||||
* Silent output unless errors
|
||||
* Ideal for scheduled backups
|
||||
|
||||
### metadata integrity
|
||||
|
||||
* SHA256 hash of metadata stored separately
|
||||
* Encrypted hash file for verification
|
||||
* Chunked metadata support for large filesystems
|
||||
|
||||
### exclusion patterns
|
||||
|
||||
* Glob-based file exclusion
|
||||
* Configured in YAML
|
||||
* Applied during directory walk
|
||||
|
||||
## prune
|
||||
|
||||
Run `vaultik prune` on a machine with the private key. It:
|
||||
@@ -160,6 +247,30 @@ WTFPL — see LICENSE.
|
||||
|
||||
---
|
||||
|
||||
## security considerations
|
||||
|
||||
* Source host compromise cannot decrypt backups
|
||||
* No replay attacks possible (append-only)
|
||||
* Each blob independently encrypted
|
||||
* Metadata tampering detectable via hash verification
|
||||
* S3 credentials only allow write access to backup prefix
|
||||
|
||||
## performance
|
||||
|
||||
* Streaming processing (no temp files)
|
||||
* Parallel blob uploads
|
||||
* Deduplication reduces storage and bandwidth
|
||||
* Local index enables fast incremental detection
|
||||
* Configurable compression levels
|
||||
|
||||
## requirements
|
||||
|
||||
* Go 1.24.4 or later
|
||||
* S3-compatible object storage
|
||||
* age command-line tool (for key generation)
|
||||
* SQLite3
|
||||
* Sufficient disk space for local index
|
||||
|
||||
## author
|
||||
|
||||
sneak
|
||||
|
||||
Reference in New Issue
Block a user