Document complete vaultik architecture and implementation plan
- Expand README with full CLI documentation, architecture details, and features - Add comprehensive 87-step implementation plan to DESIGN.md - Document all commands, configuration options, and security considerations - Define complete API signatures and data structures
This commit is contained in:
parent
67319a4699
commit
0df07790ba
117
DESIGN.md
117
DESIGN.md
@ -359,4 +359,119 @@ func RunPrune(bucket, prefix, privateKey string) error
|
||||
|
||||
## Implementation TODO
|
||||
|
||||
To be completed by claude
|
||||
### Phase 1: Core Infrastructure
|
||||
1. Set up Go module and project structure
|
||||
2. Create Makefile with test, fmt, and lint targets
|
||||
3. Set up cobra CLI skeleton with all commands
|
||||
4. Implement config loading and validation from YAML
|
||||
5. Create data structures for FileInfo, ChunkInfo, BlobInfo, etc.
|
||||
|
||||
### Phase 2: Local Index Database
|
||||
6. Implement SQLite schema creation and migrations
|
||||
7. Create Index type with all database operations
|
||||
8. Add transaction support and proper locking
|
||||
9. Implement file tracking (save, lookup, delete)
|
||||
10. Implement chunk tracking and deduplication
|
||||
11. Implement blob tracking and chunk-to-blob mapping
|
||||
12. Write tests for all index operations
|
||||
|
||||
### Phase 3: Chunking and Hashing
|
||||
13. Implement Rabin fingerprint chunker
|
||||
14. Create streaming chunk processor
|
||||
15. Implement SHA256 hashing for chunks
|
||||
16. Add configurable chunk size parameters
|
||||
17. Write tests for chunking consistency
|
||||
|
||||
### Phase 4: Compression and Encryption
|
||||
18. Implement zstd compression wrapper
|
||||
19. Integrate age encryption library
|
||||
20. Create Encryptor type for public key encryption
|
||||
21. Create Decryptor type for private key decryption
|
||||
22. Implement streaming encrypt/decrypt pipelines
|
||||
23. Write tests for compression and encryption
|
||||
|
||||
### Phase 5: Blob Packing
|
||||
24. Implement BlobWriter with size limits
|
||||
25. Add chunk accumulation and flushing
|
||||
26. Create blob hash calculation
|
||||
27. Implement proper error handling and rollback
|
||||
28. Write tests for blob packing scenarios
|
||||
|
||||
### Phase 6: S3 Operations
|
||||
29. Integrate MinIO client library
|
||||
30. Implement S3Client wrapper type
|
||||
31. Add multipart upload support for large blobs
|
||||
32. Implement retry logic with exponential backoff
|
||||
33. Add connection pooling and timeout handling
|
||||
34. Write tests using MinIO container
|
||||
|
||||
### Phase 7: Backup Command - Basic
|
||||
35. Implement directory walking with exclusion patterns
|
||||
36. Add file change detection using index
|
||||
37. Integrate chunking pipeline for changed files
|
||||
38. Implement blob upload coordination
|
||||
39. Add progress reporting to stderr
|
||||
40. Write integration tests for backup
|
||||
|
||||
### Phase 8: Snapshot Metadata
|
||||
41. Implement snapshot metadata extraction from index
|
||||
42. Create SQLite snapshot database builder
|
||||
43. Add metadata compression and encryption
|
||||
44. Implement metadata chunking for large snapshots
|
||||
45. Add hash calculation and verification
|
||||
46. Implement metadata upload to S3
|
||||
47. Write tests for metadata operations
|
||||
|
||||
### Phase 9: Restore Command
|
||||
48. Implement snapshot listing and selection
|
||||
49. Add metadata download and reconstruction
|
||||
50. Implement hash verification for metadata
|
||||
51. Create file restoration logic with chunk retrieval
|
||||
52. Add blob caching for efficiency
|
||||
53. Implement proper file permissions and mtime restoration
|
||||
54. Write integration tests for restore
|
||||
|
||||
### Phase 10: Prune Command
|
||||
55. Implement latest snapshot detection
|
||||
56. Add referenced blob extraction from metadata
|
||||
57. Create S3 blob listing and comparison
|
||||
58. Implement safe deletion of unreferenced blobs
|
||||
59. Add dry-run mode for safety
|
||||
60. Write tests for prune scenarios
|
||||
|
||||
### Phase 11: Verify Command
|
||||
61. Implement metadata integrity checking
|
||||
62. Add blob existence verification
|
||||
63. Create optional deep verification mode
|
||||
64. Implement detailed error reporting
|
||||
65. Write tests for verification
|
||||
|
||||
### Phase 12: Fetch Command
|
||||
66. Implement single-file metadata query
|
||||
67. Add minimal blob downloading for file
|
||||
68. Create streaming file reconstruction
|
||||
69. Add support for output redirection
|
||||
70. Write tests for fetch command
|
||||
|
||||
### Phase 13: Daemon Mode
|
||||
71. Implement inotify watcher for Linux
|
||||
72. Add dirty path tracking in index
|
||||
73. Create periodic full scan scheduler
|
||||
74. Implement backup interval enforcement
|
||||
75. Add proper signal handling and shutdown
|
||||
76. Write tests for daemon behavior
|
||||
|
||||
### Phase 14: Cron Mode
|
||||
77. Implement silent operation mode
|
||||
78. Add proper exit codes for cron
|
||||
79. Implement lock file to prevent concurrent runs
|
||||
80. Add error summary reporting
|
||||
81. Write tests for cron mode
|
||||
|
||||
### Phase 15: Finalization
|
||||
82. Add comprehensive logging throughout
|
||||
83. Implement proper error wrapping and context
|
||||
84. Add performance metrics collection
|
||||
85. Create end-to-end integration tests
|
||||
86. Write documentation and examples
|
||||
87. Set up CI/CD pipeline
|
||||
|
115
README.md
115
README.md
@ -97,17 +97,77 @@ Existing backup software fails under one or more of these conditions:
|
||||
|
||||
## cli
|
||||
|
||||
### commands
|
||||
|
||||
```sh
|
||||
vaultik backup /etc/vaultik.yaml
|
||||
vaultik backup /etc/vaultik.yaml [--cron] [--daemon]
|
||||
vaultik restore <bucket> <prefix> <snapshot_id> <target_dir>
|
||||
vaultik prune <bucket> <prefix>
|
||||
vaultik fetch <bucket> <prefix> <snapshot_id> <filepath> <target_fileordir>
|
||||
vaultik verify <bucket> <prefix> [<snapshot_id>]
|
||||
```
|
||||
|
||||
* `VAULTIK_PRIVATE_KEY` must be available in environment for `restore` and `prune`
|
||||
### environment
|
||||
|
||||
* `VAULTIK_PRIVATE_KEY`: Required for `restore`, `prune`, `fetch`, and `verify` commands. Contains the age private key for decryption.
|
||||
|
||||
### command details
|
||||
|
||||
**backup**: Perform incremental backup of configured directories
|
||||
* `--cron`: Silent unless error (for crontab)
|
||||
* `--daemon`: Run continuously with inotify monitoring and periodic scans
|
||||
|
||||
**restore**: Restore entire snapshot to target directory
|
||||
* Downloads and decrypts metadata
|
||||
* Fetches only required blobs
|
||||
* Reconstructs directory structure
|
||||
|
||||
**prune**: Remove unreferenced blobs from storage
|
||||
* Requires private key
|
||||
* Downloads latest snapshot metadata
|
||||
* Deletes orphaned blobs
|
||||
|
||||
**fetch**: Extract single file from backup
|
||||
* Retrieves specific file without full restore
|
||||
* Supports extracting to different filename
|
||||
|
||||
**verify**: Validate backup integrity
|
||||
* Checks metadata hash
|
||||
* Verifies all referenced blobs exist
|
||||
* Validates chunk integrity
|
||||
|
||||
---
|
||||
|
||||
## architecture
|
||||
|
||||
### chunking
|
||||
|
||||
* Content-defined chunking using rolling hash (Rabin fingerprint)
|
||||
* Average chunk size: 10MB (configurable)
|
||||
* Deduplication at chunk level
|
||||
* Multiple chunks packed into blobs for efficiency
|
||||
|
||||
### encryption
|
||||
|
||||
* Asymmetric encryption using age (X25519 + XChaCha20-Poly1305)
|
||||
* Only public key needed on source host
|
||||
* Each blob encrypted independently
|
||||
* Metadata databases also encrypted
|
||||
|
||||
### storage
|
||||
|
||||
* Content-addressed blob storage
|
||||
* Immutable append-only design
|
||||
* Two-level directory sharding for blobs (aa/bb/hash)
|
||||
* Compressed with zstd before encryption
|
||||
|
||||
### state tracking
|
||||
|
||||
* Local SQLite database for incremental state
|
||||
* Tracks file mtimes and chunk mappings
|
||||
* Enables efficient change detection
|
||||
* Supports inotify monitoring in daemon mode
|
||||
|
||||
## does not
|
||||
|
||||
* Store any secrets on the backed-up machine
|
||||
@ -141,6 +201,33 @@ The entire system is restore-only from object storage.
|
||||
|
||||
---
|
||||
|
||||
## features
|
||||
|
||||
### daemon mode
|
||||
|
||||
* Continuous background operation
|
||||
* inotify-based change detection
|
||||
* Respects `backup_interval` and `min_time_between_run`
|
||||
* Full scan every `full_scan_interval` (default 24h)
|
||||
|
||||
### cron mode
|
||||
|
||||
* Single backup run
|
||||
* Silent output unless errors
|
||||
* Ideal for scheduled backups
|
||||
|
||||
### metadata integrity
|
||||
|
||||
* SHA256 hash of metadata stored separately
|
||||
* Encrypted hash file for verification
|
||||
* Chunked metadata support for large filesystems
|
||||
|
||||
### exclusion patterns
|
||||
|
||||
* Glob-based file exclusion
|
||||
* Configured in YAML
|
||||
* Applied during directory walk
|
||||
|
||||
## prune
|
||||
|
||||
Run `vaultik prune` on a machine with the private key. It:
|
||||
@ -160,6 +247,30 @@ WTFPL — see LICENSE.
|
||||
|
||||
---
|
||||
|
||||
## security considerations
|
||||
|
||||
* Source host compromise cannot decrypt backups
|
||||
* No replay attacks possible (append-only)
|
||||
* Each blob independently encrypted
|
||||
* Metadata tampering detectable via hash verification
|
||||
* S3 credentials only allow write access to backup prefix
|
||||
|
||||
## performance
|
||||
|
||||
* Streaming processing (no temp files)
|
||||
* Parallel blob uploads
|
||||
* Deduplication reduces storage and bandwidth
|
||||
* Local index enables fast incremental detection
|
||||
* Configurable compression levels
|
||||
|
||||
## requirements
|
||||
|
||||
* Go 1.24.4 or later
|
||||
* S3-compatible object storage
|
||||
* age command-line tool (for key generation)
|
||||
* SQLite3
|
||||
* Sufficient disk space for local index
|
||||
|
||||
## author
|
||||
|
||||
sneak
|
||||
|
Loading…
Reference in New Issue
Block a user