Add custom types, version command, and restore --verify flag
- Add internal/types package with type-safe wrappers for IDs, hashes, paths, and credentials (FileID, BlobID, ChunkHash, etc.) - Implement driver.Valuer and sql.Scanner for UUID-based types - Add `vaultik version` command showing version, commit, go version - Add `--verify` flag to restore command that checksums all restored files against expected chunk hashes with progress bar - Remove fetch.go (dead code, functionality in restore) - Clean up TODO.md, remove completed items - Update all database and snapshot code to use new custom types
This commit is contained in:
268
TODO.md
268
TODO.md
@@ -1,177 +1,107 @@
|
||||
# Implementation TODO
|
||||
# Vaultik 1.0 TODO
|
||||
|
||||
## Proposed: Store and Snapshot Commands
|
||||
|
||||
### Overview
|
||||
Reorganize commands to provide better visibility into stored data and snapshots.
|
||||
|
||||
### Command Structure
|
||||
|
||||
#### `vaultik store` - Storage information commands
|
||||
- `vaultik store info`
|
||||
- Lists S3 bucket configuration
|
||||
- Shows total number of snapshots (from metadata/ listing)
|
||||
- Shows total number of blobs (from blobs/ listing)
|
||||
- Shows total size of all blobs
|
||||
- **No decryption required** - uses S3 listing only
|
||||
|
||||
#### `vaultik snapshot` - Snapshot management commands
|
||||
- `vaultik snapshot create [path]`
|
||||
- Renamed from `vaultik backup`
|
||||
- Same functionality as current backup command
|
||||
|
||||
- `vaultik snapshot list [--json]`
|
||||
- Lists all snapshots with:
|
||||
- Snapshot ID
|
||||
- Creation timestamp (parsed from snapshot ID)
|
||||
- Compressed size (sum of referenced blob sizes from manifest)
|
||||
- **No decryption required** - uses blob manifests only
|
||||
- `--json` flag outputs in JSON format instead of table
|
||||
|
||||
- `vaultik snapshot purge`
|
||||
- Requires one of:
|
||||
- `--keep-latest` - keeps only the most recent snapshot
|
||||
- `--older-than <duration>` - removes snapshots older than duration (e.g., "30d", "6m", "1y")
|
||||
- Removes snapshot metadata and runs pruning to clean up unreferenced blobs
|
||||
- Shows what would be deleted and requires confirmation
|
||||
|
||||
- `vaultik snapshot verify [--deep] <snapshot-id>`
|
||||
- Basic mode: Verifies all blobs referenced in manifest exist in S3
|
||||
- `--deep` mode: Downloads each blob and verifies its hash matches the stored hash
|
||||
- **Stub implementation for now**
|
||||
|
||||
- `vaultik snapshot remove <snapshot-id>` (alias: `rm`)
|
||||
- Removes a snapshot and any blobs that become orphaned
|
||||
- Algorithm:
|
||||
1. Validate target snapshot exists in storage
|
||||
2. List all snapshots in storage
|
||||
3. Download manifests from all OTHER snapshots to build "in-use" blob set
|
||||
4. Download target snapshot's manifest to get its blob hashes
|
||||
5. Identify orphaned blobs: target blobs NOT in the in-use set
|
||||
6. Delete orphaned blobs from storage
|
||||
7. Delete snapshot metadata using existing `deleteSnapshot()` helper
|
||||
- Flags:
|
||||
- `--force` / `-f`: Skip confirmation prompt
|
||||
- `--dry-run`: Show what would be deleted without deleting
|
||||
- Files to modify:
|
||||
- `internal/cli/snapshot.go`: Add `newSnapshotRemoveCommand()`
|
||||
- `internal/vaultik/snapshot.go`: Add `RemoveSnapshot()` method
|
||||
- Reuse existing code:
|
||||
- Snapshot enumeration pattern from `PruneBlobs()` in `prune.go`
|
||||
- `v.downloadManifest(snapshotID)` for manifest downloading
|
||||
- Blob path format: `blobs/{hash[:2]}/{hash[2:4]}/{hash}`
|
||||
- `v.deleteSnapshot(snapshotID)` for metadata deletion
|
||||
|
||||
### Implementation Notes
|
||||
|
||||
1. **No Decryption Required**: All commands work with unencrypted blob manifests
|
||||
2. **Blob Manifests**: Located at `metadata/{snapshot-id}/manifest.json.zst`
|
||||
3. **S3 Operations**: Use S3 ListObjects to enumerate snapshots and blobs
|
||||
4. **Size Calculations**: Sum blob sizes from S3 object metadata
|
||||
5. **Timestamp Parsing**: Extract from snapshot ID format (e.g., `2024-01-15-143052-hostname`)
|
||||
6. **S3 Metadata**: Only used for `snapshot verify` command
|
||||
|
||||
### Benefits
|
||||
- Users can see storage usage without decryption keys
|
||||
- Snapshot management doesn't require access to encrypted metadata
|
||||
- Clean separation between storage info and snapshot operations
|
||||
|
||||
## Chunking and Hashing
|
||||
1. ~~Implement content-defined chunking~~ (done with FastCDC)
|
||||
1. ~~Create streaming chunk processor~~ (done in chunker)
|
||||
1. ~~Implement SHA256 hashing for chunks~~ (done in scanner)
|
||||
1. ~~Add configurable chunk size parameters~~ (done in scanner)
|
||||
1. ~~Write tests for chunking consistency~~ (done)
|
||||
|
||||
## Compression and Encryption
|
||||
1. ~~Implement compression~~ (done with zlib in blob packer)
|
||||
1. ~~Integrate age encryption library~~ (done in crypto package)
|
||||
1. ~~Create Encryptor type for public key encryption~~ (done)
|
||||
1. ~~Implement streaming encrypt/decrypt pipelines~~ (done in packer)
|
||||
1. ~~Write tests for compression and encryption~~ (done)
|
||||
|
||||
## Blob Packing
|
||||
1. ~~Implement BlobWriter with size limits~~ (done in packer)
|
||||
1. ~~Add chunk accumulation and flushing~~ (done)
|
||||
1. ~~Create blob hash calculation~~ (done)
|
||||
1. ~~Implement proper error handling and rollback~~ (done with transactions)
|
||||
1. ~~Write tests for blob packing scenarios~~ (done)
|
||||
|
||||
## S3 Operations
|
||||
1. ~~Integrate MinIO client library~~ (done in s3 package)
|
||||
1. ~~Implement S3Client wrapper type~~ (done)
|
||||
1. ~~Add multipart upload support for large blobs~~ (done - using standard upload)
|
||||
1. ~~Implement retry logic~~ (handled by MinIO client)
|
||||
1. ~~Write tests using MinIO container~~ (done with testcontainers)
|
||||
|
||||
## Backup Command - Basic
|
||||
1. ~~Implement directory walking with exclusion patterns~~ (done with afero)
|
||||
1. Add file change detection using index
|
||||
1. ~~Integrate chunking pipeline for changed files~~ (done in scanner)
|
||||
1. Implement blob upload coordination to S3
|
||||
1. Add progress reporting to stderr
|
||||
1. Write integration tests for backup
|
||||
|
||||
## Snapshot Metadata
|
||||
1. Implement snapshot metadata extraction from index
|
||||
1. Create SQLite snapshot database builder
|
||||
1. Add metadata compression and encryption
|
||||
1. Implement metadata chunking for large snapshots
|
||||
1. Add hash calculation and verification
|
||||
1. Implement metadata upload to S3
|
||||
1. Write tests for metadata operations
|
||||
Linear list of tasks to complete before 1.0 release.
|
||||
|
||||
## Restore Command
|
||||
1. Implement snapshot listing and selection
|
||||
1. Add metadata download and reconstruction
|
||||
1. Implement hash verification for metadata
|
||||
1. Create file restoration logic with chunk retrieval
|
||||
1. Add blob caching for efficiency
|
||||
1. Implement proper file permissions and mtime restoration
|
||||
1. Write integration tests for restore
|
||||
|
||||
## Prune Command
|
||||
1. Implement latest snapshot detection
|
||||
1. Add referenced blob extraction from metadata
|
||||
1. Create S3 blob listing and comparison
|
||||
1. Implement safe deletion of unreferenced blobs
|
||||
1. Add dry-run mode for safety
|
||||
1. Write tests for prune scenarios
|
||||
|
||||
## Verify Command
|
||||
1. Implement metadata integrity checking
|
||||
1. Add blob existence verification
|
||||
1. Implement quick mode (S3 hash checking)
|
||||
1. Implement deep mode (download and verify chunks)
|
||||
1. Add detailed error reporting
|
||||
1. Write tests for verification
|
||||
|
||||
## Fetch Command
|
||||
1. Implement single-file metadata query
|
||||
1. Add minimal blob downloading for file
|
||||
1. Create streaming file reconstruction
|
||||
1. Add support for output redirection
|
||||
1. Write tests for fetch command
|
||||
1. Write integration tests for restore command
|
||||
|
||||
## Daemon Mode
|
||||
1. Implement inotify watcher for Linux
|
||||
1. Add dirty path tracking in index
|
||||
1. Create periodic full scan scheduler
|
||||
1. Implement backup interval enforcement
|
||||
1. Add proper signal handling and shutdown
|
||||
1. Write tests for daemon behavior
|
||||
|
||||
## Cron Mode
|
||||
1. Implement silent operation mode
|
||||
1. Add proper exit codes for cron
|
||||
1. Implement lock file to prevent concurrent runs
|
||||
1. Add error summary reporting
|
||||
1. Write tests for cron mode
|
||||
1. Implement inotify file watcher for Linux
|
||||
- Watch source directories for changes
|
||||
- Track dirty paths in memory
|
||||
|
||||
## Finalization
|
||||
1. Add comprehensive logging throughout
|
||||
1. Implement proper error wrapping and context
|
||||
1. Add performance metrics collection
|
||||
1. Create end-to-end integration tests
|
||||
1. Write documentation and examples
|
||||
1. Set up CI/CD pipeline
|
||||
1. Implement FSEvents watcher for macOS
|
||||
- Watch source directories for changes
|
||||
- Track dirty paths in memory
|
||||
|
||||
1. Implement backup scheduler in daemon mode
|
||||
- Respect backup_interval config
|
||||
- Trigger backup when dirty paths exist and interval elapsed
|
||||
- Implement full_scan_interval for periodic full scans
|
||||
|
||||
1. Add proper signal handling for daemon
|
||||
- Graceful shutdown on SIGTERM/SIGINT
|
||||
- Complete in-progress backup before exit
|
||||
|
||||
1. Write tests for daemon mode
|
||||
|
||||
## CLI Polish
|
||||
|
||||
1. Add `--quiet` flag to all commands
|
||||
- Suppress non-error output
|
||||
- Useful for scripting
|
||||
|
||||
1. Add `--json` output flag to more commands
|
||||
- `snapshot verify` - output verification results as JSON
|
||||
- `snapshot remove` - output deletion stats as JSON
|
||||
- `prune` - output pruning stats as JSON
|
||||
|
||||
1. Improve error messages throughout
|
||||
- Ensure all errors include actionable context
|
||||
- Add suggestions for common issues
|
||||
|
||||
## Testing
|
||||
|
||||
1. Write end-to-end integration test
|
||||
- Create backup
|
||||
- Verify backup
|
||||
- Restore backup
|
||||
- Compare restored files to originals
|
||||
|
||||
1. Add tests for edge cases
|
||||
- Empty directories
|
||||
- Symlinks
|
||||
- Special characters in filenames
|
||||
- Very large files (multi-GB)
|
||||
- Many small files (100k+)
|
||||
|
||||
1. Add tests for error conditions
|
||||
- Network failures during upload
|
||||
- Disk full during restore
|
||||
- Corrupted blobs
|
||||
- Missing blobs
|
||||
|
||||
## Documentation
|
||||
|
||||
1. Add man page or --help improvements
|
||||
- Detailed help for each command
|
||||
- Examples in help output
|
||||
|
||||
## Performance
|
||||
|
||||
1. Profile and optimize restore performance
|
||||
- Parallel blob downloads
|
||||
- Streaming decompression/decryption
|
||||
- Efficient chunk reassembly
|
||||
|
||||
1. Add bandwidth limiting option
|
||||
- `--bwlimit` flag for upload/download speed limiting
|
||||
|
||||
## Security
|
||||
|
||||
1. Audit encryption implementation
|
||||
- Verify age encryption is used correctly
|
||||
- Ensure no plaintext leaks in logs or errors
|
||||
|
||||
1. Add config file permission check
|
||||
- Warn if config file is world-readable (contains secrets)
|
||||
|
||||
1. Secure memory handling for secrets
|
||||
- Clear age_secret_key from memory after use
|
||||
|
||||
## Final Polish
|
||||
|
||||
1. Ensure version is set correctly in releases
|
||||
|
||||
1. Create release process
|
||||
- Binary releases for supported platforms
|
||||
- Checksums for binaries
|
||||
- Release notes template
|
||||
|
||||
1. Final code review
|
||||
- Remove debug statements
|
||||
- Ensure consistent code style
|
||||
|
||||
1. Tag and release v1.0.0
|
||||
|
||||
Reference in New Issue
Block a user