Add custom types, version command, and restore --verify flag

- Add internal/types package with type-safe wrappers for IDs, hashes, paths, and credentials (FileID, BlobID, ChunkHash, etc.) - Implement driver.Valuer and sql.Scanner for UUID-based types - Add `vaultik version` command showing version, commit, go version - Add `--verify` flag to restore command that checksums all restored files against expected chunk hashes with progress bar - Remove fetch.go (dead code, functionality in restore) - Clean up TODO.md, remove completed items - Update all database and snapshot code to use new custom types
2026-01-14 17:11:52 -08:00
parent 2afd54d693
commit 417b25a5f5
53 changed files with 2330 additions and 1581 deletions
--- a/TODO.md
+++ b/TODO.md
@@ -1,177 +1,107 @@
-# Implementation TODO
+# Vaultik 1.0 TODO

-## Proposed: Store and Snapshot Commands
-
-### Overview
-Reorganize commands to provide better visibility into stored data and snapshots.
-
-### Command Structure
-
-#### `vaultik store` - Storage information commands
- `vaultik store info`
-  - Lists S3 bucket configuration
-  - Shows total number of snapshots (from metadata/ listing)
-  - Shows total number of blobs (from blobs/ listing)
-  - Shows total size of all blobs
-  - **No decryption required** - uses S3 listing only
-
-#### `vaultik snapshot` - Snapshot management commands  
- `vaultik snapshot create [path]`
-  - Renamed from `vaultik backup`
-  - Same functionality as current backup command
-  
- `vaultik snapshot list [--json]`
-  - Lists all snapshots with:
-    - Snapshot ID
-    - Creation timestamp (parsed from snapshot ID)
-    - Compressed size (sum of referenced blob sizes from manifest)
-  - **No decryption required** - uses blob manifests only
-  - `--json` flag outputs in JSON format instead of table
-  
- `vaultik snapshot purge`
-  - Requires one of:
-    - `--keep-latest` - keeps only the most recent snapshot
-    - `--older-than <duration>` - removes snapshots older than duration (e.g., "30d", "6m", "1y")
-  - Removes snapshot metadata and runs pruning to clean up unreferenced blobs
-  - Shows what would be deleted and requires confirmation
-
- `vaultik snapshot verify [--deep] <snapshot-id>`
-  - Basic mode: Verifies all blobs referenced in manifest exist in S3
-  - `--deep` mode: Downloads each blob and verifies its hash matches the stored hash
-  - **Stub implementation for now**
-
- `vaultik snapshot remove <snapshot-id>` (alias: `rm`)
-  - Removes a snapshot and any blobs that become orphaned
-  - Algorithm:
-    1. Validate target snapshot exists in storage
-    2. List all snapshots in storage
-    3. Download manifests from all OTHER snapshots to build "in-use" blob set
-    4. Download target snapshot's manifest to get its blob hashes
-    5. Identify orphaned blobs: target blobs NOT in the in-use set
-    6. Delete orphaned blobs from storage
-    7. Delete snapshot metadata using existing `deleteSnapshot()` helper
-  - Flags:
-    - `--force` / `-f`: Skip confirmation prompt
-    - `--dry-run`: Show what would be deleted without deleting
-  - Files to modify:
-    - `internal/cli/snapshot.go`: Add `newSnapshotRemoveCommand()`
-    - `internal/vaultik/snapshot.go`: Add `RemoveSnapshot()` method
-  - Reuse existing code:
-    - Snapshot enumeration pattern from `PruneBlobs()` in `prune.go`
-    - `v.downloadManifest(snapshotID)` for manifest downloading
-    - Blob path format: `blobs/{hash[:2]}/{hash[2:4]}/{hash}`
-    - `v.deleteSnapshot(snapshotID)` for metadata deletion
-
-### Implementation Notes
-
-1. **No Decryption Required**: All commands work with unencrypted blob manifests
-2. **Blob Manifests**: Located at `metadata/{snapshot-id}/manifest.json.zst`
-3. **S3 Operations**: Use S3 ListObjects to enumerate snapshots and blobs
-4. **Size Calculations**: Sum blob sizes from S3 object metadata
-5. **Timestamp Parsing**: Extract from snapshot ID format (e.g., `2024-01-15-143052-hostname`)
-6. **S3 Metadata**: Only used for `snapshot verify` command
-
-### Benefits
- Users can see storage usage without decryption keys
- Snapshot management doesn't require access to encrypted metadata
- Clean separation between storage info and snapshot operations
-
-## Chunking and Hashing
-1. ~~Implement content-defined chunking~~ (done with FastCDC)
-1. ~~Create streaming chunk processor~~ (done in chunker)
-1. ~~Implement SHA256 hashing for chunks~~ (done in scanner)
-1. ~~Add configurable chunk size parameters~~ (done in scanner)
-1. ~~Write tests for chunking consistency~~ (done)
-
-## Compression and Encryption
-1. ~~Implement compression~~ (done with zlib in blob packer)
-1. ~~Integrate age encryption library~~ (done in crypto package)
-1. ~~Create Encryptor type for public key encryption~~ (done)
-1. ~~Implement streaming encrypt/decrypt pipelines~~ (done in packer)
-1. ~~Write tests for compression and encryption~~ (done)
-
-## Blob Packing
-1. ~~Implement BlobWriter with size limits~~ (done in packer)
-1. ~~Add chunk accumulation and flushing~~ (done)
-1. ~~Create blob hash calculation~~ (done)
-1. ~~Implement proper error handling and rollback~~ (done with transactions)
-1. ~~Write tests for blob packing scenarios~~ (done)
-
-## S3 Operations
-1. ~~Integrate MinIO client library~~ (done in s3 package)
-1. ~~Implement S3Client wrapper type~~ (done)
-1. ~~Add multipart upload support for large blobs~~ (done - using standard upload)
-1. ~~Implement retry logic~~ (handled by MinIO client)
-1. ~~Write tests using MinIO container~~ (done with testcontainers)
-
-## Backup Command - Basic
-1. ~~Implement directory walking with exclusion patterns~~ (done with afero)
-1. Add file change detection using index
-1. ~~Integrate chunking pipeline for changed files~~ (done in scanner)
-1. Implement blob upload coordination to S3
-1. Add progress reporting to stderr
-1. Write integration tests for backup
-
-## Snapshot Metadata
-1. Implement snapshot metadata extraction from index
-1. Create SQLite snapshot database builder
-1. Add metadata compression and encryption
-1. Implement metadata chunking for large snapshots
-1. Add hash calculation and verification
-1. Implement metadata upload to S3
-1. Write tests for metadata operations
+Linear list of tasks to complete before 1.0 release.

 ## Restore Command
-1. Implement snapshot listing and selection
-1. Add metadata download and reconstruction
-1. Implement hash verification for metadata
-1. Create file restoration logic with chunk retrieval
-1. Add blob caching for efficiency
-1. Implement proper file permissions and mtime restoration
-1. Write integration tests for restore

-## Prune Command
-1. Implement latest snapshot detection
-1. Add referenced blob extraction from metadata
-1. Create S3 blob listing and comparison
-1. Implement safe deletion of unreferenced blobs
-1. Add dry-run mode for safety
-1. Write tests for prune scenarios
-
-## Verify Command
-1. Implement metadata integrity checking
-1. Add blob existence verification
-1. Implement quick mode (S3 hash checking)
-1. Implement deep mode (download and verify chunks)
-1. Add detailed error reporting
-1. Write tests for verification
-
-## Fetch Command
-1. Implement single-file metadata query
-1. Add minimal blob downloading for file
-1. Create streaming file reconstruction
-1. Add support for output redirection
-1. Write tests for fetch command
+1. Write integration tests for restore command

 ## Daemon Mode
-1. Implement inotify watcher for Linux
-1. Add dirty path tracking in index
-1. Create periodic full scan scheduler
-1. Implement backup interval enforcement
-1. Add proper signal handling and shutdown
-1. Write tests for daemon behavior

-## Cron Mode
-1. Implement silent operation mode
-1. Add proper exit codes for cron
-1. Implement lock file to prevent concurrent runs
-1. Add error summary reporting
-1. Write tests for cron mode
+1. Implement inotify file watcher for Linux
+   - Watch source directories for changes
+   - Track dirty paths in memory

-## Finalization
-1. Add comprehensive logging throughout
-1. Implement proper error wrapping and context
-1. Add performance metrics collection
-1. Create end-to-end integration tests
-1. Write documentation and examples
-1. Set up CI/CD pipeline
+1. Implement FSEvents watcher for macOS
+   - Watch source directories for changes
+   - Track dirty paths in memory
+
+1. Implement backup scheduler in daemon mode
+   - Respect backup_interval config
+   - Trigger backup when dirty paths exist and interval elapsed
+   - Implement full_scan_interval for periodic full scans
+
+1. Add proper signal handling for daemon
+   - Graceful shutdown on SIGTERM/SIGINT
+   - Complete in-progress backup before exit
+
+1. Write tests for daemon mode
+
+## CLI Polish
+
+1. Add `--quiet` flag to all commands
+   - Suppress non-error output
+   - Useful for scripting
+
+1. Add `--json` output flag to more commands
+   - `snapshot verify` - output verification results as JSON
+   - `snapshot remove` - output deletion stats as JSON
+   - `prune` - output pruning stats as JSON
+
+1. Improve error messages throughout
+   - Ensure all errors include actionable context
+   - Add suggestions for common issues
+
+## Testing
+
+1. Write end-to-end integration test
+   - Create backup
+   - Verify backup
+   - Restore backup
+   - Compare restored files to originals
+
+1. Add tests for edge cases
+   - Empty directories
+   - Symlinks
+   - Special characters in filenames
+   - Very large files (multi-GB)
+   - Many small files (100k+)
+
+1. Add tests for error conditions
+   - Network failures during upload
+   - Disk full during restore
+   - Corrupted blobs
+   - Missing blobs
+
+## Documentation
+
+1. Add man page or --help improvements
+   - Detailed help for each command
+   - Examples in help output
+
+## Performance
+
+1. Profile and optimize restore performance
+   - Parallel blob downloads
+   - Streaming decompression/decryption
+   - Efficient chunk reassembly
+
+1. Add bandwidth limiting option
+   - `--bwlimit` flag for upload/download speed limiting
+
+## Security
+
+1. Audit encryption implementation
+   - Verify age encryption is used correctly
+   - Ensure no plaintext leaks in logs or errors
+
+1. Add config file permission check
+   - Warn if config file is world-readable (contains secrets)
+
+1. Secure memory handling for secrets
+   - Clear age_secret_key from memory after use
+
+## Final Polish
+
+1. Ensure version is set correctly in releases
+
+1. Create release process
+   - Binary releases for supported platforms
+   - Checksums for binaries
+   - Release notes template
+
+1. Final code review
+   - Remove debug statements
+   - Ensure consistent code style
+
+1. Tag and release v1.0.0