vaultik/TODO.md
sneak 470bf648c4 Add deterministic deduplication, rclone backend, and database purge command
- Implement deterministic blob hashing using double SHA256 of uncompressed
  plaintext data, enabling deduplication even after local DB is cleared
- Add Stat() check before blob upload to skip existing blobs in storage
- Add rclone storage backend for additional remote storage options
- Add 'vaultik database purge' command to erase local state DB
- Add 'vaultik remote check' command to verify remote connectivity
- Show configured snapshots in 'vaultik snapshot list' output
- Skip macOS resource fork files (._*) when listing remote snapshots
- Use multi-threaded zstd compression (CPUs - 2 threads)
- Add writer tests for double hashing behavior
2026-01-28 15:50:17 -08:00

3.6 KiB

Vaultik 1.0 TODO

Linear list of tasks to complete before 1.0 release.

Rclone Storage Backend (Complete)

Add rclone as a storage backend via Go library import, allowing vaultik to use any of rclone's 70+ supported cloud storage providers.

Configuration:

storage_url: "rclone://myremote/path/to/backups"

User must have rclone configured separately (via rclone config).

Implementation Steps:

  1. Add rclone dependency to go.mod
  2. Create internal/storage/rclone.go implementing Storer interface
    • NewRcloneStorer(remote, path) - init with configfile.Install() and fs.NewFs()
    • Put / PutWithProgress - use operations.Rcat()
    • Get - use fs.NewObject() then obj.Open()
    • Stat - use fs.NewObject() for size/metadata
    • Delete - use obj.Remove()
    • List / ListStream - use operations.ListFn()
    • Info - return remote name
  3. Update internal/storage/url.go - parse rclone://remote/path URLs
  4. Update internal/storage/module.go - add rclone case to storerFromURL()
  5. Test with real rclone remote

Error Mapping:

  • fs.ErrorObjectNotFoundErrNotFound
  • fs.ErrorDirNotFoundErrNotFound
  • fs.ErrorNotFoundInConfigFileErrRemoteNotFound (new)

CLI Polish (Priority)

  1. Improve error messages throughout
    • Ensure all errors include actionable context
    • Add suggestions for common issues (e.g., "did you set VAULTIK_AGE_SECRET_KEY?")

Security (Priority)

  1. Audit encryption implementation

    • Verify age encryption is used correctly
    • Ensure no plaintext leaks in logs or errors
    • Verify blob hashes are computed correctly
  2. Secure memory handling for secrets

    • Clear S3 credentials from memory after client init
    • Document that age_secret_key is env-var only (already implemented)

Testing

  1. Write integration tests for restore command

  2. Write end-to-end integration test

    • Create backup
    • Verify backup
    • Restore backup
    • Compare restored files to originals
  3. Add tests for edge cases

    • Empty directories
    • Symlinks
    • Special characters in filenames
    • Very large files (multi-GB)
    • Many small files (100k+)
  4. Add tests for error conditions

    • Network failures during upload
    • Disk full during restore
    • Corrupted blobs
    • Missing blobs

Performance

  1. Profile and optimize restore performance

    • Parallel blob downloads
    • Streaming decompression/decryption
    • Efficient chunk reassembly
  2. Add bandwidth limiting option

    • --bwlimit flag for upload/download speed limiting

Documentation

  1. Add man page or --help improvements
    • Detailed help for each command
    • Examples in help output

Final Polish

  1. Ensure version is set correctly in releases

  2. Create release process

    • Binary releases for supported platforms
    • Checksums for binaries
    • Release notes template
  3. Final code review

    • Remove debug statements
    • Ensure consistent code style
  4. Tag and release v1.0.0


Post-1.0 (Daemon Mode)

  1. Implement inotify file watcher for Linux

    • Watch source directories for changes
    • Track dirty paths in memory
  2. Implement FSEvents watcher for macOS

    • Watch source directories for changes
    • Track dirty paths in memory
  3. Implement backup scheduler in daemon mode

    • Respect backup_interval config
    • Trigger backup when dirty paths exist and interval elapsed
    • Implement full_scan_interval for periodic full scans
  4. Add proper signal handling for daemon

    • Graceful shutdown on SIGTERM/SIGINT
    • Complete in-progress backup before exit
  5. Write tests for daemon mode