- Scan phase now only collects files to process, no DB writes
- Unchanged files get snapshot_files associations via batch (no new records)
- New/changed files get records created during processing after chunking
- Reduces DB writes significantly (only changed files need new records)
- Avoids orphaned file records if backup is interrupted mid-way
New format: Progress [5.7k/610k] 6.7 GB/44 GB (15.4%), 106 MB/sec, 500 files/sec, running for 1m30s, ETA: 5m49s
- Compact file counts with k/M suffixes in brackets
- Bytes processed/total with percentage
- Both byte rate and file rate
- Elapsed time shown as "running for X"
Delete uploads table entries before deleting the snapshot itself.
The uploads table has a foreign key to snapshots(id) without CASCADE,
so we must explicitly delete upload records first.
- Show bytes processed/total instead of just files
- Display data rate in bytes/sec
- Calculate ETA based on bytes (more accurate than files)
- Print message when each blob is stored with size and speed
Remove the separate enumerateFiles() function that was doing a full
directory walk using Readdir() which calls stat() on every file.
Instead, build the existingFiles map during the scan phase walk,
and detect deleted files afterward.
This eliminates one full filesystem traversal, significantly speeding
up the scan phase for large directories.
Performance improvements:
- Load all known files from DB into memory at startup
- Check file changes against in-memory map (no per-file DB queries)
- Batch database writes in groups of 1000 files per transaction
- Scan phase now only counts regular files, not directories
This should improve scan speed from ~600 files/sec to potentially
10,000+ files/sec by eliminating per-file database round trips.
Storage backend:
- Add internal/storage package with Storer interface
- Implement FileStorer for local filesystem storage (file:// URLs)
- Implement S3Storer wrapping existing s3.Client
- Support storage_url config field (s3:// or file://)
- Migrate all consumers to use storage.Storer interface
PID locking:
- Add internal/pidlock package to prevent concurrent instances
- Acquire lock before app start, release on exit
- Detect stale locks from crashed processes
Scan progress improvements:
- Add fast file enumeration pass before stat() phase
- Use enumerated set for deletion detection (no extra filesystem access)
- Show progress with percentage, files/sec, elapsed time, and ETA
- Change "changed" to "changed/new" for clarity
Config improvements:
- Add tilde expansion for paths (~/)
- Use xdg library for platform-specific default index path
Document the data model, type instantiation flow, and module
responsibilities. Covers chunker, packer, vaultik, cli, snapshot,
and database modules with detailed explanations of relationships
between File, Chunk, Blob, and Snapshot entities.
- Create comprehensive integration tests with mock S3 client
- Add in-memory filesystem and SQLite database support for testing
- Test full backup workflow including chunking, packing, and uploading
- Add test to verify encrypted blob content
- Fix scanner to use afero filesystem for temp file cleanup
- Demonstrate successful backup and verification with mock dependencies
- Add afero.Fs field to Vaultik struct for filesystem operations
- Vaultik now owns and manages the filesystem instance
- SnapshotManager receives filesystem via SetFilesystem() setter
- Update blob packer to use afero for temporary files
- Convert all filesystem operations to use afero abstraction
- Remove filesystem module - Vaultik manages filesystem directly
- Update tests: remove symlink test (unsupported by afero memfs)
- Fix TestMultipleFileChanges to handle scanner examining directories
This enables full end-to-end testing without touching disk by using
memory-backed filesystems. Database operations continue using real
filesystem as SQLite requires actual files.
- Created new internal/vaultik package with unified Vaultik struct
- Moved all command methods (snapshot, info, prune, verify) from CLI to vaultik package
- Implemented single constructor that handles crypto capabilities automatically
- Added CanDecrypt() method to check if decryption is available
- Updated all CLI commands to use the new vaultik.Vaultik struct
- Removed old fragmented App structs and WithCrypto wrapper
- Fixed context management - Vaultik now owns its context lifecycle
- Cleaned up package imports and dependencies
This creates a cleaner separation between CLI/Cobra code and business logic,
with all vaultik operations now centralized in the internal/vaultik package.
- Remove error suppression for manifest decoding errors
- Manifest read/deserialize errors now fail immediately with clear error messages
- This ensures we catch format mismatches and other issues early
- Created manifest.go with proper Manifest structure including blob sizes
- Updated manifest generation to include compressed size for each blob
- Added TotalCompressedSize field to manifest for quick access
- Renamed backup package to snapshot for clarity
- Updated snapshot list to show all remote snapshots
- Remote snapshots not in local DB fetch manifest to get size
- Local snapshots not in remote are automatically deleted
- Removed backwards compatibility code (pre-1.0, no users)
- Fixed prune command to use new manifest format
- Updated all imports and references from backup to snapshot
- Removed unnecessary manifest downloads from snapshot list command
- Removed blob size calculation from listing operation
- Removed COMPRESSED SIZE column from output since we're not calculating it
- This makes snapshot list much faster and avoids 404 errors for old snapshots
- Added syncWithRemote method that lists remote snapshots from S3
- Removes local snapshots that don't exist in remote storage
- Ensures local database stays in sync with actual remote state
- This prevents showing snapshots that have been deleted from S3
- Manifests are now only compressed (not encrypted) so pruning operations can work without private keys
- Updated generateBlobManifest to use zstd compression directly
- Updated prune command to handle unencrypted manifests
- Updated snapshot list command to handle new manifest format
- Updated documentation to reflect manifest.json.zst (not .age)
- Removed unnecessary VAULTIK_PRIVATE_KEY check from prune command
- Remove --bucket and --prefix command line flags
- Use bucket and prefix from S3 configuration in config file
- Update command to follow same pattern as other commands
- Maintain consistency that all configuration comes from config file
- Delete old file_chunks and chunk_files when file content changes
- Add DeleteByFileID method to ChunkFileRepository
- Add tests to verify old chunks are properly disassociated
- Make log messages more precise throughout scanner and snapshot
- Support metadata-only snapshots when no files have changed
- Add periodic status output during scan and snapshot operations
- Improve scan summary output with clearer information
- Add unified compression/encryption package in internal/blobgen
- Update DATAMODEL.md to reflect current schema implementation
- Refactor snapshot cleanup into well-named methods for clarity
- Add snapshot_id to uploads table to track new blobs per snapshot
- Fix blob count reporting for incremental backups
- Add DeleteOrphaned method to BlobChunkRepository
- Fix cleanup order to respect foreign key constraints
- Update tests to reflect schema changes
- Changed blob table to use ID (UUID) as primary key instead of hash
- Blob records are now created at packing start, enabling immediate chunk associations
- Implemented streaming chunking to process large files without memory exhaustion
- Fixed blob manifest generation to include all referenced blobs
- Updated all foreign key references from blob_hash to blob_id
- Added progress reporting and improved error handling
- Enforced encryption requirement for all blob packing
- Updated tests to use test encryption keys
- Added Cyrillic transliteration to README
- Remove StartTime initialization from globals.New()
- Add setupGlobals function in app.go to set StartTime during fx OnStart
- Simplify globals package to be just a key/value store
- Remove fx dependencies from globals test
- Add gofakes3 for in-process S3-compatible test server
- Create test server that runs on localhost:9999 with temp directory
- Implement basic S3 client wrapper with standard operations
- Add comprehensive tests for blob and metadata storage patterns
- Test cleanup properly removes temporary directories
- Use AWS SDK v2 for S3 operations with proper error handling
- Update bucket structure to include unencrypted blob manifest files
- Add <snapshot_id>.manifest.json.zst containing list of referenced blobs
- This enables pruning operations without requiring decryption keys
- Add snapshot management commands: list, rm, latest (stubs)
- Add --prune flag to backup command for automatic cleanup
- Update DESIGN.md to document manifest format and updated prune flow
- Add pure Go SQLite driver (modernc.org/sqlite) to avoid CGO dependency
- Implement database connection management with WAL mode
- Add write mutex for serializing concurrent writes
- Create schema for all tables matching DESIGN.md specifications
- Implement repository pattern for all database entities:
- Files, FileChunks, Chunks, Blobs, BlobChunks, ChunkFiles, Snapshots
- Add transaction support with proper rollback handling
- Add fatal error handling for database integrity issues
- Add snapshot fields for tracking file sizes and compression ratios
- Make index path configurable via VAULTIK_INDEX_PATH environment variable
- Add comprehensive test coverage for all repositories
- Add format check to Makefile to ensure code formatting
- Add SQLite database connection management with proper error handling
- Implement schema for files, chunks, blobs, and snapshots tables
- Create repository pattern for each database table
- Add transaction support with proper rollback handling
- Integrate database module with fx dependency injection
- Make index path configurable via VAULTIK_INDEX_PATH env var
- Add fatal error handling for database integrity issues
- Update DESIGN.md to clarify file_chunks vs chunk_files distinction
- Remove FinalHash from BlobInfo (blobs are content-addressable)
- Add file metadata support (mtime, ctime, mode, uid, gid, symlinks)
- Extract implementation TODO from DESIGN.md to TODO.md
- Remove completed Phase 1 tasks
- Add --quick option to verify command for S3 hash checking
- Update documentation to reflect deep verification as default
- Change all commands to use flags (--bucket, --prefix, etc.)
- Add --config flag to backup command
- Support VAULTIK_CONFIG environment variable for config path
- Use /etc/vaultik/config.yml as default config location
- Add test/config.yaml for testing
- Update tests to use environment variable for config path
- Add .gitignore for build artifacts and local configs
- Update documentation to reflect new CLI syntax
- Set up cobra CLI with all commands (backup, restore, prune, verify, fetch)
- Integrate uber/fx for dependency injection and lifecycle management
- Add globals package with build-time variables (Version, Commit)
- Implement config loading from YAML with validation
- Create core data models (FileInfo, ChunkInfo, BlobInfo, Snapshot)
- Add Makefile with build, test, lint, and clean targets
- Include minimal test suite for compilation verification
- Update documentation with --quick flag for verify command
- Fix markdown numbering in implementation TODO
- Expand README with full CLI documentation, architecture details, and features
- Add comprehensive 87-step implementation plan to DESIGN.md
- Document all commands, configuration options, and security considerations
- Define complete API signatures and data structures