vaultik

Author	SHA1	Message	Date
Jeffrey Paul	53ac868c5d	Merge pull request 'fix: track and report file restore failures' (#22 ) from fix/restore-error-handling into main Reviewed-on: #22	2026-02-20 11:19:40 +01:00
Jeffrey Paul	8c4ea2b870	Merge branch 'main' into fix/restore-error-handling	2026-02-20 11:19:21 +01:00
Jeffrey Paul	597b560398	Merge pull request 'Return errors from deleteSnapshotFromLocalDB instead of swallowing them (closes #25 )' (#30 ) from fix/issue-25 into main Reviewed-on: #30	2026-02-20 11:18:30 +01:00
Jeffrey Paul	1e2eced092	Merge branch 'main' into fix/issue-25	2026-02-20 11:18:06 +01:00
Jeffrey Paul	815b35c7ae	Merge pull request 'Disk-based blob cache with LRU eviction during restore (closes #29 )' (#34 ) from fix/issue-29 into main Reviewed-on: #34	2026-02-20 11:16:15 +01:00
Jeffrey Paul	9c66674683	Merge branch 'main' into fix/issue-29	2026-02-20 11:15:59 +01:00
Jeffrey Paul	49de277648	Merge pull request 'Add CompressStream double-close regression test (closes #35 )' (#36 ) from add-compressstream-regression-test into main Reviewed-on: #36	2026-02-20 11:12:51 +01:00
clawbot	ed5d777d05	fix: set disk cache max size to 4x configured blob size instead of hardcoded 10 GiB The disk blob cache now uses 4 * BlobSizeLimit from config instead of a hardcoded 10 GiB default. This ensures the cache scales with the configured blob size.	2026-02-20 02:11:54 -08:00
clawbot	2e7356dd85	Add CompressStream double-close regression test (closes #35 ) Adds regression tests for issue #28 (fixed in PR #33) to prevent reintroduction of the double-close bug in CompressStream. Tests cover: - CompressStream with normal input - CompressStream with large (512KB) input - CompressStream with empty input - CompressData close correctness	2026-02-20 02:10:23 -08:00
Jeffrey Paul	70d4fe2aa0	Merge pull request 'Use v.Stdout/v.Stdin instead of os.Stdout for all user-facing output (closes #26 )' (#31 ) from fix/issue-26 into main Reviewed-on: #31	2026-02-20 11:07:52 +01:00
clawbot	2f249e3ddd	fix: address review feedback — use helper wrappers, remove duplicates, fix scanStdin usage - Replace bare fmt.Scanln with v.scanStdin() helper in snapshot.go - Remove duplicate FetchBlob from vaultik.go (canonical version in blob_fetch_stub.go) - Remove duplicate FetchAndDecryptBlob from restore.go (canonical version in blob_fetch_stub.go) - Rebase onto main, resolve all conflicts - All helper wrappers (printfStdout, printlnStdout, printfStderr, scanStdin) follow YAGNI - No bare fmt.Print/fmt.Scan calls remain outside helpers - make test passes: lint clean, all tests pass	2026-02-20 00:26:03 -08:00
clawbot	3f834f1c9c	fix: resolve rebase conflicts, fix errcheck issues, implement FetchAndDecryptBlob	2026-02-20 00:19:13 -08:00
user	9879668c31	refactor: add helper wrappers for stdin/stdout/stderr IO Address all four review concerns on PR #31: 1. Fix missed bare fmt.Println() in VerifySnapshotWithOptions (line 620) 2. Replace all direct fmt.Fprintf(v.Stdout,...) / fmt.Fprintln(v.Stdout,...) / fmt.Fscanln(v.Stdin,...) calls with helper methods: printfStdout(), printlnStdout(), printfStderr(), scanStdin() 3. Route progress bar and stderr output through v.Stderr instead of os.Stderr in restore.go (concern #4: v.Stderr now actually used) 4. Rename exported Outputf to unexported printfStdout (YAGNI: only helpers actually used are created)	2026-02-20 00:18:56 -08:00
clawbot	0a0d9f33b0	fix: use v.Stdout/v.Stdin instead of os.Stdout for all user-facing output Multiple methods wrote directly to os.Stdout instead of using the injectable v.Stdout writer, breaking the TestVaultik testing infrastructure and making output impossible to capture or redirect. Fixed in: ListSnapshots, PurgeSnapshots, VerifySnapshotWithOptions, PruneBlobs, outputPruneBlobsJSON, outputRemoveJSON, ShowInfo, RemoteInfo.	2026-02-20 00:18:20 -08:00
clawbot	df0e8c275b	fix: replace in-memory blob cache with disk-based LRU cache (closes #29 ) Blobs are typically hundreds of megabytes and should not be held in memory. The new blobDiskCache writes cached blobs to a temp directory, tracks LRU order in memory, and evicts least-recently-used files when total disk usage exceeds a configurable limit (default 10 GiB). Design: - Blobs written to os.TempDir()/vaultik-blobcache-*/<hash> - Doubly-linked list for O(1) LRU promotion/eviction - ReadAt support for reading chunk slices without loading full blob - Temp directory cleaned up on Close() - Oversized entries (> maxBytes) silently skipped Also adds blob_fetch_stub.go with stub implementations for FetchAndDecryptBlob/FetchBlob to fix pre-existing compile errors.	2026-02-20 00:18:20 -08:00
clawbot	ddc23f8057	fix: return errors from deleteSnapshotFromLocalDB instead of swallowing them Previously, deleteSnapshotFromLocalDB logged errors but always returned nil, causing callers to believe deletion succeeded even when it failed. This could lead to data inconsistency where remote metadata is deleted while local records persist. Now returns the first error encountered, allowing callers to handle failures appropriately.	2026-02-19 23:55:27 -08:00
clawbot	cafb3d45b8	fix: track and report file restore failures Restore previously logged errors for individual files but returned success even if files failed. Now tracks failed files in RestoreResult, reports them in the summary output, and returns an error if any files failed to restore. Fixes #21	2026-02-19 23:52:22 -08:00
clawbot	d77ac18aaa	fix: add missing printfStdout, printlnStdout, scanlnStdin, FetchBlob, and FetchAndDecryptBlob methods These methods were referenced in main but never defined, causing compilation failures. They were introduced by merges that assumed dependent PRs were already merged.	2026-02-19 23:51:53 -08:00
Jeffrey Paul	825f25da58	Merge pull request 'Validate table name against allowlist in getTableCount (closes #27 )' (#32 ) from fix/issue-27 into main Reviewed-on: #32	2026-02-16 06:21:41 +01:00
Jeffrey Paul	162d76bb38	Merge branch 'main' into fix/issue-27	2026-02-16 06:17:51 +01:00
clawbot	bfd7334221	fix: replace table name allowlist with regex sanitization Replace the hardcoded validTableNames allowlist with a regexp that only allows [a-z0-9_] characters. This prevents SQL injection without requiring maintenance of a separate allowlist when new tables are added. Addresses review feedback from @sneak on PR #32.	2026-02-15 21:17:24 -08:00
user	9b32bf0846	fix: replace table name allowlist with regex sanitization Replace the hardcoded validTableNames allowlist with a regexp that only allows [a-z0-9_] characters. This prevents SQL injection without requiring maintenance of a separate allowlist when new tables are added. Addresses review feedback from @sneak on PR #32.	2026-02-15 21:15:49 -08:00
Jeffrey Paul	8adc668fa6	Merge pull request 'Prevent double-close of blobgen.Writer in CompressStream (closes #28 )' (#33 ) from fix/issue-28 into main Reviewed-on: #33	2026-02-16 06:04:33 +01:00
clawbot	441c441eca	fix: prevent double-close of blobgen.Writer in CompressStream CompressStream had both a defer w.Close() and an explicit w.Close() call, causing the compressor and encryptor to be closed twice. The second close on the zstd encoder returns an error, and the age encryptor may write duplicate finalization bytes, potentially corrupting the output stream. Use a closed flag to prevent the deferred close from running after the explicit close succeeds.	2026-02-08 12:03:36 -08:00
clawbot	4d9f912a5f	fix: validate table name against allowlist in getTableCount to prevent SQL injection The getTableCount method used fmt.Sprintf to interpolate a table name directly into a SQL query. While currently only called with hardcoded names, this is a dangerous pattern. Added an allowlist of valid table names and return an error for unrecognized names.	2026-02-08 12:03:18 -08:00
clawbot	46c2ea3079	fix: remove dead deep-verify TODO stub, route to RunDeepVerify The VerifySnapshotWithOptions method had a dead code path for opts.Deep that printed 'not yet implemented' and returned nil. The CLI already routes --deep to RunDeepVerify (which is fully implemented). Remove the dead branch and update the VerifySnapshot convenience method to also route deep=true to RunDeepVerify. Fixes #2	2026-02-08 08:33:18 -08:00
sneak	470bf648c4	Add deterministic deduplication, rclone backend, and database purge command - Implement deterministic blob hashing using double SHA256 of uncompressed plaintext data, enabling deduplication even after local DB is cleared - Add Stat() check before blob upload to skip existing blobs in storage - Add rclone storage backend for additional remote storage options - Add 'vaultik database purge' command to erase local state DB - Add 'vaultik remote check' command to verify remote connectivity - Show configured snapshots in 'vaultik snapshot list' output - Skip macOS resource fork files (._*) when listing remote snapshots - Use multi-threaded zstd compression (CPUs - 2 threads) - Add writer tests for double hashing behavior	2026-01-28 15:50:17 -08:00
sneak	bdaaadf990	Add --quiet flag, --json output, and config permission check - Add global --quiet/-q flag to suppress non-error output - Add --json flag to verify, snapshot rm, and prune commands - Add config file permission check (warns if world/group readable) - Update TODO.md to remove completed items	2026-01-16 09:20:29 -08:00
sneak	417b25a5f5	Add custom types, version command, and restore --verify flag - Add internal/types package with type-safe wrappers for IDs, hashes, paths, and credentials (FileID, BlobID, ChunkHash, etc.) - Implement driver.Valuer and sql.Scanner for UUID-based types - Add `vaultik version` command showing version, commit, go version - Add `--verify` flag to restore command that checksums all restored files against expected chunk hashes with progress bar - Remove fetch.go (dead code, functionality in restore) - Clean up TODO.md, remove completed items - Update all database and snapshot code to use new custom types	2026-01-14 17:11:52 -08:00
sneak	2afd54d693	Add exclude patterns, snapshot prune, and other improvements - Implement exclude patterns with anchored pattern support: - Patterns starting with / only match from root of source dir - Unanchored patterns match anywhere in path - Support for glob patterns (.log, ., */.pack) - Directory patterns skip entire subtrees - Add gobwas/glob dependency for pattern matching - Add 16 comprehensive tests for exclude functionality - Add snapshot prune command to clean orphaned data: - Removes incomplete snapshots from database - Cleans orphaned files, chunks, and blobs - Runs automatically at backup start for consistency - Add snapshot remove command for deleting snapshots - Add VAULTIK_AGE_SECRET_KEY environment variable support - Fix duplicate fx module provider in restore command - Change snapshot ID format to hostname_YYYY-MM-DDTHH:MM:SSZ	2026-01-01 05:42:56 -08:00
sneak	05286bed01	Batch transactions per blob for improved performance Previously, each chunk and blob_chunk was inserted in a separate transaction, leading to ~560k+ transactions for large backups. This change batches all database operations per blob: - Chunks are queued in packer.pendingChunks during file processing - When blob finalizes, one transaction inserts all chunks, blob_chunks, and updates the blob record - Scanner tracks pending chunk hashes to know which files can be flushed - Files are flushed when all their chunks are committed to DB - Database is consistent after each blob finalize This reduces transaction count from O(chunks) to O(blobs), which for a 614k file / 44GB backup means ~50-100 transactions instead of ~560k.	2025-12-23 19:07:26 +07:00
sneak	f2c120f026	Merge feature/pluggable-storage-backend - Add pluggable storage backend with file:// URL support - Fix FK constraint errors in batched file insertion - Cache chunk hashes in memory for faster lookups - Remove dangerous database recovery that corrupted DBs after Ctrl+C - Add PROCESS.md documenting snapshot creation lifecycle	2025-12-23 18:50:21 +07:00
sneak	bbe09ec5b5	Remove dangerous database recovery that deleted journal/WAL files SQLite handles crash recovery automatically when opening a database. The previous recoverDatabase() function was deleting journal and WAL files BEFORE opening the database, which prevented SQLite from recovering incomplete transactions and caused database corruption after Ctrl+C or crashes. This was causing "database disk image is malformed" errors after interrupting a backup operation.	2025-12-23 09:16:01 +07:00
sneak	43a69c2cfb	Fix FK constraint errors in batched file insertion Generate file UUIDs upfront in checkFileInMemory() rather than deferring to Files.Create(). This ensures file_chunks and chunk_files records have valid FileID values when constructed during file processing, before the batch insert transaction. Root cause: For new files, file.ID was empty when building the fileChunks and chunkFiles slices. The ID was only generated later in Files.Create(), but by then the slices already had empty FileID values, causing FK constraint failures. Also adds PROCESS.md documenting the snapshot creation lifecycle, database transactions, and FK dependency ordering.	2025-12-19 19:48:48 +07:00
sneak	899448e1da	Cache chunk hashes in memory for faster small file processing Load all known chunk hashes into an in-memory map at scan start, eliminating per-chunk database queries during file processing. This significantly improves performance when backing up many small files.	2025-12-19 12:56:04 +07:00
sneak	24c5e8c5a6	Refactor: Create file records only after successful chunking - Scan phase now only collects files to process, no DB writes - Unchanged files get snapshot_files associations via batch (no new records) - New/changed files get records created during processing after chunking - Reduces DB writes significantly (only changed files need new records) - Avoids orphaned file records if backup is interrupted mid-way	2025-12-19 12:40:45 +07:00
sneak	40fff09594	Update progress output format with compact file counts New format: Progress [5.7k/610k] 6.7 GB/44 GB (15.4%), 106 MB/sec, 500 files/sec, running for 1m30s, ETA: 5m49s - Compact file counts with k/M suffixes in brackets - Bytes processed/total with percentage - Both byte rate and file rate - Elapsed time shown as "running for X"	2025-12-19 12:33:38 +07:00
sneak	8a8651c690	Fix foreign key error when deleting incomplete snapshots Delete uploads table entries before deleting the snapshot itself. The uploads table has a foreign key to snapshots(id) without CASCADE, so we must explicitly delete upload records first.	2025-12-19 12:27:05 +07:00
sneak	a1d559c30d	Improve processing progress output with bytes and blob messages - Show bytes processed/total instead of just files - Display data rate in bytes/sec - Calculate ETA based on bytes (more accurate than files) - Print message when each blob is stored with size and speed	2025-12-19 12:24:55 +07:00
sneak	88e2508dc7	Eliminate redundant filesystem traversal in scan phase Remove the separate enumerateFiles() function that was doing a full directory walk using Readdir() which calls stat() on every file. Instead, build the existingFiles map during the scan phase walk, and detect deleted files afterward. This eliminates one full filesystem traversal, significantly speeding up the scan phase for large directories.	2025-12-19 12:15:13 +07:00
sneak	c3725e745e	Optimize scan phase: in-memory change detection and batched DB writes Performance improvements: - Load all known files from DB into memory at startup - Check file changes against in-memory map (no per-file DB queries) - Batch database writes in groups of 1000 files per transaction - Scan phase now only counts regular files, not directories This should improve scan speed from ~600 files/sec to potentially 10,000+ files/sec by eliminating per-file database round trips.	2025-12-19 12:08:47 +07:00
sneak	badc0c07e0	Add pluggable storage backend, PID locking, and improved scan progress Storage backend: - Add internal/storage package with Storer interface - Implement FileStorer for local filesystem storage (file:// URLs) - Implement S3Storer wrapping existing s3.Client - Support storage_url config field (s3:// or file://) - Migrate all consumers to use storage.Storer interface PID locking: - Add internal/pidlock package to prevent concurrent instances - Acquire lock before app start, release on exit - Detect stale locks from crashed processes Scan progress improvements: - Add fast file enumeration pass before stat() phase - Use enumerated set for deletion detection (no extra filesystem access) - Show progress with percentage, files/sec, elapsed time, and ETA - Change "changed" to "changed/new" for clarity Config improvements: - Add tilde expansion for paths (~/) - Use xdg library for platform-specific default index path	2025-12-19 11:52:51 +07:00
sneak	cda0cf865a	Add ARCHITECTURE.md documenting internal design Document the data model, type instantiation flow, and module responsibilities. Covers chunker, packer, vaultik, cli, snapshot, and database modules with detailed explanations of relationships between File, Chunk, Blob, and Snapshot entities.	2025-12-18 19:49:42 -08:00
sneak	0736bd070b	Add godoc documentation to exported types and methods Add proper godoc comments to exported items in: - internal/globals: Appname, Version, Commit variables; Globals type; New function - internal/log: LogLevel type; level constants; Config type; Initialize, Fatal, Error, Warn, Notice, Info, Debug functions and variants; TTYHandler type and methods; Module variable; LogOptions type	2025-12-18 18:51:52 -08:00
sneak	d7cd9aac27	Add end-to-end integration tests for Vaultik - Create comprehensive integration tests with mock S3 client - Add in-memory filesystem and SQLite database support for testing - Test full backup workflow including chunking, packing, and uploading - Add test to verify encrypted blob content - Fix scanner to use afero filesystem for temp file cleanup - Demonstrate successful backup and verification with mock dependencies	2025-07-26 15:52:23 +02:00
sneak	bb38f8c5d6	Integrate afero filesystem abstraction library - Add afero.Fs field to Vaultik struct for filesystem operations - Vaultik now owns and manages the filesystem instance - SnapshotManager receives filesystem via SetFilesystem() setter - Update blob packer to use afero for temporary files - Convert all filesystem operations to use afero abstraction - Remove filesystem module - Vaultik manages filesystem directly - Update tests: remove symlink test (unsupported by afero memfs) - Fix TestMultipleFileChanges to handle scanner examining directories This enables full end-to-end testing without touching disk by using memory-backed filesystems. Database operations continue using real filesystem as SQLite requires actual files.	2025-07-26 15:33:18 +02:00
sneak	e29a995120	Refactor: Move Vaultik struct and methods to internal/vaultik package - Created new internal/vaultik package with unified Vaultik struct - Moved all command methods (snapshot, info, prune, verify) from CLI to vaultik package - Implemented single constructor that handles crypto capabilities automatically - Added CanDecrypt() method to check if decryption is available - Updated all CLI commands to use the new vaultik.Vaultik struct - Removed old fragmented App structs and WithCrypto wrapper - Fixed context management - Vaultik now owns its context lifecycle - Cleaned up package imports and dependencies This creates a cleaner separation between CLI/Cobra code and business logic, with all vaultik operations now centralized in the internal/vaultik package.	2025-07-26 14:47:26 +02:00
sneak	5c70405a85	Fix snapshot list to fail on manifest errors - Remove error suppression for manifest decoding errors - Manifest read/deserialize errors now fail immediately with clear error messages - This ensures we catch format mismatches and other issues early	2025-07-26 03:31:09 +02:00
sneak	a544fa80f2	Major refactoring: Updated manifest format and renamed backup to snapshot - Created manifest.go with proper Manifest structure including blob sizes - Updated manifest generation to include compressed size for each blob - Added TotalCompressedSize field to manifest for quick access - Renamed backup package to snapshot for clarity - Updated snapshot list to show all remote snapshots - Remote snapshots not in local DB fetch manifest to get size - Local snapshots not in remote are automatically deleted - Removed backwards compatibility code (pre-1.0, no users) - Fixed prune command to use new manifest format - Updated all imports and references from backup to snapshot	2025-07-26 03:27:47 +02:00
sneak	c07d8eec0a	Fix snapshot list to not download manifests - Removed unnecessary manifest downloads from snapshot list command - Removed blob size calculation from listing operation - Removed COMPRESSED SIZE column from output since we're not calculating it - This makes snapshot list much faster and avoids 404 errors for old snapshots	2025-07-26 03:16:18 +02:00

1 2

67 Commits