vaultik

Author	SHA1	Message	Date
sneak	2185421c01	Reformat progress lines and prune output Progress lines now use the form: ..., <subject> elapsed: <dur>, <subject> ETA: <time> (est remain <dur>). ui.Time formats same-day times as HH:MM:SS and other-day times as YYYY-MM-DD HH:MM:SS, with no timezone suffix (local time is implied). The local-index-database prune complete line now shows remaining counts for each category: ... 1 incomplete snapshots removed (3 remain), 3783 orphaned files removed (42 remain), ...	2026-06-17 05:44:48 +02:00
sneak	00d4b36e35	Introduce internal/ui package and rewrite user-facing output All user-facing output now goes through a single ui.Writer with a uniform style: 》 (white) for begin / info / notice 》 (green) for complete / success Warning: for warnings (orange) ERROR: for errors (red) 》 (indented) for progress heartbeats Color is enabled when stdout is a TTY and NO_COLOR is unset. Standards: - Complete-sentence messages with fully qualified terms ("backup destination store", "local index database", "snapshot source files enumeration"). - Every Complete has a matching Begin. - Natural verb tense conveys state ("Uploading" -> "Uploaded"). The words "begin"/"complete" never appear in message bodies; the marker color carries that information. - ETA means clock time, not duration. Progress lines say "estimated remaining time (<dur>), finish at <time>" with both labeled. Adds globals.CommitDate (populated by Makefile/Dockerfile/goreleaser via ldflags from `git show -s --format=%cI HEAD`) and a startup banner printed once per invocation. Strips fx call-chain noise from startup errors so users see the actual underlying error (e.g. "creating base path: mkdir /Volumes/BACKUPS: permission denied" instead of three layers of "could not build arguments for function ..."). README documents the output style and the ui package conventions.	2026-06-17 04:32:05 +02:00
sneak	6bb6f7c8a8	Make blob upload progress heartbeat unambiguous (vs snapshot progress)	2026-06-17 02:29:25 +02:00
sneak	b0747657e3	Print upload start line and 15s heartbeat during blob upload Long-running uploads (multi-GB blobs over slow links) previously produced silence between the start of the upload and the "Blob stored" line at the end. Now we print: Uploading blob: <hash> (<size>) before the upload starts, and a heartbeat line at most every 15s: uploading <hash>: <done>/<total> (NN%), <speed>/sec, <elapsed> elapsed, ETA <eta> This gives the user visible progress on large uploads, especially over SMB or remote storage where 10+ second stalls are normal.	2026-06-17 02:27:23 +02:00
sneak	485f3296d9	Fix config-not-found errors, dev-build hint, unify output writer ResolveConfigPath now stats explicit paths from --config and $VAULTIK_CONFIG and produces an actionable error naming the bad path and suggesting 'vaultik config init' (with the right path in the --config case). The default-search failure message lists the paths it tried. The scanner no longer hard-codes os.Stdout vs io.Discard based on EnableProgress. ScannerConfig and ScannerParams take an explicit Output io.Writer, and the Vaultik caller passes v.Stdout — which itself is set to io.Discard in --cron mode. One knob controls both scanner-level and Vaultik-level user-facing output. The version command prints a hint when Version == "dev" telling the user this is a development build without embedded version metadata.	2026-06-17 01:41:09 +02:00
sneak	8959741c90	Add actionable permission-error message with macOS Full Disk Access hint When the scanner hits a permission-denied error (TCC-protected directories on macOS without Full Disk Access, or any other EPERM), the error now names the offending path and includes platform-specific remediation instructions. On macOS it points the user at System Settings -> Privacy & Security -> Full Disk Access. On other platforms it suggests --skip-errors. The error wraps os.ErrPermission so errors.Is still works for callers that care about the underlying error. README quickstart and snapshot create docs now mention the macOS FDA requirement.	2026-06-16 05:20:33 -07:00
sneak	d479bfcd52	Adopt sneak.berlin/go/vaultik vanity import path, README overhaul Module path changed from git.eeqj.de/sneak/vaultik to sneak.berlin/go/vaultik (vanity redirect). All imports, ldflags, Dockerfile, goreleaser config, and docs updated. App data/config directories now use plain "vaultik" instead of the reverse-DNS name. README: - New copy-pasteable quickstart at top: go install, config init, age keypair, config set for key + file:// destination, home backup - All command names in command details are code-quoted - config set/get gained sequence index support (age_recipients.0) so lists are settable from the CLI - Dockerfile build is CGO_ENABLED=0 to match the pure-Go build	2026-06-10 11:37:23 -07:00
sneak	ac5d2f4a0d	Back up symlinks, empty directories, and file permissions Scanner now records symlinks (with their target) and directories during the walk phase instead of skipping them. processFileStreaming detects non-regular entries and writes the DB record without chunking. The e2e test (TestEndToEndFileStorage) now verifies: - Symlink target preserved through backup→restore - Empty directory survives round-trip - File permissions (0600) restored correctly	2026-06-09 12:47:18 -04:00
sneak	ebd6619638	Route scanner output through writer, fix S3 error handling, improve error messages All checks were successful check / check (push) Successful in 2m38s Details Scanner now writes all user-facing output to an io.Writer (os.Stdout when progress is enabled, io.Discard in --cron mode). This fixes the long-standing issue where --cron still printed progress lines. S3 HeadObject now properly distinguishes not-found from other errors instead of swallowing all errors as not-found. Config/CLI error messages include actionable hints (where to find the config, how to generate keys, what storage options exist).	2026-06-09 12:31:50 -04:00
clawbot	1c72a37bc8	Remove all ctime usage and storage (#55 ) All checks were successful check / check (push) Successful in 5s Details Remove all ctime from the codebase per sneak's decision on [PR #48](#48). ## Rationale - ctime means different things on macOS (birth time) vs Linux (inode change time) — ambiguous cross-platform - Vaultik never uses ctime operationally (scanning triggers on mtime change) - Cannot be restored on either platform - Write-only forensic data with no consumer ## Changes - Schema (`internal/database/schema.sql`): Removed `ctime` column from `files` table - Model (`internal/database/models.go`): Removed `CTime` field from `File` struct - Database layer (`internal/database/files.go`): Removed ctime from all INSERT/SELECT queries, ON CONFLICT updates, and scan targets in both `scanFile` and `scanFileRows` helpers; updated `CreateBatch` accordingly - Scanner (`internal/snapshot/scanner.go`): Removed `CTime: info.ModTime()` assignment in `checkFileInMemory()` - Tests: Removed all `CTime` field assignments from 8 test files - Documentation: Removed ctime references from `ARCHITECTURE.md` and `docs/DATAMODEL.md` `docker build .` passes clean (lint, fmt-check, all tests). closes #54 Co-authored-by: user <user@Mac.lan guest wan> Reviewed-on: #55 Co-authored-by: clawbot <clawbot@noreply.example.org> Co-committed-by: clawbot <clawbot@noreply.example.org>	2026-03-20 03:12:46 +01:00
clawbot	ac2f21a89d	Refactor: break up oversized methods into smaller descriptive helpers (#41 ) All checks were successful check / check (push) Successful in 4m17s Details Closes #40 Per sneak's feedback on PR #37: methods were too long. This PR breaks all methods over 100-150 lines into smaller, descriptively named helper methods. ## Refactored methods (8 total) \| Original \| Lines \| Helpers extracted \| \|---\|---\|---\| \| `createNamedSnapshot` \| 214 \| `resolveSnapshotPaths`, `scanAllDirectories`, `collectUploadStats`, `finalizeSnapshotMetadata`, `printSnapshotSummary`, `getSnapshotBlobSizes`, `formatUploadSpeed` \| \| `ListSnapshots` \| 159 \| `listRemoteSnapshotIDs`, `reconcileLocalWithRemote`, `buildSnapshotInfoList`, `printSnapshotTable` \| \| `PruneBlobs` \| 170 \| `collectReferencedBlobs`, `listUniqueSnapshotIDs`, `listAllRemoteBlobs`, `findUnreferencedBlobs`, `deleteUnreferencedBlobs` \| \| `RunDeepVerify` \| 182 \| `loadVerificationData`, `runVerificationSteps`, `deepVerifyFailure` \| \| `RemoteInfo` \| 187 \| `collectSnapshotMetadata`, `collectReferencedBlobsFromManifests`, `populateRemoteInfoResult`, `scanRemoteBlobStorage`, `printRemoteInfoTable` \| \| `handleBlobReady` \| 173 \| `uploadBlobIfNeeded`, `makeUploadProgressCallback`, `recordBlobMetadata`, `cleanupBlobTempFile` \| \| `processFileStreaming` \| 146 \| `updateChunkStats`, `addChunkToPacker`, `queueFileForBatchInsert` \| \| `finalizeCurrentBlob` \| 167 \| `closeBlobWriter`, `buildChunkRefs`, `commitBlobToDatabase`, `deliverFinishedBlob` \| ## Verification - `go build ./...` ✅ - `make test` ✅ (all tests pass) - `golangci-lint run` ✅ (0 issues) - No behavioral changes, pure restructuring Co-authored-by: user <user@Mac.lan guest wan> Reviewed-on: #41 Co-authored-by: clawbot <clawbot@noreply.example.org> Co-committed-by: clawbot <clawbot@noreply.example.org>	2026-03-19 00:23:45 +01:00
sneak	470bf648c4	Add deterministic deduplication, rclone backend, and database purge command - Implement deterministic blob hashing using double SHA256 of uncompressed plaintext data, enabling deduplication even after local DB is cleared - Add Stat() check before blob upload to skip existing blobs in storage - Add rclone storage backend for additional remote storage options - Add 'vaultik database purge' command to erase local state DB - Add 'vaultik remote check' command to verify remote connectivity - Show configured snapshots in 'vaultik snapshot list' output - Skip macOS resource fork files (._*) when listing remote snapshots - Use multi-threaded zstd compression (CPUs - 2 threads) - Add writer tests for double hashing behavior	2026-01-28 15:50:17 -08:00
sneak	417b25a5f5	Add custom types, version command, and restore --verify flag - Add internal/types package with type-safe wrappers for IDs, hashes, paths, and credentials (FileID, BlobID, ChunkHash, etc.) - Implement driver.Valuer and sql.Scanner for UUID-based types - Add `vaultik version` command showing version, commit, go version - Add `--verify` flag to restore command that checksums all restored files against expected chunk hashes with progress bar - Remove fetch.go (dead code, functionality in restore) - Clean up TODO.md, remove completed items - Update all database and snapshot code to use new custom types	2026-01-14 17:11:52 -08:00
sneak	2afd54d693	Add exclude patterns, snapshot prune, and other improvements - Implement exclude patterns with anchored pattern support: - Patterns starting with / only match from root of source dir - Unanchored patterns match anywhere in path - Support for glob patterns (.log, ., */.pack) - Directory patterns skip entire subtrees - Add gobwas/glob dependency for pattern matching - Add 16 comprehensive tests for exclude functionality - Add snapshot prune command to clean orphaned data: - Removes incomplete snapshots from database - Cleans orphaned files, chunks, and blobs - Runs automatically at backup start for consistency - Add snapshot remove command for deleting snapshots - Add VAULTIK_AGE_SECRET_KEY environment variable support - Fix duplicate fx module provider in restore command - Change snapshot ID format to hostname_YYYY-MM-DDTHH:MM:SSZ	2026-01-01 05:42:56 -08:00
sneak	05286bed01	Batch transactions per blob for improved performance Previously, each chunk and blob_chunk was inserted in a separate transaction, leading to ~560k+ transactions for large backups. This change batches all database operations per blob: - Chunks are queued in packer.pendingChunks during file processing - When blob finalizes, one transaction inserts all chunks, blob_chunks, and updates the blob record - Scanner tracks pending chunk hashes to know which files can be flushed - Files are flushed when all their chunks are committed to DB - Database is consistent after each blob finalize This reduces transaction count from O(chunks) to O(blobs), which for a 614k file / 44GB backup means ~50-100 transactions instead of ~560k.	2025-12-23 19:07:26 +07:00
sneak	43a69c2cfb	Fix FK constraint errors in batched file insertion Generate file UUIDs upfront in checkFileInMemory() rather than deferring to Files.Create(). This ensures file_chunks and chunk_files records have valid FileID values when constructed during file processing, before the batch insert transaction. Root cause: For new files, file.ID was empty when building the fileChunks and chunkFiles slices. The ID was only generated later in Files.Create(), but by then the slices already had empty FileID values, causing FK constraint failures. Also adds PROCESS.md documenting the snapshot creation lifecycle, database transactions, and FK dependency ordering.	2025-12-19 19:48:48 +07:00
sneak	899448e1da	Cache chunk hashes in memory for faster small file processing Load all known chunk hashes into an in-memory map at scan start, eliminating per-chunk database queries during file processing. This significantly improves performance when backing up many small files.	2025-12-19 12:56:04 +07:00
sneak	24c5e8c5a6	Refactor: Create file records only after successful chunking - Scan phase now only collects files to process, no DB writes - Unchanged files get snapshot_files associations via batch (no new records) - New/changed files get records created during processing after chunking - Reduces DB writes significantly (only changed files need new records) - Avoids orphaned file records if backup is interrupted mid-way	2025-12-19 12:40:45 +07:00
sneak	40fff09594	Update progress output format with compact file counts New format: Progress [5.7k/610k] 6.7 GB/44 GB (15.4%), 106 MB/sec, 500 files/sec, running for 1m30s, ETA: 5m49s - Compact file counts with k/M suffixes in brackets - Bytes processed/total with percentage - Both byte rate and file rate - Elapsed time shown as "running for X"	2025-12-19 12:33:38 +07:00
sneak	8a8651c690	Fix foreign key error when deleting incomplete snapshots Delete uploads table entries before deleting the snapshot itself. The uploads table has a foreign key to snapshots(id) without CASCADE, so we must explicitly delete upload records first.	2025-12-19 12:27:05 +07:00
sneak	a1d559c30d	Improve processing progress output with bytes and blob messages - Show bytes processed/total instead of just files - Display data rate in bytes/sec - Calculate ETA based on bytes (more accurate than files) - Print message when each blob is stored with size and speed	2025-12-19 12:24:55 +07:00
sneak	88e2508dc7	Eliminate redundant filesystem traversal in scan phase Remove the separate enumerateFiles() function that was doing a full directory walk using Readdir() which calls stat() on every file. Instead, build the existingFiles map during the scan phase walk, and detect deleted files afterward. This eliminates one full filesystem traversal, significantly speeding up the scan phase for large directories.	2025-12-19 12:15:13 +07:00
sneak	c3725e745e	Optimize scan phase: in-memory change detection and batched DB writes Performance improvements: - Load all known files from DB into memory at startup - Check file changes against in-memory map (no per-file DB queries) - Batch database writes in groups of 1000 files per transaction - Scan phase now only counts regular files, not directories This should improve scan speed from ~600 files/sec to potentially 10,000+ files/sec by eliminating per-file database round trips.	2025-12-19 12:08:47 +07:00
sneak	badc0c07e0	Add pluggable storage backend, PID locking, and improved scan progress Storage backend: - Add internal/storage package with Storer interface - Implement FileStorer for local filesystem storage (file:// URLs) - Implement S3Storer wrapping existing s3.Client - Support storage_url config field (s3:// or file://) - Migrate all consumers to use storage.Storer interface PID locking: - Add internal/pidlock package to prevent concurrent instances - Acquire lock before app start, release on exit - Detect stale locks from crashed processes Scan progress improvements: - Add fast file enumeration pass before stat() phase - Use enumerated set for deletion detection (no extra filesystem access) - Show progress with percentage, files/sec, elapsed time, and ETA - Change "changed" to "changed/new" for clarity Config improvements: - Add tilde expansion for paths (~/) - Use xdg library for platform-specific default index path	2025-12-19 11:52:51 +07:00
sneak	d7cd9aac27	Add end-to-end integration tests for Vaultik - Create comprehensive integration tests with mock S3 client - Add in-memory filesystem and SQLite database support for testing - Test full backup workflow including chunking, packing, and uploading - Add test to verify encrypted blob content - Fix scanner to use afero filesystem for temp file cleanup - Demonstrate successful backup and verification with mock dependencies	2025-07-26 15:52:23 +02:00
sneak	bb38f8c5d6	Integrate afero filesystem abstraction library - Add afero.Fs field to Vaultik struct for filesystem operations - Vaultik now owns and manages the filesystem instance - SnapshotManager receives filesystem via SetFilesystem() setter - Update blob packer to use afero for temporary files - Convert all filesystem operations to use afero abstraction - Remove filesystem module - Vaultik manages filesystem directly - Update tests: remove symlink test (unsupported by afero memfs) - Fix TestMultipleFileChanges to handle scanner examining directories This enables full end-to-end testing without touching disk by using memory-backed filesystems. Database operations continue using real filesystem as SQLite requires actual files.	2025-07-26 15:33:18 +02:00
sneak	e29a995120	Refactor: Move Vaultik struct and methods to internal/vaultik package - Created new internal/vaultik package with unified Vaultik struct - Moved all command methods (snapshot, info, prune, verify) from CLI to vaultik package - Implemented single constructor that handles crypto capabilities automatically - Added CanDecrypt() method to check if decryption is available - Updated all CLI commands to use the new vaultik.Vaultik struct - Removed old fragmented App structs and WithCrypto wrapper - Fixed context management - Vaultik now owns its context lifecycle - Cleaned up package imports and dependencies This creates a cleaner separation between CLI/Cobra code and business logic, with all vaultik operations now centralized in the internal/vaultik package.	2025-07-26 14:47:26 +02:00
sneak	a544fa80f2	Major refactoring: Updated manifest format and renamed backup to snapshot - Created manifest.go with proper Manifest structure including blob sizes - Updated manifest generation to include compressed size for each blob - Added TotalCompressedSize field to manifest for quick access - Renamed backup package to snapshot for clarity - Updated snapshot list to show all remote snapshots - Remote snapshots not in local DB fetch manifest to get size - Local snapshots not in remote are automatically deleted - Removed backwards compatibility code (pre-1.0, no users) - Fixed prune command to use new manifest format - Updated all imports and references from backup to snapshot	2025-07-26 03:27:47 +02:00

28 Commits