vaultik

Author	SHA1	Message	Date
sneak	fd759a921a	Hash snapshot IDs at the storage boundary; make snapshot list resilient Two related changes, both addressing leakage and brittleness around the public bytes the destination store sees. First, every remote storage path that previously embedded a human snapshot ID (e.g. metadata/heraklion_berlin.sneak.fs.photos.2026. catalog_2026-06-24T07:00:15Z/...) now uses the hashed remote key: RemoteSnapshotKey(id) = hex(SHA256(SHA256("vaultik\|" + id))) Applied at: * uploadSnapshotArtifacts (snapshot create write path) * the manifest.json.zst snapshot_id field — manifest is unencrypted, so the human ID would otherwise be readable to anyone with bucket-list permission * cleanupIncompleteSnapshots metadata-existence probe * snapshot restore / verify (downloadSnapshotDB, loadVerificationData) * downloadManifestByKey, deleteRemoteSnapshotByKey * CleanupLocalSnapshots reconciliation * the locally-driven removal paths (RemoveSnapshot, RemoveAllSnapshots, confirmAndExecutePurge) The local index database keeps human IDs everywhere — the hash is a boundary translation, not a rename. A directory listing of the backup destination now looks like "metadata/<64-hex>/{db.zst.age,manifest.json.zst}" with no host, snapshot-name, or timestamp information visible. Second, snapshot list no longer fails just because remote storage is unreachable, and only consults the remote when the local machine can plausibly decrypt: * Listing is always driven by the local index database — that's what holds the human IDs, timestamps, and per-snapshot stats that the table actually shows. * If no age secret key is configured, we skip remote listing entirely (the box is treated as a write-only backup machine — there's no value showing it remote-only keys it could never restore). * If a key IS configured, we try the remote listing; failures (volume unmounted, permission denied, network error) downgrade to a warning instead of aborting the command. * When the remote listing succeeds, we cross-reference by hashing each local human ID and diffing against the returned key set. Local-only snapshots get the existing "stale local record" cleanup hint; remote-only keys are surfaced as a single "NOTE: N remote snapshot(s) found in backup destination store but not in local database" line. FileStorer construction also no longer does an eager mkdir — the basePath is recorded and the directory is created lazily on first write. A missing or unmounted destination during `snapshot list` should NOT block the command, and now it doesn't. RemoveAllSnapshots is rewritten to drive deletion from the local index instead of from a remote listing, hashing each local ID to find the corresponding remote key. Orphan remote keys (no matching local snapshot) are handled separately and only deleted when --remote is set. Existing tests are updated to hash storage paths through the new RemoteSnapshotKey helper. The hash format is a hard pre-1.0 break: existing remote snapshots written under the human-ID path scheme are no longer readable; they need to be either re-uploaded under the new scheme or manually renamed. There is no fallback path; matching the project policy of "no migrations pre-1.0."	2026-06-26 01:54:35 +02:00
sneak	132f7149ca	Populate snapshot_blobs for dedup-referenced blobs at completion The bug: fully-deduplicated snapshots (every chunk already in storage from a prior run) had an empty snapshot_blobs table. The metadata- export pipeline then dropped all blob/blob_chunks rows from the exported database, leaving file_chunks references to chunks whose blobs were no longer recorded. Restore fails on every file with "chunk X not found in any blob". Fix: at CompleteSnapshot time, run an INSERT OR IGNORE that links every blob holding a chunk referenced by this snapshot's files into snapshot_blobs. New blobs uploaded during the snapshot are already recorded (no-op for them); dedup-referenced blobs are added. The cleanup query in deleteOrphanedBlobs already restricts to snapshot_blobs entries for the current snapshot — so once snapshot_blobs is correctly populated, the exported database contains the full set of blob/blob_chunks rows needed for restore. Regression test: TestDedupOnlySnapshotRestores creates two identical snapshots (the second uploads zero new blobs) and restores the second. Without the fix, restore fails on every file.	2026-06-17 06:05:52 +02:00
sneak	d479bfcd52	Adopt sneak.berlin/go/vaultik vanity import path, README overhaul Module path changed from git.eeqj.de/sneak/vaultik to sneak.berlin/go/vaultik (vanity redirect). All imports, ldflags, Dockerfile, goreleaser config, and docs updated. App data/config directories now use plain "vaultik" instead of the reverse-DNS name. README: - New copy-pasteable quickstart at top: go install, config init, age keypair, config set for key + file:// destination, home backup - All command names in command details are code-quoted - config set/get gained sequence index support (age_recipients.0) so lists are settable from the CLI - Dockerfile build is CGO_ENABLED=0 to match the pure-Go build	2026-06-10 11:37:23 -07:00
clawbot	ac2f21a89d	Refactor: break up oversized methods into smaller descriptive helpers (#41 ) All checks were successful check / check (push) Successful in 4m17s Details Closes #40 Per sneak's feedback on PR #37: methods were too long. This PR breaks all methods over 100-150 lines into smaller, descriptively named helper methods. ## Refactored methods (8 total) \| Original \| Lines \| Helpers extracted \| \|---\|---\|---\| \| `createNamedSnapshot` \| 214 \| `resolveSnapshotPaths`, `scanAllDirectories`, `collectUploadStats`, `finalizeSnapshotMetadata`, `printSnapshotSummary`, `getSnapshotBlobSizes`, `formatUploadSpeed` \| \| `ListSnapshots` \| 159 \| `listRemoteSnapshotIDs`, `reconcileLocalWithRemote`, `buildSnapshotInfoList`, `printSnapshotTable` \| \| `PruneBlobs` \| 170 \| `collectReferencedBlobs`, `listUniqueSnapshotIDs`, `listAllRemoteBlobs`, `findUnreferencedBlobs`, `deleteUnreferencedBlobs` \| \| `RunDeepVerify` \| 182 \| `loadVerificationData`, `runVerificationSteps`, `deepVerifyFailure` \| \| `RemoteInfo` \| 187 \| `collectSnapshotMetadata`, `collectReferencedBlobsFromManifests`, `populateRemoteInfoResult`, `scanRemoteBlobStorage`, `printRemoteInfoTable` \| \| `handleBlobReady` \| 173 \| `uploadBlobIfNeeded`, `makeUploadProgressCallback`, `recordBlobMetadata`, `cleanupBlobTempFile` \| \| `processFileStreaming` \| 146 \| `updateChunkStats`, `addChunkToPacker`, `queueFileForBatchInsert` \| \| `finalizeCurrentBlob` \| 167 \| `closeBlobWriter`, `buildChunkRefs`, `commitBlobToDatabase`, `deliverFinishedBlob` \| ## Verification - `go build ./...` ✅ - `make test` ✅ (all tests pass) - `golangci-lint run` ✅ (0 issues) - No behavioral changes, pure restructuring Co-authored-by: user <user@Mac.lan guest wan> Reviewed-on: #41 Co-authored-by: clawbot <clawbot@noreply.example.org> Co-committed-by: clawbot <clawbot@noreply.example.org>	2026-03-19 00:23:45 +01:00
sneak	470bf648c4	Add deterministic deduplication, rclone backend, and database purge command - Implement deterministic blob hashing using double SHA256 of uncompressed plaintext data, enabling deduplication even after local DB is cleared - Add Stat() check before blob upload to skip existing blobs in storage - Add rclone storage backend for additional remote storage options - Add 'vaultik database purge' command to erase local state DB - Add 'vaultik remote check' command to verify remote connectivity - Show configured snapshots in 'vaultik snapshot list' output - Skip macOS resource fork files (._*) when listing remote snapshots - Use multi-threaded zstd compression (CPUs - 2 threads) - Add writer tests for double hashing behavior	2026-01-28 15:50:17 -08:00
sneak	417b25a5f5	Add custom types, version command, and restore --verify flag - Add internal/types package with type-safe wrappers for IDs, hashes, paths, and credentials (FileID, BlobID, ChunkHash, etc.) - Implement driver.Valuer and sql.Scanner for UUID-based types - Add `vaultik version` command showing version, commit, go version - Add `--verify` flag to restore command that checksums all restored files against expected chunk hashes with progress bar - Remove fetch.go (dead code, functionality in restore) - Clean up TODO.md, remove completed items - Update all database and snapshot code to use new custom types	2026-01-14 17:11:52 -08:00
sneak	2afd54d693	Add exclude patterns, snapshot prune, and other improvements - Implement exclude patterns with anchored pattern support: - Patterns starting with / only match from root of source dir - Unanchored patterns match anywhere in path - Support for glob patterns (.log, ., */.pack) - Directory patterns skip entire subtrees - Add gobwas/glob dependency for pattern matching - Add 16 comprehensive tests for exclude functionality - Add snapshot prune command to clean orphaned data: - Removes incomplete snapshots from database - Cleans orphaned files, chunks, and blobs - Runs automatically at backup start for consistency - Add snapshot remove command for deleting snapshots - Add VAULTIK_AGE_SECRET_KEY environment variable support - Fix duplicate fx module provider in restore command - Change snapshot ID format to hostname_YYYY-MM-DDTHH:MM:SSZ	2026-01-01 05:42:56 -08:00
sneak	8a8651c690	Fix foreign key error when deleting incomplete snapshots Delete uploads table entries before deleting the snapshot itself. The uploads table has a foreign key to snapshots(id) without CASCADE, so we must explicitly delete upload records first.	2025-12-19 12:27:05 +07:00
sneak	badc0c07e0	Add pluggable storage backend, PID locking, and improved scan progress Storage backend: - Add internal/storage package with Storer interface - Implement FileStorer for local filesystem storage (file:// URLs) - Implement S3Storer wrapping existing s3.Client - Support storage_url config field (s3:// or file://) - Migrate all consumers to use storage.Storer interface PID locking: - Add internal/pidlock package to prevent concurrent instances - Acquire lock before app start, release on exit - Detect stale locks from crashed processes Scan progress improvements: - Add fast file enumeration pass before stat() phase - Use enumerated set for deletion detection (no extra filesystem access) - Show progress with percentage, files/sec, elapsed time, and ETA - Change "changed" to "changed/new" for clarity Config improvements: - Add tilde expansion for paths (~/) - Use xdg library for platform-specific default index path	2025-12-19 11:52:51 +07:00
sneak	bb38f8c5d6	Integrate afero filesystem abstraction library - Add afero.Fs field to Vaultik struct for filesystem operations - Vaultik now owns and manages the filesystem instance - SnapshotManager receives filesystem via SetFilesystem() setter - Update blob packer to use afero for temporary files - Convert all filesystem operations to use afero abstraction - Remove filesystem module - Vaultik manages filesystem directly - Update tests: remove symlink test (unsupported by afero memfs) - Fix TestMultipleFileChanges to handle scanner examining directories This enables full end-to-end testing without touching disk by using memory-backed filesystems. Database operations continue using real filesystem as SQLite requires actual files.	2025-07-26 15:33:18 +02:00
sneak	e29a995120	Refactor: Move Vaultik struct and methods to internal/vaultik package - Created new internal/vaultik package with unified Vaultik struct - Moved all command methods (snapshot, info, prune, verify) from CLI to vaultik package - Implemented single constructor that handles crypto capabilities automatically - Added CanDecrypt() method to check if decryption is available - Updated all CLI commands to use the new vaultik.Vaultik struct - Removed old fragmented App structs and WithCrypto wrapper - Fixed context management - Vaultik now owns its context lifecycle - Cleaned up package imports and dependencies This creates a cleaner separation between CLI/Cobra code and business logic, with all vaultik operations now centralized in the internal/vaultik package.	2025-07-26 14:47:26 +02:00
sneak	a544fa80f2	Major refactoring: Updated manifest format and renamed backup to snapshot - Created manifest.go with proper Manifest structure including blob sizes - Updated manifest generation to include compressed size for each blob - Added TotalCompressedSize field to manifest for quick access - Renamed backup package to snapshot for clarity - Updated snapshot list to show all remote snapshots - Remote snapshots not in local DB fetch manifest to get size - Local snapshots not in remote are automatically deleted - Removed backwards compatibility code (pre-1.0, no users) - Fixed prune command to use new manifest format - Updated all imports and references from backup to snapshot	2025-07-26 03:27:47 +02:00

12 Commits