Merge branch 'main' into fix/issue-27

fix: replace table name allowlist with regex sanitization
Replace the hardcoded validTableNames allowlist with a regexp that only allows [a-z0-9_] characters. This prevents SQL injection without requiring maintenance of a separate allowlist when new tables are added. Addresses review feedback from @sneak on PR #32.
2026-02-16 06:17:51 +01:00 · 2026-02-15 21:17:24 -08:00 · 2026-02-15 21:15:49 -08:00 · 2026-02-16 06:04:33 +01:00 · 2026-02-08 12:03:36 -08:00 · 2026-02-08 12:03:18 -08:00
90 changed files with 12442 additions and 3433 deletions
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -0,0 +1,380 @@
+# Vaultik Architecture
+
+This document describes the internal architecture of Vaultik, focusing on the data model, type instantiation, and the relationships between core modules.
+
+## Overview
+
+Vaultik is a backup system that uses content-defined chunking for deduplication and packs chunks into large, compressed, encrypted blobs for efficient cloud storage. The system is built around dependency injection using [uber-go/fx](https://github.com/uber-go/fx).
+
+## Data Flow
+
+```
+Source Files
+     │
+     ▼
+┌─────────────────┐
+│    Scanner      │  Walks directories, detects changed files
+└────────┬────────┘
+         │
+         ▼
+┌─────────────────┐
+│    Chunker      │  Splits files into variable-size chunks (FastCDC)
+└────────┬────────┘
+         │
+         ▼
+┌─────────────────┐
+│    Packer       │  Accumulates chunks, compresses (zstd), encrypts (age)
+└────────┬────────┘
+         │
+         ▼
+┌─────────────────┐
+│   S3 Client     │  Uploads blobs to remote storage
+└─────────────────┘
+```
+
+## Data Model
+
+### Core Entities
+
+The database tracks five primary entities and their relationships:
+
+```
+┌──────────────┐     ┌──────────────┐     ┌──────────────┐
+│   Snapshot   │────▶│     File     │────▶│    Chunk     │
+└──────────────┘     └──────────────┘     └──────────────┘
+       │                                         │
+       │                                         │
+       ▼                                         ▼
+┌──────────────┐                          ┌──────────────┐
+│     Blob     │◀─────────────────────────│  BlobChunk   │
+└──────────────┘                          └──────────────┘
+```
+
+### Entity Descriptions
+
+#### File (`database.File`)
+Represents a file or directory in the backup system. Stores metadata needed for restoration:
+- Path, timestamps (mtime, ctime)
+- Size, mode, ownership (uid, gid)
+- Symlink target (if applicable)
+
+#### Chunk (`database.Chunk`)
+A content-addressed unit of data. Files are split into variable-size chunks using the FastCDC algorithm:
+- `ChunkHash`: SHA256 hash of chunk content (primary key)
+- `Size`: Chunk size in bytes
+
+Chunk sizes vary between `avgChunkSize/4` and `avgChunkSize*4` (typically 16KB-256KB for 64KB average).
+
+#### FileChunk (`database.FileChunk`)
+Maps files to their constituent chunks:
+- `FileID`: Reference to the file
+- `Idx`: Position of this chunk within the file (0-indexed)
+- `ChunkHash`: Reference to the chunk
+
+#### Blob (`database.Blob`)
+The final storage unit uploaded to S3. Contains many compressed and encrypted chunks:
+- `ID`: UUID assigned at creation
+- `Hash`: SHA256 of final compressed+encrypted content
+- `UncompressedSize`: Total raw chunk data before compression
+- `CompressedSize`: Size after zstd compression and age encryption
+- `CreatedTS`, `FinishedTS`, `UploadedTS`: Lifecycle timestamps
+
+Blob creation process:
+1. Chunks are accumulated (up to MaxBlobSize, typically 10GB)
+2. Compressed with zstd
+3. Encrypted with age (recipients configured in config)
+4. SHA256 hash computed → becomes filename in S3
+5. Uploaded to `blobs/{hash[0:2]}/{hash[2:4]}/{hash}`
+
+#### BlobChunk (`database.BlobChunk`)
+Maps chunks to their position within blobs:
+- `BlobID`: Reference to the blob
+- `ChunkHash`: Reference to the chunk
+- `Offset`: Byte offset within the uncompressed blob
+- `Length`: Chunk size
+
+#### Snapshot (`database.Snapshot`)
+Represents a point-in-time backup:
+- `ID`: Format is `{hostname}-{YYYYMMDD}-{HHMMSS}Z`
+- Tracks file count, chunk count, blob count, sizes, compression ratio
+- `CompletedAt`: Null until snapshot finishes successfully
+
+#### SnapshotFile / SnapshotBlob
+Join tables linking snapshots to their files and blobs.
+
+### Relationship Summary
+
+```
+Snapshot 1──────────▶ N SnapshotFile N ◀────────── 1 File
+Snapshot 1──────────▶ N SnapshotBlob N ◀────────── 1 Blob
+File     1──────────▶ N FileChunk    N ◀────────── 1 Chunk
+Blob     1──────────▶ N BlobChunk    N ◀────────── 1 Chunk
+```
+
+## Type Instantiation
+
+### Application Startup
+
+The CLI uses fx for dependency injection. Here's the instantiation order:
+
+```go
+// cli/app.go: NewApp()
+fx.New(
+    fx.Supply(config.ConfigPath(opts.ConfigPath)),  // 1. Config path
+    fx.Supply(opts.LogOptions),                      // 2. Log options
+    fx.Provide(globals.New),                         // 3. Globals
+    fx.Provide(log.New),                             // 4. Logger config
+    config.Module,                                   // 5. Config
+    database.Module,                                 // 6. Database + Repositories
+    log.Module,                                      // 7. Logger initialization
+    s3.Module,                                       // 8. S3 client
+    snapshot.Module,                                 // 9. SnapshotManager + ScannerFactory
+    fx.Provide(vaultik.New),                         // 10. Vaultik orchestrator
+)
+```
+
+### Key Type Instantiation Points
+
+#### 1. Config (`config.Config`)
+- **Created by**: `config.Module` via `config.LoadConfig()`
+- **When**: Application startup (fx DI)
+- **Contains**: All configuration from YAML file (S3 credentials, encryption keys, paths, etc.)
+
+#### 2. Database (`database.DB`)
+- **Created by**: `database.Module` via `database.New()`
+- **When**: Application startup (fx DI)
+- **Contains**: SQLite connection, path reference
+
+#### 3. Repositories (`database.Repositories`)
+- **Created by**: `database.Module` via `database.NewRepositories()`
+- **When**: Application startup (fx DI)
+- **Contains**: All repository interfaces (Files, Chunks, Blobs, Snapshots, etc.)
+
+#### 4. Vaultik (`vaultik.Vaultik`)
+- **Created by**: `vaultik.New(VaultikParams)`
+- **When**: Application startup (fx DI)
+- **Contains**: All dependencies for backup operations
+
+```go
+type Vaultik struct {
+    Globals         *globals.Globals
+    Config          *config.Config
+    DB              *database.DB
+    Repositories    *database.Repositories
+    S3Client        *s3.Client
+    ScannerFactory  snapshot.ScannerFactory
+    SnapshotManager *snapshot.SnapshotManager
+    Shutdowner      fx.Shutdowner
+    Fs              afero.Fs
+    ctx             context.Context
+    cancel          context.CancelFunc
+}
+```
+
+#### 5. SnapshotManager (`snapshot.SnapshotManager`)
+- **Created by**: `snapshot.Module` via `snapshot.NewSnapshotManager()`
+- **When**: Application startup (fx DI)
+- **Responsibility**: Creates/completes snapshots, exports metadata to S3
+
+#### 6. Scanner (`snapshot.Scanner`)
+- **Created by**: `ScannerFactory(ScannerParams)`
+- **When**: Each `CreateSnapshot()` call
+- **Contains**: Chunker, Packer, progress reporter
+
+```go
+// vaultik/snapshot.go: CreateSnapshot()
+scanner := v.ScannerFactory(snapshot.ScannerParams{
+    EnableProgress: !opts.Cron,
+    Fs:             v.Fs,
+})
+```
+
+#### 7. Chunker (`chunker.Chunker`)
+- **Created by**: `chunker.NewChunker(avgChunkSize)`
+- **When**: Inside `snapshot.NewScanner()`
+- **Configuration**:
+  - `avgChunkSize`: From config (typically 64KB)
+  - `minChunkSize`: avgChunkSize / 4
+  - `maxChunkSize`: avgChunkSize * 4
+
+#### 8. Packer (`blob.Packer`)
+- **Created by**: `blob.NewPacker(PackerConfig)`
+- **When**: Inside `snapshot.NewScanner()`
+- **Configuration**:
+  - `MaxBlobSize`: Maximum blob size before finalization (typically 10GB)
+  - `CompressionLevel`: zstd level (1-19)
+  - `Recipients`: age public keys for encryption
+
+```go
+// snapshot/scanner.go: NewScanner()
+packerCfg := blob.PackerConfig{
+    MaxBlobSize:      cfg.MaxBlobSize,
+    CompressionLevel: cfg.CompressionLevel,
+    Recipients:       cfg.AgeRecipients,
+    Repositories:     cfg.Repositories,
+    Fs:               cfg.FS,
+}
+packer, err := blob.NewPacker(packerCfg)
+```
+
+## Module Responsibilities
+
+### `internal/cli`
+Entry point for fx application. Combines all modules and handles signal interrupts.
+
+Key functions:
+- `NewApp(AppOptions)` → Creates fx.App with all modules
+- `RunApp(ctx, app)` → Starts app, handles graceful shutdown
+- `RunWithApp(ctx, opts)` → Convenience wrapper
+
+### `internal/vaultik`
+Main orchestrator containing all dependencies and command implementations.
+
+Key methods:
+- `New(VaultikParams)` → Constructor (fx DI)
+- `CreateSnapshot(opts)` → Main backup operation
+- `ListSnapshots(jsonOutput)` → List available snapshots
+- `VerifySnapshot(id, deep)` → Verify snapshot integrity
+- `PurgeSnapshots(...)` → Remove old snapshots
+
+### `internal/chunker`
+Content-defined chunking using FastCDC algorithm.
+
+Key types:
+- `Chunk` → Hash, Data, Offset, Size
+- `Chunker` → avgChunkSize, minChunkSize, maxChunkSize
+
+Key methods:
+- `NewChunker(avgChunkSize)` → Constructor
+- `ChunkReaderStreaming(reader, callback)` → Stream chunks with callback (preferred)
+- `ChunkReader(reader)` → Return all chunks at once (memory-intensive)
+
+### `internal/blob`
+Blob packing: accumulates chunks, compresses, encrypts, tracks metadata.
+
+Key types:
+- `Packer` → Thread-safe blob accumulator
+- `ChunkRef` → Hash + Data for adding to packer
+- `FinishedBlob` → Completed blob ready for upload
+- `BlobWithReader` → FinishedBlob + io.Reader for streaming upload
+
+Key methods:
+- `NewPacker(PackerConfig)` → Constructor
+- `AddChunk(ChunkRef)` → Add chunk to current blob
+- `FinalizeBlob()` → Compress, encrypt, hash current blob
+- `Flush()` → Finalize any in-progress blob
+- `SetBlobHandler(func)` → Set callback for upload
+
+### `internal/snapshot`
+
+#### Scanner
+Orchestrates the backup process for a directory.
+
+Key methods:
+- `NewScanner(ScannerConfig)` → Constructor (creates Chunker + Packer)
+- `Scan(ctx, path, snapshotID)` → Main scan operation
+
+Scan phases:
+1. **Phase 0**: Detect deleted files from previous snapshots
+2. **Phase 1**: Walk directory, identify files needing processing
+3. **Phase 2**: Process files (chunk → pack → upload)
+
+#### SnapshotManager
+Manages snapshot lifecycle and metadata export.
+
+Key methods:
+- `CreateSnapshot(ctx, hostname, version, commit)` → Create snapshot record
+- `CompleteSnapshot(ctx, snapshotID)` → Mark snapshot complete
+- `ExportSnapshotMetadata(ctx, dbPath, snapshotID)` → Export to S3
+- `CleanupIncompleteSnapshots(ctx, hostname)` → Remove failed snapshots
+
+### `internal/database`
+SQLite database for local index. Single-writer mode for thread safety.
+
+Key types:
+- `DB` → Database connection wrapper
+- `Repositories` → Collection of all repository interfaces
+
+Repository interfaces:
+- `FilesRepository` → CRUD for File records
+- `ChunksRepository` → CRUD for Chunk records
+- `BlobsRepository` → CRUD for Blob records
+- `SnapshotsRepository` → CRUD for Snapshot records
+- Plus join table repositories (FileChunks, BlobChunks, etc.)
+
+## Snapshot Creation Flow
+
+```
+CreateSnapshot(opts)
+    │
+    ├─► CleanupIncompleteSnapshots()   // Critical: avoid dedup errors
+    │
+    ├─► SnapshotManager.CreateSnapshot()   // Create DB record
+    │
+    ├─► For each source directory:
+    │       │
+    │       ├─► scanner.Scan(ctx, path, snapshotID)
+    │       │       │
+    │       │       ├─► Phase 0: detectDeletedFiles()
+    │       │       │
+    │       │       ├─► Phase 1: scanPhase()
+    │       │       │       Walk directory
+    │       │       │       Check file metadata changes
+    │       │       │       Build list of files to process
+    │       │       │
+    │       │       └─► Phase 2: processPhase()
+    │       │               For each file:
+    │       │                   chunker.ChunkReaderStreaming()
+    │       │                   For each chunk:
+    │       │                       packer.AddChunk()
+    │       │                       If blob full → FinalizeBlob()
+    │       │                           → handleBlobReady()
+    │       │                           → s3Client.PutObjectWithProgress()
+    │       │               packer.Flush()  // Final blob
+    │       │
+    │       └─► Accumulate statistics
+    │
+    ├─► SnapshotManager.UpdateSnapshotStatsExtended()
+    │
+    ├─► SnapshotManager.CompleteSnapshot()
+    │
+    └─► SnapshotManager.ExportSnapshotMetadata()
+            │
+            ├─► Copy database to temp file
+            ├─► Clean to only current snapshot data
+            ├─► Dump to SQL
+            ├─► Compress with zstd
+            ├─► Encrypt with age
+            ├─► Upload db.zst.age to S3
+            └─► Upload manifest.json.zst to S3
+```
+
+## Deduplication Strategy
+
+1. **File-level**: Files unchanged since last backup are skipped (metadata comparison: size, mtime, mode, uid, gid)
+
+2. **Chunk-level**: Chunks are content-addressed by SHA256 hash. If a chunk hash already exists in the database, the chunk data is not re-uploaded.
+
+3. **Blob-level**: Blobs contain only unique chunks. Duplicate chunks within a blob are skipped.
+
+## Storage Layout in S3
+
+```
+bucket/
+├── blobs/
+│   └── {hash[0:2]}/
+│       └── {hash[2:4]}/
+│           └── {full-hash}          # Compressed+encrypted blob
+│
+└── metadata/
+    └── {snapshot-id}/
+        ├── db.zst.age               # Encrypted database dump
+        └── manifest.json.zst        # Blob list (for verification)
+```
+
+## Thread Safety
+
+- `Packer`: Thread-safe via mutex. Multiple goroutines can call `AddChunk()`.
+- `Scanner`: Uses `packerMu` mutex to coordinate blob finalization.
+- `Database`: Single-writer mode (`MaxOpenConns=1`) ensures SQLite thread safety.
+- `Repositories.WithTx()`: Handles transaction lifecycle automatically.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -10,6 +10,9 @@ Read the rules in AGENTS.md and follow them.
  corporate advertising for Anthropic and is therefore completely
  unacceptable in commit messages.

+* NEVER use `git add -A`.  Always add only the files you intentionally
+  changed.
+
 * Tests should always be run before committing code.  No commits should be
  made that do not pass tests.

@@ -33,6 +36,9 @@ Read the rules in AGENTS.md and follow them.
 * When testing on a 2.5Gbit/s ethernet to an s3 server backed by 2000MB/sec SSD, 
  estimate about 4 seconds per gigabyte of backup time.

-* When running tests, don't run individual tests, or grep the output.  run the entire test suite every time and read the full output.
+* When running tests, don't run individual tests, or grep the output.  run
+  the entire test suite every time and read the full output.

-* When running tests, don't run individual tests, or try to grep the output. never run "go test".  only ever run "make test" to run the full test suite, and examine the full output.
+* When running tests, don't run individual tests, or try to grep the output.
+  never run "go test".  only ever run "make test" to run the full test
+  suite, and examine the full output.
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -1,387 +0,0 @@
-# vaultik: Design Document
-
-`vaultik` is a secure  backup tool written in Go. It performs
-streaming backups using content-defined chunking, blob grouping, asymmetric
-encryption, and object storage. The system is designed for environments
-where the backup source host cannot store secrets and cannot retrieve or
-decrypt any data from the destination.
-
-The source host is **stateful**: it maintains a local SQLite index to detect
-changes, deduplicate content, and track uploads across backup runs. All
-remote storage is encrypted and append-only. Pruning of unreferenced data is
-done from a trusted host with access to decryption keys, as even the
-metadata indices are encrypted in the blob store.
-
---
-
-## Why
-
-ANOTHER backup tool??
-
-Other backup tools like `restic`, `borg`, and `duplicity` are designed for
-environments where the source host can store secrets and has access to
-decryption keys. I don't want to store backup decryption keys on my hosts,
-only public keys for encryption.
-
-My requirements are:
-
-* open source
-* no passphrases or private keys on the source host
-* incremental
-* compressed
-* encrypted
-* s3 compatible without an intermediate step or tool
-
-Surprisingly, no existing tool meets these requirements, so I wrote `vaultik`.
-
-## Design Goals
-
-1. Backups must require only a public key on the source host.
-2. No secrets or private keys may exist on the source system.
-3. Obviously, restore must be possible using **only** the backup bucket and
-   a private key.
-4. Prune must be possible, although this requires a private key so must be
-   done on different hosts.
-5. All encryption is done using [`age`](https://github.com/FiloSottile/age)
-   (X25519, XChaCha20-Poly1305).
-6. Compression uses `zstd` at a configurable level.
-7. Files are chunked, and multiple chunks are packed into encrypted blobs.
-   This reduces the number of objects in the blob store for filesystems with
-   many small files.
-9. All metadata (snapshots) is stored remotely as encrypted SQLite DBs.
-10. If a snapshot metadata file exceeds a configured size threshold, it is
-    chunked into multiple encrypted `.age` parts, to support large
-    filesystems.
-11. CLI interface is structured using `cobra`.
-
---
-
-## S3 Bucket Layout
-
-S3 stores only four things:
-
-1) Blobs: encrypted, compressed packs of file chunks.
-2) Metadata: encrypted SQLite databases containing the current state of the
-   filesystem at the time of the snapshot.
-3) Metadata hashes: encrypted hashes of the metadata SQLite databases.
-4) Blob manifests: unencrypted compressed JSON files listing all blob hashes
-   referenced in the snapshot, enabling pruning without decryption.
-
-```
-s3://<bucket>/<prefix>/
-├── blobs/
-│   ├── <aa>/<bb>/<full_blob_hash>.zst.age
-├── metadata/
-│   ├── <snapshot_id>.sqlite.age
-│   ├── <snapshot_id>.sqlite.00.age
-│   ├── <snapshot_id>.sqlite.01.age
-│   ├── <snapshot_id>.manifest.json.zst
-```
-
-To retrieve a given file, you would:
-
-* fetch `metadata/<snapshot_id>.sqlite.age` or `metadata/<snapshot_id>.sqlite.{seq}.age`
-* fetch `metadata/<snapshot_id>.hash.age`
-* decrypt the metadata SQLite database using the private key and reconstruct
-  the full database file
-* verify the hash of the decrypted database matches the decrypted hash
-* query the database for the file in question
-* determine all chunks for the file
-* for each chunk, look up the metadata for all blobs in the db
-* fetch each blob from `blobs/<aa>/<bb>/<blob_hash>.zst.age`
-* decrypt each blob using the private key
-* decompress each blob using `zstd`
-* reconstruct the file from set of file chunks stored in the blobs
-
-If clever, it may be possible to do this chunk by chunk without touching
-disk (except for the output file) as each uncompressed blob should fit in
-memory (<10GB).
-
-### Path Rules
-
-* `<snapshot_id>`: UTC timestamp in iso860 format, e.g. `2023-10-01T12:00:00Z`.  These are lexicographically sortable.
-* `blobs/<aa>/<bb>/...`: where `aa` and `bb` are the first 2 hex bytes of the blob hash.
-
-### Blob Manifest Format
-
-The `<snapshot_id>.manifest.json.zst` file is an unencrypted, compressed JSON file containing:
-
-```json
-{
-  "snapshot_id": "2023-10-01T12:00:00Z",
-  "blob_hashes": [
-    "aa1234567890abcdef...",
-    "bb2345678901bcdef0...",
-    ...
-  ]
-}
-```
-
-This allows pruning operations to determine which blobs are referenced without requiring decryption keys.
-
---
-
-## 3. Local SQLite Index Schema (source host)
-
-```sql
-CREATE TABLE files (
-  id TEXT PRIMARY KEY,  -- UUID
-  path TEXT NOT NULL UNIQUE,
-  mtime INTEGER NOT NULL,
-  size INTEGER NOT NULL
-);
-
-- Maps files to their constituent chunks in sequence order
-- Used for reconstructing files from chunks during restore
-CREATE TABLE file_chunks (
-  file_id TEXT NOT NULL,
-  idx INTEGER NOT NULL,
-  chunk_hash TEXT NOT NULL,
-  PRIMARY KEY (file_id, idx)
-);
-
-CREATE TABLE chunks (
-  chunk_hash TEXT PRIMARY KEY,
-  sha256 TEXT NOT NULL,
-  size INTEGER NOT NULL
-);
-
-CREATE TABLE blobs (
-  blob_hash TEXT PRIMARY KEY,
-  final_hash TEXT NOT NULL,
-  created_ts INTEGER NOT NULL
-);
-
-CREATE TABLE blob_chunks (
-  blob_hash TEXT NOT NULL,
-  chunk_hash TEXT NOT NULL,
-  offset INTEGER NOT NULL,
-  length INTEGER NOT NULL,
-  PRIMARY KEY (blob_hash, chunk_hash)
-);
-
-- Reverse mapping: tracks which files contain a given chunk
-- Used for deduplication and tracking chunk usage across files
-CREATE TABLE chunk_files (
-  chunk_hash TEXT NOT NULL,
-  file_id TEXT NOT NULL,
-  file_offset INTEGER NOT NULL,
-  length INTEGER NOT NULL,
-  PRIMARY KEY (chunk_hash, file_id)
-);
-
-CREATE TABLE snapshots (
-  id TEXT PRIMARY KEY,
-  hostname TEXT NOT NULL,
-  vaultik_version TEXT NOT NULL,
-  vaultik_git_revision TEXT NOT NULL,
-  created_ts INTEGER NOT NULL,
-  file_count INTEGER NOT NULL,
-  chunk_count INTEGER NOT NULL,
-  blob_count INTEGER NOT NULL
-);
-```
-
---
-
-## 4. Snapshot Metadata Schema (stored in S3)
-
-Identical schema to the local index, filtered to live snapshot state. Stored
-as a SQLite DB, compressed with `zstd`, encrypted with `age`. If larger than
-a configured `chunk_size`, it is split and uploaded as:
-
-```
-metadata/<snapshot_id>.sqlite.00.age
-metadata/<snapshot_id>.sqlite.01.age
-...
-```
-
---
-
-## 5. Data Flow
-
-### 5.1 Backup
-
-1. Load config
-2. Open local SQLite index
-3. Walk source directories:
-
-   * For each file:
-
-     * Check mtime and size in index
-     * If changed or new:
-
-       * Chunk file
-       * For each chunk:
-
-         * Hash with SHA256
-         * Check if already uploaded
-         * If not:
-
-           * Add chunk to blob packer
-       * Record file-chunk mapping in index
-4. When blob reaches threshold size (e.g. 1GB):
-
-   * Compress with `zstd`
-   * Encrypt with `age`
-   * Upload to: `s3://<bucket>/<prefix>/blobs/<aa>/<bb>/<hash>.zst.age`
-   * Record blob-chunk layout in local index
-5. Once all files are processed:
-   * Build snapshot SQLite DB from index delta
-   * Compress + encrypt
-   * If larger than `chunk_size`, split into parts
-   * Upload to:
-     `s3://<bucket>/<prefix>/metadata/<snapshot_id>.sqlite(.xx).age`
-6. Create snapshot record in local index that lists:
-    * snapshot ID
-    * hostname
-    * vaultik version
-    * timestamp
-    * counts of files, chunks, and blobs
-    * list of all blobs referenced in the snapshot (some new, some old) for
-      efficient pruning later
-7. Create snapshot database for upload
-8. Calculate checksum of snapshot database
-9. Compress, encrypt, split, and upload to S3
-10. Encrypt the hash of the snapshot database to the backup age key
-11. Upload the encrypted hash to S3 as `metadata/<snapshot_id>.hash.age`
-12. Create blob manifest JSON listing all blob hashes referenced in snapshot
-13. Compress manifest with zstd and upload as `metadata/<snapshot_id>.manifest.json.zst`
-14. Optionally prune remote blobs that are no longer referenced in the
-   snapshot, based on local state db
-
-### 5.2 Manual Prune
-
-1. List all objects under `metadata/`
-2. Determine the latest valid `snapshot_id` by timestamp
-3. Download and decompress the latest `<snapshot_id>.manifest.json.zst`
-4. Extract set of referenced blob hashes from manifest (no decryption needed)
-5. List all blob objects under `blobs/`
-6. For each blob:
-   * If the hash is not in the manifest:
-     * Issue `DeleteObject` to remove it
-
-### 5.3 Verify
-
-Verify runs on a host that has no state, but access to the bucket.
-
-1. Fetch latest metadata snapshot files from S3
-2. Fetch latest metadata db hash from S3
-3. Decrypt the hash using the private key
-4. Decrypt the metadata SQLite database chunks using the private key and
-   reassemble the snapshot db file
-5. Calculate the SHA256 hash of the decrypted snapshot database
-6. Verify the db file hash matches the decrypted hash
-7. For each blob in the snapshot:
-    * Fetch the blob metadata from the snapshot db
-    * Ensure the blob exists in S3
-    * Check the S3 content hash matches the expected blob hash
-    * If not using --quick mode:
-        * Download and decrypt the blob
-        * Decompress and verify chunk hashes match metadata
-
---
-
-## 6. CLI Commands
-
-```
-vaultik backup [--config <path>] [--cron] [--daemon] [--prune]
-vaultik restore --bucket <bucket> --prefix <prefix> --snapshot <id> --target <dir>
-vaultik prune --bucket <bucket> --prefix <prefix> [--dry-run]
-vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]
-vaultik fetch --bucket <bucket> --prefix <prefix> --snapshot <id> --file <path> --target <path>
-vaultik snapshot list --bucket <bucket> --prefix <prefix> [--limit <n>]
-vaultik snapshot rm --bucket <bucket> --prefix <prefix> --snapshot <id>
-vaultik snapshot latest --bucket <bucket> --prefix <prefix>
-```
-
-* `VAULTIK_PRIVATE_KEY` is required for `restore`, `prune`, `verify`, and
-  `fetch` commands.
-* It is passed via environment variable containing the age private key.
-
---
-
-## 7. Function and Method Signatures
-
-### 7.1 CLI
-
-```go
-func RootCmd() *cobra.Command
-func backupCmd() *cobra.Command
-func restoreCmd() *cobra.Command
-func pruneCmd() *cobra.Command
-func verifyCmd() *cobra.Command
-```
-
-### 7.2 Configuration
-
-```go
-type Config struct {
-    BackupPubKey      string  // age recipient
-    BackupInterval    time.Duration // used in daemon mode, irrelevant for cron mode
-    BlobSizeLimit     int64  // default 10GB
-    ChunkSize         int64 // default 10MB
-    Exclude           []string // list of regex of files to exclude from backup, absolute path
-    Hostname          string
-    IndexPath         string  // path to local SQLite index db, default /var/lib/vaultik/index.db
-    MetadataPrefix    string  // S3 prefix for metadata, default "metadata/"
-    MinTimeBetweenRun time.Duration  // minimum time between backup runs, default 1 hour - for daemon mode
-    S3                S3Config  // S3 configuration
-    ScanInterval      time.Duration  // interval to full stat() scan source dirs, default 24h
-    SourceDirs        []string  // list of source directories to back up, absolute paths
-}
-
-type S3Config struct {
-    Endpoint        string
-    Bucket          string
-    Prefix          string
-    AccessKeyID     string
-    SecretAccessKey string
-    Region          string
-}
-
-func Load(path string) (*Config, error)
-```
-
-### 7.3 Index
-
-```go
-type Index struct {
-    db *sql.DB
-}
-
-func OpenIndex(path string) (*Index, error)
-
-func (ix *Index) LookupFile(path string, mtime int64, size int64) ([]string, bool, error)
-func (ix *Index) SaveFile(path string, mtime int64, size int64, chunkHashes []string) error
-func (ix *Index) AddChunk(chunkHash string, size int64) error
-func (ix *Index) MarkBlob(blobHash, finalHash string, created time.Time) error
-func (ix *Index) MapChunkToBlob(blobHash, chunkHash string, offset, length int64) error
-func (ix *Index) MapChunkToFile(chunkHash, filePath string, offset, length int64) error
-```
-
-### 7.4 Blob Packing
-
-```go
-type BlobWriter struct {
-    // internal buffer, current size, encrypted writer, etc
-}
-
-func NewBlobWriter(...) *BlobWriter
-func (bw *BlobWriter) AddChunk(chunk []byte, chunkHash string) error
-func (bw *BlobWriter) Flush() (finalBlobHash string, err error)
-```
-
-### 7.5 Metadata
-
-```go
-func BuildSnapshotMetadata(ix *Index, snapshotID string) (sqlitePath string, err error)
-func EncryptAndUploadMetadata(path string, cfg *Config, snapshotID string) error
-```
-
-### 7.6 Prune
-
-```go
-func RunPrune(bucket, prefix, privateKey string) error
-```
-
--- a/13
+++ b/13
@@ -11,7 +11,7 @@ LDFLAGS := -X 'git.eeqj.de/sneak/vaultik/internal/globals.Version=$(VERSION)' \
           -X 'git.eeqj.de/sneak/vaultik/internal/globals.Commit=$(GIT_REVISION)'

 # Default target
-all: test
+all: vaultik

 # Run tests
 test: lint fmt-check
@@ -39,8 +39,8 @@ lint:
 	golangci-lint run

 # Build binary
-build:
-	go build -ldflags "$(LDFLAGS)" -o vaultik ./cmd/vaultik
+vaultik: internal/*/*.go cmd/vaultik/*.go
+	go build -ldflags "$(LDFLAGS)" -o $@ ./cmd/vaultik

 # Clean build artifacts
 clean:
@@ -60,3 +60,10 @@ test-coverage:
 # Run integration tests
 test-integration:
 	go test -v -tags=integration ./...
+
+local:
+	VAULTIK_CONFIG=$(HOME)/etc/vaultik/config.yml ./vaultik snapshot --debug list 2>&1
+	VAULTIK_CONFIG=$(HOME)/etc/vaultik/config.yml ./vaultik snapshot --debug create 2>&1
+
+install: vaultik
+	cp ./vaultik $(HOME)/bin/
--- a/PROCESS.md
+++ b/PROCESS.md
@@ -0,0 +1,556 @@
+# Vaultik Snapshot Creation Process
+
+This document describes the lifecycle of objects during snapshot creation, with a focus on database transactions and foreign key constraints.
+
+## Database Schema Overview
+
+### Tables and Foreign Key Dependencies
+
+```
+┌─────────────────────────────────────────────────────────────────────────┐
+│                          FOREIGN KEY GRAPH                               │
+│                                                                          │
+│  snapshots ◄────── snapshot_files ────────► files                       │
+│      │                                         │                         │
+│      └───────── snapshot_blobs ────────► blobs │                         │
+│                                           │    │                         │
+│                                           │    ├──► file_chunks ◄── chunks│
+│                                           │    │                    ▲    │
+│                                           │    └──► chunk_files ────┘    │
+│                                           │                              │
+│                                           └──► blob_chunks ─────────────┘│
+│                                                                          │
+│  uploads ───────► blobs.blob_hash                                        │
+│      └──────────► snapshots.id                                           │
+└─────────────────────────────────────────────────────────────────────────┘
+```
+
+### Critical Constraint: `chunks` Must Exist First
+
+These tables reference `chunks.chunk_hash` **without CASCADE**:
+- `file_chunks.chunk_hash` → `chunks.chunk_hash`
+- `chunk_files.chunk_hash` → `chunks.chunk_hash`
+- `blob_chunks.chunk_hash` → `chunks.chunk_hash`
+
+**Implication**: A chunk record MUST be committed to the database BEFORE any of these referencing records can be created.
+
+### Order of Operations Required by Schema
+
+```
+1. snapshots      (created first, before scan)
+2. blobs          (created when packer starts new blob)
+3. chunks         (created during file processing)
+4. blob_chunks    (created immediately after chunk added to packer)
+5. files          (created after file fully chunked)
+6. file_chunks    (created with file record)
+7. chunk_files    (created with file record)
+8. snapshot_files (created with file record)
+9. snapshot_blobs (created after blob uploaded)
+10. uploads       (created after blob uploaded)
+```
+
+---
+
+## Snapshot Creation Phases
+
+### Phase 0: Initialization
+
+**Actions:**
+1. Snapshot record created in database (Transaction T0)
+2. Known files loaded into memory from `files` table
+3. Known chunks loaded into memory from `chunks` table
+
+**Transactions:**
+```
+T0: INSERT INTO snapshots (id, hostname, ...) VALUES (...)
+    COMMIT
+```
+
+---
+
+### Phase 1: Scan Directory
+
+**Actions:**
+1. Walk filesystem directory tree
+2. For each file, compare against in-memory `knownFiles` map
+3. Classify files as: unchanged, new, or modified
+4. Collect unchanged file IDs for later association
+5. Collect new/modified files for processing
+
+**Transactions:**
+```
+(None during scan - all in-memory)
+```
+
+---
+
+### Phase 1b: Associate Unchanged Files
+
+**Actions:**
+1. For unchanged files, add entries to `snapshot_files` table
+2. Done in batches of 1000
+
+**Transactions:**
+```
+For each batch of 1000 file IDs:
+    T: BEGIN
+       INSERT INTO snapshot_files (snapshot_id, file_id) VALUES (?, ?)
+       ... (up to 1000 inserts)
+       COMMIT
+```
+
+---
+
+### Phase 2: Process Files
+
+For each file that needs processing:
+
+#### Step 2a: Open and Chunk File
+
+**Location:** `processFileStreaming()`
+
+For each chunk produced by content-defined chunking:
+
+##### Step 2a-1: Check Chunk Existence
+```go
+chunkExists := s.chunkExists(chunk.Hash)  // In-memory lookup
+```
+
+##### Step 2a-2: Create Chunk Record (if new)
+```go
+// TRANSACTION: Create chunk in database
+err := s.repos.WithTx(ctx, func(txCtx context.Context, tx *sql.Tx) error {
+    dbChunk := &database.Chunk{ChunkHash: chunk.Hash, Size: chunk.Size}
+    return s.repos.Chunks.Create(txCtx, tx, dbChunk)
+})
+// COMMIT immediately after WithTx returns
+
+// Update in-memory cache
+s.addKnownChunk(chunk.Hash)
+```
+
+**Transaction:**
+```
+T_chunk: BEGIN
+         INSERT INTO chunks (chunk_hash, size) VALUES (?, ?)
+         COMMIT
+```
+
+##### Step 2a-3: Add Chunk to Packer
+
+```go
+s.packer.AddChunk(&blob.ChunkRef{Hash: chunk.Hash, Data: chunk.Data})
+```
+
+**Inside packer.AddChunk → addChunkToCurrentBlob():**
+
+```go
+// TRANSACTION: Create blob_chunks record IMMEDIATELY
+if p.repos != nil {
+    blobChunk := &database.BlobChunk{
+        BlobID:    p.currentBlob.id,
+        ChunkHash: chunk.Hash,
+        Offset:    offset,
+        Length:    chunkSize,
+    }
+    err := p.repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
+        return p.repos.BlobChunks.Create(ctx, tx, blobChunk)
+    })
+    // COMMIT immediately
+}
+```
+
+**Transaction:**
+```
+T_blob_chunk: BEGIN
+              INSERT INTO blob_chunks (blob_id, chunk_hash, offset, length) VALUES (?, ?, ?, ?)
+              COMMIT
+```
+
+**⚠️ CRITICAL DEPENDENCY**: This transaction requires `chunks.chunk_hash` to exist (FK constraint).
+The chunk MUST be committed in Step 2a-2 BEFORE this can succeed.
+
+---
+
+#### Step 2b: Blob Size Limit Handling
+
+If adding a chunk would exceed blob size limit:
+
+```go
+if err == blob.ErrBlobSizeLimitExceeded {
+    if err := s.packer.FinalizeBlob(); err != nil { ... }
+    // Retry adding the chunk
+    if err := s.packer.AddChunk(...); err != nil { ... }
+}
+```
+
+**FinalizeBlob() transactions:**
+```
+T_blob_finish: BEGIN
+               UPDATE blobs SET blob_hash=?, uncompressed_size=?, compressed_size=?, finished_ts=? WHERE id=?
+               COMMIT
+```
+
+Then blob handler is called (handleBlobReady):
+```
+(Upload to S3 - no transaction)
+
+T_blob_uploaded: BEGIN
+                 UPDATE blobs SET uploaded_ts=? WHERE id=?
+                 INSERT INTO snapshot_blobs (snapshot_id, blob_id, blob_hash) VALUES (?, ?, ?)
+                 INSERT INTO uploads (blob_hash, snapshot_id, uploaded_at, size, duration_ms) VALUES (?, ?, ?, ?, ?)
+                 COMMIT
+```
+
+---
+
+#### Step 2c: Queue File for Batch Insertion
+
+After all chunks for a file are processed:
+
+```go
+// Build file data (in-memory, no DB)
+fileChunks := make([]database.FileChunk, len(chunks))
+chunkFiles := make([]database.ChunkFile, len(chunks))
+
+// Queue for batch insertion
+return s.addPendingFile(ctx, pendingFileData{
+    file:       fileToProcess.File,
+    fileChunks: fileChunks,
+    chunkFiles: chunkFiles,
+})
+```
+
+**No transaction yet** - just adds to `pendingFiles` slice.
+
+If `len(pendingFiles) >= fileBatchSize (100)`, triggers `flushPendingFiles()`.
+
+---
+
+### Step 2d: Flush Pending Files
+
+**Location:** `flushPendingFiles()` - called when batch is full or at end of processing
+
+```go
+return s.repos.WithTx(ctx, func(txCtx context.Context, tx *sql.Tx) error {
+    for _, data := range files {
+        // 1. Create file record
+        s.repos.Files.Create(txCtx, tx, data.file)  // INSERT OR REPLACE
+
+        // 2. Delete old associations
+        s.repos.FileChunks.DeleteByFileID(txCtx, tx, data.file.ID)
+        s.repos.ChunkFiles.DeleteByFileID(txCtx, tx, data.file.ID)
+
+        // 3. Create file_chunks records
+        for _, fc := range data.fileChunks {
+            s.repos.FileChunks.Create(txCtx, tx, &fc)  // FK: chunks.chunk_hash
+        }
+
+        // 4. Create chunk_files records
+        for _, cf := range data.chunkFiles {
+            s.repos.ChunkFiles.Create(txCtx, tx, &cf)  // FK: chunks.chunk_hash
+        }
+
+        // 5. Add file to snapshot
+        s.repos.Snapshots.AddFileByID(txCtx, tx, s.snapshotID, data.file.ID)
+    }
+    return nil
+})
+// COMMIT (all or nothing for the batch)
+```
+
+**Transaction:**
+```
+T_files_batch: BEGIN
+               -- For each file in batch:
+               INSERT OR REPLACE INTO files (...) VALUES (...)
+               DELETE FROM file_chunks WHERE file_id = ?
+               DELETE FROM chunk_files WHERE file_id = ?
+               INSERT INTO file_chunks (file_id, idx, chunk_hash) VALUES (?, ?, ?)  -- FK: chunks
+               INSERT INTO chunk_files (chunk_hash, file_id, ...) VALUES (?, ?, ...) -- FK: chunks
+               INSERT INTO snapshot_files (snapshot_id, file_id) VALUES (?, ?)
+               -- Repeat for each file
+               COMMIT
+```
+
+**⚠️ CRITICAL DEPENDENCY**: `file_chunks` and `chunk_files` require `chunks.chunk_hash` to exist.
+
+---
+
+### Phase 2 End: Final Flush
+
+```go
+// Flush any remaining pending files
+if err := s.flushAllPending(ctx); err != nil { ... }
+
+// Final packer flush
+s.packer.Flush()
+```
+
+---
+
+## The Current Bug
+
+### Problem
+
+The current code attempts to batch file insertions, but `file_chunks` and `chunk_files` have foreign keys to `chunks.chunk_hash`. The batched file flush tries to insert these records, but if the chunks haven't been committed yet, the FK constraint fails.
+
+### Why It's Happening
+
+Looking at the sequence:
+
+1. Process file A, chunk X
+2. Create chunk X in DB (Transaction commits)
+3. Add chunk X to packer
+4. Packer creates blob_chunks for chunk X (needs chunk X - OK, committed in step 2)
+5. Queue file A with chunk references
+6. Process file B, chunk Y
+7. Create chunk Y in DB (Transaction commits)
+8. ... etc ...
+9. At end: flushPendingFiles()
+10. Insert file_chunks for file A referencing chunk X (chunk X committed - should work)
+
+The chunks ARE being created individually. But something is going wrong.
+
+### Actual Issue
+
+Wait - let me re-read the code. The issue is:
+
+In `processFileStreaming`, when we queue file data:
+```go
+fileChunks[i] = database.FileChunk{
+    FileID:    fileToProcess.File.ID,
+    Idx:       ci.fileChunk.Idx,
+    ChunkHash: ci.fileChunk.ChunkHash,
+}
+```
+
+The `FileID` is set, but `fileToProcess.File.ID` might be empty at this point because the file record hasn't been created yet!
+
+Looking at `checkFileInMemory`:
+```go
+// For new files:
+if !exists {
+    return file, true  // file.ID is empty string!
+}
+
+// For existing files:
+file.ID = existingFile.ID  // Reuse existing ID
+```
+
+**For NEW files, `file.ID` is empty!**
+
+Then in `flushPendingFiles`:
+```go
+s.repos.Files.Create(txCtx, tx, data.file)  // This generates/uses the ID
+```
+
+But `data.fileChunks` was built with the EMPTY ID!
+
+### The Real Problem
+
+For new files:
+1. `checkFileInMemory` creates file record with empty ID
+2. `processFileStreaming` queues file_chunks with empty `FileID`
+3. `flushPendingFiles` creates file (generates ID), but file_chunks still have empty `FileID`
+
+Wait, but `Files.Create` should be INSERT OR REPLACE by path, and the file struct should get updated... Let me check.
+
+Actually, looking more carefully at the code path - the file IS created first in the flush, but the `fileChunks` slice was already built with the old (possibly empty) ID. The ID isn't updated after the file is created.
+
+Hmm, but looking at the current code:
+```go
+fileChunks[i] = database.FileChunk{
+    FileID:    fileToProcess.File.ID,  // This uses the ID from the File struct
+```
+
+And in `checkFileInMemory` for new files, we create a file struct but don't set the ID. However, looking at the database repository, `Files.Create` should be doing `INSERT OR REPLACE` and the ID should be pre-generated...
+
+Let me check if IDs are being generated. Looking at the File struct usage, it seems like UUIDs should be generated somewhere...
+
+Actually, looking at the test failures again:
+```
+creating file chunk: inserting file_chunk: constraint failed: FOREIGN KEY constraint failed (787)
+```
+
+Error 787 is SQLite's foreign key constraint error. The failing FK is on `file_chunks.chunk_hash → chunks.chunk_hash`.
+
+So the chunks ARE NOT in the database when we try to insert file_chunks. Let me trace through more carefully...
+
+---
+
+## Transaction Timing Issue
+
+The problem is transaction visibility in SQLite.
+
+Each `WithTx` creates a new transaction that commits at the end. But with batched file insertion:
+
+1. Chunk transactions commit one at a time
+2. File batch transaction runs later
+
+If chunks are being inserted but something goes wrong with transaction isolation, the file batch might not see them.
+
+But actually SQLite in WAL mode should have SERIALIZABLE isolation by default, so committed transactions should be visible.
+
+Let me check if the in-memory cache is masking a database problem...
+
+Actually, wait. Let me re-check the current broken code more carefully. The issue might be simpler.
+
+---
+
+## Current Code Flow Analysis
+
+Looking at `processFileStreaming` in the current broken state:
+
+```go
+// For each chunk:
+if !chunkExists {
+    err := s.repos.WithTx(ctx, func(txCtx context.Context, tx *sql.Tx) error {
+        dbChunk := &database.Chunk{ChunkHash: chunk.Hash, Size: chunk.Size}
+        return s.repos.Chunks.Create(txCtx, tx, dbChunk)
+    })
+    // ... check error ...
+    s.addKnownChunk(chunk.Hash)
+}
+
+// ... add to packer (creates blob_chunks) ...
+
+// Collect chunk info for file
+chunks = append(chunks, chunkInfo{...})
+```
+
+Then at end of function:
+```go
+// Queue file for batch insertion
+return s.addPendingFile(ctx, pendingFileData{
+    file:       fileToProcess.File,
+    fileChunks: fileChunks,
+    chunkFiles: chunkFiles,
+})
+```
+
+At end of `processPhase`:
+```go
+if err := s.flushAllPending(ctx); err != nil { ... }
+```
+
+The chunks are being created one-by-one with individual transactions. By the time `flushPendingFiles` runs, all chunk transactions should have committed.
+
+Unless... there's a bug in how the chunks are being referenced. Let me check if the chunk_hash values are correct.
+
+Or... maybe the test database is being recreated between operations somehow?
+
+Actually, let me check the test setup. Maybe the issue is specific to the test environment.
+
+---
+
+## Summary of Object Lifecycle
+
+| Object | When Created | Transaction | Dependencies |
+|--------|--------------|-------------|--------------|
+| snapshot | Before scan | Individual tx | None |
+| blob | When packer needs new blob | Individual tx | None |
+| chunk | During file chunking (each chunk) | Individual tx | None |
+| blob_chunks | Immediately after adding chunk to packer | Individual tx | chunks, blobs |
+| files | Batched at end of processing | Batch tx | None |
+| file_chunks | With file (batched) | Batch tx | files, chunks |
+| chunk_files | With file (batched) | Batch tx | files, chunks |
+| snapshot_files | With file (batched) | Batch tx | snapshots, files |
+| snapshot_blobs | After blob upload | Individual tx | snapshots, blobs |
+| uploads | After blob upload | Same tx as snapshot_blobs | blobs, snapshots |
+
+---
+
+## Root Cause Analysis
+
+After detailed analysis, I believe the issue is one of the following:
+
+### Hypothesis 1: File ID Not Set
+
+Looking at `checkFileInMemory()` for NEW files:
+```go
+if !exists {
+    return file, true  // file.ID is empty string!
+}
+```
+
+For new files, `file.ID` is empty. Then in `processFileStreaming`:
+```go
+fileChunks[i] = database.FileChunk{
+    FileID:    fileToProcess.File.ID,  // Empty for new files!
+    ...
+}
+```
+
+The `FileID` in the built `fileChunks` slice is empty.
+
+Then in `flushPendingFiles`:
+```go
+s.repos.Files.Create(txCtx, tx, data.file)  // This generates the ID
+// But data.fileChunks still has empty FileID!
+for i := range data.fileChunks {
+    s.repos.FileChunks.Create(...)  // Uses empty FileID
+}
+```
+
+**Solution**: Generate file IDs upfront in `checkFileInMemory()`:
+```go
+file := &database.File{
+    ID:   uuid.New().String(),  // Generate ID immediately
+    Path: path,
+    ...
+}
+```
+
+### Hypothesis 2: Transaction Isolation
+
+SQLite with a single connection pool (`MaxOpenConns(1)`) should serialize all transactions. Committed data should be visible to subsequent transactions.
+
+However, there might be a subtle issue with how `context.Background()` is used in the packer vs the scanner's context.
+
+## Recommended Fix
+
+**Step 1: Generate file IDs upfront**
+
+In `checkFileInMemory()`, generate the UUID for new files immediately:
+```go
+file := &database.File{
+    ID:   uuid.New().String(),  // Always generate ID
+    Path: path,
+    ...
+}
+```
+
+This ensures `file.ID` is set when building `fileChunks` and `chunkFiles` slices.
+
+**Step 2: Verify by reverting to per-file transactions**
+
+If Step 1 doesn't fix it, revert to non-batched file insertion to isolate the issue:
+
+```go
+// Instead of queuing:
+//   return s.addPendingFile(ctx, pendingFileData{...})
+
+// Do immediate insertion:
+return s.repos.WithTx(ctx, func(txCtx context.Context, tx *sql.Tx) error {
+    // Create file
+    s.repos.Files.Create(txCtx, tx, fileToProcess.File)
+    // Delete old associations
+    s.repos.FileChunks.DeleteByFileID(...)
+    s.repos.ChunkFiles.DeleteByFileID(...)
+    // Create new associations
+    for _, fc := range fileChunks {
+        s.repos.FileChunks.Create(...)
+    }
+    for _, cf := range chunkFiles {
+        s.repos.ChunkFiles.Create(...)
+    }
+    // Add to snapshot
+    s.repos.Snapshots.AddFileByID(...)
+    return nil
+})
+```
+
+**Step 3: If batching is still desired**
+
+After confirming per-file transactions work, re-implement batching with the ID fix in place, and add debug logging to trace exactly which chunk_hash is failing and why.
--- a/README.md
+++ b/README.md
@@ -1,39 +1,27 @@
 # vaultik (ваултик)

-`vaultik` is a incremental backup daemon written in Go. It
-encrypts data using an `age` public key and uploads each encrypted blob
-directly to a remote S3-compatible object store. It requires no private
-keys, secrets, or credentials stored on the backed-up system.
+WIP: pre-1.0, some functions may not be fully implemented yet
+
+`vaultik` is an incremental backup daemon written in Go. It encrypts data
+using an `age` public key and uploads each encrypted blob directly to a
+remote S3-compatible object store. It requires no private keys, secrets, or
+credentials (other than those required to PUT to encrypted object storage,
+such as S3 API keys) stored on the backed-up system.

 It includes table-stakes features such as:

-* modern authenticated encryption
+* modern encryption (the excellent `age`)
 * deduplication
 * incremental backups
 * modern multithreaded zstd compression with configurable levels
 * content-addressed immutable storage
-* local state tracking in standard SQLite database
-* inotify-based change detection
-* streaming processing of all data to not require lots of ram or temp file
-  storage
+* local state tracking in standard SQLite database, enables write-only
+  incremental backups to destination
 * no mutable remote metadata
 * no plaintext file paths or metadata stored in remote
 * does not create huge numbers of small files (to keep S3 operation counts
  down) even if the source system has many small files

-## what
-
-`vaultik` walks a set of configured directories and builds a
-content-addressable chunk map of changed files using deterministic chunking.
-Each chunk is streamed into a blob packer. Blobs are compressed with `zstd`,
-encrypted with `age`, and uploaded directly to remote storage under a
-content-addressed S3 path.
-
-No plaintext file contents ever hit disk. No private key or secret
-passphrase is needed or stored locally. All encrypted data is
-streaming-processed and immediately discarded once uploaded. Metadata is
-encrypted and pushed with the same mechanism.
-
 ## why

 Existing backup software fails under one or more of these conditions:
@@ -42,16 +30,48 @@ Existing backup software fails under one or more of these conditions:
  compromises encrypted backups in the case of host system compromise
 * Depends on symmetric encryption unsuitable for zero-trust environments
 * Creates one-blob-per-file, which results in excessive S3 operation counts
+* is slow

-`vaultik` addresses these by using:
+Other backup tools like `restic`, `borg`, and `duplicity` are designed for
+environments where the source host can store secrets and has access to
+decryption keys. I don't want to store backup decryption keys on my hosts,
+only public keys for encryption.

-* Public-key-only encryption (via `age`) requires no secrets (other than
-  remote storage api key) on the source system
-* Local state cache for incremental detection does not require reading from
-  or decrypting remote storage
-* Content-addressed immutable storage allows efficient deduplication
-* Storage only of large encrypted blobs of configurable size (1G by default)
-  reduces S3 operation counts and improves performance
+My requirements are:
+
+* open source
+* no passphrases or private keys on the source host
+* incremental
+* compressed
+* encrypted
+* s3 compatible without an intermediate step or tool
+
+Surprisingly, no existing tool meets these requirements, so I wrote `vaultik`.
+
+## design goals
+
+1. Backups must require only a public key on the source host.
+1. No secrets or private keys may exist on the source system.
+1. Restore must be possible using **only** the backup bucket and a private key.
+1. Prune must be possible (requires private key, done on different hosts).
+1. All encryption uses [`age`](https://age-encryption.org/) (X25519, XChaCha20-Poly1305).
+1. Compression uses `zstd` at a configurable level.
+1. Files are chunked, and multiple chunks are packed into encrypted blobs
+   to reduce object count for filesystems with many small files.
+1. All metadata (snapshots) is stored remotely as encrypted SQLite DBs.
+
+## what
+
+`vaultik` walks a set of configured directories and builds a
+content-addressable chunk map of changed files using deterministic chunking.
+Each chunk is streamed into a blob packer. Blobs are compressed with `zstd`,
+encrypted with `age`, and uploaded directly to remote storage under a
+content-addressed S3 path. At the end, a pruned snapshot-specific sqlite
+database of metadata is created, encrypted, and uploaded alongside the
+blobs.
+
+No plaintext file contents ever hit disk. No private key or secret
+passphrase is needed or stored locally.

 ## how

@@ -61,59 +81,63 @@ Existing backup software fails under one or more of these conditions:
   go install git.eeqj.de/sneak/vaultik@latest
   ```

-2. **generate keypair**
+1. **generate keypair**

   ```sh
   age-keygen -o agekey.txt
   grep 'public key:' agekey.txt
   ```

-3. **write config**
+1. **write config**

   ```yaml
-   source_dirs:
+   # Named snapshots - each snapshot can contain multiple paths
+   snapshots:
+     system:
+       paths:
         - /etc
-     - /home/user/data
+         - /var/lib
+       exclude:
+         - '*.cache'  # Snapshot-specific exclusions
+     home:
+       paths:
+         - /home/user/documents
+         - /home/user/photos
+
+   # Global exclusions (apply to all snapshots)
   exclude:
     - '*.log'
     - '*.tmp'
-   age_recipient: age1278m9q7dp3chsh2dcy82qk27v047zywyvtxwnj4cvt0z65jw6a7q5dqhfj
+     - '.git'
+     - 'node_modules'
+
+   age_recipients:
+     - age1278m9q7dp3chsh2dcy82qk27v047zywyvtxwnj4cvt0z65jw6a7q5dqhfj
   s3:
-     # endpoint is optional if using AWS S3, but who even does that?
     endpoint: https://s3.example.com
     bucket: vaultik-data
     prefix: host1/
     access_key_id: ...
     secret_access_key: ...
     region: us-east-1
-   backup_interval: 1h      # only used in daemon mode, not for --cron mode
-   full_scan_interval: 24h  # normally we use inotify to mark dirty, but
-                            # every 24h we do a full stat() scan
-   min_time_between_run: 15m  # again, only for daemon mode
-   #index_path: /var/lib/vaultik/index.sqlite
+   backup_interval: 1h
+   full_scan_interval: 24h
+   min_time_between_run: 15m
   chunk_size: 10MB
-   blob_size_limit: 10GB
+   blob_size_limit: 1GB
   ```

-4. **run**
+1. **run**

   ```sh
+   # Create all configured snapshots
   vaultik --config /etc/vaultik.yaml snapshot create
-   ```

-   ```sh
-   vaultik --config /etc/vaultik.yaml snapshot create --cron # silent unless error
-   ```
+   # Create specific snapshots by name
+   vaultik --config /etc/vaultik.yaml snapshot create home system

-   ```sh
-   vaultik --config /etc/vaultik.yaml snapshot daemon # runs continuously in foreground, uses inotify to detect changes
-
-   # TODO
-   * make sure daemon mode does not make a snapshot if no files have
-     changed, even if the backup_interval has passed
-   * in daemon mode, if we are long enough since the last snapshot event, and we get
-     an inotify event, we should schedule the next snapshot creation for 10 minutes from the
-     time of the mark-dirty event.
+   # Silent mode for cron
+   vaultik --config /etc/vaultik.yaml snapshot create --cron
   ```

 ---
@@ -123,76 +147,211 @@ Existing backup software fails under one or more of these conditions:
 ### commands

 ```sh
-vaultik [--config <path>] snapshot create [--cron] [--daemon]
+vaultik [--config <path>] snapshot create [snapshot-names...] [--cron] [--daemon] [--prune]
 vaultik [--config <path>] snapshot list [--json]
-vaultik [--config <path>] snapshot purge [--keep-latest | --older-than <duration>] [--force]
 vaultik [--config <path>] snapshot verify <snapshot-id> [--deep]
+vaultik [--config <path>] snapshot purge [--keep-latest | --older-than <duration>] [--force]
+vaultik [--config <path>] snapshot remove <snapshot-id> [--dry-run] [--force]
+vaultik [--config <path>] snapshot prune
+vaultik [--config <path>] restore <snapshot-id> <target-dir> [paths...]
+vaultik [--config <path>] prune [--dry-run] [--force]
+vaultik [--config <path>] info
 vaultik [--config <path>] store info
-# FIXME: remove 'bucket' and 'prefix' and 'snapshot' flags.  it should be
-# 'vaultik restore snapshot <snapshot> --target <dir>'.  bucket and prefix are always
-# from config file.
-vaultik restore --bucket <bucket> --prefix <prefix> --snapshot <id> --target <dir>
-# FIXME: remove prune, it's the old version of "snapshot purge"
-vaultik prune --bucket <bucket> --prefix <prefix> [--dry-run]
-# FIXME: change fetch to 'vaultik restore path <snapshot> <path> --target <path>'
-vaultik fetch --bucket <bucket> --prefix <prefix> --snapshot <id> --file <path> --target <path>
-# FIXME: remove this, it's redundant with 'snapshot verify'
-vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]
 ```

 ### environment

-* `VAULTIK_PRIVATE_KEY`: Required for `restore`, `prune`, `fetch`, and `verify` commands. Contains the age private key for decryption.
-* `VAULTIK_CONFIG`: Optional path to config file. If set, config file path doesn't need to be specified on the command line.
+* `VAULTIK_AGE_SECRET_KEY`: Required for `restore` and deep `verify`. Contains the age private key for decryption.
+* `VAULTIK_CONFIG`: Optional path to config file.

 ### command details

-**snapshot create**: Perform incremental backup of configured directories
+**snapshot create**: Perform incremental backup of configured snapshots
 * Config is located at `/etc/vaultik/config.yml` by default
+* Optional snapshot names argument to create specific snapshots (default: all)
 * `--cron`: Silent unless error (for crontab)
 * `--daemon`: Run continuously with inotify monitoring and periodic scans
+* `--prune`: Delete old snapshots and orphaned blobs after backup

 **snapshot list**: List all snapshots with their timestamps and sizes
 * `--json`: Output in JSON format

+**snapshot verify**: Verify snapshot integrity
+* `--deep`: Download and verify blob contents (not just existence)
+
 **snapshot purge**: Remove old snapshots based on criteria
 * `--keep-latest`: Keep only the most recent snapshot
 * `--older-than`: Remove snapshots older than duration (e.g., 30d, 6mo, 1y)
 * `--force`: Skip confirmation prompt

-**snapshot verify**: Verify snapshot integrity
-* `--deep`: Download and verify blob hashes (not just existence)
+**snapshot remove**: Remove a specific snapshot
+* `--dry-run`: Show what would be deleted without deleting
+* `--force`: Skip confirmation prompt

-**store info**: Display S3 bucket configuration and storage statistics
+**snapshot prune**: Clean orphaned data from local database

-**restore**: Restore entire snapshot to target directory
-* Downloads and decrypts metadata
-* Fetches only required blobs
-* Reconstructs directory structure
+**restore**: Restore snapshot to target directory
+* Requires `VAULTIK_AGE_SECRET_KEY` environment variable with age private key
+* Optional path arguments to restore specific files/directories (default: all)
+* Downloads and decrypts metadata, fetches required blobs, reconstructs files
+* Preserves file permissions, timestamps, and ownership (ownership requires root)
+* Handles symlinks and directories

-**prune**: Remove unreferenced blobs from storage
-* Requires private key
-* Downloads latest snapshot metadata
+**prune**: Remove unreferenced blobs from remote storage
+* Scans all snapshots for referenced blobs
 * Deletes orphaned blobs

-**fetch**: Extract single file from backup
-* Retrieves specific file without full restore
-* Supports extracting to different filename
+**info**: Display system and configuration information

-**verify**: Validate backup integrity
-* Checks metadata hash
-* Verifies all referenced blobs exist
-* Default: Downloads blobs and validates chunk integrity
-* `--quick`: Only checks blob existence and S3 content hashes
+**store info**: Display S3 bucket configuration and storage statistics

 ---

 ## architecture

+### s3 bucket layout
+
+```
+s3://<bucket>/<prefix>/
+├── blobs/
+│   └── <aa>/<bb>/<full_blob_hash>
+└── metadata/
+    ├── <snapshot_id>/
+    │   ├── db.zst.age
+    │   └── manifest.json.zst
+```
+
+* `blobs/<aa>/<bb>/...`: Two-level directory sharding using first 4 hex chars of blob hash
+* `metadata/<snapshot_id>/db.zst.age`: Encrypted, compressed SQLite database
+* `metadata/<snapshot_id>/manifest.json.zst`: Unencrypted blob list for pruning
+
+### blob manifest format
+
+The `manifest.json.zst` file is unencrypted (compressed JSON) to enable pruning without decryption:
+
+```json
+{
+  "snapshot_id": "hostname_snapshotname_2025-01-01T12:00:00Z",
+  "blob_hashes": [
+    "aa1234567890abcdef...",
+    "bb2345678901bcdef0..."
+  ]
+}
+```
+
+Snapshot IDs follow the format `<hostname>_<snapshot-name>_<timestamp>` (e.g., `server1_home_2025-01-01T12:00:00Z`).
+
+### local sqlite schema
+
+```sql
+CREATE TABLE files (
+  id TEXT PRIMARY KEY,
+  path TEXT NOT NULL UNIQUE,
+  mtime INTEGER NOT NULL,
+  size INTEGER NOT NULL,
+  mode INTEGER NOT NULL,
+  uid INTEGER NOT NULL,
+  gid INTEGER NOT NULL
+);
+
+CREATE TABLE file_chunks (
+  file_id TEXT NOT NULL,
+  idx INTEGER NOT NULL,
+  chunk_hash TEXT NOT NULL,
+  PRIMARY KEY (file_id, idx),
+  FOREIGN KEY (file_id) REFERENCES files(id) ON DELETE CASCADE
+);
+
+CREATE TABLE chunks (
+  chunk_hash TEXT PRIMARY KEY,
+  size INTEGER NOT NULL
+);
+
+CREATE TABLE blobs (
+  id TEXT PRIMARY KEY,
+  blob_hash TEXT NOT NULL UNIQUE,
+  uncompressed INTEGER NOT NULL,
+  compressed INTEGER NOT NULL,
+  uploaded_at INTEGER
+);
+
+CREATE TABLE blob_chunks (
+  blob_hash TEXT NOT NULL,
+  chunk_hash TEXT NOT NULL,
+  offset INTEGER NOT NULL,
+  length INTEGER NOT NULL,
+  PRIMARY KEY (blob_hash, chunk_hash)
+);
+
+CREATE TABLE chunk_files (
+  chunk_hash TEXT NOT NULL,
+  file_id TEXT NOT NULL,
+  file_offset INTEGER NOT NULL,
+  length INTEGER NOT NULL,
+  PRIMARY KEY (chunk_hash, file_id)
+);
+
+CREATE TABLE snapshots (
+  id TEXT PRIMARY KEY,
+  hostname TEXT NOT NULL,
+  vaultik_version TEXT NOT NULL,
+  started_at INTEGER NOT NULL,
+  completed_at INTEGER,
+  file_count INTEGER NOT NULL,
+  chunk_count INTEGER NOT NULL,
+  blob_count INTEGER NOT NULL,
+  total_size INTEGER NOT NULL,
+  blob_size INTEGER NOT NULL,
+  compression_ratio REAL NOT NULL
+);
+
+CREATE TABLE snapshot_files (
+  snapshot_id TEXT NOT NULL,
+  file_id TEXT NOT NULL,
+  PRIMARY KEY (snapshot_id, file_id)
+);
+
+CREATE TABLE snapshot_blobs (
+  snapshot_id TEXT NOT NULL,
+  blob_id TEXT NOT NULL,
+  blob_hash TEXT NOT NULL,
+  PRIMARY KEY (snapshot_id, blob_id)
+);
+```
+
+### data flow
+
+#### backup
+
+1. Load config, open local SQLite index
+1. Walk source directories, check mtime/size against index
+1. For changed/new files: chunk using content-defined chunking
+1. For each chunk: hash, check if already uploaded, add to blob packer
+1. When blob reaches threshold: compress, encrypt, upload to S3
+1. Build snapshot metadata, compress, encrypt, upload
+1. Create blob manifest (unencrypted) for pruning support
+
+#### restore
+
+1. Download `metadata/<snapshot_id>/db.zst.age`
+1. Decrypt and decompress SQLite database
+1. Query files table (optionally filtered by paths)
+1. For each file, get ordered chunk list from file_chunks
+1. Download required blobs, decrypt, decompress
+1. Extract chunks and reconstruct files
+1. Restore permissions, mtime, uid/gid
+
+#### prune
+
+1. List all snapshot manifests
+1. Build set of all referenced blob hashes
+1. List all blobs in storage
+1. Delete any blob not in referenced set
+
 ### chunking

-* Content-defined chunking using rolling hash (Rabin fingerprint)
-* Average chunk size: 10MB (configurable)
+* Content-defined chunking using FastCDC algorithm
+* Average chunk size: configurable (default 10MB)
 * Deduplication at chunk level
 * Multiple chunks packed into blobs for efficiency

@@ -203,19 +362,13 @@ vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]
 * Each blob encrypted independently
 * Metadata databases also encrypted

-### storage
+### compression

-* Content-addressed blob storage
-* Immutable append-only design
-* Two-level directory sharding for blobs (aa/bb/hash)
-* Compressed with zstd before encryption
+* zstd compression at configurable level
+* Applied before encryption
+* Blob-level compression for efficiency

-### state tracking
-
-* Local SQLite database for incremental state
-* Tracks file mtimes and chunk mappings
-* Enables efficient change detection
-* Supports inotify monitoring in daemon mode
+---

 ## does not

@@ -225,8 +378,6 @@ vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]
 * Require a symmetric passphrase or password
 * Trust the source system with anything

---
-
 ## does

 * Incremental deduplicated backup
@@ -238,70 +389,16 @@ vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]

 ---

-## restore
-
-`vaultik restore` downloads only the snapshot metadata and required blobs. It
-never contacts the source system. All restore operations depend only on:
-
-* `VAULTIK_PRIVATE_KEY`
-* The bucket
-
-The entire system is restore-only from object storage.
-
---
-
-## features
-
-### daemon mode
-
-* Continuous background operation
-* inotify-based change detection
-* Respects `backup_interval` and `min_time_between_run`
-* Full scan every `full_scan_interval` (default 24h)
-
-### cron mode
-
-* Single backup run
-* Silent output unless errors
-* Ideal for scheduled backups
-
-### metadata integrity
-
-* SHA256 hash of metadata stored separately
-* Encrypted hash file for verification
-* Chunked metadata support for large filesystems
-
-### exclusion patterns
-
-* Glob-based file exclusion
-* Configured in YAML
-* Applied during directory walk
-
-## prune
-
-Run `vaultik prune` on a machine with the private key. It:
-
-* Downloads the most recent snapshot
-* Decrypts metadata
-* Lists referenced blobs
-* Deletes any blob in the bucket not referenced
-
-This enables garbage collection from immutable storage.
-
---
-
-## LICENSE
-
-[MIT](https://opensource.org/license/mit/)
-
---
-
 ## requirements

-* Go 1.24.4 or later
+* Go 1.24 or later
 * S3-compatible object storage
 * Sufficient disk space for local index (typically <1GB)

+## license
+
+[MIT](https://opensource.org/license/mit/)
+
 ## author

 Made with love and lots of expensive SOTA AI by [sneak](https://sneak.berlin) in Berlin in the summer of 2025.
--- a/TODO.md
+++ b/TODO.md
@@ -1,155 +1,128 @@
-# Implementation TODO
+# Vaultik 1.0 TODO

-## Proposed: Store and Snapshot Commands
+Linear list of tasks to complete before 1.0 release.

-### Overview
-Reorganize commands to provide better visibility into stored data and snapshots.
+## Rclone Storage Backend (Complete)

-### Command Structure
+Add rclone as a storage backend via Go library import, allowing vaultik to use any of rclone's 70+ supported cloud storage providers.

-#### `vaultik store` - Storage information commands
- `vaultik store info`
-  - Lists S3 bucket configuration
-  - Shows total number of snapshots (from metadata/ listing)
-  - Shows total number of blobs (from blobs/ listing)
-  - Shows total size of all blobs
-  - **No decryption required** - uses S3 listing only
+**Configuration:**
+```yaml
+storage_url: "rclone://myremote/path/to/backups"
+```
+User must have rclone configured separately (via `rclone config`).

-#### `vaultik snapshot` - Snapshot management commands  
- `vaultik snapshot create [path]`
-  - Renamed from `vaultik backup`
-  - Same functionality as current backup command
+**Implementation Steps:**
+1. [x] Add rclone dependency to go.mod
+2. [x] Create `internal/storage/rclone.go` implementing `Storer` interface
+   - `NewRcloneStorer(remote, path)` - init with `configfile.Install()` and `fs.NewFs()`
+   - `Put` / `PutWithProgress` - use `operations.Rcat()`
+   - `Get` - use `fs.NewObject()` then `obj.Open()`
+   - `Stat` - use `fs.NewObject()` for size/metadata
+   - `Delete` - use `obj.Remove()`
+   - `List` / `ListStream` - use `operations.ListFn()`
+   - `Info` - return remote name
+3. [x] Update `internal/storage/url.go` - parse `rclone://remote/path` URLs
+4. [x] Update `internal/storage/module.go` - add rclone case to `storerFromURL()`
+5. [x] Test with real rclone remote

- `vaultik snapshot list [--json]`
-  - Lists all snapshots with:
-    - Snapshot ID
-    - Creation timestamp (parsed from snapshot ID)
-    - Compressed size (sum of referenced blob sizes from manifest)
-  - **No decryption required** - uses blob manifests only
-  - `--json` flag outputs in JSON format instead of table
+**Error Mapping:**
+- `fs.ErrorObjectNotFound` → `ErrNotFound`
+- `fs.ErrorDirNotFound` → `ErrNotFound`
+- `fs.ErrorNotFoundInConfigFile` → `ErrRemoteNotFound` (new)

- `vaultik snapshot purge`
-  - Requires one of:
-    - `--keep-latest` - keeps only the most recent snapshot
-    - `--older-than <duration>` - removes snapshots older than duration (e.g., "30d", "6m", "1y")
-  - Removes snapshot metadata and runs pruning to clean up unreferenced blobs
-  - Shows what would be deleted and requires confirmation
+---

- `vaultik snapshot verify [--deep] <snapshot-id>`
-  - Basic mode: Verifies all blobs referenced in manifest exist in S3
-  - `--deep` mode: Downloads each blob and verifies its hash matches the stored hash
-  - **Stub implementation for now**
+## CLI Polish (Priority)

-### Implementation Notes
+1. Improve error messages throughout
+   - Ensure all errors include actionable context
+   - Add suggestions for common issues (e.g., "did you set VAULTIK_AGE_SECRET_KEY?")

-1. **No Decryption Required**: All commands work with unencrypted blob manifests
-2. **Blob Manifests**: Located at `metadata/{snapshot-id}/manifest.json.zst`
-3. **S3 Operations**: Use S3 ListObjects to enumerate snapshots and blobs
-4. **Size Calculations**: Sum blob sizes from S3 object metadata
-5. **Timestamp Parsing**: Extract from snapshot ID format (e.g., `2024-01-15-143052-hostname`)
-6. **S3 Metadata**: Only used for `snapshot verify` command
+## Security (Priority)

-### Benefits
- Users can see storage usage without decryption keys
- Snapshot management doesn't require access to encrypted metadata
- Clean separation between storage info and snapshot operations
+1. Audit encryption implementation
+   - Verify age encryption is used correctly
+   - Ensure no plaintext leaks in logs or errors
+   - Verify blob hashes are computed correctly

-## Chunking and Hashing
-1. ~~Implement content-defined chunking~~ (done with FastCDC)
-1. ~~Create streaming chunk processor~~ (done in chunker)
-1. ~~Implement SHA256 hashing for chunks~~ (done in scanner)
-1. ~~Add configurable chunk size parameters~~ (done in scanner)
-1. ~~Write tests for chunking consistency~~ (done)
+1. Secure memory handling for secrets
+   - Clear S3 credentials from memory after client init
+   - Document that age_secret_key is env-var only (already implemented)

-## Compression and Encryption
-1. ~~Implement compression~~ (done with zlib in blob packer)
-1. ~~Integrate age encryption library~~ (done in crypto package)
-1. ~~Create Encryptor type for public key encryption~~ (done)
-1. ~~Implement streaming encrypt/decrypt pipelines~~ (done in packer)
-1. ~~Write tests for compression and encryption~~ (done)
+## Testing

-## Blob Packing
-1. ~~Implement BlobWriter with size limits~~ (done in packer)
-1. ~~Add chunk accumulation and flushing~~ (done)
-1. ~~Create blob hash calculation~~ (done)
-1. ~~Implement proper error handling and rollback~~ (done with transactions)
-1. ~~Write tests for blob packing scenarios~~ (done)
+1. Write integration tests for restore command

-## S3 Operations
-1. ~~Integrate MinIO client library~~ (done in s3 package)
-1. ~~Implement S3Client wrapper type~~ (done)
-1. ~~Add multipart upload support for large blobs~~ (done - using standard upload)
-1. ~~Implement retry logic~~ (handled by MinIO client)
-1. ~~Write tests using MinIO container~~ (done with testcontainers)
+1. Write end-to-end integration test
+   - Create backup
+   - Verify backup
+   - Restore backup
+   - Compare restored files to originals

-## Backup Command - Basic
-1. ~~Implement directory walking with exclusion patterns~~ (done with afero)
-1. Add file change detection using index
-1. ~~Integrate chunking pipeline for changed files~~ (done in scanner)
-1. Implement blob upload coordination to S3
-1. Add progress reporting to stderr
-1. Write integration tests for backup
+1. Add tests for edge cases
+   - Empty directories
+   - Symlinks
+   - Special characters in filenames
+   - Very large files (multi-GB)
+   - Many small files (100k+)

-## Snapshot Metadata
-1. Implement snapshot metadata extraction from index
-1. Create SQLite snapshot database builder
-1. Add metadata compression and encryption
-1. Implement metadata chunking for large snapshots
-1. Add hash calculation and verification
-1. Implement metadata upload to S3
-1. Write tests for metadata operations
+1. Add tests for error conditions
+   - Network failures during upload
+   - Disk full during restore
+   - Corrupted blobs
+   - Missing blobs

-## Restore Command
-1. Implement snapshot listing and selection
-1. Add metadata download and reconstruction
-1. Implement hash verification for metadata
-1. Create file restoration logic with chunk retrieval
-1. Add blob caching for efficiency
-1. Implement proper file permissions and mtime restoration
-1. Write integration tests for restore
+## Performance

-## Prune Command
-1. Implement latest snapshot detection
-1. Add referenced blob extraction from metadata
-1. Create S3 blob listing and comparison
-1. Implement safe deletion of unreferenced blobs
-1. Add dry-run mode for safety
-1. Write tests for prune scenarios
+1. Profile and optimize restore performance
+   - Parallel blob downloads
+   - Streaming decompression/decryption
+   - Efficient chunk reassembly

-## Verify Command
-1. Implement metadata integrity checking
-1. Add blob existence verification
-1. Implement quick mode (S3 hash checking)
-1. Implement deep mode (download and verify chunks)
-1. Add detailed error reporting
-1. Write tests for verification
+1. Add bandwidth limiting option
+   - `--bwlimit` flag for upload/download speed limiting

-## Fetch Command
-1. Implement single-file metadata query
-1. Add minimal blob downloading for file
-1. Create streaming file reconstruction
-1. Add support for output redirection
-1. Write tests for fetch command
+## Documentation

-## Daemon Mode
-1. Implement inotify watcher for Linux
-1. Add dirty path tracking in index
-1. Create periodic full scan scheduler
-1. Implement backup interval enforcement
-1. Add proper signal handling and shutdown
-1. Write tests for daemon behavior
+1. Add man page or --help improvements
+   - Detailed help for each command
+   - Examples in help output

-## Cron Mode
-1. Implement silent operation mode
-1. Add proper exit codes for cron
-1. Implement lock file to prevent concurrent runs
-1. Add error summary reporting
-1. Write tests for cron mode
+## Final Polish

-## Finalization
-1. Add comprehensive logging throughout
-1. Implement proper error wrapping and context
-1. Add performance metrics collection
-1. Create end-to-end integration tests
-1. Write documentation and examples
-1. Set up CI/CD pipeline
+1. Ensure version is set correctly in releases
+
+1. Create release process
+   - Binary releases for supported platforms
+   - Checksums for binaries
+   - Release notes template
+
+1. Final code review
+   - Remove debug statements
+   - Ensure consistent code style
+
+1. Tag and release v1.0.0
+
+---
+
+## Post-1.0 (Daemon Mode)
+
+1. Implement inotify file watcher for Linux
+   - Watch source directories for changes
+   - Track dirty paths in memory
+
+1. Implement FSEvents watcher for macOS
+   - Watch source directories for changes
+   - Track dirty paths in memory
+
+1. Implement backup scheduler in daemon mode
+   - Respect backup_interval config
+   - Trigger backup when dirty paths exist and interval elapsed
+   - Implement full_scan_interval for periodic full scans
+
+1. Add proper signal handling for daemon
+   - Graceful shutdown on SIGTERM/SIGINT
+   - Complete in-progress backup before exit
+
+1. Write tests for daemon mode
--- a/cmd/vaultik/main.go
+++ b/cmd/vaultik/main.go
@@ -1,9 +1,41 @@
 package main

 import (
+	"os"
+	"runtime"
+	"runtime/pprof"
+
 	"git.eeqj.de/sneak/vaultik/internal/cli"
 )

 func main() {
+	// CPU profiling: set VAULTIK_CPUPROFILE=/path/to/cpu.prof
+	if cpuProfile := os.Getenv("VAULTIK_CPUPROFILE"); cpuProfile != "" {
+		f, err := os.Create(cpuProfile)
+		if err != nil {
+			panic("could not create CPU profile: " + err.Error())
+		}
+		defer func() { _ = f.Close() }()
+		if err := pprof.StartCPUProfile(f); err != nil {
+			panic("could not start CPU profile: " + err.Error())
+		}
+		defer pprof.StopCPUProfile()
+	}
+
+	// Memory profiling: set VAULTIK_MEMPROFILE=/path/to/mem.prof
+	if memProfile := os.Getenv("VAULTIK_MEMPROFILE"); memProfile != "" {
+		defer func() {
+			f, err := os.Create(memProfile)
+			if err != nil {
+				panic("could not create memory profile: " + err.Error())
+			}
+			defer func() { _ = f.Close() }()
+			runtime.GC() // get up-to-date statistics
+			if err := pprof.WriteHeapProfile(f); err != nil {
+				panic("could not write memory profile: " + err.Error())
+			}
+		}()
+	}
+
 	cli.CLIEntry()
 }
--- a/config.example.yml
+++ b/config.example.yml
@@ -2,25 +2,222 @@
 # This file shows all available configuration options with their default values
 # Copy this file and uncomment/modify the values you need

-# Age recipient public key for encryption
-# This is REQUIRED - backups are encrypted to this public key
+# Age recipient public keys for encryption
+# This is REQUIRED - backups are encrypted to these public keys
 # Generate with: age-keygen | grep "public key"
-age_recipient: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+age_recipients:
+  - age1cj2k2addawy294f6k2gr2mf9gps9r3syplryxca3nvxj3daqm96qfp84tz

-# List of directories to backup
-# These paths will be scanned recursively for files to backup
-# Use absolute paths
-source_dirs:
-  - /
-  # - /home
-  # - /etc
-  # - /var
-
-# Patterns to exclude from backup
-# Uses glob patterns to match file paths
-# Paths are matched as absolute paths
+# Named snapshots - each snapshot can contain multiple paths
+# Each snapshot gets its own ID and can have snapshot-specific excludes
+snapshots:
+  testing:
+    paths:
+      - ~/dev/vaultik
+  apps:
+    paths:
+      - /Applications
    exclude:
-  # System directories that should not be backed up
+      - "/App Store.app"
+      - "/Apps.app"
+      - "/Automator.app"
+      - "/Books.app"
+      - "/Calculator.app"
+      - "/Calendar.app"
+      - "/Chess.app"
+      - "/Clock.app"
+      - "/Contacts.app"
+      - "/Dictionary.app"
+      - "/FaceTime.app"
+      - "/FindMy.app"
+      - "/Font Book.app"
+      - "/Freeform.app"
+      - "/Games.app"
+      - "/GarageBand.app"
+      - "/Home.app"
+      - "/Image Capture.app"
+      - "/Image Playground.app"
+      - "/Journal.app"
+      - "/Keynote.app"
+      - "/Mail.app"
+      - "/Maps.app"
+      - "/Messages.app"
+      - "/Mission Control.app"
+      - "/Music.app"
+      - "/News.app"
+      - "/Notes.app"
+      - "/Numbers.app"
+      - "/Pages.app"
+      - "/Passwords.app"
+      - "/Phone.app"
+      - "/Photo Booth.app"
+      - "/Photos.app"
+      - "/Podcasts.app"
+      - "/Preview.app"
+      - "/QuickTime Player.app"
+      - "/Reminders.app"
+      - "/Safari.app"
+      - "/Shortcuts.app"
+      - "/Siri.app"
+      - "/Stickies.app"
+      - "/Stocks.app"
+      - "/System Settings.app"
+      - "/TV.app"
+      - "/TextEdit.app"
+      - "/Time Machine.app"
+      - "/Tips.app"
+      - "/Utilities/Activity Monitor.app"
+      - "/Utilities/AirPort Utility.app"
+      - "/Utilities/Audio MIDI Setup.app"
+      - "/Utilities/Bluetooth File Exchange.app"
+      - "/Utilities/Boot Camp Assistant.app"
+      - "/Utilities/ColorSync Utility.app"
+      - "/Utilities/Console.app"
+      - "/Utilities/Digital Color Meter.app"
+      - "/Utilities/Disk Utility.app"
+      - "/Utilities/Grapher.app"
+      - "/Utilities/Magnifier.app"
+      - "/Utilities/Migration Assistant.app"
+      - "/Utilities/Print Center.app"
+      - "/Utilities/Screen Sharing.app"
+      - "/Utilities/Screenshot.app"
+      - "/Utilities/Script Editor.app"
+      - "/Utilities/System Information.app"
+      - "/Utilities/Terminal.app"
+      - "/Utilities/VoiceOver Utility.app"
+      - "/VoiceMemos.app"
+      - "/Weather.app"
+      - "/iMovie.app"
+      - "/iPhone Mirroring.app"
+  home:
+    paths:
+      - "~"
+    exclude:
+      - "/.Trash"
+      - "/tmp"
+      - "/Library/Caches"
+      - "/Library/Accounts"
+      - "/Library/AppleMediaServices"
+      - "/Library/Application Support/AddressBook"
+      - "/Library/Application Support/CallHistoryDB"
+      - "/Library/Application Support/CallHistoryTransactions"
+      - "/Library/Application Support/DifferentialPrivacy"
+      - "/Library/Application Support/FaceTime"
+      - "/Library/Application Support/FileProvider"
+      - "/Library/Application Support/Knowledge"
+      - "/Library/Application Support/com.apple.TCC"
+      - "/Library/Application Support/com.apple.avfoundation/Frecents"
+      - "/Library/Application Support/com.apple.sharedfilelist"
+      - "/Library/Assistant/SiriVocabulary"
+      - "/Library/Autosave Information"
+      - "/Library/Biome"
+      - "/Library/ContainerManager"
+      - "/Library/Containers/com.apple.Home"
+      - "/Library/Containers/com.apple.Maps/Data/Maps"
+      - "/Library/Containers/com.apple.MobileSMS"
+      - "/Library/Containers/com.apple.Notes"
+      - "/Library/Containers/com.apple.Safari"
+      - "/Library/Containers/com.apple.Safari.WebApp"
+      - "/Library/Containers/com.apple.VoiceMemos"
+      - "/Library/Containers/com.apple.archiveutility"
+      - "/Library/Containers/com.apple.corerecents.recentsd/Data/Library/Recents"
+      - "/Library/Containers/com.apple.mail"
+      - "/Library/Containers/com.apple.news"
+      - "/Library/Containers/com.apple.stocks"
+      - "/Library/Cookies"
+      - "/Library/CoreFollowUp"
+      - "/Library/Daemon Containers"
+      - "/Library/DoNotDisturb"
+      - "/Library/DuetExpertCenter"
+      - "/Library/Group Containers/com.apple.Home.group"
+      - "/Library/Group Containers/com.apple.MailPersonaStorage"
+      - "/Library/Group Containers/com.apple.PreviewLegacySignaturesConversion"
+      - "/Library/Group Containers/com.apple.bird"
+      - "/Library/Group Containers/com.apple.stickersd.group"
+      - "/Library/Group Containers/com.apple.systempreferences.cache"
+      - "/Library/Group Containers/group.com.apple.AppleSpell"
+      - "/Library/Group Containers/group.com.apple.ArchiveUtility.PKSignedContainer"
+      - "/Library/Group Containers/group.com.apple.DeviceActivity"
+      - "/Library/Group Containers/group.com.apple.Journal"
+      - "/Library/Group Containers/group.com.apple.ManagedSettings"
+      - "/Library/Group Containers/group.com.apple.PegasusConfiguration"
+      - "/Library/Group Containers/group.com.apple.Safari.SandboxBroker"
+      - "/Library/Group Containers/group.com.apple.SiriTTS"
+      - "/Library/Group Containers/group.com.apple.UserNotifications"
+      - "/Library/Group Containers/group.com.apple.VoiceMemos.shared"
+      - "/Library/Group Containers/group.com.apple.accessibility.voicebanking"
+      - "/Library/Group Containers/group.com.apple.amsondevicestoraged"
+      - "/Library/Group Containers/group.com.apple.appstoreagent"
+      - "/Library/Group Containers/group.com.apple.calendar"
+      - "/Library/Group Containers/group.com.apple.chronod"
+      - "/Library/Group Containers/group.com.apple.contacts"
+      - "/Library/Group Containers/group.com.apple.controlcenter"
+      - "/Library/Group Containers/group.com.apple.corerepair"
+      - "/Library/Group Containers/group.com.apple.coreservices.useractivityd"
+      - "/Library/Group Containers/group.com.apple.energykit"
+      - "/Library/Group Containers/group.com.apple.feedback"
+      - "/Library/Group Containers/group.com.apple.feedbacklogger"
+      - "/Library/Group Containers/group.com.apple.findmy.findmylocateagent"
+      - "/Library/Group Containers/group.com.apple.iCloudDrive"
+      - "/Library/Group Containers/group.com.apple.icloud.fmfcore"
+      - "/Library/Group Containers/group.com.apple.icloud.fmipcore"
+      - "/Library/Group Containers/group.com.apple.icloud.searchpartyuseragent"
+      - "/Library/Group Containers/group.com.apple.liveactivitiesd"
+      - "/Library/Group Containers/group.com.apple.loginwindow.persistent-apps"
+      - "/Library/Group Containers/group.com.apple.mail"
+      - "/Library/Group Containers/group.com.apple.mlhost"
+      - "/Library/Group Containers/group.com.apple.moments"
+      - "/Library/Group Containers/group.com.apple.news"
+      - "/Library/Group Containers/group.com.apple.newsd"
+      - "/Library/Group Containers/group.com.apple.notes"
+      - "/Library/Group Containers/group.com.apple.notes.import"
+      - "/Library/Group Containers/group.com.apple.photolibraryd.private"
+      - "/Library/Group Containers/group.com.apple.portrait.BackgroundReplacement"
+      - "/Library/Group Containers/group.com.apple.printtool"
+      - "/Library/Group Containers/group.com.apple.private.translation"
+      - "/Library/Group Containers/group.com.apple.reminders"
+      - "/Library/Group Containers/group.com.apple.replicatord"
+      - "/Library/Group Containers/group.com.apple.scopedbookmarkagent"
+      - "/Library/Group Containers/group.com.apple.secure-control-center-preferences"
+      - "/Library/Group Containers/group.com.apple.sharingd"
+      - "/Library/Group Containers/group.com.apple.shortcuts"
+      - "/Library/Group Containers/group.com.apple.siri.inference"
+      - "/Library/Group Containers/group.com.apple.siri.referenceResolution"
+      - "/Library/Group Containers/group.com.apple.siri.remembers"
+      - "/Library/Group Containers/group.com.apple.siri.userfeedbacklearning"
+      - "/Library/Group Containers/group.com.apple.spotlight"
+      - "/Library/Group Containers/group.com.apple.stocks"
+      - "/Library/Group Containers/group.com.apple.stocks-news"
+      - "/Library/Group Containers/group.com.apple.studentd"
+      - "/Library/Group Containers/group.com.apple.swtransparency"
+      - "/Library/Group Containers/group.com.apple.telephonyutilities.callservicesd"
+      - "/Library/Group Containers/group.com.apple.tips"
+      - "/Library/Group Containers/group.com.apple.tipsnext"
+      - "/Library/Group Containers/group.com.apple.transparency"
+      - "/Library/Group Containers/group.com.apple.usernoted"
+      - "/Library/Group Containers/group.com.apple.weather"
+      - "/Library/HomeKit"
+      - "/Library/IdentityServices"
+      - "/Library/IntelligencePlatform"
+      - "/Library/Mail"
+      - "/Library/Messages"
+      - "/Library/Metadata/CoreSpotlight"
+      - "/Library/Metadata/com.apple.IntelligentSuggestions"
+      - "/Library/PersonalizationPortrait"
+      - "/Library/Safari"
+      - "/Library/Sharing"
+      - "/Library/Shortcuts"
+      - "/Library/StatusKit"
+      - "/Library/Suggestions"
+      - "/Library/Trial"
+      - "/Library/Weather"
+      - "/Library/com.apple.aiml.instrumentation"
+      - "/Movies/TV"
+  system:
+    paths:
+      - /
+    exclude:
+      # Virtual/transient filesystems
      - /proc
      - /sys
      - /dev
@@ -30,73 +227,69 @@ exclude:
      - /var/run
      - /var/lock
      - /var/cache
-  - /lost+found
      - /media
      - /mnt
-  # Swap files
+      # Swap
      - /swapfile
      - /swap.img
-  - "*.swap"
-  - "*.swp"
-  # Log files (optional - you may want to keep some logs)
-  - "*.log"
-  - "*.log.*"
-  - /var/log
      # Package manager caches
      - /var/cache/apt
      - /var/cache/yum
      - /var/cache/dnf
      - /var/cache/pacman
-  # User caches and temporary files
-  - "*/.cache"
+      # Trash
      - "*/.local/share/Trash"
-  - "*/Downloads"
-  - "*/.thumbnails"
-  # Development artifacts
+  dev:
+    paths:
+      - /Users/user/dev
+    exclude:
      - "**/node_modules"
-  - "**/.git/objects"
      - "**/target"
      - "**/build"
      - "**/__pycache__"
      - "**/*.pyc"
-  # Large files you might not want to backup
-  - "*.iso"
-  - "*.img"
-  - "*.vmdk"
-  - "*.vdi"
-  - "*.qcow2"
+      - "**/.venv"
+      - "**/vendor"
+
+# Global patterns to exclude from all backups
+exclude:
+  - "*.tmp"
+
+# Storage URL - use either this OR the s3 section below
+# Supports: s3://bucket/prefix, file:///path, rclone://remote/path
+storage_url: "rclone://las1stor1//srv/pool.2024.04/backups/heraklion"

 # S3-compatible storage configuration
-s3:
-  # S3-compatible endpoint URL
-  # Examples: https://s3.amazonaws.com, https://storage.googleapis.com
-  endpoint: https://s3.example.com
-  
-  # Bucket name where backups will be stored
-  bucket: my-backup-bucket
-  
-  # Prefix (folder) within the bucket for this host's backups
-  # Useful for organizing backups from multiple hosts
-  # Default: empty (root of bucket)
-  #prefix: "hosts/myserver/"
-  
-  # S3 access credentials
-  access_key_id: your-access-key
-  secret_access_key: your-secret-key
-  
-  # S3 region
-  # Default: us-east-1
-  #region: us-east-1
-  
-  # Use SSL/TLS for S3 connections
-  # Default: true
-  #use_ssl: true
-  
-  # Part size for multipart uploads
-  # Minimum 5MB, affects memory usage during upload
-  # Supports: 5MB, 10M, 100MiB, etc.
-  # Default: 5MB
-  #part_size: 5MB
+#s3:
+#  # S3-compatible endpoint URL
+#  # Examples: https://s3.amazonaws.com, https://storage.googleapis.com
+#  endpoint: http://10.100.205.122:8333
+#
+#  # Bucket name where backups will be stored
+#  bucket: testbucket
+#
+#  # Prefix (folder) within the bucket for this host's backups
+#  # Useful for organizing backups from multiple hosts
+#  # Default: empty (root of bucket)
+#  #prefix: "hosts/myserver/"
+#
+#  # S3 access credentials
+#  access_key_id: Z9GT22M9YFU08WRMC5D4
+#  secret_access_key: Pi0tPKjFbN4rZlRhcA4zBtEkib04yy2WcIzI+AXk
+#
+#  # S3 region
+#  # Default: us-east-1
+#  #region: us-east-1
+#
+#  # Use SSL/TLS for S3 connections
+#  # Default: true
+#  #use_ssl: true
+#
+#  # Part size for multipart uploads
+#  # Minimum 5MB, affects memory usage during upload
+#  # Supports: 5MB, 10M, 100MiB, etc.
+#  # Default: 5MB
+#  #part_size: 5MB

 # How often to run backups in daemon mode
 # Format: 1h, 30m, 24h, etc
@@ -133,8 +326,7 @@ s3:
 # Compression level (1-19)
 # Higher = better compression but slower
 # Default: 3
-#compression_level: 3
-
+compression_level: 5
 # Hostname to use in backup metadata
 # Default: system hostname
 #hostname: myserver
--- a/go.mod
+++ b/go.mod
@@ -4,59 +4,302 @@ go 1.24.4

 require (
 	filippo.io/age v1.2.1
-	github.com/aws/aws-sdk-go-v2 v1.36.6
-	github.com/aws/aws-sdk-go-v2/config v1.29.18
-	github.com/aws/aws-sdk-go-v2/credentials v1.17.71
-	github.com/aws/aws-sdk-go-v2/feature/s3/manager v1.17.85
-	github.com/aws/aws-sdk-go-v2/service/s3 v1.84.1
-	github.com/aws/smithy-go v1.22.4
+	git.eeqj.de/sneak/smartconfig v1.0.0
+	github.com/adrg/xdg v0.5.3
+	github.com/aws/aws-sdk-go-v2 v1.39.6
+	github.com/aws/aws-sdk-go-v2/config v1.31.17
+	github.com/aws/aws-sdk-go-v2/credentials v1.18.21
+	github.com/aws/aws-sdk-go-v2/feature/s3/manager v1.20.4
+	github.com/aws/aws-sdk-go-v2/service/s3 v1.90.0
+	github.com/aws/smithy-go v1.23.2
 	github.com/dustin/go-humanize v1.0.1
+	github.com/gobwas/glob v0.2.3
 	github.com/google/uuid v1.6.0
 	github.com/johannesboyne/gofakes3 v0.0.0-20250603205740-ed9094be7668
-	github.com/jotfs/fastcdc-go v0.2.0
-	github.com/klauspost/compress v1.18.0
-	github.com/spf13/afero v1.14.0
-	github.com/spf13/cobra v1.9.1
-	github.com/stretchr/testify v1.9.0
+	github.com/klauspost/compress v1.18.1
+	github.com/mattn/go-sqlite3 v1.14.29
+	github.com/rclone/rclone v1.72.1
+	github.com/schollz/progressbar/v3 v3.19.0
+	github.com/spf13/afero v1.15.0
+	github.com/spf13/cobra v1.10.1
+	github.com/stretchr/testify v1.11.1
 	go.uber.org/fx v1.24.0
-	golang.org/x/term v0.33.0
+	golang.org/x/term v0.37.0
 	gopkg.in/yaml.v3 v3.0.1
 	modernc.org/sqlite v1.38.0
 )

 require (
+	cloud.google.com/go/auth v0.17.0 // indirect
+	cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
+	cloud.google.com/go/compute/metadata v0.9.0 // indirect
+	cloud.google.com/go/iam v1.5.2 // indirect
+	cloud.google.com/go/secretmanager v1.15.0 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.0 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/keyvault/azsecrets v0.12.0 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/keyvault/internal v0.7.1 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.6.3 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/storage/azfile v1.5.3 // indirect
+	github.com/Azure/go-ntlmssp v0.0.2-0.20251110135918-10b7b7e7cd26 // indirect
+	github.com/AzureAD/microsoft-authentication-library-for-go v1.6.0 // indirect
+	github.com/Files-com/files-sdk-go/v3 v3.2.264 // indirect
+	github.com/IBM/go-sdk-core/v5 v5.21.0 // indirect
+	github.com/Max-Sum/base32768 v0.0.0-20230304063302-18e6ce5945fd // indirect
+	github.com/Microsoft/go-winio v0.6.2 // indirect
+	github.com/ProtonMail/bcrypt v0.0.0-20211005172633-e235017c1baf // indirect
+	github.com/ProtonMail/gluon v0.17.1-0.20230724134000-308be39be96e // indirect
+	github.com/ProtonMail/go-crypto v1.3.0 // indirect
+	github.com/ProtonMail/go-mime v0.0.0-20230322103455-7d82a3887f2f // indirect
+	github.com/ProtonMail/go-srp v0.0.7 // indirect
+	github.com/ProtonMail/gopenpgp/v2 v2.9.0 // indirect
+	github.com/PuerkitoBio/goquery v1.10.3 // indirect
+	github.com/a1ex3/zstd-seekable-format-go/pkg v0.10.0 // indirect
+	github.com/abbot/go-http-auth v0.4.0 // indirect
+	github.com/anchore/go-lzo v0.1.0 // indirect
+	github.com/andybalholm/cascadia v1.3.3 // indirect
+	github.com/appscode/go-querystring v0.0.0-20170504095604-0126cfb3f1dc // indirect
+	github.com/armon/go-metrics v0.4.1 // indirect
 	github.com/aws/aws-sdk-go v1.44.256 // indirect
-	github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.6.11 // indirect
-	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.33 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.37 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.37 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/v4a v1.3.37 // indirect
-	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.12.4 // indirect
-	github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.7.5 // indirect
-	github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.12.18 // indirect
-	github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.18.18 // indirect
-	github.com/aws/aws-sdk-go-v2/service/sso v1.25.6 // indirect
-	github.com/aws/aws-sdk-go-v2/service/ssooidc v1.30.4 // indirect
-	github.com/aws/aws-sdk-go-v2/service/sts v1.34.1 // indirect
-	github.com/davecgh/go-spew v1.1.1 // indirect
+	github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.3 // indirect
+	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.13 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.13 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.13 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/ini v1.8.4 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.13 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.3 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.4 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.13 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.13 // indirect
+	github.com/aws/aws-sdk-go-v2/service/secretsmanager v1.35.8 // indirect
+	github.com/aws/aws-sdk-go-v2/service/sso v1.30.1 // indirect
+	github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.5 // indirect
+	github.com/aws/aws-sdk-go-v2/service/sts v1.39.1 // indirect
+	github.com/bahlo/generic-list-go v0.2.0 // indirect
+	github.com/beorn7/perks v1.0.1 // indirect
+	github.com/boombuler/barcode v1.1.0 // indirect
+	github.com/bradenaw/juniper v0.15.3 // indirect
+	github.com/bradfitz/iter v0.0.0-20191230175014-e8f45d346db8 // indirect
+	github.com/buengese/sgzip v0.1.1 // indirect
+	github.com/buger/jsonparser v1.1.1 // indirect
+	github.com/calebcase/tmpfile v1.0.3 // indirect
+	github.com/cenkalti/backoff/v4 v4.3.0 // indirect
+	github.com/cespare/xxhash/v2 v2.3.0 // indirect
+	github.com/chilts/sid v0.0.0-20190607042430-660e94789ec9 // indirect
+	github.com/clipperhouse/stringish v0.1.1 // indirect
+	github.com/clipperhouse/uax29/v2 v2.3.0 // indirect
+	github.com/cloudflare/circl v1.6.1 // indirect
+	github.com/cloudinary/cloudinary-go/v2 v2.13.0 // indirect
+	github.com/cloudsoda/go-smb2 v0.0.0-20250228001242-d4c70e6251cc // indirect
+	github.com/cloudsoda/sddl v0.0.0-20250224235906-926454e91efc // indirect
+	github.com/colinmarc/hdfs/v2 v2.4.0 // indirect
+	github.com/coreos/go-semver v0.3.1 // indirect
+	github.com/coreos/go-systemd/v22 v22.6.0 // indirect
+	github.com/creasty/defaults v1.8.0 // indirect
+	github.com/cronokirby/saferith v0.33.0 // indirect
+	github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
+	github.com/diskfs/go-diskfs v1.7.0 // indirect
+	github.com/dropbox/dropbox-sdk-go-unofficial/v6 v6.0.5 // indirect
+	github.com/ebitengine/purego v0.9.1 // indirect
+	github.com/emersion/go-message v0.18.2 // indirect
+	github.com/emersion/go-vcard v0.0.0-20241024213814-c9703dde27ff // indirect
+	github.com/emicklei/go-restful/v3 v3.11.0 // indirect
+	github.com/fatih/color v1.16.0 // indirect
+	github.com/felixge/httpsnoop v1.0.4 // indirect
+	github.com/flynn/noise v1.1.0 // indirect
+	github.com/fxamacker/cbor/v2 v2.7.0 // indirect
+	github.com/gabriel-vasile/mimetype v1.4.11 // indirect
+	github.com/geoffgarside/ber v1.2.0 // indirect
+	github.com/go-chi/chi/v5 v5.2.3 // indirect
+	github.com/go-darwin/apfs v0.0.0-20211011131704-f84b94dbf348 // indirect
+	github.com/go-git/go-billy/v5 v5.6.2 // indirect
+	github.com/go-jose/go-jose/v4 v4.1.2 // indirect
+	github.com/go-logr/logr v1.4.3 // indirect
+	github.com/go-logr/stdr v1.2.2 // indirect
+	github.com/go-ole/go-ole v1.3.0 // indirect
+	github.com/go-openapi/errors v0.22.4 // indirect
+	github.com/go-openapi/jsonpointer v0.21.0 // indirect
+	github.com/go-openapi/jsonreference v0.20.2 // indirect
+	github.com/go-openapi/strfmt v0.25.0 // indirect
+	github.com/go-openapi/swag v0.23.0 // indirect
+	github.com/go-playground/locales v0.14.1 // indirect
+	github.com/go-playground/universal-translator v0.18.1 // indirect
+	github.com/go-playground/validator/v10 v10.28.0 // indirect
+	github.com/go-resty/resty/v2 v2.16.5 // indirect
+	github.com/go-viper/mapstructure/v2 v2.4.0 // indirect
+	github.com/gofrs/flock v0.13.0 // indirect
+	github.com/gogo/protobuf v1.3.2 // indirect
+	github.com/golang-jwt/jwt/v4 v4.5.2 // indirect
+	github.com/golang-jwt/jwt/v5 v5.3.0 // indirect
+	github.com/golang/protobuf v1.5.4 // indirect
+	github.com/google/btree v1.1.3 // indirect
+	github.com/google/gnostic-models v0.6.9 // indirect
+	github.com/google/go-cmp v0.7.0 // indirect
+	github.com/google/s2a-go v0.1.9 // indirect
+	github.com/googleapis/enterprise-certificate-proxy v0.3.7 // indirect
+	github.com/googleapis/gax-go/v2 v2.15.0 // indirect
+	github.com/gopherjs/gopherjs v1.17.2 // indirect
+	github.com/gorilla/schema v1.4.1 // indirect
+	github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.3 // indirect
+	github.com/hashicorp/consul/api v1.32.1 // indirect
+	github.com/hashicorp/errwrap v1.1.0 // indirect
+	github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
+	github.com/hashicorp/go-hclog v1.6.3 // indirect
+	github.com/hashicorp/go-immutable-radix v1.3.1 // indirect
+	github.com/hashicorp/go-multierror v1.1.1 // indirect
+	github.com/hashicorp/go-retryablehttp v0.7.8 // indirect
+	github.com/hashicorp/go-rootcerts v1.0.2 // indirect
+	github.com/hashicorp/go-secure-stdlib/parseutil v0.1.6 // indirect
+	github.com/hashicorp/go-secure-stdlib/strutil v0.1.2 // indirect
+	github.com/hashicorp/go-sockaddr v1.0.2 // indirect
+	github.com/hashicorp/go-uuid v1.0.3 // indirect
+	github.com/hashicorp/golang-lru v0.5.4 // indirect
+	github.com/hashicorp/hcl v1.0.1-vault-7 // indirect
+	github.com/hashicorp/serf v0.10.1 // indirect
+	github.com/hashicorp/vault/api v1.20.0 // indirect
+	github.com/henrybear327/Proton-API-Bridge v1.0.0 // indirect
+	github.com/henrybear327/go-proton-api v1.0.0 // indirect
 	github.com/inconshreveable/mousetrap v1.1.0 // indirect
+	github.com/jcmturner/aescts/v2 v2.0.0 // indirect
+	github.com/jcmturner/dnsutils/v2 v2.0.0 // indirect
+	github.com/jcmturner/gofork v1.7.6 // indirect
+	github.com/jcmturner/goidentity/v6 v6.0.1 // indirect
+	github.com/jcmturner/gokrb5/v8 v8.4.4 // indirect
+	github.com/jcmturner/rpc/v2 v2.0.3 // indirect
+	github.com/jlaffaye/ftp v0.2.1-0.20240918233326-1b970516f5d3 // indirect
+	github.com/josharian/intern v1.0.0 // indirect
+	github.com/json-iterator/go v1.1.12 // indirect
+	github.com/jtolds/gls v4.20.0+incompatible // indirect
+	github.com/jtolio/noiseconn v0.0.0-20231127013910-f6d9ecbf1de7 // indirect
+	github.com/jzelinskie/whirlpool v0.0.0-20201016144138-0675e54bb004 // indirect
+	github.com/klauspost/cpuid/v2 v2.3.0 // indirect
+	github.com/koofr/go-httpclient v0.0.0-20240520111329-e20f8f203988 // indirect
+	github.com/koofr/go-koofrclient v0.0.0-20221207135200-cbd7fc9ad6a6 // indirect
+	github.com/kr/fs v0.1.0 // indirect
+	github.com/kylelemons/godebug v1.1.0 // indirect
+	github.com/lanrat/extsort v1.4.2 // indirect
+	github.com/leodido/go-urn v1.4.0 // indirect
+	github.com/lpar/date v1.0.0 // indirect
+	github.com/lufia/plan9stats v0.0.0-20251013123823-9fd1530e3ec3 // indirect
+	github.com/mailru/easyjson v0.9.1 // indirect
+	github.com/mattn/go-colorable v0.1.14 // indirect
 	github.com/mattn/go-isatty v0.0.20 // indirect
+	github.com/mattn/go-runewidth v0.0.19 // indirect
+	github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db // indirect
+	github.com/mitchellh/go-homedir v1.1.0 // indirect
+	github.com/mitchellh/mapstructure v1.5.0 // indirect
+	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
+	github.com/modern-go/reflect2 v1.0.2 // indirect
+	github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
 	github.com/ncruces/go-strftime v0.1.9 // indirect
-	github.com/pmezard/go-difflib v1.0.0 // indirect
+	github.com/ncw/swift/v2 v2.0.5 // indirect
+	github.com/oklog/ulid v1.3.1 // indirect
+	github.com/onsi/ginkgo/v2 v2.23.3 // indirect
+	github.com/oracle/oci-go-sdk/v65 v65.104.0 // indirect
+	github.com/panjf2000/ants/v2 v2.11.3 // indirect
+	github.com/patrickmn/go-cache v2.1.0+incompatible // indirect
+	github.com/pengsrc/go-shared v0.2.1-0.20190131101655-1999055a4a14 // indirect
+	github.com/peterh/liner v1.2.2 // indirect
+	github.com/pierrec/lz4/v4 v4.1.22 // indirect
+	github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c // indirect
+	github.com/pkg/errors v0.9.1 // indirect
+	github.com/pkg/sftp v1.13.10 // indirect
+	github.com/pkg/xattr v0.4.12 // indirect
+	github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
+	github.com/power-devops/perfstat v0.0.0-20240221224432-82ca36839d55 // indirect
+	github.com/pquerna/otp v1.5.0 // indirect
+	github.com/prometheus/client_golang v1.23.2 // indirect
+	github.com/prometheus/client_model v0.6.2 // indirect
+	github.com/prometheus/common v0.67.2 // indirect
+	github.com/prometheus/procfs v0.19.2 // indirect
+	github.com/putdotio/go-putio/putio v0.0.0-20200123120452-16d982cac2b8 // indirect
+	github.com/relvacode/iso8601 v1.7.0 // indirect
 	github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
+	github.com/rfjakob/eme v1.1.2 // indirect
+	github.com/rivo/uniseg v0.4.7 // indirect
+	github.com/ryanuber/go-glob v1.0.0 // indirect
 	github.com/ryszard/goskiplist v0.0.0-20150312221310-2dfbae5fcf46 // indirect
-	github.com/spf13/pflag v1.0.6 // indirect
+	github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06 // indirect
+	github.com/samber/lo v1.52.0 // indirect
+	github.com/shirou/gopsutil/v4 v4.25.10 // indirect
+	github.com/sirupsen/logrus v1.9.4-0.20230606125235-dd1b4c2e81af // indirect
+	github.com/skratchdot/open-golang v0.0.0-20200116055534-eef842397966 // indirect
+	github.com/smarty/assertions v1.16.0 // indirect
+	github.com/sony/gobreaker v1.0.0 // indirect
+	github.com/spacemonkeygo/monkit/v3 v3.0.25-0.20251022131615-eb24eb109368 // indirect
+	github.com/spf13/pflag v1.0.10 // indirect
+	github.com/t3rm1n4l/go-mega v0.0.0-20251031123324-a804aaa87491 // indirect
+	github.com/tidwall/gjson v1.18.0 // indirect
+	github.com/tidwall/match v1.1.1 // indirect
+	github.com/tidwall/pretty v1.2.0 // indirect
+	github.com/tklauser/go-sysconf v0.3.15 // indirect
+	github.com/tklauser/numcpus v0.10.0 // indirect
+	github.com/ulikunitz/xz v0.5.15 // indirect
+	github.com/unknwon/goconfig v1.0.0 // indirect
+	github.com/wk8/go-ordered-map/v2 v2.1.8 // indirect
+	github.com/x448/float16 v0.8.4 // indirect
+	github.com/xanzy/ssh-agent v0.3.3 // indirect
+	github.com/youmark/pkcs8 v0.0.0-20240726163527-a2c0da244d78 // indirect
+	github.com/yunify/qingstor-sdk-go/v3 v3.2.0 // indirect
+	github.com/yusufpapurcu/wmi v1.2.4 // indirect
+	github.com/zeebo/blake3 v0.2.4 // indirect
+	github.com/zeebo/errs v1.4.0 // indirect
+	github.com/zeebo/xxh3 v1.0.2 // indirect
+	go.etcd.io/bbolt v1.4.3 // indirect
+	go.etcd.io/etcd/api/v3 v3.6.2 // indirect
+	go.etcd.io/etcd/client/pkg/v3 v3.6.2 // indirect
+	go.etcd.io/etcd/client/v3 v3.6.2 // indirect
+	go.mongodb.org/mongo-driver v1.17.6 // indirect
+	go.opentelemetry.io/auto/sdk v1.2.1 // indirect
+	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.61.0 // indirect
+	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.63.0 // indirect
+	go.opentelemetry.io/otel v1.38.0 // indirect
+	go.opentelemetry.io/otel/metric v1.38.0 // indirect
+	go.opentelemetry.io/otel/trace v1.38.0 // indirect
 	go.shabbyrobe.org/gocovmerge v0.0.0-20230507111327-fa4f82cfbf4d // indirect
 	go.uber.org/dig v1.19.0 // indirect
-	go.uber.org/multierr v1.10.0 // indirect
-	go.uber.org/zap v1.26.0 // indirect
-	golang.org/x/crypto v0.38.0 // indirect
-	golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 // indirect
-	golang.org/x/sys v0.34.0 // indirect
-	golang.org/x/text v0.25.0 // indirect
-	golang.org/x/tools v0.33.0 // indirect
+	go.uber.org/multierr v1.11.0 // indirect
+	go.uber.org/zap v1.27.0 // indirect
+	go.yaml.in/yaml/v2 v2.4.3 // indirect
+	golang.org/x/crypto v0.45.0 // indirect
+	golang.org/x/exp v0.0.0-20251023183803-a4bb9ffd2546 // indirect
+	golang.org/x/net v0.47.0 // indirect
+	golang.org/x/oauth2 v0.33.0 // indirect
+	golang.org/x/sync v0.18.0 // indirect
+	golang.org/x/sys v0.38.0 // indirect
+	golang.org/x/text v0.31.0 // indirect
+	golang.org/x/time v0.14.0 // indirect
+	golang.org/x/tools v0.38.0 // indirect
+	google.golang.org/api v0.255.0 // indirect
+	google.golang.org/genproto v0.0.0-20250603155806-513f23925822 // indirect
+	google.golang.org/genproto/googleapis/api v0.0.0-20250804133106-a7a43d27e69b // indirect
+	google.golang.org/genproto/googleapis/rpc v0.0.0-20251103181224-f26f9409b101 // indirect
+	google.golang.org/grpc v1.76.0 // indirect
+	google.golang.org/protobuf v1.36.10 // indirect
+	gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
+	gopkg.in/inf.v0 v0.9.1 // indirect
+	gopkg.in/natefinch/lumberjack.v2 v2.2.1 // indirect
+	gopkg.in/validator.v2 v2.0.1 // indirect
+	gopkg.in/yaml.v2 v2.4.0 // indirect
+	k8s.io/api v0.33.3 // indirect
+	k8s.io/apimachinery v0.33.3 // indirect
+	k8s.io/client-go v0.33.3 // indirect
+	k8s.io/klog/v2 v2.130.1 // indirect
+	k8s.io/kube-openapi v0.0.0-20250318190949-c8a335a9a2ff // indirect
+	k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738 // indirect
 	modernc.org/libc v1.65.10 // indirect
 	modernc.org/mathutil v1.7.1 // indirect
 	modernc.org/memory v1.11.0 // indirect
+	moul.io/http2curl/v2 v2.3.0 // indirect
+	sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 // indirect
+	sigs.k8s.io/randfill v1.0.0 // indirect
+	sigs.k8s.io/structured-merge-diff/v4 v4.6.0 // indirect
+	sigs.k8s.io/yaml v1.6.0 // indirect
+	storj.io/common v0.0.0-20251107171817-6221ae45072c // indirect
+	storj.io/drpc v0.0.35-0.20250513201419-f7819ea69b55 // indirect
+	storj.io/eventkit v0.0.0-20250410172343-61f26d3de156 // indirect
+	storj.io/infectious v0.0.2 // indirect
+	storj.io/picobuf v0.0.4 // indirect
+	storj.io/uplink v1.13.1 // indirect
 )
--- a/go.sum
+++ b/go.sum
--- a/internal/blob/packer.go
+++ b/internal/blob/packer.go
@@ -20,14 +20,15 @@ import (
 	"encoding/hex"
 	"fmt"
 	"io"
-	"os"
 	"sync"
 	"time"

 	"git.eeqj.de/sneak/vaultik/internal/blobgen"
 	"git.eeqj.de/sneak/vaultik/internal/database"
 	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/types"
 	"github.com/google/uuid"
+	"github.com/spf13/afero"
 )

 // BlobHandler is a callback function invoked when a blob is finalized and ready for upload.
@@ -44,6 +45,13 @@ type PackerConfig struct {
 	Recipients       []string               // Age recipients for encryption
 	Repositories     *database.Repositories // Database repositories for tracking blob metadata
 	BlobHandler      BlobHandler            // Optional callback when blob is ready for upload
+	Fs               afero.Fs               // Filesystem for temporary files
+}
+
+// PendingChunk represents a chunk waiting to be inserted into the database.
+type PendingChunk struct {
+	Hash string
+	Size int64
 }

 // Packer accumulates chunks and packs them into blobs.
@@ -55,6 +63,7 @@ type Packer struct {
 	recipients       []string               // Age recipients for encryption
 	blobHandler      BlobHandler            // Called when blob is ready
 	repos            *database.Repositories // For creating blob records
+	fs               afero.Fs               // Filesystem for temporary files

 	// Mutex for thread-safe blob creation
 	mu sync.Mutex
@@ -62,6 +71,9 @@ type Packer struct {
 	// Current blob being packed
 	currentBlob   *blobInProgress
 	finishedBlobs []*FinishedBlob // Only used if no handler provided
+
+	// Pending chunks to be inserted when blob finalizes
+	pendingChunks []PendingChunk
 }

 // blobInProgress represents a blob being assembled
@@ -69,7 +81,7 @@ type blobInProgress struct {
 	id        string          // UUID of the blob
 	chunks    []*chunkInfo    // Track chunk metadata
 	chunkSet  map[string]bool // Track unique chunks in this blob
-	tempFile  *os.File        // Temporary file for encrypted compressed data
+	tempFile  afero.File      // Temporary file for encrypted compressed data
 	writer    *blobgen.Writer // Unified compression/encryption/hashing writer
 	startTime time.Time
 	size      int64 // Current uncompressed size
@@ -113,7 +125,8 @@ type BlobChunkRef struct {
 type BlobWithReader struct {
 	*FinishedBlob
 	Reader              io.ReadSeeker
-	TempFile *os.File // Optional, only set for disk-based blobs
+	TempFile            afero.File // Optional, only set for disk-based blobs
+	InsertedChunkHashes []string   // Chunk hashes that were inserted to DB with this blob
 }

 // NewPacker creates a new blob packer that accumulates chunks into blobs.
@@ -126,12 +139,16 @@ func NewPacker(cfg PackerConfig) (*Packer, error) {
 	if cfg.MaxBlobSize <= 0 {
 		return nil, fmt.Errorf("max blob size must be positive")
 	}
+	if cfg.Fs == nil {
+		return nil, fmt.Errorf("filesystem is required")
+	}
 	return &Packer{
 		maxBlobSize:      cfg.MaxBlobSize,
 		compressionLevel: cfg.CompressionLevel,
 		recipients:       cfg.Recipients,
 		blobHandler:      cfg.BlobHandler,
 		repos:            cfg.Repositories,
+		fs:               cfg.Fs,
 		finishedBlobs:    make([]*FinishedBlob, 0),
 	}, nil
 }
@@ -146,6 +163,15 @@ func (p *Packer) SetBlobHandler(handler BlobHandler) {
 	p.blobHandler = handler
 }

+// AddPendingChunk queues a chunk to be inserted into the database when the
+// current blob is finalized. This batches chunk inserts to reduce transaction
+// overhead. Thread-safe.
+func (p *Packer) AddPendingChunk(hash string, size int64) {
+	p.mu.Lock()
+	defer p.mu.Unlock()
+	p.pendingChunks = append(p.pendingChunks, PendingChunk{Hash: hash, Size: size})
+}
+
 // AddChunk adds a chunk to the current blob being packed.
 // If adding the chunk would exceed MaxBlobSize, returns ErrBlobSizeLimitExceeded.
 // In this case, the caller should finalize the current blob and retry.
@@ -237,25 +263,28 @@ func (p *Packer) startNewBlob() error {

 	// Create blob record in database
 	if p.repos != nil {
+		blobIDTyped, err := types.ParseBlobID(blobID)
+		if err != nil {
+			return fmt.Errorf("parsing blob ID: %w", err)
+		}
 		blob := &database.Blob{
-			ID:               blobID,
-			Hash:             "temp-placeholder-" + blobID, // Temporary placeholder until finalized
+			ID:               blobIDTyped,
+			Hash:             types.BlobHash("temp-placeholder-" + blobID), // Temporary placeholder until finalized
 			CreatedTS:        time.Now().UTC(),
 			FinishedTS:       nil,
 			UncompressedSize: 0,
 			CompressedSize:   0,
 			UploadedTS:       nil,
 		}
-		err := p.repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
+		if err := p.repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
 			return p.repos.Blobs.Create(ctx, tx, blob)
-		})
-		if err != nil {
+		}); err != nil {
 			return fmt.Errorf("creating blob record: %w", err)
 		}
 	}

 	// Create temporary file
-	tempFile, err := os.CreateTemp("", "vaultik-blob-*.tmp")
+	tempFile, err := afero.TempFile(p.fs, "", "vaultik-blob-*.tmp")
 	if err != nil {
 		return fmt.Errorf("creating temp file: %w", err)
 	}
@@ -264,7 +293,7 @@ func (p *Packer) startNewBlob() error {
 	writer, err := blobgen.NewWriter(tempFile, p.compressionLevel, p.recipients)
 	if err != nil {
 		_ = tempFile.Close()
-		_ = os.Remove(tempFile.Name())
+		_ = p.fs.Remove(tempFile.Name())
 		return fmt.Errorf("creating blobgen writer: %w", err)
 	}

@@ -308,23 +337,9 @@ func (p *Packer) addChunkToCurrentBlob(chunk *ChunkRef) error {
 	p.currentBlob.chunks = append(p.currentBlob.chunks, chunkInfo)
 	p.currentBlob.chunkSet[chunk.Hash] = true

-	// Store blob-chunk association in database immediately
-	if p.repos != nil {
-		blobChunk := &database.BlobChunk{
-			BlobID:    p.currentBlob.id,
-			ChunkHash: chunk.Hash,
-			Offset:    offset,
-			Length:    chunkSize,
-		}
-		err := p.repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
-			return p.repos.BlobChunks.Create(ctx, tx, blobChunk)
-		})
-		if err != nil {
-			log.Error("Failed to store blob-chunk association in database", "error", err,
-				"blob_id", p.currentBlob.id, "chunk_hash", chunk.Hash)
-			// Continue anyway - we can reconstruct this later if needed
-		}
-	}
+	// Note: blob_chunk records are inserted in batch when blob is finalized
+	// to reduce transaction overhead. The chunk info is already stored in
+	// p.currentBlob.chunks for later insertion.

 	// Update total size
 	p.currentBlob.size += chunkSize
@@ -386,16 +401,54 @@ func (p *Packer) finalizeCurrentBlob() error {
 		})
 	}

-	// Update blob record in database with hash and sizes
+	// Get pending chunks (will be inserted to DB and reported to handler)
+	chunksToInsert := p.pendingChunks
+	p.pendingChunks = nil // Clear pending list
+
+	// Insert pending chunks, blob_chunks, and update blob in a single transaction
 	if p.repos != nil {
+		blobIDTyped, parseErr := types.ParseBlobID(p.currentBlob.id)
+		if parseErr != nil {
+			p.cleanupTempFile()
+			return fmt.Errorf("parsing blob ID: %w", parseErr)
+		}
 		err := p.repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
+			// First insert all pending chunks (required for blob_chunks FK)
+			for _, chunk := range chunksToInsert {
+				dbChunk := &database.Chunk{
+					ChunkHash: types.ChunkHash(chunk.Hash),
+					Size:      chunk.Size,
+				}
+				if err := p.repos.Chunks.Create(ctx, tx, dbChunk); err != nil {
+					return fmt.Errorf("creating chunk: %w", err)
+				}
+			}
+
+			// Insert all blob_chunk records in batch
+			for _, chunk := range p.currentBlob.chunks {
+				blobChunk := &database.BlobChunk{
+					BlobID:    blobIDTyped,
+					ChunkHash: types.ChunkHash(chunk.Hash),
+					Offset:    chunk.Offset,
+					Length:    chunk.Size,
+				}
+				if err := p.repos.BlobChunks.Create(ctx, tx, blobChunk); err != nil {
+					return fmt.Errorf("creating blob_chunk: %w", err)
+				}
+			}
+
+			// Update blob record with final hash and sizes
 			return p.repos.Blobs.UpdateFinished(ctx, tx, p.currentBlob.id, blobHash,
 				p.currentBlob.size, finalSize)
 		})
 		if err != nil {
 			p.cleanupTempFile()
-			return fmt.Errorf("updating blob record: %w", err)
+			return fmt.Errorf("finalizing blob transaction: %w", err)
 		}
+
+		log.Debug("Committed blob transaction",
+			"chunks_inserted", len(chunksToInsert),
+			"blob_chunks_inserted", len(p.currentBlob.chunks))
 	}

 	// Create finished blob
@@ -418,9 +471,14 @@ func (p *Packer) finalizeCurrentBlob() error {
 		"ratio", fmt.Sprintf("%.2f", compressionRatio),
 		"duration", time.Since(p.currentBlob.startTime))

+	// Collect inserted chunk hashes for the scanner to track
+	var insertedChunkHashes []string
+	for _, chunk := range chunksToInsert {
+		insertedChunkHashes = append(insertedChunkHashes, chunk.Hash)
+	}
+
 	// Call blob handler if set
 	if p.blobHandler != nil {
-		log.Debug("Invoking blob handler callback", "blob_hash", blobHash[:8]+"...")
 		// Reset file position for handler
 		if _, err := p.currentBlob.tempFile.Seek(0, io.SeekStart); err != nil {
 			p.cleanupTempFile()
@@ -432,6 +490,7 @@ func (p *Packer) finalizeCurrentBlob() error {
 			FinishedBlob:        finished,
 			Reader:              p.currentBlob.tempFile,
 			TempFile:            p.currentBlob.tempFile,
+			InsertedChunkHashes: insertedChunkHashes,
 		}

 		if err := p.blobHandler(blobWithReader); err != nil {
@@ -470,7 +529,7 @@ func (p *Packer) cleanupTempFile() {
 	if p.currentBlob != nil && p.currentBlob.tempFile != nil {
 		name := p.currentBlob.tempFile.Name()
 		_ = p.currentBlob.tempFile.Close()
-		_ = os.Remove(name)
+		_ = p.fs.Remove(name)
 	}
 }

--- a/internal/blob/packer_test.go
+++ b/internal/blob/packer_test.go
@@ -12,7 +12,9 @@ import (
 	"filippo.io/age"
 	"git.eeqj.de/sneak/vaultik/internal/database"
 	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/types"
 	"github.com/klauspost/compress/zstd"
+	"github.com/spf13/afero"
 )

 const (
@@ -45,6 +47,7 @@ func TestPacker(t *testing.T) {
 			CompressionLevel: 3,
 			Recipients:       []string{testPublicKey},
 			Repositories:     repos,
+			Fs:               afero.NewMemMapFs(),
 		}
 		packer, err := NewPacker(cfg)
 		if err != nil {
@@ -58,7 +61,7 @@ func TestPacker(t *testing.T) {

 		// Create chunk in database first
 		dbChunk := &database.Chunk{
-			ChunkHash: hashStr,
+			ChunkHash: types.ChunkHash(hashStr),
 			Size:      int64(len(data)),
 		}
 		err = repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
@@ -134,6 +137,7 @@ func TestPacker(t *testing.T) {
 			CompressionLevel: 3,
 			Recipients:       []string{testPublicKey},
 			Repositories:     repos,
+			Fs:               afero.NewMemMapFs(),
 		}
 		packer, err := NewPacker(cfg)
 		if err != nil {
@@ -149,7 +153,7 @@ func TestPacker(t *testing.T) {

 			// Create chunk in database first
 			dbChunk := &database.Chunk{
-				ChunkHash: hashStr,
+				ChunkHash: types.ChunkHash(hashStr),
 				Size:      int64(len(data)),
 			}
 			err = repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
@@ -216,6 +220,7 @@ func TestPacker(t *testing.T) {
 			CompressionLevel: 3,
 			Recipients:       []string{testPublicKey},
 			Repositories:     repos,
+			Fs:               afero.NewMemMapFs(),
 		}
 		packer, err := NewPacker(cfg)
 		if err != nil {
@@ -231,7 +236,7 @@ func TestPacker(t *testing.T) {

 			// Create chunk in database first
 			dbChunk := &database.Chunk{
-				ChunkHash: hashStr,
+				ChunkHash: types.ChunkHash(hashStr),
 				Size:      int64(len(data)),
 			}
 			err = repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
@@ -304,6 +309,7 @@ func TestPacker(t *testing.T) {
 			CompressionLevel: 3,
 			Recipients:       []string{testPublicKey},
 			Repositories:     repos,
+			Fs:               afero.NewMemMapFs(),
 		}
 		packer, err := NewPacker(cfg)
 		if err != nil {
@@ -317,7 +323,7 @@ func TestPacker(t *testing.T) {

 		// Create chunk in database first
 		dbChunk := &database.Chunk{
-			ChunkHash: hashStr,
+			ChunkHash: types.ChunkHash(hashStr),
 			Size:      int64(len(data)),
 		}
 		err = repos.WithTx(context.Background(), func(ctx context.Context, tx *sql.Tx) error {
--- a/internal/blobgen/compress.go
+++ b/internal/blobgen/compress.go
@@ -51,7 +51,13 @@ func CompressStream(dst io.Writer, src io.Reader, compressionLevel int, recipien
 	if err != nil {
 		return 0, "", fmt.Errorf("creating writer: %w", err)
 	}
-	defer func() { _ = w.Close() }()
+
+	closed := false
+	defer func() {
+		if !closed {
+			_ = w.Close()
+		}
+	}()

 	// Copy data
 	if _, err := io.Copy(w, src); err != nil {
@@ -62,6 +68,7 @@ func CompressStream(dst io.Writer, src io.Reader, compressionLevel int, recipien
 	if err := w.Close(); err != nil {
 		return 0, "", fmt.Errorf("closing writer: %w", err)
 	}
+	closed = true

 	return w.BytesWritten(), hex.EncodeToString(w.Sum256()), nil
 }
--- a/internal/blobgen/writer.go
+++ b/internal/blobgen/writer.go
@@ -5,30 +5,33 @@ import (
 	"fmt"
 	"hash"
 	"io"
+	"runtime"

 	"filippo.io/age"
 	"github.com/klauspost/compress/zstd"
 )

-// Writer wraps compression and encryption with SHA256 hashing
+// Writer wraps compression and encryption with SHA256 hashing.
+// Data flows: input -> tee(hasher, compressor -> encryptor -> destination)
+// The hash is computed on the uncompressed input for deterministic content-addressing.
 type Writer struct {
-	writer           io.Writer      // Final destination
+	teeWriter        io.Writer      // Tee to hasher and compressor
 	compressor       *zstd.Encoder  // Compression layer
 	encryptor        io.WriteCloser // Encryption layer
-	hasher           hash.Hash      // SHA256 hasher
-	teeWriter        io.Writer      // Tees data to hasher
+	hasher           hash.Hash      // SHA256 hasher (on uncompressed input)
 	compressionLevel int
 	bytesWritten     int64
 }

-// NewWriter creates a new Writer that compresses, encrypts, and hashes data
+// NewWriter creates a new Writer that compresses, encrypts, and hashes data.
+// The hash is computed on the uncompressed input for deterministic content-addressing.
 func NewWriter(w io.Writer, compressionLevel int, recipients []string) (*Writer, error) {
 	// Validate compression level
 	if err := validateCompressionLevel(compressionLevel); err != nil {
 		return nil, err
 	}

-	// Create SHA256 hasher
+	// Create SHA256 hasher for the uncompressed input
 	hasher := sha256.New()

 	// Parse recipients
@@ -41,31 +44,36 @@ func NewWriter(w io.Writer, compressionLevel int, recipients []string) (*Writer,
 		ageRecipients = append(ageRecipients, r)
 	}

-	// Create encryption writer
+	// Create encryption writer that outputs to destination
 	encWriter, err := age.Encrypt(w, ageRecipients...)
 	if err != nil {
 		return nil, fmt.Errorf("creating encryption writer: %w", err)
 	}

+	// Calculate compression concurrency: CPUs - 2, minimum 1
+	concurrency := runtime.NumCPU() - 2
+	if concurrency < 1 {
+		concurrency = 1
+	}
+
 	// Create compression writer with encryption as destination
 	compressor, err := zstd.NewWriter(encWriter,
 		zstd.WithEncoderLevel(zstd.EncoderLevelFromZstd(compressionLevel)),
-		zstd.WithEncoderConcurrency(1), // Use single thread for streaming
+		zstd.WithEncoderConcurrency(concurrency),
 	)
 	if err != nil {
 		_ = encWriter.Close()
 		return nil, fmt.Errorf("creating compression writer: %w", err)
 	}

-	// Create tee writer that writes to both compressor and hasher
-	teeWriter := io.MultiWriter(compressor, hasher)
+	// Create tee writer: input goes to both hasher and compressor
+	teeWriter := io.MultiWriter(hasher, compressor)

 	return &Writer{
-		writer:           w,
+		teeWriter:        teeWriter,
 		compressor:       compressor,
 		encryptor:        encWriter,
 		hasher:           hasher,
-		teeWriter:        teeWriter,
 		compressionLevel: compressionLevel,
 	}, nil
 }
@@ -92,9 +100,16 @@ func (w *Writer) Close() error {
 	return nil
 }

-// Sum256 returns the SHA256 hash of all data written
+// Sum256 returns the double SHA256 hash of the uncompressed input data.
+// Double hashing (SHA256(SHA256(data))) prevents information leakage about
+// the plaintext - an attacker cannot confirm existence of known content
+// by computing its hash and checking for a matching blob filename.
 func (w *Writer) Sum256() []byte {
-	return w.hasher.Sum(nil)
+	// First hash: SHA256(plaintext)
+	firstHash := w.hasher.Sum(nil)
+	// Second hash: SHA256(firstHash) - this is the blob ID
+	secondHash := sha256.Sum256(firstHash)
+	return secondHash[:]
 }

 // BytesWritten returns the number of uncompressed bytes written
--- a/internal/blobgen/writer_test.go
+++ b/internal/blobgen/writer_test.go
@@ -0,0 +1,105 @@
+package blobgen
+
+import (
+	"bytes"
+	"crypto/rand"
+	"crypto/sha256"
+	"encoding/hex"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+// TestWriterHashIsDoubleHash verifies that Writer.Sum256() returns
+// the double hash SHA256(SHA256(plaintext)) for security.
+// Double hashing prevents attackers from confirming existence of known content.
+func TestWriterHashIsDoubleHash(t *testing.T) {
+	// Test data - random data that doesn't compress well
+	testData := make([]byte, 1024*1024) // 1MB
+	_, err := rand.Read(testData)
+	require.NoError(t, err)
+
+	// Test recipient (generated with age-keygen)
+	testRecipient := "age1cplgrwj77ta54dnmydvvmzn64ltk83ankxl5sww04mrtmu62kv3s89gmvv"
+
+	// Create a buffer to capture the encrypted output
+	var encryptedBuf bytes.Buffer
+
+	// Create blobgen writer
+	writer, err := NewWriter(&encryptedBuf, 3, []string{testRecipient})
+	require.NoError(t, err)
+
+	// Write test data
+	n, err := writer.Write(testData)
+	require.NoError(t, err)
+	assert.Equal(t, len(testData), n)
+
+	// Close to flush all data
+	err = writer.Close()
+	require.NoError(t, err)
+
+	// Get the hash from the writer
+	writerHash := hex.EncodeToString(writer.Sum256())
+
+	// Calculate the expected double hash: SHA256(SHA256(plaintext))
+	firstHash := sha256.Sum256(testData)
+	secondHash := sha256.Sum256(firstHash[:])
+	expectedDoubleHash := hex.EncodeToString(secondHash[:])
+
+	// Also compute single hash to verify it's different
+	singleHashStr := hex.EncodeToString(firstHash[:])
+
+	t.Logf("Input size: %d bytes", len(testData))
+	t.Logf("Single hash (SHA256(data)): %s", singleHashStr)
+	t.Logf("Double hash (SHA256(SHA256(data))): %s", expectedDoubleHash)
+	t.Logf("Writer hash: %s", writerHash)
+
+	// The writer hash should match the double hash
+	assert.Equal(t, expectedDoubleHash, writerHash,
+		"Writer.Sum256() should return SHA256(SHA256(plaintext)) for security")
+
+	// Verify it's NOT the single hash (would leak information)
+	assert.NotEqual(t, singleHashStr, writerHash,
+		"Writer hash should not be single hash (would allow content confirmation attacks)")
+}
+
+// TestWriterDeterministicHash verifies that the same input always produces
+// the same hash, even with non-deterministic encryption.
+func TestWriterDeterministicHash(t *testing.T) {
+	// Test data
+	testData := []byte("Hello, World! This is test data for deterministic hashing.")
+
+	// Test recipient
+	testRecipient := "age1cplgrwj77ta54dnmydvvmzn64ltk83ankxl5sww04mrtmu62kv3s89gmvv"
+
+	// Create two writers and verify they produce the same hash
+	var buf1, buf2 bytes.Buffer
+
+	writer1, err := NewWriter(&buf1, 3, []string{testRecipient})
+	require.NoError(t, err)
+	_, err = writer1.Write(testData)
+	require.NoError(t, err)
+	require.NoError(t, writer1.Close())
+
+	writer2, err := NewWriter(&buf2, 3, []string{testRecipient})
+	require.NoError(t, err)
+	_, err = writer2.Write(testData)
+	require.NoError(t, err)
+	require.NoError(t, writer2.Close())
+
+	hash1 := hex.EncodeToString(writer1.Sum256())
+	hash2 := hex.EncodeToString(writer2.Sum256())
+
+	// Hashes should be identical (deterministic)
+	assert.Equal(t, hash1, hash2, "Same input should produce same hash")
+
+	// Encrypted outputs should be different (non-deterministic encryption)
+	assert.NotEqual(t, buf1.Bytes(), buf2.Bytes(),
+		"Encrypted outputs should differ due to non-deterministic encryption")
+
+	t.Logf("Hash 1: %s", hash1)
+	t.Logf("Hash 2: %s", hash2)
+	t.Logf("Encrypted size 1: %d bytes", buf1.Len())
+	t.Logf("Encrypted size 2: %d bytes", buf2.Len())
+}
--- a/internal/chunker/chunker.go
+++ b/internal/chunker/chunker.go
@@ -6,8 +6,6 @@ import (
 	"fmt"
 	"io"
 	"os"
-
-	"github.com/jotfs/fastcdc-go"
 )

 // Chunk represents a single chunk of data produced by the content-defined chunking algorithm.
@@ -48,16 +46,8 @@ func NewChunker(avgChunkSize int64) *Chunker {
 // reasonably sized inputs. For large files or streams, use ChunkReaderStreaming instead.
 // Returns an error if chunking fails or if reading from the input fails.
 func (c *Chunker) ChunkReader(r io.Reader) ([]Chunk, error) {
-	opts := fastcdc.Options{
-		MinSize:     c.minChunkSize,
-		AverageSize: c.avgChunkSize,
-		MaxSize:     c.maxChunkSize,
-	}
-
-	chunker, err := fastcdc.NewChunker(r, opts)
-	if err != nil {
-		return nil, fmt.Errorf("creating chunker: %w", err)
-	}
+	chunker := AcquireReusableChunker(r, c.minChunkSize, c.avgChunkSize, c.maxChunkSize)
+	defer chunker.Release()

 	var chunks []Chunk
 	offset := int64(0)
@@ -74,7 +64,7 @@ func (c *Chunker) ChunkReader(r io.Reader) ([]Chunk, error) {
 		// Calculate hash
 		hash := sha256.Sum256(chunk.Data)

-		// Make a copy of the data since FastCDC reuses the buffer
+		// Make a copy of the data since the chunker reuses the buffer
 		chunkData := make([]byte, len(chunk.Data))
 		copy(chunkData, chunk.Data)

@@ -107,16 +97,8 @@ func (c *Chunker) ChunkReaderStreaming(r io.Reader, callback ChunkCallback) (str
 	fileHasher := sha256.New()
 	teeReader := io.TeeReader(r, fileHasher)

-	opts := fastcdc.Options{
-		MinSize:     c.minChunkSize,
-		AverageSize: c.avgChunkSize,
-		MaxSize:     c.maxChunkSize,
-	}
-
-	chunker, err := fastcdc.NewChunker(teeReader, opts)
-	if err != nil {
-		return "", fmt.Errorf("creating chunker: %w", err)
-	}
+	chunker := AcquireReusableChunker(teeReader, c.minChunkSize, c.avgChunkSize, c.maxChunkSize)
+	defer chunker.Release()

 	offset := int64(0)

@@ -132,13 +114,12 @@ func (c *Chunker) ChunkReaderStreaming(r io.Reader, callback ChunkCallback) (str
 		// Calculate chunk hash
 		hash := sha256.Sum256(chunk.Data)

-		// Make a copy of the data since FastCDC reuses the buffer
-		chunkData := make([]byte, len(chunk.Data))
-		copy(chunkData, chunk.Data)
-
+		// Pass the data directly - caller must process it before we call Next() again
+		// (chunker reuses its internal buffer, but since we process synchronously
+		// and completely before continuing, no copy is needed)
 		if err := callback(Chunk{
 			Hash:   hex.EncodeToString(hash[:]),
-			Data:   chunkData,
+			Data:   chunk.Data,
 			Offset: offset,
 			Size:   int64(len(chunk.Data)),
 		}); err != nil {
--- a/internal/chunker/fastcdc.go
+++ b/internal/chunker/fastcdc.go
@@ -0,0 +1,265 @@
+package chunker
+
+import (
+	"io"
+	"math"
+	"sync"
+)
+
+// ReusableChunker implements FastCDC with reusable buffers to minimize allocations.
+// Unlike the upstream fastcdc-go library which allocates a new buffer per file,
+// this implementation uses sync.Pool to reuse buffers across files.
+type ReusableChunker struct {
+	minSize  int
+	maxSize  int
+	normSize int
+	bufSize  int
+
+	maskS uint64
+	maskL uint64
+
+	rd io.Reader
+
+	buf    []byte
+	cursor int
+	offset int
+	eof    bool
+}
+
+// reusableChunkerPool pools ReusableChunker instances to avoid allocations.
+var reusableChunkerPool = sync.Pool{
+	New: func() interface{} {
+		return &ReusableChunker{}
+	},
+}
+
+// bufferPools contains pools for different buffer sizes.
+// Key is the buffer size.
+var bufferPools = sync.Map{}
+
+func getBuffer(size int) []byte {
+	poolI, _ := bufferPools.LoadOrStore(size, &sync.Pool{
+		New: func() interface{} {
+			buf := make([]byte, size)
+			return &buf
+		},
+	})
+	pool := poolI.(*sync.Pool)
+	return *pool.Get().(*[]byte)
+}
+
+func putBuffer(buf []byte) {
+	size := cap(buf)
+	poolI, ok := bufferPools.Load(size)
+	if ok {
+		pool := poolI.(*sync.Pool)
+		b := buf[:size]
+		pool.Put(&b)
+	}
+}
+
+// FastCDCChunk represents a chunk from the FastCDC algorithm.
+type FastCDCChunk struct {
+	Offset      int
+	Length      int
+	Data        []byte
+	Fingerprint uint64
+}
+
+// AcquireReusableChunker gets a chunker from the pool and initializes it for the given reader.
+func AcquireReusableChunker(rd io.Reader, minSize, avgSize, maxSize int) *ReusableChunker {
+	c := reusableChunkerPool.Get().(*ReusableChunker)
+
+	bufSize := maxSize * 2
+
+	// Reuse buffer if it's the right size, otherwise get a new one
+	if c.buf == nil || cap(c.buf) != bufSize {
+		if c.buf != nil {
+			putBuffer(c.buf)
+		}
+		c.buf = getBuffer(bufSize)
+	} else {
+		// Restore buffer to full capacity (may have been truncated by previous EOF)
+		c.buf = c.buf[:cap(c.buf)]
+	}
+
+	bits := int(math.Round(math.Log2(float64(avgSize))))
+	normalization := 2
+	smallBits := bits + normalization
+	largeBits := bits - normalization
+
+	c.minSize = minSize
+	c.maxSize = maxSize
+	c.normSize = avgSize
+	c.bufSize = bufSize
+	c.maskS = (1 << smallBits) - 1
+	c.maskL = (1 << largeBits) - 1
+	c.rd = rd
+	c.cursor = bufSize
+	c.offset = 0
+	c.eof = false
+
+	return c
+}
+
+// Release returns the chunker to the pool for reuse.
+func (c *ReusableChunker) Release() {
+	c.rd = nil
+	reusableChunkerPool.Put(c)
+}
+
+func (c *ReusableChunker) fillBuffer() error {
+	n := len(c.buf) - c.cursor
+	if n >= c.maxSize {
+		return nil
+	}
+
+	// Move all data after the cursor to the start of the buffer
+	copy(c.buf[:n], c.buf[c.cursor:])
+	c.cursor = 0
+
+	if c.eof {
+		c.buf = c.buf[:n]
+		return nil
+	}
+
+	// Restore buffer to full capacity for reading
+	c.buf = c.buf[:c.bufSize]
+
+	// Fill the rest of the buffer
+	m, err := io.ReadFull(c.rd, c.buf[n:])
+	if err == io.EOF || err == io.ErrUnexpectedEOF {
+		c.buf = c.buf[:n+m]
+		c.eof = true
+	} else if err != nil {
+		return err
+	}
+	return nil
+}
+
+// Next returns the next chunk or io.EOF when done.
+// The returned Data slice is only valid until the next call to Next.
+func (c *ReusableChunker) Next() (FastCDCChunk, error) {
+	if err := c.fillBuffer(); err != nil {
+		return FastCDCChunk{}, err
+	}
+	if len(c.buf) == 0 {
+		return FastCDCChunk{}, io.EOF
+	}
+
+	length, fp := c.nextChunk(c.buf[c.cursor:])
+
+	chunk := FastCDCChunk{
+		Offset:      c.offset,
+		Length:      length,
+		Data:        c.buf[c.cursor : c.cursor+length],
+		Fingerprint: fp,
+	}
+
+	c.cursor += length
+	c.offset += chunk.Length
+
+	return chunk, nil
+}
+
+func (c *ReusableChunker) nextChunk(data []byte) (int, uint64) {
+	fp := uint64(0)
+	i := c.minSize
+
+	if len(data) <= c.minSize {
+		return len(data), fp
+	}
+
+	n := min(len(data), c.maxSize)
+
+	for ; i < min(n, c.normSize); i++ {
+		fp = (fp << 1) + table[data[i]]
+		if (fp & c.maskS) == 0 {
+			return i + 1, fp
+		}
+	}
+
+	for ; i < n; i++ {
+		fp = (fp << 1) + table[data[i]]
+		if (fp & c.maskL) == 0 {
+			return i + 1, fp
+		}
+	}
+
+	return i, fp
+}
+
+func min(a, b int) int {
+	if a < b {
+		return a
+	}
+	return b
+}
+
+// 256 random uint64s for the rolling hash function (from FastCDC paper)
+var table = [256]uint64{
+	0xe80e8d55032474b3, 0x11b25b61f5924e15, 0x03aa5bd82a9eb669, 0xc45a153ef107a38c,
+	0xeac874b86f0f57b9, 0xa5ccedec95ec79c7, 0xe15a3320ad42ac0a, 0x5ed3583fa63cec15,
+	0xcd497bf624a4451d, 0xf9ade5b059683605, 0x773940c03fb11ca1, 0xa36b16e4a6ae15b2,
+	0x67afd1adb5a89eac, 0xc44c75ee32f0038e, 0x2101790f365c0967, 0x76415c64a222fc4a,
+	0x579929249a1e577a, 0xe4762fc41fdbf750, 0xea52198e57dfcdcc, 0xe2535aafe30b4281,
+	0xcb1a1bd6c77c9056, 0x5a1aa9bfc4612a62, 0x15a728aef8943eb5, 0x2f8f09738a8ec8d9,
+	0x200f3dec9fac8074, 0x0fa9a7b1e0d318df, 0x06c0804ffd0d8e3a, 0x630cbc412669dd25,
+	0x10e34f85f4b10285, 0x2a6fe8164b9b6410, 0xcacb57d857d55810, 0x77f8a3a36ff11b46,
+	0x66af517e0dc3003e, 0x76c073c789b4009a, 0x853230dbb529f22a, 0x1e9e9c09a1f77e56,
+	0x1e871223802ee65d, 0x37fe4588718ff813, 0x10088539f30db464, 0x366f7470b80b72d1,
+	0x33f2634d9a6b31db, 0xd43917751d69ea18, 0xa0f492bc1aa7b8de, 0x3f94e5a8054edd20,
+	0xedfd6e25eb8b1dbf, 0x759517a54f196a56, 0xe81d5006ec7b6b17, 0x8dd8385fa894a6b7,
+	0x45f4d5467b0d6f91, 0xa1f894699de22bc8, 0x33829d09ef93e0fe, 0x3e29e250caed603c,
+	0xf7382cba7f63a45e, 0x970f95412bb569d1, 0xc7fcea456d356b4b, 0x723042513f3e7a57,
+	0x17ae7688de3596f1, 0x27ac1fcd7cd23c1a, 0xf429beeb78b3f71f, 0xd0780692fb93a3f9,
+	0x9f507e28a7c9842f, 0x56001ad536e433ae, 0x7e1dd1ecf58be306, 0x15fee353aa233fc6,
+	0xb033a0730b7638e8, 0xeb593ad6bd2406d1, 0x7c86502574d0f133, 0xce3b008d4ccb4be7,
+	0xf8566e3d383594c8, 0xb2c261e9b7af4429, 0xf685e7e253799dbb, 0x05d33ed60a494cbc,
+	0xeaf88d55a4cb0d1a, 0x3ee9368a902415a1, 0x8980fe6a8493a9a4, 0x358ed008cb448631,
+	0xd0cb7e37b46824b8, 0xe9bc375c0bc94f84, 0xea0bf1d8e6b55bb3, 0xb66a60d0f9f6f297,
+	0x66db2cc4807b3758, 0x7e4e014afbca8b4d, 0xa5686a4938b0c730, 0xa5f0d7353d623316,
+	0x26e38c349242d5e8, 0xeeefa80a29858e30, 0x8915cb912aa67386, 0x4b957a47bfc420d4,
+	0xbb53d051a895f7e1, 0x09f5e3235f6911ce, 0x416b98e695cfb7ce, 0x97a08183344c5c86,
+	0xbf68e0791839a861, 0xea05dde59ed3ed56, 0x0ca732280beda160, 0xac748ed62fe7f4e2,
+	0xc686da075cf6e151, 0xe1ba5658f4af05c8, 0xe9ff09fbeb67cc35, 0xafaea9470323b28d,
+	0x0291e8db5bb0ac2a, 0x342072a9bbee77ae, 0x03147eed6b3d0a9c, 0x21379d4de31dbadb,
+	0x2388d965226fb986, 0x52c96988bfebabfa, 0xa6fc29896595bc2d, 0x38fa4af70aa46b8b,
+	0xa688dd13939421ee, 0x99d5275d9b1415da, 0x453d31bb4fe73631, 0xde51debc1fbe3356,
+	0x75a3c847a06c622f, 0xe80e32755d272579, 0x5444052250d8ec0d, 0x8f17dfda19580a3b,
+	0xf6b3e9363a185e42, 0x7a42adec6868732f, 0x32cb6a07629203a2, 0x1eca8957defe56d9,
+	0x9fa85e4bc78ff9ed, 0x20ff07224a499ca7, 0x3fa6295ff9682c70, 0xe3d5b1e3ce993eff,
+	0xa341209362e0b79a, 0x64bd9eae5712ffe8, 0xceebb537babbd12a, 0x5586ef404315954f,
+	0x46c3085c938ab51a, 0xa82ccb9199907cee, 0x8c51b6690a3523c8, 0xc4dbd4c9ae518332,
+	0x979898dbb23db7b2, 0x1b5b585e6f672a9d, 0xce284da7c4903810, 0x841166e8bb5f1c4f,
+	0xb7d884a3fceca7d0, 0xa76468f5a4572374, 0xc10c45f49ee9513d, 0x68f9a5663c1908c9,
+	0x0095a13476a6339d, 0xd1d7516ffbe9c679, 0xfd94ab0c9726f938, 0x627468bbdb27c959,
+	0xedc3f8988e4a8c9a, 0x58efd33f0dfaa499, 0x21e37d7e2ef4ac8b, 0x297f9ab5586259c6,
+	0xda3ba4dc6cb9617d, 0xae11d8d9de2284d2, 0xcfeed88cb3729865, 0xefc2f9e4f03e2633,
+	0x8226393e8f0855a4, 0xd6e25fd7acf3a767, 0x435784c3bfd6d14a, 0xf97142e6343fe757,
+	0xd73b9fe826352f85, 0x6c3ac444b5b2bd76, 0xd8e88f3e9fd4a3fd, 0x31e50875c36f3460,
+	0xa824f1bf88cf4d44, 0x54a4d2c8f5f25899, 0xbff254637ce3b1e6, 0xa02cfe92561b3caa,
+	0x7bedb4edee9f0af7, 0x879c0620ac49a102, 0xa12c4ccd23b332e7, 0x09a5ff47bf94ed1e,
+	0x7b62f43cd3046fa0, 0xaa3af0476b9c2fb9, 0x22e55301abebba8e, 0x3a6035c42747bd58,
+	0x1705373106c8ec07, 0xb1f660de828d0628, 0x065fe82d89ca563d, 0xf555c2d8074d516d,
+	0x6bb6c186b423ee99, 0x54a807be6f3120a8, 0x8a3c7fe2f88860b8, 0xbeffc344f5118e81,
+	0xd686e80b7d1bd268, 0x661aef4ef5e5e88b, 0x5bf256c654cd1dda, 0x9adb1ab85d7640f4,
+	0x68449238920833a2, 0x843279f4cebcb044, 0xc8710cdefa93f7bb, 0x236943294538f3e6,
+	0x80d7d136c486d0b4, 0x61653956b28851d3, 0x3f843be9a9a956b5, 0xf73cfbbf137987e5,
+	0xcf0cb6dee8ceac2c, 0x50c401f52f185cae, 0xbdbe89ce735c4c1c, 0xeef3ade9c0570bc7,
+	0xbe8b066f8f64cbf6, 0x5238d6131705dcb9, 0x20219086c950e9f6, 0x634468d9ed74de02,
+	0x0aba4b3d705c7fa5, 0x3374416f725a6672, 0xe7378bdf7beb3bc6, 0x0f7b6a1b1cee565b,
+	0x234e4c41b0c33e64, 0x4efa9a0c3f21fe28, 0x1167fc551643e514, 0x9f81a69d3eb01fa4,
+	0xdb75c22b12306ed0, 0xe25055d738fc9686, 0x9f9f167a3f8507bb, 0x195f8336d3fbe4d3,
+	0x8442b6feffdcb6f6, 0x1e07ed24746ffde9, 0x140e31462d555266, 0x8bd0ce515ae1406e,
+	0x2c0be0042b5584b3, 0x35a23d0e15d45a60, 0xc14f1ba147d9bc83, 0xbbf168691264b23f,
+	0xad2cc7b57e589ade, 0x9501963154c7815c, 0x9664afa6b8d67d47, 0x7f9e5101fea0a81c,
+	0x45ecffb610d25bfd, 0x3157f7aecf9b6ab3, 0xc43ca6f88d87501d, 0x9576ff838dee38dc,
+	0x93f21afe0ce1c7d7, 0xceac699df343d8f9, 0x2fec49e29f03398d, 0x8805ccd5730281ed,
+	0xf9fc16fc750a8e59, 0x35308cc771adf736, 0x4a57b7c9ee2b7def, 0x03a4c6cdc937a02a,
+	0x6c9a8a269fc8c4fc, 0x4681decec7a03f43, 0x342eecded1353ef9, 0x8be0552d8413a867,
+	0xc7b4ac51beda8be8, 0xebcc64fb719842c0, 0xde8e4c7fb6d40c1c, 0xcc8263b62f9738b1,
+	0xd3cfc0f86511929a, 0x466024ce8bb226ea, 0x459ff690253a3c18, 0x98b27e9d91284c9c,
+	0x75c3ae8aa3af373d, 0xfbf8f8e79a866ffc, 0x32327f59d0662799, 0x8228b57e729e9830,
+	0x065ceb7a18381b58, 0xd2177671a31dc5ff, 0x90cd801f2f8701f9, 0x9d714428471c65fe,
+}
--- a/internal/cli/app.go
+++ b/internal/cli/app.go
@@ -2,9 +2,11 @@ package cli

 import (
 	"context"
+	"errors"
 	"fmt"
 	"os"
 	"os/signal"
+	"path/filepath"
 	"syscall"
 	"time"

@@ -12,6 +14,11 @@ import (
 	"git.eeqj.de/sneak/vaultik/internal/database"
 	"git.eeqj.de/sneak/vaultik/internal/globals"
 	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/pidlock"
+	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"git.eeqj.de/sneak/vaultik/internal/storage"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
+	"github.com/adrg/xdg"
 	"go.uber.org/fx"
 )

@@ -48,6 +55,9 @@ func NewApp(opts AppOptions) *fx.App {
 		config.Module,
 		database.Module,
 		log.Module,
+		storage.Module,
+		snapshot.Module,
+		fx.Provide(vaultik.New),
 		fx.Invoke(setupGlobals),
 		fx.NopLogger,
 	}
@@ -112,7 +122,23 @@ func RunApp(ctx context.Context, app *fx.App) error {
 // RunWithApp is a helper that creates and runs an fx app with the given options.
 // It combines NewApp and RunApp into a single convenient function. This is the
 // preferred way to run CLI commands that need the full application context.
+// It acquires a PID lock before starting to prevent concurrent instances.
 func RunWithApp(ctx context.Context, opts AppOptions) error {
+	// Acquire PID lock to prevent concurrent instances
+	lockDir := filepath.Join(xdg.DataHome, "berlin.sneak.app.vaultik")
+	lock, err := pidlock.Acquire(lockDir)
+	if err != nil {
+		if errors.Is(err, pidlock.ErrAlreadyRunning) {
+			return fmt.Errorf("cannot start: %w", err)
+		}
+		return fmt.Errorf("failed to acquire lock: %w", err)
+	}
+	defer func() {
+		if err := lock.Release(); err != nil {
+			log.Warn("Failed to release PID lock", "error", err)
+		}
+	}()
+
 	app := NewApp(opts)
 	return RunApp(ctx, app)
 }
--- a/internal/cli/database.go
+++ b/internal/cli/database.go
@@ -0,0 +1,102 @@
+package cli
+
+import (
+	"fmt"
+	"os"
+
+	"git.eeqj.de/sneak/vaultik/internal/config"
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"github.com/spf13/cobra"
+)
+
+// NewDatabaseCommand creates the database command group
+func NewDatabaseCommand() *cobra.Command {
+	cmd := &cobra.Command{
+		Use:   "database",
+		Short: "Manage the local state database",
+		Long:  `Commands for managing the local SQLite state database.`,
+	}
+
+	cmd.AddCommand(
+		newDatabasePurgeCommand(),
+	)
+
+	return cmd
+}
+
+// newDatabasePurgeCommand creates the database purge command
+func newDatabasePurgeCommand() *cobra.Command {
+	var force bool
+
+	cmd := &cobra.Command{
+		Use:   "purge",
+		Short: "Delete the local state database",
+		Long: `Completely removes the local SQLite state database.
+
+This will erase all local tracking of:
+- File metadata and change detection state
+- Chunk and blob mappings
+- Local snapshot records
+
+The remote storage is NOT affected. After purging, the next backup will
+perform a full scan and re-deduplicate against existing remote blobs.
+
+Use --force to skip the confirmation prompt.`,
+		Args: cobra.NoArgs,
+		RunE: func(cmd *cobra.Command, args []string) error {
+			// Resolve config path
+			configPath, err := ResolveConfigPath()
+			if err != nil {
+				return err
+			}
+
+			// Load config to get database path
+			cfg, err := config.Load(configPath)
+			if err != nil {
+				return fmt.Errorf("failed to load config: %w", err)
+			}
+
+			dbPath := cfg.IndexPath
+
+			// Check if database exists
+			if _, err := os.Stat(dbPath); os.IsNotExist(err) {
+				fmt.Printf("Database does not exist: %s\n", dbPath)
+				return nil
+			}
+
+			// Confirm unless --force
+			if !force {
+				fmt.Printf("This will delete the local state database at:\n  %s\n\n", dbPath)
+				fmt.Print("Are you sure? Type 'yes' to confirm: ")
+				var confirm string
+				if _, err := fmt.Scanln(&confirm); err != nil || confirm != "yes" {
+					fmt.Println("Aborted.")
+					return nil
+				}
+			}
+
+			// Delete the database file
+			if err := os.Remove(dbPath); err != nil {
+				return fmt.Errorf("failed to delete database: %w", err)
+			}
+
+			// Also delete WAL and SHM files if they exist
+			walPath := dbPath + "-wal"
+			shmPath := dbPath + "-shm"
+			_ = os.Remove(walPath) // Ignore errors - files may not exist
+			_ = os.Remove(shmPath)
+
+			rootFlags := GetRootFlags()
+			if !rootFlags.Quiet {
+				fmt.Printf("Database purged: %s\n", dbPath)
+			}
+
+			log.Info("Local state database purged", "path", dbPath)
+			return nil
+		},
+	}
+
+	cmd.Flags().BoolVar(&force, "force", false, "Skip confirmation prompt")
+
+	return cmd
+}
--- a/internal/cli/entry_test.go
+++ b/internal/cli/entry_test.go
@@ -18,7 +18,7 @@ func TestCLIEntry(t *testing.T) {
 	}

 	// Verify all subcommands are registered
-	expectedCommands := []string{"snapshot", "store", "restore", "prune", "verify", "fetch"}
+	expectedCommands := []string{"snapshot", "store", "restore", "prune", "verify", "info", "version"}
 	for _, expected := range expectedCommands {
 		found := false
 		for _, cmd := range cmd.Commands() {
--- a/internal/cli/fetch.go
+++ b/internal/cli/fetch.go
@@ -1,88 +0,0 @@
-package cli
-
-import (
-	"context"
-	"fmt"
-	"os"
-
-	"git.eeqj.de/sneak/vaultik/internal/globals"
-	"github.com/spf13/cobra"
-	"go.uber.org/fx"
-)
-
-// FetchOptions contains options for the fetch command
-type FetchOptions struct {
-	Bucket     string
-	Prefix     string
-	SnapshotID string
-	FilePath   string
-	Target     string
-}
-
-// NewFetchCommand creates the fetch command
-func NewFetchCommand() *cobra.Command {
-	opts := &FetchOptions{}
-
-	cmd := &cobra.Command{
-		Use:   "fetch",
-		Short: "Extract single file from backup",
-		Long:  `Download and decrypt a single file from a backup snapshot`,
-		Args:  cobra.NoArgs,
-		RunE: func(cmd *cobra.Command, args []string) error {
-			// Validate required flags
-			if opts.Bucket == "" {
-				return fmt.Errorf("--bucket is required")
-			}
-			if opts.Prefix == "" {
-				return fmt.Errorf("--prefix is required")
-			}
-			if opts.SnapshotID == "" {
-				return fmt.Errorf("--snapshot is required")
-			}
-			if opts.FilePath == "" {
-				return fmt.Errorf("--file is required")
-			}
-			if opts.Target == "" {
-				return fmt.Errorf("--target is required")
-			}
-			return runFetch(cmd.Context(), opts)
-		},
-	}
-
-	cmd.Flags().StringVar(&opts.Bucket, "bucket", "", "S3 bucket name")
-	cmd.Flags().StringVar(&opts.Prefix, "prefix", "", "S3 prefix")
-	cmd.Flags().StringVar(&opts.SnapshotID, "snapshot", "", "Snapshot ID")
-	cmd.Flags().StringVar(&opts.FilePath, "file", "", "Path of file to extract from backup")
-	cmd.Flags().StringVar(&opts.Target, "target", "", "Target path for extracted file")
-
-	return cmd
-}
-
-func runFetch(ctx context.Context, opts *FetchOptions) error {
-	if os.Getenv("VAULTIK_PRIVATE_KEY") == "" {
-		return fmt.Errorf("VAULTIK_PRIVATE_KEY environment variable must be set")
-	}
-
-	app := fx.New(
-		fx.Supply(opts),
-		fx.Provide(globals.New),
-		// Additional modules will be added here
-		fx.Invoke(func(g *globals.Globals) error {
-			// TODO: Implement fetch logic
-			fmt.Printf("Fetching %s from snapshot %s to %s\n", opts.FilePath, opts.SnapshotID, opts.Target)
-			return nil
-		}),
-		fx.NopLogger,
-	)
-
-	if err := app.Start(ctx); err != nil {
-		return fmt.Errorf("failed to start fetch: %w", err)
-	}
-	defer func() {
-		if err := app.Stop(ctx); err != nil {
-			fmt.Printf("error stopping app: %v\n", err)
-		}
-	}()
-
-	return nil
-}
--- a/internal/cli/info.go
+++ b/internal/cli/info.go
@@ -0,0 +1,71 @@
+package cli
+
+import (
+	"context"
+	"os"
+
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
+	"github.com/spf13/cobra"
+	"go.uber.org/fx"
+)
+
+// NewInfoCommand creates the info command
+func NewInfoCommand() *cobra.Command {
+	cmd := &cobra.Command{
+		Use:   "info",
+		Short: "Display system and configuration information",
+		Long: `Shows information about the current vaultik configuration, including:
+- System details (OS, architecture, version)
+- Storage configuration (S3 bucket, endpoint)
+- Backup settings (source directories, compression)
+- Encryption configuration (recipients)
+- Local database statistics`,
+		Args: cobra.NoArgs,
+		RunE: func(cmd *cobra.Command, args []string) error {
+			// Use unified config resolution
+			configPath, err := ResolveConfigPath()
+			if err != nil {
+				return err
+			}
+
+			// Use the app framework
+			rootFlags := GetRootFlags()
+			return RunWithApp(cmd.Context(), AppOptions{
+				ConfigPath: configPath,
+				LogOptions: log.LogOptions{
+					Verbose: rootFlags.Verbose,
+					Debug:   rootFlags.Debug,
+					Quiet:   rootFlags.Quiet,
+				},
+				Modules: []fx.Option{},
+				Invokes: []fx.Option{
+					fx.Invoke(func(v *vaultik.Vaultik, lc fx.Lifecycle) {
+						lc.Append(fx.Hook{
+							OnStart: func(ctx context.Context) error {
+								go func() {
+									if err := v.ShowInfo(); err != nil {
+										if err != context.Canceled {
+											log.Error("Failed to show info", "error", err)
+											os.Exit(1)
+										}
+									}
+									if err := v.Shutdowner.Shutdown(); err != nil {
+										log.Error("Failed to shutdown", "error", err)
+									}
+								}()
+								return nil
+							},
+							OnStop: func(ctx context.Context) error {
+								v.Cancel()
+								return nil
+							},
+						})
+					}),
+				},
+			})
+		},
+	}
+
+	return cmd
+}
--- a/internal/cli/prune.go
+++ b/internal/cli/prune.go
@@ -2,51 +2,28 @@ package cli

 import (
 	"context"
-	"fmt"
-	"strings"
+	"os"

-	"git.eeqj.de/sneak/vaultik/internal/config"
-	"git.eeqj.de/sneak/vaultik/internal/database"
-	"git.eeqj.de/sneak/vaultik/internal/globals"
 	"git.eeqj.de/sneak/vaultik/internal/log"
-	"git.eeqj.de/sneak/vaultik/internal/s3"
-	"git.eeqj.de/sneak/vaultik/internal/snapshot"
-	"github.com/dustin/go-humanize"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
 	"github.com/spf13/cobra"
 	"go.uber.org/fx"
 )

-// PruneOptions contains options for the prune command
-type PruneOptions struct {
-	DryRun bool
-}
-
-// PruneApp contains all dependencies needed for pruning
-type PruneApp struct {
-	Globals      *globals.Globals
-	Config       *config.Config
-	Repositories *database.Repositories
-	S3Client     *s3.Client
-	DB           *database.DB
-	Shutdowner   fx.Shutdowner
-}
-
 // NewPruneCommand creates the prune command
 func NewPruneCommand() *cobra.Command {
-	opts := &PruneOptions{}
+	opts := &vaultik.PruneOptions{}

 	cmd := &cobra.Command{
 		Use:   "prune",
 		Short: "Remove unreferenced blobs",
-		Long: `Delete blobs that are no longer referenced by any snapshot.
+		Long: `Removes blobs that are not referenced by any snapshot.

-This command will:
-1. Download the manifest from the last successful snapshot
-2. List all blobs in S3
-3. Delete any blobs not referenced in the manifest
+This command scans all snapshots and their manifests to build a list of
+referenced blobs, then removes any blobs in storage that are not in this list.

-Config is located at /etc/vaultik/config.yml by default, but can be overridden by 
-specifying a path using --config or by setting VAULTIK_CONFIG to a path.`,
+Use this command after deleting snapshots with 'vaultik purge' to reclaim
+storage space.`,
 		Args: cobra.NoArgs,
 		RunE: func(cmd *cobra.Command, args []string) error {
 			// Use unified config resolution
@@ -62,39 +39,27 @@ specifying a path using --config or by setting VAULTIK_CONFIG to a path.`,
 				LogOptions: log.LogOptions{
 					Verbose: rootFlags.Verbose,
 					Debug:   rootFlags.Debug,
+					Quiet:   rootFlags.Quiet || opts.JSON,
 				},
-				Modules: []fx.Option{
-					snapshot.Module,
-					s3.Module,
-					fx.Provide(fx.Annotate(
-						func(g *globals.Globals, cfg *config.Config, repos *database.Repositories,
-							s3Client *s3.Client, db *database.DB, shutdowner fx.Shutdowner) *PruneApp {
-							return &PruneApp{
-								Globals:      g,
-								Config:       cfg,
-								Repositories: repos,
-								S3Client:     s3Client,
-								DB:           db,
-								Shutdowner:   shutdowner,
-							}
-						},
-					)),
-				},
+				Modules: []fx.Option{},
 				Invokes: []fx.Option{
-					fx.Invoke(func(app *PruneApp, lc fx.Lifecycle) {
+					fx.Invoke(func(v *vaultik.Vaultik, lc fx.Lifecycle) {
 						lc.Append(fx.Hook{
 							OnStart: func(ctx context.Context) error {
 								// Start the prune operation in a goroutine
 								go func() {
 									// Run the prune operation
-									if err := app.runPrune(ctx, opts); err != nil {
+									if err := v.PruneBlobs(opts); err != nil {
 										if err != context.Canceled {
+											if !opts.JSON {
 												log.Error("Prune operation failed", "error", err)
 											}
+											os.Exit(1)
+										}
 									}

 									// Shutdown the app when prune completes
-									if err := app.Shutdowner.Shutdown(); err != nil {
+									if err := v.Shutdowner.Shutdown(); err != nil {
 										log.Error("Failed to shutdown", "error", err)
 									}
 								}()
@@ -102,6 +67,7 @@ specifying a path using --config or by setting VAULTIK_CONFIG to a path.`,
 							},
 							OnStop: func(ctx context.Context) error {
 								log.Debug("Stopping prune operation")
+								v.Cancel()
 								return nil
 							},
 						})
@@ -111,186 +77,8 @@ specifying a path using --config or by setting VAULTIK_CONFIG to a path.`,
 		},
 	}

-	cmd.Flags().BoolVar(&opts.DryRun, "dry-run", false, "Show what would be deleted without actually deleting")
+	cmd.Flags().BoolVar(&opts.Force, "force", false, "Skip confirmation prompt")
+	cmd.Flags().BoolVar(&opts.JSON, "json", false, "Output pruning stats as JSON")

 	return cmd
 }
-
-// runPrune executes the prune operation
-func (app *PruneApp) runPrune(ctx context.Context, opts *PruneOptions) error {
-	log.Info("Starting prune operation",
-		"bucket", app.Config.S3.Bucket,
-		"prefix", app.Config.S3.Prefix,
-		"dry_run", opts.DryRun,
-	)
-
-	// Step 1: Get the latest complete snapshot from the database
-	log.Info("Getting latest snapshot from database")
-	snapshots, err := app.Repositories.Snapshots.ListRecent(ctx, 1)
-	if err != nil {
-		return fmt.Errorf("listing snapshots: %w", err)
-	}
-
-	if len(snapshots) == 0 {
-		return fmt.Errorf("no snapshots found in database")
-	}
-
-	latestSnapshot := snapshots[0]
-	if latestSnapshot.CompletedAt == nil {
-		return fmt.Errorf("latest snapshot %s is incomplete", latestSnapshot.ID)
-	}
-
-	log.Info("Found latest snapshot",
-		"id", latestSnapshot.ID,
-		"completed_at", latestSnapshot.CompletedAt.Format("2006-01-02 15:04:05"))
-
-	// Step 2: Find and download the manifest from the last successful snapshot in S3
-	log.Info("Finding last successful snapshot in S3")
-	metadataPrefix := "metadata/"
-
-	// List all snapshots in S3
-	var s3Snapshots []string
-	objectCh := app.S3Client.ListObjectsStream(ctx, metadataPrefix, false)
-	for obj := range objectCh {
-		if obj.Err != nil {
-			return fmt.Errorf("listing metadata objects: %w", obj.Err)
-		}
-		// Extract snapshot ID from path like "metadata/hostname-20240115-143052Z/manifest.json.zst"
-		parts := strings.Split(obj.Key, "/")
-		if len(parts) >= 2 && strings.HasSuffix(obj.Key, "/manifest.json.zst") {
-			s3Snapshots = append(s3Snapshots, parts[1])
-		}
-	}
-
-	if len(s3Snapshots) == 0 {
-		return fmt.Errorf("no snapshot manifests found in S3")
-	}
-
-	// Find the most recent snapshot (they're named with timestamps)
-	var lastS3Snapshot string
-	for _, snap := range s3Snapshots {
-		if lastS3Snapshot == "" || snap > lastS3Snapshot {
-			lastS3Snapshot = snap
-		}
-	}
-
-	log.Info("Found last S3 snapshot", "id", lastS3Snapshot)
-
-	// Step 3: Verify the last S3 snapshot matches the latest DB snapshot
-	if lastS3Snapshot != latestSnapshot.ID {
-		return fmt.Errorf("latest snapshot in database (%s) does not match last successful snapshot in S3 (%s)",
-			latestSnapshot.ID, lastS3Snapshot)
-	}
-
-	// Step 4: Download and parse the manifest
-	log.Info("Downloading manifest", "snapshot_id", lastS3Snapshot)
-	manifest, err := app.downloadManifest(ctx, lastS3Snapshot)
-	if err != nil {
-		return fmt.Errorf("downloading manifest: %w", err)
-	}
-
-	log.Info("Manifest loaded", "blob_count", len(manifest.Blobs))
-
-	// Step 5: Build set of referenced blobs
-	referencedBlobs := make(map[string]bool)
-	for _, blob := range manifest.Blobs {
-		referencedBlobs[blob.Hash] = true
-	}
-
-	// Step 6: List all blobs in S3
-	log.Info("Listing all blobs in S3")
-	blobPrefix := "blobs/"
-	var totalBlobs int
-	var unreferencedBlobs []s3.ObjectInfo
-	var unreferencedSize int64
-
-	objectCh = app.S3Client.ListObjectsStream(ctx, blobPrefix, true)
-	for obj := range objectCh {
-		if obj.Err != nil {
-			return fmt.Errorf("listing blobs: %w", obj.Err)
-		}
-
-		totalBlobs++
-
-		// Extract blob hash from path like "blobs/ca/fe/cafebabe..."
-		parts := strings.Split(obj.Key, "/")
-		if len(parts) == 4 {
-			blobHash := parts[3]
-			if !referencedBlobs[blobHash] {
-				unreferencedBlobs = append(unreferencedBlobs, obj)
-				unreferencedSize += obj.Size
-			}
-		}
-	}
-
-	log.Info("Blob scan complete",
-		"total_blobs", totalBlobs,
-		"referenced_blobs", len(referencedBlobs),
-		"unreferenced_blobs", len(unreferencedBlobs),
-		"unreferenced_size", humanize.Bytes(uint64(unreferencedSize)))
-
-	// Step 7: Delete or report unreferenced blobs
-	if opts.DryRun {
-		fmt.Printf("\nDry run mode - would delete %d unreferenced blobs\n", len(unreferencedBlobs))
-		fmt.Printf("Total size of blobs to delete: %s\n", humanize.Bytes(uint64(unreferencedSize)))
-
-		if len(unreferencedBlobs) > 0 {
-			log.Debug("Unreferenced blobs found", "count", len(unreferencedBlobs))
-			for _, obj := range unreferencedBlobs {
-				log.Debug("Would delete blob", "key", obj.Key, "size", humanize.Bytes(uint64(obj.Size)))
-			}
-		}
-	} else {
-		if len(unreferencedBlobs) == 0 {
-			fmt.Println("No unreferenced blobs to delete")
-			return nil
-		}
-
-		fmt.Printf("\nDeleting %d unreferenced blobs (%s)...\n",
-			len(unreferencedBlobs), humanize.Bytes(uint64(unreferencedSize)))
-
-		deletedCount := 0
-		deletedSize := int64(0)
-
-		for _, obj := range unreferencedBlobs {
-			if err := app.S3Client.RemoveObject(ctx, obj.Key); err != nil {
-				log.Error("Failed to delete blob", "key", obj.Key, "error", err)
-				continue
-			}
-			deletedCount++
-			deletedSize += obj.Size
-
-			// Show progress every 100 blobs
-			if deletedCount%100 == 0 {
-				fmt.Printf("  Deleted %d/%d blobs (%s)...\n",
-					deletedCount, len(unreferencedBlobs),
-					humanize.Bytes(uint64(deletedSize)))
-			}
-		}
-
-		fmt.Printf("\nDeleted %d blobs (%s)\n", deletedCount, humanize.Bytes(uint64(deletedSize)))
-	}
-
-	log.Info("Prune operation completed successfully")
-	return nil
-}
-
-// downloadManifest downloads and decompresses a snapshot manifest
-func (app *PruneApp) downloadManifest(ctx context.Context, snapshotID string) (*snapshot.Manifest, error) {
-	manifestPath := fmt.Sprintf("metadata/%s/manifest.json.zst", snapshotID)
-
-	// Download the compressed manifest
-	reader, err := app.S3Client.GetObject(ctx, manifestPath)
-	if err != nil {
-		return nil, fmt.Errorf("downloading manifest: %w", err)
-	}
-	defer func() { _ = reader.Close() }()
-
-	// Decode manifest
-	manifest, err := snapshot.DecodeManifest(reader)
-	if err != nil {
-		return nil, fmt.Errorf("decoding manifest: %w", err)
-	}
-
-	return manifest, nil
-}
--- a/internal/cli/purge.go
+++ b/internal/cli/purge.go
@@ -0,0 +1,100 @@
+package cli
+
+import (
+	"context"
+	"fmt"
+	"os"
+
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
+	"github.com/spf13/cobra"
+	"go.uber.org/fx"
+)
+
+// PurgeOptions contains options for the purge command
+type PurgeOptions struct {
+	KeepLatest bool
+	OlderThan  string
+	Force      bool
+}
+
+// NewPurgeCommand creates the purge command
+func NewPurgeCommand() *cobra.Command {
+	opts := &PurgeOptions{}
+
+	cmd := &cobra.Command{
+		Use:   "purge",
+		Short: "Purge old snapshots",
+		Long: `Removes snapshots based on age or count criteria.
+
+This command allows you to:
+- Keep only the latest snapshot (--keep-latest)
+- Remove snapshots older than a specific duration (--older-than)
+
+Config is located at /etc/vaultik/config.yml by default, but can be overridden by 
+specifying a path using --config or by setting VAULTIK_CONFIG to a path.`,
+		Args: cobra.NoArgs,
+		RunE: func(cmd *cobra.Command, args []string) error {
+			// Validate flags
+			if !opts.KeepLatest && opts.OlderThan == "" {
+				return fmt.Errorf("must specify either --keep-latest or --older-than")
+			}
+			if opts.KeepLatest && opts.OlderThan != "" {
+				return fmt.Errorf("cannot specify both --keep-latest and --older-than")
+			}
+
+			// Use unified config resolution
+			configPath, err := ResolveConfigPath()
+			if err != nil {
+				return err
+			}
+
+			// Use the app framework like other commands
+			rootFlags := GetRootFlags()
+			return RunWithApp(cmd.Context(), AppOptions{
+				ConfigPath: configPath,
+				LogOptions: log.LogOptions{
+					Verbose: rootFlags.Verbose,
+					Debug:   rootFlags.Debug,
+					Quiet:   rootFlags.Quiet,
+				},
+				Modules: []fx.Option{},
+				Invokes: []fx.Option{
+					fx.Invoke(func(v *vaultik.Vaultik, lc fx.Lifecycle) {
+						lc.Append(fx.Hook{
+							OnStart: func(ctx context.Context) error {
+								// Start the purge operation in a goroutine
+								go func() {
+									// Run the purge operation
+									if err := v.PurgeSnapshots(opts.KeepLatest, opts.OlderThan, opts.Force); err != nil {
+										if err != context.Canceled {
+											log.Error("Purge operation failed", "error", err)
+											os.Exit(1)
+										}
+									}
+
+									// Shutdown the app when purge completes
+									if err := v.Shutdowner.Shutdown(); err != nil {
+										log.Error("Failed to shutdown", "error", err)
+									}
+								}()
+								return nil
+							},
+							OnStop: func(ctx context.Context) error {
+								log.Debug("Stopping purge operation")
+								v.Cancel()
+								return nil
+							},
+						})
+					}),
+				},
+			})
+		},
+	}
+
+	cmd.Flags().BoolVar(&opts.KeepLatest, "keep-latest", false, "Keep only the latest snapshot")
+	cmd.Flags().StringVar(&opts.OlderThan, "older-than", "", "Remove snapshots older than duration (e.g. 30d, 6m, 1y)")
+	cmd.Flags().BoolVar(&opts.Force, "force", false, "Skip confirmation prompts")
+
+	return cmd
+}
--- a/internal/cli/remote.go
+++ b/internal/cli/remote.go
@@ -0,0 +1,89 @@
+package cli
+
+import (
+	"context"
+	"os"
+
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
+	"github.com/spf13/cobra"
+	"go.uber.org/fx"
+)
+
+// NewRemoteCommand creates the remote command and subcommands
+func NewRemoteCommand() *cobra.Command {
+	cmd := &cobra.Command{
+		Use:   "remote",
+		Short: "Remote storage management commands",
+		Long:  "Commands for inspecting and managing remote storage",
+	}
+
+	// Add subcommands
+	cmd.AddCommand(newRemoteInfoCommand())
+
+	return cmd
+}
+
+// newRemoteInfoCommand creates the 'remote info' subcommand
+func newRemoteInfoCommand() *cobra.Command {
+	var jsonOutput bool
+
+	cmd := &cobra.Command{
+		Use:   "info",
+		Short: "Display remote storage information",
+		Long: `Shows detailed information about remote storage, including:
+- Size of all snapshot metadata (per snapshot and total)
+- Count and total size of all blobs
+- Count and size of referenced blobs (from all manifests)
+- Count and size of orphaned blobs (not referenced by any manifest)`,
+		Args: cobra.NoArgs,
+		RunE: func(cmd *cobra.Command, args []string) error {
+			// Use unified config resolution
+			configPath, err := ResolveConfigPath()
+			if err != nil {
+				return err
+			}
+
+			rootFlags := GetRootFlags()
+			return RunWithApp(cmd.Context(), AppOptions{
+				ConfigPath: configPath,
+				LogOptions: log.LogOptions{
+					Verbose: rootFlags.Verbose,
+					Debug:   rootFlags.Debug,
+					Quiet:   rootFlags.Quiet || jsonOutput,
+				},
+				Modules: []fx.Option{},
+				Invokes: []fx.Option{
+					fx.Invoke(func(v *vaultik.Vaultik, lc fx.Lifecycle) {
+						lc.Append(fx.Hook{
+							OnStart: func(ctx context.Context) error {
+								go func() {
+									if err := v.RemoteInfo(jsonOutput); err != nil {
+										if err != context.Canceled {
+											if !jsonOutput {
+												log.Error("Failed to get remote info", "error", err)
+											}
+											os.Exit(1)
+										}
+									}
+									if err := v.Shutdowner.Shutdown(); err != nil {
+										log.Error("Failed to shutdown", "error", err)
+									}
+								}()
+								return nil
+							},
+							OnStop: func(ctx context.Context) error {
+								v.Cancel()
+								return nil
+							},
+						})
+					}),
+				},
+			})
+		},
+	}
+
+	cmd.Flags().BoolVar(&jsonOutput, "json", false, "Output in JSON format")
+
+	return cmd
+}
--- a/internal/cli/restore.go
+++ b/internal/cli/restore.go
@@ -2,20 +2,30 @@ package cli

 import (
 	"context"
-	"fmt"
-	"os"

+	"git.eeqj.de/sneak/vaultik/internal/config"
 	"git.eeqj.de/sneak/vaultik/internal/globals"
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/storage"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
 	"github.com/spf13/cobra"
 	"go.uber.org/fx"
 )

 // RestoreOptions contains options for the restore command
 type RestoreOptions struct {
-	Bucket     string
-	Prefix     string
-	SnapshotID string
 	TargetDir string
+	Paths     []string // Optional paths to restore (empty = all)
+	Verify    bool     // Verify restored files after restore
+}
+
+// RestoreApp contains all dependencies needed for restore
+type RestoreApp struct {
+	Globals    *globals.Globals
+	Config     *config.Config
+	Storage    storage.Storer
+	Vaultik    *vaultik.Vaultik
+	Shutdowner fx.Shutdowner
 }

 // NewRestoreCommand creates the restore command
@@ -23,61 +33,104 @@ func NewRestoreCommand() *cobra.Command {
 	opts := &RestoreOptions{}

 	cmd := &cobra.Command{
-		Use:   "restore",
+		Use:   "restore <snapshot-id> <target-dir> [paths...]",
 		Short: "Restore files from backup",
-		Long:  `Download and decrypt files from a backup snapshot`,
-		Args:  cobra.NoArgs,
+		Long: `Download and decrypt files from a backup snapshot.
+
+This command will restore files from the specified snapshot to the target directory.
+If no paths are specified, all files are restored.
+If paths are specified, only matching files/directories are restored.
+
+Requires the VAULTIK_AGE_SECRET_KEY environment variable to be set with the age private key.
+
+Examples:
+  # Restore entire snapshot
+  vaultik restore myhost_docs_2025-01-01T12:00:00Z /restore
+
+  # Restore specific file
+  vaultik restore myhost_docs_2025-01-01T12:00:00Z /restore /home/user/important.txt
+
+  # Restore specific directory
+  vaultik restore myhost_docs_2025-01-01T12:00:00Z /restore /home/user/documents/
+
+  # Restore and verify all files
+  vaultik restore --verify myhost_docs_2025-01-01T12:00:00Z /restore`,
+		Args: cobra.MinimumNArgs(2),
 		RunE: func(cmd *cobra.Command, args []string) error {
-			// Validate required flags
-			if opts.Bucket == "" {
-				return fmt.Errorf("--bucket is required")
+			snapshotID := args[0]
+			opts.TargetDir = args[1]
+			if len(args) > 2 {
+				opts.Paths = args[2:]
 			}
-			if opts.Prefix == "" {
-				return fmt.Errorf("--prefix is required")
+
+			// Use unified config resolution
+			configPath, err := ResolveConfigPath()
+			if err != nil {
+				return err
 			}
-			if opts.SnapshotID == "" {
-				return fmt.Errorf("--snapshot is required")
+
+			// Use the app framework like other commands
+			rootFlags := GetRootFlags()
+			return RunWithApp(cmd.Context(), AppOptions{
+				ConfigPath: configPath,
+				LogOptions: log.LogOptions{
+					Verbose: rootFlags.Verbose,
+					Debug:   rootFlags.Debug,
+					Quiet:   rootFlags.Quiet,
+				},
+				Modules: []fx.Option{
+					fx.Provide(fx.Annotate(
+						func(g *globals.Globals, cfg *config.Config,
+							storer storage.Storer, v *vaultik.Vaultik, shutdowner fx.Shutdowner) *RestoreApp {
+							return &RestoreApp{
+								Globals:    g,
+								Config:     cfg,
+								Storage:    storer,
+								Vaultik:    v,
+								Shutdowner: shutdowner,
 							}
-			if opts.TargetDir == "" {
-				return fmt.Errorf("--target is required")
+						},
+					)),
+				},
+				Invokes: []fx.Option{
+					fx.Invoke(func(app *RestoreApp, lc fx.Lifecycle) {
+						lc.Append(fx.Hook{
+							OnStart: func(ctx context.Context) error {
+								// Start the restore operation in a goroutine
+								go func() {
+									// Run the restore operation
+									restoreOpts := &vaultik.RestoreOptions{
+										SnapshotID: snapshotID,
+										TargetDir:  opts.TargetDir,
+										Paths:      opts.Paths,
+										Verify:     opts.Verify,
 									}
-			return runRestore(cmd.Context(), opts)
+									if err := app.Vaultik.Restore(restoreOpts); err != nil {
+										if err != context.Canceled {
+											log.Error("Restore operation failed", "error", err)
+										}
+									}
+
+									// Shutdown the app when restore completes
+									if err := app.Shutdowner.Shutdown(); err != nil {
+										log.Error("Failed to shutdown", "error", err)
+									}
+								}()
+								return nil
+							},
+							OnStop: func(ctx context.Context) error {
+								log.Debug("Stopping restore operation")
+								app.Vaultik.Cancel()
+								return nil
+							},
+						})
+					}),
+				},
+			})
 		},
 	}

-	cmd.Flags().StringVar(&opts.Bucket, "bucket", "", "S3 bucket name")
-	cmd.Flags().StringVar(&opts.Prefix, "prefix", "", "S3 prefix")
-	cmd.Flags().StringVar(&opts.SnapshotID, "snapshot", "", "Snapshot ID to restore")
-	cmd.Flags().StringVar(&opts.TargetDir, "target", "", "Target directory for restore")
+	cmd.Flags().BoolVar(&opts.Verify, "verify", false, "Verify restored files by checking chunk hashes")

 	return cmd
 }
-
-func runRestore(ctx context.Context, opts *RestoreOptions) error {
-	if os.Getenv("VAULTIK_PRIVATE_KEY") == "" {
-		return fmt.Errorf("VAULTIK_PRIVATE_KEY environment variable must be set")
-	}
-
-	app := fx.New(
-		fx.Supply(opts),
-		fx.Provide(globals.New),
-		// Additional modules will be added here
-		fx.Invoke(func(g *globals.Globals) error {
-			// TODO: Implement restore logic
-			fmt.Printf("Restoring snapshot %s to %s\n", opts.SnapshotID, opts.TargetDir)
-			return nil
-		}),
-		fx.NopLogger,
-	)
-
-	if err := app.Start(ctx); err != nil {
-		return fmt.Errorf("failed to start restore: %w", err)
-	}
-	defer func() {
-		if err := app.Stop(ctx); err != nil {
-			fmt.Printf("error stopping app: %v\n", err)
-		}
-	}()
-
-	return nil
-}
--- a/internal/cli/root.go
+++ b/internal/cli/root.go
@@ -13,6 +13,7 @@ type RootFlags struct {
 	ConfigPath string
 	Verbose    bool
 	Debug      bool
+	Quiet      bool
 }

 var rootFlags RootFlags
@@ -34,15 +35,19 @@ on the source system.`,
 	cmd.PersistentFlags().StringVar(&rootFlags.ConfigPath, "config", "", "Path to config file (default: $VAULTIK_CONFIG or /etc/vaultik/config.yml)")
 	cmd.PersistentFlags().BoolVarP(&rootFlags.Verbose, "verbose", "v", false, "Enable verbose output")
 	cmd.PersistentFlags().BoolVar(&rootFlags.Debug, "debug", false, "Enable debug output")
+	cmd.PersistentFlags().BoolVarP(&rootFlags.Quiet, "quiet", "q", false, "Suppress non-error output")

 	// Add subcommands
 	cmd.AddCommand(
 		NewRestoreCommand(),
 		NewPruneCommand(),
 		NewVerifyCommand(),
-		NewFetchCommand(),
 		NewStoreCommand(),
 		NewSnapshotCommand(),
+		NewInfoCommand(),
+		NewVersionCommand(),
+		NewRemoteCommand(),
+		NewDatabaseCommand(),
 	)

 	return cmd
--- a/internal/cli/snapshot.go
+++ b/internal/cli/snapshot.go
--- a/internal/cli/store.go
+++ b/internal/cli/store.go
@@ -7,14 +7,14 @@ import (
 	"time"

 	"git.eeqj.de/sneak/vaultik/internal/log"
-	"git.eeqj.de/sneak/vaultik/internal/s3"
+	"git.eeqj.de/sneak/vaultik/internal/storage"
 	"github.com/spf13/cobra"
 	"go.uber.org/fx"
 )

 // StoreApp contains dependencies for store commands
 type StoreApp struct {
-	S3Client   *s3.Client
+	Storage    storage.Storer
 	Shutdowner fx.Shutdowner
 }

@@ -23,7 +23,7 @@ func NewStoreCommand() *cobra.Command {
 	cmd := &cobra.Command{
 		Use:   "store",
 		Short: "Storage information commands",
-		Long:  "Commands for viewing information about the S3 storage backend",
+		Long:  "Commands for viewing information about the storage backend",
 	}

 	// Add subcommands
@@ -37,7 +37,7 @@ func newStoreInfoCommand() *cobra.Command {
 	return &cobra.Command{
 		Use:   "info",
 		Short: "Display storage information",
-		Long:  "Shows S3 bucket configuration and storage statistics including snapshots and blobs",
+		Long:  "Shows storage configuration and statistics including snapshots and blobs",
 		RunE: func(cmd *cobra.Command, args []string) error {
 			return runWithApp(cmd.Context(), func(app *StoreApp) error {
 				return app.Info(cmd.Context())
@@ -48,19 +48,18 @@ func newStoreInfoCommand() *cobra.Command {

 // Info displays storage information
 func (app *StoreApp) Info(ctx context.Context) error {
-	// Get bucket info
-	bucketName := app.S3Client.BucketName()
-	endpoint := app.S3Client.Endpoint()
+	// Get storage info
+	storageInfo := app.Storage.Info()

 	fmt.Printf("Storage Information\n")
 	fmt.Printf("==================\n\n")
-	fmt.Printf("S3 Configuration:\n")
-	fmt.Printf("  Endpoint: %s\n", endpoint)
-	fmt.Printf("  Bucket:   %s\n\n", bucketName)
+	fmt.Printf("Storage Configuration:\n")
+	fmt.Printf("  Type:     %s\n", storageInfo.Type)
+	fmt.Printf("  Location: %s\n\n", storageInfo.Location)

 	// Count snapshots by listing metadata/ prefix
 	snapshotCount := 0
-	snapshotCh := app.S3Client.ListObjectsStream(ctx, "metadata/", true)
+	snapshotCh := app.Storage.ListStream(ctx, "metadata/")
 	snapshotDirs := make(map[string]bool)

 	for object := range snapshotCh {
@@ -79,7 +78,7 @@ func (app *StoreApp) Info(ctx context.Context) error {
 	blobCount := 0
 	var totalSize int64

-	blobCh := app.S3Client.ListObjectsStream(ctx, "blobs/", false)
+	blobCh := app.Storage.ListStream(ctx, "blobs/")
 	for object := range blobCh {
 		if object.Err != nil {
 			return fmt.Errorf("listing blobs: %w", object.Err)
@@ -128,12 +127,12 @@ func runWithApp(ctx context.Context, fn func(*StoreApp) error) error {
 		LogOptions: log.LogOptions{
 			Verbose: rootFlags.Verbose,
 			Debug:   rootFlags.Debug,
+			Quiet:   rootFlags.Quiet,
 		},
 		Modules: []fx.Option{
-			s3.Module,
-			fx.Provide(func(s3Client *s3.Client, shutdowner fx.Shutdowner) *StoreApp {
+			fx.Provide(func(storer storage.Storer, shutdowner fx.Shutdowner) *StoreApp {
 				return &StoreApp{
-					S3Client:   s3Client,
+					Storage:    storer,
 					Shutdowner: shutdowner,
 				}
 			}),
--- a/internal/cli/vaultik_snapshot_types.go
+++ b/internal/cli/vaultik_snapshot_types.go
@@ -0,0 +1,10 @@
+package cli
+
+import "time"
+
+// SnapshotInfo represents snapshot information for listing
+type SnapshotInfo struct {
+	ID             string    `json:"id"`
+	Timestamp      time.Time `json:"timestamp"`
+	CompressedSize int64     `json:"compressed_size"`
+}
--- a/internal/cli/verify.go
+++ b/internal/cli/verify.go
@@ -2,85 +2,97 @@ package cli

 import (
 	"context"
-	"fmt"
 	"os"

-	"git.eeqj.de/sneak/vaultik/internal/globals"
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
 	"github.com/spf13/cobra"
 	"go.uber.org/fx"
 )

-// VerifyOptions contains options for the verify command
-type VerifyOptions struct {
-	Bucket     string
-	Prefix     string
-	SnapshotID string
-	Quick      bool
-}
-
 // NewVerifyCommand creates the verify command
 func NewVerifyCommand() *cobra.Command {
-	opts := &VerifyOptions{}
+	opts := &vaultik.VerifyOptions{}

 	cmd := &cobra.Command{
-		Use:   "verify",
-		Short: "Verify backup integrity",
-		Long:  `Check that all referenced blobs exist and verify metadata integrity`,
-		Args:  cobra.NoArgs,
+		Use:   "verify <snapshot-id>",
+		Short: "Verify snapshot integrity",
+		Long: `Verifies that all blobs referenced in a snapshot exist and optionally verifies their contents.
+
+Shallow verification (default):
+- Downloads and decompresses manifest
+- Checks existence of all blobs in S3
+- Reports missing blobs
+
+Deep verification (--deep):
+- Downloads and decrypts database
+- Verifies blob lists match between manifest and database
+- Downloads, decrypts, and decompresses each blob
+- Verifies SHA256 hash of each chunk matches database
+- Ensures chunks are ordered correctly
+
+The command will fail immediately on any verification error and exit with non-zero status.`,
+		Args: cobra.ExactArgs(1),
 		RunE: func(cmd *cobra.Command, args []string) error {
-			// Validate required flags
-			if opts.Bucket == "" {
-				return fmt.Errorf("--bucket is required")
+			snapshotID := args[0]
+
+			// Use unified config resolution
+			configPath, err := ResolveConfigPath()
+			if err != nil {
+				return err
 			}
-			if opts.Prefix == "" {
-				return fmt.Errorf("--prefix is required")
+
+			// Use the app framework for all verification
+			rootFlags := GetRootFlags()
+			return RunWithApp(cmd.Context(), AppOptions{
+				ConfigPath: configPath,
+				LogOptions: log.LogOptions{
+					Verbose: rootFlags.Verbose,
+					Debug:   rootFlags.Debug,
+					Quiet:   rootFlags.Quiet || opts.JSON, // Suppress log output in JSON mode
+				},
+				Modules: []fx.Option{},
+				Invokes: []fx.Option{
+					fx.Invoke(func(v *vaultik.Vaultik, lc fx.Lifecycle) {
+						lc.Append(fx.Hook{
+							OnStart: func(ctx context.Context) error {
+								// Run the verify operation directly
+								go func() {
+									var err error
+									if opts.Deep {
+										err = v.RunDeepVerify(snapshotID, opts)
+									} else {
+										err = v.VerifySnapshotWithOptions(snapshotID, opts)
 									}
-			return runVerify(cmd.Context(), opts)
+
+									if err != nil {
+										if err != context.Canceled {
+											if !opts.JSON {
+												log.Error("Verification failed", "error", err)
+											}
+											os.Exit(1)
+										}
+									}
+									if err := v.Shutdowner.Shutdown(); err != nil {
+										log.Error("Failed to shutdown", "error", err)
+									}
+								}()
+								return nil
+							},
+							OnStop: func(ctx context.Context) error {
+								log.Debug("Stopping verify operation")
+								v.Cancel()
+								return nil
+							},
+						})
+					}),
+				},
+			})
 		},
 	}

-	cmd.Flags().StringVar(&opts.Bucket, "bucket", "", "S3 bucket name")
-	cmd.Flags().StringVar(&opts.Prefix, "prefix", "", "S3 prefix")
-	cmd.Flags().StringVar(&opts.SnapshotID, "snapshot", "", "Snapshot ID to verify (optional, defaults to latest)")
-	cmd.Flags().BoolVar(&opts.Quick, "quick", false, "Perform quick verification by checking blob existence and S3 content hashes without downloading")
+	cmd.Flags().BoolVar(&opts.Deep, "deep", false, "Perform deep verification by downloading and verifying all blob contents")
+	cmd.Flags().BoolVar(&opts.JSON, "json", false, "Output verification results as JSON")

 	return cmd
 }
-
-func runVerify(ctx context.Context, opts *VerifyOptions) error {
-	if os.Getenv("VAULTIK_PRIVATE_KEY") == "" {
-		return fmt.Errorf("VAULTIK_PRIVATE_KEY environment variable must be set")
-	}
-
-	app := fx.New(
-		fx.Supply(opts),
-		fx.Provide(globals.New),
-		// Additional modules will be added here
-		fx.Invoke(func(g *globals.Globals) error {
-			// TODO: Implement verify logic
-			if opts.SnapshotID == "" {
-				fmt.Printf("Verifying latest snapshot in bucket %s with prefix %s\n", opts.Bucket, opts.Prefix)
-			} else {
-				fmt.Printf("Verifying snapshot %s in bucket %s with prefix %s\n", opts.SnapshotID, opts.Bucket, opts.Prefix)
-			}
-			if opts.Quick {
-				fmt.Println("Performing quick verification")
-			} else {
-				fmt.Println("Performing deep verification")
-			}
-			return nil
-		}),
-		fx.NopLogger,
-	)
-
-	if err := app.Start(ctx); err != nil {
-		return fmt.Errorf("failed to start verify: %w", err)
-	}
-	defer func() {
-		if err := app.Stop(ctx); err != nil {
-			fmt.Printf("error stopping app: %v\n", err)
-		}
-	}()
-
-	return nil
-}
--- a/internal/cli/version.go
+++ b/internal/cli/version.go
@@ -0,0 +1,27 @@
+package cli
+
+import (
+	"fmt"
+	"runtime"
+
+	"git.eeqj.de/sneak/vaultik/internal/globals"
+	"github.com/spf13/cobra"
+)
+
+// NewVersionCommand creates the version command
+func NewVersionCommand() *cobra.Command {
+	cmd := &cobra.Command{
+		Use:   "version",
+		Short: "Print version information",
+		Long:  `Print version, git commit, and build information for vaultik.`,
+		Args:  cobra.NoArgs,
+		Run: func(cmd *cobra.Command, args []string) {
+			fmt.Printf("vaultik %s\n", globals.Version)
+			fmt.Printf("  commit:  %s\n", globals.Commit)
+			fmt.Printf("  go:      %s\n", runtime.Version())
+			fmt.Printf("  os/arch: %s/%s\n", runtime.GOOS, runtime.GOARCH)
+		},
+	}
+
+	return cmd
+}
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -3,29 +3,107 @@ package config
 import (
 	"fmt"
 	"os"
+	"path/filepath"
+	"sort"
+	"strings"
 	"time"

+	"filippo.io/age"
+	"git.eeqj.de/sneak/smartconfig"
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"github.com/adrg/xdg"
 	"go.uber.org/fx"
 	"gopkg.in/yaml.v3"
 )

+const appName = "berlin.sneak.app.vaultik"
+
+// expandTilde expands ~ at the start of a path to the user's home directory.
+func expandTilde(path string) string {
+	if path == "~" {
+		home, _ := os.UserHomeDir()
+		return home
+	}
+	if strings.HasPrefix(path, "~/") {
+		home, _ := os.UserHomeDir()
+		return filepath.Join(home, path[2:])
+	}
+	return path
+}
+
+// expandTildeInURL expands ~ in file:// URLs.
+func expandTildeInURL(url string) string {
+	if strings.HasPrefix(url, "file://~/") {
+		home, _ := os.UserHomeDir()
+		return "file://" + filepath.Join(home, url[9:])
+	}
+	return url
+}
+
+// SnapshotConfig represents configuration for a named snapshot.
+// Each snapshot backs up one or more paths and can have its own exclude patterns
+// in addition to the global excludes.
+type SnapshotConfig struct {
+	Paths   []string `yaml:"paths"`
+	Exclude []string `yaml:"exclude"` // Additional excludes for this snapshot
+}
+
+// GetExcludes returns the combined exclude patterns for a named snapshot.
+// It merges global excludes with the snapshot-specific excludes.
+func (c *Config) GetExcludes(snapshotName string) []string {
+	snap, ok := c.Snapshots[snapshotName]
+	if !ok {
+		return c.Exclude
+	}
+
+	if len(snap.Exclude) == 0 {
+		return c.Exclude
+	}
+
+	// Combine global and snapshot-specific excludes
+	combined := make([]string, 0, len(c.Exclude)+len(snap.Exclude))
+	combined = append(combined, c.Exclude...)
+	combined = append(combined, snap.Exclude...)
+	return combined
+}
+
+// SnapshotNames returns the names of all configured snapshots in sorted order.
+func (c *Config) SnapshotNames() []string {
+	names := make([]string, 0, len(c.Snapshots))
+	for name := range c.Snapshots {
+		names = append(names, name)
+	}
+	// Sort for deterministic order
+	sort.Strings(names)
+	return names
+}
+
 // Config represents the application configuration for Vaultik.
 // It defines all settings for backup operations, including source directories,
-// encryption recipients, S3 storage configuration, and performance tuning parameters.
+// encryption recipients, storage configuration, and performance tuning parameters.
 // Configuration is typically loaded from a YAML file.
 type Config struct {
 	AgeRecipients     []string                  `yaml:"age_recipients"`
+	AgeSecretKey      string                    `yaml:"age_secret_key"`
 	BackupInterval    time.Duration             `yaml:"backup_interval"`
 	BlobSizeLimit     Size                      `yaml:"blob_size_limit"`
 	ChunkSize         Size                      `yaml:"chunk_size"`
-	Exclude           []string      `yaml:"exclude"`
+	Exclude           []string                  `yaml:"exclude"` // Global excludes applied to all snapshots
 	FullScanInterval  time.Duration             `yaml:"full_scan_interval"`
 	Hostname          string                    `yaml:"hostname"`
 	IndexPath         string                    `yaml:"index_path"`
 	MinTimeBetweenRun time.Duration             `yaml:"min_time_between_run"`
 	S3                S3Config                  `yaml:"s3"`
-	SourceDirs        []string      `yaml:"source_dirs"`
+	Snapshots         map[string]SnapshotConfig `yaml:"snapshots"`
 	CompressionLevel  int                       `yaml:"compression_level"`
+
+	// StorageURL specifies the storage backend using a URL format.
+	// Takes precedence over S3Config if set.
+	// Supported formats:
+	//   - s3://bucket/prefix?endpoint=host&region=us-east-1
+	//   - file:///path/to/backup
+	// For S3 URLs, credentials are still read from s3.access_key_id and s3.secret_access_key.
+	StorageURL string `yaml:"storage_url"`
 }

 // S3Config represents S3 storage configuration for backup storage.
@@ -65,13 +143,14 @@ func New(path ConfigPath) (*Config, error) {

 // Load reads and parses the configuration file from the specified path.
 // It applies default values for optional fields, performs environment variable
-// substitution for certain fields (like IndexPath), and validates the configuration.
+// substitution using smartconfig, and validates the configuration.
 // The configuration file should be in YAML format. Returns an error if the file
 // cannot be read, parsed, or if validation fails.
 func Load(path string) (*Config, error) {
-	data, err := os.ReadFile(path)
+	// Load config using smartconfig for interpolation
+	sc, err := smartconfig.NewFromConfigPath(path)
 	if err != nil {
-		return nil, fmt.Errorf("failed to read config file: %w", err)
+		return nil, fmt.Errorf("failed to load config file: %w", err)
 	}

 	cfg := &Config{
@@ -81,17 +160,41 @@ func Load(path string) (*Config, error) {
 		BackupInterval:    1 * time.Hour,
 		FullScanInterval:  24 * time.Hour,
 		MinTimeBetweenRun: 15 * time.Minute,
-		IndexPath:         "/var/lib/vaultik/index.sqlite",
+		IndexPath:         filepath.Join(xdg.DataHome, appName, "index.sqlite"),
 		CompressionLevel:  3,
 	}

-	if err := yaml.Unmarshal(data, cfg); err != nil {
+	// Convert smartconfig data to YAML then unmarshal
+	configData := sc.Data()
+	yamlBytes, err := yaml.Marshal(configData)
+	if err != nil {
+		return nil, fmt.Errorf("failed to marshal config data: %w", err)
+	}
+
+	if err := yaml.Unmarshal(yamlBytes, cfg); err != nil {
 		return nil, fmt.Errorf("failed to parse config: %w", err)
 	}

+	// Expand tilde in all path fields
+	cfg.IndexPath = expandTilde(cfg.IndexPath)
+	cfg.StorageURL = expandTildeInURL(cfg.StorageURL)
+
+	// Expand tildes in snapshot paths
+	for name, snap := range cfg.Snapshots {
+		for i, path := range snap.Paths {
+			snap.Paths[i] = expandTilde(path)
+		}
+		cfg.Snapshots[name] = snap
+	}
+
 	// Check for environment variable override for IndexPath
 	if envIndexPath := os.Getenv("VAULTIK_INDEX_PATH"); envIndexPath != "" {
-		cfg.IndexPath = envIndexPath
+		cfg.IndexPath = expandTilde(envIndexPath)
+	}
+
+	// Check for environment variable override for AgeSecretKey
+	if envAgeSecretKey := os.Getenv("VAULTIK_AGE_SECRET_KEY"); envAgeSecretKey != "" {
+		cfg.AgeSecretKey = extractAgeSecretKey(envAgeSecretKey)
 	}

 	// Get hostname if not set
@@ -111,6 +214,17 @@ func Load(path string) (*Config, error) {
 		cfg.S3.PartSize = Size(5 * 1024 * 1024) // 5MB
 	}

+	// Check config file permissions (warn if world or group readable)
+	if info, err := os.Stat(path); err == nil {
+		mode := info.Mode().Perm()
+		if mode&0044 != 0 { // group or world readable
+			log.Warn("Config file has insecure permissions (contains S3 credentials)",
+				"path", path,
+				"mode", fmt.Sprintf("%04o", mode),
+				"recommendation", "chmod 600 "+path)
+		}
+	}
+
 	if err := cfg.Validate(); err != nil {
 		return nil, fmt.Errorf("invalid config: %w", err)
 	}
@@ -121,8 +235,8 @@ func Load(path string) (*Config, error) {
 // Validate checks if the configuration is valid and complete.
 // It ensures all required fields are present and have valid values:
 // - At least one age recipient must be specified
-// - At least one source directory must be configured
-// - S3 credentials and endpoint must be provided
+// - At least one snapshot must be configured with at least one path
+// - Storage must be configured (either storage_url or s3.* fields)
 // - Chunk size must be at least 1MB
 // - Blob size limit must be at least the chunk size
 // - Compression level must be between 1 and 19
@@ -132,24 +246,19 @@ func (c *Config) Validate() error {
 		return fmt.Errorf("at least one age_recipient is required")
 	}

-	if len(c.SourceDirs) == 0 {
-		return fmt.Errorf("at least one source directory is required")
+	if len(c.Snapshots) == 0 {
+		return fmt.Errorf("at least one snapshot must be configured")
 	}

-	if c.S3.Endpoint == "" {
-		return fmt.Errorf("s3.endpoint is required")
+	for name, snap := range c.Snapshots {
+		if len(snap.Paths) == 0 {
+			return fmt.Errorf("snapshot %q must have at least one path", name)
+		}
 	}

-	if c.S3.Bucket == "" {
-		return fmt.Errorf("s3.bucket is required")
-	}
-
-	if c.S3.AccessKeyID == "" {
-		return fmt.Errorf("s3.access_key_id is required")
-	}
-
-	if c.S3.SecretAccessKey == "" {
-		return fmt.Errorf("s3.secret_access_key is required")
+	// Validate storage configuration
+	if err := c.validateStorage(); err != nil {
+		return err
 	}

 	if c.ChunkSize.Int64() < 1024*1024 { // 1MB minimum
@@ -167,6 +276,69 @@ func (c *Config) Validate() error {
 	return nil
 }

+// validateStorage validates storage configuration.
+// If StorageURL is set, it takes precedence. S3 URLs require credentials.
+// File URLs don't require any S3 configuration.
+// If StorageURL is not set, legacy S3 configuration is required.
+func (c *Config) validateStorage() error {
+	if c.StorageURL != "" {
+		// URL-based configuration
+		if strings.HasPrefix(c.StorageURL, "file://") {
+			// File storage doesn't need S3 credentials
+			return nil
+		}
+		if strings.HasPrefix(c.StorageURL, "s3://") {
+			// S3 storage needs credentials
+			if c.S3.AccessKeyID == "" {
+				return fmt.Errorf("s3.access_key_id is required for s3:// URLs")
+			}
+			if c.S3.SecretAccessKey == "" {
+				return fmt.Errorf("s3.secret_access_key is required for s3:// URLs")
+			}
+			return nil
+		}
+		if strings.HasPrefix(c.StorageURL, "rclone://") {
+			// Rclone storage uses rclone's own config
+			return nil
+		}
+		return fmt.Errorf("storage_url must start with s3://, file://, or rclone://")
+	}
+
+	// Legacy S3 configuration
+	if c.S3.Endpoint == "" {
+		return fmt.Errorf("s3.endpoint is required (or set storage_url)")
+	}
+
+	if c.S3.Bucket == "" {
+		return fmt.Errorf("s3.bucket is required (or set storage_url)")
+	}
+
+	if c.S3.AccessKeyID == "" {
+		return fmt.Errorf("s3.access_key_id is required")
+	}
+
+	if c.S3.SecretAccessKey == "" {
+		return fmt.Errorf("s3.secret_access_key is required")
+	}
+
+	return nil
+}
+
+// extractAgeSecretKey extracts the AGE-SECRET-KEY from the input using
+// the age library's parser, which handles comments and whitespace.
+func extractAgeSecretKey(input string) string {
+	identities, err := age.ParseIdentities(strings.NewReader(input))
+	if err != nil || len(identities) == 0 {
+		// Fall back to trimmed input if parsing fails
+		return strings.TrimSpace(input)
+	}
+	// Return the string representation of the first identity
+	if id, ok := identities[0].(*age.X25519Identity); ok {
+		return id.String()
+	}
+	return strings.TrimSpace(input)
+}
+
 // Module exports the config module for fx dependency injection.
 // It provides the Config type to other modules in the application.
 var Module = fx.Module("config",
--- a/internal/config/config_test.go
+++ b/internal/config/config_test.go
@@ -45,12 +45,21 @@ func TestConfigLoad(t *testing.T) {
 		t.Errorf("Expected first age recipient to be %s, got '%s'", TEST_SNEAK_AGE_PUBLIC_KEY, cfg.AgeRecipients[0])
 	}

-	if len(cfg.SourceDirs) != 2 {
-		t.Errorf("Expected 2 source dirs, got %d", len(cfg.SourceDirs))
+	if len(cfg.Snapshots) != 1 {
+		t.Errorf("Expected 1 snapshot, got %d", len(cfg.Snapshots))
 	}

-	if cfg.SourceDirs[0] != "/tmp/vaultik-test-source" {
-		t.Errorf("Expected first source dir to be '/tmp/vaultik-test-source', got '%s'", cfg.SourceDirs[0])
+	testSnap, ok := cfg.Snapshots["test"]
+	if !ok {
+		t.Fatal("Expected 'test' snapshot to exist")
+	}
+
+	if len(testSnap.Paths) != 2 {
+		t.Errorf("Expected 2 paths in test snapshot, got %d", len(testSnap.Paths))
+	}
+
+	if testSnap.Paths[0] != "/tmp/vaultik-test-source" {
+		t.Errorf("Expected first path to be '/tmp/vaultik-test-source', got '%s'", testSnap.Paths[0])
 	}

 	if cfg.S3.Bucket != "vaultik-test-bucket" {
@@ -74,3 +83,65 @@ func TestConfigFromEnv(t *testing.T) {
 		t.Errorf("Config file does not exist at path from VAULTIK_CONFIG: %s", configPath)
 	}
 }
+
+// TestExtractAgeSecretKey tests extraction of AGE-SECRET-KEY from various inputs
+func TestExtractAgeSecretKey(t *testing.T) {
+	tests := []struct {
+		name     string
+		input    string
+		expected string
+	}{
+		{
+			name:     "plain key",
+			input:    "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5",
+			expected: "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5",
+		},
+		{
+			name:     "key with trailing newline",
+			input:    "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5\n",
+			expected: "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5",
+		},
+		{
+			name: "full age-keygen output",
+			input: `# created: 2025-01-14T12:00:00Z
+# public key: age1ezrjmfpwsc95svdg0y54mums3zevgzu0x0ecq2f7tp8a05gl0sjq9q9wjg
+AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5
+`,
+			expected: "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5",
+		},
+		{
+			name: "age-keygen output with extra blank lines",
+			input: `# created: 2025-01-14T12:00:00Z
+# public key: age1ezrjmfpwsc95svdg0y54mums3zevgzu0x0ecq2f7tp8a05gl0sjq9q9wjg
+
+AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5
+
+`,
+			expected: "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5",
+		},
+		{
+			name:     "key with leading whitespace",
+			input:    "  AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5  ",
+			expected: "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5",
+		},
+		{
+			name:     "empty input",
+			input:    "",
+			expected: "",
+		},
+		{
+			name:     "only comments",
+			input:    "# this is a comment\n# another comment",
+			expected: "# this is a comment\n# another comment",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			result := extractAgeSecretKey(tt.input)
+			if result != tt.expected {
+				t.Errorf("extractAgeSecretKey(%q) = %q, want %q", tt.input, result, tt.expected)
+			}
+		})
+	}
+}
--- a/internal/config/size.go
+++ b/internal/config/size.go
@@ -51,3 +51,12 @@ func (s Size) Int64() int64 {
 func (s Size) String() string {
 	return humanize.Bytes(uint64(s))
 }
+
+// ParseSize parses a size string into a Size value
+func ParseSize(s string) (Size, error) {
+	bytes, err := humanize.ParseBytes(s)
+	if err != nil {
+		return 0, fmt.Errorf("invalid size format: %w", err)
+	}
+	return Size(bytes), nil
+}
--- a/internal/crypto/encryption.go
+++ b/internal/crypto/encryption.go
@@ -7,6 +7,7 @@ import (
 	"sync"

 	"filippo.io/age"
+	"go.uber.org/fx"
 )

 // Encryptor provides thread-safe encryption using the age encryption library.
@@ -143,3 +144,66 @@ func (e *Encryptor) UpdateRecipients(publicKeys []string) error {

 	return nil
 }
+
+// Decryptor provides thread-safe decryption using the age encryption library.
+// It uses a private key to decrypt data that was encrypted for the corresponding
+// public key.
+type Decryptor struct {
+	identity age.Identity
+	mu       sync.RWMutex
+}
+
+// NewDecryptor creates a new decryptor with the given age private key.
+// The private key should be a valid age X25519 identity string.
+// Returns an error if the private key is invalid.
+func NewDecryptor(privateKey string) (*Decryptor, error) {
+	identity, err := age.ParseX25519Identity(privateKey)
+	if err != nil {
+		return nil, fmt.Errorf("parsing age identity: %w", err)
+	}
+
+	return &Decryptor{
+		identity: identity,
+	}, nil
+}
+
+// Decrypt decrypts data using age decryption.
+// This method is suitable for small to medium amounts of data that fit in memory.
+// For large data streams, use DecryptStream instead.
+func (d *Decryptor) Decrypt(data []byte) ([]byte, error) {
+	d.mu.RLock()
+	identity := d.identity
+	d.mu.RUnlock()
+
+	r, err := age.Decrypt(bytes.NewReader(data), identity)
+	if err != nil {
+		return nil, fmt.Errorf("creating decrypted reader: %w", err)
+	}
+
+	decrypted, err := io.ReadAll(r)
+	if err != nil {
+		return nil, fmt.Errorf("reading decrypted data: %w", err)
+	}
+
+	return decrypted, nil
+}
+
+// DecryptStream returns a reader that decrypts data from the provided reader.
+// This method is suitable for decrypting large files or streams as it processes
+// data in a streaming fashion without loading everything into memory.
+// The caller should close the input reader when done.
+func (d *Decryptor) DecryptStream(src io.Reader) (io.Reader, error) {
+	d.mu.RLock()
+	identity := d.identity
+	d.mu.RUnlock()
+
+	r, err := age.Decrypt(src, identity)
+	if err != nil {
+		return nil, fmt.Errorf("creating decrypted reader: %w", err)
+	}
+
+	return r, nil
+}
+
+// Module exports the crypto module for fx dependency injection.
+var Module = fx.Module("crypto")
--- a/internal/database/blob_chunks_test.go
+++ b/internal/database/blob_chunks_test.go
@@ -5,6 +5,8 @@ import (
 	"strings"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 func TestBlobChunkRepository(t *testing.T) {
@@ -16,8 +18,8 @@ func TestBlobChunkRepository(t *testing.T) {

 	// Create blob first
 	blob := &Blob{
-		ID:        "blob1-uuid",
-		Hash:      "blob1-hash",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("blob1-hash"),
 		CreatedTS: time.Now(),
 	}
 	err := repos.Blobs.Create(ctx, nil, blob)
@@ -26,7 +28,7 @@ func TestBlobChunkRepository(t *testing.T) {
 	}

 	// Create chunks
-	chunks := []string{"chunk1", "chunk2", "chunk3"}
+	chunks := []types.ChunkHash{"chunk1", "chunk2", "chunk3"}
 	for _, chunkHash := range chunks {
 		chunk := &Chunk{
 			ChunkHash: chunkHash,
@@ -41,7 +43,7 @@ func TestBlobChunkRepository(t *testing.T) {
 	// Test Create
 	bc1 := &BlobChunk{
 		BlobID:    blob.ID,
-		ChunkHash: "chunk1",
+		ChunkHash: types.ChunkHash("chunk1"),
 		Offset:    0,
 		Length:    1024,
 	}
@@ -54,7 +56,7 @@ func TestBlobChunkRepository(t *testing.T) {
 	// Add more chunks to the same blob
 	bc2 := &BlobChunk{
 		BlobID:    blob.ID,
-		ChunkHash: "chunk2",
+		ChunkHash: types.ChunkHash("chunk2"),
 		Offset:    1024,
 		Length:    2048,
 	}
@@ -65,7 +67,7 @@ func TestBlobChunkRepository(t *testing.T) {

 	bc3 := &BlobChunk{
 		BlobID:    blob.ID,
-		ChunkHash: "chunk3",
+		ChunkHash: types.ChunkHash("chunk3"),
 		Offset:    3072,
 		Length:    512,
 	}
@@ -75,7 +77,7 @@ func TestBlobChunkRepository(t *testing.T) {
 	}

 	// Test GetByBlobID
-	blobChunks, err := repos.BlobChunks.GetByBlobID(ctx, blob.ID)
+	blobChunks, err := repos.BlobChunks.GetByBlobID(ctx, blob.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get blob chunks: %v", err)
 	}
@@ -134,13 +136,13 @@ func TestBlobChunkRepositoryMultipleBlobs(t *testing.T) {

 	// Create blobs
 	blob1 := &Blob{
-		ID:        "blob1-uuid",
-		Hash:      "blob1-hash",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("blob1-hash"),
 		CreatedTS: time.Now(),
 	}
 	blob2 := &Blob{
-		ID:        "blob2-uuid",
-		Hash:      "blob2-hash",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("blob2-hash"),
 		CreatedTS: time.Now(),
 	}

@@ -154,7 +156,7 @@ func TestBlobChunkRepositoryMultipleBlobs(t *testing.T) {
 	}

 	// Create chunks
-	chunkHashes := []string{"chunk1", "chunk2", "chunk3"}
+	chunkHashes := []types.ChunkHash{"chunk1", "chunk2", "chunk3"}
 	for _, chunkHash := range chunkHashes {
 		chunk := &Chunk{
 			ChunkHash: chunkHash,
@@ -169,10 +171,10 @@ func TestBlobChunkRepositoryMultipleBlobs(t *testing.T) {
 	// Create chunks across multiple blobs
 	// Some chunks are shared between blobs (deduplication scenario)
 	blobChunks := []BlobChunk{
-		{BlobID: blob1.ID, ChunkHash: "chunk1", Offset: 0, Length: 1024},
-		{BlobID: blob1.ID, ChunkHash: "chunk2", Offset: 1024, Length: 1024},
-		{BlobID: blob2.ID, ChunkHash: "chunk2", Offset: 0, Length: 1024}, // chunk2 is shared
-		{BlobID: blob2.ID, ChunkHash: "chunk3", Offset: 1024, Length: 1024},
+		{BlobID: blob1.ID, ChunkHash: types.ChunkHash("chunk1"), Offset: 0, Length: 1024},
+		{BlobID: blob1.ID, ChunkHash: types.ChunkHash("chunk2"), Offset: 1024, Length: 1024},
+		{BlobID: blob2.ID, ChunkHash: types.ChunkHash("chunk2"), Offset: 0, Length: 1024}, // chunk2 is shared
+		{BlobID: blob2.ID, ChunkHash: types.ChunkHash("chunk3"), Offset: 1024, Length: 1024},
 	}

 	for _, bc := range blobChunks {
@@ -183,7 +185,7 @@ func TestBlobChunkRepositoryMultipleBlobs(t *testing.T) {
 	}

 	// Verify blob1 chunks
-	chunks, err := repos.BlobChunks.GetByBlobID(ctx, blob1.ID)
+	chunks, err := repos.BlobChunks.GetByBlobID(ctx, blob1.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get blob1 chunks: %v", err)
 	}
@@ -192,7 +194,7 @@ func TestBlobChunkRepositoryMultipleBlobs(t *testing.T) {
 	}

 	// Verify blob2 chunks
-	chunks, err = repos.BlobChunks.GetByBlobID(ctx, blob2.ID)
+	chunks, err = repos.BlobChunks.GetByBlobID(ctx, blob2.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get blob2 chunks: %v", err)
 	}
--- a/internal/database/blobs_test.go
+++ b/internal/database/blobs_test.go
@@ -4,6 +4,8 @@ import (
 	"context"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 func TestBlobRepository(t *testing.T) {
@@ -15,8 +17,8 @@ func TestBlobRepository(t *testing.T) {

 	// Test Create
 	blob := &Blob{
-		ID:        "test-blob-id-123",
-		Hash:      "blobhash123",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("blobhash123"),
 		CreatedTS: time.Now().Truncate(time.Second),
 	}

@@ -26,7 +28,7 @@ func TestBlobRepository(t *testing.T) {
 	}

 	// Test GetByHash
-	retrieved, err := repo.GetByHash(ctx, blob.Hash)
+	retrieved, err := repo.GetByHash(ctx, blob.Hash.String())
 	if err != nil {
 		t.Fatalf("failed to get blob: %v", err)
 	}
@@ -41,7 +43,7 @@ func TestBlobRepository(t *testing.T) {
 	}

 	// Test GetByID
-	retrievedByID, err := repo.GetByID(ctx, blob.ID)
+	retrievedByID, err := repo.GetByID(ctx, blob.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get blob by ID: %v", err)
 	}
@@ -54,8 +56,8 @@ func TestBlobRepository(t *testing.T) {

 	// Test with second blob
 	blob2 := &Blob{
-		ID:        "test-blob-id-456",
-		Hash:      "blobhash456",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("blobhash456"),
 		CreatedTS: time.Now().Truncate(time.Second),
 	}
 	err = repo.Create(ctx, nil, blob2)
@@ -65,13 +67,13 @@ func TestBlobRepository(t *testing.T) {

 	// Test UpdateFinished
 	now := time.Now()
-	err = repo.UpdateFinished(ctx, nil, blob.ID, blob.Hash, 1000, 500)
+	err = repo.UpdateFinished(ctx, nil, blob.ID.String(), blob.Hash.String(), 1000, 500)
 	if err != nil {
 		t.Fatalf("failed to update blob as finished: %v", err)
 	}

 	// Verify update
-	updated, err := repo.GetByID(ctx, blob.ID)
+	updated, err := repo.GetByID(ctx, blob.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get updated blob: %v", err)
 	}
@@ -86,13 +88,13 @@ func TestBlobRepository(t *testing.T) {
 	}

 	// Test UpdateUploaded
-	err = repo.UpdateUploaded(ctx, nil, blob.ID)
+	err = repo.UpdateUploaded(ctx, nil, blob.ID.String())
 	if err != nil {
 		t.Fatalf("failed to update blob as uploaded: %v", err)
 	}

 	// Verify upload update
-	uploaded, err := repo.GetByID(ctx, blob.ID)
+	uploaded, err := repo.GetByID(ctx, blob.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get uploaded blob: %v", err)
 	}
@@ -113,8 +115,8 @@ func TestBlobRepositoryDuplicate(t *testing.T) {
 	repo := NewBlobRepository(db)

 	blob := &Blob{
-		ID:        "duplicate-test-id",
-		Hash:      "duplicate_blob",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("duplicate_blob"),
 		CreatedTS: time.Now().Truncate(time.Second),
 	}

--- a/internal/database/cascade_debug_test.go
+++ b/internal/database/cascade_debug_test.go
@@ -5,6 +5,8 @@ import (
 	"fmt"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 // TestCascadeDeleteDebug tests cascade delete with debug output
@@ -42,7 +44,7 @@ func TestCascadeDeleteDebug(t *testing.T) {
 	// Create chunks and file-chunk mappings
 	for i := 0; i < 3; i++ {
 		chunk := &Chunk{
-			ChunkHash: fmt.Sprintf("cascade-chunk-%d", i),
+			ChunkHash: types.ChunkHash(fmt.Sprintf("cascade-chunk-%d", i)),
 			Size:      1024,
 		}
 		err = repos.Chunks.Create(ctx, nil, chunk)
--- a/internal/database/chunk_files.go
+++ b/internal/database/chunk_files.go
@@ -4,6 +4,8 @@ import (
 	"context"
 	"database/sql"
 	"fmt"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 type ChunkFileRepository struct {
@@ -23,9 +25,9 @@ func (r *ChunkFileRepository) Create(ctx context.Context, tx *sql.Tx, cf *ChunkF

 	var err error
 	if tx != nil {
-		_, err = tx.ExecContext(ctx, query, cf.ChunkHash, cf.FileID, cf.FileOffset, cf.Length)
+		_, err = tx.ExecContext(ctx, query, cf.ChunkHash.String(), cf.FileID.String(), cf.FileOffset, cf.Length)
 	} else {
-		_, err = r.db.ExecWithLog(ctx, query, cf.ChunkHash, cf.FileID, cf.FileOffset, cf.Length)
+		_, err = r.db.ExecWithLog(ctx, query, cf.ChunkHash.String(), cf.FileID.String(), cf.FileOffset, cf.Length)
 	}

 	if err != nil {
@@ -35,30 +37,20 @@ func (r *ChunkFileRepository) Create(ctx context.Context, tx *sql.Tx, cf *ChunkF
 	return nil
 }

-func (r *ChunkFileRepository) GetByChunkHash(ctx context.Context, chunkHash string) ([]*ChunkFile, error) {
+func (r *ChunkFileRepository) GetByChunkHash(ctx context.Context, chunkHash types.ChunkHash) ([]*ChunkFile, error) {
 	query := `
 		SELECT chunk_hash, file_id, file_offset, length
 		FROM chunk_files
 		WHERE chunk_hash = ?
 	`

-	rows, err := r.db.conn.QueryContext(ctx, query, chunkHash)
+	rows, err := r.db.conn.QueryContext(ctx, query, chunkHash.String())
 	if err != nil {
 		return nil, fmt.Errorf("querying chunk files: %w", err)
 	}
 	defer CloseRows(rows)

-	var chunkFiles []*ChunkFile
-	for rows.Next() {
-		var cf ChunkFile
-		err := rows.Scan(&cf.ChunkHash, &cf.FileID, &cf.FileOffset, &cf.Length)
-		if err != nil {
-			return nil, fmt.Errorf("scanning chunk file: %w", err)
-		}
-		chunkFiles = append(chunkFiles, &cf)
-	}
-
-	return chunkFiles, rows.Err()
+	return r.scanChunkFiles(rows)
 }

 func (r *ChunkFileRepository) GetByFilePath(ctx context.Context, filePath string) ([]*ChunkFile, error) {
@@ -75,40 +67,41 @@ func (r *ChunkFileRepository) GetByFilePath(ctx context.Context, filePath string
 	}
 	defer CloseRows(rows)

-	var chunkFiles []*ChunkFile
-	for rows.Next() {
-		var cf ChunkFile
-		err := rows.Scan(&cf.ChunkHash, &cf.FileID, &cf.FileOffset, &cf.Length)
-		if err != nil {
-			return nil, fmt.Errorf("scanning chunk file: %w", err)
-		}
-		chunkFiles = append(chunkFiles, &cf)
-	}
-
-	return chunkFiles, rows.Err()
+	return r.scanChunkFiles(rows)
 }

 // GetByFileID retrieves chunk files by file ID
-func (r *ChunkFileRepository) GetByFileID(ctx context.Context, fileID string) ([]*ChunkFile, error) {
+func (r *ChunkFileRepository) GetByFileID(ctx context.Context, fileID types.FileID) ([]*ChunkFile, error) {
 	query := `
 		SELECT chunk_hash, file_id, file_offset, length
 		FROM chunk_files
 		WHERE file_id = ?
 	`

-	rows, err := r.db.conn.QueryContext(ctx, query, fileID)
+	rows, err := r.db.conn.QueryContext(ctx, query, fileID.String())
 	if err != nil {
 		return nil, fmt.Errorf("querying chunk files: %w", err)
 	}
 	defer CloseRows(rows)

+	return r.scanChunkFiles(rows)
+}
+
+// scanChunkFiles is a helper that scans chunk file rows
+func (r *ChunkFileRepository) scanChunkFiles(rows *sql.Rows) ([]*ChunkFile, error) {
 	var chunkFiles []*ChunkFile
 	for rows.Next() {
 		var cf ChunkFile
-		err := rows.Scan(&cf.ChunkHash, &cf.FileID, &cf.FileOffset, &cf.Length)
+		var chunkHashStr, fileIDStr string
+		err := rows.Scan(&chunkHashStr, &fileIDStr, &cf.FileOffset, &cf.Length)
 		if err != nil {
 			return nil, fmt.Errorf("scanning chunk file: %w", err)
 		}
+		cf.ChunkHash = types.ChunkHash(chunkHashStr)
+		cf.FileID, err = types.ParseFileID(fileIDStr)
+		if err != nil {
+			return nil, fmt.Errorf("parsing file ID: %w", err)
+		}
 		chunkFiles = append(chunkFiles, &cf)
 	}

@@ -116,14 +109,14 @@ func (r *ChunkFileRepository) GetByFileID(ctx context.Context, fileID string) ([
 }

 // DeleteByFileID deletes all chunk_files entries for a given file ID
-func (r *ChunkFileRepository) DeleteByFileID(ctx context.Context, tx *sql.Tx, fileID string) error {
+func (r *ChunkFileRepository) DeleteByFileID(ctx context.Context, tx *sql.Tx, fileID types.FileID) error {
 	query := `DELETE FROM chunk_files WHERE file_id = ?`

 	var err error
 	if tx != nil {
-		_, err = tx.ExecContext(ctx, query, fileID)
+		_, err = tx.ExecContext(ctx, query, fileID.String())
 	} else {
-		_, err = r.db.ExecWithLog(ctx, query, fileID)
+		_, err = r.db.ExecWithLog(ctx, query, fileID.String())
 	}

 	if err != nil {
@@ -132,3 +125,80 @@ func (r *ChunkFileRepository) DeleteByFileID(ctx context.Context, tx *sql.Tx, fi

 	return nil
 }
+
+// DeleteByFileIDs deletes all chunk_files for multiple files in a single statement.
+func (r *ChunkFileRepository) DeleteByFileIDs(ctx context.Context, tx *sql.Tx, fileIDs []types.FileID) error {
+	if len(fileIDs) == 0 {
+		return nil
+	}
+
+	// Batch at 500 to stay within SQLite's variable limit
+	const batchSize = 500
+
+	for i := 0; i < len(fileIDs); i += batchSize {
+		end := i + batchSize
+		if end > len(fileIDs) {
+			end = len(fileIDs)
+		}
+		batch := fileIDs[i:end]
+
+		query := "DELETE FROM chunk_files WHERE file_id IN (?" + repeatPlaceholder(len(batch)-1) + ")"
+		args := make([]interface{}, len(batch))
+		for j, id := range batch {
+			args[j] = id.String()
+		}
+
+		var err error
+		if tx != nil {
+			_, err = tx.ExecContext(ctx, query, args...)
+		} else {
+			_, err = r.db.ExecWithLog(ctx, query, args...)
+		}
+		if err != nil {
+			return fmt.Errorf("batch deleting chunk_files: %w", err)
+		}
+	}
+
+	return nil
+}
+
+// CreateBatch inserts multiple chunk_files in a single statement for efficiency.
+func (r *ChunkFileRepository) CreateBatch(ctx context.Context, tx *sql.Tx, cfs []ChunkFile) error {
+	if len(cfs) == 0 {
+		return nil
+	}
+
+	// Each ChunkFile has 4 values, so batch at 200 to be safe with SQLite's variable limit
+	const batchSize = 200
+
+	for i := 0; i < len(cfs); i += batchSize {
+		end := i + batchSize
+		if end > len(cfs) {
+			end = len(cfs)
+		}
+		batch := cfs[i:end]
+
+		query := "INSERT INTO chunk_files (chunk_hash, file_id, file_offset, length) VALUES "
+		args := make([]interface{}, 0, len(batch)*4)
+		for j, cf := range batch {
+			if j > 0 {
+				query += ", "
+			}
+			query += "(?, ?, ?, ?)"
+			args = append(args, cf.ChunkHash.String(), cf.FileID.String(), cf.FileOffset, cf.Length)
+		}
+		query += " ON CONFLICT(chunk_hash, file_id) DO NOTHING"
+
+		var err error
+		if tx != nil {
+			_, err = tx.ExecContext(ctx, query, args...)
+		} else {
+			_, err = r.db.ExecWithLog(ctx, query, args...)
+		}
+		if err != nil {
+			return fmt.Errorf("batch inserting chunk_files: %w", err)
+		}
+	}
+
+	return nil
+}
--- a/internal/database/chunk_files_test.go
+++ b/internal/database/chunk_files_test.go
@@ -4,6 +4,8 @@ import (
 	"context"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 func TestChunkFileRepository(t *testing.T) {
@@ -49,7 +51,7 @@ func TestChunkFileRepository(t *testing.T) {

 	// Create chunk first
 	chunk := &Chunk{
-		ChunkHash: "chunk1",
+		ChunkHash: types.ChunkHash("chunk1"),
 		Size:      1024,
 	}
 	err = chunksRepo.Create(ctx, nil, chunk)
@@ -59,7 +61,7 @@ func TestChunkFileRepository(t *testing.T) {

 	// Test Create
 	cf1 := &ChunkFile{
-		ChunkHash:  "chunk1",
+		ChunkHash:  types.ChunkHash("chunk1"),
 		FileID:     file1.ID,
 		FileOffset: 0,
 		Length:     1024,
@@ -72,7 +74,7 @@ func TestChunkFileRepository(t *testing.T) {

 	// Add same chunk in different file (deduplication scenario)
 	cf2 := &ChunkFile{
-		ChunkHash:  "chunk1",
+		ChunkHash:  types.ChunkHash("chunk1"),
 		FileID:     file2.ID,
 		FileOffset: 2048,
 		Length:     1024,
@@ -114,7 +116,7 @@ func TestChunkFileRepository(t *testing.T) {
 	if len(chunkFiles) != 1 {
 		t.Errorf("expected 1 chunk for file, got %d", len(chunkFiles))
 	}
-	if chunkFiles[0].ChunkHash != "chunk1" {
+	if chunkFiles[0].ChunkHash != types.ChunkHash("chunk1") {
 		t.Errorf("wrong chunk hash: expected chunk1, got %s", chunkFiles[0].ChunkHash)
 	}

@@ -151,7 +153,7 @@ func TestChunkFileRepositoryComplexDeduplication(t *testing.T) {
 	}

 	// Create chunks first
-	chunks := []string{"chunk1", "chunk2", "chunk3", "chunk4"}
+	chunks := []types.ChunkHash{"chunk1", "chunk2", "chunk3", "chunk4"}
 	for _, chunkHash := range chunks {
 		chunk := &Chunk{
 			ChunkHash: chunkHash,
@@ -170,16 +172,16 @@ func TestChunkFileRepositoryComplexDeduplication(t *testing.T) {

 	chunkFiles := []ChunkFile{
 		// File1
-		{ChunkHash: "chunk1", FileID: file1.ID, FileOffset: 0, Length: 1024},
-		{ChunkHash: "chunk2", FileID: file1.ID, FileOffset: 1024, Length: 1024},
-		{ChunkHash: "chunk3", FileID: file1.ID, FileOffset: 2048, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk1"), FileID: file1.ID, FileOffset: 0, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk2"), FileID: file1.ID, FileOffset: 1024, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk3"), FileID: file1.ID, FileOffset: 2048, Length: 1024},
 		// File2
-		{ChunkHash: "chunk2", FileID: file2.ID, FileOffset: 0, Length: 1024},
-		{ChunkHash: "chunk3", FileID: file2.ID, FileOffset: 1024, Length: 1024},
-		{ChunkHash: "chunk4", FileID: file2.ID, FileOffset: 2048, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk2"), FileID: file2.ID, FileOffset: 0, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk3"), FileID: file2.ID, FileOffset: 1024, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk4"), FileID: file2.ID, FileOffset: 2048, Length: 1024},
 		// File3
-		{ChunkHash: "chunk1", FileID: file3.ID, FileOffset: 0, Length: 1024},
-		{ChunkHash: "chunk4", FileID: file3.ID, FileOffset: 1024, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk1"), FileID: file3.ID, FileOffset: 0, Length: 1024},
+		{ChunkHash: types.ChunkHash("chunk4"), FileID: file3.ID, FileOffset: 1024, Length: 1024},
 	}

 	for _, cf := range chunkFiles {
--- a/internal/database/chunks.go
+++ b/internal/database/chunks.go
@@ -139,7 +139,7 @@ func (r *ChunkRepository) ListUnpacked(ctx context.Context, limit int) ([]*Chunk
 	return chunks, rows.Err()
 }

-// DeleteOrphaned deletes chunks that are not referenced by any file
+// DeleteOrphaned deletes chunks that are not referenced by any file or blob
 func (r *ChunkRepository) DeleteOrphaned(ctx context.Context) error {
 	query := `
 		DELETE FROM chunks 
@@ -147,6 +147,10 @@ func (r *ChunkRepository) DeleteOrphaned(ctx context.Context) error {
 			SELECT 1 FROM file_chunks 
 			WHERE file_chunks.chunk_hash = chunks.chunk_hash
 		)
+		AND NOT EXISTS (
+			SELECT 1 FROM blob_chunks 
+			WHERE blob_chunks.chunk_hash = chunks.chunk_hash
+		)
 	`

 	result, err := r.db.ExecWithLog(ctx, query)
--- a/internal/database/chunks_test.go
+++ b/internal/database/chunks_test.go
@@ -3,6 +3,8 @@ package database
 import (
 	"context"
 	"testing"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 func TestChunkRepository(t *testing.T) {
@@ -14,7 +16,7 @@ func TestChunkRepository(t *testing.T) {

 	// Test Create
 	chunk := &Chunk{
-		ChunkHash: "chunkhash123",
+		ChunkHash: types.ChunkHash("chunkhash123"),
 		Size:      4096,
 	}

@@ -24,7 +26,7 @@ func TestChunkRepository(t *testing.T) {
 	}

 	// Test GetByHash
-	retrieved, err := repo.GetByHash(ctx, chunk.ChunkHash)
+	retrieved, err := repo.GetByHash(ctx, chunk.ChunkHash.String())
 	if err != nil {
 		t.Fatalf("failed to get chunk: %v", err)
 	}
@@ -46,7 +48,7 @@ func TestChunkRepository(t *testing.T) {

 	// Test GetByHashes
 	chunk2 := &Chunk{
-		ChunkHash: "chunkhash456",
+		ChunkHash: types.ChunkHash("chunkhash456"),
 		Size:      8192,
 	}
 	err = repo.Create(ctx, nil, chunk2)
@@ -54,7 +56,7 @@ func TestChunkRepository(t *testing.T) {
 		t.Fatalf("failed to create second chunk: %v", err)
 	}

-	chunks, err := repo.GetByHashes(ctx, []string{chunk.ChunkHash, chunk2.ChunkHash})
+	chunks, err := repo.GetByHashes(ctx, []string{chunk.ChunkHash.String(), chunk2.ChunkHash.String()})
 	if err != nil {
 		t.Fatalf("failed to get chunks by hashes: %v", err)
 	}
--- a/internal/database/database.go
+++ b/internal/database/database.go
@@ -36,26 +36,17 @@ type DB struct {
 }

 // New creates a new database connection at the specified path.
-// It automatically handles database recovery, creates the schema if needed,
-// and configures SQLite with appropriate settings for performance and reliability.
-// The database uses WAL mode for better concurrency and sets a busy timeout
-// to handle concurrent access gracefully.
-//
-// If the database appears locked, it will attempt recovery by removing stale
-// lock files and switching temporarily to TRUNCATE journal mode.
-//
-// New creates a new database connection at the specified path.
-// It automatically handles recovery from stale locks, creates the schema if needed,
-// and configures SQLite with WAL mode for better concurrency.
+// It creates the schema if needed and configures SQLite with WAL mode for
+// better concurrency. SQLite handles crash recovery automatically when
+// opening a database with journal/WAL files present.
 // The path parameter can be a file path for persistent storage or ":memory:"
 // for an in-memory database (useful for testing).
 func New(ctx context.Context, path string) (*DB, error) {
 	log.Debug("Opening database connection", "path", path)

-	// First, try to recover from any stale locks
-	if err := recoverDatabase(ctx, path); err != nil {
-		log.Warn("Failed to recover database", "error", err)
-	}
+	// Note: We do NOT delete journal/WAL files before opening.
+	// SQLite handles crash recovery automatically when the database is opened.
+	// Deleting these files would corrupt the database after an unclean shutdown.

 	// First attempt with standard WAL mode
 	log.Debug("Attempting to open database with WAL mode", "path", path)
@@ -156,62 +147,6 @@ func (db *DB) Close() error {
 	return nil
 }

-// recoverDatabase attempts to recover a locked database
-func recoverDatabase(ctx context.Context, path string) error {
-	// Check if database file exists
-	if _, err := os.Stat(path); os.IsNotExist(err) {
-		// No database file, nothing to recover
-		return nil
-	}
-
-	// Remove stale lock files
-	// SQLite creates -wal and -shm files for WAL mode
-	walPath := path + "-wal"
-	shmPath := path + "-shm"
-	journalPath := path + "-journal"
-
-	log.Info("Attempting database recovery", "path", path)
-
-	// Always remove lock files on startup to ensure clean state
-	removed := false
-
-	// Check for and remove journal file (from non-WAL mode)
-	if _, err := os.Stat(journalPath); err == nil {
-		log.Info("Found journal file, removing", "path", journalPath)
-		if err := os.Remove(journalPath); err != nil {
-			log.Warn("Failed to remove journal file", "error", err)
-		} else {
-			removed = true
-		}
-	}
-
-	// Remove WAL file
-	if _, err := os.Stat(walPath); err == nil {
-		log.Info("Found WAL file, removing", "path", walPath)
-		if err := os.Remove(walPath); err != nil {
-			log.Warn("Failed to remove WAL file", "error", err)
-		} else {
-			removed = true
-		}
-	}
-
-	// Remove SHM file
-	if _, err := os.Stat(shmPath); err == nil {
-		log.Info("Found shared memory file, removing", "path", shmPath)
-		if err := os.Remove(shmPath); err != nil {
-			log.Warn("Failed to remove shared memory file", "error", err)
-		} else {
-			removed = true
-		}
-	}
-
-	if removed {
-		log.Info("Database lock files removed")
-	}
-
-	return nil
-}
-
 // Conn returns the underlying *sql.DB connection.
 // This should be used sparingly and primarily for read operations.
 // For write operations, prefer using the ExecWithLog method.
@@ -219,6 +154,11 @@ func (db *DB) Conn() *sql.DB {
 	return db.conn
 }

+// Path returns the path to the database file.
+func (db *DB) Path() string {
+	return db.path
+}
+
 // BeginTx starts a new database transaction with the given options.
 // The caller is responsible for committing or rolling back the transaction.
 // For write transactions, consider using the Repositories.WithTx method instead,
@@ -270,6 +210,15 @@ func NewTestDB() (*DB, error) {
 	return New(context.Background(), ":memory:")
 }

+// repeatPlaceholder generates a string of ", ?" repeated n times for IN clause construction.
+// For example, repeatPlaceholder(2) returns ", ?, ?".
+func repeatPlaceholder(n int) string {
+	if n <= 0 {
+		return ""
+	}
+	return strings.Repeat(", ?", n)
+}
+
 // LogSQL logs SQL queries and their arguments when debug mode is enabled.
 // Debug mode is activated by setting the GODEBUG environment variable to include "vaultik".
 // This is useful for troubleshooting database operations and understanding query patterns.
--- a/internal/database/file_chunks.go
+++ b/internal/database/file_chunks.go
@@ -4,6 +4,8 @@ import (
 	"context"
 	"database/sql"
 	"fmt"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 type FileChunkRepository struct {
@@ -23,9 +25,9 @@ func (r *FileChunkRepository) Create(ctx context.Context, tx *sql.Tx, fc *FileCh

 	var err error
 	if tx != nil {
-		_, err = tx.ExecContext(ctx, query, fc.FileID, fc.Idx, fc.ChunkHash)
+		_, err = tx.ExecContext(ctx, query, fc.FileID.String(), fc.Idx, fc.ChunkHash.String())
 	} else {
-		_, err = r.db.ExecWithLog(ctx, query, fc.FileID, fc.Idx, fc.ChunkHash)
+		_, err = r.db.ExecWithLog(ctx, query, fc.FileID.String(), fc.Idx, fc.ChunkHash.String())
 	}

 	if err != nil {
@@ -50,21 +52,11 @@ func (r *FileChunkRepository) GetByPath(ctx context.Context, path string) ([]*Fi
 	}
 	defer CloseRows(rows)

-	var fileChunks []*FileChunk
-	for rows.Next() {
-		var fc FileChunk
-		err := rows.Scan(&fc.FileID, &fc.Idx, &fc.ChunkHash)
-		if err != nil {
-			return nil, fmt.Errorf("scanning file chunk: %w", err)
-		}
-		fileChunks = append(fileChunks, &fc)
-	}
-
-	return fileChunks, rows.Err()
+	return r.scanFileChunks(rows)
 }

 // GetByFileID retrieves file chunks by file ID
-func (r *FileChunkRepository) GetByFileID(ctx context.Context, fileID string) ([]*FileChunk, error) {
+func (r *FileChunkRepository) GetByFileID(ctx context.Context, fileID types.FileID) ([]*FileChunk, error) {
 	query := `
 		SELECT file_id, idx, chunk_hash
 		FROM file_chunks
@@ -72,23 +64,13 @@ func (r *FileChunkRepository) GetByFileID(ctx context.Context, fileID string) ([
 		ORDER BY idx
 	`

-	rows, err := r.db.conn.QueryContext(ctx, query, fileID)
+	rows, err := r.db.conn.QueryContext(ctx, query, fileID.String())
 	if err != nil {
 		return nil, fmt.Errorf("querying file chunks: %w", err)
 	}
 	defer CloseRows(rows)

-	var fileChunks []*FileChunk
-	for rows.Next() {
-		var fc FileChunk
-		err := rows.Scan(&fc.FileID, &fc.Idx, &fc.ChunkHash)
-		if err != nil {
-			return nil, fmt.Errorf("scanning file chunk: %w", err)
-		}
-		fileChunks = append(fileChunks, &fc)
-	}
-
-	return fileChunks, rows.Err()
+	return r.scanFileChunks(rows)
 }

 // GetByPathTx retrieves file chunks within a transaction
@@ -108,16 +90,28 @@ func (r *FileChunkRepository) GetByPathTx(ctx context.Context, tx *sql.Tx, path
 	}
 	defer CloseRows(rows)

+	fileChunks, err := r.scanFileChunks(rows)
+	LogSQL("GetByPathTx", "Complete", path, "count", len(fileChunks))
+	return fileChunks, err
+}
+
+// scanFileChunks is a helper that scans file chunk rows
+func (r *FileChunkRepository) scanFileChunks(rows *sql.Rows) ([]*FileChunk, error) {
 	var fileChunks []*FileChunk
 	for rows.Next() {
 		var fc FileChunk
-		err := rows.Scan(&fc.FileID, &fc.Idx, &fc.ChunkHash)
+		var fileIDStr, chunkHashStr string
+		err := rows.Scan(&fileIDStr, &fc.Idx, &chunkHashStr)
 		if err != nil {
 			return nil, fmt.Errorf("scanning file chunk: %w", err)
 		}
+		fc.FileID, err = types.ParseFileID(fileIDStr)
+		if err != nil {
+			return nil, fmt.Errorf("parsing file ID: %w", err)
+		}
+		fc.ChunkHash = types.ChunkHash(chunkHashStr)
 		fileChunks = append(fileChunks, &fc)
 	}
-	LogSQL("GetByPathTx", "Complete", path, "count", len(fileChunks))

 	return fileChunks, rows.Err()
 }
@@ -140,14 +134,14 @@ func (r *FileChunkRepository) DeleteByPath(ctx context.Context, tx *sql.Tx, path
 }

 // DeleteByFileID deletes all chunks for a file by its UUID
-func (r *FileChunkRepository) DeleteByFileID(ctx context.Context, tx *sql.Tx, fileID string) error {
+func (r *FileChunkRepository) DeleteByFileID(ctx context.Context, tx *sql.Tx, fileID types.FileID) error {
 	query := `DELETE FROM file_chunks WHERE file_id = ?`

 	var err error
 	if tx != nil {
-		_, err = tx.ExecContext(ctx, query, fileID)
+		_, err = tx.ExecContext(ctx, query, fileID.String())
 	} else {
-		_, err = r.db.ExecWithLog(ctx, query, fileID)
+		_, err = r.db.ExecWithLog(ctx, query, fileID.String())
 	}

 	if err != nil {
@@ -157,6 +151,86 @@ func (r *FileChunkRepository) DeleteByFileID(ctx context.Context, tx *sql.Tx, fi
 	return nil
 }

+// DeleteByFileIDs deletes all chunks for multiple files in a single statement.
+func (r *FileChunkRepository) DeleteByFileIDs(ctx context.Context, tx *sql.Tx, fileIDs []types.FileID) error {
+	if len(fileIDs) == 0 {
+		return nil
+	}
+
+	// Batch at 500 to stay within SQLite's variable limit
+	const batchSize = 500
+
+	for i := 0; i < len(fileIDs); i += batchSize {
+		end := i + batchSize
+		if end > len(fileIDs) {
+			end = len(fileIDs)
+		}
+		batch := fileIDs[i:end]
+
+		query := "DELETE FROM file_chunks WHERE file_id IN (?" + repeatPlaceholder(len(batch)-1) + ")"
+		args := make([]interface{}, len(batch))
+		for j, id := range batch {
+			args[j] = id.String()
+		}
+
+		var err error
+		if tx != nil {
+			_, err = tx.ExecContext(ctx, query, args...)
+		} else {
+			_, err = r.db.ExecWithLog(ctx, query, args...)
+		}
+		if err != nil {
+			return fmt.Errorf("batch deleting file_chunks: %w", err)
+		}
+	}
+
+	return nil
+}
+
+// CreateBatch inserts multiple file_chunks in a single statement for efficiency.
+// Batches are automatically split to stay within SQLite's variable limit.
+func (r *FileChunkRepository) CreateBatch(ctx context.Context, tx *sql.Tx, fcs []FileChunk) error {
+	if len(fcs) == 0 {
+		return nil
+	}
+
+	// SQLite has a limit on variables (typically 999 or 32766).
+	// Each FileChunk has 3 values, so batch at 300 to be safe.
+	const batchSize = 300
+
+	for i := 0; i < len(fcs); i += batchSize {
+		end := i + batchSize
+		if end > len(fcs) {
+			end = len(fcs)
+		}
+		batch := fcs[i:end]
+
+		// Build the query with multiple value sets
+		query := "INSERT INTO file_chunks (file_id, idx, chunk_hash) VALUES "
+		args := make([]interface{}, 0, len(batch)*3)
+		for j, fc := range batch {
+			if j > 0 {
+				query += ", "
+			}
+			query += "(?, ?, ?)"
+			args = append(args, fc.FileID.String(), fc.Idx, fc.ChunkHash.String())
+		}
+		query += " ON CONFLICT(file_id, idx) DO NOTHING"
+
+		var err error
+		if tx != nil {
+			_, err = tx.ExecContext(ctx, query, args...)
+		} else {
+			_, err = r.db.ExecWithLog(ctx, query, args...)
+		}
+		if err != nil {
+			return fmt.Errorf("batch inserting file_chunks: %w", err)
+		}
+	}
+
+	return nil
+}
+
 // GetByFile is an alias for GetByPath for compatibility
 func (r *FileChunkRepository) GetByFile(ctx context.Context, path string) ([]*FileChunk, error) {
 	LogSQL("GetByFile", "Starting", path)
--- a/internal/database/file_chunks_test.go
+++ b/internal/database/file_chunks_test.go
@@ -5,6 +5,8 @@ import (
 	"fmt"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 func TestFileChunkRepository(t *testing.T) {
@@ -33,7 +35,7 @@ func TestFileChunkRepository(t *testing.T) {
 	}

 	// Create chunks first
-	chunks := []string{"chunk1", "chunk2", "chunk3"}
+	chunks := []types.ChunkHash{"chunk1", "chunk2", "chunk3"}
 	chunkRepo := NewChunkRepository(db)
 	for _, chunkHash := range chunks {
 		chunk := &Chunk{
@@ -50,7 +52,7 @@ func TestFileChunkRepository(t *testing.T) {
 	fc1 := &FileChunk{
 		FileID:    file.ID,
 		Idx:       0,
-		ChunkHash: "chunk1",
+		ChunkHash: types.ChunkHash("chunk1"),
 	}

 	err = repo.Create(ctx, nil, fc1)
@@ -62,7 +64,7 @@ func TestFileChunkRepository(t *testing.T) {
 	fc2 := &FileChunk{
 		FileID:    file.ID,
 		Idx:       1,
-		ChunkHash: "chunk2",
+		ChunkHash: types.ChunkHash("chunk2"),
 	}
 	err = repo.Create(ctx, nil, fc2)
 	if err != nil {
@@ -72,7 +74,7 @@ func TestFileChunkRepository(t *testing.T) {
 	fc3 := &FileChunk{
 		FileID:    file.ID,
 		Idx:       2,
-		ChunkHash: "chunk3",
+		ChunkHash: types.ChunkHash("chunk3"),
 	}
 	err = repo.Create(ctx, nil, fc3)
 	if err != nil {
@@ -131,7 +133,7 @@ func TestFileChunkRepositoryMultipleFiles(t *testing.T) {

 	for i, path := range filePaths {
 		file := &File{
-			Path:       path,
+			Path:       types.FilePath(path),
 			MTime:      testTime,
 			CTime:      testTime,
 			Size:       2048,
@@ -151,7 +153,7 @@ func TestFileChunkRepositoryMultipleFiles(t *testing.T) {
 	chunkRepo := NewChunkRepository(db)
 	for i := range files {
 		for j := 0; j < 2; j++ {
-			chunkHash := fmt.Sprintf("file%d_chunk%d", i, j)
+			chunkHash := types.ChunkHash(fmt.Sprintf("file%d_chunk%d", i, j))
 			chunk := &Chunk{
 				ChunkHash: chunkHash,
 				Size:      1024,
@@ -169,7 +171,7 @@ func TestFileChunkRepositoryMultipleFiles(t *testing.T) {
 			fc := &FileChunk{
 				FileID:    file.ID,
 				Idx:       j,
-				ChunkHash: fmt.Sprintf("file%d_chunk%d", i, j),
+				ChunkHash: types.ChunkHash(fmt.Sprintf("file%d_chunk%d", i, j)),
 			}
 			err := repo.Create(ctx, nil, fc)
 			if err != nil {
--- a/internal/database/files.go
+++ b/internal/database/files.go
@@ -7,7 +7,7 @@ import (
 	"time"

 	"git.eeqj.de/sneak/vaultik/internal/log"
-	"github.com/google/uuid"
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 type FileRepository struct {
@@ -20,14 +20,15 @@ func NewFileRepository(db *DB) *FileRepository {

 func (r *FileRepository) Create(ctx context.Context, tx *sql.Tx, file *File) error {
 	// Generate UUID if not provided
-	if file.ID == "" {
-		file.ID = uuid.New().String()
+	if file.ID.IsZero() {
+		file.ID = types.NewFileID()
 	}

 	query := `
-		INSERT INTO files (id, path, mtime, ctime, size, mode, uid, gid, link_target)
-		VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+		INSERT INTO files (id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target)
+		VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
 		ON CONFLICT(path) DO UPDATE SET
+			source_path = excluded.source_path,
 			mtime = excluded.mtime,
 			ctime = excluded.ctime,
 			size = excluded.size,
@@ -38,44 +39,36 @@ func (r *FileRepository) Create(ctx context.Context, tx *sql.Tx, file *File) err
 		RETURNING id
 	`

+	var idStr string
 	var err error
 	if tx != nil {
-		LogSQL("Execute", query, file.ID, file.Path, file.MTime.Unix(), file.CTime.Unix(), file.Size, file.Mode, file.UID, file.GID, file.LinkTarget)
-		err = tx.QueryRowContext(ctx, query, file.ID, file.Path, file.MTime.Unix(), file.CTime.Unix(), file.Size, file.Mode, file.UID, file.GID, file.LinkTarget).Scan(&file.ID)
+		LogSQL("Execute", query, file.ID.String(), file.Path.String(), file.SourcePath.String(), file.MTime.Unix(), file.CTime.Unix(), file.Size, file.Mode, file.UID, file.GID, file.LinkTarget.String())
+		err = tx.QueryRowContext(ctx, query, file.ID.String(), file.Path.String(), file.SourcePath.String(), file.MTime.Unix(), file.CTime.Unix(), file.Size, file.Mode, file.UID, file.GID, file.LinkTarget.String()).Scan(&idStr)
 	} else {
-		err = r.db.QueryRowWithLog(ctx, query, file.ID, file.Path, file.MTime.Unix(), file.CTime.Unix(), file.Size, file.Mode, file.UID, file.GID, file.LinkTarget).Scan(&file.ID)
+		err = r.db.QueryRowWithLog(ctx, query, file.ID.String(), file.Path.String(), file.SourcePath.String(), file.MTime.Unix(), file.CTime.Unix(), file.Size, file.Mode, file.UID, file.GID, file.LinkTarget.String()).Scan(&idStr)
 	}

 	if err != nil {
 		return fmt.Errorf("inserting file: %w", err)
 	}

+	// Parse the returned ID
+	file.ID, err = types.ParseFileID(idStr)
+	if err != nil {
+		return fmt.Errorf("parsing file ID: %w", err)
+	}
+
 	return nil
 }

 func (r *FileRepository) GetByPath(ctx context.Context, path string) (*File, error) {
 	query := `
-		SELECT id, path, mtime, ctime, size, mode, uid, gid, link_target
+		SELECT id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target
 		FROM files
 		WHERE path = ?
 	`

-	var file File
-	var mtimeUnix, ctimeUnix int64
-	var linkTarget sql.NullString
-
-	err := r.db.conn.QueryRowContext(ctx, query, path).Scan(
-		&file.ID,
-		&file.Path,
-		&mtimeUnix,
-		&ctimeUnix,
-		&file.Size,
-		&file.Mode,
-		&file.UID,
-		&file.GID,
-		&linkTarget,
-	)
-
+	file, err := r.scanFile(r.db.conn.QueryRowContext(ctx, query, path))
 	if err == sql.ErrNoRows {
 		return nil, nil
 	}
@@ -83,39 +76,18 @@ func (r *FileRepository) GetByPath(ctx context.Context, path string) (*File, err
 		return nil, fmt.Errorf("querying file: %w", err)
 	}

-	file.MTime = time.Unix(mtimeUnix, 0).UTC()
-	file.CTime = time.Unix(ctimeUnix, 0).UTC()
-	if linkTarget.Valid {
-		file.LinkTarget = linkTarget.String
-	}
-
-	return &file, nil
+	return file, nil
 }

 // GetByID retrieves a file by its UUID
-func (r *FileRepository) GetByID(ctx context.Context, id string) (*File, error) {
+func (r *FileRepository) GetByID(ctx context.Context, id types.FileID) (*File, error) {
 	query := `
-		SELECT id, path, mtime, ctime, size, mode, uid, gid, link_target
+		SELECT id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target
 		FROM files
 		WHERE id = ?
 	`

-	var file File
-	var mtimeUnix, ctimeUnix int64
-	var linkTarget sql.NullString
-
-	err := r.db.conn.QueryRowContext(ctx, query, id).Scan(
-		&file.ID,
-		&file.Path,
-		&mtimeUnix,
-		&ctimeUnix,
-		&file.Size,
-		&file.Mode,
-		&file.UID,
-		&file.GID,
-		&linkTarget,
-	)
-
+	file, err := r.scanFile(r.db.conn.QueryRowContext(ctx, query, id.String()))
 	if err == sql.ErrNoRows {
 		return nil, nil
 	}
@@ -123,38 +95,18 @@ func (r *FileRepository) GetByID(ctx context.Context, id string) (*File, error)
 		return nil, fmt.Errorf("querying file: %w", err)
 	}

-	file.MTime = time.Unix(mtimeUnix, 0).UTC()
-	file.CTime = time.Unix(ctimeUnix, 0).UTC()
-	if linkTarget.Valid {
-		file.LinkTarget = linkTarget.String
-	}
-
-	return &file, nil
+	return file, nil
 }

 func (r *FileRepository) GetByPathTx(ctx context.Context, tx *sql.Tx, path string) (*File, error) {
 	query := `
-		SELECT id, path, mtime, ctime, size, mode, uid, gid, link_target
+		SELECT id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target
 		FROM files
 		WHERE path = ?
 	`

-	var file File
-	var mtimeUnix, ctimeUnix int64
-	var linkTarget sql.NullString
-
 	LogSQL("GetByPathTx QueryRowContext", query, path)
-	err := tx.QueryRowContext(ctx, query, path).Scan(
-		&file.ID,
-		&file.Path,
-		&mtimeUnix,
-		&ctimeUnix,
-		&file.Size,
-		&file.Mode,
-		&file.UID,
-		&file.GID,
-		&linkTarget,
-	)
+	file, err := r.scanFile(tx.QueryRowContext(ctx, query, path))
 	LogSQL("GetByPathTx Scan complete", query, path)

 	if err == sql.ErrNoRows {
@@ -164,10 +116,80 @@ func (r *FileRepository) GetByPathTx(ctx context.Context, tx *sql.Tx, path strin
 		return nil, fmt.Errorf("querying file: %w", err)
 	}

+	return file, nil
+}
+
+// scanFile is a helper that scans a single file row
+func (r *FileRepository) scanFile(row *sql.Row) (*File, error) {
+	var file File
+	var idStr, pathStr, sourcePathStr string
+	var mtimeUnix, ctimeUnix int64
+	var linkTarget sql.NullString
+
+	err := row.Scan(
+		&idStr,
+		&pathStr,
+		&sourcePathStr,
+		&mtimeUnix,
+		&ctimeUnix,
+		&file.Size,
+		&file.Mode,
+		&file.UID,
+		&file.GID,
+		&linkTarget,
+	)
+	if err != nil {
+		return nil, err
+	}
+
+	file.ID, err = types.ParseFileID(idStr)
+	if err != nil {
+		return nil, fmt.Errorf("parsing file ID: %w", err)
+	}
+	file.Path = types.FilePath(pathStr)
+	file.SourcePath = types.SourcePath(sourcePathStr)
 	file.MTime = time.Unix(mtimeUnix, 0).UTC()
 	file.CTime = time.Unix(ctimeUnix, 0).UTC()
 	if linkTarget.Valid {
-		file.LinkTarget = linkTarget.String
+		file.LinkTarget = types.FilePath(linkTarget.String)
+	}
+
+	return &file, nil
+}
+
+// scanFileRows is a helper that scans a file row from rows iterator
+func (r *FileRepository) scanFileRows(rows *sql.Rows) (*File, error) {
+	var file File
+	var idStr, pathStr, sourcePathStr string
+	var mtimeUnix, ctimeUnix int64
+	var linkTarget sql.NullString
+
+	err := rows.Scan(
+		&idStr,
+		&pathStr,
+		&sourcePathStr,
+		&mtimeUnix,
+		&ctimeUnix,
+		&file.Size,
+		&file.Mode,
+		&file.UID,
+		&file.GID,
+		&linkTarget,
+	)
+	if err != nil {
+		return nil, err
+	}
+
+	file.ID, err = types.ParseFileID(idStr)
+	if err != nil {
+		return nil, fmt.Errorf("parsing file ID: %w", err)
+	}
+	file.Path = types.FilePath(pathStr)
+	file.SourcePath = types.SourcePath(sourcePathStr)
+	file.MTime = time.Unix(mtimeUnix, 0).UTC()
+	file.CTime = time.Unix(ctimeUnix, 0).UTC()
+	if linkTarget.Valid {
+		file.LinkTarget = types.FilePath(linkTarget.String)
 	}

 	return &file, nil
@@ -175,7 +197,7 @@ func (r *FileRepository) GetByPathTx(ctx context.Context, tx *sql.Tx, path strin

 func (r *FileRepository) ListModifiedSince(ctx context.Context, since time.Time) ([]*File, error) {
 	query := `
-		SELECT id, path, mtime, ctime, size, mode, uid, gid, link_target
+		SELECT id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target
 		FROM files
 		WHERE mtime >= ?
 		ORDER BY path
@@ -189,32 +211,11 @@ func (r *FileRepository) ListModifiedSince(ctx context.Context, since time.Time)

 	var files []*File
 	for rows.Next() {
-		var file File
-		var mtimeUnix, ctimeUnix int64
-		var linkTarget sql.NullString
-
-		err := rows.Scan(
-			&file.ID,
-			&file.Path,
-			&mtimeUnix,
-			&ctimeUnix,
-			&file.Size,
-			&file.Mode,
-			&file.UID,
-			&file.GID,
-			&linkTarget,
-		)
+		file, err := r.scanFileRows(rows)
 		if err != nil {
 			return nil, fmt.Errorf("scanning file: %w", err)
 		}
-
-		file.MTime = time.Unix(mtimeUnix, 0)
-		file.CTime = time.Unix(ctimeUnix, 0)
-		if linkTarget.Valid {
-			file.LinkTarget = linkTarget.String
-		}
-
-		files = append(files, &file)
+		files = append(files, file)
 	}

 	return files, rows.Err()
@@ -238,14 +239,14 @@ func (r *FileRepository) Delete(ctx context.Context, tx *sql.Tx, path string) er
 }

 // DeleteByID deletes a file by its UUID
-func (r *FileRepository) DeleteByID(ctx context.Context, tx *sql.Tx, id string) error {
+func (r *FileRepository) DeleteByID(ctx context.Context, tx *sql.Tx, id types.FileID) error {
 	query := `DELETE FROM files WHERE id = ?`

 	var err error
 	if tx != nil {
-		_, err = tx.ExecContext(ctx, query, id)
+		_, err = tx.ExecContext(ctx, query, id.String())
 	} else {
-		_, err = r.db.ExecWithLog(ctx, query, id)
+		_, err = r.db.ExecWithLog(ctx, query, id.String())
 	}

 	if err != nil {
@@ -257,7 +258,7 @@ func (r *FileRepository) DeleteByID(ctx context.Context, tx *sql.Tx, id string)

 func (r *FileRepository) ListByPrefix(ctx context.Context, prefix string) ([]*File, error) {
 	query := `
-		SELECT id, path, mtime, ctime, size, mode, uid, gid, link_target
+		SELECT id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target
 		FROM files
 		WHERE path LIKE ? || '%'
 		ORDER BY path
@@ -271,37 +272,92 @@ func (r *FileRepository) ListByPrefix(ctx context.Context, prefix string) ([]*Fi

 	var files []*File
 	for rows.Next() {
-		var file File
-		var mtimeUnix, ctimeUnix int64
-		var linkTarget sql.NullString
-
-		err := rows.Scan(
-			&file.ID,
-			&file.Path,
-			&mtimeUnix,
-			&ctimeUnix,
-			&file.Size,
-			&file.Mode,
-			&file.UID,
-			&file.GID,
-			&linkTarget,
-		)
+		file, err := r.scanFileRows(rows)
 		if err != nil {
 			return nil, fmt.Errorf("scanning file: %w", err)
 		}
-
-		file.MTime = time.Unix(mtimeUnix, 0)
-		file.CTime = time.Unix(ctimeUnix, 0)
-		if linkTarget.Valid {
-			file.LinkTarget = linkTarget.String
-		}
-
-		files = append(files, &file)
+		files = append(files, file)
 	}

 	return files, rows.Err()
 }

+// ListAll returns all files in the database
+func (r *FileRepository) ListAll(ctx context.Context) ([]*File, error) {
+	query := `
+		SELECT id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target
+		FROM files
+		ORDER BY path
+	`
+
+	rows, err := r.db.conn.QueryContext(ctx, query)
+	if err != nil {
+		return nil, fmt.Errorf("querying files: %w", err)
+	}
+	defer CloseRows(rows)
+
+	var files []*File
+	for rows.Next() {
+		file, err := r.scanFileRows(rows)
+		if err != nil {
+			return nil, fmt.Errorf("scanning file: %w", err)
+		}
+		files = append(files, file)
+	}
+
+	return files, rows.Err()
+}
+
+// CreateBatch inserts or updates multiple files in a single statement for efficiency.
+// File IDs must be pre-generated before calling this method.
+func (r *FileRepository) CreateBatch(ctx context.Context, tx *sql.Tx, files []*File) error {
+	if len(files) == 0 {
+		return nil
+	}
+
+	// Each File has 10 values, so batch at 100 to be safe with SQLite's variable limit
+	const batchSize = 100
+
+	for i := 0; i < len(files); i += batchSize {
+		end := i + batchSize
+		if end > len(files) {
+			end = len(files)
+		}
+		batch := files[i:end]
+
+		query := `INSERT INTO files (id, path, source_path, mtime, ctime, size, mode, uid, gid, link_target) VALUES `
+		args := make([]interface{}, 0, len(batch)*10)
+		for j, f := range batch {
+			if j > 0 {
+				query += ", "
+			}
+			query += "(?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"
+			args = append(args, f.ID.String(), f.Path.String(), f.SourcePath.String(), f.MTime.Unix(), f.CTime.Unix(), f.Size, f.Mode, f.UID, f.GID, f.LinkTarget.String())
+		}
+		query += ` ON CONFLICT(path) DO UPDATE SET
+			source_path = excluded.source_path,
+			mtime = excluded.mtime,
+			ctime = excluded.ctime,
+			size = excluded.size,
+			mode = excluded.mode,
+			uid = excluded.uid,
+			gid = excluded.gid,
+			link_target = excluded.link_target`
+
+		var err error
+		if tx != nil {
+			_, err = tx.ExecContext(ctx, query, args...)
+		} else {
+			_, err = r.db.ExecWithLog(ctx, query, args...)
+		}
+		if err != nil {
+			return fmt.Errorf("batch inserting files: %w", err)
+		}
+	}
+
+	return nil
+}
+
 // DeleteOrphaned deletes files that are not referenced by any snapshot
 func (r *FileRepository) DeleteOrphaned(ctx context.Context) error {
 	query := `
--- a/internal/database/files_test.go
+++ b/internal/database/files_test.go
@@ -53,7 +53,7 @@ func TestFileRepository(t *testing.T) {
 	}

 	// Test GetByPath
-	retrieved, err := repo.GetByPath(ctx, file.Path)
+	retrieved, err := repo.GetByPath(ctx, file.Path.String())
 	if err != nil {
 		t.Fatalf("failed to get file: %v", err)
 	}
@@ -81,7 +81,7 @@ func TestFileRepository(t *testing.T) {
 		t.Fatalf("failed to update file: %v", err)
 	}

-	retrieved, err = repo.GetByPath(ctx, file.Path)
+	retrieved, err = repo.GetByPath(ctx, file.Path.String())
 	if err != nil {
 		t.Fatalf("failed to get updated file: %v", err)
 	}
@@ -99,12 +99,12 @@ func TestFileRepository(t *testing.T) {
 	}

 	// Test Delete
-	err = repo.Delete(ctx, nil, file.Path)
+	err = repo.Delete(ctx, nil, file.Path.String())
 	if err != nil {
 		t.Fatalf("failed to delete file: %v", err)
 	}

-	retrieved, err = repo.GetByPath(ctx, file.Path)
+	retrieved, err = repo.GetByPath(ctx, file.Path.String())
 	if err != nil {
 		t.Fatalf("error getting deleted file: %v", err)
 	}
@@ -137,7 +137,7 @@ func TestFileRepositorySymlink(t *testing.T) {
 		t.Fatalf("failed to create symlink: %v", err)
 	}

-	retrieved, err := repo.GetByPath(ctx, symlink.Path)
+	retrieved, err := repo.GetByPath(ctx, symlink.Path.String())
 	if err != nil {
 		t.Fatalf("failed to get symlink: %v", err)
 	}
--- a/internal/database/models.go
+++ b/internal/database/models.go
@@ -2,22 +2,27 @@
 // It includes types for files, chunks, blobs, snapshots, and their relationships.
 package database

-import "time"
+import (
+	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
+)

 // File represents a file or directory in the backup system.
 // It stores metadata about files including timestamps, permissions, ownership,
 // and symlink targets. This information is used to restore files with their
 // original attributes.
 type File struct {
-	ID         string // UUID primary key
-	Path       string
+	ID         types.FileID     // UUID primary key
+	Path       types.FilePath   // Absolute path of the file
+	SourcePath types.SourcePath // The source directory this file came from (for restore path stripping)
 	MTime      time.Time
 	CTime      time.Time
 	Size       int64
 	Mode       uint32
 	UID        uint32
 	GID        uint32
-	LinkTarget string // empty for regular files, target path for symlinks
+	LinkTarget types.FilePath // empty for regular files, target path for symlinks
 }

 // IsSymlink returns true if this file is a symbolic link.
@@ -30,16 +35,16 @@ func (f *File) IsSymlink() bool {
 // Large files are split into multiple chunks for efficient deduplication and storage.
 // The Idx field maintains the order of chunks within a file.
 type FileChunk struct {
-	FileID    string
+	FileID    types.FileID
 	Idx       int
-	ChunkHash string
+	ChunkHash types.ChunkHash
 }

 // Chunk represents a data chunk in the deduplication system.
 // Files are split into chunks which are content-addressed by their hash.
 // The ChunkHash is the SHA256 hash of the chunk content, used for deduplication.
 type Chunk struct {
-	ChunkHash string
+	ChunkHash types.ChunkHash
 	Size      int64
 }

@@ -51,8 +56,8 @@ type Chunk struct {
 // The blob creation process is: chunks are accumulated -> compressed with zstd
 // -> encrypted with age -> hashed -> uploaded to S3 with the hash as filename.
 type Blob struct {
-	ID               string     // UUID assigned when blob creation starts
-	Hash             string     // SHA256 of final compressed+encrypted content (empty until finalized)
+	ID               types.BlobID   // UUID assigned when blob creation starts
+	Hash             types.BlobHash // SHA256 of final compressed+encrypted content (empty until finalized)
 	CreatedTS        time.Time      // When blob creation started
 	FinishedTS       *time.Time     // When blob was finalized (nil if still packing)
 	UncompressedSize int64          // Total size of raw chunks before compression
@@ -65,8 +70,8 @@ type Blob struct {
 // their position and size within the blob. The offset and length fields
 // enable extracting specific chunks from a blob without processing the entire blob.
 type BlobChunk struct {
-	BlobID    string
-	ChunkHash string
+	BlobID    types.BlobID
+	ChunkHash types.ChunkHash
 	Offset    int64
 	Length    int64
 }
@@ -75,18 +80,18 @@ type BlobChunk struct {
 // This is used during deduplication to identify all files that share a chunk,
 // which is important for garbage collection and integrity verification.
 type ChunkFile struct {
-	ChunkHash  string
-	FileID     string
+	ChunkHash  types.ChunkHash
+	FileID     types.FileID
 	FileOffset int64
 	Length     int64
 }

 // Snapshot represents a snapshot record in the database
 type Snapshot struct {
-	ID                   string
-	Hostname             string
-	VaultikVersion       string
-	VaultikGitRevision   string
+	ID                   types.SnapshotID
+	Hostname             types.Hostname
+	VaultikVersion       types.Version
+	VaultikGitRevision   types.GitRevision
 	StartedAt            time.Time
 	CompletedAt          *time.Time // nil if still in progress
 	FileCount            int64
@@ -108,13 +113,13 @@ func (s *Snapshot) IsComplete() bool {

 // SnapshotFile represents the mapping between snapshots and files
 type SnapshotFile struct {
-	SnapshotID string
-	FileID     string
+	SnapshotID types.SnapshotID
+	FileID     types.FileID
 }

 // SnapshotBlob represents the mapping between snapshots and blobs
 type SnapshotBlob struct {
-	SnapshotID string
-	BlobID     string
-	BlobHash   string // Denormalized for easier manifest generation
+	SnapshotID types.SnapshotID
+	BlobID     types.BlobID
+	BlobHash   types.BlobHash // Denormalized for easier manifest generation
 }
--- a/internal/database/repositories.go
+++ b/internal/database/repositories.go
@@ -75,6 +75,11 @@ func (r *Repositories) WithTx(ctx context.Context, fn TxFunc) error {
 	return tx.Commit()
 }

+// DB returns the underlying database for direct queries
+func (r *Repositories) DB() *DB {
+	return r.db
+}
+
 // WithReadTx executes a function within a read-only transaction.
 // Read transactions can run concurrently with other read transactions
 // but will be blocked by write transactions. The transaction is
--- a/internal/database/repositories_test.go
+++ b/internal/database/repositories_test.go
@@ -6,6 +6,8 @@ import (
 	"fmt"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 func TestRepositoriesTransaction(t *testing.T) {
@@ -33,7 +35,7 @@ func TestRepositoriesTransaction(t *testing.T) {

 		// Create chunks
 		chunk1 := &Chunk{
-			ChunkHash: "tx_chunk1",
+			ChunkHash: types.ChunkHash("tx_chunk1"),
 			Size:      512,
 		}
 		if err := repos.Chunks.Create(ctx, tx, chunk1); err != nil {
@@ -41,7 +43,7 @@ func TestRepositoriesTransaction(t *testing.T) {
 		}

 		chunk2 := &Chunk{
-			ChunkHash: "tx_chunk2",
+			ChunkHash: types.ChunkHash("tx_chunk2"),
 			Size:      512,
 		}
 		if err := repos.Chunks.Create(ctx, tx, chunk2); err != nil {
@@ -69,8 +71,8 @@ func TestRepositoriesTransaction(t *testing.T) {

 		// Create blob
 		blob := &Blob{
-			ID:        "tx-blob-id-1",
-			Hash:      "tx_blob1",
+			ID:        types.NewBlobID(),
+			Hash:      types.BlobHash("tx_blob1"),
 			CreatedTS: time.Now().Truncate(time.Second),
 		}
 		if err := repos.Blobs.Create(ctx, tx, blob); err != nil {
@@ -156,7 +158,7 @@ func TestRepositoriesTransactionRollback(t *testing.T) {

 		// Create a chunk
 		chunk := &Chunk{
-			ChunkHash: "rollback_chunk",
+			ChunkHash: types.ChunkHash("rollback_chunk"),
 			Size:      1024,
 		}
 		if err := repos.Chunks.Create(ctx, tx, chunk); err != nil {
--- a/internal/database/repository_comprehensive_test.go
+++ b/internal/database/repository_comprehensive_test.go
@@ -6,6 +6,8 @@ import (
 	"fmt"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 // TestFileRepositoryUUIDGeneration tests that files get unique UUIDs
@@ -46,15 +48,15 @@ func TestFileRepositoryUUIDGeneration(t *testing.T) {
 		}

 		// Check UUID was generated
-		if file.ID == "" {
+		if file.ID.IsZero() {
 			t.Error("file ID was not generated")
 		}

 		// Check UUID is unique
-		if uuids[file.ID] {
+		if uuids[file.ID.String()] {
 			t.Errorf("duplicate UUID generated: %s", file.ID)
 		}
-		uuids[file.ID] = true
+		uuids[file.ID.String()] = true
 	}
 }

@@ -96,7 +98,8 @@ func TestFileRepositoryGetByID(t *testing.T) {
 	}

 	// Test non-existent ID
-	nonExistent, err := repo.GetByID(ctx, "non-existent-uuid")
+	nonExistentID := types.NewFileID() // Generate a new UUID that won't exist in the database
+	nonExistent, err := repo.GetByID(ctx, nonExistentID)
 	if err != nil {
 		t.Fatalf("GetByID should not return error for non-existent ID: %v", err)
 	}
@@ -154,7 +157,7 @@ func TestOrphanedFileCleanup(t *testing.T) {
 	}

 	// Add file2 to snapshot
-	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID, file2.ID)
+	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID.String(), file2.ID)
 	if err != nil {
 		t.Fatalf("failed to add file to snapshot: %v", err)
 	}
@@ -194,11 +197,11 @@ func TestOrphanedChunkCleanup(t *testing.T) {

 	// Create chunks
 	chunk1 := &Chunk{
-		ChunkHash: "orphaned-chunk",
+		ChunkHash: types.ChunkHash("orphaned-chunk"),
 		Size:      1024,
 	}
 	chunk2 := &Chunk{
-		ChunkHash: "referenced-chunk",
+		ChunkHash: types.ChunkHash("referenced-chunk"),
 		Size:      1024,
 	}

@@ -244,7 +247,7 @@ func TestOrphanedChunkCleanup(t *testing.T) {
 	}

 	// Check that orphaned chunk is gone
-	orphanedChunk, err := repos.Chunks.GetByHash(ctx, chunk1.ChunkHash)
+	orphanedChunk, err := repos.Chunks.GetByHash(ctx, chunk1.ChunkHash.String())
 	if err != nil {
 		t.Fatalf("error getting chunk: %v", err)
 	}
@@ -253,7 +256,7 @@ func TestOrphanedChunkCleanup(t *testing.T) {
 	}

 	// Check that referenced chunk still exists
-	referencedChunk, err := repos.Chunks.GetByHash(ctx, chunk2.ChunkHash)
+	referencedChunk, err := repos.Chunks.GetByHash(ctx, chunk2.ChunkHash.String())
 	if err != nil {
 		t.Fatalf("error getting chunk: %v", err)
 	}
@@ -272,13 +275,13 @@ func TestOrphanedBlobCleanup(t *testing.T) {

 	// Create blobs
 	blob1 := &Blob{
-		ID:        "orphaned-blob-id",
-		Hash:      "orphaned-blob",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("orphaned-blob"),
 		CreatedTS: time.Now().Truncate(time.Second),
 	}
 	blob2 := &Blob{
-		ID:        "referenced-blob-id",
-		Hash:      "referenced-blob",
+		ID:        types.NewBlobID(),
+		Hash:      types.BlobHash("referenced-blob"),
 		CreatedTS: time.Now().Truncate(time.Second),
 	}

@@ -303,7 +306,7 @@ func TestOrphanedBlobCleanup(t *testing.T) {
 	}

 	// Add blob2 to snapshot
-	err = repos.Snapshots.AddBlob(ctx, nil, snapshot.ID, blob2.ID, blob2.Hash)
+	err = repos.Snapshots.AddBlob(ctx, nil, snapshot.ID.String(), blob2.ID, blob2.Hash)
 	if err != nil {
 		t.Fatalf("failed to add blob to snapshot: %v", err)
 	}
@@ -315,7 +318,7 @@ func TestOrphanedBlobCleanup(t *testing.T) {
 	}

 	// Check that orphaned blob is gone
-	orphanedBlob, err := repos.Blobs.GetByID(ctx, blob1.ID)
+	orphanedBlob, err := repos.Blobs.GetByID(ctx, blob1.ID.String())
 	if err != nil {
 		t.Fatalf("error getting blob: %v", err)
 	}
@@ -324,7 +327,7 @@ func TestOrphanedBlobCleanup(t *testing.T) {
 	}

 	// Check that referenced blob still exists
-	referencedBlob, err := repos.Blobs.GetByID(ctx, blob2.ID)
+	referencedBlob, err := repos.Blobs.GetByID(ctx, blob2.ID.String())
 	if err != nil {
 		t.Fatalf("error getting blob: %v", err)
 	}
@@ -357,7 +360,7 @@ func TestFileChunkRepositoryWithUUIDs(t *testing.T) {
 	}

 	// Create chunks
-	chunks := []string{"chunk1", "chunk2", "chunk3"}
+	chunks := []types.ChunkHash{"chunk1", "chunk2", "chunk3"}
 	for i, chunkHash := range chunks {
 		chunk := &Chunk{
 			ChunkHash: chunkHash,
@@ -443,7 +446,7 @@ func TestChunkFileRepositoryWithUUIDs(t *testing.T) {

 	// Create a chunk that appears in both files (deduplication)
 	chunk := &Chunk{
-		ChunkHash: "shared-chunk",
+		ChunkHash: types.ChunkHash("shared-chunk"),
 		Size:      1024,
 	}
 	err = repos.Chunks.Create(ctx, nil, chunk)
@@ -526,7 +529,7 @@ func TestSnapshotRepositoryExtendedFields(t *testing.T) {
 	}

 	// Retrieve and verify
-	retrieved, err := repo.GetByID(ctx, snapshot.ID)
+	retrieved, err := repo.GetByID(ctx, snapshot.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get snapshot: %v", err)
 	}
@@ -581,7 +584,7 @@ func TestComplexOrphanedDataScenario(t *testing.T) {
 	files := make([]*File, 3)
 	for i := range files {
 		files[i] = &File{
-			Path:  fmt.Sprintf("/file%d.txt", i),
+			Path:  types.FilePath(fmt.Sprintf("/file%d.txt", i)),
 			MTime: time.Now().Truncate(time.Second),
 			CTime: time.Now().Truncate(time.Second),
 			Size:  1024,
@@ -601,29 +604,29 @@ func TestComplexOrphanedDataScenario(t *testing.T) {
 	// file0: only in snapshot1
 	// file1: in both snapshots
 	// file2: only in snapshot2
-	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot1.ID, files[0].ID)
+	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot1.ID.String(), files[0].ID)
 	if err != nil {
 		t.Fatal(err)
 	}
-	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot1.ID, files[1].ID)
+	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot1.ID.String(), files[1].ID)
 	if err != nil {
 		t.Fatal(err)
 	}
-	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot2.ID, files[1].ID)
+	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot2.ID.String(), files[1].ID)
 	if err != nil {
 		t.Fatal(err)
 	}
-	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot2.ID, files[2].ID)
+	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot2.ID.String(), files[2].ID)
 	if err != nil {
 		t.Fatal(err)
 	}

 	// Delete snapshot1
-	err = repos.Snapshots.DeleteSnapshotFiles(ctx, snapshot1.ID)
+	err = repos.Snapshots.DeleteSnapshotFiles(ctx, snapshot1.ID.String())
 	if err != nil {
 		t.Fatal(err)
 	}
-	err = repos.Snapshots.Delete(ctx, snapshot1.ID)
+	err = repos.Snapshots.Delete(ctx, snapshot1.ID.String())
 	if err != nil {
 		t.Fatal(err)
 	}
@@ -689,7 +692,7 @@ func TestCascadeDelete(t *testing.T) {
 	// Create chunks and file-chunk mappings
 	for i := 0; i < 3; i++ {
 		chunk := &Chunk{
-			ChunkHash: fmt.Sprintf("cascade-chunk-%d", i),
+			ChunkHash: types.ChunkHash(fmt.Sprintf("cascade-chunk-%d", i)),
 			Size:      1024,
 		}
 		err = repos.Chunks.Create(ctx, nil, chunk)
@@ -807,7 +810,7 @@ func TestConcurrentOrphanedCleanup(t *testing.T) {
 	// Create many files, some orphaned
 	for i := 0; i < 20; i++ {
 		file := &File{
-			Path:  fmt.Sprintf("/concurrent-%d.txt", i),
+			Path:  types.FilePath(fmt.Sprintf("/concurrent-%d.txt", i)),
 			MTime: time.Now().Truncate(time.Second),
 			CTime: time.Now().Truncate(time.Second),
 			Size:  1024,
@@ -822,7 +825,7 @@ func TestConcurrentOrphanedCleanup(t *testing.T) {

 		// Add even-numbered files to snapshot
 		if i%2 == 0 {
-			err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID, file.ID)
+			err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID.String(), file.ID)
 			if err != nil {
 				t.Fatal(err)
 			}
@@ -860,7 +863,7 @@ func TestConcurrentOrphanedCleanup(t *testing.T) {
 	// Verify all remaining files are even-numbered
 	for _, file := range files {
 		var num int
-		_, err := fmt.Sscanf(file.Path, "/concurrent-%d.txt", &num)
+		_, err := fmt.Sscanf(file.Path.String(), "/concurrent-%d.txt", &num)
 		if err != nil {
 			t.Logf("failed to parse file number from %s: %v", file.Path, err)
 		}
--- a/internal/database/repository_debug_test.go
+++ b/internal/database/repository_debug_test.go
@@ -67,7 +67,7 @@ func TestOrphanedFileCleanupDebug(t *testing.T) {
 	t.Logf("snapshot_files count before add: %d", count)

 	// Add file2 to snapshot
-	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID, file2.ID)
+	err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID.String(), file2.ID)
 	if err != nil {
 		t.Fatalf("failed to add file to snapshot: %v", err)
 	}
--- a/internal/database/repository_edge_cases_test.go
+++ b/internal/database/repository_edge_cases_test.go
@@ -6,6 +6,8 @@ import (
 	"strings"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 // TestFileRepositoryEdgeCases tests edge cases for file repository
@@ -38,7 +40,7 @@ func TestFileRepositoryEdgeCases(t *testing.T) {
 		{
 			name: "very long path",
 			file: &File{
-				Path:  "/" + strings.Repeat("a", 4096),
+				Path:  types.FilePath("/" + strings.Repeat("a", 4096)),
 				MTime: time.Now(),
 				CTime: time.Now(),
 				Size:  1024,
@@ -94,7 +96,7 @@ func TestFileRepositoryEdgeCases(t *testing.T) {
 		t.Run(tt.name, func(t *testing.T) {
 			// Add a unique suffix to paths to avoid UNIQUE constraint violations
 			if tt.file.Path != "" {
-				tt.file.Path = fmt.Sprintf("%s_%d_%d", tt.file.Path, i, time.Now().UnixNano())
+				tt.file.Path = types.FilePath(fmt.Sprintf("%s_%d_%d", tt.file.Path, i, time.Now().UnixNano()))
 			}

 			err := repo.Create(ctx, nil, tt.file)
@@ -169,7 +171,7 @@ func TestDuplicateHandling(t *testing.T) {
 	// Test duplicate chunk hashes
 	t.Run("duplicate chunk hashes", func(t *testing.T) {
 		chunk := &Chunk{
-			ChunkHash: "duplicate-chunk",
+			ChunkHash: types.ChunkHash("duplicate-chunk"),
 			Size:      1024,
 		}

@@ -202,7 +204,7 @@ func TestDuplicateHandling(t *testing.T) {
 		}

 		chunk := &Chunk{
-			ChunkHash: "test-chunk-dup",
+			ChunkHash: types.ChunkHash("test-chunk-dup"),
 			Size:      1024,
 		}
 		err = repos.Chunks.Create(ctx, nil, chunk)
@@ -279,7 +281,7 @@ func TestNullHandling(t *testing.T) {
 			t.Fatal(err)
 		}

-		retrieved, err := repos.Snapshots.GetByID(ctx, snapshot.ID)
+		retrieved, err := repos.Snapshots.GetByID(ctx, snapshot.ID.String())
 		if err != nil {
 			t.Fatal(err)
 		}
@@ -292,8 +294,8 @@ func TestNullHandling(t *testing.T) {
 	// Test blob with NULL uploaded_ts
 	t.Run("blob not uploaded", func(t *testing.T) {
 		blob := &Blob{
-			ID:         "not-uploaded",
-			Hash:       "test-hash",
+			ID:         types.NewBlobID(),
+			Hash:       types.BlobHash("test-hash"),
 			CreatedTS:  time.Now(),
 			UploadedTS: nil, // Not uploaded yet
 		}
@@ -303,7 +305,7 @@ func TestNullHandling(t *testing.T) {
 			t.Fatal(err)
 		}

-		retrieved, err := repos.Blobs.GetByID(ctx, blob.ID)
+		retrieved, err := repos.Blobs.GetByID(ctx, blob.ID.String())
 		if err != nil {
 			t.Fatal(err)
 		}
@@ -339,13 +341,13 @@ func TestLargeDatasets(t *testing.T) {

 	// Create many files
 	const fileCount = 1000
-	fileIDs := make([]string, fileCount)
+	fileIDs := make([]types.FileID, fileCount)

 	t.Run("create many files", func(t *testing.T) {
 		start := time.Now()
 		for i := 0; i < fileCount; i++ {
 			file := &File{
-				Path:  fmt.Sprintf("/large/file%05d.txt", i),
+				Path:  types.FilePath(fmt.Sprintf("/large/file%05d.txt", i)),
 				MTime: time.Now(),
 				CTime: time.Now(),
 				Size:  int64(i * 1024),
@@ -361,7 +363,7 @@ func TestLargeDatasets(t *testing.T) {

 			// Add half to snapshot
 			if i%2 == 0 {
-				err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID, file.ID)
+				err = repos.Snapshots.AddFileByID(ctx, nil, snapshot.ID.String(), file.ID)
 				if err != nil {
 					t.Fatal(err)
 				}
@@ -413,7 +415,7 @@ func TestErrorPropagation(t *testing.T) {

 	// Test GetByID with non-existent ID
 	t.Run("GetByID non-existent", func(t *testing.T) {
-		file, err := repos.Files.GetByID(ctx, "non-existent-uuid")
+		file, err := repos.Files.GetByID(ctx, types.NewFileID())
 		if err != nil {
 			t.Errorf("GetByID should not return error for non-existent ID, got: %v", err)
 		}
@@ -436,9 +438,9 @@ func TestErrorPropagation(t *testing.T) {
 	// Test invalid foreign key reference
 	t.Run("invalid foreign key", func(t *testing.T) {
 		fc := &FileChunk{
-			FileID:    "non-existent-file-id",
+			FileID:    types.NewFileID(),
 			Idx:       0,
-			ChunkHash: "some-chunk",
+			ChunkHash: types.ChunkHash("some-chunk"),
 		}
 		err := repos.FileChunks.Create(ctx, nil, fc)
 		if err == nil {
@@ -470,7 +472,7 @@ func TestQueryInjection(t *testing.T) {
 		t.Run("injection attempt", func(t *testing.T) {
 			// Try injection in file path
 			file := &File{
-				Path:  injection,
+				Path:  types.FilePath(injection),
 				MTime: time.Now(),
 				CTime: time.Now(),
 				Size:  1024,
--- a/internal/database/schema.sql
+++ b/internal/database/schema.sql
@@ -6,6 +6,7 @@
 CREATE TABLE IF NOT EXISTS files (
    id TEXT PRIMARY KEY,  -- UUID
    path TEXT NOT NULL UNIQUE,
+    source_path TEXT NOT NULL DEFAULT '',  -- The source directory this file came from (for restore path stripping)
    mtime INTEGER NOT NULL,
    ctime INTEGER NOT NULL,
    size INTEGER NOT NULL,
@@ -28,6 +29,9 @@ CREATE TABLE IF NOT EXISTS file_chunks (
    FOREIGN KEY (chunk_hash) REFERENCES chunks(chunk_hash)
 );

+-- Index for efficient chunk lookups (used in orphan detection)
+CREATE INDEX IF NOT EXISTS idx_file_chunks_chunk_hash ON file_chunks(chunk_hash);
+
 -- Chunks table: stores unique content-defined chunks
 CREATE TABLE IF NOT EXISTS chunks (
    chunk_hash TEXT PRIMARY KEY,
@@ -56,6 +60,9 @@ CREATE TABLE IF NOT EXISTS blob_chunks (
    FOREIGN KEY (chunk_hash) REFERENCES chunks(chunk_hash)
 );

+-- Index for efficient chunk lookups (used in orphan detection)
+CREATE INDEX IF NOT EXISTS idx_blob_chunks_chunk_hash ON blob_chunks(chunk_hash);
+
 -- Chunk files table: reverse mapping of chunks to files
 CREATE TABLE IF NOT EXISTS chunk_files (
    chunk_hash TEXT NOT NULL,
@@ -67,6 +74,9 @@ CREATE TABLE IF NOT EXISTS chunk_files (
    FOREIGN KEY (file_id) REFERENCES files(id) ON DELETE CASCADE
 );

+-- Index for efficient file lookups (used in orphan detection)
+CREATE INDEX IF NOT EXISTS idx_chunk_files_file_id ON chunk_files(file_id);
+
 -- Snapshots table: tracks backup snapshots
 CREATE TABLE IF NOT EXISTS snapshots (
    id TEXT PRIMARY KEY,
@@ -96,6 +106,9 @@ CREATE TABLE IF NOT EXISTS snapshot_files (
    FOREIGN KEY (file_id) REFERENCES files(id)
 );

+-- Index for efficient file lookups (used in orphan detection)
+CREATE INDEX IF NOT EXISTS idx_snapshot_files_file_id ON snapshot_files(file_id);
+
 -- Snapshot blobs table: maps snapshots to blobs
 CREATE TABLE IF NOT EXISTS snapshot_blobs (
    snapshot_id TEXT NOT NULL,
@@ -106,6 +119,9 @@ CREATE TABLE IF NOT EXISTS snapshot_blobs (
    FOREIGN KEY (blob_id) REFERENCES blobs(id)
 );

+-- Index for efficient blob lookups (used in orphan detection)
+CREATE INDEX IF NOT EXISTS idx_snapshot_blobs_blob_id ON snapshot_blobs(blob_id);
+
 -- Uploads table: tracks blob upload metrics
 CREATE TABLE IF NOT EXISTS uploads (
    blob_hash TEXT PRIMARY KEY,
@@ -116,3 +132,6 @@ CREATE TABLE IF NOT EXISTS uploads (
    FOREIGN KEY (blob_hash) REFERENCES blobs(blob_hash),
    FOREIGN KEY (snapshot_id) REFERENCES snapshots(id)
 );
+
+-- Index for efficient snapshot lookups
+CREATE INDEX IF NOT EXISTS idx_uploads_snapshot_id ON uploads(snapshot_id);
--- a/internal/database/snapshots.go
+++ b/internal/database/snapshots.go
@@ -5,6 +5,8 @@ import (
 	"database/sql"
 	"fmt"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 type SnapshotRepository struct {
@@ -269,7 +271,7 @@ func (r *SnapshotRepository) AddFile(ctx context.Context, tx *sql.Tx, snapshotID
 }

 // AddFileByID adds a file to a snapshot by file ID
-func (r *SnapshotRepository) AddFileByID(ctx context.Context, tx *sql.Tx, snapshotID string, fileID string) error {
+func (r *SnapshotRepository) AddFileByID(ctx context.Context, tx *sql.Tx, snapshotID string, fileID types.FileID) error {
 	query := `
 		INSERT OR IGNORE INTO snapshot_files (snapshot_id, file_id)
 		VALUES (?, ?)
@@ -277,9 +279,9 @@ func (r *SnapshotRepository) AddFileByID(ctx context.Context, tx *sql.Tx, snapsh

 	var err error
 	if tx != nil {
-		_, err = tx.ExecContext(ctx, query, snapshotID, fileID)
+		_, err = tx.ExecContext(ctx, query, snapshotID, fileID.String())
 	} else {
-		_, err = r.db.ExecWithLog(ctx, query, snapshotID, fileID)
+		_, err = r.db.ExecWithLog(ctx, query, snapshotID, fileID.String())
 	}

 	if err != nil {
@@ -289,8 +291,48 @@ func (r *SnapshotRepository) AddFileByID(ctx context.Context, tx *sql.Tx, snapsh
 	return nil
 }

+// AddFilesByIDBatch adds multiple files to a snapshot in batched inserts
+func (r *SnapshotRepository) AddFilesByIDBatch(ctx context.Context, tx *sql.Tx, snapshotID string, fileIDs []types.FileID) error {
+	if len(fileIDs) == 0 {
+		return nil
+	}
+
+	// Each entry has 2 values, so batch at 400 to be safe
+	const batchSize = 400
+
+	for i := 0; i < len(fileIDs); i += batchSize {
+		end := i + batchSize
+		if end > len(fileIDs) {
+			end = len(fileIDs)
+		}
+		batch := fileIDs[i:end]
+
+		query := "INSERT OR IGNORE INTO snapshot_files (snapshot_id, file_id) VALUES "
+		args := make([]interface{}, 0, len(batch)*2)
+		for j, fileID := range batch {
+			if j > 0 {
+				query += ", "
+			}
+			query += "(?, ?)"
+			args = append(args, snapshotID, fileID.String())
+		}
+
+		var err error
+		if tx != nil {
+			_, err = tx.ExecContext(ctx, query, args...)
+		} else {
+			_, err = r.db.ExecWithLog(ctx, query, args...)
+		}
+		if err != nil {
+			return fmt.Errorf("batch adding files to snapshot: %w", err)
+		}
+	}
+
+	return nil
+}
+
 // AddBlob adds a blob to a snapshot
-func (r *SnapshotRepository) AddBlob(ctx context.Context, tx *sql.Tx, snapshotID string, blobID string, blobHash string) error {
+func (r *SnapshotRepository) AddBlob(ctx context.Context, tx *sql.Tx, snapshotID string, blobID types.BlobID, blobHash types.BlobHash) error {
 	query := `
 		INSERT OR IGNORE INTO snapshot_blobs (snapshot_id, blob_id, blob_hash)
 		VALUES (?, ?, ?)
@@ -298,9 +340,9 @@ func (r *SnapshotRepository) AddBlob(ctx context.Context, tx *sql.Tx, snapshotID

 	var err error
 	if tx != nil {
-		_, err = tx.ExecContext(ctx, query, snapshotID, blobID, blobHash)
+		_, err = tx.ExecContext(ctx, query, snapshotID, blobID.String(), blobHash.String())
 	} else {
-		_, err = r.db.ExecWithLog(ctx, query, snapshotID, blobID, blobHash)
+		_, err = r.db.ExecWithLog(ctx, query, snapshotID, blobID.String(), blobHash.String())
 	}

 	if err != nil {
@@ -337,6 +379,24 @@ func (r *SnapshotRepository) GetBlobHashes(ctx context.Context, snapshotID strin
 	return blobs, rows.Err()
 }

+// GetSnapshotTotalCompressedSize returns the total compressed size of all blobs referenced by a snapshot
+func (r *SnapshotRepository) GetSnapshotTotalCompressedSize(ctx context.Context, snapshotID string) (int64, error) {
+	query := `
+		SELECT COALESCE(SUM(b.compressed_size), 0)
+		FROM snapshot_blobs sb
+		JOIN blobs b ON sb.blob_hash = b.blob_hash
+		WHERE sb.snapshot_id = ?
+	`
+
+	var totalSize int64
+	err := r.db.conn.QueryRowContext(ctx, query, snapshotID).Scan(&totalSize)
+	if err != nil {
+		return 0, fmt.Errorf("querying total compressed size: %w", err)
+	}
+
+	return totalSize, nil
+}
+
 // GetIncompleteSnapshots returns all snapshots that haven't been completed
 func (r *SnapshotRepository) GetIncompleteSnapshots(ctx context.Context) ([]*Snapshot, error) {
 	query := `
@@ -474,3 +534,15 @@ func (r *SnapshotRepository) DeleteSnapshotBlobs(ctx context.Context, snapshotID

 	return nil
 }
+
+// DeleteSnapshotUploads removes all uploads entries for a snapshot
+func (r *SnapshotRepository) DeleteSnapshotUploads(ctx context.Context, snapshotID string) error {
+	query := `DELETE FROM uploads WHERE snapshot_id = ?`
+
+	_, err := r.db.ExecWithLog(ctx, query, snapshotID)
+	if err != nil {
+		return fmt.Errorf("deleting snapshot uploads: %w", err)
+	}
+
+	return nil
+}
--- a/internal/database/snapshots_test.go
+++ b/internal/database/snapshots_test.go
@@ -6,6 +6,8 @@ import (
 	"math"
 	"testing"
 	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 const (
@@ -46,7 +48,7 @@ func TestSnapshotRepository(t *testing.T) {
 	}

 	// Test GetByID
-	retrieved, err := repo.GetByID(ctx, snapshot.ID)
+	retrieved, err := repo.GetByID(ctx, snapshot.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get snapshot: %v", err)
 	}
@@ -64,12 +66,12 @@ func TestSnapshotRepository(t *testing.T) {
 	}

 	// Test UpdateCounts
-	err = repo.UpdateCounts(ctx, nil, snapshot.ID, 200, 1000, 20, twoHundredMebibytes, sixtyMebibytes)
+	err = repo.UpdateCounts(ctx, nil, snapshot.ID.String(), 200, 1000, 20, twoHundredMebibytes, sixtyMebibytes)
 	if err != nil {
 		t.Fatalf("failed to update counts: %v", err)
 	}

-	retrieved, err = repo.GetByID(ctx, snapshot.ID)
+	retrieved, err = repo.GetByID(ctx, snapshot.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get updated snapshot: %v", err)
 	}
@@ -97,7 +99,7 @@ func TestSnapshotRepository(t *testing.T) {
 	// Add more snapshots
 	for i := 2; i <= 5; i++ {
 		s := &Snapshot{
-			ID:             fmt.Sprintf("2024-01-0%dT12:00:00Z", i),
+			ID:             types.SnapshotID(fmt.Sprintf("2024-01-0%dT12:00:00Z", i)),
 			Hostname:       "test-host",
 			VaultikVersion: "1.0.0",
 			StartedAt:      time.Now().Add(time.Duration(i) * time.Hour).Truncate(time.Second),
--- a/internal/globals/globals.go
+++ b/internal/globals/globals.go
@@ -4,13 +4,16 @@ import (
 	"time"
 )

-// these get populated from main() and copied into the Globals object.
-var (
-	Appname string = "vaultik"
-	Version string = "dev"
-	Commit  string = "unknown"
-)
+// Appname is the application name, populated from main().
+var Appname string = "vaultik"

+// Version is the application version, populated from main().
+var Version string = "dev"
+
+// Commit is the git commit hash, populated from main().
+var Commit string = "unknown"
+
+// Globals contains application-wide configuration and metadata.
 type Globals struct {
 	Appname   string
 	Version   string
@@ -18,6 +21,7 @@ type Globals struct {
 	StartTime time.Time
 }

+// New creates and returns a new Globals instance initialized with the package-level variables.
 func New() (*Globals, error) {
 	return &Globals{
 		Appname: Appname,
--- a/internal/log/log.go
+++ b/internal/log/log.go
@@ -12,34 +12,41 @@ import (
 	"golang.org/x/term"
 )

-// LogLevel represents the logging level
+// LogLevel represents the logging level.
 type LogLevel int

 const (
+	// LevelFatal represents a fatal error level that will exit the program.
 	LevelFatal LogLevel = iota
+	// LevelError represents an error level.
 	LevelError
+	// LevelWarn represents a warning level.
 	LevelWarn
+	// LevelNotice represents a notice level (mapped to Info in slog).
 	LevelNotice
+	// LevelInfo represents an informational level.
 	LevelInfo
+	// LevelDebug represents a debug level.
 	LevelDebug
 )

-// Logger configuration
+// Config holds logger configuration.
 type Config struct {
 	Verbose bool
 	Debug   bool
 	Cron    bool
+	Quiet   bool
 }

 var logger *slog.Logger

-// Initialize sets up the global logger based on the provided configuration
+// Initialize sets up the global logger based on the provided configuration.
 func Initialize(cfg Config) {
 	// Determine log level based on configuration
 	var level slog.Level

-	if cfg.Cron {
-		// In cron mode, only show fatal errors (which we'll handle specially)
+	if cfg.Cron || cfg.Quiet {
+		// In quiet/cron mode, only show errors
 		level = slog.LevelError
 	} else if cfg.Debug || strings.Contains(os.Getenv("GODEBUG"), "vaultik") {
 		level = slog.LevelDebug
@@ -76,7 +83,7 @@ func getCaller(skip int) string {
 	return fmt.Sprintf("%s:%d", filepath.Base(file), line)
 }

-// Fatal logs a fatal error and exits
+// Fatal logs a fatal error message and exits the program with code 1.
 func Fatal(msg string, args ...any) {
 	if logger != nil {
 		// Add caller info to args
@@ -86,12 +93,12 @@ func Fatal(msg string, args ...any) {
 	os.Exit(1)
 }

-// Fatalf logs a formatted fatal error and exits
+// Fatalf logs a formatted fatal error message and exits the program with code 1.
 func Fatalf(format string, args ...any) {
 	Fatal(fmt.Sprintf(format, args...))
 }

-// Error logs an error
+// Error logs an error message.
 func Error(msg string, args ...any) {
 	if logger != nil {
 		args = append(args, "caller", getCaller(2))
@@ -99,12 +106,12 @@ func Error(msg string, args ...any) {
 	}
 }

-// Errorf logs a formatted error
+// Errorf logs a formatted error message.
 func Errorf(format string, args ...any) {
 	Error(fmt.Sprintf(format, args...))
 }

-// Warn logs a warning
+// Warn logs a warning message.
 func Warn(msg string, args ...any) {
 	if logger != nil {
 		args = append(args, "caller", getCaller(2))
@@ -112,12 +119,12 @@ func Warn(msg string, args ...any) {
 	}
 }

-// Warnf logs a formatted warning
+// Warnf logs a formatted warning message.
 func Warnf(format string, args ...any) {
 	Warn(fmt.Sprintf(format, args...))
 }

-// Notice logs a notice (mapped to Info level)
+// Notice logs a notice message (mapped to Info level).
 func Notice(msg string, args ...any) {
 	if logger != nil {
 		args = append(args, "caller", getCaller(2))
@@ -125,12 +132,12 @@ func Notice(msg string, args ...any) {
 	}
 }

-// Noticef logs a formatted notice
+// Noticef logs a formatted notice message.
 func Noticef(format string, args ...any) {
 	Notice(fmt.Sprintf(format, args...))
 }

-// Info logs an info message
+// Info logs an informational message.
 func Info(msg string, args ...any) {
 	if logger != nil {
 		args = append(args, "caller", getCaller(2))
@@ -138,12 +145,12 @@ func Info(msg string, args ...any) {
 	}
 }

-// Infof logs a formatted info message
+// Infof logs a formatted informational message.
 func Infof(format string, args ...any) {
 	Info(fmt.Sprintf(format, args...))
 }

-// Debug logs a debug message
+// Debug logs a debug message.
 func Debug(msg string, args ...any) {
 	if logger != nil {
 		args = append(args, "caller", getCaller(2))
@@ -151,12 +158,12 @@ func Debug(msg string, args ...any) {
 	}
 }

-// Debugf logs a formatted debug message
+// Debugf logs a formatted debug message.
 func Debugf(format string, args ...any) {
 	Debug(fmt.Sprintf(format, args...))
 }

-// With returns a logger with additional context
+// With returns a logger with additional context attributes.
 func With(args ...any) *slog.Logger {
 	if logger != nil {
 		return logger.With(args...)
@@ -164,12 +171,12 @@ func With(args ...any) *slog.Logger {
 	return slog.Default()
 }

-// WithContext returns a logger with context
+// WithContext returns a logger with the provided context.
 func WithContext(ctx context.Context) *slog.Logger {
 	return logger
 }

-// Logger returns the underlying slog.Logger
+// Logger returns the underlying slog.Logger instance.
 func Logger() *slog.Logger {
 	return logger
 }
--- a/internal/log/module.go
+++ b/internal/log/module.go
@@ -4,21 +4,22 @@ import (
 	"go.uber.org/fx"
 )

-// Module exports logging functionality
+// Module exports logging functionality for dependency injection.
 var Module = fx.Module("log",
 	fx.Invoke(func(cfg Config) {
 		Initialize(cfg)
 	}),
 )

-// New creates a new logger configuration from provided options
+// New creates a new logger configuration from provided options.
 func New(opts LogOptions) Config {
 	return Config(opts)
 }

-// LogOptions are provided by the CLI
+// LogOptions are provided by the CLI.
 type LogOptions struct {
 	Verbose bool
 	Debug   bool
 	Cron    bool
+	Quiet   bool
 }
--- a/internal/log/tty_handler.go
+++ b/internal/log/tty_handler.go
@@ -21,14 +21,14 @@ const (
 	colorBold   = "\033[1m"
 )

-// TTYHandler is a custom handler for TTY output with colors
+// TTYHandler is a custom slog handler for TTY output with colors.
 type TTYHandler struct {
 	opts slog.HandlerOptions
 	mu   sync.Mutex
 	out  io.Writer
 }

-// NewTTYHandler creates a new TTY handler
+// NewTTYHandler creates a new TTY handler with colored output.
 func NewTTYHandler(out io.Writer, opts *slog.HandlerOptions) *TTYHandler {
 	if opts == nil {
 		opts = &slog.HandlerOptions{}
@@ -39,12 +39,12 @@ func NewTTYHandler(out io.Writer, opts *slog.HandlerOptions) *TTYHandler {
 	}
 }

-// Enabled reports whether the handler handles records at the given level
+// Enabled reports whether the handler handles records at the given level.
 func (h *TTYHandler) Enabled(_ context.Context, level slog.Level) bool {
 	return level >= h.opts.Level.Level()
 }

-// Handle writes the log record
+// Handle writes the log record to the output with color formatting.
 func (h *TTYHandler) Handle(_ context.Context, r slog.Record) error {
 	h.mu.Lock()
 	defer h.mu.Unlock()
@@ -103,12 +103,12 @@ func (h *TTYHandler) Handle(_ context.Context, r slog.Record) error {
 	return nil
 }

-// WithAttrs returns a new handler with the given attributes
+// WithAttrs returns a new handler with the given attributes.
 func (h *TTYHandler) WithAttrs(attrs []slog.Attr) slog.Handler {
 	return h // Simplified for now
 }

-// WithGroup returns a new handler with the given group name
+// WithGroup returns a new handler with the given group name.
 func (h *TTYHandler) WithGroup(name string) slog.Handler {
 	return h // Simplified for now
 }
--- a/internal/pidlock/pidlock.go
+++ b/internal/pidlock/pidlock.go
@@ -0,0 +1,108 @@
+// Package pidlock provides process-level locking using PID files.
+// It prevents multiple instances of vaultik from running simultaneously,
+// which would cause database locking conflicts.
+package pidlock
+
+import (
+	"errors"
+	"fmt"
+	"os"
+	"path/filepath"
+	"strconv"
+	"strings"
+	"syscall"
+)
+
+// ErrAlreadyRunning indicates another vaultik instance is running.
+var ErrAlreadyRunning = errors.New("another vaultik instance is already running")
+
+// Lock represents an acquired PID lock.
+type Lock struct {
+	path string
+}
+
+// Acquire attempts to acquire a PID lock in the specified directory.
+// If the lock file exists and the process is still running, it returns
+// ErrAlreadyRunning with details about the existing process.
+// On success, it writes the current PID to the lock file and returns
+// a Lock that must be released with Release().
+func Acquire(lockDir string) (*Lock, error) {
+	// Ensure lock directory exists
+	if err := os.MkdirAll(lockDir, 0700); err != nil {
+		return nil, fmt.Errorf("creating lock directory: %w", err)
+	}
+
+	lockPath := filepath.Join(lockDir, "vaultik.pid")
+
+	// Check for existing lock
+	existingPID, err := readPIDFile(lockPath)
+	if err == nil {
+		// Lock file exists, check if process is running
+		if isProcessRunning(existingPID) {
+			return nil, fmt.Errorf("%w (PID %d)", ErrAlreadyRunning, existingPID)
+		}
+		// Process is not running, stale lock file - we can take over
+	}
+
+	// Write our PID
+	pid := os.Getpid()
+	if err := os.WriteFile(lockPath, []byte(strconv.Itoa(pid)), 0600); err != nil {
+		return nil, fmt.Errorf("writing PID file: %w", err)
+	}
+
+	return &Lock{path: lockPath}, nil
+}
+
+// Release removes the PID lock file.
+// It is safe to call Release multiple times.
+func (l *Lock) Release() error {
+	if l == nil || l.path == "" {
+		return nil
+	}
+
+	// Verify we still own the lock (our PID is in the file)
+	existingPID, err := readPIDFile(l.path)
+	if err != nil {
+		// File already gone or unreadable - that's fine
+		return nil
+	}
+
+	if existingPID != os.Getpid() {
+		// Someone else wrote to our lock file - don't remove it
+		return nil
+	}
+
+	if err := os.Remove(l.path); err != nil && !os.IsNotExist(err) {
+		return fmt.Errorf("removing PID file: %w", err)
+	}
+
+	l.path = "" // Prevent double-release
+	return nil
+}
+
+// readPIDFile reads and parses the PID from a lock file.
+func readPIDFile(path string) (int, error) {
+	data, err := os.ReadFile(path)
+	if err != nil {
+		return 0, err
+	}
+
+	pid, err := strconv.Atoi(strings.TrimSpace(string(data)))
+	if err != nil {
+		return 0, fmt.Errorf("parsing PID: %w", err)
+	}
+
+	return pid, nil
+}
+
+// isProcessRunning checks if a process with the given PID is running.
+func isProcessRunning(pid int) bool {
+	process, err := os.FindProcess(pid)
+	if err != nil {
+		return false
+	}
+
+	// On Unix, FindProcess always succeeds. We need to send signal 0 to check.
+	err = process.Signal(syscall.Signal(0))
+	return err == nil
+}
--- a/internal/pidlock/pidlock_test.go
+++ b/internal/pidlock/pidlock_test.go
@@ -0,0 +1,108 @@
+package pidlock
+
+import (
+	"os"
+	"path/filepath"
+	"strconv"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+func TestAcquireAndRelease(t *testing.T) {
+	tmpDir := t.TempDir()
+
+	// Acquire lock
+	lock, err := Acquire(tmpDir)
+	require.NoError(t, err)
+	require.NotNil(t, lock)
+
+	// Verify PID file exists with our PID
+	data, err := os.ReadFile(filepath.Join(tmpDir, "vaultik.pid"))
+	require.NoError(t, err)
+	pid, err := strconv.Atoi(string(data))
+	require.NoError(t, err)
+	assert.Equal(t, os.Getpid(), pid)
+
+	// Release lock
+	err = lock.Release()
+	require.NoError(t, err)
+
+	// Verify PID file is gone
+	_, err = os.Stat(filepath.Join(tmpDir, "vaultik.pid"))
+	assert.True(t, os.IsNotExist(err))
+}
+
+func TestAcquireBlocksSecondInstance(t *testing.T) {
+	tmpDir := t.TempDir()
+
+	// Acquire first lock
+	lock1, err := Acquire(tmpDir)
+	require.NoError(t, err)
+	require.NotNil(t, lock1)
+	defer func() { _ = lock1.Release() }()
+
+	// Try to acquire second lock - should fail
+	lock2, err := Acquire(tmpDir)
+	assert.ErrorIs(t, err, ErrAlreadyRunning)
+	assert.Nil(t, lock2)
+}
+
+func TestAcquireWithStaleLock(t *testing.T) {
+	tmpDir := t.TempDir()
+
+	// Write a stale PID file (PID that doesn't exist)
+	stalePID := 999999999 // Unlikely to be a real process
+	pidPath := filepath.Join(tmpDir, "vaultik.pid")
+	err := os.WriteFile(pidPath, []byte(strconv.Itoa(stalePID)), 0600)
+	require.NoError(t, err)
+
+	// Should be able to acquire lock (stale lock is cleaned up)
+	lock, err := Acquire(tmpDir)
+	require.NoError(t, err)
+	require.NotNil(t, lock)
+	defer func() { _ = lock.Release() }()
+
+	// Verify our PID is now in the file
+	data, err := os.ReadFile(pidPath)
+	require.NoError(t, err)
+	pid, err := strconv.Atoi(string(data))
+	require.NoError(t, err)
+	assert.Equal(t, os.Getpid(), pid)
+}
+
+func TestReleaseIsIdempotent(t *testing.T) {
+	tmpDir := t.TempDir()
+
+	lock, err := Acquire(tmpDir)
+	require.NoError(t, err)
+
+	// Release multiple times - should not error
+	err = lock.Release()
+	require.NoError(t, err)
+
+	err = lock.Release()
+	require.NoError(t, err)
+}
+
+func TestReleaseNilLock(t *testing.T) {
+	var lock *Lock
+	err := lock.Release()
+	assert.NoError(t, err)
+}
+
+func TestAcquireCreatesDirectory(t *testing.T) {
+	tmpDir := t.TempDir()
+	nestedDir := filepath.Join(tmpDir, "nested", "dir")
+
+	lock, err := Acquire(nestedDir)
+	require.NoError(t, err)
+	require.NotNil(t, lock)
+	defer func() { _ = lock.Release() }()
+
+	// Verify directory was created
+	info, err := os.Stat(nestedDir)
+	require.NoError(t, err)
+	assert.True(t, info.IsDir())
+}
--- a/internal/s3/client.go
+++ b/internal/s3/client.go
@@ -10,6 +10,7 @@ import (
 	"github.com/aws/aws-sdk-go-v2/credentials"
 	"github.com/aws/aws-sdk-go-v2/feature/s3/manager"
 	"github.com/aws/aws-sdk-go-v2/service/s3"
+	"github.com/aws/smithy-go/logging"
 )

 // Client wraps the AWS S3 client for vaultik operations.
@@ -35,12 +36,18 @@ type Config struct {
 	Region          string
 }

+// nopLogger is a logger that discards all output.
+// Used to suppress SDK warnings about checksums.
+type nopLogger struct{}
+
+func (nopLogger) Logf(classification logging.Classification, format string, v ...interface{}) {}
+
 // NewClient creates a new S3 client with the provided configuration.
 // It establishes a connection to the S3-compatible storage service and
 // validates the credentials. The client uses static credentials and
 // path-style URLs for compatibility with various S3-compatible services.
 func NewClient(ctx context.Context, cfg Config) (*Client, error) {
-	// Create AWS config
+	// Create AWS config with a nop logger to suppress SDK warnings
 	awsCfg, err := config.LoadDefaultConfig(ctx,
 		config.WithRegion(cfg.Region),
 		config.WithCredentialsProvider(credentials.NewStaticCredentialsProvider(
@@ -48,6 +55,7 @@ func NewClient(ctx context.Context, cfg Config) (*Client, error) {
 			cfg.SecretAccessKey,
 			"",
 		)),
+		config.WithLogger(nopLogger{}),
 	)
 	if err != nil {
 		return nil, err
--- a/internal/snapshot/backup_test.go
+++ b/internal/snapshot/backup_test.go
@@ -14,6 +14,7 @@ import (
 	"time"

 	"git.eeqj.de/sneak/vaultik/internal/database"
+	"git.eeqj.de/sneak/vaultik/internal/types"
 )

 // MockS3Client is a mock implementation of S3 operations for testing
@@ -138,13 +139,13 @@ func TestBackupWithInMemoryFS(t *testing.T) {
 	}

 	for _, file := range files {
-		if !expectedFiles[file.Path] {
+		if !expectedFiles[file.Path.String()] {
 			t.Errorf("Unexpected file in database: %s", file.Path)
 		}
-		delete(expectedFiles, file.Path)
+		delete(expectedFiles, file.Path.String())

 		// Verify file metadata
-		fsFile := testFS[file.Path]
+		fsFile := testFS[file.Path.String()]
 		if fsFile == nil {
 			t.Errorf("File %s not found in test filesystem", file.Path)
 			continue
@@ -294,8 +295,8 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str
 	hostname, _ := os.Hostname()
 	snapshotID := time.Now().Format(time.RFC3339)
 	snapshot := &database.Snapshot{
-		ID:             snapshotID,
-		Hostname:       hostname,
+		ID:             types.SnapshotID(snapshotID),
+		Hostname:       types.Hostname(hostname),
 		VaultikVersion: "test",
 		StartedAt:      time.Now(),
 		CompletedAt:    nil,
@@ -340,7 +341,7 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str

 		// Create file record in a short transaction
 		file := &database.File{
-			Path:  path,
+			Path:  types.FilePath(path),
 			Size:  info.Size(),
 			Mode:  uint32(info.Mode()),
 			MTime: info.ModTime(),
@@ -392,7 +393,7 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str
 				// Create new chunk in a short transaction
 				err = b.repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 					chunk := &database.Chunk{
-						ChunkHash: chunkHash,
+						ChunkHash: types.ChunkHash(chunkHash),
 						Size:      int64(n),
 					}
 					return b.repos.Chunks.Create(ctx, tx, chunk)
@@ -408,7 +409,7 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str
 				fileChunk := &database.FileChunk{
 					FileID:    file.ID,
 					Idx:       chunkIndex,
-					ChunkHash: chunkHash,
+					ChunkHash: types.ChunkHash(chunkHash),
 				}
 				return b.repos.FileChunks.Create(ctx, tx, fileChunk)
 			})
@@ -419,7 +420,7 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str
 			// Create chunk-file mapping in a short transaction
 			err = b.repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 				chunkFile := &database.ChunkFile{
-					ChunkHash:  chunkHash,
+					ChunkHash:  types.ChunkHash(chunkHash),
 					FileID:     file.ID,
 					FileOffset: int64(chunkIndex * defaultChunkSize),
 					Length:     int64(n),
@@ -463,10 +464,11 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str
 		}

 		// Create blob entry in a short transaction
+		blobID := types.NewBlobID()
 		err = b.repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 			blob := &database.Blob{
-				ID:        "test-blob-" + blobHash[:8],
-				Hash:      blobHash,
+				ID:        blobID,
+				Hash:      types.BlobHash(blobHash),
 				CreatedTS: time.Now(),
 			}
 			return b.repos.Blobs.Create(ctx, tx, blob)
@@ -481,8 +483,8 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str
 		// Create blob-chunk mapping in a short transaction
 		err = b.repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 			blobChunk := &database.BlobChunk{
-				BlobID:    "test-blob-" + blobHash[:8],
-				ChunkHash: chunkHash,
+				BlobID:    blobID,
+				ChunkHash: types.ChunkHash(chunkHash),
 				Offset:    0,
 				Length:    chunk.Size,
 			}
@@ -494,7 +496,7 @@ func (b *BackupEngine) Backup(ctx context.Context, fsys fs.FS, root string) (str

 		// Add blob to snapshot in a short transaction
 		err = b.repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
-			return b.repos.Snapshots.AddBlob(ctx, tx, snapshotID, "test-blob-"+blobHash[:8], blobHash)
+			return b.repos.Snapshots.AddBlob(ctx, tx, snapshotID, blobID, types.BlobHash(blobHash))
 		})
 		if err != nil {
 			return "", err
--- a/internal/snapshot/exclude_test.go
+++ b/internal/snapshot/exclude_test.go
@@ -0,0 +1,454 @@
+package snapshot_test
+
+import (
+	"context"
+	"database/sql"
+	"path/filepath"
+	"testing"
+	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/database"
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"git.eeqj.de/sneak/vaultik/internal/types"
+	"github.com/spf13/afero"
+	"github.com/stretchr/testify/require"
+)
+
+func setupExcludeTestFS(t *testing.T) afero.Fs {
+	t.Helper()
+
+	// Create in-memory filesystem
+	fs := afero.NewMemMapFs()
+
+	// Create test directory structure:
+	// /backup/
+	//   file1.txt           (should be backed up)
+	//   file2.log           (should be excluded if *.log is in patterns)
+	//   .git/
+	//     config            (should be excluded if .git is in patterns)
+	//     objects/
+	//       pack/
+	//         data.pack     (should be excluded if .git is in patterns)
+	//   src/
+	//     main.go           (should be backed up)
+	//     test.go           (should be backed up)
+	//   node_modules/
+	//     package/
+	//       index.js        (should be excluded if node_modules is in patterns)
+	//   cache/
+	//     temp.dat          (should be excluded if cache/ is in patterns)
+	//   build/
+	//     output.bin        (should be excluded if build is in patterns)
+	//   docs/
+	//     readme.md         (should be backed up)
+	//   .DS_Store           (should be excluded if .DS_Store is in patterns)
+	//   thumbs.db           (should be excluded if thumbs.db is in patterns)
+
+	files := map[string]string{
+		"/backup/file1.txt":                     "content1",
+		"/backup/file2.log":                     "log content",
+		"/backup/.git/config":                   "git config",
+		"/backup/.git/objects/pack/data.pack":   "pack data",
+		"/backup/src/main.go":                   "package main",
+		"/backup/src/test.go":                   "package main_test",
+		"/backup/node_modules/package/index.js": "module.exports = {}",
+		"/backup/cache/temp.dat":                "cached data",
+		"/backup/build/output.bin":              "binary data",
+		"/backup/docs/readme.md":                "# Documentation",
+		"/backup/.DS_Store":                     "ds store data",
+		"/backup/thumbs.db":                     "thumbs data",
+		"/backup/src/.hidden":                   "hidden file",
+		"/backup/important.log.bak":             "backup of log",
+	}
+
+	testTime := time.Date(2024, 1, 1, 12, 0, 0, 0, time.UTC)
+	for path, content := range files {
+		dir := filepath.Dir(path)
+		err := fs.MkdirAll(dir, 0755)
+		require.NoError(t, err)
+		err = afero.WriteFile(fs, path, []byte(content), 0644)
+		require.NoError(t, err)
+		err = fs.Chtimes(path, testTime, testTime)
+		require.NoError(t, err)
+	}
+
+	return fs
+}
+
+func createTestScanner(t *testing.T, fs afero.Fs, excludePatterns []string) (*snapshot.Scanner, *database.Repositories, func()) {
+	t.Helper()
+
+	// Initialize logger
+	log.Initialize(log.Config{})
+
+	// Create test database
+	db, err := database.NewTestDB()
+	require.NoError(t, err)
+
+	repos := database.NewRepositories(db)
+
+	scanner := snapshot.NewScanner(snapshot.ScannerConfig{
+		FS:               fs,
+		ChunkSize:        64 * 1024,
+		Repositories:     repos,
+		MaxBlobSize:      1024 * 1024,
+		CompressionLevel: 3,
+		AgeRecipients:    []string{"age1ql3z7hjy54pw3hyww5ayyfg7zqgvc7w3j2elw8zmrj2kg5sfn9aqmcac8p"},
+		Exclude:          excludePatterns,
+	})
+
+	cleanup := func() {
+		_ = db.Close()
+	}
+
+	return scanner, repos, cleanup
+}
+
+func createSnapshotRecord(t *testing.T, ctx context.Context, repos *database.Repositories, snapshotID string) {
+	t.Helper()
+	err := repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
+		snap := &database.Snapshot{
+			ID:               types.SnapshotID(snapshotID),
+			Hostname:         "test-host",
+			VaultikVersion:   "test",
+			StartedAt:        time.Now(),
+			CompletedAt:      nil,
+			FileCount:        0,
+			ChunkCount:       0,
+			BlobCount:        0,
+			TotalSize:        0,
+			BlobSize:         0,
+			CompressionRatio: 1.0,
+		}
+		return repos.Snapshots.Create(ctx, tx, snap)
+	})
+	require.NoError(t, err)
+}
+
+func TestExcludePatterns_ExcludeGitDirectory(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{".git"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should have scanned files but NOT .git directory contents
+	// Expected: file1.txt, file2.log, src/main.go, src/test.go, node_modules/package/index.js,
+	//           cache/temp.dat, build/output.bin, docs/readme.md, .DS_Store, thumbs.db,
+	//           src/.hidden, important.log.bak
+	// Excluded: .git/config, .git/objects/pack/data.pack
+	require.Equal(t, 12, result.FilesScanned, "Should exclude .git directory contents")
+}
+
+func TestExcludePatterns_ExcludeByExtension(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"*.log"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should exclude file2.log but NOT important.log.bak (different extension)
+	// Total files: 14, excluded: 1 (file2.log)
+	require.Equal(t, 13, result.FilesScanned, "Should exclude *.log files")
+}
+
+func TestExcludePatterns_ExcludeNodeModules(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"node_modules"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should exclude node_modules/package/index.js
+	// Total files: 14, excluded: 1
+	require.Equal(t, 13, result.FilesScanned, "Should exclude node_modules directory")
+}
+
+func TestExcludePatterns_MultiplePatterns(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{".git", "node_modules", "*.log", ".DS_Store", "thumbs.db", "cache", "build"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should only have: file1.txt, src/main.go, src/test.go, docs/readme.md, src/.hidden, important.log.bak
+	// Excluded: .git/*, node_modules/*, *.log (file2.log), .DS_Store, thumbs.db, cache/*, build/*
+	require.Equal(t, 6, result.FilesScanned, "Should exclude multiple patterns")
+}
+
+func TestExcludePatterns_NoExclusions(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should scan all 14 files
+	require.Equal(t, 14, result.FilesScanned, "Should scan all files when no exclusions")
+}
+
+func TestExcludePatterns_ExcludeHiddenFiles(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{".*"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should exclude: .git/*, .DS_Store, src/.hidden
+	// Total files: 14, excluded: 4 (.git/config, .git/objects/pack/data.pack, .DS_Store, src/.hidden)
+	require.Equal(t, 10, result.FilesScanned, "Should exclude hidden files and directories")
+}
+
+func TestExcludePatterns_DoubleStarGlob(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"**/*.pack"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should exclude .git/objects/pack/data.pack
+	// Total files: 14, excluded: 1
+	require.Equal(t, 13, result.FilesScanned, "Should exclude **/*.pack files")
+}
+
+func TestExcludePatterns_ExactFileName(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"thumbs.db", ".DS_Store"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should exclude thumbs.db and .DS_Store
+	// Total files: 14, excluded: 2
+	require.Equal(t, 12, result.FilesScanned, "Should exclude exact file names")
+}
+
+func TestExcludePatterns_CaseSensitive(t *testing.T) {
+	// Pattern matching should be case-sensitive
+	fs := setupExcludeTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"THUMBS.DB"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Case-sensitive matching: THUMBS.DB should NOT match thumbs.db
+	// All 14 files should be scanned
+	require.Equal(t, 14, result.FilesScanned, "Pattern matching should be case-sensitive")
+}
+
+func TestExcludePatterns_DirectoryWithTrailingSlash(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	// Some users might add trailing slashes to directory patterns
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"cache/", "build/"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should exclude cache/temp.dat and build/output.bin
+	// Total files: 14, excluded: 2
+	require.Equal(t, 12, result.FilesScanned, "Should handle directory patterns with trailing slashes")
+}
+
+func TestExcludePatterns_PatternInSubdirectory(t *testing.T) {
+	fs := setupExcludeTestFS(t)
+	// Exclude .hidden file specifically in src directory
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"src/.hidden"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// Should exclude only src/.hidden
+	// Total files: 14, excluded: 1
+	require.Equal(t, 13, result.FilesScanned, "Should exclude specific subdirectory files")
+}
+
+// setupAnchoredTestFS creates a filesystem for testing anchored patterns
+// Source dir: /backup
+// Structure:
+//
+//	/backup/
+//	  projectname/
+//	    file.txt         (should be excluded with /projectname)
+//	  otherproject/
+//	    projectname/
+//	      file.txt       (should NOT be excluded with /projectname, only with projectname)
+//	  src/
+//	    file.go
+func setupAnchoredTestFS(t *testing.T) afero.Fs {
+	t.Helper()
+
+	fs := afero.NewMemMapFs()
+
+	files := map[string]string{
+		"/backup/projectname/file.txt":              "root project file",
+		"/backup/otherproject/projectname/file.txt": "nested project file",
+		"/backup/src/file.go":                       "source file",
+		"/backup/file.txt":                          "root file",
+	}
+
+	testTime := time.Date(2024, 1, 1, 12, 0, 0, 0, time.UTC)
+	for path, content := range files {
+		dir := filepath.Dir(path)
+		err := fs.MkdirAll(dir, 0755)
+		require.NoError(t, err)
+		err = afero.WriteFile(fs, path, []byte(content), 0644)
+		require.NoError(t, err)
+		err = fs.Chtimes(path, testTime, testTime)
+		require.NoError(t, err)
+	}
+
+	return fs
+}
+
+func TestExcludePatterns_AnchoredPattern(t *testing.T) {
+	// Pattern starting with / should only match from root of source dir
+	fs := setupAnchoredTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"/projectname"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// /projectname should ONLY exclude /backup/projectname/file.txt (1 file)
+	// /backup/otherproject/projectname/file.txt should NOT be excluded
+	// Total files: 4, excluded: 1
+	require.Equal(t, 3, result.FilesScanned, "Anchored pattern /projectname should only match at root of source dir")
+}
+
+func TestExcludePatterns_UnanchoredPattern(t *testing.T) {
+	// Pattern without leading / should match anywhere in path
+	fs := setupAnchoredTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"projectname"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// projectname (without /) should exclude BOTH:
+	// - /backup/projectname/file.txt
+	// - /backup/otherproject/projectname/file.txt
+	// Total files: 4, excluded: 2
+	require.Equal(t, 2, result.FilesScanned, "Unanchored pattern should match anywhere in path")
+}
+
+func TestExcludePatterns_AnchoredPatternWithGlob(t *testing.T) {
+	// Anchored pattern with glob
+	fs := setupAnchoredTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"/src/*.go"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// /src/*.go should exclude /backup/src/file.go
+	// Total files: 4, excluded: 1
+	require.Equal(t, 3, result.FilesScanned, "Anchored pattern with glob should work")
+}
+
+func TestExcludePatterns_AnchoredPatternFile(t *testing.T) {
+	// Anchored pattern for exact file at root
+	fs := setupAnchoredTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"/file.txt"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// /file.txt should ONLY exclude /backup/file.txt
+	// NOT /backup/projectname/file.txt or /backup/otherproject/projectname/file.txt
+	// Total files: 4, excluded: 1
+	require.Equal(t, 3, result.FilesScanned, "Anchored pattern for file should only match at root")
+}
+
+func TestExcludePatterns_UnanchoredPatternFile(t *testing.T) {
+	// Unanchored pattern for file should match anywhere
+	fs := setupAnchoredTestFS(t)
+	scanner, repos, cleanup := createTestScanner(t, fs, []string{"file.txt"})
+	defer cleanup()
+	require.NotNil(t, scanner)
+
+	ctx := context.Background()
+	createSnapshotRecord(t, ctx, repos, "test-snapshot")
+
+	result, err := scanner.Scan(ctx, "/backup", "test-snapshot")
+	require.NoError(t, err)
+
+	// file.txt should exclude ALL file.txt files:
+	// - /backup/file.txt
+	// - /backup/projectname/file.txt
+	// - /backup/otherproject/projectname/file.txt
+	// Total files: 4, excluded: 3
+	require.Equal(t, 1, result.FilesScanned, "Unanchored pattern for file should match anywhere")
+}
--- a/internal/snapshot/file_change_test.go
+++ b/internal/snapshot/file_change_test.go
@@ -9,6 +9,7 @@ import (
 	"git.eeqj.de/sneak/vaultik/internal/database"
 	"git.eeqj.de/sneak/vaultik/internal/log"
 	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"git.eeqj.de/sneak/vaultik/internal/types"
 	"github.com/spf13/afero"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
@@ -53,7 +54,7 @@ func TestFileContentChange(t *testing.T) {
 	snapshotID1 := "snapshot1"
 	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 		snapshot := &database.Snapshot{
-			ID:             snapshotID1,
+			ID:             types.SnapshotID(snapshotID1),
 			Hostname:       "test-host",
 			VaultikVersion: "test",
 			StartedAt:      time.Now(),
@@ -87,7 +88,7 @@ func TestFileContentChange(t *testing.T) {
 	snapshotID2 := "snapshot2"
 	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 		snapshot := &database.Snapshot{
-			ID:             snapshotID2,
+			ID:             types.SnapshotID(snapshotID2),
 			Hostname:       "test-host",
 			VaultikVersion: "test",
 			StartedAt:      time.Now(),
@@ -117,12 +118,12 @@ func TestFileContentChange(t *testing.T) {
 	assert.Equal(t, newChunkHash, chunkFiles2[0].ChunkHash)

 	// Verify old chunk still exists (it's still valid data)
-	oldChunk, err := repos.Chunks.GetByHash(ctx, oldChunkHash)
+	oldChunk, err := repos.Chunks.GetByHash(ctx, oldChunkHash.String())
 	require.NoError(t, err)
 	assert.NotNil(t, oldChunk)

 	// Verify new chunk exists
-	newChunk, err := repos.Chunks.GetByHash(ctx, newChunkHash)
+	newChunk, err := repos.Chunks.GetByHash(ctx, newChunkHash.String())
 	require.NoError(t, err)
 	assert.NotNil(t, newChunk)

@@ -182,7 +183,7 @@ func TestMultipleFileChanges(t *testing.T) {
 	snapshotID1 := "snapshot1"
 	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 		snapshot := &database.Snapshot{
-			ID:             snapshotID1,
+			ID:             types.SnapshotID(snapshotID1),
 			Hostname:       "test-host",
 			VaultikVersion: "test",
 			StartedAt:      time.Now(),
@@ -194,8 +195,8 @@ func TestMultipleFileChanges(t *testing.T) {
 	// First scan
 	result1, err := scanner.Scan(ctx, "/", snapshotID1)
 	require.NoError(t, err)
-	// 4 files because root directory is also counted
-	assert.Equal(t, 4, result1.FilesScanned)
+	// Only regular files are counted, not directories
+	assert.Equal(t, 3, result1.FilesScanned)

 	// Modify two files
 	time.Sleep(10 * time.Millisecond) // Ensure mtime changes
@@ -208,7 +209,7 @@ func TestMultipleFileChanges(t *testing.T) {
 	snapshotID2 := "snapshot2"
 	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 		snapshot := &database.Snapshot{
-			ID:             snapshotID2,
+			ID:             types.SnapshotID(snapshotID2),
 			Hostname:       "test-host",
 			VaultikVersion: "test",
 			StartedAt:      time.Now(),
@@ -220,8 +221,9 @@ func TestMultipleFileChanges(t *testing.T) {
 	// Second scan
 	result2, err := scanner.Scan(ctx, "/", snapshotID2)
 	require.NoError(t, err)
-	// 4 files because root directory is also counted
-	assert.Equal(t, 4, result2.FilesScanned)
+
+	// Only regular files are counted, not directories
+	assert.Equal(t, 3, result2.FilesScanned)

 	// Verify each file has exactly one set of chunks
 	for path := range files {
--- a/internal/snapshot/module.go
+++ b/internal/snapshot/module.go
@@ -3,7 +3,7 @@ package snapshot
 import (
 	"git.eeqj.de/sneak/vaultik/internal/config"
 	"git.eeqj.de/sneak/vaultik/internal/database"
-	"git.eeqj.de/sneak/vaultik/internal/s3"
+	"git.eeqj.de/sneak/vaultik/internal/storage"
 	"github.com/spf13/afero"
 	"go.uber.org/fx"
 )
@@ -11,6 +11,9 @@ import (
 // ScannerParams holds parameters for scanner creation
 type ScannerParams struct {
 	EnableProgress bool
+	Fs             afero.Fs
+	Exclude        []string // Exclude patterns (combined global + snapshot-specific)
+	SkipErrors     bool     // Skip file read errors (log loudly but continue)
 }

 // Module exports backup functionality as an fx module.
@@ -26,17 +29,25 @@ var Module = fx.Module("backup",
 // ScannerFactory creates scanners with custom parameters
 type ScannerFactory func(params ScannerParams) *Scanner

-func provideScannerFactory(cfg *config.Config, repos *database.Repositories, s3Client *s3.Client) ScannerFactory {
+func provideScannerFactory(cfg *config.Config, repos *database.Repositories, storer storage.Storer) ScannerFactory {
 	return func(params ScannerParams) *Scanner {
+		// Use provided excludes, or fall back to global config excludes
+		excludes := params.Exclude
+		if len(excludes) == 0 {
+			excludes = cfg.Exclude
+		}
+
 		return NewScanner(ScannerConfig{
-			FS:               afero.NewOsFs(),
+			FS:               params.Fs,
 			ChunkSize:        cfg.ChunkSize.Int64(),
 			Repositories:     repos,
-			S3Client:         s3Client,
+			Storage:          storer,
 			MaxBlobSize:      cfg.BlobSizeLimit.Int64(),
 			CompressionLevel: cfg.CompressionLevel,
 			AgeRecipients:    cfg.AgeRecipients,
 			EnableProgress:   params.EnableProgress,
+			Exclude:          excludes,
+			SkipErrors:       params.SkipErrors,
 		})
 	}
 }
--- a/internal/snapshot/progress.go
+++ b/internal/snapshot/progress.go
@@ -22,6 +22,9 @@ const (
 	// DetailInterval defines how often multi-line detailed status reports are printed.
 	// These reports include comprehensive statistics about files, chunks, blobs, and uploads.
 	DetailInterval = 60 * time.Second
+
+	// UploadProgressInterval defines how often upload progress messages are logged.
+	UploadProgressInterval = 15 * time.Second
 )

 // ProgressStats holds atomic counters for progress tracking
@@ -36,7 +39,7 @@ type ProgressStats struct {
 	BlobsCreated     atomic.Int64
 	BlobsUploaded    atomic.Int64
 	BytesUploaded    atomic.Int64
-	UploadDurationMs atomic.Int64 // Total milliseconds spent uploading to S3
+	UploadDurationMs atomic.Int64 // Total milliseconds spent uploading
 	CurrentFile      atomic.Value // stores string
 	TotalSize        atomic.Int64 // Total size to process (set after scan phase)
 	TotalFiles       atomic.Int64 // Total files to process in phase 2
@@ -55,6 +58,7 @@ type UploadInfo struct {
 	BlobHash    string
 	Size        int64
 	StartTime   time.Time
+	LastLogTime time.Time
 }

 // ProgressReporter handles periodic progress reporting
@@ -269,7 +273,7 @@ func (pr *ProgressReporter) printDetailedStatus() {
 		"created", blobsCreated,
 		"uploaded", blobsUploaded,
 		"pending", blobsCreated-blobsUploaded)
-	log.Info("Total uploaded to S3",
+	log.Info("Total uploaded to remote",
 		"uploaded", humanize.Bytes(uint64(bytesUploaded)),
 		"compression_ratio", formatRatio(bytesUploaded, bytesScanned))
 	if currentFile != "" {
@@ -330,6 +334,11 @@ func (pr *ProgressReporter) ReportUploadStart(blobHash string, size int64) {
 		StartTime: time.Now().UTC(),
 	}
 	pr.stats.CurrentUpload.Store(info)
+
+	// Log the start of upload
+	log.Info("Starting blob upload",
+		"hash", blobHash[:8]+"...",
+		"size", humanize.Bytes(uint64(size)))
 }

 // ReportUploadComplete marks the completion of a blob upload
@@ -377,18 +386,13 @@ func (pr *ProgressReporter) UpdateChunkingActivity() {
 func (pr *ProgressReporter) ReportUploadProgress(blobHash string, bytesUploaded, totalSize int64, instantSpeed float64) {
 	// Update the current upload info with progress
 	if uploadInfo, ok := pr.stats.CurrentUpload.Load().(*UploadInfo); ok && uploadInfo != nil {
-		// Format speed in bits/second
+		now := time.Now()
+
+		// Only log at the configured interval
+		if now.Sub(uploadInfo.LastLogTime) >= UploadProgressInterval {
+			// Format speed in bits/second using humanize
 			bitsPerSec := instantSpeed * 8
-		var speedStr string
-		if bitsPerSec >= 1e9 {
-			speedStr = fmt.Sprintf("%.1fGbit/sec", bitsPerSec/1e9)
-		} else if bitsPerSec >= 1e6 {
-			speedStr = fmt.Sprintf("%.0fMbit/sec", bitsPerSec/1e6)
-		} else if bitsPerSec >= 1e3 {
-			speedStr = fmt.Sprintf("%.0fKbit/sec", bitsPerSec/1e3)
-		} else {
-			speedStr = fmt.Sprintf("%.0fbit/sec", bitsPerSec)
-		}
+			speedStr := humanize.SI(bitsPerSec, "bit/sec")

 			percent := float64(bytesUploaded) / float64(totalSize) * 100

@@ -408,5 +412,8 @@ func (pr *ProgressReporter) ReportUploadProgress(blobHash string, bytesUploaded,
 				"total", humanize.Bytes(uint64(totalSize)),
 				"speed", speedStr,
 				"eta", etaStr)
+
+			uploadInfo.LastLogTime = now
+		}
 	}
 }
--- a/internal/snapshot/scanner.go
+++ b/internal/snapshot/scanner.go
--- a/internal/snapshot/scanner_test.go
+++ b/internal/snapshot/scanner_test.go
@@ -10,6 +10,7 @@ import (
 	"git.eeqj.de/sneak/vaultik/internal/database"
 	"git.eeqj.de/sneak/vaultik/internal/log"
 	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"git.eeqj.de/sneak/vaultik/internal/types"
 	"github.com/spf13/afero"
 )

@@ -74,7 +75,7 @@ func TestScannerSimpleDirectory(t *testing.T) {
 	snapshotID := "test-snapshot-001"
 	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 		snapshot := &database.Snapshot{
-			ID:               snapshotID,
+			ID:               types.SnapshotID(snapshotID),
 			Hostname:         "test-host",
 			VaultikVersion:   "test",
 			StartedAt:        time.Now(),
@@ -99,26 +100,25 @@ func TestScannerSimpleDirectory(t *testing.T) {
 		t.Fatalf("scan failed: %v", err)
 	}

-	// Verify results
-	// We now scan 6 files + 3 directories (source, subdir, subdir2) = 9 entries
-	if result.FilesScanned != 9 {
-		t.Errorf("expected 9 entries scanned, got %d", result.FilesScanned)
+	// Verify results - we only scan regular files, not directories
+	if result.FilesScanned != 6 {
+		t.Errorf("expected 6 files scanned, got %d", result.FilesScanned)
 	}

-	// Directories have their own sizes, so the total will be more than just file content
+	// Total bytes should be the sum of all file contents
 	if result.BytesScanned < 97 { // At minimum we have 97 bytes of file content
 		t.Errorf("expected at least 97 bytes scanned, got %d", result.BytesScanned)
 	}

-	// Verify files in database
+	// Verify files in database - only regular files are stored
 	files, err := repos.Files.ListByPrefix(ctx, "/source")
 	if err != nil {
 		t.Fatalf("failed to list files: %v", err)
 	}

-	// We should have 6 files + 3 directories = 9 entries
-	if len(files) != 9 {
-		t.Errorf("expected 9 entries in database, got %d", len(files))
+	// We should have 6 files (directories are not stored)
+	if len(files) != 6 {
+		t.Errorf("expected 6 files in database, got %d", len(files))
 	}

 	// Verify specific file
@@ -159,118 +159,6 @@ func TestScannerSimpleDirectory(t *testing.T) {
 	}
 }

-func TestScannerWithSymlinks(t *testing.T) {
-	// Initialize logger for tests
-	log.Initialize(log.Config{})
-
-	// Create in-memory filesystem
-	fs := afero.NewMemMapFs()
-
-	// Create test files
-	if err := fs.MkdirAll("/source", 0755); err != nil {
-		t.Fatal(err)
-	}
-	if err := afero.WriteFile(fs, "/source/target.txt", []byte("target content"), 0644); err != nil {
-		t.Fatal(err)
-	}
-	if err := afero.WriteFile(fs, "/outside/file.txt", []byte("outside content"), 0644); err != nil {
-		t.Fatal(err)
-	}
-
-	// Create symlinks (if supported by the filesystem)
-	linker, ok := fs.(afero.Symlinker)
-	if !ok {
-		t.Skip("filesystem does not support symlinks")
-	}
-
-	// Symlink to file in source
-	if err := linker.SymlinkIfPossible("target.txt", "/source/link1.txt"); err != nil {
-		t.Fatal(err)
-	}
-
-	// Symlink to file outside source
-	if err := linker.SymlinkIfPossible("/outside/file.txt", "/source/link2.txt"); err != nil {
-		t.Fatal(err)
-	}
-
-	// Create test database
-	db, err := database.NewTestDB()
-	if err != nil {
-		t.Fatalf("failed to create test database: %v", err)
-	}
-	defer func() {
-		if err := db.Close(); err != nil {
-			t.Errorf("failed to close database: %v", err)
-		}
-	}()
-
-	repos := database.NewRepositories(db)
-
-	// Create scanner
-	scanner := snapshot.NewScanner(snapshot.ScannerConfig{
-		FS:               fs,
-		ChunkSize:        1024 * 16,
-		Repositories:     repos,
-		MaxBlobSize:      int64(1024 * 1024),
-		CompressionLevel: 3,
-		AgeRecipients:    []string{"age1ezrjmfpwsc95svdg0y54mums3zevgzu0x0ecq2f7tp8a05gl0sjq9q9wjg"}, // Test public key
-	})
-
-	// Create a snapshot record for testing
-	ctx := context.Background()
-	snapshotID := "test-snapshot-001"
-	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
-		snapshot := &database.Snapshot{
-			ID:               snapshotID,
-			Hostname:         "test-host",
-			VaultikVersion:   "test",
-			StartedAt:        time.Now(),
-			CompletedAt:      nil,
-			FileCount:        0,
-			ChunkCount:       0,
-			BlobCount:        0,
-			TotalSize:        0,
-			BlobSize:         0,
-			CompressionRatio: 1.0,
-		}
-		return repos.Snapshots.Create(ctx, tx, snapshot)
-	})
-	if err != nil {
-		t.Fatalf("failed to create snapshot: %v", err)
-	}
-
-	// Scan the directory
-	var result *snapshot.ScanResult
-	result, err = scanner.Scan(ctx, "/source", snapshotID)
-	if err != nil {
-		t.Fatalf("scan failed: %v", err)
-	}
-
-	// Should have scanned 3 files (target + 2 symlinks)
-	if result.FilesScanned != 3 {
-		t.Errorf("expected 3 files scanned, got %d", result.FilesScanned)
-	}
-
-	// Check symlinks in database
-	link1, err := repos.Files.GetByPath(ctx, "/source/link1.txt")
-	if err != nil {
-		t.Fatalf("failed to get link1.txt: %v", err)
-	}
-
-	if link1.LinkTarget != "target.txt" {
-		t.Errorf("expected link1.txt target 'target.txt', got %q", link1.LinkTarget)
-	}
-
-	link2, err := repos.Files.GetByPath(ctx, "/source/link2.txt")
-	if err != nil {
-		t.Fatalf("failed to get link2.txt: %v", err)
-	}
-
-	if link2.LinkTarget != "/outside/file.txt" {
-		t.Errorf("expected link2.txt target '/outside/file.txt', got %q", link2.LinkTarget)
-	}
-}
-
 func TestScannerLargeFile(t *testing.T) {
 	// Initialize logger for tests
 	log.Initialize(log.Config{})
@@ -322,7 +210,7 @@ func TestScannerLargeFile(t *testing.T) {
 	snapshotID := "test-snapshot-001"
 	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
 		snapshot := &database.Snapshot{
-			ID:               snapshotID,
+			ID:               types.SnapshotID(snapshotID),
 			Hostname:         "test-host",
 			VaultikVersion:   "test",
 			StartedAt:        time.Now(),
@@ -347,9 +235,9 @@ func TestScannerLargeFile(t *testing.T) {
 		t.Fatalf("scan failed: %v", err)
 	}

-	// We scan 1 file + 1 directory = 2 entries
-	if result.FilesScanned != 2 {
-		t.Errorf("expected 2 entries scanned, got %d", result.FilesScanned)
+	// We scan only regular files, not directories
+	if result.FilesScanned != 1 {
+		t.Errorf("expected 1 file scanned, got %d", result.FilesScanned)
 	}

 	// The file size should be at least 1MB
--- a/internal/snapshot/snapshot.go
+++ b/internal/snapshot/snapshot.go
@@ -19,24 +19,19 @@ package snapshot
 //    - Blobs not containing any remaining chunks
 //    - All related mapping tables (file_chunks, chunk_files, blob_chunks)
 // 7. Close the temporary database
-// 8. Use sqlite3 to dump the cleaned database to SQL
-// 9. Delete the temporary database file
-// 10. Compress the SQL dump with zstd
-// 11. Encrypt the compressed dump with age (if encryption is enabled)
-// 12. Upload to S3 as: snapshots/{snapshot-id}.sql.zst[.age]
-// 13. Reopen the main database
+// 8. VACUUM the database to remove deleted data and compact (security critical)
+// 9. Compress the binary database with zstd
+// 10. Encrypt the compressed database with age (if encryption is enabled)
+// 11. Upload to S3 as: metadata/{snapshot-id}/db.zst.age
+// 12. Reopen the main database
 //
 // Advantages of this approach:
 // - No custom metadata format needed
 // - Reuses existing database schema and relationships
-// - SQL dumps are portable and compress well
-// - Restore process can simply execute the SQL
+// - Binary SQLite files are portable and compress well
+// - Fast restore - just decompress and open (no SQL parsing)
+// - VACUUM ensures no deleted data leaks
 // - Atomic and consistent snapshot of all metadata
-//
-// TODO: Future improvements:
-// - Add snapshot-file relationships to track which files belong to which snapshot
-// - Implement incremental snapshots that reference previous snapshots
-// - Add snapshot manifest with additional metadata (size, chunk count, etc.)

 import (
 	"bytes"
@@ -44,25 +39,28 @@ import (
 	"database/sql"
 	"fmt"
 	"io"
-	"os"
 	"os/exec"
 	"path/filepath"
+	"strings"
 	"time"

 	"git.eeqj.de/sneak/vaultik/internal/blobgen"
 	"git.eeqj.de/sneak/vaultik/internal/config"
 	"git.eeqj.de/sneak/vaultik/internal/database"
 	"git.eeqj.de/sneak/vaultik/internal/log"
-	"git.eeqj.de/sneak/vaultik/internal/s3"
+	"git.eeqj.de/sneak/vaultik/internal/storage"
+	"git.eeqj.de/sneak/vaultik/internal/types"
 	"github.com/dustin/go-humanize"
+	"github.com/spf13/afero"
 	"go.uber.org/fx"
 )

 // SnapshotManager handles snapshot creation and metadata export
 type SnapshotManager struct {
 	repos   *database.Repositories
-	s3Client S3Client
+	storage storage.Storer
 	config  *config.Config
+	fs      afero.Fs
 }

 // SnapshotManagerParams holds dependencies for NewSnapshotManager
@@ -70,7 +68,7 @@ type SnapshotManagerParams struct {
 	fx.In

 	Repos   *database.Repositories
-	S3Client *s3.Client
+	Storage storage.Storer
 	Config  *config.Config
 }

@@ -78,20 +76,45 @@ type SnapshotManagerParams struct {
 func NewSnapshotManager(params SnapshotManagerParams) *SnapshotManager {
 	return &SnapshotManager{
 		repos:   params.Repos,
-		s3Client: params.S3Client,
+		storage: params.Storage,
 		config:  params.Config,
 	}
 }

-// CreateSnapshot creates a new snapshot record in the database at the start of a backup
+// SetFilesystem sets the filesystem to use for all file operations
+func (sm *SnapshotManager) SetFilesystem(fs afero.Fs) {
+	sm.fs = fs
+}
+
+// CreateSnapshot creates a new snapshot record in the database at the start of a backup.
+// Deprecated: Use CreateSnapshotWithName instead for multi-snapshot support.
 func (sm *SnapshotManager) CreateSnapshot(ctx context.Context, hostname, version, gitRevision string) (string, error) {
-	snapshotID := fmt.Sprintf("%s-%s", hostname, time.Now().UTC().Format("20060102-150405Z"))
+	return sm.CreateSnapshotWithName(ctx, hostname, "", version, gitRevision)
+}
+
+// CreateSnapshotWithName creates a new snapshot record with an optional snapshot name.
+// The snapshot ID format is: hostname_name_timestamp or hostname_timestamp if name is empty.
+func (sm *SnapshotManager) CreateSnapshotWithName(ctx context.Context, hostname, name, version, gitRevision string) (string, error) {
+	// Use short hostname (strip domain if present)
+	shortHostname := hostname
+	if idx := strings.Index(hostname, "."); idx != -1 {
+		shortHostname = hostname[:idx]
+	}
+
+	// Build snapshot ID with optional name
+	timestamp := time.Now().UTC().Format("2006-01-02T15:04:05Z")
+	var snapshotID string
+	if name != "" {
+		snapshotID = fmt.Sprintf("%s_%s_%s", shortHostname, name, timestamp)
+	} else {
+		snapshotID = fmt.Sprintf("%s_%s", shortHostname, timestamp)
+	}

 	snapshot := &database.Snapshot{
-		ID:                 snapshotID,
-		Hostname:           hostname,
-		VaultikVersion:     version,
-		VaultikGitRevision: gitRevision,
+		ID:                 types.SnapshotID(snapshotID),
+		Hostname:           types.Hostname(hostname),
+		VaultikVersion:     types.Version(version),
+		VaultikGitRevision: types.GitRevision(gitRevision),
 		StartedAt:          time.Now().UTC(),
 		CompletedAt:        nil, // Not completed yet
 		FileCount:          0,
@@ -192,14 +215,14 @@ func (sm *SnapshotManager) ExportSnapshotMetadata(ctx context.Context, dbPath st
 	log.Info("Phase 3/3: Exporting snapshot metadata", "snapshot_id", snapshotID, "source_db", dbPath)

 	// Create temp directory for all temporary files
-	tempDir, err := os.MkdirTemp("", "vaultik-snapshot-*")
+	tempDir, err := afero.TempDir(sm.fs, "", "vaultik-snapshot-*")
 	if err != nil {
 		return fmt.Errorf("creating temp dir: %w", err)
 	}
 	log.Debug("Created temporary directory", "path", tempDir)
 	defer func() {
 		log.Debug("Cleaning up temporary directory", "path", tempDir)
-		if err := os.RemoveAll(tempDir); err != nil {
+		if err := sm.fs.RemoveAll(tempDir); err != nil {
 			log.Debug("Failed to remove temp dir", "path", tempDir, "error", err)
 		}
 	}()
@@ -208,20 +231,20 @@ func (sm *SnapshotManager) ExportSnapshotMetadata(ctx context.Context, dbPath st
 	// The main database should be closed at this point
 	tempDBPath := filepath.Join(tempDir, "snapshot.db")
 	log.Debug("Copying database to temporary location", "source", dbPath, "destination", tempDBPath)
-	if err := copyFile(dbPath, tempDBPath); err != nil {
+	if err := sm.copyFile(dbPath, tempDBPath); err != nil {
 		return fmt.Errorf("copying database: %w", err)
 	}
-	log.Debug("Database copy complete", "size", getFileSize(tempDBPath))
+	log.Debug("Database copy complete", "size", sm.getFileSize(tempDBPath))

 	// Step 2: Clean the temp database to only contain current snapshot data
-	log.Debug("Cleaning temporary database to contain only current snapshot data", "snapshot_id", snapshotID, "db_path", tempDBPath)
+	log.Debug("Cleaning temporary database", "snapshot_id", snapshotID)
 	stats, err := sm.cleanSnapshotDB(ctx, tempDBPath, snapshotID)
 	if err != nil {
 		return fmt.Errorf("cleaning snapshot database: %w", err)
 	}
 	log.Info("Temporary database cleanup complete",
 		"db_path", tempDBPath,
-		"size_after_clean", humanize.Bytes(uint64(getFileSize(tempDBPath))),
+		"size_after_clean", humanize.Bytes(uint64(sm.getFileSize(tempDBPath))),
 		"files", stats.FileCount,
 		"chunks", stats.ChunkCount,
 		"blobs", stats.BlobCount,
@@ -229,31 +252,29 @@ func (sm *SnapshotManager) ExportSnapshotMetadata(ctx context.Context, dbPath st
 		"total_uncompressed_size", humanize.Bytes(uint64(stats.UncompressedSize)),
 		"compression_ratio", fmt.Sprintf("%.2fx", float64(stats.UncompressedSize)/float64(stats.CompressedSize)))

-	// Step 3: Dump the cleaned database to SQL
-	dumpPath := filepath.Join(tempDir, "snapshot.sql")
-	log.Debug("Dumping database to SQL", "source", tempDBPath, "destination", dumpPath)
-	if err := sm.dumpDatabase(tempDBPath, dumpPath); err != nil {
-		return fmt.Errorf("dumping database: %w", err)
+	// Step 3: VACUUM the database to remove deleted data and compact
+	// This is critical for security - ensures no stale/deleted data is uploaded
+	if err := sm.vacuumDatabase(tempDBPath); err != nil {
+		return fmt.Errorf("vacuuming database: %w", err)
 	}
-	log.Debug("SQL dump complete", "size", getFileSize(dumpPath))
+	log.Debug("Database vacuumed", "size", humanize.Bytes(uint64(sm.getFileSize(tempDBPath))))

-	// Step 4: Compress and encrypt the SQL dump
-	compressedPath := filepath.Join(tempDir, "snapshot.sql.zst.age")
-	log.Debug("Compressing and encrypting SQL dump", "source", dumpPath, "destination", compressedPath)
-	if err := sm.compressDump(dumpPath, compressedPath); err != nil {
-		return fmt.Errorf("compressing dump: %w", err)
+	// Step 4: Compress and encrypt the binary database file
+	compressedPath := filepath.Join(tempDir, "db.zst.age")
+	if err := sm.compressFile(tempDBPath, compressedPath); err != nil {
+		return fmt.Errorf("compressing database: %w", err)
 	}
-	log.Debug("Compression complete", "original_size", getFileSize(dumpPath), "compressed_size", getFileSize(compressedPath))
+	log.Debug("Compression complete",
+		"original_size", humanize.Bytes(uint64(sm.getFileSize(tempDBPath))),
+		"compressed_size", humanize.Bytes(uint64(sm.getFileSize(compressedPath))))

 	// Step 5: Read compressed and encrypted data for upload
-	log.Debug("Reading compressed and encrypted data for upload", "path", compressedPath)
-	finalData, err := os.ReadFile(compressedPath)
+	finalData, err := afero.ReadFile(sm.fs, compressedPath)
 	if err != nil {
 		return fmt.Errorf("reading compressed dump: %w", err)
 	}

 	// Step 6: Generate blob manifest (before closing temp DB)
-	log.Debug("Generating blob manifest from temporary database", "db_path", tempDBPath)
 	blobManifest, err := sm.generateBlobManifest(ctx, tempDBPath, snapshotID)
 	if err != nil {
 		return fmt.Errorf("generating blob manifest: %w", err)
@@ -263,14 +284,13 @@ func (sm *SnapshotManager) ExportSnapshotMetadata(ctx context.Context, dbPath st
 	// Upload database backup (compressed and encrypted)
 	dbKey := fmt.Sprintf("metadata/%s/db.zst.age", snapshotID)

-	log.Debug("Uploading snapshot database to S3", "key", dbKey, "size", len(finalData))
 	dbUploadStart := time.Now()
-	if err := sm.s3Client.PutObject(ctx, dbKey, bytes.NewReader(finalData)); err != nil {
+	if err := sm.storage.Put(ctx, dbKey, bytes.NewReader(finalData)); err != nil {
 		return fmt.Errorf("uploading snapshot database: %w", err)
 	}
 	dbUploadDuration := time.Since(dbUploadStart)
 	dbUploadSpeed := float64(len(finalData)) * 8 / dbUploadDuration.Seconds() // bits per second
-	log.Info("Uploaded snapshot database to S3",
+	log.Info("Uploaded snapshot database",
 		"path", dbKey,
 		"size", humanize.Bytes(uint64(len(finalData))),
 		"duration", dbUploadDuration,
@@ -278,14 +298,13 @@ func (sm *SnapshotManager) ExportSnapshotMetadata(ctx context.Context, dbPath st

 	// Upload blob manifest (compressed only, not encrypted)
 	manifestKey := fmt.Sprintf("metadata/%s/manifest.json.zst", snapshotID)
-	log.Debug("Uploading blob manifest to S3", "key", manifestKey, "size", len(blobManifest))
 	manifestUploadStart := time.Now()
-	if err := sm.s3Client.PutObject(ctx, manifestKey, bytes.NewReader(blobManifest)); err != nil {
+	if err := sm.storage.Put(ctx, manifestKey, bytes.NewReader(blobManifest)); err != nil {
 		return fmt.Errorf("uploading blob manifest: %w", err)
 	}
 	manifestUploadDuration := time.Since(manifestUploadStart)
 	manifestUploadSpeed := float64(len(blobManifest)) * 8 / manifestUploadDuration.Seconds() // bits per second
-	log.Info("Uploaded blob manifest to S3",
+	log.Info("Uploaded blob manifest",
 		"path", manifestKey,
 		"size", humanize.Bytes(uint64(len(blobManifest))),
 		"duration", manifestUploadDuration,
@@ -411,67 +430,61 @@ func (sm *SnapshotManager) cleanSnapshotDB(ctx context.Context, dbPath string, s
 	stats.CompressedSize = compressedSize.Int64
 	stats.UncompressedSize = uncompressedSize.Int64

-	log.Debug("[Temp DB Cleanup] Database cleanup complete", "stats", stats)
 	return stats, nil
 }

-// dumpDatabase creates a SQL dump of the database
-func (sm *SnapshotManager) dumpDatabase(dbPath, dumpPath string) error {
-	log.Debug("Running sqlite3 dump command", "source", dbPath, "destination", dumpPath)
-	cmd := exec.Command("sqlite3", dbPath, ".dump")
+// vacuumDatabase runs VACUUM on the database to remove deleted data and compact
+// This is critical for security - ensures no stale/deleted data pages are uploaded
+func (sm *SnapshotManager) vacuumDatabase(dbPath string) error {
+	log.Debug("Running VACUUM on database", "path", dbPath)
+	cmd := exec.Command("sqlite3", dbPath, "VACUUM;")

-	output, err := cmd.Output()
-	if err != nil {
-		return fmt.Errorf("running sqlite3 dump: %w", err)
-	}
-
-	log.Debug("SQL dump generated", "size", len(output))
-	if err := os.WriteFile(dumpPath, output, 0644); err != nil {
-		return fmt.Errorf("writing dump file: %w", err)
+	if output, err := cmd.CombinedOutput(); err != nil {
+		return fmt.Errorf("running VACUUM: %w (output: %s)", err, string(output))
 	}

 	return nil
 }

-// compressDump compresses the SQL dump using zstd
-func (sm *SnapshotManager) compressDump(inputPath, outputPath string) error {
-	log.Debug("Opening SQL dump for compression", "path", inputPath)
-	input, err := os.Open(inputPath)
+// compressFile compresses a file using zstd and encrypts with age
+func (sm *SnapshotManager) compressFile(inputPath, outputPath string) error {
+	input, err := sm.fs.Open(inputPath)
 	if err != nil {
 		return fmt.Errorf("opening input file: %w", err)
 	}
 	defer func() {
-		log.Debug("Closing input file", "path", inputPath)
 		if err := input.Close(); err != nil {
 			log.Debug("Failed to close input file", "path", inputPath, "error", err)
 		}
 	}()

-	log.Debug("Creating output file for compressed and encrypted data", "path", outputPath)
-	output, err := os.Create(outputPath)
+	output, err := sm.fs.Create(outputPath)
 	if err != nil {
 		return fmt.Errorf("creating output file: %w", err)
 	}
 	defer func() {
-		log.Debug("Closing output file", "path", outputPath)
 		if err := output.Close(); err != nil {
 			log.Debug("Failed to close output file", "path", outputPath, "error", err)
 		}
 	}()

 	// Use blobgen for compression and encryption
-	log.Debug("Creating compressor/encryptor", "level", sm.config.CompressionLevel)
+	log.Debug("Compressing and encrypting data")
 	writer, err := blobgen.NewWriter(output, sm.config.CompressionLevel, sm.config.AgeRecipients)
 	if err != nil {
 		return fmt.Errorf("creating blobgen writer: %w", err)
 	}
+
+	// Track if writer has been closed to avoid double-close
+	writerClosed := false
 	defer func() {
+		if !writerClosed {
 			if err := writer.Close(); err != nil {
 				log.Debug("Failed to close writer", "error", err)
 			}
+		}
 	}()

-	log.Debug("Compressing and encrypting data")
 	if _, err := io.Copy(writer, input); err != nil {
 		return fmt.Errorf("compressing data: %w", err)
 	}
@@ -480,6 +493,7 @@ func (sm *SnapshotManager) compressDump(inputPath, outputPath string) error {
 	if err := writer.Close(); err != nil {
 		return fmt.Errorf("closing writer: %w", err)
 	}
+	writerClosed = true

 	log.Debug("Compression complete", "hash", fmt.Sprintf("%x", writer.Sum256()))

@@ -487,9 +501,9 @@ func (sm *SnapshotManager) compressDump(inputPath, outputPath string) error {
 }

 // copyFile copies a file from src to dst
-func copyFile(src, dst string) error {
+func (sm *SnapshotManager) copyFile(src, dst string) error {
 	log.Debug("Opening source file for copy", "path", src)
-	sourceFile, err := os.Open(src)
+	sourceFile, err := sm.fs.Open(src)
 	if err != nil {
 		return err
 	}
@@ -501,7 +515,7 @@ func copyFile(src, dst string) error {
 	}()

 	log.Debug("Creating destination file", "path", dst)
-	destFile, err := os.Create(dst)
+	destFile, err := sm.fs.Create(dst)
 	if err != nil {
 		return err
 	}
@@ -524,7 +538,6 @@ func copyFile(src, dst string) error {

 // generateBlobManifest creates a compressed JSON list of all blobs in the snapshot
 func (sm *SnapshotManager) generateBlobManifest(ctx context.Context, dbPath string, snapshotID string) ([]byte, error) {
-	log.Debug("Generating blob manifest", "db_path", dbPath, "snapshot_id", snapshotID)

 	// Open the cleaned database using the database package
 	db, err := database.New(ctx, dbPath)
@@ -573,7 +586,6 @@ func (sm *SnapshotManager) generateBlobManifest(ctx context.Context, dbPath stri
 	}

 	// Encode manifest
-	log.Debug("Encoding manifest")
 	compressedData, err := EncodeManifest(manifest, sm.config.CompressionLevel)
 	if err != nil {
 		return nil, fmt.Errorf("encoding manifest: %w", err)
@@ -591,8 +603,8 @@ func (sm *SnapshotManager) generateBlobManifest(ctx context.Context, dbPath stri
 // compressData compresses data using zstd

 // getFileSize returns the size of a file in bytes, or -1 if error
-func getFileSize(path string) int64 {
-	info, err := os.Stat(path)
+func (sm *SnapshotManager) getFileSize(path string) int64 {
+	info, err := sm.fs.Stat(path)
 	if err != nil {
 		return -1
 	}
@@ -635,18 +647,18 @@ func (sm *SnapshotManager) CleanupIncompleteSnapshots(ctx context.Context, hostn

 	log.Info("Found incomplete snapshots", "count", len(incompleteSnapshots))

-	// Check each incomplete snapshot for metadata in S3
+	// Check each incomplete snapshot for metadata in storage
 	for _, snapshot := range incompleteSnapshots {
-		// Check if metadata exists in S3
+		// Check if metadata exists in storage
 		metadataKey := fmt.Sprintf("metadata/%s/db.zst", snapshot.ID)
-		_, err := sm.s3Client.StatObject(ctx, metadataKey)
+		_, err := sm.storage.Stat(ctx, metadataKey)

 		if err != nil {
 			// Metadata doesn't exist in S3 - this is an incomplete snapshot
 			log.Info("Cleaning up incomplete snapshot record", "snapshot_id", snapshot.ID, "started_at", snapshot.StartedAt)

 			// Delete the snapshot and all its associations
-			if err := sm.deleteSnapshot(ctx, snapshot.ID); err != nil {
+			if err := sm.deleteSnapshot(ctx, snapshot.ID.String()); err != nil {
 				return fmt.Errorf("deleting incomplete snapshot %s: %w", snapshot.ID, err)
 			}

@@ -654,8 +666,8 @@ func (sm *SnapshotManager) CleanupIncompleteSnapshots(ctx context.Context, hostn
 		} else {
 			// Metadata exists - this snapshot was completed but database wasn't updated
 			// This shouldn't happen in normal operation, but mark it complete
-			log.Warn("Found snapshot with S3 metadata but incomplete in database", "snapshot_id", snapshot.ID)
-			if err := sm.repos.Snapshots.MarkComplete(ctx, nil, snapshot.ID); err != nil {
+			log.Warn("Found snapshot with remote metadata but incomplete in database", "snapshot_id", snapshot.ID)
+			if err := sm.repos.Snapshots.MarkComplete(ctx, nil, snapshot.ID.String()); err != nil {
 				log.Error("Failed to mark snapshot as complete in database", "snapshot_id", snapshot.ID, "error", err)
 			}
 		}
@@ -676,6 +688,11 @@ func (sm *SnapshotManager) deleteSnapshot(ctx context.Context, snapshotID string
 		return fmt.Errorf("deleting snapshot blobs: %w", err)
 	}

+	// Delete uploads entries (has foreign key to snapshots without CASCADE)
+	if err := sm.repos.Snapshots.DeleteSnapshotUploads(ctx, snapshotID); err != nil {
+		return fmt.Errorf("deleting snapshot uploads: %w", err)
+	}
+
 	// Delete the snapshot itself
 	if err := sm.repos.Snapshots.Delete(ctx, snapshotID); err != nil {
 		return fmt.Errorf("deleting snapshot: %w", err)
@@ -683,15 +700,16 @@ func (sm *SnapshotManager) deleteSnapshot(ctx context.Context, snapshotID string

 	// Clean up orphaned data
 	log.Debug("Cleaning up orphaned records in main database")
-	if err := sm.cleanupOrphanedData(ctx); err != nil {
+	if err := sm.CleanupOrphanedData(ctx); err != nil {
 		return fmt.Errorf("cleaning up orphaned data: %w", err)
 	}

 	return nil
 }

-// cleanupOrphanedData removes files, chunks, and blobs that are no longer referenced by any snapshot
-func (sm *SnapshotManager) cleanupOrphanedData(ctx context.Context) error {
+// CleanupOrphanedData removes files, chunks, and blobs that are no longer referenced by any snapshot.
+// This should be called periodically to clean up data from deleted or incomplete snapshots.
+func (sm *SnapshotManager) CleanupOrphanedData(ctx context.Context) error {
 	// Order is important to respect foreign key constraints:
 	// 1. Delete orphaned files (will cascade delete file_chunks)
 	// 2. Delete orphaned blobs (will cascade delete blob_chunks for deleted blobs)
@@ -731,6 +749,17 @@ func (sm *SnapshotManager) cleanupOrphanedData(ctx context.Context) error {
 // deleteOtherSnapshots deletes all snapshots except the current one
 func (sm *SnapshotManager) deleteOtherSnapshots(ctx context.Context, tx *sql.Tx, currentSnapshotID string) error {
 	log.Debug("[Temp DB Cleanup] Deleting all snapshot records except current", "keeping", currentSnapshotID)
+
+	// First delete uploads that reference other snapshots (no CASCADE DELETE on this FK)
+	database.LogSQL("Execute", "DELETE FROM uploads WHERE snapshot_id != ?", currentSnapshotID)
+	uploadResult, err := tx.ExecContext(ctx, "DELETE FROM uploads WHERE snapshot_id != ?", currentSnapshotID)
+	if err != nil {
+		return fmt.Errorf("deleting uploads for other snapshots: %w", err)
+	}
+	uploadsDeleted, _ := uploadResult.RowsAffected()
+	log.Debug("[Temp DB Cleanup] Deleted upload records", "count", uploadsDeleted)
+
+	// Now we can safely delete the snapshots
 	database.LogSQL("Execute", "DELETE FROM snapshots WHERE id != ?", currentSnapshotID)
 	result, err := tx.ExecContext(ctx, "DELETE FROM snapshots WHERE id != ?", currentSnapshotID)
 	if err != nil {
@@ -842,16 +871,21 @@ func (sm *SnapshotManager) deleteOrphanedBlobToChunkMappings(ctx context.Context
 	return nil
 }

-// deleteOrphanedChunks deletes chunks not referenced by any file
+// deleteOrphanedChunks deletes chunks not referenced by any file or blob
 func (sm *SnapshotManager) deleteOrphanedChunks(ctx context.Context, tx *sql.Tx) error {
 	log.Debug("[Temp DB Cleanup] Deleting orphaned chunk records")
-	database.LogSQL("Execute", `DELETE FROM chunks WHERE NOT EXISTS (SELECT 1 FROM file_chunks WHERE file_chunks.chunk_hash = chunks.chunk_hash)`)
-	result, err := tx.ExecContext(ctx, `
+	query := `
 		DELETE FROM chunks 
 		WHERE NOT EXISTS (
 			SELECT 1 FROM file_chunks 
 			WHERE file_chunks.chunk_hash = chunks.chunk_hash
-		)`)
+		)
+		AND NOT EXISTS (
+			SELECT 1 FROM blob_chunks 
+			WHERE blob_chunks.chunk_hash = chunks.chunk_hash
+		)`
+	database.LogSQL("Execute", query)
+	result, err := tx.ExecContext(ctx, query)
 	if err != nil {
 		return fmt.Errorf("deleting orphaned chunks: %w", err)
 	}
--- a/internal/snapshot/snapshot_test.go
+++ b/internal/snapshot/snapshot_test.go
@@ -3,12 +3,14 @@ package snapshot
 import (
 	"context"
 	"database/sql"
+	"io"
 	"path/filepath"
 	"testing"

 	"git.eeqj.de/sneak/vaultik/internal/config"
 	"git.eeqj.de/sneak/vaultik/internal/database"
 	"git.eeqj.de/sneak/vaultik/internal/log"
+	"github.com/spf13/afero"
 )

 const (
@@ -16,11 +18,30 @@ const (
 	testAgeRecipient = "age1ezrjmfpwsc95svdg0y54mums3zevgzu0x0ecq2f7tp8a05gl0sjq9q9wjg"
 )

+// copyFile is a test helper to copy files using afero
+func copyFile(fs afero.Fs, src, dst string) error {
+	sourceFile, err := fs.Open(src)
+	if err != nil {
+		return err
+	}
+	defer func() { _ = sourceFile.Close() }()
+
+	destFile, err := fs.Create(dst)
+	if err != nil {
+		return err
+	}
+	defer func() { _ = destFile.Close() }()
+
+	_, err = io.Copy(destFile, sourceFile)
+	return err
+}
+
 func TestCleanSnapshotDBEmptySnapshot(t *testing.T) {
 	// Initialize logger
 	log.Initialize(log.Config{})

 	ctx := context.Background()
+	fs := afero.NewOsFs()

 	// Create a test database
 	tempDir := t.TempDir()
@@ -66,7 +87,7 @@ func TestCleanSnapshotDBEmptySnapshot(t *testing.T) {

 	// Copy database
 	tempDBPath := filepath.Join(tempDir, "temp.db")
-	if err := copyFile(dbPath, tempDBPath); err != nil {
+	if err := copyFile(fs, dbPath, tempDBPath); err != nil {
 		t.Fatalf("failed to copy database: %v", err)
 	}

@@ -75,9 +96,12 @@ func TestCleanSnapshotDBEmptySnapshot(t *testing.T) {
 		CompressionLevel: 3,
 		AgeRecipients:    []string{testAgeRecipient},
 	}
-	// Clean the database
-	sm := &SnapshotManager{config: cfg}
-	if _, err := sm.cleanSnapshotDB(ctx, tempDBPath, snapshot.ID); err != nil {
+	// Create SnapshotManager with filesystem
+	sm := &SnapshotManager{
+		config: cfg,
+		fs:     fs,
+	}
+	if _, err := sm.cleanSnapshotDB(ctx, tempDBPath, snapshot.ID.String()); err != nil {
 		t.Fatalf("failed to clean snapshot database: %v", err)
 	}

@@ -95,7 +119,7 @@ func TestCleanSnapshotDBEmptySnapshot(t *testing.T) {
 	cleanedRepos := database.NewRepositories(cleanedDB)

 	// Verify snapshot exists
-	verifySnapshot, err := cleanedRepos.Snapshots.GetByID(ctx, snapshot.ID)
+	verifySnapshot, err := cleanedRepos.Snapshots.GetByID(ctx, snapshot.ID.String())
 	if err != nil {
 		t.Fatalf("failed to get snapshot: %v", err)
 	}
@@ -104,7 +128,7 @@ func TestCleanSnapshotDBEmptySnapshot(t *testing.T) {
 	}

 	// Verify orphan file is gone
-	f, err := cleanedRepos.Files.GetByPath(ctx, file.Path)
+	f, err := cleanedRepos.Files.GetByPath(ctx, file.Path.String())
 	if err != nil {
 		t.Fatalf("failed to check file: %v", err)
 	}
@@ -113,7 +137,7 @@ func TestCleanSnapshotDBEmptySnapshot(t *testing.T) {
 	}

 	// Verify orphan chunk is gone
-	c, err := cleanedRepos.Chunks.GetByHash(ctx, chunk.ChunkHash)
+	c, err := cleanedRepos.Chunks.GetByHash(ctx, chunk.ChunkHash.String())
 	if err != nil {
 		t.Fatalf("failed to check chunk: %v", err)
 	}
@@ -127,6 +151,7 @@ func TestCleanSnapshotDBNonExistentSnapshot(t *testing.T) {
 	log.Initialize(log.Config{})

 	ctx := context.Background()
+	fs := afero.NewOsFs()

 	// Create a test database
 	tempDir := t.TempDir()
@@ -143,7 +168,7 @@ func TestCleanSnapshotDBNonExistentSnapshot(t *testing.T) {

 	// Copy database
 	tempDBPath := filepath.Join(tempDir, "temp.db")
-	if err := copyFile(dbPath, tempDBPath); err != nil {
+	if err := copyFile(fs, dbPath, tempDBPath); err != nil {
 		t.Fatalf("failed to copy database: %v", err)
 	}

@@ -153,7 +178,7 @@ func TestCleanSnapshotDBNonExistentSnapshot(t *testing.T) {
 		AgeRecipients:    []string{testAgeRecipient},
 	}
 	// Try to clean with non-existent snapshot
-	sm := &SnapshotManager{config: cfg}
+	sm := &SnapshotManager{config: cfg, fs: fs}
 	_, err = sm.cleanSnapshotDB(ctx, tempDBPath, "non-existent-snapshot")

 	// Should not error - it will just delete everything
--- a/internal/storage/file.go
+++ b/internal/storage/file.go
@@ -0,0 +1,262 @@
+package storage
+
+import (
+	"context"
+	"fmt"
+	"io"
+	"os"
+	"path/filepath"
+	"strings"
+
+	"github.com/spf13/afero"
+)
+
+// FileStorer implements Storer using the local filesystem.
+// It mirrors the S3 path structure for consistency.
+type FileStorer struct {
+	fs       afero.Fs
+	basePath string
+}
+
+// NewFileStorer creates a new filesystem storage backend.
+// The basePath directory will be created if it doesn't exist.
+// Uses the real OS filesystem by default; call SetFilesystem to override for testing.
+func NewFileStorer(basePath string) (*FileStorer, error) {
+	fs := afero.NewOsFs()
+	// Ensure base path exists
+	if err := fs.MkdirAll(basePath, 0755); err != nil {
+		return nil, fmt.Errorf("creating base path: %w", err)
+	}
+	return &FileStorer{
+		fs:       fs,
+		basePath: basePath,
+	}, nil
+}
+
+// SetFilesystem overrides the filesystem for testing.
+func (f *FileStorer) SetFilesystem(fs afero.Fs) {
+	f.fs = fs
+}
+
+// fullPath returns the full filesystem path for a key.
+func (f *FileStorer) fullPath(key string) string {
+	return filepath.Join(f.basePath, key)
+}
+
+// Put stores data at the specified key.
+func (f *FileStorer) Put(ctx context.Context, key string, data io.Reader) error {
+	path := f.fullPath(key)
+
+	// Create parent directories
+	dir := filepath.Dir(path)
+	if err := f.fs.MkdirAll(dir, 0755); err != nil {
+		return fmt.Errorf("creating directories: %w", err)
+	}
+
+	file, err := f.fs.Create(path)
+	if err != nil {
+		return fmt.Errorf("creating file: %w", err)
+	}
+	defer func() { _ = file.Close() }()
+
+	if _, err := io.Copy(file, data); err != nil {
+		return fmt.Errorf("writing file: %w", err)
+	}
+
+	return nil
+}
+
+// PutWithProgress stores data with progress reporting.
+func (f *FileStorer) PutWithProgress(ctx context.Context, key string, data io.Reader, size int64, progress ProgressCallback) error {
+	path := f.fullPath(key)
+
+	// Create parent directories
+	dir := filepath.Dir(path)
+	if err := f.fs.MkdirAll(dir, 0755); err != nil {
+		return fmt.Errorf("creating directories: %w", err)
+	}
+
+	file, err := f.fs.Create(path)
+	if err != nil {
+		return fmt.Errorf("creating file: %w", err)
+	}
+	defer func() { _ = file.Close() }()
+
+	// Wrap with progress tracking
+	pw := &progressWriter{
+		writer:   file,
+		callback: progress,
+	}
+
+	if _, err := io.Copy(pw, data); err != nil {
+		return fmt.Errorf("writing file: %w", err)
+	}
+
+	return nil
+}
+
+// Get retrieves data from the specified key.
+func (f *FileStorer) Get(ctx context.Context, key string) (io.ReadCloser, error) {
+	path := f.fullPath(key)
+	file, err := f.fs.Open(path)
+	if err != nil {
+		if os.IsNotExist(err) {
+			return nil, ErrNotFound
+		}
+		return nil, fmt.Errorf("opening file: %w", err)
+	}
+	return file, nil
+}
+
+// Stat returns metadata about an object without retrieving its contents.
+func (f *FileStorer) Stat(ctx context.Context, key string) (*ObjectInfo, error) {
+	path := f.fullPath(key)
+	info, err := f.fs.Stat(path)
+	if err != nil {
+		if os.IsNotExist(err) {
+			return nil, ErrNotFound
+		}
+		return nil, fmt.Errorf("stat file: %w", err)
+	}
+	return &ObjectInfo{
+		Key:  key,
+		Size: info.Size(),
+	}, nil
+}
+
+// Delete removes an object.
+func (f *FileStorer) Delete(ctx context.Context, key string) error {
+	path := f.fullPath(key)
+	err := f.fs.Remove(path)
+	if os.IsNotExist(err) {
+		return nil // Match S3 behavior: no error if doesn't exist
+	}
+	if err != nil {
+		return fmt.Errorf("removing file: %w", err)
+	}
+	return nil
+}
+
+// List returns all keys with the given prefix.
+func (f *FileStorer) List(ctx context.Context, prefix string) ([]string, error) {
+	var keys []string
+	basePath := f.fullPath(prefix)
+
+	// Check if base path exists
+	exists, err := afero.Exists(f.fs, basePath)
+	if err != nil {
+		return nil, fmt.Errorf("checking path: %w", err)
+	}
+	if !exists {
+		return keys, nil // Empty list for non-existent prefix
+	}
+
+	err = afero.Walk(f.fs, basePath, func(path string, info os.FileInfo, err error) error {
+		if err != nil {
+			return err
+		}
+
+		// Check context cancellation
+		select {
+		case <-ctx.Done():
+			return ctx.Err()
+		default:
+		}
+
+		if !info.IsDir() {
+			// Convert back to key (relative path from basePath)
+			relPath, err := filepath.Rel(f.basePath, path)
+			if err != nil {
+				return fmt.Errorf("computing relative path: %w", err)
+			}
+			// Normalize path separators to forward slashes for consistency
+			relPath = strings.ReplaceAll(relPath, string(filepath.Separator), "/")
+			keys = append(keys, relPath)
+		}
+		return nil
+	})
+
+	if err != nil {
+		return nil, fmt.Errorf("walking directory: %w", err)
+	}
+
+	return keys, nil
+}
+
+// ListStream returns a channel of ObjectInfo for large result sets.
+func (f *FileStorer) ListStream(ctx context.Context, prefix string) <-chan ObjectInfo {
+	ch := make(chan ObjectInfo)
+	go func() {
+		defer close(ch)
+		basePath := f.fullPath(prefix)
+
+		// Check if base path exists
+		exists, err := afero.Exists(f.fs, basePath)
+		if err != nil {
+			ch <- ObjectInfo{Err: fmt.Errorf("checking path: %w", err)}
+			return
+		}
+		if !exists {
+			return // Empty channel for non-existent prefix
+		}
+
+		_ = afero.Walk(f.fs, basePath, func(path string, info os.FileInfo, err error) error {
+			// Check context cancellation
+			select {
+			case <-ctx.Done():
+				ch <- ObjectInfo{Err: ctx.Err()}
+				return ctx.Err()
+			default:
+			}
+
+			if err != nil {
+				ch <- ObjectInfo{Err: err}
+				return nil // Continue walking despite errors
+			}
+
+			if !info.IsDir() {
+				relPath, err := filepath.Rel(f.basePath, path)
+				if err != nil {
+					ch <- ObjectInfo{Err: fmt.Errorf("computing relative path: %w", err)}
+					return nil
+				}
+				// Normalize path separators
+				relPath = strings.ReplaceAll(relPath, string(filepath.Separator), "/")
+				ch <- ObjectInfo{
+					Key:  relPath,
+					Size: info.Size(),
+				}
+			}
+			return nil
+		})
+	}()
+	return ch
+}
+
+// Info returns human-readable storage location information.
+func (f *FileStorer) Info() StorageInfo {
+	return StorageInfo{
+		Type:     "file",
+		Location: f.basePath,
+	}
+}
+
+// progressWriter wraps an io.Writer to track write progress.
+type progressWriter struct {
+	writer   io.Writer
+	written  int64
+	callback ProgressCallback
+}
+
+func (pw *progressWriter) Write(p []byte) (int, error) {
+	n, err := pw.writer.Write(p)
+	if n > 0 {
+		pw.written += int64(n)
+		if pw.callback != nil {
+			if callbackErr := pw.callback(pw.written); callbackErr != nil {
+				return n, callbackErr
+			}
+		}
+	}
+	return n, err
+}
--- a/internal/storage/module.go
+++ b/internal/storage/module.go
@@ -0,0 +1,113 @@
+package storage
+
+import (
+	"context"
+	"fmt"
+	"strings"
+
+	"git.eeqj.de/sneak/vaultik/internal/config"
+	"git.eeqj.de/sneak/vaultik/internal/s3"
+	"go.uber.org/fx"
+)
+
+// Module exports storage functionality as an fx module.
+// It provides a Storer implementation based on the configured storage URL
+// or falls back to legacy S3 configuration.
+var Module = fx.Module("storage",
+	fx.Provide(NewStorer),
+)
+
+// NewStorer creates a Storer based on configuration.
+// If StorageURL is set, it uses URL-based configuration.
+// Otherwise, it falls back to legacy S3 configuration.
+func NewStorer(cfg *config.Config) (Storer, error) {
+	if cfg.StorageURL != "" {
+		return storerFromURL(cfg.StorageURL, cfg)
+	}
+	return storerFromLegacyS3Config(cfg)
+}
+
+func storerFromURL(rawURL string, cfg *config.Config) (Storer, error) {
+	parsed, err := ParseStorageURL(rawURL)
+	if err != nil {
+		return nil, fmt.Errorf("parsing storage URL: %w", err)
+	}
+
+	switch parsed.Scheme {
+	case "file":
+		return NewFileStorer(parsed.Prefix)
+
+	case "s3":
+		// Build endpoint URL
+		endpoint := parsed.Endpoint
+		if endpoint == "" {
+			endpoint = "s3.amazonaws.com"
+		}
+
+		// Add protocol if not present
+		if parsed.UseSSL && !strings.HasPrefix(endpoint, "https://") && !strings.HasPrefix(endpoint, "http://") {
+			endpoint = "https://" + endpoint
+		} else if !parsed.UseSSL && !strings.HasPrefix(endpoint, "http://") && !strings.HasPrefix(endpoint, "https://") {
+			endpoint = "http://" + endpoint
+		}
+
+		region := parsed.Region
+		if region == "" {
+			region = cfg.S3.Region
+			if region == "" {
+				region = "us-east-1"
+			}
+		}
+
+		// Credentials come from config (not URL for security)
+		client, err := s3.NewClient(context.Background(), s3.Config{
+			Endpoint:        endpoint,
+			Bucket:          parsed.Bucket,
+			Prefix:          parsed.Prefix,
+			AccessKeyID:     cfg.S3.AccessKeyID,
+			SecretAccessKey: cfg.S3.SecretAccessKey,
+			Region:          region,
+		})
+		if err != nil {
+			return nil, fmt.Errorf("creating S3 client: %w", err)
+		}
+		return NewS3Storer(client), nil
+
+	case "rclone":
+		return NewRcloneStorer(context.Background(), parsed.RcloneRemote, parsed.Prefix)
+
+	default:
+		return nil, fmt.Errorf("unsupported storage scheme: %s", parsed.Scheme)
+	}
+}
+
+func storerFromLegacyS3Config(cfg *config.Config) (Storer, error) {
+	endpoint := cfg.S3.Endpoint
+
+	// Ensure protocol is present
+	if !strings.HasPrefix(endpoint, "http://") && !strings.HasPrefix(endpoint, "https://") {
+		if cfg.S3.UseSSL {
+			endpoint = "https://" + endpoint
+		} else {
+			endpoint = "http://" + endpoint
+		}
+	}
+
+	region := cfg.S3.Region
+	if region == "" {
+		region = "us-east-1"
+	}
+
+	client, err := s3.NewClient(context.Background(), s3.Config{
+		Endpoint:        endpoint,
+		Bucket:          cfg.S3.Bucket,
+		Prefix:          cfg.S3.Prefix,
+		AccessKeyID:     cfg.S3.AccessKeyID,
+		SecretAccessKey: cfg.S3.SecretAccessKey,
+		Region:          region,
+	})
+	if err != nil {
+		return nil, fmt.Errorf("creating S3 client: %w", err)
+	}
+	return NewS3Storer(client), nil
+}
--- a/internal/storage/rclone.go
+++ b/internal/storage/rclone.go
@@ -0,0 +1,236 @@
+package storage
+
+import (
+	"bytes"
+	"context"
+	"errors"
+	"fmt"
+	"io"
+	"strings"
+	"time"
+
+	"github.com/rclone/rclone/fs"
+	"github.com/rclone/rclone/fs/config/configfile"
+	"github.com/rclone/rclone/fs/operations"
+
+	// Import all rclone backends
+	_ "github.com/rclone/rclone/backend/all"
+)
+
+// ErrRemoteNotFound is returned when an rclone remote is not configured.
+var ErrRemoteNotFound = errors.New("rclone remote not found in config")
+
+// RcloneStorer implements Storer using rclone's filesystem abstraction.
+// This allows vaultik to use any of rclone's 70+ supported storage providers.
+type RcloneStorer struct {
+	fsys   fs.Fs  // rclone filesystem
+	remote string // remote name (for Info())
+	path   string // path within remote (for Info())
+}
+
+// NewRcloneStorer creates a new rclone storage backend.
+// The remote parameter is the rclone remote name (as configured via `rclone config`).
+// The path parameter is the path within the remote.
+func NewRcloneStorer(ctx context.Context, remote, path string) (*RcloneStorer, error) {
+	// Install the default config file handler
+	configfile.Install()
+
+	// Build the rclone path string (e.g., "myremote:path/to/backups")
+	rclonePath := remote + ":"
+	if path != "" {
+		rclonePath += path
+	}
+
+	// Create the rclone filesystem
+	fsys, err := fs.NewFs(ctx, rclonePath)
+	if err != nil {
+		// Check for remote not found error
+		if strings.Contains(err.Error(), "didn't find section in config file") ||
+			strings.Contains(err.Error(), "failed to find remote") {
+			return nil, fmt.Errorf("%w: %s", ErrRemoteNotFound, remote)
+		}
+		return nil, fmt.Errorf("creating rclone filesystem: %w", err)
+	}
+
+	return &RcloneStorer{
+		fsys:   fsys,
+		remote: remote,
+		path:   path,
+	}, nil
+}
+
+// Put stores data at the specified key.
+func (r *RcloneStorer) Put(ctx context.Context, key string, data io.Reader) error {
+	// Read all data into memory to get size (required by rclone)
+	buf, err := io.ReadAll(data)
+	if err != nil {
+		return fmt.Errorf("reading data: %w", err)
+	}
+
+	// Upload the object
+	_, err = operations.Rcat(ctx, r.fsys, key, io.NopCloser(bytes.NewReader(buf)), time.Now(), nil)
+	if err != nil {
+		return fmt.Errorf("uploading object: %w", err)
+	}
+
+	return nil
+}
+
+// PutWithProgress stores data with progress reporting.
+func (r *RcloneStorer) PutWithProgress(ctx context.Context, key string, data io.Reader, size int64, progress ProgressCallback) error {
+	// Wrap reader with progress tracking
+	pr := &progressReader{
+		reader:   data,
+		callback: progress,
+	}
+
+	// Upload the object
+	_, err := operations.Rcat(ctx, r.fsys, key, io.NopCloser(pr), time.Now(), nil)
+	if err != nil {
+		return fmt.Errorf("uploading object: %w", err)
+	}
+
+	return nil
+}
+
+// Get retrieves data from the specified key.
+func (r *RcloneStorer) Get(ctx context.Context, key string) (io.ReadCloser, error) {
+	// Get the object
+	obj, err := r.fsys.NewObject(ctx, key)
+	if err != nil {
+		if errors.Is(err, fs.ErrorObjectNotFound) {
+			return nil, ErrNotFound
+		}
+		if errors.Is(err, fs.ErrorDirNotFound) {
+			return nil, ErrNotFound
+		}
+		return nil, fmt.Errorf("getting object: %w", err)
+	}
+
+	// Open the object for reading
+	reader, err := obj.Open(ctx)
+	if err != nil {
+		return nil, fmt.Errorf("opening object: %w", err)
+	}
+
+	return reader, nil
+}
+
+// Stat returns metadata about an object without retrieving its contents.
+func (r *RcloneStorer) Stat(ctx context.Context, key string) (*ObjectInfo, error) {
+	obj, err := r.fsys.NewObject(ctx, key)
+	if err != nil {
+		if errors.Is(err, fs.ErrorObjectNotFound) {
+			return nil, ErrNotFound
+		}
+		if errors.Is(err, fs.ErrorDirNotFound) {
+			return nil, ErrNotFound
+		}
+		return nil, fmt.Errorf("getting object: %w", err)
+	}
+
+	return &ObjectInfo{
+		Key:  key,
+		Size: obj.Size(),
+	}, nil
+}
+
+// Delete removes an object.
+func (r *RcloneStorer) Delete(ctx context.Context, key string) error {
+	obj, err := r.fsys.NewObject(ctx, key)
+	if err != nil {
+		if errors.Is(err, fs.ErrorObjectNotFound) {
+			return nil // Match S3 behavior: no error if doesn't exist
+		}
+		if errors.Is(err, fs.ErrorDirNotFound) {
+			return nil
+		}
+		return fmt.Errorf("getting object: %w", err)
+	}
+
+	if err := obj.Remove(ctx); err != nil {
+		return fmt.Errorf("removing object: %w", err)
+	}
+
+	return nil
+}
+
+// List returns all keys with the given prefix.
+func (r *RcloneStorer) List(ctx context.Context, prefix string) ([]string, error) {
+	var keys []string
+
+	err := operations.ListFn(ctx, r.fsys, func(obj fs.Object) {
+		key := obj.Remote()
+		if prefix == "" || strings.HasPrefix(key, prefix) {
+			keys = append(keys, key)
+		}
+	})
+	if err != nil {
+		return nil, fmt.Errorf("listing objects: %w", err)
+	}
+
+	return keys, nil
+}
+
+// ListStream returns a channel of ObjectInfo for large result sets.
+func (r *RcloneStorer) ListStream(ctx context.Context, prefix string) <-chan ObjectInfo {
+	ch := make(chan ObjectInfo)
+
+	go func() {
+		defer close(ch)
+
+		err := operations.ListFn(ctx, r.fsys, func(obj fs.Object) {
+			// Check context cancellation
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
+
+			key := obj.Remote()
+			if prefix == "" || strings.HasPrefix(key, prefix) {
+				ch <- ObjectInfo{
+					Key:  key,
+					Size: obj.Size(),
+				}
+			}
+		})
+		if err != nil {
+			ch <- ObjectInfo{Err: fmt.Errorf("listing objects: %w", err)}
+		}
+	}()
+
+	return ch
+}
+
+// Info returns human-readable storage location information.
+func (r *RcloneStorer) Info() StorageInfo {
+	location := r.remote
+	if r.path != "" {
+		location += ":" + r.path
+	}
+	return StorageInfo{
+		Type:     "rclone",
+		Location: location,
+	}
+}
+
+// progressReader wraps an io.Reader to track read progress.
+type progressReader struct {
+	reader   io.Reader
+	read     int64
+	callback ProgressCallback
+}
+
+func (pr *progressReader) Read(p []byte) (int, error) {
+	n, err := pr.reader.Read(p)
+	if n > 0 {
+		pr.read += int64(n)
+		if pr.callback != nil {
+			if callbackErr := pr.callback(pr.read); callbackErr != nil {
+				return n, callbackErr
+			}
+		}
+	}
+	return n, err
+}
--- a/internal/storage/s3.go
+++ b/internal/storage/s3.go
@@ -0,0 +1,85 @@
+package storage
+
+import (
+	"context"
+	"fmt"
+	"io"
+
+	"git.eeqj.de/sneak/vaultik/internal/s3"
+)
+
+// S3Storer wraps the existing s3.Client to implement Storer.
+type S3Storer struct {
+	client *s3.Client
+}
+
+// NewS3Storer creates a new S3 storage backend.
+func NewS3Storer(client *s3.Client) *S3Storer {
+	return &S3Storer{client: client}
+}
+
+// Put stores data at the specified key.
+func (s *S3Storer) Put(ctx context.Context, key string, data io.Reader) error {
+	return s.client.PutObject(ctx, key, data)
+}
+
+// PutWithProgress stores data with progress reporting.
+func (s *S3Storer) PutWithProgress(ctx context.Context, key string, data io.Reader, size int64, progress ProgressCallback) error {
+	// Convert storage.ProgressCallback to s3.ProgressCallback
+	var s3Progress s3.ProgressCallback
+	if progress != nil {
+		s3Progress = s3.ProgressCallback(progress)
+	}
+	return s.client.PutObjectWithProgress(ctx, key, data, size, s3Progress)
+}
+
+// Get retrieves data from the specified key.
+func (s *S3Storer) Get(ctx context.Context, key string) (io.ReadCloser, error) {
+	return s.client.GetObject(ctx, key)
+}
+
+// Stat returns metadata about an object without retrieving its contents.
+func (s *S3Storer) Stat(ctx context.Context, key string) (*ObjectInfo, error) {
+	info, err := s.client.StatObject(ctx, key)
+	if err != nil {
+		return nil, err
+	}
+	return &ObjectInfo{
+		Key:  info.Key,
+		Size: info.Size,
+	}, nil
+}
+
+// Delete removes an object.
+func (s *S3Storer) Delete(ctx context.Context, key string) error {
+	return s.client.DeleteObject(ctx, key)
+}
+
+// List returns all keys with the given prefix.
+func (s *S3Storer) List(ctx context.Context, prefix string) ([]string, error) {
+	return s.client.ListObjects(ctx, prefix)
+}
+
+// ListStream returns a channel of ObjectInfo for large result sets.
+func (s *S3Storer) ListStream(ctx context.Context, prefix string) <-chan ObjectInfo {
+	ch := make(chan ObjectInfo)
+	go func() {
+		defer close(ch)
+		for info := range s.client.ListObjectsStream(ctx, prefix, false) {
+			ch <- ObjectInfo{
+				Key:  info.Key,
+				Size: info.Size,
+				Err:  info.Err,
+			}
+		}
+	}()
+	return ch
+}
+
+// Info returns human-readable storage location information.
+func (s *S3Storer) Info() StorageInfo {
+	return StorageInfo{
+		Type:     "s3",
+		Location: fmt.Sprintf("%s/%s", s.client.Endpoint(), s.client.BucketName()),
+	}
+}
--- a/internal/storage/storer.go
+++ b/internal/storage/storer.go
@@ -0,0 +1,74 @@
+// Package storage provides a unified interface for storage backends.
+// It supports both S3-compatible object storage and local filesystem storage,
+// allowing Vaultik to store backups in either location with the same API.
+//
+// Storage backends are selected via URL:
+//   - s3://bucket/prefix?endpoint=host&region=r - S3-compatible storage
+//   - file:///path/to/backup - Local filesystem storage
+//
+// Both backends implement the Storer interface and support progress reporting
+// during upload/write operations.
+package storage
+
+import (
+	"context"
+	"errors"
+	"io"
+)
+
+// ErrNotFound is returned when an object does not exist.
+var ErrNotFound = errors.New("object not found")
+
+// ProgressCallback is called during storage operations with bytes transferred so far.
+// Return an error to cancel the operation.
+type ProgressCallback func(bytesTransferred int64) error
+
+// ObjectInfo contains metadata about a stored object.
+type ObjectInfo struct {
+	Key  string // Object key/path
+	Size int64  // Size in bytes
+	Err  error  // Error for streaming results (nil on success)
+}
+
+// StorageInfo provides human-readable storage configuration.
+type StorageInfo struct {
+	Type     string // "s3" or "file"
+	Location string // endpoint/bucket for S3, base path for filesystem
+}
+
+// Storer defines the interface for storage backends.
+// All paths are relative to the storage root (bucket/prefix for S3, base directory for filesystem).
+type Storer interface {
+	// Put stores data at the specified key.
+	// Parent directories are created automatically for filesystem backends.
+	Put(ctx context.Context, key string, data io.Reader) error
+
+	// PutWithProgress stores data with progress reporting.
+	// Size must be the exact size of the data to store.
+	// The progress callback is called periodically with bytes transferred.
+	PutWithProgress(ctx context.Context, key string, data io.Reader, size int64, progress ProgressCallback) error
+
+	// Get retrieves data from the specified key.
+	// The caller must close the returned ReadCloser.
+	// Returns ErrNotFound if the object does not exist.
+	Get(ctx context.Context, key string) (io.ReadCloser, error)
+
+	// Stat returns metadata about an object without retrieving its contents.
+	// Returns ErrNotFound if the object does not exist.
+	Stat(ctx context.Context, key string) (*ObjectInfo, error)
+
+	// Delete removes an object. No error is returned if the object doesn't exist.
+	Delete(ctx context.Context, key string) error
+
+	// List returns all keys with the given prefix.
+	// For large result sets, prefer ListStream.
+	List(ctx context.Context, prefix string) ([]string, error)
+
+	// ListStream returns a channel of ObjectInfo for large result sets.
+	// The channel is closed when listing completes.
+	// If an error occurs during listing, the final item will have Err set.
+	ListStream(ctx context.Context, prefix string) <-chan ObjectInfo
+
+	// Info returns human-readable storage location information.
+	Info() StorageInfo
+}
--- a/internal/storage/url.go
+++ b/internal/storage/url.go
@@ -0,0 +1,118 @@
+package storage
+
+import (
+	"fmt"
+	"net/url"
+	"strings"
+)
+
+// StorageURL represents a parsed storage URL.
+type StorageURL struct {
+	Scheme       string // "s3", "file", or "rclone"
+	Bucket       string // S3 bucket name (empty for file/rclone)
+	Prefix       string // Path within bucket or filesystem base path
+	Endpoint     string // S3 endpoint (optional, default AWS)
+	Region       string // S3 region (optional)
+	UseSSL       bool   // Use HTTPS for S3 (default true)
+	RcloneRemote string // rclone remote name (for rclone:// URLs)
+}
+
+// ParseStorageURL parses a storage URL string.
+// Supported formats:
+//   - s3://bucket/prefix?endpoint=host&region=us-east-1&ssl=true
+//   - file:///absolute/path/to/backup
+//   - rclone://remote/path/to/backups
+func ParseStorageURL(rawURL string) (*StorageURL, error) {
+	if rawURL == "" {
+		return nil, fmt.Errorf("storage URL is empty")
+	}
+
+	// Handle file:// URLs
+	if strings.HasPrefix(rawURL, "file://") {
+		path := strings.TrimPrefix(rawURL, "file://")
+		if path == "" {
+			return nil, fmt.Errorf("file URL path is empty")
+		}
+		return &StorageURL{
+			Scheme: "file",
+			Prefix: path,
+		}, nil
+	}
+
+	// Handle s3:// URLs
+	if strings.HasPrefix(rawURL, "s3://") {
+		u, err := url.Parse(rawURL)
+		if err != nil {
+			return nil, fmt.Errorf("invalid URL: %w", err)
+		}
+
+		bucket := u.Host
+		if bucket == "" {
+			return nil, fmt.Errorf("s3 URL missing bucket name")
+		}
+
+		prefix := strings.TrimPrefix(u.Path, "/")
+
+		query := u.Query()
+		useSSL := true
+		if query.Get("ssl") == "false" {
+			useSSL = false
+		}
+
+		return &StorageURL{
+			Scheme:   "s3",
+			Bucket:   bucket,
+			Prefix:   prefix,
+			Endpoint: query.Get("endpoint"),
+			Region:   query.Get("region"),
+			UseSSL:   useSSL,
+		}, nil
+	}
+
+	// Handle rclone:// URLs
+	if strings.HasPrefix(rawURL, "rclone://") {
+		u, err := url.Parse(rawURL)
+		if err != nil {
+			return nil, fmt.Errorf("invalid URL: %w", err)
+		}
+
+		remote := u.Host
+		if remote == "" {
+			return nil, fmt.Errorf("rclone URL missing remote name")
+		}
+
+		path := strings.TrimPrefix(u.Path, "/")
+
+		return &StorageURL{
+			Scheme:       "rclone",
+			Prefix:       path,
+			RcloneRemote: remote,
+		}, nil
+	}
+
+	return nil, fmt.Errorf("unsupported URL scheme: must start with s3://, file://, or rclone://")
+}
+
+// String returns a human-readable representation of the storage URL.
+func (u *StorageURL) String() string {
+	switch u.Scheme {
+	case "file":
+		return fmt.Sprintf("file://%s", u.Prefix)
+	case "s3":
+		endpoint := u.Endpoint
+		if endpoint == "" {
+			endpoint = "s3.amazonaws.com"
+		}
+		if u.Prefix != "" {
+			return fmt.Sprintf("s3://%s/%s (endpoint: %s)", u.Bucket, u.Prefix, endpoint)
+		}
+		return fmt.Sprintf("s3://%s (endpoint: %s)", u.Bucket, endpoint)
+	case "rclone":
+		if u.Prefix != "" {
+			return fmt.Sprintf("rclone://%s/%s", u.RcloneRemote, u.Prefix)
+		}
+		return fmt.Sprintf("rclone://%s", u.RcloneRemote)
+	default:
+		return fmt.Sprintf("%s://?", u.Scheme)
+	}
+}
--- a/internal/types/types.go
+++ b/internal/types/types.go
@@ -0,0 +1,203 @@
+// Package types provides custom types for better type safety across the vaultik codebase.
+// Using distinct types for IDs, hashes, paths, and credentials prevents accidental
+// mixing of semantically different values that happen to share the same underlying type.
+package types
+
+import (
+	"database/sql/driver"
+	"fmt"
+
+	"github.com/google/uuid"
+)
+
+// FileID is a UUID identifying a file record in the database.
+type FileID uuid.UUID
+
+// NewFileID generates a new random FileID.
+func NewFileID() FileID {
+	return FileID(uuid.New())
+}
+
+// ParseFileID parses a string into a FileID.
+func ParseFileID(s string) (FileID, error) {
+	id, err := uuid.Parse(s)
+	if err != nil {
+		return FileID{}, err
+	}
+	return FileID(id), nil
+}
+
+// IsZero returns true if the FileID is the zero value.
+func (id FileID) IsZero() bool {
+	return uuid.UUID(id) == uuid.Nil
+}
+
+// Value implements driver.Valuer for database serialization.
+func (id FileID) Value() (driver.Value, error) {
+	return uuid.UUID(id).String(), nil
+}
+
+// Scan implements sql.Scanner for database deserialization.
+func (id *FileID) Scan(src interface{}) error {
+	if src == nil {
+		*id = FileID{}
+		return nil
+	}
+
+	var s string
+	switch v := src.(type) {
+	case string:
+		s = v
+	case []byte:
+		s = string(v)
+	default:
+		return fmt.Errorf("cannot scan %T into FileID", src)
+	}
+
+	parsed, err := uuid.Parse(s)
+	if err != nil {
+		return fmt.Errorf("invalid FileID: %w", err)
+	}
+	*id = FileID(parsed)
+	return nil
+}
+
+// BlobID is a UUID identifying a blob record in the database.
+// This is distinct from BlobHash which is the content-addressed hash of the blob.
+type BlobID uuid.UUID
+
+// NewBlobID generates a new random BlobID.
+func NewBlobID() BlobID {
+	return BlobID(uuid.New())
+}
+
+// ParseBlobID parses a string into a BlobID.
+func ParseBlobID(s string) (BlobID, error) {
+	id, err := uuid.Parse(s)
+	if err != nil {
+		return BlobID{}, err
+	}
+	return BlobID(id), nil
+}
+
+// IsZero returns true if the BlobID is the zero value.
+func (id BlobID) IsZero() bool {
+	return uuid.UUID(id) == uuid.Nil
+}
+
+// Value implements driver.Valuer for database serialization.
+func (id BlobID) Value() (driver.Value, error) {
+	return uuid.UUID(id).String(), nil
+}
+
+// Scan implements sql.Scanner for database deserialization.
+func (id *BlobID) Scan(src interface{}) error {
+	if src == nil {
+		*id = BlobID{}
+		return nil
+	}
+
+	var s string
+	switch v := src.(type) {
+	case string:
+		s = v
+	case []byte:
+		s = string(v)
+	default:
+		return fmt.Errorf("cannot scan %T into BlobID", src)
+	}
+
+	parsed, err := uuid.Parse(s)
+	if err != nil {
+		return fmt.Errorf("invalid BlobID: %w", err)
+	}
+	*id = BlobID(parsed)
+	return nil
+}
+
+// SnapshotID identifies a snapshot, typically in format "hostname_name_timestamp".
+type SnapshotID string
+
+// ChunkHash is the SHA256 hash of a chunk's content.
+// Used for content-addressing and deduplication of file chunks.
+type ChunkHash string
+
+// BlobHash is the SHA256 hash of a blob's compressed and encrypted content.
+// This is used as the filename in S3 storage for content-addressed retrieval.
+type BlobHash string
+
+// FilePath represents an absolute path to a file or directory.
+type FilePath string
+
+// SourcePath represents the root directory from which files are backed up.
+// Used during restore to strip the source prefix from paths.
+type SourcePath string
+
+// AgeRecipient is an age public key used for encryption.
+// Format: age1... (Bech32-encoded X25519 public key)
+type AgeRecipient string
+
+// AgeSecretKey is an age private key used for decryption.
+// Format: AGE-SECRET-KEY-... (Bech32-encoded X25519 private key)
+// This type should never be logged or serialized in plaintext.
+type AgeSecretKey string
+
+// S3Endpoint is the URL of an S3-compatible storage endpoint.
+type S3Endpoint string
+
+// BucketName is the name of an S3 bucket.
+type BucketName string
+
+// S3Prefix is the path prefix within an S3 bucket.
+type S3Prefix string
+
+// AWSRegion is an AWS region identifier (e.g., "us-east-1").
+type AWSRegion string
+
+// AWSAccessKeyID is an AWS access key ID for authentication.
+type AWSAccessKeyID string
+
+// AWSSecretAccessKey is an AWS secret access key for authentication.
+// This type should never be logged or serialized in plaintext.
+type AWSSecretAccessKey string
+
+// Hostname identifies a host machine.
+type Hostname string
+
+// Version is a semantic version string.
+type Version string
+
+// GitRevision is a git commit SHA.
+type GitRevision string
+
+// GlobPattern is a glob pattern for file matching (e.g., "*.log", "node_modules").
+type GlobPattern string
+
+// String methods for Stringer interface
+
+func (id FileID) String() string        { return uuid.UUID(id).String() }
+func (id BlobID) String() string        { return uuid.UUID(id).String() }
+func (id SnapshotID) String() string    { return string(id) }
+func (h ChunkHash) String() string      { return string(h) }
+func (h BlobHash) String() string       { return string(h) }
+func (p FilePath) String() string       { return string(p) }
+func (p SourcePath) String() string     { return string(p) }
+func (r AgeRecipient) String() string   { return string(r) }
+func (e S3Endpoint) String() string     { return string(e) }
+func (b BucketName) String() string     { return string(b) }
+func (p S3Prefix) String() string       { return string(p) }
+func (r AWSRegion) String() string      { return string(r) }
+func (k AWSAccessKeyID) String() string { return string(k) }
+func (h Hostname) String() string       { return string(h) }
+func (v Version) String() string        { return string(v) }
+func (r GitRevision) String() string    { return string(r) }
+func (p GlobPattern) String() string    { return string(p) }
+
+// Redacted String methods for sensitive types - prevents accidental logging
+
+func (k AgeSecretKey) String() string       { return "[REDACTED]" }
+func (k AWSSecretAccessKey) String() string { return "[REDACTED]" }
+
+// Raw returns the actual value for sensitive types when explicitly needed
+func (k AgeSecretKey) Raw() string       { return string(k) }
+func (k AWSSecretAccessKey) Raw() string { return string(k) }
--- a/internal/vaultik/helpers.go
+++ b/internal/vaultik/helpers.go
@@ -0,0 +1,96 @@
+package vaultik
+
+import (
+	"fmt"
+	"strconv"
+	"strings"
+	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/types"
+)
+
+// SnapshotInfo contains information about a snapshot
+type SnapshotInfo struct {
+	ID             types.SnapshotID `json:"id"`
+	Timestamp      time.Time        `json:"timestamp"`
+	CompressedSize int64            `json:"compressed_size"`
+}
+
+// formatNumber formats a number with commas
+func formatNumber(n int) string {
+	str := fmt.Sprintf("%d", n)
+	var result []string
+	for i, digit := range str {
+		if i > 0 && (len(str)-i)%3 == 0 {
+			result = append(result, ",")
+		}
+		result = append(result, string(digit))
+	}
+	return strings.Join(result, "")
+}
+
+// formatDuration formats a duration in a human-readable way
+func formatDuration(d time.Duration) string {
+	if d < time.Second {
+		return fmt.Sprintf("%dms", d.Milliseconds())
+	}
+	if d < time.Minute {
+		return fmt.Sprintf("%.1fs", d.Seconds())
+	}
+	if d < time.Hour {
+		mins := int(d.Minutes())
+		secs := int(d.Seconds()) % 60
+		return fmt.Sprintf("%dm %ds", mins, secs)
+	}
+	hours := int(d.Hours())
+	mins := int(d.Minutes()) % 60
+	return fmt.Sprintf("%dh %dm", hours, mins)
+}
+
+// formatBytes formats bytes in a human-readable format
+func formatBytes(bytes int64) string {
+	const unit = 1024
+	if bytes < unit {
+		return fmt.Sprintf("%d B", bytes)
+	}
+	div, exp := int64(unit), 0
+	for n := bytes / unit; n >= unit; n /= unit {
+		div *= unit
+		exp++
+	}
+	return fmt.Sprintf("%.1f %cB", float64(bytes)/float64(div), "KMGTPE"[exp])
+}
+
+// parseSnapshotTimestamp extracts the timestamp from a snapshot ID
+// Format: hostname_snapshotname_2026-01-12T14:41:15Z
+func parseSnapshotTimestamp(snapshotID string) (time.Time, error) {
+	parts := strings.Split(snapshotID, "_")
+	if len(parts) < 2 {
+		return time.Time{}, fmt.Errorf("invalid snapshot ID format: expected hostname_snapshotname_timestamp")
+	}
+
+	// Last part is the RFC3339 timestamp
+	timestampStr := parts[len(parts)-1]
+	timestamp, err := time.Parse(time.RFC3339, timestampStr)
+	if err != nil {
+		return time.Time{}, fmt.Errorf("invalid timestamp: %w", err)
+	}
+
+	return timestamp.UTC(), nil
+}
+
+// parseDuration parses a duration string with support for days
+func parseDuration(s string) (time.Duration, error) {
+	// Check for days suffix
+	if strings.HasSuffix(s, "d") {
+		daysStr := strings.TrimSuffix(s, "d")
+		days, err := strconv.Atoi(daysStr)
+		if err != nil {
+			return 0, fmt.Errorf("invalid days value: %w", err)
+		}
+		return time.Duration(days) * 24 * time.Hour, nil
+	}
+
+	// Otherwise use standard Go duration parsing
+	return time.ParseDuration(s)
+}
--- a/internal/vaultik/info.go
+++ b/internal/vaultik/info.go
@@ -0,0 +1,348 @@
+package vaultik
+
+import (
+	"encoding/json"
+	"fmt"
+	"runtime"
+	"sort"
+	"strings"
+
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"github.com/dustin/go-humanize"
+)
+
+// ShowInfo displays system and configuration information
+func (v *Vaultik) ShowInfo() error {
+	// System Information
+	fmt.Printf("=== System Information ===\n")
+	fmt.Printf("OS/Architecture: %s/%s\n", runtime.GOOS, runtime.GOARCH)
+	fmt.Printf("Version:         %s\n", v.Globals.Version)
+	fmt.Printf("Commit:          %s\n", v.Globals.Commit)
+	fmt.Printf("Go Version:      %s\n", runtime.Version())
+	fmt.Println()
+
+	// Storage Configuration
+	fmt.Printf("=== Storage Configuration ===\n")
+	fmt.Printf("S3 Bucket:       %s\n", v.Config.S3.Bucket)
+	if v.Config.S3.Prefix != "" {
+		fmt.Printf("S3 Prefix:       %s\n", v.Config.S3.Prefix)
+	}
+	fmt.Printf("S3 Endpoint:     %s\n", v.Config.S3.Endpoint)
+	fmt.Printf("S3 Region:       %s\n", v.Config.S3.Region)
+	fmt.Println()
+
+	// Backup Settings
+	fmt.Printf("=== Backup Settings ===\n")
+
+	// Show configured snapshots
+	fmt.Printf("Snapshots:\n")
+	for _, name := range v.Config.SnapshotNames() {
+		snap := v.Config.Snapshots[name]
+		fmt.Printf("  %s:\n", name)
+		for _, path := range snap.Paths {
+			fmt.Printf("    - %s\n", path)
+		}
+		if len(snap.Exclude) > 0 {
+			fmt.Printf("    exclude: %s\n", strings.Join(snap.Exclude, ", "))
+		}
+	}
+
+	// Global exclude patterns
+	if len(v.Config.Exclude) > 0 {
+		fmt.Printf("Global Exclude:  %s\n", strings.Join(v.Config.Exclude, ", "))
+	}
+
+	fmt.Printf("Compression:     zstd level %d\n", v.Config.CompressionLevel)
+	fmt.Printf("Chunk Size:      %s\n", humanize.Bytes(uint64(v.Config.ChunkSize)))
+	fmt.Printf("Blob Size Limit: %s\n", humanize.Bytes(uint64(v.Config.BlobSizeLimit)))
+	fmt.Println()
+
+	// Encryption Configuration
+	fmt.Printf("=== Encryption Configuration ===\n")
+	fmt.Printf("Recipients:\n")
+	for _, recipient := range v.Config.AgeRecipients {
+		fmt.Printf("  - %s\n", recipient)
+	}
+	fmt.Println()
+
+	// Daemon Settings (if applicable)
+	if v.Config.BackupInterval > 0 || v.Config.MinTimeBetweenRun > 0 {
+		fmt.Printf("=== Daemon Settings ===\n")
+		if v.Config.BackupInterval > 0 {
+			fmt.Printf("Backup Interval: %s\n", v.Config.BackupInterval)
+		}
+		if v.Config.MinTimeBetweenRun > 0 {
+			fmt.Printf("Minimum Time:    %s\n", v.Config.MinTimeBetweenRun)
+		}
+		fmt.Println()
+	}
+
+	// Local Database
+	fmt.Printf("=== Local Database ===\n")
+	fmt.Printf("Index Path:      %s\n", v.Config.IndexPath)
+
+	// Check if index file exists and get its size
+	if info, err := v.Fs.Stat(v.Config.IndexPath); err == nil {
+		fmt.Printf("Index Size:      %s\n", humanize.Bytes(uint64(info.Size())))
+
+		// Get snapshot count from database
+		query := `SELECT COUNT(*) FROM snapshots WHERE completed_at IS NOT NULL`
+		var snapshotCount int
+		if err := v.DB.Conn().QueryRowContext(v.ctx, query).Scan(&snapshotCount); err == nil {
+			fmt.Printf("Snapshots:       %d\n", snapshotCount)
+		}
+
+		// Get blob count from database
+		query = `SELECT COUNT(*) FROM blobs`
+		var blobCount int
+		if err := v.DB.Conn().QueryRowContext(v.ctx, query).Scan(&blobCount); err == nil {
+			fmt.Printf("Blobs:           %d\n", blobCount)
+		}
+
+		// Get file count from database
+		query = `SELECT COUNT(*) FROM files`
+		var fileCount int
+		if err := v.DB.Conn().QueryRowContext(v.ctx, query).Scan(&fileCount); err == nil {
+			fmt.Printf("Files:           %d\n", fileCount)
+		}
+	} else {
+		fmt.Printf("Index Size:      (not created)\n")
+	}
+
+	return nil
+}
+
+// SnapshotMetadataInfo contains information about a single snapshot's metadata
+type SnapshotMetadataInfo struct {
+	SnapshotID   string `json:"snapshot_id"`
+	ManifestSize int64  `json:"manifest_size"`
+	DatabaseSize int64  `json:"database_size"`
+	TotalSize    int64  `json:"total_size"`
+	BlobCount    int    `json:"blob_count"`
+	BlobsSize    int64  `json:"blobs_size"`
+}
+
+// RemoteInfoResult contains all remote storage information
+type RemoteInfoResult struct {
+	// Storage info
+	StorageType     string `json:"storage_type"`
+	StorageLocation string `json:"storage_location"`
+
+	// Snapshot metadata
+	Snapshots          []SnapshotMetadataInfo `json:"snapshots"`
+	TotalMetadataSize  int64                  `json:"total_metadata_size"`
+	TotalMetadataCount int                    `json:"total_metadata_count"`
+
+	// All blobs on remote
+	TotalBlobCount int   `json:"total_blob_count"`
+	TotalBlobSize  int64 `json:"total_blob_size"`
+
+	// Referenced blobs (from manifests)
+	ReferencedBlobCount int   `json:"referenced_blob_count"`
+	ReferencedBlobSize  int64 `json:"referenced_blob_size"`
+
+	// Orphaned blobs
+	OrphanedBlobCount int   `json:"orphaned_blob_count"`
+	OrphanedBlobSize  int64 `json:"orphaned_blob_size"`
+}
+
+// RemoteInfo displays information about remote storage
+func (v *Vaultik) RemoteInfo(jsonOutput bool) error {
+	result := &RemoteInfoResult{}
+
+	// Get storage info
+	storageInfo := v.Storage.Info()
+	result.StorageType = storageInfo.Type
+	result.StorageLocation = storageInfo.Location
+
+	if !jsonOutput {
+		fmt.Printf("=== Remote Storage ===\n")
+		fmt.Printf("Type:     %s\n", storageInfo.Type)
+		fmt.Printf("Location: %s\n", storageInfo.Location)
+		fmt.Println()
+	}
+
+	// List all snapshot metadata
+	if !jsonOutput {
+		fmt.Printf("Scanning snapshot metadata...\n")
+	}
+
+	snapshotMetadata := make(map[string]*SnapshotMetadataInfo)
+
+	// Collect metadata files
+	metadataCh := v.Storage.ListStream(v.ctx, "metadata/")
+	for obj := range metadataCh {
+		if obj.Err != nil {
+			return fmt.Errorf("listing metadata: %w", obj.Err)
+		}
+
+		// Parse key: metadata/<snapshot-id>/<filename>
+		parts := strings.Split(obj.Key, "/")
+		if len(parts) < 3 {
+			continue
+		}
+		snapshotID := parts[1]
+
+		if _, exists := snapshotMetadata[snapshotID]; !exists {
+			snapshotMetadata[snapshotID] = &SnapshotMetadataInfo{
+				SnapshotID: snapshotID,
+			}
+		}
+
+		info := snapshotMetadata[snapshotID]
+		filename := parts[2]
+
+		if strings.HasPrefix(filename, "manifest") {
+			info.ManifestSize = obj.Size
+		} else if strings.HasPrefix(filename, "db") {
+			info.DatabaseSize = obj.Size
+		}
+		info.TotalSize = info.ManifestSize + info.DatabaseSize
+	}
+
+	// Sort snapshots by ID for consistent output
+	var snapshotIDs []string
+	for id := range snapshotMetadata {
+		snapshotIDs = append(snapshotIDs, id)
+	}
+	sort.Strings(snapshotIDs)
+
+	// Download and parse all manifests to get referenced blobs
+	if !jsonOutput {
+		fmt.Printf("Downloading %d manifest(s)...\n", len(snapshotIDs))
+	}
+
+	referencedBlobs := make(map[string]int64) // hash -> compressed size
+
+	for _, snapshotID := range snapshotIDs {
+		manifestKey := fmt.Sprintf("metadata/%s/manifest.json.zst", snapshotID)
+		reader, err := v.Storage.Get(v.ctx, manifestKey)
+		if err != nil {
+			log.Warn("Failed to get manifest", "snapshot", snapshotID, "error", err)
+			continue
+		}
+
+		manifest, err := snapshot.DecodeManifest(reader)
+		_ = reader.Close()
+		if err != nil {
+			log.Warn("Failed to decode manifest", "snapshot", snapshotID, "error", err)
+			continue
+		}
+
+		// Record blob info from manifest
+		info := snapshotMetadata[snapshotID]
+		info.BlobCount = manifest.BlobCount
+
+		var blobsSize int64
+		for _, blob := range manifest.Blobs {
+			referencedBlobs[blob.Hash] = blob.CompressedSize
+			blobsSize += blob.CompressedSize
+		}
+		info.BlobsSize = blobsSize
+	}
+
+	// Build result snapshots
+	var totalMetadataSize int64
+	for _, id := range snapshotIDs {
+		info := snapshotMetadata[id]
+		result.Snapshots = append(result.Snapshots, *info)
+		totalMetadataSize += info.TotalSize
+	}
+	result.TotalMetadataSize = totalMetadataSize
+	result.TotalMetadataCount = len(snapshotIDs)
+
+	// Calculate referenced blob stats
+	for _, size := range referencedBlobs {
+		result.ReferencedBlobCount++
+		result.ReferencedBlobSize += size
+	}
+
+	// List all blobs on remote
+	if !jsonOutput {
+		fmt.Printf("Scanning blobs...\n")
+	}
+
+	allBlobs := make(map[string]int64) // hash -> size from storage
+
+	blobCh := v.Storage.ListStream(v.ctx, "blobs/")
+	for obj := range blobCh {
+		if obj.Err != nil {
+			return fmt.Errorf("listing blobs: %w", obj.Err)
+		}
+
+		// Extract hash from key: blobs/xx/yy/hash
+		parts := strings.Split(obj.Key, "/")
+		if len(parts) < 4 {
+			continue
+		}
+		hash := parts[3]
+		allBlobs[hash] = obj.Size
+		result.TotalBlobCount++
+		result.TotalBlobSize += obj.Size
+	}
+
+	// Calculate orphaned blobs
+	for hash, size := range allBlobs {
+		if _, referenced := referencedBlobs[hash]; !referenced {
+			result.OrphanedBlobCount++
+			result.OrphanedBlobSize += size
+		}
+	}
+
+	// Output results
+	if jsonOutput {
+		enc := json.NewEncoder(v.Stdout)
+		enc.SetIndent("", "  ")
+		return enc.Encode(result)
+	}
+
+	// Human-readable output
+	fmt.Printf("\n=== Snapshot Metadata ===\n")
+	if len(result.Snapshots) == 0 {
+		fmt.Printf("No snapshots found\n")
+	} else {
+		fmt.Printf("%-45s %12s %12s %12s %10s %12s\n", "SNAPSHOT", "MANIFEST", "DATABASE", "TOTAL", "BLOBS", "BLOB SIZE")
+		fmt.Printf("%-45s %12s %12s %12s %10s %12s\n", strings.Repeat("-", 45), strings.Repeat("-", 12), strings.Repeat("-", 12), strings.Repeat("-", 12), strings.Repeat("-", 10), strings.Repeat("-", 12))
+		for _, info := range result.Snapshots {
+			fmt.Printf("%-45s %12s %12s %12s %10s %12s\n",
+				truncateString(info.SnapshotID, 45),
+				humanize.Bytes(uint64(info.ManifestSize)),
+				humanize.Bytes(uint64(info.DatabaseSize)),
+				humanize.Bytes(uint64(info.TotalSize)),
+				humanize.Comma(int64(info.BlobCount)),
+				humanize.Bytes(uint64(info.BlobsSize)),
+			)
+		}
+		fmt.Printf("%-45s %12s %12s %12s %10s %12s\n", strings.Repeat("-", 45), strings.Repeat("-", 12), strings.Repeat("-", 12), strings.Repeat("-", 12), strings.Repeat("-", 10), strings.Repeat("-", 12))
+		fmt.Printf("%-45s %12s %12s %12s\n", fmt.Sprintf("Total (%d snapshots)", result.TotalMetadataCount), "", "", humanize.Bytes(uint64(result.TotalMetadataSize)))
+	}
+
+	fmt.Printf("\n=== Blob Storage ===\n")
+	fmt.Printf("Total blobs on remote:      %s (%s)\n",
+		humanize.Comma(int64(result.TotalBlobCount)),
+		humanize.Bytes(uint64(result.TotalBlobSize)))
+	fmt.Printf("Referenced by snapshots:    %s (%s)\n",
+		humanize.Comma(int64(result.ReferencedBlobCount)),
+		humanize.Bytes(uint64(result.ReferencedBlobSize)))
+	fmt.Printf("Orphaned (unreferenced):    %s (%s)\n",
+		humanize.Comma(int64(result.OrphanedBlobCount)),
+		humanize.Bytes(uint64(result.OrphanedBlobSize)))
+
+	if result.OrphanedBlobCount > 0 {
+		fmt.Printf("\nRun 'vaultik prune --remote' to remove orphaned blobs.\n")
+	}
+
+	return nil
+}
+
+// truncateString truncates a string to maxLen, adding "..." if truncated
+func truncateString(s string, maxLen int) string {
+	if len(s) <= maxLen {
+		return s
+	}
+	if maxLen <= 3 {
+		return s[:maxLen]
+	}
+	return s[:maxLen-3] + "..."
+}
--- a/internal/vaultik/integration_test.go
+++ b/internal/vaultik/integration_test.go
@@ -0,0 +1,543 @@
+package vaultik_test
+
+import (
+	"bytes"
+	"context"
+	"database/sql"
+	"io"
+	"os"
+	"path/filepath"
+	"sync"
+	"testing"
+	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/config"
+	"git.eeqj.de/sneak/vaultik/internal/database"
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"git.eeqj.de/sneak/vaultik/internal/storage"
+	"git.eeqj.de/sneak/vaultik/internal/types"
+	"git.eeqj.de/sneak/vaultik/internal/vaultik"
+	"github.com/spf13/afero"
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+// MockStorer implements storage.Storer for testing
+type MockStorer struct {
+	mu    sync.Mutex
+	data  map[string][]byte
+	calls []string
+}
+
+func NewMockStorer() *MockStorer {
+	return &MockStorer{
+		data:  make(map[string][]byte),
+		calls: make([]string, 0),
+	}
+}
+
+func (m *MockStorer) Put(ctx context.Context, key string, reader io.Reader) error {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	m.calls = append(m.calls, "Put:"+key)
+	data, err := io.ReadAll(reader)
+	if err != nil {
+		return err
+	}
+	m.data[key] = data
+	return nil
+}
+
+func (m *MockStorer) PutWithProgress(ctx context.Context, key string, reader io.Reader, size int64, progress storage.ProgressCallback) error {
+	return m.Put(ctx, key, reader)
+}
+
+func (m *MockStorer) Get(ctx context.Context, key string) (io.ReadCloser, error) {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	m.calls = append(m.calls, "Get:"+key)
+	data, exists := m.data[key]
+	if !exists {
+		return nil, storage.ErrNotFound
+	}
+	return io.NopCloser(bytes.NewReader(data)), nil
+}
+
+func (m *MockStorer) Stat(ctx context.Context, key string) (*storage.ObjectInfo, error) {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	m.calls = append(m.calls, "Stat:"+key)
+	data, exists := m.data[key]
+	if !exists {
+		return nil, storage.ErrNotFound
+	}
+	return &storage.ObjectInfo{
+		Key:  key,
+		Size: int64(len(data)),
+	}, nil
+}
+
+func (m *MockStorer) Delete(ctx context.Context, key string) error {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	m.calls = append(m.calls, "Delete:"+key)
+	delete(m.data, key)
+	return nil
+}
+
+func (m *MockStorer) List(ctx context.Context, prefix string) ([]string, error) {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	m.calls = append(m.calls, "List:"+prefix)
+	var keys []string
+	for key := range m.data {
+		if len(prefix) == 0 || (len(key) >= len(prefix) && key[:len(prefix)] == prefix) {
+			keys = append(keys, key)
+		}
+	}
+	return keys, nil
+}
+
+func (m *MockStorer) ListStream(ctx context.Context, prefix string) <-chan storage.ObjectInfo {
+	ch := make(chan storage.ObjectInfo)
+	go func() {
+		defer close(ch)
+		m.mu.Lock()
+		defer m.mu.Unlock()
+
+		for key, data := range m.data {
+			if len(prefix) == 0 || (len(key) >= len(prefix) && key[:len(prefix)] == prefix) {
+				ch <- storage.ObjectInfo{
+					Key:  key,
+					Size: int64(len(data)),
+				}
+			}
+		}
+	}()
+	return ch
+}
+
+func (m *MockStorer) Info() storage.StorageInfo {
+	return storage.StorageInfo{
+		Type:     "mock",
+		Location: "memory",
+	}
+}
+
+// GetCalls returns the list of operations that were called
+func (m *MockStorer) GetCalls() []string {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	calls := make([]string, len(m.calls))
+	copy(calls, m.calls)
+	return calls
+}
+
+// GetStorageSize returns the number of objects in storage
+func (m *MockStorer) GetStorageSize() int {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+
+	return len(m.data)
+}
+
+// TestEndToEndBackup tests the full backup workflow with mocked dependencies
+func TestEndToEndBackup(t *testing.T) {
+	// Initialize logger
+	log.Initialize(log.Config{})
+
+	// Create in-memory filesystem
+	fs := afero.NewMemMapFs()
+
+	// Create test directory structure and files
+	testFiles := map[string]string{
+		"/home/user/documents/file1.txt": "This is file 1 content",
+		"/home/user/documents/file2.txt": "This is file 2 content with more data",
+		"/home/user/pictures/photo1.jpg": "Binary photo data here...",
+		"/home/user/code/main.go":        "package main\n\nfunc main() {\n\tprintln(\"Hello, World!\")\n}",
+	}
+
+	// Create all directories first
+	dirs := []string{
+		"/home/user/documents",
+		"/home/user/pictures",
+		"/home/user/code",
+	}
+	for _, dir := range dirs {
+		if err := fs.MkdirAll(dir, 0755); err != nil {
+			t.Fatalf("failed to create directory %s: %v", dir, err)
+		}
+	}
+
+	// Create test files
+	for path, content := range testFiles {
+		if err := afero.WriteFile(fs, path, []byte(content), 0644); err != nil {
+			t.Fatalf("failed to create test file %s: %v", path, err)
+		}
+	}
+
+	// Create mock storage
+	mockStorage := NewMockStorer()
+
+	// Create test configuration
+	cfg := &config.Config{
+		Snapshots: map[string]config.SnapshotConfig{
+			"test": {
+				Paths: []string{"/home/user"},
+			},
+		},
+		Exclude:          []string{"*.tmp", "*.log"},
+		ChunkSize:        config.Size(16 * 1024),  // 16KB chunks
+		BlobSizeLimit:    config.Size(100 * 1024), // 100KB blobs
+		CompressionLevel: 3,
+		AgeRecipients:    []string{"age1ezrjmfpwsc95svdg0y54mums3zevgzu0x0ecq2f7tp8a05gl0sjq9q9wjg"},   // Test public key
+		AgeSecretKey:     "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5", // Test private key
+		S3: config.S3Config{
+			Endpoint:        "http://localhost:9000", // MinIO endpoint for testing
+			Region:          "us-east-1",
+			Bucket:          "test-bucket",
+			AccessKeyID:     "test-access",
+			SecretAccessKey: "test-secret",
+		},
+		IndexPath: ":memory:", // In-memory SQLite database
+	}
+
+	// For a true end-to-end test, we'll create a simpler test that focuses on
+	// the core backup logic using the scanner directly with our mock storage
+	ctx := context.Background()
+
+	// Create in-memory database
+	db, err := database.New(ctx, ":memory:")
+	require.NoError(t, err)
+	defer func() {
+		if err := db.Close(); err != nil {
+			t.Errorf("failed to close database: %v", err)
+		}
+	}()
+
+	repos := database.NewRepositories(db)
+
+	// Create scanner with mock storage
+	scanner := snapshot.NewScanner(snapshot.ScannerConfig{
+		FS:               fs,
+		ChunkSize:        cfg.ChunkSize.Int64(),
+		Repositories:     repos,
+		Storage:          mockStorage,
+		MaxBlobSize:      cfg.BlobSizeLimit.Int64(),
+		CompressionLevel: cfg.CompressionLevel,
+		AgeRecipients:    cfg.AgeRecipients,
+		EnableProgress:   false,
+	})
+
+	// Create a snapshot record
+	snapshotID := "test-snapshot-001"
+	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
+		snapshot := &database.Snapshot{
+			ID:             types.SnapshotID(snapshotID),
+			Hostname:       "test-host",
+			VaultikVersion: "test-version",
+			StartedAt:      time.Now(),
+		}
+		return repos.Snapshots.Create(ctx, tx, snapshot)
+	})
+	require.NoError(t, err)
+
+	// Run the backup scan
+	result, err := scanner.Scan(ctx, "/home/user", snapshotID)
+	require.NoError(t, err)
+
+	// Verify scan results
+	// The scanner counts both files and directories, so we have:
+	// 4 files + 4 directories (/home, /home/user, /home/user/documents, /home/user/pictures, /home/user/code)
+	assert.GreaterOrEqual(t, result.FilesScanned, 4, "Should scan at least 4 files")
+	assert.Greater(t, result.BytesScanned, int64(0), "Should scan some bytes")
+	assert.Greater(t, result.ChunksCreated, 0, "Should create chunks")
+	assert.Greater(t, result.BlobsCreated, 0, "Should create blobs")
+
+	// Verify storage operations
+	calls := mockStorage.GetCalls()
+	t.Logf("Storage operations performed: %v", calls)
+
+	// Should have uploaded at least one blob
+	blobUploads := 0
+	for _, call := range calls {
+		if len(call) > 4 && call[:4] == "Put:" {
+			if len(call) > 10 && call[4:10] == "blobs/" {
+				blobUploads++
+			}
+		}
+	}
+	assert.Greater(t, blobUploads, 0, "Should upload at least one blob")
+
+	// Verify files in database
+	files, err := repos.Files.ListByPrefix(ctx, "/home/user")
+	require.NoError(t, err)
+	// Count only regular files (not directories)
+	regularFiles := 0
+	for _, f := range files {
+		if f.Mode&0x80000000 == 0 { // Check if regular file (not directory)
+			regularFiles++
+		}
+	}
+	assert.Equal(t, 4, regularFiles, "Should have 4 regular files in database")
+
+	// Verify chunks were created by checking a specific file
+	fileChunks, err := repos.FileChunks.GetByPath(ctx, "/home/user/documents/file1.txt")
+	require.NoError(t, err)
+	assert.Greater(t, len(fileChunks), 0, "Should have chunks for file1.txt")
+
+	// Verify blobs were uploaded to storage
+	assert.Greater(t, mockStorage.GetStorageSize(), 0, "Should have blobs in storage")
+
+	// Complete the snapshot - just verify we got results
+	// In a real integration test, we'd update the snapshot record
+
+	// Create snapshot manager to test metadata export
+	snapshotManager := &snapshot.SnapshotManager{}
+	snapshotManager.SetFilesystem(fs)
+
+	// Note: We can't fully test snapshot metadata export without a proper S3 client mock
+	// that implements all required methods. This would require refactoring the S3 client
+	// interface to be more testable.
+
+	t.Logf("Backup completed successfully:")
+	t.Logf("  Files scanned: %d", result.FilesScanned)
+	t.Logf("  Bytes scanned: %d", result.BytesScanned)
+	t.Logf("  Chunks created: %d", result.ChunksCreated)
+	t.Logf("  Blobs created: %d", result.BlobsCreated)
+	t.Logf("  Storage size: %d objects", mockStorage.GetStorageSize())
+}
+
+// TestBackupAndVerify tests backing up files and verifying the blobs
+func TestBackupAndVerify(t *testing.T) {
+	// Initialize logger
+	log.Initialize(log.Config{})
+
+	// Create in-memory filesystem
+	fs := afero.NewMemMapFs()
+
+	// Create test files
+	testContent := "This is a test file with some content that should be backed up"
+	err := fs.MkdirAll("/data", 0755)
+	require.NoError(t, err)
+	err = afero.WriteFile(fs, "/data/test.txt", []byte(testContent), 0644)
+	require.NoError(t, err)
+
+	// Create mock storage
+	mockStorage := NewMockStorer()
+
+	// Create test database
+	ctx := context.Background()
+	db, err := database.New(ctx, ":memory:")
+	require.NoError(t, err)
+	defer func() {
+		if err := db.Close(); err != nil {
+			t.Errorf("failed to close database: %v", err)
+		}
+	}()
+
+	repos := database.NewRepositories(db)
+
+	// Create scanner
+	scanner := snapshot.NewScanner(snapshot.ScannerConfig{
+		FS:               fs,
+		ChunkSize:        int64(1024 * 16), // 16KB chunks
+		Repositories:     repos,
+		Storage:          mockStorage,
+		MaxBlobSize:      int64(1024 * 1024), // 1MB blobs
+		CompressionLevel: 3,
+		AgeRecipients:    []string{"age1ezrjmfpwsc95svdg0y54mums3zevgzu0x0ecq2f7tp8a05gl0sjq9q9wjg"}, // Test public key
+	})
+
+	// Create a snapshot
+	snapshotID := "test-snapshot-001"
+	err = repos.WithTx(ctx, func(ctx context.Context, tx *sql.Tx) error {
+		snapshot := &database.Snapshot{
+			ID:             types.SnapshotID(snapshotID),
+			Hostname:       "test-host",
+			VaultikVersion: "test-version",
+			StartedAt:      time.Now(),
+		}
+		return repos.Snapshots.Create(ctx, tx, snapshot)
+	})
+	require.NoError(t, err)
+
+	// Run the backup
+	result, err := scanner.Scan(ctx, "/data", snapshotID)
+	require.NoError(t, err)
+
+	// Verify backup created blobs
+	assert.Greater(t, result.BlobsCreated, 0, "Should create at least one blob")
+	assert.Equal(t, mockStorage.GetStorageSize(), result.BlobsCreated, "Storage should have the blobs")
+
+	// Verify we can retrieve the blob from storage
+	objects, err := mockStorage.List(ctx, "blobs/")
+	require.NoError(t, err)
+	assert.Len(t, objects, result.BlobsCreated, "Should have correct number of blobs in storage")
+
+	// Get the first blob and verify it exists
+	if len(objects) > 0 {
+		blobKey := objects[0]
+		t.Logf("Verifying blob: %s", blobKey)
+
+		// Get blob info
+		blobInfo, err := mockStorage.Stat(ctx, blobKey)
+		require.NoError(t, err)
+		assert.Greater(t, blobInfo.Size, int64(0), "Blob should have content")
+
+		// Get blob content
+		reader, err := mockStorage.Get(ctx, blobKey)
+		require.NoError(t, err)
+		defer func() { _ = reader.Close() }()
+
+		// Verify blob data is encrypted (should not contain plaintext)
+		blobData, err := io.ReadAll(reader)
+		require.NoError(t, err)
+		assert.NotContains(t, string(blobData), testContent, "Blob should be encrypted")
+		assert.Greater(t, len(blobData), 0, "Blob should have data")
+	}
+
+	t.Logf("Backup and verify test completed successfully")
+}
+
+// TestBackupAndRestore tests the full backup and restore workflow
+// This test verifies that the restore code correctly handles the binary SQLite
+// database format that is exported by the snapshot manager.
+func TestBackupAndRestore(t *testing.T) {
+	// Initialize logger
+	log.Initialize(log.Config{})
+
+	// Create real temp directory for the database (SQLite needs real filesystem)
+	realTempDir, err := os.MkdirTemp("", "vaultik-test-")
+	require.NoError(t, err)
+	defer func() { _ = os.RemoveAll(realTempDir) }()
+
+	// Use real OS filesystem for this test
+	fs := afero.NewOsFs()
+
+	// Create test directory structure and files
+	dataDir := filepath.Join(realTempDir, "data")
+	testFiles := map[string]string{
+		filepath.Join(dataDir, "file1.txt"):           "This is file 1 content",
+		filepath.Join(dataDir, "file2.txt"):           "This is file 2 content with more data",
+		filepath.Join(dataDir, "subdir", "file3.txt"): "This is file 3 in a subdirectory",
+	}
+
+	// Create directories and files
+	for path, content := range testFiles {
+		dir := filepath.Dir(path)
+		if err := fs.MkdirAll(dir, 0755); err != nil {
+			t.Fatalf("failed to create directory %s: %v", dir, err)
+		}
+		if err := afero.WriteFile(fs, path, []byte(content), 0644); err != nil {
+			t.Fatalf("failed to create test file %s: %v", path, err)
+		}
+	}
+
+	ctx := context.Background()
+
+	// Create mock storage
+	mockStorage := NewMockStorer()
+
+	// Test keypair
+	agePublicKey := "age1ezrjmfpwsc95svdg0y54mums3zevgzu0x0ecq2f7tp8a05gl0sjq9q9wjg"
+	ageSecretKey := "AGE-SECRET-KEY-19CR5YSFW59HM4TLD6GXVEDMZFTVVF7PPHKUT68TXSFPK7APHXA2QS2NJA5"
+
+	// Create database file
+	dbPath := filepath.Join(realTempDir, "test.db")
+	db, err := database.New(ctx, dbPath)
+	require.NoError(t, err)
+	defer func() { _ = db.Close() }()
+
+	repos := database.NewRepositories(db)
+
+	// Create config for snapshot manager
+	cfg := &config.Config{
+		AgeSecretKey:     ageSecretKey,
+		AgeRecipients:    []string{agePublicKey},
+		CompressionLevel: 3,
+	}
+
+	// Create snapshot manager
+	sm := snapshot.NewSnapshotManager(snapshot.SnapshotManagerParams{
+		Repos:   repos,
+		Storage: mockStorage,
+		Config:  cfg,
+	})
+	sm.SetFilesystem(fs)
+
+	// Create scanner
+	scanner := snapshot.NewScanner(snapshot.ScannerConfig{
+		FS:               fs,
+		Storage:          mockStorage,
+		ChunkSize:        int64(16 * 1024),
+		MaxBlobSize:      int64(100 * 1024),
+		CompressionLevel: 3,
+		AgeRecipients:    []string{agePublicKey},
+		Repositories:     repos,
+	})
+
+	// Create a snapshot
+	snapshotID, err := sm.CreateSnapshot(ctx, "test-host", "test-version", "test-git")
+	require.NoError(t, err)
+	t.Logf("Created snapshot: %s", snapshotID)
+
+	// Run the backup (scan)
+	result, err := scanner.Scan(ctx, dataDir, snapshotID)
+	require.NoError(t, err)
+	t.Logf("Scan complete: %d files, %d blobs", result.FilesScanned, result.BlobsCreated)
+
+	// Complete the snapshot
+	err = sm.CompleteSnapshot(ctx, snapshotID)
+	require.NoError(t, err)
+
+	// Export snapshot metadata (this uploads db.zst.age and manifest.json.zst)
+	err = sm.ExportSnapshotMetadata(ctx, dbPath, snapshotID)
+	require.NoError(t, err)
+	t.Logf("Exported snapshot metadata")
+
+	// Verify metadata was uploaded
+	keys, err := mockStorage.List(ctx, "metadata/")
+	require.NoError(t, err)
+	t.Logf("Metadata keys: %v", keys)
+	assert.GreaterOrEqual(t, len(keys), 2, "Should have at least db.zst.age and manifest.json.zst")
+
+	// Close the source database
+	err = db.Close()
+	require.NoError(t, err)
+
+	// Create Vaultik instance for restore
+	vaultikApp := &vaultik.Vaultik{
+		Config:  cfg,
+		Storage: mockStorage,
+		Fs:      fs,
+		Stdout:  io.Discard,
+		Stderr:  io.Discard,
+	}
+	vaultikApp.SetContext(ctx)
+
+	// Try to restore - this should work with binary SQLite format
+	restoreDir := filepath.Join(realTempDir, "restored")
+	err = vaultikApp.Restore(&vaultik.RestoreOptions{
+		SnapshotID: snapshotID,
+		TargetDir:  restoreDir,
+	})
+	require.NoError(t, err, "Restore should succeed with binary SQLite database format")
+
+	// Verify restored files match originals
+	for origPath, expectedContent := range testFiles {
+		restoredPath := filepath.Join(restoreDir, origPath)
+		restoredContent, err := afero.ReadFile(fs, restoredPath)
+		require.NoError(t, err, "Should be able to read restored file: %s", restoredPath)
+		assert.Equal(t, expectedContent, string(restoredContent), "Restored content should match original for: %s", origPath)
+	}
+
+	t.Log("Backup and restore test completed successfully")
+}
--- a/internal/vaultik/prune.go
+++ b/internal/vaultik/prune.go
@@ -0,0 +1,204 @@
+package vaultik
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"strings"
+
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"github.com/dustin/go-humanize"
+)
+
+// PruneOptions contains options for the prune command
+type PruneOptions struct {
+	Force bool
+	JSON  bool
+}
+
+// PruneBlobsResult contains the result of a blob prune operation
+type PruneBlobsResult struct {
+	BlobsFound   int   `json:"blobs_found"`
+	BlobsDeleted int   `json:"blobs_deleted"`
+	BlobsFailed  int   `json:"blobs_failed,omitempty"`
+	BytesFreed   int64 `json:"bytes_freed"`
+}
+
+// PruneBlobs removes unreferenced blobs from storage
+func (v *Vaultik) PruneBlobs(opts *PruneOptions) error {
+	log.Info("Starting prune operation")
+
+	// Get all remote snapshots and their manifests
+	allBlobsReferenced := make(map[string]bool)
+	manifestCount := 0
+
+	// List all snapshots in storage
+	log.Info("Listing remote snapshots")
+	objectCh := v.Storage.ListStream(v.ctx, "metadata/")
+
+	var snapshotIDs []string
+	for object := range objectCh {
+		if object.Err != nil {
+			return fmt.Errorf("listing remote snapshots: %w", object.Err)
+		}
+
+		// Extract snapshot ID from paths like metadata/hostname-20240115-143052Z/
+		parts := strings.Split(object.Key, "/")
+		if len(parts) >= 2 && parts[0] == "metadata" && parts[1] != "" {
+			// Check if this is a directory by looking for trailing slash
+			if strings.HasSuffix(object.Key, "/") || strings.Contains(object.Key, "/manifest.json.zst") {
+				snapshotID := parts[1]
+				// Only add unique snapshot IDs
+				found := false
+				for _, id := range snapshotIDs {
+					if id == snapshotID {
+						found = true
+						break
+					}
+				}
+				if !found {
+					snapshotIDs = append(snapshotIDs, snapshotID)
+				}
+			}
+		}
+	}
+
+	log.Info("Found manifests in remote storage", "count", len(snapshotIDs))
+
+	// Download and parse each manifest to get referenced blobs
+	for _, snapshotID := range snapshotIDs {
+		log.Debug("Processing manifest", "snapshot_id", snapshotID)
+
+		manifest, err := v.downloadManifest(snapshotID)
+		if err != nil {
+			log.Error("Failed to download manifest", "snapshot_id", snapshotID, "error", err)
+			continue
+		}
+
+		// Add all blobs from this manifest to our referenced set
+		for _, blob := range manifest.Blobs {
+			allBlobsReferenced[blob.Hash] = true
+		}
+		manifestCount++
+	}
+
+	log.Info("Processed manifests", "count", manifestCount, "unique_blobs_referenced", len(allBlobsReferenced))
+
+	// List all blobs in storage
+	log.Info("Listing all blobs in storage")
+	allBlobs := make(map[string]int64) // hash -> size
+	blobObjectCh := v.Storage.ListStream(v.ctx, "blobs/")
+
+	for object := range blobObjectCh {
+		if object.Err != nil {
+			return fmt.Errorf("listing blobs: %w", object.Err)
+		}
+
+		// Extract hash from path like blobs/ab/cd/abcdef123456...
+		parts := strings.Split(object.Key, "/")
+		if len(parts) == 4 && parts[0] == "blobs" {
+			hash := parts[3]
+			allBlobs[hash] = object.Size
+		}
+	}
+
+	log.Info("Found blobs in storage", "count", len(allBlobs))
+
+	// Find unreferenced blobs
+	var unreferencedBlobs []string
+	var totalSize int64
+	for hash, size := range allBlobs {
+		if !allBlobsReferenced[hash] {
+			unreferencedBlobs = append(unreferencedBlobs, hash)
+			totalSize += size
+		}
+	}
+
+	result := &PruneBlobsResult{
+		BlobsFound: len(unreferencedBlobs),
+	}
+
+	if len(unreferencedBlobs) == 0 {
+		log.Info("No unreferenced blobs found")
+		if opts.JSON {
+			return outputPruneBlobsJSON(result)
+		}
+		fmt.Println("No unreferenced blobs to remove.")
+		return nil
+	}
+
+	// Show what will be deleted
+	log.Info("Found unreferenced blobs", "count", len(unreferencedBlobs), "total_size", humanize.Bytes(uint64(totalSize)))
+	if !opts.JSON {
+		fmt.Printf("Found %d unreferenced blob(s) totaling %s\n", len(unreferencedBlobs), humanize.Bytes(uint64(totalSize)))
+	}
+
+	// Confirm unless --force is used (skip in JSON mode - require --force)
+	if !opts.Force && !opts.JSON {
+		fmt.Printf("\nDelete %d unreferenced blob(s)? [y/N] ", len(unreferencedBlobs))
+		var confirm string
+		if _, err := fmt.Scanln(&confirm); err != nil {
+			// Treat EOF or error as "no"
+			fmt.Println("Cancelled")
+			return nil
+		}
+		if strings.ToLower(confirm) != "y" {
+			fmt.Println("Cancelled")
+			return nil
+		}
+	}
+
+	// Delete unreferenced blobs
+	log.Info("Deleting unreferenced blobs")
+	deletedCount := 0
+	deletedSize := int64(0)
+
+	for i, hash := range unreferencedBlobs {
+		blobPath := fmt.Sprintf("blobs/%s/%s/%s", hash[:2], hash[2:4], hash)
+
+		if err := v.Storage.Delete(v.ctx, blobPath); err != nil {
+			log.Error("Failed to delete blob", "hash", hash, "error", err)
+			continue
+		}
+
+		deletedCount++
+		deletedSize += allBlobs[hash]
+
+		// Progress update every 100 blobs
+		if (i+1)%100 == 0 || i == len(unreferencedBlobs)-1 {
+			log.Info("Deletion progress",
+				"deleted", i+1,
+				"total", len(unreferencedBlobs),
+				"percent", fmt.Sprintf("%.1f%%", float64(i+1)/float64(len(unreferencedBlobs))*100),
+			)
+		}
+	}
+
+	result.BlobsDeleted = deletedCount
+	result.BlobsFailed = len(unreferencedBlobs) - deletedCount
+	result.BytesFreed = deletedSize
+
+	log.Info("Prune complete",
+		"deleted_count", deletedCount,
+		"deleted_size", humanize.Bytes(uint64(deletedSize)),
+		"failed", len(unreferencedBlobs)-deletedCount,
+	)
+
+	if opts.JSON {
+		return outputPruneBlobsJSON(result)
+	}
+
+	fmt.Printf("\nDeleted %d blob(s) totaling %s\n", deletedCount, humanize.Bytes(uint64(deletedSize)))
+	if deletedCount < len(unreferencedBlobs) {
+		fmt.Printf("Failed to delete %d blob(s)\n", len(unreferencedBlobs)-deletedCount)
+	}
+
+	return nil
+}
+
+// outputPruneBlobsJSON outputs the prune result as JSON
+func outputPruneBlobsJSON(result *PruneBlobsResult) error {
+	encoder := json.NewEncoder(os.Stdout)
+	encoder.SetIndent("", "  ")
+	return encoder.Encode(result)
+}
--- a/internal/vaultik/restore.go
+++ b/internal/vaultik/restore.go
@@ -0,0 +1,632 @@
+package vaultik
+
+import (
+	"bytes"
+	"context"
+	"crypto/sha256"
+	"encoding/hex"
+	"fmt"
+	"io"
+	"os"
+	"path/filepath"
+	"time"
+
+	"filippo.io/age"
+	"git.eeqj.de/sneak/vaultik/internal/blobgen"
+	"git.eeqj.de/sneak/vaultik/internal/database"
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/types"
+	"github.com/dustin/go-humanize"
+	"github.com/schollz/progressbar/v3"
+	"github.com/spf13/afero"
+	"golang.org/x/term"
+)
+
+// RestoreOptions contains options for the restore operation
+type RestoreOptions struct {
+	SnapshotID string
+	TargetDir  string
+	Paths      []string // Optional paths to restore (empty = all)
+	Verify     bool     // Verify restored files by checking chunk hashes
+}
+
+// RestoreResult contains statistics from a restore operation
+type RestoreResult struct {
+	FilesRestored   int
+	BytesRestored   int64
+	BlobsDownloaded int
+	BytesDownloaded int64
+	Duration        time.Duration
+	// Verification results (only populated if Verify option is set)
+	FilesVerified int
+	BytesVerified int64
+	FilesFailed   int
+	FailedFiles   []string // Paths of files that failed verification
+}
+
+// Restore restores files from a snapshot to the target directory
+func (v *Vaultik) Restore(opts *RestoreOptions) error {
+	startTime := time.Now()
+
+	// Check for age_secret_key
+	if v.Config.AgeSecretKey == "" {
+		return fmt.Errorf("decryption key required for restore\n\nSet the VAULTIK_AGE_SECRET_KEY environment variable to your age private key:\n  export VAULTIK_AGE_SECRET_KEY='AGE-SECRET-KEY-...'")
+	}
+
+	// Parse the age identity
+	identity, err := age.ParseX25519Identity(v.Config.AgeSecretKey)
+	if err != nil {
+		return fmt.Errorf("parsing age secret key: %w", err)
+	}
+
+	log.Info("Starting restore operation",
+		"snapshot_id", opts.SnapshotID,
+		"target_dir", opts.TargetDir,
+		"paths", opts.Paths,
+	)
+
+	// Step 1: Download and decrypt the snapshot metadata database
+	log.Info("Downloading snapshot metadata...")
+	tempDB, err := v.downloadSnapshotDB(opts.SnapshotID, identity)
+	if err != nil {
+		return fmt.Errorf("downloading snapshot database: %w", err)
+	}
+	defer func() {
+		if err := tempDB.Close(); err != nil {
+			log.Debug("Failed to close temp database", "error", err)
+		}
+		// Clean up temp file
+		if err := v.Fs.Remove(tempDB.Path()); err != nil {
+			log.Debug("Failed to remove temp database", "error", err)
+		}
+	}()
+
+	repos := database.NewRepositories(tempDB)
+
+	// Step 2: Get list of files to restore
+	files, err := v.getFilesToRestore(v.ctx, repos, opts.Paths)
+	if err != nil {
+		return fmt.Errorf("getting files to restore: %w", err)
+	}
+
+	if len(files) == 0 {
+		log.Warn("No files found to restore")
+		return nil
+	}
+
+	log.Info("Found files to restore", "count", len(files))
+
+	// Step 3: Create target directory
+	if err := v.Fs.MkdirAll(opts.TargetDir, 0755); err != nil {
+		return fmt.Errorf("creating target directory: %w", err)
+	}
+
+	// Step 4: Build a map of chunks to blobs for efficient restoration
+	chunkToBlobMap, err := v.buildChunkToBlobMap(v.ctx, repos)
+	if err != nil {
+		return fmt.Errorf("building chunk-to-blob map: %w", err)
+	}
+
+	// Step 5: Restore files
+	result := &RestoreResult{}
+	blobCache := make(map[string][]byte) // Cache downloaded and decrypted blobs
+
+	for i, file := range files {
+		if v.ctx.Err() != nil {
+			return v.ctx.Err()
+		}
+
+		if err := v.restoreFile(v.ctx, repos, file, opts.TargetDir, identity, chunkToBlobMap, blobCache, result); err != nil {
+			log.Error("Failed to restore file", "path", file.Path, "error", err)
+			// Continue with other files
+			continue
+		}
+
+		// Progress logging
+		if (i+1)%100 == 0 || i+1 == len(files) {
+			log.Info("Restore progress",
+				"files", fmt.Sprintf("%d/%d", i+1, len(files)),
+				"bytes", humanize.Bytes(uint64(result.BytesRestored)),
+			)
+		}
+	}
+
+	result.Duration = time.Since(startTime)
+
+	log.Info("Restore complete",
+		"files_restored", result.FilesRestored,
+		"bytes_restored", humanize.Bytes(uint64(result.BytesRestored)),
+		"blobs_downloaded", result.BlobsDownloaded,
+		"bytes_downloaded", humanize.Bytes(uint64(result.BytesDownloaded)),
+		"duration", result.Duration,
+	)
+
+	_, _ = fmt.Fprintf(v.Stdout, "Restored %d files (%s) in %s\n",
+		result.FilesRestored,
+		humanize.Bytes(uint64(result.BytesRestored)),
+		result.Duration.Round(time.Second),
+	)
+
+	// Run verification if requested
+	if opts.Verify {
+		if err := v.verifyRestoredFiles(v.ctx, repos, files, opts.TargetDir, result); err != nil {
+			return fmt.Errorf("verification failed: %w", err)
+		}
+
+		if result.FilesFailed > 0 {
+			_, _ = fmt.Fprintf(v.Stdout, "\nVerification FAILED: %d files did not match expected checksums\n", result.FilesFailed)
+			for _, path := range result.FailedFiles {
+				_, _ = fmt.Fprintf(v.Stdout, "  - %s\n", path)
+			}
+			return fmt.Errorf("%d files failed verification", result.FilesFailed)
+		}
+
+		_, _ = fmt.Fprintf(v.Stdout, "Verified %d files (%s)\n",
+			result.FilesVerified,
+			humanize.Bytes(uint64(result.BytesVerified)),
+		)
+	}
+
+	return nil
+}
+
+// downloadSnapshotDB downloads and decrypts the snapshot metadata database
+func (v *Vaultik) downloadSnapshotDB(snapshotID string, identity age.Identity) (*database.DB, error) {
+	// Download encrypted database from storage
+	dbKey := fmt.Sprintf("metadata/%s/db.zst.age", snapshotID)
+
+	reader, err := v.Storage.Get(v.ctx, dbKey)
+	if err != nil {
+		return nil, fmt.Errorf("downloading %s: %w", dbKey, err)
+	}
+	defer func() { _ = reader.Close() }()
+
+	// Read all data
+	encryptedData, err := io.ReadAll(reader)
+	if err != nil {
+		return nil, fmt.Errorf("reading encrypted data: %w", err)
+	}
+	log.Debug("Downloaded encrypted database", "size", humanize.Bytes(uint64(len(encryptedData))))
+
+	// Decrypt and decompress using blobgen.Reader
+	blobReader, err := blobgen.NewReader(bytes.NewReader(encryptedData), identity)
+	if err != nil {
+		return nil, fmt.Errorf("creating decryption reader: %w", err)
+	}
+	defer func() { _ = blobReader.Close() }()
+
+	// Read the binary SQLite database
+	dbData, err := io.ReadAll(blobReader)
+	if err != nil {
+		return nil, fmt.Errorf("decrypting and decompressing: %w", err)
+	}
+	log.Debug("Decrypted database", "size", humanize.Bytes(uint64(len(dbData))))
+
+	// Create a temporary database file and write the binary SQLite data directly
+	tempFile, err := afero.TempFile(v.Fs, "", "vaultik-restore-*.db")
+	if err != nil {
+		return nil, fmt.Errorf("creating temp file: %w", err)
+	}
+	tempPath := tempFile.Name()
+
+	// Write the binary SQLite database directly
+	if _, err := tempFile.Write(dbData); err != nil {
+		_ = tempFile.Close()
+		_ = v.Fs.Remove(tempPath)
+		return nil, fmt.Errorf("writing database file: %w", err)
+	}
+	if err := tempFile.Close(); err != nil {
+		_ = v.Fs.Remove(tempPath)
+		return nil, fmt.Errorf("closing temp file: %w", err)
+	}
+	log.Debug("Created restore database", "path", tempPath)
+
+	// Open the database
+	db, err := database.New(v.ctx, tempPath)
+	if err != nil {
+		return nil, fmt.Errorf("opening restore database: %w", err)
+	}
+
+	return db, nil
+}
+
+// getFilesToRestore returns the list of files to restore based on path filters
+func (v *Vaultik) getFilesToRestore(ctx context.Context, repos *database.Repositories, pathFilters []string) ([]*database.File, error) {
+	// If no filters, get all files
+	if len(pathFilters) == 0 {
+		return repos.Files.ListAll(ctx)
+	}
+
+	// Get files matching the path filters
+	var result []*database.File
+	seen := make(map[string]bool)
+
+	for _, filter := range pathFilters {
+		// Normalize the filter path
+		filter = filepath.Clean(filter)
+
+		// Get files with this prefix
+		files, err := repos.Files.ListByPrefix(ctx, filter)
+		if err != nil {
+			return nil, fmt.Errorf("listing files with prefix %s: %w", filter, err)
+		}
+
+		for _, file := range files {
+			if !seen[file.ID.String()] {
+				seen[file.ID.String()] = true
+				result = append(result, file)
+			}
+		}
+	}
+
+	return result, nil
+}
+
+// buildChunkToBlobMap creates a mapping from chunk hash to blob information
+func (v *Vaultik) buildChunkToBlobMap(ctx context.Context, repos *database.Repositories) (map[string]*database.BlobChunk, error) {
+	// Query all blob_chunks
+	query := `SELECT blob_id, chunk_hash, offset, length FROM blob_chunks`
+	rows, err := repos.DB().Conn().QueryContext(ctx, query)
+	if err != nil {
+		return nil, fmt.Errorf("querying blob_chunks: %w", err)
+	}
+	defer func() { _ = rows.Close() }()
+
+	result := make(map[string]*database.BlobChunk)
+	for rows.Next() {
+		var bc database.BlobChunk
+		var blobIDStr, chunkHashStr string
+		if err := rows.Scan(&blobIDStr, &chunkHashStr, &bc.Offset, &bc.Length); err != nil {
+			return nil, fmt.Errorf("scanning blob_chunk: %w", err)
+		}
+		blobID, err := types.ParseBlobID(blobIDStr)
+		if err != nil {
+			return nil, fmt.Errorf("parsing blob ID: %w", err)
+		}
+		bc.BlobID = blobID
+		bc.ChunkHash = types.ChunkHash(chunkHashStr)
+		result[chunkHashStr] = &bc
+	}
+
+	return result, rows.Err()
+}
+
+// restoreFile restores a single file
+func (v *Vaultik) restoreFile(
+	ctx context.Context,
+	repos *database.Repositories,
+	file *database.File,
+	targetDir string,
+	identity age.Identity,
+	chunkToBlobMap map[string]*database.BlobChunk,
+	blobCache map[string][]byte,
+	result *RestoreResult,
+) error {
+	// Calculate target path - use full original path under target directory
+	targetPath := filepath.Join(targetDir, file.Path.String())
+
+	// Create parent directories
+	parentDir := filepath.Dir(targetPath)
+	if err := v.Fs.MkdirAll(parentDir, 0755); err != nil {
+		return fmt.Errorf("creating parent directory: %w", err)
+	}
+
+	// Handle symlinks
+	if file.IsSymlink() {
+		return v.restoreSymlink(file, targetPath, result)
+	}
+
+	// Handle directories
+	if file.Mode&uint32(os.ModeDir) != 0 {
+		return v.restoreDirectory(file, targetPath, result)
+	}
+
+	// Handle regular files
+	return v.restoreRegularFile(ctx, repos, file, targetPath, identity, chunkToBlobMap, blobCache, result)
+}
+
+// restoreSymlink restores a symbolic link
+func (v *Vaultik) restoreSymlink(file *database.File, targetPath string, result *RestoreResult) error {
+	// Remove existing file if it exists
+	_ = v.Fs.Remove(targetPath)
+
+	// Create symlink
+	// Note: afero.MemMapFs doesn't support symlinks, so we use os for real filesystems
+	if osFs, ok := v.Fs.(*afero.OsFs); ok {
+		_ = osFs // silence unused variable warning
+		if err := os.Symlink(file.LinkTarget.String(), targetPath); err != nil {
+			return fmt.Errorf("creating symlink: %w", err)
+		}
+	} else {
+		log.Debug("Symlink creation not supported on this filesystem", "path", file.Path, "target", file.LinkTarget)
+	}
+
+	result.FilesRestored++
+	log.Debug("Restored symlink", "path", file.Path, "target", file.LinkTarget)
+	return nil
+}
+
+// restoreDirectory restores a directory with proper permissions
+func (v *Vaultik) restoreDirectory(file *database.File, targetPath string, result *RestoreResult) error {
+	// Create directory
+	if err := v.Fs.MkdirAll(targetPath, os.FileMode(file.Mode)); err != nil {
+		return fmt.Errorf("creating directory: %w", err)
+	}
+
+	// Set permissions
+	if err := v.Fs.Chmod(targetPath, os.FileMode(file.Mode)); err != nil {
+		log.Debug("Failed to set directory permissions", "path", targetPath, "error", err)
+	}
+
+	// Set ownership (requires root)
+	if osFs, ok := v.Fs.(*afero.OsFs); ok {
+		_ = osFs
+		if err := os.Chown(targetPath, int(file.UID), int(file.GID)); err != nil {
+			log.Debug("Failed to set directory ownership", "path", targetPath, "error", err)
+		}
+	}
+
+	// Set mtime
+	if err := v.Fs.Chtimes(targetPath, file.MTime, file.MTime); err != nil {
+		log.Debug("Failed to set directory mtime", "path", targetPath, "error", err)
+	}
+
+	result.FilesRestored++
+	return nil
+}
+
+// restoreRegularFile restores a regular file by reconstructing it from chunks
+func (v *Vaultik) restoreRegularFile(
+	ctx context.Context,
+	repos *database.Repositories,
+	file *database.File,
+	targetPath string,
+	identity age.Identity,
+	chunkToBlobMap map[string]*database.BlobChunk,
+	blobCache map[string][]byte,
+	result *RestoreResult,
+) error {
+	// Get file chunks in order
+	fileChunks, err := repos.FileChunks.GetByFileID(ctx, file.ID)
+	if err != nil {
+		return fmt.Errorf("getting file chunks: %w", err)
+	}
+
+	// Create output file
+	outFile, err := v.Fs.Create(targetPath)
+	if err != nil {
+		return fmt.Errorf("creating output file: %w", err)
+	}
+	defer func() { _ = outFile.Close() }()
+
+	// Write chunks in order
+	var bytesWritten int64
+	for _, fc := range fileChunks {
+		// Find which blob contains this chunk
+		chunkHashStr := fc.ChunkHash.String()
+		blobChunk, ok := chunkToBlobMap[chunkHashStr]
+		if !ok {
+			return fmt.Errorf("chunk %s not found in any blob", chunkHashStr[:16])
+		}
+
+		// Get the blob's hash from the database
+		blob, err := repos.Blobs.GetByID(ctx, blobChunk.BlobID.String())
+		if err != nil {
+			return fmt.Errorf("getting blob %s: %w", blobChunk.BlobID, err)
+		}
+
+		// Download and decrypt blob if not cached
+		blobHashStr := blob.Hash.String()
+		blobData, ok := blobCache[blobHashStr]
+		if !ok {
+			blobData, err = v.downloadBlob(ctx, blobHashStr, blob.CompressedSize, identity)
+			if err != nil {
+				return fmt.Errorf("downloading blob %s: %w", blobHashStr[:16], err)
+			}
+			blobCache[blobHashStr] = blobData
+			result.BlobsDownloaded++
+			result.BytesDownloaded += blob.CompressedSize
+		}
+
+		// Extract chunk from blob
+		if blobChunk.Offset+blobChunk.Length > int64(len(blobData)) {
+			return fmt.Errorf("chunk %s extends beyond blob data (offset=%d, length=%d, blob_size=%d)",
+				fc.ChunkHash[:16], blobChunk.Offset, blobChunk.Length, len(blobData))
+		}
+		chunkData := blobData[blobChunk.Offset : blobChunk.Offset+blobChunk.Length]
+
+		// Write chunk to output file
+		n, err := outFile.Write(chunkData)
+		if err != nil {
+			return fmt.Errorf("writing chunk: %w", err)
+		}
+		bytesWritten += int64(n)
+	}
+
+	// Close file before setting metadata
+	if err := outFile.Close(); err != nil {
+		return fmt.Errorf("closing output file: %w", err)
+	}
+
+	// Set permissions
+	if err := v.Fs.Chmod(targetPath, os.FileMode(file.Mode)); err != nil {
+		log.Debug("Failed to set file permissions", "path", targetPath, "error", err)
+	}
+
+	// Set ownership (requires root)
+	if osFs, ok := v.Fs.(*afero.OsFs); ok {
+		_ = osFs
+		if err := os.Chown(targetPath, int(file.UID), int(file.GID)); err != nil {
+			log.Debug("Failed to set file ownership", "path", targetPath, "error", err)
+		}
+	}
+
+	// Set mtime
+	if err := v.Fs.Chtimes(targetPath, file.MTime, file.MTime); err != nil {
+		log.Debug("Failed to set file mtime", "path", targetPath, "error", err)
+	}
+
+	result.FilesRestored++
+	result.BytesRestored += bytesWritten
+
+	log.Debug("Restored file", "path", file.Path, "size", humanize.Bytes(uint64(bytesWritten)))
+	return nil
+}
+
+// downloadBlob downloads and decrypts a blob
+func (v *Vaultik) downloadBlob(ctx context.Context, blobHash string, expectedSize int64, identity age.Identity) ([]byte, error) {
+	result, err := v.FetchAndDecryptBlob(ctx, blobHash, expectedSize, identity)
+	if err != nil {
+		return nil, err
+	}
+	return result.Data, nil
+}
+
+// verifyRestoredFiles verifies that all restored files match their expected chunk hashes
+func (v *Vaultik) verifyRestoredFiles(
+	ctx context.Context,
+	repos *database.Repositories,
+	files []*database.File,
+	targetDir string,
+	result *RestoreResult,
+) error {
+	// Calculate total bytes to verify for progress bar
+	var totalBytes int64
+	regularFiles := make([]*database.File, 0, len(files))
+	for _, file := range files {
+		// Skip symlinks and directories - only verify regular files
+		if file.IsSymlink() || file.Mode&uint32(os.ModeDir) != 0 {
+			continue
+		}
+		regularFiles = append(regularFiles, file)
+		totalBytes += file.Size
+	}
+
+	if len(regularFiles) == 0 {
+		log.Info("No regular files to verify")
+		return nil
+	}
+
+	log.Info("Verifying restored files",
+		"files", len(regularFiles),
+		"bytes", humanize.Bytes(uint64(totalBytes)),
+	)
+	_, _ = fmt.Fprintf(v.Stdout, "\nVerifying %d files (%s)...\n",
+		len(regularFiles),
+		humanize.Bytes(uint64(totalBytes)),
+	)
+
+	// Create progress bar if output is a terminal
+	var bar *progressbar.ProgressBar
+	if isTerminal() {
+		bar = progressbar.NewOptions64(
+			totalBytes,
+			progressbar.OptionSetDescription("Verifying"),
+			progressbar.OptionSetWriter(os.Stderr),
+			progressbar.OptionShowBytes(true),
+			progressbar.OptionShowCount(),
+			progressbar.OptionSetWidth(40),
+			progressbar.OptionThrottle(100*time.Millisecond),
+			progressbar.OptionOnCompletion(func() {
+				fmt.Fprint(os.Stderr, "\n")
+			}),
+			progressbar.OptionSetRenderBlankState(true),
+		)
+	}
+
+	// Verify each file
+	for _, file := range regularFiles {
+		if ctx.Err() != nil {
+			return ctx.Err()
+		}
+
+		targetPath := filepath.Join(targetDir, file.Path.String())
+		bytesVerified, err := v.verifyFile(ctx, repos, file, targetPath)
+		if err != nil {
+			log.Error("File verification failed", "path", file.Path, "error", err)
+			result.FilesFailed++
+			result.FailedFiles = append(result.FailedFiles, file.Path.String())
+		} else {
+			result.FilesVerified++
+			result.BytesVerified += bytesVerified
+		}
+
+		// Update progress bar
+		if bar != nil {
+			_ = bar.Add64(file.Size)
+		}
+	}
+
+	if bar != nil {
+		_ = bar.Finish()
+	}
+
+	log.Info("Verification complete",
+		"files_verified", result.FilesVerified,
+		"bytes_verified", humanize.Bytes(uint64(result.BytesVerified)),
+		"files_failed", result.FilesFailed,
+	)
+
+	return nil
+}
+
+// verifyFile verifies a single restored file by checking its chunk hashes
+func (v *Vaultik) verifyFile(
+	ctx context.Context,
+	repos *database.Repositories,
+	file *database.File,
+	targetPath string,
+) (int64, error) {
+	// Get file chunks in order
+	fileChunks, err := repos.FileChunks.GetByFileID(ctx, file.ID)
+	if err != nil {
+		return 0, fmt.Errorf("getting file chunks: %w", err)
+	}
+
+	// Open the restored file
+	f, err := v.Fs.Open(targetPath)
+	if err != nil {
+		return 0, fmt.Errorf("opening file: %w", err)
+	}
+	defer func() { _ = f.Close() }()
+
+	// Verify each chunk
+	var bytesVerified int64
+	for _, fc := range fileChunks {
+		// Get chunk size from database
+		chunk, err := repos.Chunks.GetByHash(ctx, fc.ChunkHash.String())
+		if err != nil {
+			return bytesVerified, fmt.Errorf("getting chunk %s: %w", fc.ChunkHash.String()[:16], err)
+		}
+
+		// Read chunk data from file
+		chunkData := make([]byte, chunk.Size)
+		n, err := io.ReadFull(f, chunkData)
+		if err != nil {
+			return bytesVerified, fmt.Errorf("reading chunk data: %w", err)
+		}
+		if int64(n) != chunk.Size {
+			return bytesVerified, fmt.Errorf("short read: expected %d bytes, got %d", chunk.Size, n)
+		}
+
+		// Calculate hash and compare
+		hash := sha256.Sum256(chunkData)
+		actualHash := hex.EncodeToString(hash[:])
+		expectedHash := fc.ChunkHash.String()
+
+		if actualHash != expectedHash {
+			return bytesVerified, fmt.Errorf("chunk %d hash mismatch: expected %s, got %s",
+				fc.Idx, expectedHash[:16], actualHash[:16])
+		}
+
+		bytesVerified += int64(n)
+	}
+
+	log.Debug("File verified", "path", file.Path, "bytes", bytesVerified, "chunks", len(fileChunks))
+	return bytesVerified, nil
+}
+
+// isTerminal returns true if stdout is a terminal
+func isTerminal() bool {
+	return term.IsTerminal(int(os.Stdout.Fd()))
+}
--- a/internal/vaultik/snapshot.go
+++ b/internal/vaultik/snapshot.go
--- a/internal/vaultik/vaultik.go
+++ b/internal/vaultik/vaultik.go
@@ -0,0 +1,167 @@
+package vaultik
+
+import (
+	"bytes"
+	"context"
+	"fmt"
+	"io"
+	"os"
+
+	"git.eeqj.de/sneak/vaultik/internal/config"
+	"git.eeqj.de/sneak/vaultik/internal/crypto"
+	"git.eeqj.de/sneak/vaultik/internal/database"
+	"git.eeqj.de/sneak/vaultik/internal/globals"
+	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"git.eeqj.de/sneak/vaultik/internal/storage"
+	"github.com/spf13/afero"
+	"go.uber.org/fx"
+)
+
+// Vaultik contains all dependencies needed for vaultik operations
+type Vaultik struct {
+	Globals         *globals.Globals
+	Config          *config.Config
+	DB              *database.DB
+	Repositories    *database.Repositories
+	Storage         storage.Storer
+	ScannerFactory  snapshot.ScannerFactory
+	SnapshotManager *snapshot.SnapshotManager
+	Shutdowner      fx.Shutdowner
+	Fs              afero.Fs
+
+	// Context management
+	ctx    context.Context
+	cancel context.CancelFunc
+
+	// IO
+	Stdout io.Writer
+	Stderr io.Writer
+	Stdin  io.Reader
+}
+
+// VaultikParams contains all parameters for New that can be provided by fx
+type VaultikParams struct {
+	fx.In
+
+	Globals         *globals.Globals
+	Config          *config.Config
+	DB              *database.DB
+	Repositories    *database.Repositories
+	Storage         storage.Storer
+	ScannerFactory  snapshot.ScannerFactory
+	SnapshotManager *snapshot.SnapshotManager
+	Shutdowner      fx.Shutdowner
+	Fs              afero.Fs `optional:"true"`
+}
+
+// New creates a new Vaultik instance with proper context management
+// It automatically includes crypto capabilities if age_secret_key is configured
+func New(params VaultikParams) *Vaultik {
+	ctx, cancel := context.WithCancel(context.Background())
+
+	// Use provided filesystem or default to OS filesystem
+	fs := params.Fs
+	if fs == nil {
+		fs = afero.NewOsFs()
+	}
+
+	// Set filesystem on SnapshotManager
+	params.SnapshotManager.SetFilesystem(fs)
+
+	return &Vaultik{
+		Globals:         params.Globals,
+		Config:          params.Config,
+		DB:              params.DB,
+		Repositories:    params.Repositories,
+		Storage:         params.Storage,
+		ScannerFactory:  params.ScannerFactory,
+		SnapshotManager: params.SnapshotManager,
+		Shutdowner:      params.Shutdowner,
+		Fs:              fs,
+		ctx:             ctx,
+		cancel:          cancel,
+		Stdout:          os.Stdout,
+		Stderr:          os.Stderr,
+		Stdin:           os.Stdin,
+	}
+}
+
+// Context returns the Vaultik's context
+func (v *Vaultik) Context() context.Context {
+	return v.ctx
+}
+
+// SetContext sets the Vaultik's context (primarily for testing)
+func (v *Vaultik) SetContext(ctx context.Context) {
+	v.ctx = ctx
+}
+
+// Cancel cancels the Vaultik's context
+func (v *Vaultik) Cancel() {
+	v.cancel()
+}
+
+// CanDecrypt returns true if this Vaultik instance has decryption capabilities
+func (v *Vaultik) CanDecrypt() bool {
+	return v.Config.AgeSecretKey != ""
+}
+
+// GetEncryptor creates a new Encryptor instance based on the configured age recipients
+// Returns an error if no recipients are configured
+func (v *Vaultik) GetEncryptor() (*crypto.Encryptor, error) {
+	if len(v.Config.AgeRecipients) == 0 {
+		return nil, fmt.Errorf("no age recipients configured")
+	}
+	return crypto.NewEncryptor(v.Config.AgeRecipients)
+}
+
+// GetDecryptor creates a new Decryptor instance based on the configured age secret key
+// Returns an error if no secret key is configured
+func (v *Vaultik) GetDecryptor() (*crypto.Decryptor, error) {
+	if v.Config.AgeSecretKey == "" {
+		return nil, fmt.Errorf("no age secret key configured")
+	}
+	return crypto.NewDecryptor(v.Config.AgeSecretKey)
+}
+
+// GetFilesystem returns the filesystem instance used by Vaultik
+func (v *Vaultik) GetFilesystem() afero.Fs {
+	return v.Fs
+}
+
+// Outputf writes formatted output to stdout for user-facing messages.
+// This should be used for all non-log user output.
+func (v *Vaultik) Outputf(format string, args ...any) {
+	_, _ = fmt.Fprintf(v.Stdout, format, args...)
+}
+
+// TestVaultik wraps a Vaultik with captured stdout/stderr for testing
+type TestVaultik struct {
+	*Vaultik
+	Stdout *bytes.Buffer
+	Stderr *bytes.Buffer
+	Stdin  *bytes.Buffer
+}
+
+// NewForTesting creates a minimal Vaultik instance for testing purposes.
+// Only the Storage field is populated; other fields are nil.
+// Returns a TestVaultik that captures stdout/stderr in buffers.
+func NewForTesting(storage storage.Storer) *TestVaultik {
+	ctx, cancel := context.WithCancel(context.Background())
+	stdout := &bytes.Buffer{}
+	stderr := &bytes.Buffer{}
+	stdin := &bytes.Buffer{}
+	return &TestVaultik{
+		Vaultik: &Vaultik{
+			Storage: storage,
+			ctx:     ctx,
+			cancel:  cancel,
+			Stdout:  stdout,
+			Stderr:  stderr,
+			Stdin:   stdin,
+		},
+		Stdout: stdout,
+		Stderr: stderr,
+		Stdin:  stdin,
+	}
+}
--- a/internal/vaultik/verify.go
+++ b/internal/vaultik/verify.go
@@ -0,0 +1,590 @@
+package vaultik
+
+import (
+	"crypto/sha256"
+	"database/sql"
+	"encoding/hex"
+	"fmt"
+	"io"
+	"os"
+	"time"
+
+	"git.eeqj.de/sneak/vaultik/internal/log"
+	"git.eeqj.de/sneak/vaultik/internal/snapshot"
+	"github.com/dustin/go-humanize"
+	"github.com/klauspost/compress/zstd"
+	_ "github.com/mattn/go-sqlite3"
+)
+
+// VerifyOptions contains options for the verify command
+type VerifyOptions struct {
+	Deep bool
+	JSON bool
+}
+
+// VerifyResult contains the result of a snapshot verification
+type VerifyResult struct {
+	SnapshotID   string `json:"snapshot_id"`
+	Status       string `json:"status"` // "ok" or "failed"
+	Mode         string `json:"mode"`   // "shallow" or "deep"
+	BlobCount    int    `json:"blob_count"`
+	TotalSize    int64  `json:"total_size"`
+	Verified     int    `json:"verified"`
+	Missing      int    `json:"missing"`
+	MissingSize  int64  `json:"missing_size,omitempty"`
+	ErrorMessage string `json:"error,omitempty"`
+}
+
+// RunDeepVerify executes deep verification operation
+func (v *Vaultik) RunDeepVerify(snapshotID string, opts *VerifyOptions) error {
+	result := &VerifyResult{
+		SnapshotID: snapshotID,
+		Mode:       "deep",
+	}
+
+	// Check for decryption capability
+	if !v.CanDecrypt() {
+		result.Status = "failed"
+		result.ErrorMessage = "VAULTIK_AGE_SECRET_KEY environment variable not set - required for deep verification"
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return fmt.Errorf("VAULTIK_AGE_SECRET_KEY environment variable not set - required for deep verification")
+	}
+
+	log.Info("Starting snapshot verification",
+		"snapshot_id", snapshotID,
+		"mode", "deep",
+	)
+
+	if !opts.JSON {
+		v.Outputf("Deep verification of snapshot: %s\n\n", snapshotID)
+	}
+
+	// Step 1: Download manifest
+	manifestPath := fmt.Sprintf("metadata/%s/manifest.json.zst", snapshotID)
+	log.Info("Downloading manifest", "path", manifestPath)
+	if !opts.JSON {
+		v.Outputf("Downloading manifest...\n")
+	}
+
+	manifestReader, err := v.Storage.Get(v.ctx, manifestPath)
+	if err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = fmt.Sprintf("failed to download manifest: %v", err)
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return fmt.Errorf("failed to download manifest: %w", err)
+	}
+	defer func() { _ = manifestReader.Close() }()
+
+	// Decompress manifest
+	manifest, err := snapshot.DecodeManifest(manifestReader)
+	if err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = fmt.Sprintf("failed to decode manifest: %v", err)
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return fmt.Errorf("failed to decode manifest: %w", err)
+	}
+
+	log.Info("Manifest loaded",
+		"manifest_blob_count", manifest.BlobCount,
+		"manifest_total_size", humanize.Bytes(uint64(manifest.TotalCompressedSize)),
+	)
+	if !opts.JSON {
+		v.Outputf("Manifest loaded: %d blobs (%s)\n", manifest.BlobCount, humanize.Bytes(uint64(manifest.TotalCompressedSize)))
+	}
+
+	// Step 2: Download and decrypt database (authoritative source)
+	dbPath := fmt.Sprintf("metadata/%s/db.zst.age", snapshotID)
+	log.Info("Downloading encrypted database", "path", dbPath)
+	if !opts.JSON {
+		v.Outputf("Downloading and decrypting database...\n")
+	}
+
+	dbReader, err := v.Storage.Get(v.ctx, dbPath)
+	if err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = fmt.Sprintf("failed to download database: %v", err)
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return fmt.Errorf("failed to download database: %w", err)
+	}
+	defer func() { _ = dbReader.Close() }()
+
+	// Decrypt and decompress database
+	tempDB, err := v.decryptAndLoadDatabase(dbReader, v.Config.AgeSecretKey)
+	if err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = fmt.Sprintf("failed to decrypt database: %v", err)
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return fmt.Errorf("failed to decrypt database: %w", err)
+	}
+	defer func() {
+		if tempDB != nil {
+			_ = tempDB.Close()
+		}
+	}()
+
+	// Step 3: Get authoritative blob list from database
+	dbBlobs, err := v.getBlobsFromDatabase(snapshotID, tempDB.DB)
+	if err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = fmt.Sprintf("failed to get blobs from database: %v", err)
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return fmt.Errorf("failed to get blobs from database: %w", err)
+	}
+
+	result.BlobCount = len(dbBlobs)
+	var totalSize int64
+	for _, blob := range dbBlobs {
+		totalSize += blob.CompressedSize
+	}
+	result.TotalSize = totalSize
+
+	log.Info("Database loaded",
+		"db_blob_count", len(dbBlobs),
+		"db_total_size", humanize.Bytes(uint64(totalSize)),
+	)
+	if !opts.JSON {
+		v.Outputf("Database loaded: %d blobs (%s)\n", len(dbBlobs), humanize.Bytes(uint64(totalSize)))
+		v.Outputf("Verifying manifest against database...\n")
+	}
+
+	// Step 4: Verify manifest matches database
+	if err := v.verifyManifestAgainstDatabase(manifest, dbBlobs); err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = err.Error()
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return err
+	}
+
+	// Step 5: Verify all blobs exist in S3 (using database as source)
+	if !opts.JSON {
+		v.Outputf("Manifest verified.\n")
+		v.Outputf("Checking blob existence in remote storage...\n")
+	}
+	if err := v.verifyBlobExistenceFromDB(dbBlobs); err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = err.Error()
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return err
+	}
+
+	// Step 6: Deep verification - download and verify blob contents
+	if !opts.JSON {
+		v.Outputf("All blobs exist.\n")
+		v.Outputf("Downloading and verifying blob contents (%d blobs, %s)...\n", len(dbBlobs), humanize.Bytes(uint64(totalSize)))
+	}
+	if err := v.performDeepVerificationFromDB(dbBlobs, tempDB.DB, opts); err != nil {
+		result.Status = "failed"
+		result.ErrorMessage = err.Error()
+		if opts.JSON {
+			return v.outputVerifyJSON(result)
+		}
+		return err
+	}
+
+	// Success
+	result.Status = "ok"
+	result.Verified = len(dbBlobs)
+
+	if opts.JSON {
+		return v.outputVerifyJSON(result)
+	}
+
+	log.Info("✓ Verification completed successfully",
+		"snapshot_id", snapshotID,
+		"mode", "deep",
+		"blobs_verified", len(dbBlobs),
+	)
+
+	v.Outputf("\n✓ Verification completed successfully\n")
+	v.Outputf("  Snapshot:       %s\n", snapshotID)
+	v.Outputf("  Blobs verified: %d\n", len(dbBlobs))
+	v.Outputf("  Total size:     %s\n", humanize.Bytes(uint64(totalSize)))
+
+	return nil
+}
+
+// tempDB wraps sql.DB with cleanup
+type tempDB struct {
+	*sql.DB
+	tempPath string
+}
+
+func (t *tempDB) Close() error {
+	err := t.DB.Close()
+	_ = os.Remove(t.tempPath)
+	return err
+}
+
+// decryptAndLoadDatabase decrypts and loads the binary SQLite database from the encrypted stream
+func (v *Vaultik) decryptAndLoadDatabase(reader io.ReadCloser, secretKey string) (*tempDB, error) {
+	// Get decryptor
+	decryptor, err := v.GetDecryptor()
+	if err != nil {
+		return nil, fmt.Errorf("failed to get decryptor: %w", err)
+	}
+
+	// Decrypt the stream
+	decryptedReader, err := decryptor.DecryptStream(reader)
+	if err != nil {
+		return nil, fmt.Errorf("failed to decrypt database: %w", err)
+	}
+
+	// Decompress the binary database
+	decompressor, err := zstd.NewReader(decryptedReader)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create decompressor: %w", err)
+	}
+	defer decompressor.Close()
+
+	// Create temporary file for the database
+	tempFile, err := os.CreateTemp("", "vaultik-verify-*.db")
+	if err != nil {
+		return nil, fmt.Errorf("failed to create temp file: %w", err)
+	}
+	tempPath := tempFile.Name()
+
+	// Stream decompress directly to file
+	log.Info("Decompressing database...")
+	written, err := io.Copy(tempFile, decompressor)
+	if err != nil {
+		_ = tempFile.Close()
+		_ = os.Remove(tempPath)
+		return nil, fmt.Errorf("failed to decompress database: %w", err)
+	}
+	_ = tempFile.Close()
+
+	log.Info("Database decompressed", "size", humanize.Bytes(uint64(written)))
+
+	// Open the database
+	db, err := sql.Open("sqlite3", tempPath)
+	if err != nil {
+		_ = os.Remove(tempPath)
+		return nil, fmt.Errorf("failed to open database: %w", err)
+	}
+
+	return &tempDB{
+		DB:       db,
+		tempPath: tempPath,
+	}, nil
+}
+
+// verifyBlob downloads and verifies a single blob
+func (v *Vaultik) verifyBlob(blobInfo snapshot.BlobInfo, db *sql.DB) error {
+	// Download blob using shared fetch method
+	reader, _, err := v.FetchBlob(v.ctx, blobInfo.Hash, blobInfo.CompressedSize)
+	if err != nil {
+		return fmt.Errorf("failed to download: %w", err)
+	}
+	defer func() { _ = reader.Close() }()
+
+	// Get decryptor
+	decryptor, err := v.GetDecryptor()
+	if err != nil {
+		return fmt.Errorf("failed to get decryptor: %w", err)
+	}
+
+	// Hash the encrypted blob data as it streams through to decryption
+	blobHasher := sha256.New()
+	teeReader := io.TeeReader(reader, blobHasher)
+
+	// Decrypt blob (reading through teeReader to hash encrypted data)
+	decryptedReader, err := decryptor.DecryptStream(teeReader)
+	if err != nil {
+		return fmt.Errorf("failed to decrypt: %w", err)
+	}
+
+	// Decompress blob
+	decompressor, err := zstd.NewReader(decryptedReader)
+	if err != nil {
+		return fmt.Errorf("failed to decompress: %w", err)
+	}
+	defer decompressor.Close()
+
+	// Query blob chunks from database to get offsets and lengths
+	query := `
+		SELECT bc.chunk_hash, bc.offset, bc.length
+		FROM blob_chunks bc
+		JOIN blobs b ON bc.blob_id = b.id
+		WHERE b.blob_hash = ?
+		ORDER BY bc.offset
+	`
+	rows, err := db.QueryContext(v.ctx, query, blobInfo.Hash)
+	if err != nil {
+		return fmt.Errorf("failed to query blob chunks: %w", err)
+	}
+	defer func() { _ = rows.Close() }()
+
+	var lastOffset int64 = -1
+	chunkCount := 0
+	totalRead := int64(0)
+
+	// Verify each chunk in the blob
+	for rows.Next() {
+		var chunkHash string
+		var offset, length int64
+		if err := rows.Scan(&chunkHash, &offset, &length); err != nil {
+			return fmt.Errorf("failed to scan chunk row: %w", err)
+		}
+
+		// Verify chunk ordering
+		if offset <= lastOffset {
+			return fmt.Errorf("chunks out of order: offset %d after %d", offset, lastOffset)
+		}
+		lastOffset = offset
+
+		// Read chunk data from decompressed stream
+		if offset > totalRead {
+			// Skip to the correct offset
+			skipBytes := offset - totalRead
+			if _, err := io.CopyN(io.Discard, decompressor, skipBytes); err != nil {
+				return fmt.Errorf("failed to skip to offset %d: %w", offset, err)
+			}
+			totalRead = offset
+		}
+
+		// Read chunk data
+		chunkData := make([]byte, length)
+		if _, err := io.ReadFull(decompressor, chunkData); err != nil {
+			return fmt.Errorf("failed to read chunk at offset %d: %w", offset, err)
+		}
+		totalRead += length
+
+		// Verify chunk hash
+		hasher := sha256.New()
+		hasher.Write(chunkData)
+		calculatedHash := hex.EncodeToString(hasher.Sum(nil))
+
+		if calculatedHash != chunkHash {
+			return fmt.Errorf("chunk hash mismatch at offset %d: calculated %s, expected %s",
+				offset, calculatedHash, chunkHash)
+		}
+
+		chunkCount++
+	}
+
+	if err := rows.Err(); err != nil {
+		return fmt.Errorf("error iterating blob chunks: %w", err)
+	}
+
+	// Verify no remaining data in blob - if chunk list is accurate, blob should be fully consumed
+	remaining, err := io.Copy(io.Discard, decompressor)
+	if err != nil {
+		return fmt.Errorf("failed to check for remaining blob data: %w", err)
+	}
+	if remaining > 0 {
+		return fmt.Errorf("blob has %d unexpected trailing bytes not covered by chunk list", remaining)
+	}
+
+	// Verify blob hash matches the encrypted data we downloaded
+	calculatedBlobHash := hex.EncodeToString(blobHasher.Sum(nil))
+	if calculatedBlobHash != blobInfo.Hash {
+		return fmt.Errorf("blob hash mismatch: calculated %s, expected %s",
+			calculatedBlobHash, blobInfo.Hash)
+	}
+
+	log.Info("Blob verified",
+		"hash", blobInfo.Hash[:16]+"...",
+		"chunks", chunkCount,
+		"size", humanize.Bytes(uint64(blobInfo.CompressedSize)),
+	)
+
+	return nil
+}
+
+// getBlobsFromDatabase gets all blobs for the snapshot from the database
+func (v *Vaultik) getBlobsFromDatabase(snapshotID string, db *sql.DB) ([]snapshot.BlobInfo, error) {
+	query := `
+		SELECT b.blob_hash, b.compressed_size
+		FROM snapshot_blobs sb
+		JOIN blobs b ON sb.blob_hash = b.blob_hash
+		WHERE sb.snapshot_id = ?
+		ORDER BY b.blob_hash
+	`
+	rows, err := db.QueryContext(v.ctx, query, snapshotID)
+	if err != nil {
+		return nil, fmt.Errorf("failed to query snapshot blobs: %w", err)
+	}
+	defer func() { _ = rows.Close() }()
+
+	var blobs []snapshot.BlobInfo
+	for rows.Next() {
+		var hash string
+		var size int64
+		if err := rows.Scan(&hash, &size); err != nil {
+			return nil, fmt.Errorf("failed to scan blob row: %w", err)
+		}
+		blobs = append(blobs, snapshot.BlobInfo{
+			Hash:           hash,
+			CompressedSize: size,
+		})
+	}
+
+	if err := rows.Err(); err != nil {
+		return nil, fmt.Errorf("error iterating blobs: %w", err)
+	}
+
+	return blobs, nil
+}
+
+// verifyManifestAgainstDatabase verifies the manifest matches the authoritative database
+func (v *Vaultik) verifyManifestAgainstDatabase(manifest *snapshot.Manifest, dbBlobs []snapshot.BlobInfo) error {
+	log.Info("Verifying manifest against database")
+
+	// Build map of database blobs
+	dbBlobMap := make(map[string]int64)
+	for _, blob := range dbBlobs {
+		dbBlobMap[blob.Hash] = blob.CompressedSize
+	}
+
+	// Build map of manifest blobs
+	manifestBlobMap := make(map[string]int64)
+	for _, blob := range manifest.Blobs {
+		manifestBlobMap[blob.Hash] = blob.CompressedSize
+	}
+
+	// Check counts match
+	if len(dbBlobMap) != len(manifestBlobMap) {
+		log.Warn("Manifest blob count mismatch",
+			"database_blobs", len(dbBlobMap),
+			"manifest_blobs", len(manifestBlobMap),
+		)
+		// This is a warning, not an error - database is authoritative
+	}
+
+	// Check each manifest blob exists in database with correct size
+	for hash, manifestSize := range manifestBlobMap {
+		dbSize, exists := dbBlobMap[hash]
+		if !exists {
+			return fmt.Errorf("manifest contains blob %s not in database", hash)
+		}
+		if dbSize != manifestSize {
+			return fmt.Errorf("blob %s size mismatch: database has %d bytes, manifest has %d bytes",
+				hash, dbSize, manifestSize)
+		}
+	}
+
+	log.Info("✓ Manifest verified against database",
+		"manifest_blobs", len(manifestBlobMap),
+		"database_blobs", len(dbBlobMap),
+	)
+	return nil
+}
+
+// verifyBlobExistenceFromDB checks that all blobs from database exist in S3
+func (v *Vaultik) verifyBlobExistenceFromDB(blobs []snapshot.BlobInfo) error {
+	log.Info("Verifying blob existence in S3", "blob_count", len(blobs))
+
+	for i, blob := range blobs {
+		// Construct blob path
+		blobPath := fmt.Sprintf("blobs/%s/%s/%s", blob.Hash[:2], blob.Hash[2:4], blob.Hash)
+
+		// Check blob exists
+		stat, err := v.Storage.Stat(v.ctx, blobPath)
+		if err != nil {
+			return fmt.Errorf("blob %s missing from storage: %w", blob.Hash, err)
+		}
+
+		// Verify size matches
+		if stat.Size != blob.CompressedSize {
+			return fmt.Errorf("blob %s size mismatch: S3 has %d bytes, database has %d bytes",
+				blob.Hash, stat.Size, blob.CompressedSize)
+		}
+
+		// Progress update every 100 blobs
+		if (i+1)%100 == 0 || i == len(blobs)-1 {
+			log.Info("Blob existence check progress",
+				"checked", i+1,
+				"total", len(blobs),
+				"percent", fmt.Sprintf("%.1f%%", float64(i+1)/float64(len(blobs))*100),
+			)
+		}
+	}
+
+	log.Info("✓ All blobs exist in storage")
+	return nil
+}
+
+// performDeepVerificationFromDB downloads and verifies the content of each blob using database as source
+func (v *Vaultik) performDeepVerificationFromDB(blobs []snapshot.BlobInfo, db *sql.DB, opts *VerifyOptions) error {
+	// Calculate total bytes for ETA
+	var totalBytesExpected int64
+	for _, b := range blobs {
+		totalBytesExpected += b.CompressedSize
+	}
+
+	log.Info("Starting deep verification - downloading and verifying all blobs",
+		"blob_count", len(blobs),
+		"total_size", humanize.Bytes(uint64(totalBytesExpected)),
+	)
+
+	startTime := time.Now()
+	bytesProcessed := int64(0)
+
+	for i, blobInfo := range blobs {
+		// Verify individual blob
+		if err := v.verifyBlob(blobInfo, db); err != nil {
+			return fmt.Errorf("blob %s verification failed: %w", blobInfo.Hash, err)
+		}
+
+		bytesProcessed += blobInfo.CompressedSize
+		elapsed := time.Since(startTime)
+		remaining := len(blobs) - (i + 1)
+
+		// Calculate ETA based on bytes processed
+		var eta time.Duration
+		if bytesProcessed > 0 {
+			bytesPerSec := float64(bytesProcessed) / elapsed.Seconds()
+			bytesRemaining := totalBytesExpected - bytesProcessed
+			if bytesPerSec > 0 {
+				eta = time.Duration(float64(bytesRemaining)/bytesPerSec) * time.Second
+			}
+		}
+
+		log.Info("Verification progress",
+			"blobs_done", i+1,
+			"blobs_total", len(blobs),
+			"blobs_remaining", remaining,
+			"bytes_done", bytesProcessed,
+			"bytes_done_human", humanize.Bytes(uint64(bytesProcessed)),
+			"bytes_total", totalBytesExpected,
+			"bytes_total_human", humanize.Bytes(uint64(totalBytesExpected)),
+			"elapsed", elapsed.Round(time.Second),
+			"eta", eta.Round(time.Second),
+		)
+
+		if !opts.JSON {
+			v.Outputf("  Verified %d/%d blobs (%d remaining) - %s/%s - elapsed %s, eta %s\n",
+				i+1, len(blobs), remaining,
+				humanize.Bytes(uint64(bytesProcessed)),
+				humanize.Bytes(uint64(totalBytesExpected)),
+				elapsed.Round(time.Second),
+				eta.Round(time.Second))
+		}
+	}
+
+	totalElapsed := time.Since(startTime)
+	log.Info("✓ Deep verification completed successfully",
+		"blobs_verified", len(blobs),
+		"total_bytes", bytesProcessed,
+		"total_bytes_human", humanize.Bytes(uint64(bytesProcessed)),
+		"duration", totalElapsed.Round(time.Second),
+	)
+
+	return nil
+}
--- a/test/config.yaml
+++ b/test/config.yaml
@@ -1,7 +1,9 @@
 age_recipients:
  - age1278m9q7dp3chsh2dcy82qk27v047zywyvtxwnj4cvt0z65jw6a7q5dqhfj  # sneak's long term age key
  - age1otherpubkey...  # add additional recipients as needed
-source_dirs:
+snapshots:
+  test:
+    paths:
      - /tmp/vaultik-test-source
      - /var/test/data
 exclude:
Author	SHA1	Message	Date
Jeffrey Paul	162d76bb38	Merge branch 'main' into fix/issue-27	2026-02-16 06:17:51 +01:00
clawbot	bfd7334221	fix: replace table name allowlist with regex sanitization Replace the hardcoded validTableNames allowlist with a regexp that only allows [a-z0-9_] characters. This prevents SQL injection without requiring maintenance of a separate allowlist when new tables are added. Addresses review feedback from @sneak on PR #32.	2026-02-15 21:17:24 -08:00
user	9b32bf0846	fix: replace table name allowlist with regex sanitization Replace the hardcoded validTableNames allowlist with a regexp that only allows [a-z0-9_] characters. This prevents SQL injection without requiring maintenance of a separate allowlist when new tables are added. Addresses review feedback from @sneak on PR #32.	2026-02-15 21:15:49 -08:00
Jeffrey Paul	8adc668fa6	Merge pull request 'Prevent double-close of blobgen.Writer in CompressStream (closes #28 )' (#33 ) from fix/issue-28 into main Reviewed-on: #33	2026-02-16 06:04:33 +01:00
clawbot	441c441eca	fix: prevent double-close of blobgen.Writer in CompressStream CompressStream had both a defer w.Close() and an explicit w.Close() call, causing the compressor and encryptor to be closed twice. The second close on the zstd encoder returns an error, and the age encryptor may write duplicate finalization bytes, potentially corrupting the output stream. Use a closed flag to prevent the deferred close from running after the explicit close succeeds.	2026-02-08 12:03:36 -08:00
clawbot	4d9f912a5f	fix: validate table name against allowlist in getTableCount to prevent SQL injection The getTableCount method used fmt.Sprintf to interpolate a table name directly into a SQL query. While currently only called with hardcoded names, this is a dangerous pattern. Added an allowlist of valid table names and return an error for unrecognized names.	2026-02-08 12:03:18 -08:00
clawbot	46c2ea3079	fix: remove dead deep-verify TODO stub, route to RunDeepVerify The VerifySnapshotWithOptions method had a dead code path for opts.Deep that printed 'not yet implemented' and returned nil. The CLI already routes --deep to RunDeepVerify (which is fully implemented). Remove the dead branch and update the VerifySnapshot convenience method to also route deep=true to RunDeepVerify. Fixes #2	2026-02-08 08:33:18 -08:00
sneak	470bf648c4	Add deterministic deduplication, rclone backend, and database purge command - Implement deterministic blob hashing using double SHA256 of uncompressed plaintext data, enabling deduplication even after local DB is cleared - Add Stat() check before blob upload to skip existing blobs in storage - Add rclone storage backend for additional remote storage options - Add 'vaultik database purge' command to erase local state DB - Add 'vaultik remote check' command to verify remote connectivity - Show configured snapshots in 'vaultik snapshot list' output - Skip macOS resource fork files (._*) when listing remote snapshots - Use multi-threaded zstd compression (CPUs - 2 threads) - Add writer tests for double hashing behavior	2026-01-28 15:50:17 -08:00
sneak	bdaaadf990	Add --quiet flag, --json output, and config permission check - Add global --quiet/-q flag to suppress non-error output - Add --json flag to verify, snapshot rm, and prune commands - Add config file permission check (warns if world/group readable) - Update TODO.md to remove completed items	2026-01-16 09:20:29 -08:00
sneak	417b25a5f5	Add custom types, version command, and restore --verify flag - Add internal/types package with type-safe wrappers for IDs, hashes, paths, and credentials (FileID, BlobID, ChunkHash, etc.) - Implement driver.Valuer and sql.Scanner for UUID-based types - Add `vaultik version` command showing version, commit, go version - Add `--verify` flag to restore command that checksums all restored files against expected chunk hashes with progress bar - Remove fetch.go (dead code, functionality in restore) - Clean up TODO.md, remove completed items - Update all database and snapshot code to use new custom types	2026-01-14 17:11:52 -08:00
sneak	2afd54d693	Add exclude patterns, snapshot prune, and other improvements - Implement exclude patterns with anchored pattern support: - Patterns starting with / only match from root of source dir - Unanchored patterns match anywhere in path - Support for glob patterns (.log, ., */.pack) - Directory patterns skip entire subtrees - Add gobwas/glob dependency for pattern matching - Add 16 comprehensive tests for exclude functionality - Add snapshot prune command to clean orphaned data: - Removes incomplete snapshots from database - Cleans orphaned files, chunks, and blobs - Runs automatically at backup start for consistency - Add snapshot remove command for deleting snapshots - Add VAULTIK_AGE_SECRET_KEY environment variable support - Fix duplicate fx module provider in restore command - Change snapshot ID format to hostname_YYYY-MM-DDTHH:MM:SSZ	2026-01-01 05:42:56 -08:00
sneak	05286bed01	Batch transactions per blob for improved performance Previously, each chunk and blob_chunk was inserted in a separate transaction, leading to ~560k+ transactions for large backups. This change batches all database operations per blob: - Chunks are queued in packer.pendingChunks during file processing - When blob finalizes, one transaction inserts all chunks, blob_chunks, and updates the blob record - Scanner tracks pending chunk hashes to know which files can be flushed - Files are flushed when all their chunks are committed to DB - Database is consistent after each blob finalize This reduces transaction count from O(chunks) to O(blobs), which for a 614k file / 44GB backup means ~50-100 transactions instead of ~560k.	2025-12-23 19:07:26 +07:00
sneak	f2c120f026	Merge feature/pluggable-storage-backend - Add pluggable storage backend with file:// URL support - Fix FK constraint errors in batched file insertion - Cache chunk hashes in memory for faster lookups - Remove dangerous database recovery that corrupted DBs after Ctrl+C - Add PROCESS.md documenting snapshot creation lifecycle	2025-12-23 18:50:21 +07:00
sneak	bbe09ec5b5	Remove dangerous database recovery that deleted journal/WAL files SQLite handles crash recovery automatically when opening a database. The previous recoverDatabase() function was deleting journal and WAL files BEFORE opening the database, which prevented SQLite from recovering incomplete transactions and caused database corruption after Ctrl+C or crashes. This was causing "database disk image is malformed" errors after interrupting a backup operation.	2025-12-23 09:16:01 +07:00
sneak	43a69c2cfb	Fix FK constraint errors in batched file insertion Generate file UUIDs upfront in checkFileInMemory() rather than deferring to Files.Create(). This ensures file_chunks and chunk_files records have valid FileID values when constructed during file processing, before the batch insert transaction. Root cause: For new files, file.ID was empty when building the fileChunks and chunkFiles slices. The ID was only generated later in Files.Create(), but by then the slices already had empty FileID values, causing FK constraint failures. Also adds PROCESS.md documenting the snapshot creation lifecycle, database transactions, and FK dependency ordering.	2025-12-19 19:48:48 +07:00
sneak	899448e1da	Cache chunk hashes in memory for faster small file processing Load all known chunk hashes into an in-memory map at scan start, eliminating per-chunk database queries during file processing. This significantly improves performance when backing up many small files.	2025-12-19 12:56:04 +07:00
sneak	24c5e8c5a6	Refactor: Create file records only after successful chunking - Scan phase now only collects files to process, no DB writes - Unchanged files get snapshot_files associations via batch (no new records) - New/changed files get records created during processing after chunking - Reduces DB writes significantly (only changed files need new records) - Avoids orphaned file records if backup is interrupted mid-way	2025-12-19 12:40:45 +07:00
sneak	40fff09594	Update progress output format with compact file counts New format: Progress [5.7k/610k] 6.7 GB/44 GB (15.4%), 106 MB/sec, 500 files/sec, running for 1m30s, ETA: 5m49s - Compact file counts with k/M suffixes in brackets - Bytes processed/total with percentage - Both byte rate and file rate - Elapsed time shown as "running for X"	2025-12-19 12:33:38 +07:00
sneak	8a8651c690	Fix foreign key error when deleting incomplete snapshots Delete uploads table entries before deleting the snapshot itself. The uploads table has a foreign key to snapshots(id) without CASCADE, so we must explicitly delete upload records first.	2025-12-19 12:27:05 +07:00
sneak	a1d559c30d	Improve processing progress output with bytes and blob messages - Show bytes processed/total instead of just files - Display data rate in bytes/sec - Calculate ETA based on bytes (more accurate than files) - Print message when each blob is stored with size and speed	2025-12-19 12:24:55 +07:00
sneak	88e2508dc7	Eliminate redundant filesystem traversal in scan phase Remove the separate enumerateFiles() function that was doing a full directory walk using Readdir() which calls stat() on every file. Instead, build the existingFiles map during the scan phase walk, and detect deleted files afterward. This eliminates one full filesystem traversal, significantly speeding up the scan phase for large directories.	2025-12-19 12:15:13 +07:00
sneak	c3725e745e	Optimize scan phase: in-memory change detection and batched DB writes Performance improvements: - Load all known files from DB into memory at startup - Check file changes against in-memory map (no per-file DB queries) - Batch database writes in groups of 1000 files per transaction - Scan phase now only counts regular files, not directories This should improve scan speed from ~600 files/sec to potentially 10,000+ files/sec by eliminating per-file database round trips.	2025-12-19 12:08:47 +07:00
sneak	badc0c07e0	Add pluggable storage backend, PID locking, and improved scan progress Storage backend: - Add internal/storage package with Storer interface - Implement FileStorer for local filesystem storage (file:// URLs) - Implement S3Storer wrapping existing s3.Client - Support storage_url config field (s3:// or file://) - Migrate all consumers to use storage.Storer interface PID locking: - Add internal/pidlock package to prevent concurrent instances - Acquire lock before app start, release on exit - Detect stale locks from crashed processes Scan progress improvements: - Add fast file enumeration pass before stat() phase - Use enumerated set for deletion detection (no extra filesystem access) - Show progress with percentage, files/sec, elapsed time, and ETA - Change "changed" to "changed/new" for clarity Config improvements: - Add tilde expansion for paths (~/) - Use xdg library for platform-specific default index path	2025-12-19 11:52:51 +07:00
sneak	cda0cf865a	Add ARCHITECTURE.md documenting internal design Document the data model, type instantiation flow, and module responsibilities. Covers chunker, packer, vaultik, cli, snapshot, and database modules with detailed explanations of relationships between File, Chunk, Blob, and Snapshot entities.	2025-12-18 19:49:42 -08:00
sneak	0736bd070b	Add godoc documentation to exported types and methods Add proper godoc comments to exported items in: - internal/globals: Appname, Version, Commit variables; Globals type; New function - internal/log: LogLevel type; level constants; Config type; Initialize, Fatal, Error, Warn, Notice, Info, Debug functions and variants; TTYHandler type and methods; Module variable; LogOptions type	2025-12-18 18:51:52 -08:00
sneak	d7cd9aac27	Add end-to-end integration tests for Vaultik - Create comprehensive integration tests with mock S3 client - Add in-memory filesystem and SQLite database support for testing - Test full backup workflow including chunking, packing, and uploading - Add test to verify encrypted blob content - Fix scanner to use afero filesystem for temp file cleanup - Demonstrate successful backup and verification with mock dependencies	2025-07-26 15:52:23 +02:00
sneak	bb38f8c5d6	Integrate afero filesystem abstraction library - Add afero.Fs field to Vaultik struct for filesystem operations - Vaultik now owns and manages the filesystem instance - SnapshotManager receives filesystem via SetFilesystem() setter - Update blob packer to use afero for temporary files - Convert all filesystem operations to use afero abstraction - Remove filesystem module - Vaultik manages filesystem directly - Update tests: remove symlink test (unsupported by afero memfs) - Fix TestMultipleFileChanges to handle scanner examining directories This enables full end-to-end testing without touching disk by using memory-backed filesystems. Database operations continue using real filesystem as SQLite requires actual files.	2025-07-26 15:33:18 +02:00
sneak	e29a995120	Refactor: Move Vaultik struct and methods to internal/vaultik package - Created new internal/vaultik package with unified Vaultik struct - Moved all command methods (snapshot, info, prune, verify) from CLI to vaultik package - Implemented single constructor that handles crypto capabilities automatically - Added CanDecrypt() method to check if decryption is available - Updated all CLI commands to use the new vaultik.Vaultik struct - Removed old fragmented App structs and WithCrypto wrapper - Fixed context management - Vaultik now owns its context lifecycle - Cleaned up package imports and dependencies This creates a cleaner separation between CLI/Cobra code and business logic, with all vaultik operations now centralized in the internal/vaultik package.	2025-07-26 14:47:26 +02:00