diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
new file mode 100644
index 0000000..4cdb844
--- /dev/null
+++ b/ARCHITECTURE.md
@@ -0,0 +1,380 @@
+# Vaultik Architecture
+
+This document describes the internal architecture of Vaultik, focusing on the data model, type instantiation, and the relationships between core modules.
+
+## Overview
+
+Vaultik is a backup system that uses content-defined chunking for deduplication and packs chunks into large, compressed, encrypted blobs for efficient cloud storage. The system is built around dependency injection using [uber-go/fx](https://github.com/uber-go/fx).
+
+## Data Flow
+
+```
+Source Files
+     │
+     ▼
+┌─────────────────┐
+│    Scanner      │  Walks directories, detects changed files
+└────────┬────────┘
+         │
+         ▼
+┌─────────────────┐
+│    Chunker      │  Splits files into variable-size chunks (FastCDC)
+└────────┬────────┘
+         │
+         ▼
+┌─────────────────┐
+│    Packer       │  Accumulates chunks, compresses (zstd), encrypts (age)
+└────────┬────────┘
+         │
+         ▼
+┌─────────────────┐
+│   S3 Client     │  Uploads blobs to remote storage
+└─────────────────┘
+```
+
+## Data Model
+
+### Core Entities
+
+The database tracks five primary entities and their relationships:
+
+```
+┌──────────────┐     ┌──────────────┐     ┌──────────────┐
+│   Snapshot   │────▶│     File     │────▶│    Chunk     │
+└──────────────┘     └──────────────┘     └──────────────┘
+       │                                         │
+       │                                         │
+       ▼                                         ▼
+┌──────────────┐                          ┌──────────────┐
+│     Blob     │◀─────────────────────────│  BlobChunk   │
+└──────────────┘                          └──────────────┘
+```
+
+### Entity Descriptions
+
+#### File (`database.File`)
+Represents a file or directory in the backup system. Stores metadata needed for restoration:
+- Path, timestamps (mtime, ctime)
+- Size, mode, ownership (uid, gid)
+- Symlink target (if applicable)
+
+#### Chunk (`database.Chunk`)
+A content-addressed unit of data. Files are split into variable-size chunks using the FastCDC algorithm:
+- `ChunkHash`: SHA256 hash of chunk content (primary key)
+- `Size`: Chunk size in bytes
+
+Chunk sizes vary between `avgChunkSize/4` and `avgChunkSize*4` (typically 16KB-256KB for 64KB average).
+
+#### FileChunk (`database.FileChunk`)
+Maps files to their constituent chunks:
+- `FileID`: Reference to the file
+- `Idx`: Position of this chunk within the file (0-indexed)
+- `ChunkHash`: Reference to the chunk
+
+#### Blob (`database.Blob`)
+The final storage unit uploaded to S3. Contains many compressed and encrypted chunks:
+- `ID`: UUID assigned at creation
+- `Hash`: SHA256 of final compressed+encrypted content
+- `UncompressedSize`: Total raw chunk data before compression
+- `CompressedSize`: Size after zstd compression and age encryption
+- `CreatedTS`, `FinishedTS`, `UploadedTS`: Lifecycle timestamps
+
+Blob creation process:
+1. Chunks are accumulated (up to MaxBlobSize, typically 10GB)
+2. Compressed with zstd
+3. Encrypted with age (recipients configured in config)
+4. SHA256 hash computed → becomes filename in S3
+5. Uploaded to `blobs/{hash[0:2]}/{hash[2:4]}/{hash}`
+
+#### BlobChunk (`database.BlobChunk`)
+Maps chunks to their position within blobs:
+- `BlobID`: Reference to the blob
+- `ChunkHash`: Reference to the chunk
+- `Offset`: Byte offset within the uncompressed blob
+- `Length`: Chunk size
+
+#### Snapshot (`database.Snapshot`)
+Represents a point-in-time backup:
+- `ID`: Format is `{hostname}-{YYYYMMDD}-{HHMMSS}Z`
+- Tracks file count, chunk count, blob count, sizes, compression ratio
+- `CompletedAt`: Null until snapshot finishes successfully
+
+#### SnapshotFile / SnapshotBlob
+Join tables linking snapshots to their files and blobs.
+
+### Relationship Summary
+
+```
+Snapshot 1──────────▶ N SnapshotFile N ◀────────── 1 File
+Snapshot 1──────────▶ N SnapshotBlob N ◀────────── 1 Blob
+File     1──────────▶ N FileChunk    N ◀────────── 1 Chunk
+Blob     1──────────▶ N BlobChunk    N ◀────────── 1 Chunk
+```
+
+## Type Instantiation
+
+### Application Startup
+
+The CLI uses fx for dependency injection. Here's the instantiation order:
+
+```go
+// cli/app.go: NewApp()
+fx.New(
+    fx.Supply(config.ConfigPath(opts.ConfigPath)),  // 1. Config path
+    fx.Supply(opts.LogOptions),                      // 2. Log options
+    fx.Provide(globals.New),                         // 3. Globals
+    fx.Provide(log.New),                             // 4. Logger config
+    config.Module,                                   // 5. Config
+    database.Module,                                 // 6. Database + Repositories
+    log.Module,                                      // 7. Logger initialization
+    s3.Module,                                       // 8. S3 client
+    snapshot.Module,                                 // 9. SnapshotManager + ScannerFactory
+    fx.Provide(vaultik.New),                         // 10. Vaultik orchestrator
+)
+```
+
+### Key Type Instantiation Points
+
+#### 1. Config (`config.Config`)
+- **Created by**: `config.Module` via `config.LoadConfig()`
+- **When**: Application startup (fx DI)
+- **Contains**: All configuration from YAML file (S3 credentials, encryption keys, paths, etc.)
+
+#### 2. Database (`database.DB`)
+- **Created by**: `database.Module` via `database.New()`
+- **When**: Application startup (fx DI)
+- **Contains**: SQLite connection, path reference
+
+#### 3. Repositories (`database.Repositories`)
+- **Created by**: `database.Module` via `database.NewRepositories()`
+- **When**: Application startup (fx DI)
+- **Contains**: All repository interfaces (Files, Chunks, Blobs, Snapshots, etc.)
+
+#### 4. Vaultik (`vaultik.Vaultik`)
+- **Created by**: `vaultik.New(VaultikParams)`
+- **When**: Application startup (fx DI)
+- **Contains**: All dependencies for backup operations
+
+```go
+type Vaultik struct {
+    Globals         *globals.Globals
+    Config          *config.Config
+    DB              *database.DB
+    Repositories    *database.Repositories
+    S3Client        *s3.Client
+    ScannerFactory  snapshot.ScannerFactory
+    SnapshotManager *snapshot.SnapshotManager
+    Shutdowner      fx.Shutdowner
+    Fs              afero.Fs
+    ctx             context.Context
+    cancel          context.CancelFunc
+}
+```
+
+#### 5. SnapshotManager (`snapshot.SnapshotManager`)
+- **Created by**: `snapshot.Module` via `snapshot.NewSnapshotManager()`
+- **When**: Application startup (fx DI)
+- **Responsibility**: Creates/completes snapshots, exports metadata to S3
+
+#### 6. Scanner (`snapshot.Scanner`)
+- **Created by**: `ScannerFactory(ScannerParams)`
+- **When**: Each `CreateSnapshot()` call
+- **Contains**: Chunker, Packer, progress reporter
+
+```go
+// vaultik/snapshot.go: CreateSnapshot()
+scanner := v.ScannerFactory(snapshot.ScannerParams{
+    EnableProgress: !opts.Cron,
+    Fs:             v.Fs,
+})
+```
+
+#### 7. Chunker (`chunker.Chunker`)
+- **Created by**: `chunker.NewChunker(avgChunkSize)`
+- **When**: Inside `snapshot.NewScanner()`
+- **Configuration**:
+  - `avgChunkSize`: From config (typically 64KB)
+  - `minChunkSize`: avgChunkSize / 4
+  - `maxChunkSize`: avgChunkSize * 4
+
+#### 8. Packer (`blob.Packer`)
+- **Created by**: `blob.NewPacker(PackerConfig)`
+- **When**: Inside `snapshot.NewScanner()`
+- **Configuration**:
+  - `MaxBlobSize`: Maximum blob size before finalization (typically 10GB)
+  - `CompressionLevel`: zstd level (1-19)
+  - `Recipients`: age public keys for encryption
+
+```go
+// snapshot/scanner.go: NewScanner()
+packerCfg := blob.PackerConfig{
+    MaxBlobSize:      cfg.MaxBlobSize,
+    CompressionLevel: cfg.CompressionLevel,
+    Recipients:       cfg.AgeRecipients,
+    Repositories:     cfg.Repositories,
+    Fs:               cfg.FS,
+}
+packer, err := blob.NewPacker(packerCfg)
+```
+
+## Module Responsibilities
+
+### `internal/cli`
+Entry point for fx application. Combines all modules and handles signal interrupts.
+
+Key functions:
+- `NewApp(AppOptions)` → Creates fx.App with all modules
+- `RunApp(ctx, app)` → Starts app, handles graceful shutdown
+- `RunWithApp(ctx, opts)` → Convenience wrapper
+
+### `internal/vaultik`
+Main orchestrator containing all dependencies and command implementations.
+
+Key methods:
+- `New(VaultikParams)` → Constructor (fx DI)
+- `CreateSnapshot(opts)` → Main backup operation
+- `ListSnapshots(jsonOutput)` → List available snapshots
+- `VerifySnapshot(id, deep)` → Verify snapshot integrity
+- `PurgeSnapshots(...)` → Remove old snapshots
+
+### `internal/chunker`
+Content-defined chunking using FastCDC algorithm.
+
+Key types:
+- `Chunk` → Hash, Data, Offset, Size
+- `Chunker` → avgChunkSize, minChunkSize, maxChunkSize
+
+Key methods:
+- `NewChunker(avgChunkSize)` → Constructor
+- `ChunkReaderStreaming(reader, callback)` → Stream chunks with callback (preferred)
+- `ChunkReader(reader)` → Return all chunks at once (memory-intensive)
+
+### `internal/blob`
+Blob packing: accumulates chunks, compresses, encrypts, tracks metadata.
+
+Key types:
+- `Packer` → Thread-safe blob accumulator
+- `ChunkRef` → Hash + Data for adding to packer
+- `FinishedBlob` → Completed blob ready for upload
+- `BlobWithReader` → FinishedBlob + io.Reader for streaming upload
+
+Key methods:
+- `NewPacker(PackerConfig)` → Constructor
+- `AddChunk(ChunkRef)` → Add chunk to current blob
+- `FinalizeBlob()` → Compress, encrypt, hash current blob
+- `Flush()` → Finalize any in-progress blob
+- `SetBlobHandler(func)` → Set callback for upload
+
+### `internal/snapshot`
+
+#### Scanner
+Orchestrates the backup process for a directory.
+
+Key methods:
+- `NewScanner(ScannerConfig)` → Constructor (creates Chunker + Packer)
+- `Scan(ctx, path, snapshotID)` → Main scan operation
+
+Scan phases:
+1. **Phase 0**: Detect deleted files from previous snapshots
+2. **Phase 1**: Walk directory, identify files needing processing
+3. **Phase 2**: Process files (chunk → pack → upload)
+
+#### SnapshotManager
+Manages snapshot lifecycle and metadata export.
+
+Key methods:
+- `CreateSnapshot(ctx, hostname, version, commit)` → Create snapshot record
+- `CompleteSnapshot(ctx, snapshotID)` → Mark snapshot complete
+- `ExportSnapshotMetadata(ctx, dbPath, snapshotID)` → Export to S3
+- `CleanupIncompleteSnapshots(ctx, hostname)` → Remove failed snapshots
+
+### `internal/database`
+SQLite database for local index. Single-writer mode for thread safety.
+
+Key types:
+- `DB` → Database connection wrapper
+- `Repositories` → Collection of all repository interfaces
+
+Repository interfaces:
+- `FilesRepository` → CRUD for File records
+- `ChunksRepository` → CRUD for Chunk records
+- `BlobsRepository` → CRUD for Blob records
+- `SnapshotsRepository` → CRUD for Snapshot records
+- Plus join table repositories (FileChunks, BlobChunks, etc.)
+
+## Snapshot Creation Flow
+
+```
+CreateSnapshot(opts)
+    │
+    ├─► CleanupIncompleteSnapshots()   // Critical: avoid dedup errors
+    │
+    ├─► SnapshotManager.CreateSnapshot()   // Create DB record
+    │
+    ├─► For each source directory:
+    │       │
+    │       ├─► scanner.Scan(ctx, path, snapshotID)
+    │       │       │
+    │       │       ├─► Phase 0: detectDeletedFiles()
+    │       │       │
+    │       │       ├─► Phase 1: scanPhase()
+    │       │       │       Walk directory
+    │       │       │       Check file metadata changes
+    │       │       │       Build list of files to process
+    │       │       │
+    │       │       └─► Phase 2: processPhase()
+    │       │               For each file:
+    │       │                   chunker.ChunkReaderStreaming()
+    │       │                   For each chunk:
+    │       │                       packer.AddChunk()
+    │       │                       If blob full → FinalizeBlob()
+    │       │                           → handleBlobReady()
+    │       │                           → s3Client.PutObjectWithProgress()
+    │       │               packer.Flush()  // Final blob
+    │       │
+    │       └─► Accumulate statistics
+    │
+    ├─► SnapshotManager.UpdateSnapshotStatsExtended()
+    │
+    ├─► SnapshotManager.CompleteSnapshot()
+    │
+    └─► SnapshotManager.ExportSnapshotMetadata()
+            │
+            ├─► Copy database to temp file
+            ├─► Clean to only current snapshot data
+            ├─► Dump to SQL
+            ├─► Compress with zstd
+            ├─► Encrypt with age
+            ├─► Upload db.zst.age to S3
+            └─► Upload manifest.json.zst to S3
+```
+
+## Deduplication Strategy
+
+1. **File-level**: Files unchanged since last backup are skipped (metadata comparison: size, mtime, mode, uid, gid)
+
+2. **Chunk-level**: Chunks are content-addressed by SHA256 hash. If a chunk hash already exists in the database, the chunk data is not re-uploaded.
+
+3. **Blob-level**: Blobs contain only unique chunks. Duplicate chunks within a blob are skipped.
+
+## Storage Layout in S3
+
+```
+bucket/
+├── blobs/
+│   └── {hash[0:2]}/
+│       └── {hash[2:4]}/
+│           └── {full-hash}          # Compressed+encrypted blob
+│
+└── metadata/
+    └── {snapshot-id}/
+        ├── db.zst.age               # Encrypted database dump
+        └── manifest.json.zst        # Blob list (for verification)
+```
+
+## Thread Safety
+
+- `Packer`: Thread-safe via mutex. Multiple goroutines can call `AddChunk()`.
+- `Scanner`: Uses `packerMu` mutex to coordinate blob finalization.
+- `Database`: Single-writer mode (`MaxOpenConns=1`) ensures SQLite thread safety.
+- `Repositories.WithTx()`: Handles transaction lifecycle automatically.