- Add internal/types package with type-safe wrappers for IDs, hashes, paths, and credentials (FileID, BlobID, ChunkHash, etc.) - Implement driver.Valuer and sql.Scanner for UUID-based types - Add `vaultik version` command showing version, commit, go version - Add `--verify` flag to restore command that checksums all restored files against expected chunk hashes with progress bar - Remove fetch.go (dead code, functionality in restore) - Clean up TODO.md, remove completed items - Update all database and snapshot code to use new custom types |
||
|---|---|---|
| cmd/vaultik | ||
| docs | ||
| internal | ||
| test | ||
| .gitignore | ||
| AGENTS.md | ||
| ARCHITECTURE.md | ||
| CLAUDE.md | ||
| config.example.yml | ||
| go.mod | ||
| go.sum | ||
| LICENSE | ||
| Makefile | ||
| PROCESS.md | ||
| README.md | ||
| test-config.yml | ||
| TODO.md | ||
vaultik (ваултик)
WIP: pre-1.0, some functions may not be fully implemented yet
vaultik is an incremental backup daemon written in Go. It encrypts data
using an age public key and uploads each encrypted blob directly to a
remote S3-compatible object store. It requires no private keys, secrets, or
credentials (other than those required to PUT to encrypted object storage,
such as S3 API keys) stored on the backed-up system.
It includes table-stakes features such as:
- modern encryption (the excellent
age) - deduplication
- incremental backups
- modern multithreaded zstd compression with configurable levels
- content-addressed immutable storage
- local state tracking in standard SQLite database, enables write-only incremental backups to destination
- no mutable remote metadata
- no plaintext file paths or metadata stored in remote
- does not create huge numbers of small files (to keep S3 operation counts down) even if the source system has many small files
why
Existing backup software fails under one or more of these conditions:
- Requires secrets (passwords, private keys) on the source system, which compromises encrypted backups in the case of host system compromise
- Depends on symmetric encryption unsuitable for zero-trust environments
- Creates one-blob-per-file, which results in excessive S3 operation counts
- is slow
Other backup tools like restic, borg, and duplicity are designed for
environments where the source host can store secrets and has access to
decryption keys. I don't want to store backup decryption keys on my hosts,
only public keys for encryption.
My requirements are:
- open source
- no passphrases or private keys on the source host
- incremental
- compressed
- encrypted
- s3 compatible without an intermediate step or tool
Surprisingly, no existing tool meets these requirements, so I wrote vaultik.
design goals
- Backups must require only a public key on the source host.
- No secrets or private keys may exist on the source system.
- Restore must be possible using only the backup bucket and a private key.
- Prune must be possible (requires private key, done on different hosts).
- All encryption uses
age(X25519, XChaCha20-Poly1305). - Compression uses
zstdat a configurable level. - Files are chunked, and multiple chunks are packed into encrypted blobs to reduce object count for filesystems with many small files.
- All metadata (snapshots) is stored remotely as encrypted SQLite DBs.
what
vaultik walks a set of configured directories and builds a
content-addressable chunk map of changed files using deterministic chunking.
Each chunk is streamed into a blob packer. Blobs are compressed with zstd,
encrypted with age, and uploaded directly to remote storage under a
content-addressed S3 path. At the end, a pruned snapshot-specific sqlite
database of metadata is created, encrypted, and uploaded alongside the
blobs.
No plaintext file contents ever hit disk. No private key or secret passphrase is needed or stored locally.
how
-
install
go install git.eeqj.de/sneak/vaultik@latest -
generate keypair
age-keygen -o agekey.txt grep 'public key:' agekey.txt -
write config
# Named snapshots - each snapshot can contain multiple paths snapshots: system: paths: - /etc - /var/lib exclude: - '*.cache' # Snapshot-specific exclusions home: paths: - /home/user/documents - /home/user/photos # Global exclusions (apply to all snapshots) exclude: - '*.log' - '*.tmp' - '.git' - 'node_modules' age_recipients: - age1278m9q7dp3chsh2dcy82qk27v047zywyvtxwnj4cvt0z65jw6a7q5dqhfj s3: endpoint: https://s3.example.com bucket: vaultik-data prefix: host1/ access_key_id: ... secret_access_key: ... region: us-east-1 backup_interval: 1h full_scan_interval: 24h min_time_between_run: 15m chunk_size: 10MB blob_size_limit: 1GB -
run
# Create all configured snapshots vaultik --config /etc/vaultik.yaml snapshot create # Create specific snapshots by name vaultik --config /etc/vaultik.yaml snapshot create home system # Silent mode for cron vaultik --config /etc/vaultik.yaml snapshot create --cron
cli
commands
vaultik [--config <path>] snapshot create [snapshot-names...] [--cron] [--daemon] [--prune]
vaultik [--config <path>] snapshot list [--json]
vaultik [--config <path>] snapshot verify <snapshot-id> [--deep]
vaultik [--config <path>] snapshot purge [--keep-latest | --older-than <duration>] [--force]
vaultik [--config <path>] snapshot remove <snapshot-id> [--dry-run] [--force]
vaultik [--config <path>] snapshot prune
vaultik [--config <path>] restore <snapshot-id> <target-dir> [paths...]
vaultik [--config <path>] prune [--dry-run] [--force]
vaultik [--config <path>] info
vaultik [--config <path>] store info
environment
VAULTIK_AGE_SECRET_KEY: Required forrestoreand deepverify. Contains the age private key for decryption.VAULTIK_CONFIG: Optional path to config file.
command details
snapshot create: Perform incremental backup of configured snapshots
- Config is located at
/etc/vaultik/config.ymlby default - Optional snapshot names argument to create specific snapshots (default: all)
--cron: Silent unless error (for crontab)--daemon: Run continuously with inotify monitoring and periodic scans--prune: Delete old snapshots and orphaned blobs after backup
snapshot list: List all snapshots with their timestamps and sizes
--json: Output in JSON format
snapshot verify: Verify snapshot integrity
--deep: Download and verify blob contents (not just existence)
snapshot purge: Remove old snapshots based on criteria
--keep-latest: Keep only the most recent snapshot--older-than: Remove snapshots older than duration (e.g., 30d, 6mo, 1y)--force: Skip confirmation prompt
snapshot remove: Remove a specific snapshot
--dry-run: Show what would be deleted without deleting--force: Skip confirmation prompt
snapshot prune: Clean orphaned data from local database
restore: Restore snapshot to target directory
- Requires
VAULTIK_AGE_SECRET_KEYenvironment variable with age private key - Optional path arguments to restore specific files/directories (default: all)
- Downloads and decrypts metadata, fetches required blobs, reconstructs files
- Preserves file permissions, timestamps, and ownership (ownership requires root)
- Handles symlinks and directories
prune: Remove unreferenced blobs from remote storage
- Scans all snapshots for referenced blobs
- Deletes orphaned blobs
info: Display system and configuration information
store info: Display S3 bucket configuration and storage statistics
architecture
s3 bucket layout
s3://<bucket>/<prefix>/
├── blobs/
│ └── <aa>/<bb>/<full_blob_hash>
└── metadata/
├── <snapshot_id>/
│ ├── db.zst.age
│ └── manifest.json.zst
blobs/<aa>/<bb>/...: Two-level directory sharding using first 4 hex chars of blob hashmetadata/<snapshot_id>/db.zst.age: Encrypted, compressed SQLite databasemetadata/<snapshot_id>/manifest.json.zst: Unencrypted blob list for pruning
blob manifest format
The manifest.json.zst file is unencrypted (compressed JSON) to enable pruning without decryption:
{
"snapshot_id": "hostname_snapshotname_2025-01-01T12:00:00Z",
"blob_hashes": [
"aa1234567890abcdef...",
"bb2345678901bcdef0..."
]
}
Snapshot IDs follow the format <hostname>_<snapshot-name>_<timestamp> (e.g., server1_home_2025-01-01T12:00:00Z).
local sqlite schema
CREATE TABLE files (
id TEXT PRIMARY KEY,
path TEXT NOT NULL UNIQUE,
mtime INTEGER NOT NULL,
size INTEGER NOT NULL,
mode INTEGER NOT NULL,
uid INTEGER NOT NULL,
gid INTEGER NOT NULL
);
CREATE TABLE file_chunks (
file_id TEXT NOT NULL,
idx INTEGER NOT NULL,
chunk_hash TEXT NOT NULL,
PRIMARY KEY (file_id, idx),
FOREIGN KEY (file_id) REFERENCES files(id) ON DELETE CASCADE
);
CREATE TABLE chunks (
chunk_hash TEXT PRIMARY KEY,
size INTEGER NOT NULL
);
CREATE TABLE blobs (
id TEXT PRIMARY KEY,
blob_hash TEXT NOT NULL UNIQUE,
uncompressed INTEGER NOT NULL,
compressed INTEGER NOT NULL,
uploaded_at INTEGER
);
CREATE TABLE blob_chunks (
blob_hash TEXT NOT NULL,
chunk_hash TEXT NOT NULL,
offset INTEGER NOT NULL,
length INTEGER NOT NULL,
PRIMARY KEY (blob_hash, chunk_hash)
);
CREATE TABLE chunk_files (
chunk_hash TEXT NOT NULL,
file_id TEXT NOT NULL,
file_offset INTEGER NOT NULL,
length INTEGER NOT NULL,
PRIMARY KEY (chunk_hash, file_id)
);
CREATE TABLE snapshots (
id TEXT PRIMARY KEY,
hostname TEXT NOT NULL,
vaultik_version TEXT NOT NULL,
started_at INTEGER NOT NULL,
completed_at INTEGER,
file_count INTEGER NOT NULL,
chunk_count INTEGER NOT NULL,
blob_count INTEGER NOT NULL,
total_size INTEGER NOT NULL,
blob_size INTEGER NOT NULL,
compression_ratio REAL NOT NULL
);
CREATE TABLE snapshot_files (
snapshot_id TEXT NOT NULL,
file_id TEXT NOT NULL,
PRIMARY KEY (snapshot_id, file_id)
);
CREATE TABLE snapshot_blobs (
snapshot_id TEXT NOT NULL,
blob_id TEXT NOT NULL,
blob_hash TEXT NOT NULL,
PRIMARY KEY (snapshot_id, blob_id)
);
data flow
backup
- Load config, open local SQLite index
- Walk source directories, check mtime/size against index
- For changed/new files: chunk using content-defined chunking
- For each chunk: hash, check if already uploaded, add to blob packer
- When blob reaches threshold: compress, encrypt, upload to S3
- Build snapshot metadata, compress, encrypt, upload
- Create blob manifest (unencrypted) for pruning support
restore
- Download
metadata/<snapshot_id>/db.zst.age - Decrypt and decompress SQLite database
- Query files table (optionally filtered by paths)
- For each file, get ordered chunk list from file_chunks
- Download required blobs, decrypt, decompress
- Extract chunks and reconstruct files
- Restore permissions, mtime, uid/gid
prune
- List all snapshot manifests
- Build set of all referenced blob hashes
- List all blobs in storage
- Delete any blob not in referenced set
chunking
- Content-defined chunking using FastCDC algorithm
- Average chunk size: configurable (default 10MB)
- Deduplication at chunk level
- Multiple chunks packed into blobs for efficiency
encryption
- Asymmetric encryption using age (X25519 + XChaCha20-Poly1305)
- Only public key needed on source host
- Each blob encrypted independently
- Metadata databases also encrypted
compression
- zstd compression at configurable level
- Applied before encryption
- Blob-level compression for efficiency
does not
- Store any secrets on the backed-up machine
- Require mutable remote metadata
- Use tarballs, restic, rsync, or ssh
- Require a symmetric passphrase or password
- Trust the source system with anything
does
- Incremental deduplicated backup
- Blob-packed chunk encryption
- Content-addressed immutable blobs
- Public-key encryption only
- SQLite-based local and snapshot metadata
- Fully stream-processed storage
requirements
- Go 1.24 or later
- S3-compatible object storage
- Sufficient disk space for local index (typically <1GB)
license
author
Made with love and lots of expensive SOTA AI by sneak in Berlin in the summer of 2025.
Released as a free software gift to the world, no strings attached.
Contact: sneak@sneak.berlin
https://keys.openpgp.org/vks/v1/by-fingerprint/5539AD00DE4C42F3AFE11575052443F4DF2A55C2