Go to file

sneak d7cd9aac27 Add end-to-end integration tests for Vaultik - Create comprehensive integration tests with mock S3 client - Add in-memory filesystem and SQLite database support for testing - Test full backup workflow including chunking, packing, and uploading - Add test to verify encrypted blob content - Fix scanner to use afero filesystem for temp file cleanup - Demonstrate successful backup and verification with mock dependencies		2025-07-26 15:52:23 +02:00
cmd/vaultik	Implement local SQLite index database with repositories	2025-07-20 10:26:15 +02:00
docs	Fix manifest generation to not encrypt manifests	2025-07-26 02:54:52 +02:00
internal	Add end-to-end integration tests for Vaultik	2025-07-26 15:52:23 +02:00
test	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
.gitignore	Refactor CLI to use flags instead of positional arguments	2025-07-20 09:45:24 +02:00
AGENTS.md	initial design	2025-07-20 08:51:38 +02:00
CLAUDE.md	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
config.example.yml	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
DESIGN.md	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
go.mod	Refactor: Move Vaultik struct and methods to internal/vaultik package	2025-07-26 14:47:26 +02:00
go.sum	Refactor: Move Vaultik struct and methods to internal/vaultik package	2025-07-26 14:47:26 +02:00
LICENSE	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
Makefile	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
README.md	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
test-config.yml	Major refactoring: UUID-based storage, streaming architecture, and CLI improvements	2025-07-22 14:56:44 +02:00
TODO-verify.md	Refactor: Move Vaultik struct and methods to internal/vaultik package	2025-07-26 14:47:26 +02:00
TODO.md	Refactor blob storage to use UUID primary keys and implement streaming chunking	2025-07-22 07:43:39 +02:00

README.md

vaultik (ваултик)

vaultik is a incremental backup daemon written in Go. It encrypts data using an age public key and uploads each encrypted blob directly to a remote S3-compatible object store. It requires no private keys, secrets, or credentials stored on the backed-up system.

It includes table-stakes features such as:

modern authenticated encryption
deduplication
incremental backups
modern multithreaded zstd compression with configurable levels
content-addressed immutable storage
local state tracking in standard SQLite database
inotify-based change detection
streaming processing of all data to not require lots of ram or temp file storage
no mutable remote metadata
no plaintext file paths or metadata stored in remote
does not create huge numbers of small files (to keep S3 operation counts down) even if the source system has many small files

what

vaultik walks a set of configured directories and builds a content-addressable chunk map of changed files using deterministic chunking. Each chunk is streamed into a blob packer. Blobs are compressed with zstd, encrypted with age, and uploaded directly to remote storage under a content-addressed S3 path.

No plaintext file contents ever hit disk. No private key or secret passphrase is needed or stored locally. All encrypted data is streaming-processed and immediately discarded once uploaded. Metadata is encrypted and pushed with the same mechanism.

why

Existing backup software fails under one or more of these conditions:

Requires secrets (passwords, private keys) on the source system, which compromises encrypted backups in the case of host system compromise
Depends on symmetric encryption unsuitable for zero-trust environments
Creates one-blob-per-file, which results in excessive S3 operation counts

vaultik addresses these by using:

Public-key-only encryption (via age) requires no secrets (other than remote storage api key) on the source system
Local state cache for incremental detection does not require reading from or decrypting remote storage
Content-addressed immutable storage allows efficient deduplication
Storage only of large encrypted blobs of configurable size (1G by default) reduces S3 operation counts and improves performance

how

install

go install git.eeqj.de/sneak/vaultik@latest

generate keypair

age-keygen -o agekey.txt
grep 'public key:' agekey.txt

write config

source_dirs:
  - /etc
  - /home/user/data
exclude:
  - '*.log'
  - '*.tmp'
age_recipient: age1278m9q7dp3chsh2dcy82qk27v047zywyvtxwnj4cvt0z65jw6a7q5dqhfj
s3:
  # endpoint is optional if using AWS S3, but who even does that?
  endpoint: https://s3.example.com
  bucket: vaultik-data
  prefix: host1/
  access_key_id: ...
  secret_access_key: ...
  region: us-east-1
backup_interval: 1h      # only used in daemon mode, not for --cron mode
full_scan_interval: 24h  # normally we use inotify to mark dirty, but
                         # every 24h we do a full stat() scan
min_time_between_run: 15m  # again, only for daemon mode
#index_path: /var/lib/vaultik/index.sqlite
chunk_size: 10MB
blob_size_limit: 10GB

run

vaultik --config /etc/vaultik.yaml snapshot create

vaultik --config /etc/vaultik.yaml snapshot create --cron # silent unless error

vaultik --config /etc/vaultik.yaml snapshot daemon # runs continuously in foreground, uses inotify to detect changes

# TODO
* make sure daemon mode does not make a snapshot if no files have
  changed, even if the backup_interval has passed
* in daemon mode, if we are long enough since the last snapshot event, and we get
  an inotify event, we should schedule the next snapshot creation for 10 minutes from the
  time of the mark-dirty event.

cli

commands

vaultik [--config <path>] snapshot create [--cron] [--daemon]
vaultik [--config <path>] snapshot list [--json]
vaultik [--config <path>] snapshot purge [--keep-latest | --older-than <duration>] [--force]
vaultik [--config <path>] snapshot verify <snapshot-id> [--deep]
vaultik [--config <path>] store info
# FIXME: remove 'bucket' and 'prefix' and 'snapshot' flags.  it should be
# 'vaultik restore snapshot <snapshot> --target <dir>'.  bucket and prefix are always
# from config file.
vaultik restore --bucket <bucket> --prefix <prefix> --snapshot <id> --target <dir>
# FIXME: remove prune, it's the old version of "snapshot purge"
vaultik prune --bucket <bucket> --prefix <prefix> [--dry-run]
# FIXME: change fetch to 'vaultik restore path <snapshot> <path> --target <path>'
vaultik fetch --bucket <bucket> --prefix <prefix> --snapshot <id> --file <path> --target <path>
# FIXME: remove this, it's redundant with 'snapshot verify'
vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]

environment

VAULTIK_PRIVATE_KEY: Required for restore, prune, fetch, and verify commands. Contains the age private key for decryption.
VAULTIK_CONFIG: Optional path to config file. If set, config file path doesn't need to be specified on the command line.

command details

snapshot create: Perform incremental backup of configured directories

Config is located at /etc/vaultik/config.yml by default
--cron: Silent unless error (for crontab)
--daemon: Run continuously with inotify monitoring and periodic scans

snapshot list: List all snapshots with their timestamps and sizes

--json: Output in JSON format

snapshot purge: Remove old snapshots based on criteria

--keep-latest: Keep only the most recent snapshot
--older-than: Remove snapshots older than duration (e.g., 30d, 6mo, 1y)
--force: Skip confirmation prompt

snapshot verify: Verify snapshot integrity

--deep: Download and verify blob hashes (not just existence)

store info: Display S3 bucket configuration and storage statistics

restore: Restore entire snapshot to target directory

Downloads and decrypts metadata
Fetches only required blobs
Reconstructs directory structure

prune: Remove unreferenced blobs from storage

Requires private key
Downloads latest snapshot metadata
Deletes orphaned blobs

fetch: Extract single file from backup

Retrieves specific file without full restore
Supports extracting to different filename

verify: Validate backup integrity

Checks metadata hash
Verifies all referenced blobs exist
Default: Downloads blobs and validates chunk integrity
--quick: Only checks blob existence and S3 content hashes

architecture

chunking

Content-defined chunking using rolling hash (Rabin fingerprint)
Average chunk size: 10MB (configurable)
Deduplication at chunk level
Multiple chunks packed into blobs for efficiency

encryption

Asymmetric encryption using age (X25519 + XChaCha20-Poly1305)
Only public key needed on source host
Each blob encrypted independently
Metadata databases also encrypted

storage

Content-addressed blob storage
Immutable append-only design
Two-level directory sharding for blobs (aa/bb/hash)
Compressed with zstd before encryption

state tracking

Local SQLite database for incremental state
Tracks file mtimes and chunk mappings
Enables efficient change detection
Supports inotify monitoring in daemon mode

does not

Store any secrets on the backed-up machine
Require mutable remote metadata
Use tarballs, restic, rsync, or ssh
Require a symmetric passphrase or password
Trust the source system with anything

does

Incremental deduplicated backup
Blob-packed chunk encryption
Content-addressed immutable blobs
Public-key encryption only
SQLite-based local and snapshot metadata
Fully stream-processed storage

restore

vaultik restore downloads only the snapshot metadata and required blobs. It never contacts the source system. All restore operations depend only on:

VAULTIK_PRIVATE_KEY
The bucket

The entire system is restore-only from object storage.

features

daemon mode

Continuous background operation
inotify-based change detection
Respects backup_interval and min_time_between_run
Full scan every full_scan_interval (default 24h)

cron mode

Single backup run
Silent output unless errors
Ideal for scheduled backups

metadata integrity

SHA256 hash of metadata stored separately
Encrypted hash file for verification
Chunked metadata support for large filesystems

exclusion patterns

Glob-based file exclusion
Configured in YAML
Applied during directory walk

prune

Run vaultik prune on a machine with the private key. It:

Downloads the most recent snapshot
Decrypts metadata
Lists referenced blobs
Deletes any blob in the bucket not referenced

This enables garbage collection from immutable storage.

LICENSE

MIT

requirements

Go 1.24.4 or later
S3-compatible object storage
Sufficient disk space for local index (typically <1GB)

author

Made with love and lots of expensive SOTA AI by sneak in Berlin in the summer of 2025.

Released as a free software gift to the world, no strings attached.

Contact: sneak@sneak.berlin

https://keys.openpgp.org/vks/v1/by-fingerprint/5539AD00DE4C42F3AFE11575052443F4DF2A55C2