- Changed blob table to use ID (UUID) as primary key instead of hash - Blob records are now created at packing start, enabling immediate chunk associations - Implemented streaming chunking to process large files without memory exhaustion - Fixed blob manifest generation to include all referenced blobs - Updated all foreign key references from blob_hash to blob_id - Added progress reporting and improved error handling - Enforced encryption requirement for all blob packing - Updated tests to use test encryption keys - Added Cyrillic transliteration to README
287 lines
7.6 KiB
Markdown
287 lines
7.6 KiB
Markdown
# vaultik (ваултик)
|
|
|
|
`vaultik` is a incremental backup daemon written in Go. It
|
|
encrypts data using an `age` public key and uploads each encrypted blob
|
|
directly to a remote S3-compatible object store. It requires no private
|
|
keys, secrets, or credentials stored on the backed-up system.
|
|
|
|
---
|
|
|
|
## what
|
|
|
|
`vaultik` walks a set of configured directories and builds a
|
|
content-addressable chunk map of changed files using deterministic chunking.
|
|
Each chunk is streamed into a blob packer. Blobs are compressed with `zstd`,
|
|
encrypted with `age`, and uploaded directly to remote storage under a
|
|
content-addressed S3 path.
|
|
|
|
No plaintext file contents ever hit disk. No private key is needed or stored
|
|
locally. All encrypted data is streaming-processed and immediately discarded
|
|
once uploaded. Metadata is encrypted and pushed with the same mechanism.
|
|
|
|
## why
|
|
|
|
Existing backup software fails under one or more of these conditions:
|
|
|
|
* Requires secrets (passwords, private keys) on the source system
|
|
* Depends on symmetric encryption unsuitable for zero-trust environments
|
|
* Stages temporary archives or repositories
|
|
* Writes plaintext metadata or plaintext file paths
|
|
|
|
`vaultik` addresses all of these by using:
|
|
|
|
* Public-key-only encryption (via `age`) requires no secrets (other than
|
|
bucket access key) on the source system
|
|
* Blob-level deduplication and batching
|
|
* Local state cache for incremental detection
|
|
* S3-native chunked upload interface
|
|
* Self-contained encrypted snapshot metadata
|
|
|
|
## how
|
|
|
|
1. **install**
|
|
|
|
```sh
|
|
go install git.eeqj.de/sneak/vaultik@latest
|
|
```
|
|
|
|
2. **generate keypair**
|
|
|
|
```sh
|
|
age-keygen -o agekey.txt
|
|
grep 'public key:' agekey.txt
|
|
```
|
|
|
|
3. **write config**
|
|
|
|
```yaml
|
|
source_dirs:
|
|
- /etc
|
|
- /home/user/data
|
|
exclude:
|
|
- '*.log'
|
|
- '*.tmp'
|
|
age_recipient: age1278m9q7dp3chsh2dcy82qk27v047zywyvtxwnj4cvt0z65jw6a7q5dqhfj
|
|
s3:
|
|
endpoint: https://s3.example.com
|
|
bucket: vaultik-data
|
|
prefix: host1/
|
|
access_key_id: ...
|
|
secret_access_key: ...
|
|
region: us-east-1
|
|
backup_interval: 1h # only used in daemon mode, not for --cron mode
|
|
full_scan_interval: 24h # normally we use inotify to mark dirty, but
|
|
# every 24h we do a full stat() scan
|
|
min_time_between_run: 15m # again, only for daemon mode
|
|
index_path: /var/lib/vaultik/index.sqlite
|
|
chunk_size: 10MB
|
|
blob_size_limit: 10GB
|
|
index_prefix: index/
|
|
```
|
|
|
|
4. **run**
|
|
|
|
```sh
|
|
vaultik backup /etc/vaultik.yaml
|
|
```
|
|
|
|
```sh
|
|
vaultik backup /etc/vaultik.yaml --cron # silent unless error
|
|
```
|
|
|
|
```sh
|
|
vaultik backup /etc/vaultik.yaml --daemon # runs in background, uses inotify
|
|
```
|
|
|
|
---
|
|
|
|
## cli
|
|
|
|
### commands
|
|
|
|
```sh
|
|
vaultik backup [--config <path>] [--cron] [--daemon]
|
|
vaultik restore --bucket <bucket> --prefix <prefix> --snapshot <id> --target <dir>
|
|
vaultik prune --bucket <bucket> --prefix <prefix> [--dry-run]
|
|
vaultik fetch --bucket <bucket> --prefix <prefix> --snapshot <id> --file <path> --target <path>
|
|
vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]
|
|
```
|
|
|
|
### environment
|
|
|
|
* `VAULTIK_PRIVATE_KEY`: Required for `restore`, `prune`, `fetch`, and `verify` commands. Contains the age private key for decryption.
|
|
* `VAULTIK_CONFIG`: Optional path to config file. If set, `vaultik backup` can be run without specifying the config file path.
|
|
|
|
### command details
|
|
|
|
**backup**: Perform incremental backup of configured directories
|
|
* Config is located at `/etc/vaultik/config.yml` by default
|
|
* `--config`: Override config file path
|
|
* `--cron`: Silent unless error (for crontab)
|
|
* `--daemon`: Run continuously with inotify monitoring and periodic scans
|
|
|
|
**restore**: Restore entire snapshot to target directory
|
|
* Downloads and decrypts metadata
|
|
* Fetches only required blobs
|
|
* Reconstructs directory structure
|
|
|
|
**prune**: Remove unreferenced blobs from storage
|
|
* Requires private key
|
|
* Downloads latest snapshot metadata
|
|
* Deletes orphaned blobs
|
|
|
|
**fetch**: Extract single file from backup
|
|
* Retrieves specific file without full restore
|
|
* Supports extracting to different filename
|
|
|
|
**verify**: Validate backup integrity
|
|
* Checks metadata hash
|
|
* Verifies all referenced blobs exist
|
|
* Default: Downloads blobs and validates chunk integrity
|
|
* `--quick`: Only checks blob existence and S3 content hashes
|
|
|
|
---
|
|
|
|
## architecture
|
|
|
|
### chunking
|
|
|
|
* Content-defined chunking using rolling hash (Rabin fingerprint)
|
|
* Average chunk size: 10MB (configurable)
|
|
* Deduplication at chunk level
|
|
* Multiple chunks packed into blobs for efficiency
|
|
|
|
### encryption
|
|
|
|
* Asymmetric encryption using age (X25519 + XChaCha20-Poly1305)
|
|
* Only public key needed on source host
|
|
* Each blob encrypted independently
|
|
* Metadata databases also encrypted
|
|
|
|
### storage
|
|
|
|
* Content-addressed blob storage
|
|
* Immutable append-only design
|
|
* Two-level directory sharding for blobs (aa/bb/hash)
|
|
* Compressed with zstd before encryption
|
|
|
|
### state tracking
|
|
|
|
* Local SQLite database for incremental state
|
|
* Tracks file mtimes and chunk mappings
|
|
* Enables efficient change detection
|
|
* Supports inotify monitoring in daemon mode
|
|
|
|
## does not
|
|
|
|
* Store any secrets on the backed-up machine
|
|
* Require mutable remote metadata
|
|
* Use tarballs, restic, rsync, or ssh
|
|
* Require a symmetric passphrase or password
|
|
* Trust the source system with anything
|
|
|
|
---
|
|
|
|
## does
|
|
|
|
* Incremental deduplicated backup
|
|
* Blob-packed chunk encryption
|
|
* Content-addressed immutable blobs
|
|
* Public-key encryption only
|
|
* SQLite-based local and snapshot metadata
|
|
* Fully stream-processed storage
|
|
|
|
---
|
|
|
|
## restore
|
|
|
|
`vaultik restore` downloads only the snapshot metadata and required blobs. It
|
|
never contacts the source system. All restore operations depend only on:
|
|
|
|
* `VAULTIK_PRIVATE_KEY`
|
|
* The bucket
|
|
|
|
The entire system is restore-only from object storage.
|
|
|
|
---
|
|
|
|
## features
|
|
|
|
### daemon mode
|
|
|
|
* Continuous background operation
|
|
* inotify-based change detection
|
|
* Respects `backup_interval` and `min_time_between_run`
|
|
* Full scan every `full_scan_interval` (default 24h)
|
|
|
|
### cron mode
|
|
|
|
* Single backup run
|
|
* Silent output unless errors
|
|
* Ideal for scheduled backups
|
|
|
|
### metadata integrity
|
|
|
|
* SHA256 hash of metadata stored separately
|
|
* Encrypted hash file for verification
|
|
* Chunked metadata support for large filesystems
|
|
|
|
### exclusion patterns
|
|
|
|
* Glob-based file exclusion
|
|
* Configured in YAML
|
|
* Applied during directory walk
|
|
|
|
## prune
|
|
|
|
Run `vaultik prune` on a machine with the private key. It:
|
|
|
|
* Downloads the most recent snapshot
|
|
* Decrypts metadata
|
|
* Lists referenced blobs
|
|
* Deletes any blob in the bucket not referenced
|
|
|
|
This enables garbage collection from immutable storage.
|
|
|
|
---
|
|
|
|
## license
|
|
|
|
WTFPL — see LICENSE.
|
|
|
|
---
|
|
|
|
## security considerations
|
|
|
|
* Source host compromise cannot decrypt backups
|
|
* No replay attacks possible (append-only)
|
|
* Each blob independently encrypted
|
|
* Metadata tampering detectable via hash verification
|
|
* S3 credentials only allow write access to backup prefix
|
|
|
|
## performance
|
|
|
|
* Streaming processing (no temp files)
|
|
* Parallel blob uploads
|
|
* Deduplication reduces storage and bandwidth
|
|
* Local index enables fast incremental detection
|
|
* Configurable compression levels
|
|
|
|
## requirements
|
|
|
|
* Go 1.24.4 or later
|
|
* S3-compatible object storage
|
|
* age command-line tool (for key generation)
|
|
* SQLite3
|
|
* Sufficient disk space for local index
|
|
|
|
## author
|
|
|
|
Made with love and lots of expensive SOTA AI by [sneak](https://sneak.berlin) in Berlin in the summer of 2025.
|
|
|
|
Released as a free software gift to the world, no strings attached, under the [WTFPL](https://www.wtfpl.net/) license.
|
|
|
|
Contact: [sneak@sneak.berlin](mailto:sneak@sneak.berlin)
|
|
|
|
[https://keys.openpgp.org/vks/v1/by-fingerprint/5539AD00DE4C42F3AFE11575052443F4DF2A55C2](https://keys.openpgp.org/vks/v1/by-fingerprint/5539AD00DE4C42F3AFE11575052443F4DF2A55C2)
|