# vaultik (ваултик) `vaultik` is an incremental backup tool written in Go. It encrypts data using an `age` public key and uploads each encrypted blob directly to a remote S3-compatible object store. It requires no private keys, secrets, or credentials (other than those required to PUT to encrypted object storage, such as S3 API keys) stored on the backed-up system. ## quickstart ```sh # install go install sneak.berlin/go/vaultik/cmd/vaultik@latest # create a default config file (prints the path it wrote to) vaultik config init # generate an age keypair; keep the private key file somewhere safe and # offline — you need it to restore, and the backed-up machine does not need it age-keygen -o vaultik_backup_private_key.txt grep 'public key' vaultik_backup_private_key.txt # configure the encryption key and backup destination vaultik config set age_recipients.0 age1YOUR_PUBLIC_KEY_HERE vaultik config set storage_url "file:///Volumes/usbstick/mybackup" # macOS only: grant your terminal app Full Disk Access first # (System Settings → Privacy & Security → Full Disk Access), otherwise # the backup will abort with a permission error on protected directories # run your first backup (the default config backs up ~ and /Applications # with sensible excludes) vaultik snapshot create # see what you have vaultik snapshot list ``` Features: * modern encryption ([age](https://age-encryption.org/), X25519 + XChaCha20-Poly1305) * content-defined chunking with deduplication (FastCDC) * incremental backups (only changed files are re-chunked) * multithreaded zstd compression at configurable levels * content-addressed immutable storage * local state tracking in SQLite (enables write-only incremental backups) * no mutable remote metadata * no plaintext file paths or metadata in remote storage * packs small files into large blobs (keeps S3 operation counts down) * backs up regular files, symlinks, empty directories, and file permissions * pluggable storage backends: S3, local filesystem, rclone (70+ providers) * pure Go (no CGO), cross-compiles to linux/darwin × amd64/arm64 ## why Other backup tools like `restic`, `borg`, and `duplicity` are designed for environments where the source host can store secrets and has access to decryption keys. `vaultik` is for environments where you don't want to store backup decryption keys on your hosts — only public keys for encryption. Requirements that no existing tool meets: * open source * no passphrases or private keys on the source host * incremental * compressed * encrypted * s3 compatible without an intermediate step or tool ## daily use ```sh # verify a snapshot (shallow: checks all blobs exist) vaultik snapshot verify # deep verify (downloads and cryptographically verifies every blob) VAULTIK_AGE_SECRET_KEY='AGE-SECRET-KEY-...' vaultik snapshot verify --deep # restore (requires the private key) VAULTIK_AGE_SECRET_KEY='AGE-SECRET-KEY-...' vaultik snapshot restore /tmp/restored # daily cron job: back up, keep a 4-week rolling window of snapshots # 0 3 * * * vaultik snapshot create --cron --prune --keep-newer-than 4w ``` --- ## cli ### commands ```sh vaultik [--config ] config init vaultik [--config ] config edit vaultik [--config ] config get vaultik [--config ] config set vaultik [--config ] snapshot create [snapshot-names...] [--cron] [--prune] [--keep-newer-than ] vaultik [--config ] snapshot list [--json] vaultik [--config ] snapshot verify [--deep] [--json] vaultik [--config ] snapshot purge [--keep-latest | --older-than ] [--snapshot ...] [--force] vaultik [--config ] snapshot remove [--dry-run] [--force] [--remote] [--json] vaultik [--config ] snapshot prune vaultik [--config ] snapshot cleanup vaultik [--config ] restore [paths...] [--verify] vaultik [--config ] prune [--force] [--json] vaultik [--config ] info vaultik [--config ] remote info [--json] vaultik [--config ] remote nuke --force vaultik [--config ] store info vaultik [--config ] database purge [--force] vaultik completion vaultik version ``` ### global flags * `--config `: Path to config file (default: `$VAULTIK_CONFIG`, then platform config dir, then `/etc/vaultik/config.yml`) * `--verbose`, `-v`: Enable verbose output * `--debug`: Enable debug output * `--quiet`, `-q`: Suppress non-error output (also suppresses startup banner) * `--skip-errors`: Continue past per-file errors instead of aborting (applies to `snapshot create` and `restore`) ### environment variables * `VAULTIK_AGE_SECRET_KEY`: Age private key for decryption (required for `restore` and `verify --deep`) * `VAULTIK_CONFIG`: Path to config file (overridden by `--config`) * `VAULTIK_INDEX_PATH`: Override local SQLite index path ### shell completion ```sh # zsh: load for the current session source <(vaultik completion zsh) # zsh: install permanently vaultik completion zsh > "${fpath[1]}/_vaultik" # bash: load for the current session source <(vaultik completion bash) # bash: install permanently (Linux) vaultik completion bash > /etc/bash_completion.d/vaultik # fish vaultik completion fish > ~/.config/fish/completions/vaultik.fish ``` ### command details **`config init`**: Write a default config file with commented explanations for every setting. Writes to the path from `--config`, `$VAULTIK_CONFIG`, or the platform config directory (`~/Library/Application Support/vaultik/` on macOS, `~/.config/vaultik/` on Linux, `/etc/vaultik/` as root). Refuses to overwrite an existing file. Created with mode `0600` since it will contain credentials. **`config edit`**: Open the config file in `$EDITOR` (falls back to `vi`). **`config get`**: Print a config value addressed by dotted YAML path (e.g. `vaultik config get s3.bucket`). Non-scalar values print as YAML. **`config set`**: Set a scalar config value by dotted YAML path (e.g. `vaultik config set compression_level 9`). Comments and formatting in the file are preserved; intermediate maps are created as needed. **`snapshot create`**: Perform incremental backup of configured snapshots. * Optional snapshot names argument to create specific snapshots (default: all) * On macOS, the terminal application running vaultik needs Full Disk Access (System Settings → Privacy & Security → Full Disk Access) to read TCC-protected directories; without it the backup aborts with a permission error that explains how to fix it * `--cron`: Silent unless error (for crontab) * `--prune`: After backup, drop older snapshots of each backed-up name and remove orphaned blobs from remote storage. By default keeps only the latest snapshot per name; use `--keep-newer-than` for a rolling window. * `--keep-newer-than `: With `--prune`, keep snapshots newer than this duration instead of only the latest (e.g. `4w`, `30d`, `6mo`, `1y`) **`snapshot list`**: List all snapshots with their timestamps and sizes. * `--json`: Output in JSON format **`snapshot verify`**: Verify snapshot integrity. * Default (shallow): checks that all blobs referenced in the manifest exist in storage * `--deep`: Downloads and decrypts each blob, verifies chunk hashes against the encrypted metadata database * `--json`: Output results as JSON **`snapshot purge`**: Remove old snapshots based on criteria. Retention is per-snapshot-name (`--keep-latest` keeps the latest of each name, not the latest globally). * `--keep-latest`: Keep only the most recent snapshot of each name * `--older-than `: Remove snapshots older than duration (e.g. `30d`, `6m`, `1y`) * `--snapshot `: Restrict to specific snapshot names (repeat for multiple) * `--force`: Skip confirmation prompt **`snapshot remove`**: Remove a specific snapshot from the local database. * `--remote`: Also remove snapshot metadata from remote storage * `--all`: Remove all snapshots (requires `--force`) * `--dry-run`: Show what would be deleted without deleting * `--force`: Skip confirmation prompt * `--json`: Output result as JSON **`snapshot prune`**: Clean orphaned data from the local database (files, chunks, blobs not referenced by any snapshot). **`snapshot cleanup`**: Remove stale local snapshot records that have no corresponding metadata in remote storage. These are typically left behind by incomplete or interrupted backups. Does not touch remote storage. **`restore`**: Restore files from a backup snapshot. * Requires `VAULTIK_AGE_SECRET_KEY` environment variable * Optional path arguments to restore specific files/directories (default: all) * Preserves file permissions, timestamps, ownership (ownership requires root), symlinks, and empty directories * `--verify`: After restoring, verify every file's chunk hashes match **`prune`**: Remove unreferenced blobs from remote storage. * Scans all snapshot manifests for referenced blobs, deletes any blob not referenced * `--force`: Skip confirmation prompt * `--json`: Output stats as JSON **`info`**: Display system configuration, storage settings, encryption recipients, and local database statistics. **`remote info`**: Show detailed remote storage information including per-snapshot metadata sizes, blob counts, and orphaned blob detection. * `--json`: Output as JSON **`remote nuke`**: Delete every snapshot's metadata and every blob from the backup destination store, leaving the bucket prefix empty. Destructive and irreversible. * `--force`: Required to confirm destruction. **`store info`**: Display storage backend type and statistics. **`database purge`**: Delete the local SQLite state database entirely. Remote storage is unaffected; the next backup will do a full scan and re-deduplicate against existing remote blobs. * `--force`: Skip confirmation prompt --- ## storage backends vaultik supports three storage backends, selected via the `storage_url` config field: **S3** (`s3://bucket/prefix?endpoint=host®ion=us-east-1`): Any S3-compatible object store. Credentials are read from `s3.access_key_id` and `s3.secret_access_key` in the config file. **Local filesystem** (`file:///path/to/backup`): Stores blobs and metadata on a local or mounted filesystem. Useful for testing or backing up to a NAS. **Rclone** (`rclone://remote/path`): Uses rclone's 70+ supported cloud providers. Requires rclone to be configured separately (`rclone config`). Legacy S3 configuration via `s3.*` fields (endpoint, bucket, prefix, etc.) is still supported for backward compatibility. `storage_url` takes precedence if both are set. --- ## architecture ### remote storage layout ``` // ├── blobs/ │ └── // └── metadata/ └── / ├── db.zst.age # Encrypted binary SQLite database └── manifest.json.zst # Unencrypted blob list (for pruning) ``` * Blobs are two-level directory sharded using the first 4 hex chars of the blob hash * `db.zst.age` is a binary SQLite database (zstd compressed, age encrypted) containing all file metadata, chunk mappings, and relationships for the snapshot * `manifest.json.zst` is an unencrypted compressed JSON blob list, enabling pruning without the private key Snapshot IDs follow the format `__` (e.g. `server1_home_2025-06-01T12:00:00Z`). ### data flow **backup:** 1. Open local SQLite index, load known files and chunks into memory 2. Walk source directories, compare mtime/size/mode against index 3. For changed/new files: chunk using content-defined chunking (FastCDC) 4. For symlinks and directories: record metadata (no chunking) 5. For each chunk: hash, check dedup, add to blob packer 6. When blob reaches size threshold: compress (zstd), encrypt (age), upload 7. Build snapshot metadata database, compress, encrypt, upload 8. Create unencrypted blob manifest for pruning support **restore:** 1. Download and decrypt `metadata//db.zst.age` 2. Open the binary SQLite database 3. Query files (optionally filtered by paths) 4. Download and decrypt required blobs 5. Extract chunks, reconstruct files 6. Restore permissions, timestamps, ownership, symlinks **prune:** 1. List all snapshot manifests 2. Build set of all referenced blob hashes 3. List all blobs in storage 4. Delete any blob not in the referenced set ### chunking and deduplication * Content-defined chunking using the FastCDC algorithm * Average chunk size: configurable (default 10MB) * Deduplication at file level (unchanged files skipped) and chunk level (identical chunks across files stored once) * Multiple chunks packed into blobs to reduce object count ### encryption * Asymmetric encryption using age (X25519 + XChaCha20-Poly1305) * Only the public key is needed on the source host * Each blob and each metadata database is encrypted independently * Multiple recipients supported (encrypt to multiple keys) ### compression * zstd compression at configurable level (1-19, default 3) * Applied before encryption at the blob level --- ## configuration reference Run `vaultik config init` to generate a fully commented config file. Key fields: | Field | Default | Description | |-------|---------|-------------| | `age_recipients` | (required) | Age public keys for encryption | | `snapshots` | (required) | Named snapshot definitions with paths and excludes | | `storage_url` | | Storage backend URL (`s3://`, `file://`, `rclone://`) | | `s3.*` | | Legacy S3 configuration (endpoint, bucket, credentials) | | `exclude` | | Global exclude patterns (applied to all snapshots) | | `chunk_size` | `10MB` | Average chunk size for content-defined chunking | | `blob_size_limit` | `10GB` | Maximum blob size before splitting | | `compression_level` | `3` | zstd compression level (1-19) | | `hostname` | system hostname | Hostname used in snapshot IDs | | `index_path` | platform data dir | Local SQLite index path | --- ## limitations * **No extended attributes (xattrs).** ACLs, macOS Finder metadata, quarantine flags, SELinux labels, and other extended attributes are not backed up or restored. * **No hard link detection.** Two hard links to the same inode are backed up as independent files. Content deduplication means the data is stored once, but the hard link relationship is lost on restore. * **No sparse file support.** Sparse files are fully materialized during backup. A 100 GB sparse VM disk that is mostly zeros will consume the full (compressed) size in storage. * **No bandwidth limiting.** Uploads and downloads use whatever bandwidth is available. There is no `--bwlimit` flag yet. * **No parallel blob downloads during restore.** Blobs are fetched sequentially. Restore speed is bound by single-stream throughput. * **Device nodes, named pipes, and sockets are silently skipped.** Only regular files, directories, and symlinks are backed up. * **No database migrations.** If the local SQLite schema changes between versions, delete the local database (`vaultik database purge`) and run a full backup. Remote storage is unaffected. * **Files that change during backup may be inconsistent.** There is no filesystem snapshot or freeze. If a file is modified between the scan and chunk phases, the backed-up copy may reflect a partial write. * **Ownership restoration requires root.** File uid/gid are recorded and restored, but `chown` requires elevated privileges. Without root, files are restored with the current user's ownership. --- ## roadmap Items for future releases: * Error-condition tests (network failures, disk full, corrupted/missing blobs) * Parallel blob downloads during restore * Bandwidth limiting (`--bwlimit`) * Security audit of encryption implementation * Man pages and richer `--help` examples --- ## output style All user-facing output goes through helpers in `internal/ui` and conforms to a uniform style. Color is enabled when stdout is a TTY and the `NO_COLOR` environment variable is unset (https://no-color.org/). Message classes: | Class | Marker | Alignment | Use for | |-------|--------|-----------|---------| | Banner | none | column 0 | The startup line printed once per invocation | | Begin | `》` (white) | column 0 | An operation is about to start (present-continuous verb) | | Complete | `》` (green) | column 0 | An operation just finished (past-tense verb) | | Info | `》` (white) | column 0 | Neutral status update | | Notice | `》` (cyan) | column 0 | Important note that is not a warning | | Warning | `⚠️ Warning:` (orange/yellow) | column 0 | Recoverable problem | | Error | `🛑 ERROR:` (red) | column 0 | Operation aborted | | Progress | ` 》` (white) | column 2 | Heartbeat or per-item status during a long-running operation | | Detail | ` 》` (white) | column 2 | Continuation/sub-line of a preceding Complete (visually identical to Progress) | Conventions: * Messages are complete English sentences ending with a period. * Fully qualify terms — say "backup destination store" instead of "storage", "snapshot source files enumeration" instead of "scan", "local index database" instead of "database". * Every operation that emits a Complete also emits a corresponding Begin. Operations that print only a Begin (because completion is obvious from a later Begin) should be rare and intentional. * Use natural verb tense to signal state: "Uploading" for Begin, "Uploaded" for Complete. Never write the words "begin" or "complete" in the body — the marker color already conveys that. * All elapsed and remaining-time fields are explicitly scoped to their subject: write "blob upload elapsed: 30s, blob upload ETA: 03:15:00 (est remain 14s)", never just "elapsed 30s, ETA 14s". * "ETA" means an absolute clock time (when the operation will finish), not a remaining-duration. Use `ui.Time()` for the former and `ui.Duration()` for the latter, and label both. * `ui.Time` formats same-day times as `HH:MM:SS` and other-day times as `YYYY-MM-DD HH:MM:SS`. No timezone — local time is implied. Value colorizers in `internal/ui` colorize specific value types consistently. Compose messages from these helpers rather than embedding ANSI escapes inline: | Helper | Color | Use for | |--------|-------|---------| | `Hex` | cyan | Blob hashes, chunk hashes (truncated to 12 chars + `...`) | | `Snapshot` | bold cyan | Snapshot IDs (untruncated) | | `Path` | blue | Filesystem paths | | `Size` | magenta | Byte counts (human-readable) | | `Speed` | magenta | Bytes-per-second rates | | `Duration` | yellow | Elapsed or remaining time | | `Time` | yellow | Absolute clock times | | `Count` | magenta | Integer counts with thousands separators | | `Percent` | magenta | Percentages | When `NO_COLOR` is set or output is not a TTY, all helpers return plain text and the marker prefixes (`》`, `Warning:`, `ERROR:`) emit without ANSI escapes. The emoji prefixes on Warning and Error are always emitted regardless of color setting (emoji are not color). ## requirements * Go 1.26 or later * S3-compatible object storage (or local filesystem, or rclone remote) ## development workflow All changes follow this workflow. No exceptions. 1. Create a feature branch off `main`. 2. Write tests. 3. Write the implementation. 4. Fix implementation errors until it compiles and tests pass. 5. Fix linting errors (`make lint`). 6. Update documentation and README as required by the change. 7. Format code (`make fmt`). 8. Run `make check` (lint + fmt-check + test). Fix any issues. Repeat until clean. 9. Commit on the branch. 10. Merge to `main`. 11. Push. Do not commit directly to `main`. Do not skip steps. Repository policies for AI agents are in [`AGENTS.md`](AGENTS.md). ## license [MIT](https://opensource.org/license/mit/) ## author Made with love and lots of expensive SOTA AI by [sneak](https://sneak.berlin) in Berlin in the summer of 2025. Released as a free software gift to the world, no strings attached. Contact: [sneak@sneak.berlin](mailto:sneak@sneak.berlin) [https://keys.openpgp.org/vks/v1/by-fingerprint/5539AD00DE4C42F3AFE11575052443F4DF2A55C2](https://keys.openpgp.org/vks/v1/by-fingerprint/5539AD00DE4C42F3AFE11575052443F4DF2A55C2)