- Change all commands to use flags (--bucket, --prefix, etc.) - Add --config flag to backup command - Support VAULTIK_CONFIG environment variable for config path - Use /etc/vaultik/config.yml as default config location - Add test/config.yaml for testing - Update tests to use environment variable for config path - Add .gitignore for build artifacts and local configs - Update documentation to reflect new CLI syntax
15 KiB
vaultik: Design Document
vaultik
is a secure backup tool written in Go. It performs
streaming backups using content-defined chunking, blob grouping, asymmetric
encryption, and object storage. The system is designed for environments
where the backup source host cannot store secrets and cannot retrieve or
decrypt any data from the destination.
The source host is stateful: it maintains a local SQLite index to detect changes, deduplicate content, and track uploads across backup runs. All remote storage is encrypted and append-only. Pruning of unreferenced data is done from a trusted host with access to decryption keys, as even the metadata indices are encrypted in the blob store.
Why
ANOTHER backup tool??
Other backup tools like restic
, borg
, and duplicity
are designed for
environments where the source host can store secrets and has access to
decryption keys. I don't want to store backup decryption keys on my hosts,
only public keys for encryption.
My requirements are:
- open source
- no passphrases or private keys on the source host
- incremental
- compressed
- encrypted
- s3 compatible without an intermediate step or tool
Surprisingly, no existing tool meets these requirements, so I wrote vaultik
.
Design Goals
- Backups must require only a public key on the source host.
- No secrets or private keys may exist on the source system.
- Obviously, restore must be possible using only the backup bucket and a private key.
- Prune must be possible, although this requires a private key so must be done on different hosts.
- All encryption is done using
age
(X25519, XChaCha20-Poly1305). - Compression uses
zstd
at a configurable level. - Files are chunked, and multiple chunks are packed into encrypted blobs. This reduces the number of objects in the blob store for filesystems with many small files.
- All metadata (snapshots) is stored remotely as encrypted SQLite DBs.
- If a snapshot metadata file exceeds a configured size threshold, it is
chunked into multiple encrypted
.age
parts, to support large filesystems. - CLI interface is structured using
cobra
.
S3 Bucket Layout
S3 stores only three things:
- Blobs: encrypted, compressed packs of file chunks.
- Metadata: encrypted SQLite databases containing the current state of the filesystem at the time of the snapshot.
- Metadata hashes: encrypted hashes of the metadata SQLite databases.
s3://<bucket>/<prefix>/
├── blobs/
│ ├── <aa>/<bb>/<full_blob_hash>.zst.age
├── metadata/
│ ├── <snapshot_id>.sqlite.age
│ ├── <snapshot_id>.sqlite.00.age
│ ├── <snapshot_id>.sqlite.01.age
To retrieve a given file, you would:
- fetch
metadata/<snapshot_id>.sqlite.age
ormetadata/<snapshot_id>.sqlite.{seq}.age
- fetch
metadata/<snapshot_id>.hash.age
- decrypt the metadata SQLite database using the private key and reconstruct the full database file
- verify the hash of the decrypted database matches the decrypted hash
- query the database for the file in question
- determine all chunks for the file
- for each chunk, look up the metadata for all blobs in the db
- fetch each blob from
blobs/<aa>/<bb>/<blob_hash>.zst.age
- decrypt each blob using the private key
- decompress each blob using
zstd
- reconstruct the file from set of file chunks stored in the blobs
If clever, it may be possible to do this chunk by chunk without touching disk (except for the output file) as each uncompressed blob should fit in memory (<10GB).
Path Rules
<snapshot_id>
: UTC timestamp in iso860 format, e.g.2023-10-01T12:00:00Z
. These are lexicographically sortable.blobs/<aa>/<bb>/...
: whereaa
andbb
are the first 2 hex bytes of the blob hash.
3. Local SQLite Index Schema (source host)
CREATE TABLE files (
path TEXT PRIMARY KEY,
mtime INTEGER NOT NULL,
size INTEGER NOT NULL
);
CREATE TABLE file_chunks (
path TEXT NOT NULL,
idx INTEGER NOT NULL,
chunk_hash TEXT NOT NULL,
PRIMARY KEY (path, idx)
);
CREATE TABLE chunks (
chunk_hash TEXT PRIMARY KEY,
sha256 TEXT NOT NULL,
size INTEGER NOT NULL
);
CREATE TABLE blobs (
blob_hash TEXT PRIMARY KEY,
final_hash TEXT NOT NULL,
created_ts INTEGER NOT NULL
);
CREATE TABLE blob_chunks (
blob_hash TEXT NOT NULL,
chunk_hash TEXT NOT NULL,
offset INTEGER NOT NULL,
length INTEGER NOT NULL,
PRIMARY KEY (blob_hash, chunk_hash)
);
CREATE TABLE chunk_files (
chunk_hash TEXT NOT NULL,
file_path TEXT NOT NULL,
file_offset INTEGER NOT NULL,
length INTEGER NOT NULL,
PRIMARY KEY (chunk_hash, file_path)
);
CREATE TABLE snapshots (
id TEXT PRIMARY KEY,
hostname TEXT NOT NULL,
vaultik_version TEXT NOT NULL,
created_ts INTEGER NOT NULL,
file_count INTEGER NOT NULL,
chunk_count INTEGER NOT NULL,
blob_count INTEGER NOT NULL
);
4. Snapshot Metadata Schema (stored in S3)
Identical schema to the local index, filtered to live snapshot state. Stored
as a SQLite DB, compressed with zstd
, encrypted with age
. If larger than
a configured chunk_size
, it is split and uploaded as:
metadata/<snapshot_id>.sqlite.00.age
metadata/<snapshot_id>.sqlite.01.age
...
5. Data Flow
5.1 Backup
-
Load config
-
Open local SQLite index
-
Walk source directories:
-
For each file:
-
Check mtime and size in index
-
If changed or new:
-
Chunk file
-
For each chunk:
-
Hash with SHA256
-
Check if already uploaded
-
If not:
- Add chunk to blob packer
-
-
Record file-chunk mapping in index
-
-
-
-
When blob reaches threshold size (e.g. 1GB):
- Compress with
zstd
- Encrypt with
age
- Upload to:
s3://<bucket>/<prefix>/blobs/<aa>/<bb>/<hash>.zst.age
- Record blob-chunk layout in local index
- Compress with
-
Once all files are processed:
- Build snapshot SQLite DB from index delta
- Compress + encrypt
- If larger than
chunk_size
, split into parts - Upload to:
s3://<bucket>/<prefix>/metadata/<snapshot_id>.sqlite(.xx).age
-
Create snapshot record in local index that lists:
- snapshot ID
- hostname
- vaultik version
- timestamp
- counts of files, chunks, and blobs
- list of all blobs referenced in the snapshot (some new, some old) for efficient pruning later
-
Create snapshot database for upload
-
Calculate checksum of snapshot database
-
Compress, encrypt, split, and upload to S3
-
Encrypt the hash of the snapshot database to the backup age key
-
Upload the encrypted hash to S3 as
metadata/<snapshot_id>.hash.age
-
Optionally prune remote blobs that are no longer referenced in the snapshot, based on local state db
5.2 Manual Prune
- List all objects under
metadata/
- Determine the latest valid
snapshot_id
by timestamp - Download, decrypt, and reconstruct the latest snapshot SQLite database
- Extract set of referenced blob hashes
- List all blob objects under
blobs/
- For each blob:
- If the hash is not in the latest snapshot:
- Issue
DeleteObject
to remove it
- Issue
- If the hash is not in the latest snapshot:
5.3 Verify
Verify runs on a host that has no state, but access to the bucket.
- Fetch latest metadata snapshot files from S3
- Fetch latest metadata db hash from S3
- Decrypt the hash using the private key
- Decrypt the metadata SQLite database chunks using the private key and reassemble the snapshot db file
- Calculate the SHA256 hash of the decrypted snapshot database
- Verify the db file hash matches the decrypted hash
- For each blob in the snapshot:
- Fetch the blob metadata from the snapshot db
- Ensure the blob exists in S3
- Check the S3 content hash matches the expected blob hash
- If not using --quick mode:
- Download and decrypt the blob
- Decompress and verify chunk hashes match metadata
6. CLI Commands
vaultik backup [--config <path>] [--cron] [--daemon]
vaultik restore --bucket <bucket> --prefix <prefix> --snapshot <id> --target <dir>
vaultik prune --bucket <bucket> --prefix <prefix> [--dry-run]
vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]
vaultik fetch --bucket <bucket> --prefix <prefix> --snapshot <id> --file <path> --target <path>
VAULTIK_PRIVATE_KEY
is required forrestore
,prune
,verify
, andfetch
commands.- It is passed via environment variable containing the age private key.
7. Function and Method Signatures
7.1 CLI
func RootCmd() *cobra.Command
func backupCmd() *cobra.Command
func restoreCmd() *cobra.Command
func pruneCmd() *cobra.Command
func verifyCmd() *cobra.Command
7.2 Configuration
type Config struct {
BackupPubKey string // age recipient
BackupInterval time.Duration // used in daemon mode, irrelevant for cron mode
BlobSizeLimit int64 // default 10GB
ChunkSize int64 // default 10MB
Exclude []string // list of regex of files to exclude from backup, absolute path
Hostname string
IndexPath string // path to local SQLite index db, default /var/lib/vaultik/index.db
MetadataPrefix string // S3 prefix for metadata, default "metadata/"
MinTimeBetweenRun time.Duration // minimum time between backup runs, default 1 hour - for daemon mode
S3 S3Config // S3 configuration
ScanInterval time.Duration // interval to full stat() scan source dirs, default 24h
SourceDirs []string // list of source directories to back up, absolute paths
}
type S3Config struct {
Endpoint string
Bucket string
Prefix string
AccessKeyID string
SecretAccessKey string
Region string
}
func Load(path string) (*Config, error)
7.3 Index
type Index struct {
db *sql.DB
}
func OpenIndex(path string) (*Index, error)
func (ix *Index) LookupFile(path string, mtime int64, size int64) ([]string, bool, error)
func (ix *Index) SaveFile(path string, mtime int64, size int64, chunkHashes []string) error
func (ix *Index) AddChunk(chunkHash string, size int64) error
func (ix *Index) MarkBlob(blobHash, finalHash string, created time.Time) error
func (ix *Index) MapChunkToBlob(blobHash, chunkHash string, offset, length int64) error
func (ix *Index) MapChunkToFile(chunkHash, filePath string, offset, length int64) error
7.4 Blob Packing
type BlobWriter struct {
// internal buffer, current size, encrypted writer, etc
}
func NewBlobWriter(...) *BlobWriter
func (bw *BlobWriter) AddChunk(chunk []byte, chunkHash string) error
func (bw *BlobWriter) Flush() (finalBlobHash string, err error)
7.5 Metadata
func BuildSnapshotMetadata(ix *Index, snapshotID string) (sqlitePath string, err error)
func EncryptAndUploadMetadata(path string, cfg *Config, snapshotID string) error
7.6 Prune
func RunPrune(bucket, prefix, privateKey string) error
Implementation TODO
Core Infrastructure
- Set up Go module and project structure
- Create Makefile with test, fmt, and lint targets
- Set up cobra CLI skeleton with all commands
- Implement config loading and validation from YAML
- Create data structures for FileInfo, ChunkInfo, BlobInfo, etc.
Local Index Database
- Implement SQLite schema creation and migrations
- Create Index type with all database operations
- Add transaction support and proper locking
- Implement file tracking (save, lookup, delete)
- Implement chunk tracking and deduplication
- Implement blob tracking and chunk-to-blob mapping
- Write tests for all index operations
Chunking and Hashing
- Implement Rabin fingerprint chunker
- Create streaming chunk processor
- Implement SHA256 hashing for chunks
- Add configurable chunk size parameters
- Write tests for chunking consistency
Compression and Encryption
- Implement zstd compression wrapper
- Integrate age encryption library
- Create Encryptor type for public key encryption
- Create Decryptor type for private key decryption
- Implement streaming encrypt/decrypt pipelines
- Write tests for compression and encryption
Blob Packing
- Implement BlobWriter with size limits
- Add chunk accumulation and flushing
- Create blob hash calculation
- Implement proper error handling and rollback
- Write tests for blob packing scenarios
S3 Operations
- Integrate MinIO client library
- Implement S3Client wrapper type
- Add multipart upload support for large blobs
- Implement retry logic with exponential backoff
- Add connection pooling and timeout handling
- Write tests using MinIO container
Backup Command - Basic
- Implement directory walking with exclusion patterns
- Add file change detection using index
- Integrate chunking pipeline for changed files
- Implement blob upload coordination
- Add progress reporting to stderr
- Write integration tests for backup
Snapshot Metadata
- Implement snapshot metadata extraction from index
- Create SQLite snapshot database builder
- Add metadata compression and encryption
- Implement metadata chunking for large snapshots
- Add hash calculation and verification
- Implement metadata upload to S3
- Write tests for metadata operations
Restore Command
- Implement snapshot listing and selection
- Add metadata download and reconstruction
- Implement hash verification for metadata
- Create file restoration logic with chunk retrieval
- Add blob caching for efficiency
- Implement proper file permissions and mtime restoration
- Write integration tests for restore
Prune Command
- Implement latest snapshot detection
- Add referenced blob extraction from metadata
- Create S3 blob listing and comparison
- Implement safe deletion of unreferenced blobs
- Add dry-run mode for safety
- Write tests for prune scenarios
Verify Command
- Implement metadata integrity checking
- Add blob existence verification
- Implement quick mode (S3 hash checking)
- Implement deep mode (download and verify chunks)
- Add detailed error reporting
- Write tests for verification
Fetch Command
- Implement single-file metadata query
- Add minimal blob downloading for file
- Create streaming file reconstruction
- Add support for output redirection
- Write tests for fetch command
Daemon Mode
- Implement inotify watcher for Linux
- Add dirty path tracking in index
- Create periodic full scan scheduler
- Implement backup interval enforcement
- Add proper signal handling and shutdown
- Write tests for daemon behavior
Cron Mode
- Implement silent operation mode
- Add proper exit codes for cron
- Implement lock file to prevent concurrent runs
- Add error summary reporting
- Write tests for cron mode
Finalization
- Add comprehensive logging throughout
- Implement proper error wrapping and context
- Add performance metrics collection
- Create end-to-end integration tests
- Write documentation and examples
- Set up CI/CD pipeline