sneak bcbc186286 Refactor CLI to use flags instead of positional arguments

- Change all commands to use flags (--bucket, --prefix, etc.)
- Add --config flag to backup command
- Support VAULTIK_CONFIG environment variable for config path
- Use /etc/vaultik/config.yml as default config location
- Add test/config.yaml for testing
- Update tests to use environment variable for config path
- Add .gitignore for build artifacts and local configs
- Update documentation to reflect new CLI syntax

2025-07-20 09:45:24 +02:00

15 KiB

Raw Blame History

vaultik: Design Document

vaultik is a secure backup tool written in Go. It performs streaming backups using content-defined chunking, blob grouping, asymmetric encryption, and object storage. The system is designed for environments where the backup source host cannot store secrets and cannot retrieve or decrypt any data from the destination.

The source host is stateful: it maintains a local SQLite index to detect changes, deduplicate content, and track uploads across backup runs. All remote storage is encrypted and append-only. Pruning of unreferenced data is done from a trusted host with access to decryption keys, as even the metadata indices are encrypted in the blob store.

Why

ANOTHER backup tool??

Other backup tools like restic, borg, and duplicity are designed for environments where the source host can store secrets and has access to decryption keys. I don't want to store backup decryption keys on my hosts, only public keys for encryption.

My requirements are:

open source
no passphrases or private keys on the source host
incremental
compressed
encrypted
s3 compatible without an intermediate step or tool

Surprisingly, no existing tool meets these requirements, so I wrote vaultik.

Design Goals

Backups must require only a public key on the source host.
No secrets or private keys may exist on the source system.
Obviously, restore must be possible using only the backup bucket and a private key.
Prune must be possible, although this requires a private key so must be done on different hosts.
All encryption is done using age (X25519, XChaCha20-Poly1305).
Compression uses zstd at a configurable level.
Files are chunked, and multiple chunks are packed into encrypted blobs. This reduces the number of objects in the blob store for filesystems with many small files.
All metadata (snapshots) is stored remotely as encrypted SQLite DBs.
If a snapshot metadata file exceeds a configured size threshold, it is chunked into multiple encrypted .age parts, to support large filesystems.
CLI interface is structured using cobra.

S3 Bucket Layout

S3 stores only three things:

Blobs: encrypted, compressed packs of file chunks.
Metadata: encrypted SQLite databases containing the current state of the filesystem at the time of the snapshot.
Metadata hashes: encrypted hashes of the metadata SQLite databases.

s3://<bucket>/<prefix>/
├── blobs/
│   ├── <aa>/<bb>/<full_blob_hash>.zst.age
├── metadata/
│   ├── <snapshot_id>.sqlite.age
│   ├── <snapshot_id>.sqlite.00.age
│   ├── <snapshot_id>.sqlite.01.age

To retrieve a given file, you would:

fetch metadata/<snapshot_id>.sqlite.age or metadata/<snapshot_id>.sqlite.{seq}.age
fetch metadata/<snapshot_id>.hash.age
decrypt the metadata SQLite database using the private key and reconstruct the full database file
verify the hash of the decrypted database matches the decrypted hash
query the database for the file in question
determine all chunks for the file
for each chunk, look up the metadata for all blobs in the db
fetch each blob from blobs/<aa>/<bb>/<blob_hash>.zst.age
decrypt each blob using the private key
decompress each blob using zstd
reconstruct the file from set of file chunks stored in the blobs

If clever, it may be possible to do this chunk by chunk without touching disk (except for the output file) as each uncompressed blob should fit in memory (<10GB).

Path Rules

<snapshot_id>: UTC timestamp in iso860 format, e.g. 2023-10-01T12:00:00Z. These are lexicographically sortable.
blobs/<aa>/<bb>/...: where aa and bb are the first 2 hex bytes of the blob hash.

3. Local SQLite Index Schema (source host)

CREATE TABLE files (
  path TEXT PRIMARY KEY,
  mtime INTEGER NOT NULL,
  size INTEGER NOT NULL
);

CREATE TABLE file_chunks (
  path TEXT NOT NULL,
  idx INTEGER NOT NULL,
  chunk_hash TEXT NOT NULL,
  PRIMARY KEY (path, idx)
);

CREATE TABLE chunks (
  chunk_hash TEXT PRIMARY KEY,
  sha256 TEXT NOT NULL,
  size INTEGER NOT NULL
);

CREATE TABLE blobs (
  blob_hash TEXT PRIMARY KEY,
  final_hash TEXT NOT NULL,
  created_ts INTEGER NOT NULL
);

CREATE TABLE blob_chunks (
  blob_hash TEXT NOT NULL,
  chunk_hash TEXT NOT NULL,
  offset INTEGER NOT NULL,
  length INTEGER NOT NULL,
  PRIMARY KEY (blob_hash, chunk_hash)
);

CREATE TABLE chunk_files (
  chunk_hash TEXT NOT NULL,
  file_path TEXT NOT NULL,
  file_offset INTEGER NOT NULL,
  length INTEGER NOT NULL,
  PRIMARY KEY (chunk_hash, file_path)
);

CREATE TABLE snapshots (
  id TEXT PRIMARY KEY,
  hostname TEXT NOT NULL,
  vaultik_version TEXT NOT NULL,
  created_ts INTEGER NOT NULL,
  file_count INTEGER NOT NULL,
  chunk_count INTEGER NOT NULL,
  blob_count INTEGER NOT NULL
);

4. Snapshot Metadata Schema (stored in S3)

Identical schema to the local index, filtered to live snapshot state. Stored as a SQLite DB, compressed with zstd, encrypted with age. If larger than a configured chunk_size, it is split and uploaded as:

metadata/<snapshot_id>.sqlite.00.age
metadata/<snapshot_id>.sqlite.01.age
...

5. Data Flow

5.1 Backup

Load config
Open local SQLite index
Walk source directories:
- For each file:
  - Check mtime and size in index
  - If changed or new:
    - Chunk file
    - For each chunk:
      - Hash with SHA256
      - Check if already uploaded
      - If not:
        
        Add chunk to blob packer
    - Record file-chunk mapping in index
When blob reaches threshold size (e.g. 1GB):
- Compress with zstd
- Encrypt with age
- Upload to: s3://<bucket>/<prefix>/blobs/<aa>/<bb>/<hash>.zst.age
- Record blob-chunk layout in local index
Once all files are processed:
- Build snapshot SQLite DB from index delta
- Compress + encrypt
- If larger than chunk_size, split into parts
- Upload to: s3://<bucket>/<prefix>/metadata/<snapshot_id>.sqlite(.xx).age
Create snapshot record in local index that lists:
- snapshot ID
- hostname
- vaultik version
- timestamp
- counts of files, chunks, and blobs
- list of all blobs referenced in the snapshot (some new, some old) for efficient pruning later
Create snapshot database for upload
Calculate checksum of snapshot database
Compress, encrypt, split, and upload to S3
Encrypt the hash of the snapshot database to the backup age key
Upload the encrypted hash to S3 as metadata/<snapshot_id>.hash.age
Optionally prune remote blobs that are no longer referenced in the snapshot, based on local state db

5.2 Manual Prune

List all objects under metadata/
Determine the latest valid snapshot_id by timestamp
Download, decrypt, and reconstruct the latest snapshot SQLite database
Extract set of referenced blob hashes
List all blob objects under blobs/
For each blob:
- If the hash is not in the latest snapshot:
  - Issue DeleteObject to remove it

5.3 Verify

Verify runs on a host that has no state, but access to the bucket.

Fetch latest metadata snapshot files from S3
Fetch latest metadata db hash from S3
Decrypt the hash using the private key
Decrypt the metadata SQLite database chunks using the private key and reassemble the snapshot db file
Calculate the SHA256 hash of the decrypted snapshot database
Verify the db file hash matches the decrypted hash
For each blob in the snapshot:
- Fetch the blob metadata from the snapshot db
- Ensure the blob exists in S3
- Check the S3 content hash matches the expected blob hash
- If not using --quick mode:
  - Download and decrypt the blob
  - Decompress and verify chunk hashes match metadata

6. CLI Commands

vaultik backup [--config <path>] [--cron] [--daemon]
vaultik restore --bucket <bucket> --prefix <prefix> --snapshot <id> --target <dir>
vaultik prune --bucket <bucket> --prefix <prefix> [--dry-run]
vaultik verify --bucket <bucket> --prefix <prefix> [--snapshot <id>] [--quick]
vaultik fetch --bucket <bucket> --prefix <prefix> --snapshot <id> --file <path> --target <path>

VAULTIK_PRIVATE_KEY is required for restore, prune, verify, and fetch commands.
It is passed via environment variable containing the age private key.

7. Function and Method Signatures

7.1 CLI

func RootCmd() *cobra.Command
func backupCmd() *cobra.Command
func restoreCmd() *cobra.Command
func pruneCmd() *cobra.Command
func verifyCmd() *cobra.Command

7.2 Configuration

type Config struct {
    BackupPubKey      string  // age recipient
    BackupInterval    time.Duration // used in daemon mode, irrelevant for cron mode
    BlobSizeLimit     int64  // default 10GB
    ChunkSize         int64 // default 10MB
    Exclude           []string // list of regex of files to exclude from backup, absolute path
    Hostname          string
    IndexPath         string  // path to local SQLite index db, default /var/lib/vaultik/index.db
    MetadataPrefix    string  // S3 prefix for metadata, default "metadata/"
    MinTimeBetweenRun time.Duration  // minimum time between backup runs, default 1 hour - for daemon mode
    S3                S3Config  // S3 configuration
    ScanInterval      time.Duration  // interval to full stat() scan source dirs, default 24h
    SourceDirs        []string  // list of source directories to back up, absolute paths
}

type S3Config struct {
    Endpoint        string
    Bucket          string
    Prefix          string
    AccessKeyID     string
    SecretAccessKey string
    Region          string
}

func Load(path string) (*Config, error)

7.3 Index

type Index struct {
    db *sql.DB
}

func OpenIndex(path string) (*Index, error)

func (ix *Index) LookupFile(path string, mtime int64, size int64) ([]string, bool, error)
func (ix *Index) SaveFile(path string, mtime int64, size int64, chunkHashes []string) error
func (ix *Index) AddChunk(chunkHash string, size int64) error
func (ix *Index) MarkBlob(blobHash, finalHash string, created time.Time) error
func (ix *Index) MapChunkToBlob(blobHash, chunkHash string, offset, length int64) error
func (ix *Index) MapChunkToFile(chunkHash, filePath string, offset, length int64) error

7.4 Blob Packing

type BlobWriter struct {
    // internal buffer, current size, encrypted writer, etc
}

func NewBlobWriter(...) *BlobWriter
func (bw *BlobWriter) AddChunk(chunk []byte, chunkHash string) error
func (bw *BlobWriter) Flush() (finalBlobHash string, err error)

7.5 Metadata

func BuildSnapshotMetadata(ix *Index, snapshotID string) (sqlitePath string, err error)
func EncryptAndUploadMetadata(path string, cfg *Config, snapshotID string) error

7.6 Prune

func RunPrune(bucket, prefix, privateKey string) error

Implementation TODO

Core Infrastructure

Set up Go module and project structure
Create Makefile with test, fmt, and lint targets
Set up cobra CLI skeleton with all commands
Implement config loading and validation from YAML
Create data structures for FileInfo, ChunkInfo, BlobInfo, etc.

Local Index Database

Implement SQLite schema creation and migrations
Create Index type with all database operations
Add transaction support and proper locking
Implement file tracking (save, lookup, delete)
Implement chunk tracking and deduplication
Implement blob tracking and chunk-to-blob mapping
Write tests for all index operations

Chunking and Hashing

Implement Rabin fingerprint chunker
Create streaming chunk processor
Implement SHA256 hashing for chunks
Add configurable chunk size parameters
Write tests for chunking consistency

Compression and Encryption

Implement zstd compression wrapper
Integrate age encryption library
Create Encryptor type for public key encryption
Create Decryptor type for private key decryption
Implement streaming encrypt/decrypt pipelines
Write tests for compression and encryption

Blob Packing

Implement BlobWriter with size limits
Add chunk accumulation and flushing
Create blob hash calculation
Implement proper error handling and rollback
Write tests for blob packing scenarios

S3 Operations

Integrate MinIO client library
Implement S3Client wrapper type
Add multipart upload support for large blobs
Implement retry logic with exponential backoff
Add connection pooling and timeout handling
Write tests using MinIO container

Backup Command - Basic

Implement directory walking with exclusion patterns
Add file change detection using index
Integrate chunking pipeline for changed files
Implement blob upload coordination
Add progress reporting to stderr
Write integration tests for backup

Snapshot Metadata

Implement snapshot metadata extraction from index
Create SQLite snapshot database builder
Add metadata compression and encryption
Implement metadata chunking for large snapshots
Add hash calculation and verification
Implement metadata upload to S3
Write tests for metadata operations

Restore Command

Implement snapshot listing and selection
Add metadata download and reconstruction
Implement hash verification for metadata
Create file restoration logic with chunk retrieval
Add blob caching for efficiency
Implement proper file permissions and mtime restoration
Write integration tests for restore

Prune Command

Implement latest snapshot detection
Add referenced blob extraction from metadata
Create S3 blob listing and comparison
Implement safe deletion of unreferenced blobs
Add dry-run mode for safety
Write tests for prune scenarios

Verify Command

Implement metadata integrity checking
Add blob existence verification
Implement quick mode (S3 hash checking)
Implement deep mode (download and verify chunks)
Add detailed error reporting
Write tests for verification

Fetch Command

Implement single-file metadata query
Add minimal blob downloading for file
Create streaming file reconstruction
Add support for output redirection
Write tests for fetch command

Daemon Mode

Implement inotify watcher for Linux
Add dirty path tracking in index
Create periodic full scan scheduler
Implement backup interval enforcement
Add proper signal handling and shutdown
Write tests for daemon behavior

Cron Mode

Implement silent operation mode
Add proper exit codes for cron
Implement lock file to prevent concurrent runs
Add error summary reporting
Write tests for cron mode

Finalization

Add comprehensive logging throughout
Implement proper error wrapping and context
Add performance metrics collection
Create end-to-end integration tests
Write documentation and examples
Set up CI/CD pipeline

15 KiB Raw Blame History