Collapse snapshot prune into vaultik prune; auto-clean on removal
The CLI had two commands named "prune" doing different jobs (local
DB orphan cleanup vs. remote blob garbage collection), which was
confusing and forced a manual two-step workflow after deleting any
snapshot.
Single user-facing prune surface is now `vaultik prune`, which calls
PruneDatabase (local orphan cleanup) then PruneBlobs (remote unref
blob GC). Snapshot deletion paths (snapshot remove, snapshot remove
--all, snapshot purge) auto-run CleanupOrphanedData inline so the
local index database doesn't accumulate ghost rows after every
removal — the user observed ~39k orphaned files and 2 orphaned blobs
after a remove --all because that cleanup was previously a separate
opt-in command. `snapshot prune` is removed.
Also addresses the doc/help-string drift the user audit caught:
* cli/prune.go help text used to reference a non-existent
`vaultik purge` command.
* cli/config.go get/set short/long examples were S3-specific
(s3.bucket) when the primary storage configuration is
storage_url.
* vaultik/info.go printed S3 Bucket/Endpoint/Region labels
unconditionally; for file:// or rclone:// users those rows
were empty. The Storage Configuration block now prints the
storer's Type+Location first, the storage_url string when set,
and only emits S3 rows that are actually populated.
* vaultik/info.go's "Run 'vaultik prune --remote'" hint
referenced a flag that doesn't exist.
* vaultik/blobcache.go's doc comment claimed LRU eviction, which
is no longer the restore-time policy (the sweeper drives
eviction; LRU is the safety-net fallback when maxBytes is
finite).
* README.md listed `vaultik restore`, `vaultik snapshot prune`,
and `s3.bucket` example, all out of date.
README's roadmap section is rewritten with concrete pre-1.0 items
(security audit, error-condition tests, parallel blob downloads,
restart of interrupted restore, …) so the next-steps surface
matches what the project actually still needs.
The cleanup calls are guarded against a nil SnapshotManager so
tests that construct a bare Vaultik struct continue to work.
This commit is contained in:
104
README.md
104
README.md
@@ -100,9 +100,8 @@ vaultik [--config <path>] snapshot list [--json]
|
||||
vaultik [--config <path>] snapshot verify <snapshot-id> [--deep] [--json]
|
||||
vaultik [--config <path>] snapshot purge [--keep-latest | --older-than <duration>] [--snapshot <name>...] [--force]
|
||||
vaultik [--config <path>] snapshot remove <snapshot-id|--all> [--dry-run] [--force] [--remote] [--json]
|
||||
vaultik [--config <path>] snapshot prune
|
||||
vaultik [--config <path>] snapshot cleanup
|
||||
vaultik [--config <path>] restore <snapshot-id> <target-dir> [paths...] [--verify]
|
||||
vaultik [--config <path>] snapshot restore <snapshot-id> <target-dir> [paths...] [--verify]
|
||||
vaultik [--config <path>] prune [--force] [--json]
|
||||
vaultik [--config <path>] info
|
||||
vaultik [--config <path>] remote info [--json]
|
||||
@@ -123,7 +122,7 @@ vaultik version
|
||||
|
||||
### environment variables
|
||||
|
||||
* `VAULTIK_AGE_SECRET_KEY`: Age private key for decryption (required for `restore` and `verify --deep`)
|
||||
* `VAULTIK_AGE_SECRET_KEY`: Age private key for decryption (required for `snapshot restore` and `snapshot verify --deep`)
|
||||
* `VAULTIK_CONFIG`: Path to config file (overridden by `--config`)
|
||||
* `VAULTIK_INDEX_PATH`: Override local SQLite index path
|
||||
|
||||
@@ -157,11 +156,13 @@ existing file. Created with mode `0600` since it will contain credentials.
|
||||
**`config edit`**: Open the config file in `$EDITOR` (falls back to `vi`).
|
||||
|
||||
**`config get`**: Print a config value addressed by dotted YAML path
|
||||
(e.g. `vaultik config get s3.bucket`). Non-scalar values print as YAML.
|
||||
(e.g. `vaultik config get storage_url`). Non-scalar values print as YAML.
|
||||
|
||||
**`config set`**: Set a scalar config value by dotted YAML path
|
||||
(e.g. `vaultik config set compression_level 9`). Comments and formatting
|
||||
in the file are preserved; intermediate maps are created as needed.
|
||||
(e.g. `vaultik config set compression_level 9`,
|
||||
`vaultik config set storage_url "file:///mnt/backups"`). Comments and
|
||||
formatting in the file are preserved; intermediate maps are created as
|
||||
needed.
|
||||
|
||||
**`snapshot create`**: Perform incremental backup of configured snapshots.
|
||||
* Optional snapshot names argument to create specific snapshots (default: all)
|
||||
@@ -176,7 +177,11 @@ in the file are preserved; intermediate maps are created as needed.
|
||||
* `--keep-newer-than <duration>`: With `--prune`, keep snapshots newer than
|
||||
this duration instead of only the latest (e.g. `4w`, `30d`, `6mo`, `1y`)
|
||||
|
||||
**`snapshot list`**: List all snapshots with their timestamps and sizes.
|
||||
**`snapshot list`**: Show every snapshot known to the destination
|
||||
store with timestamps and three sizes per snapshot (compressed
|
||||
remote size; total uncompressed chunk size; size of chunks newly
|
||||
referenced by that snapshot). The uncompressed and "new chunk"
|
||||
columns show `<remote only>` for snapshots not in the local index.
|
||||
* `--json`: Output in JSON format
|
||||
|
||||
**`snapshot verify`**: Verify snapshot integrity.
|
||||
@@ -194,28 +199,31 @@ latest globally).
|
||||
* `--force`: Skip confirmation prompt
|
||||
|
||||
**`snapshot remove`**: Remove a specific snapshot from the local database.
|
||||
Automatically cleans up local rows (files, chunks, blobs) that the removed
|
||||
snapshot was the last referrer for — you don't need a separate prune step
|
||||
after removal.
|
||||
* `--remote`: Also remove snapshot metadata from remote storage
|
||||
* `--all`: Remove all snapshots (requires `--force`)
|
||||
* `--dry-run`: Show what would be deleted without deleting
|
||||
* `--force`: Skip confirmation prompt
|
||||
* `--json`: Output result as JSON
|
||||
|
||||
**`snapshot prune`**: Clean orphaned data from the local database (files,
|
||||
chunks, blobs not referenced by any snapshot).
|
||||
|
||||
**`snapshot cleanup`**: Remove stale local snapshot records that have no
|
||||
corresponding metadata in remote storage. These are typically left behind
|
||||
by incomplete or interrupted backups. Does not touch remote storage.
|
||||
|
||||
**`restore`**: Restore files from a backup snapshot.
|
||||
**`snapshot restore`**: Restore files from a backup snapshot.
|
||||
* Requires `VAULTIK_AGE_SECRET_KEY` environment variable
|
||||
* Optional path arguments to restore specific files/directories (default: all)
|
||||
* Preserves file permissions, timestamps, ownership (ownership requires root),
|
||||
symlinks, and empty directories
|
||||
* `--verify`: After restoring, verify every file's chunk hashes match
|
||||
|
||||
**`prune`**: Remove unreferenced blobs from remote storage.
|
||||
* Scans all snapshot manifests for referenced blobs, deletes any blob not referenced
|
||||
**`prune`**: Tidy up everything that isn't needed. Removes orphaned local
|
||||
database rows (files, chunks, blobs no longer referenced by any completed
|
||||
snapshot) AND deletes unreferenced blobs from remote storage. `snapshot
|
||||
create --prune`, `snapshot remove`, and `snapshot purge` run the same
|
||||
cleanup automatically; this is the manual entry point for the same work.
|
||||
* `--force`: Skip confirmation prompt
|
||||
* `--json`: Output stats as JSON
|
||||
|
||||
@@ -385,13 +393,71 @@ Key fields:
|
||||
|
||||
## roadmap
|
||||
|
||||
Items for future releases:
|
||||
Items still to do before / shortly after 1.0. Loosely ordered by
|
||||
priority.
|
||||
|
||||
* Error-condition tests (network failures, disk full, corrupted/missing blobs)
|
||||
* Parallel blob downloads during restore
|
||||
* Bandwidth limiting (`--bwlimit`)
|
||||
* Security audit of encryption implementation
|
||||
* Man pages and richer `--help` examples
|
||||
### correctness and operability
|
||||
|
||||
* **Security audit of the encryption implementation.** Pre-1.0
|
||||
blocker if we're advertising "secure" at the top of this README.
|
||||
age + zstd + content-defined chunking is mostly off-the-shelf
|
||||
pieces, but the seams (key handling, recipient parsing, manifest
|
||||
trust boundary, restore-time identity validation) need an outside
|
||||
read.
|
||||
* **Error-condition tests.** Today's coverage is the happy path
|
||||
plus a few specific regressions. Need fault-injection coverage:
|
||||
network failures mid-blob, disk-full during restore, corrupted /
|
||||
truncated / missing blobs, partial uploads, kill -9 between
|
||||
manifest and db.zst.age writes.
|
||||
* **Verify restored content end-to-end in CI.** The current
|
||||
integration test does this for a small synthetic snapshot but
|
||||
not at scale. A nightly job against a multi-GB representative
|
||||
snapshot would catch silent regressions in the chunker, packer,
|
||||
or restore planner.
|
||||
|
||||
### performance
|
||||
|
||||
* **Parallel blob downloads during restore.** Single-stream right
|
||||
now. With a fast S3 endpoint and a multi-core machine restore is
|
||||
bound by per-blob fetch + decrypt + decompress; running N of
|
||||
those in parallel against the disk cache would close most of the
|
||||
remaining gap. Needs to interact correctly with the locality
|
||||
planner and sweeper.
|
||||
* **Bandwidth limiting (`--bwlimit`).** Both upload and download.
|
||||
Useful for backing up over a shared link. Tricky to make work
|
||||
correctly with the parallel-download story.
|
||||
* **Restart of interrupted restore.** Today restore is restartable
|
||||
in the sense that re-running it overwrites partial output; it
|
||||
doesn't resume from where it stopped or skip already-present
|
||||
files. A `--resume` mode that checks targets before fetching
|
||||
blobs would matter for very large restores.
|
||||
|
||||
### usability
|
||||
|
||||
* **Man pages and richer `--help` examples.** Cobra generates
|
||||
basic help; man pages would be a separate target.
|
||||
* **`--bwlimit` style human-readable size flags** across the
|
||||
command surface where they're currently raw integers.
|
||||
* **`vaultik snapshot diff <a> <b>`** — show which files changed
|
||||
between two snapshots without restoring either.
|
||||
* **Status reporting hook for `--cron`.** When a backup fails
|
||||
silently in cron, the user has no idea. A configurable
|
||||
webhook / email / `notify-send` hook on completion (success and
|
||||
failure) would close the loop.
|
||||
|
||||
### infrastructure
|
||||
|
||||
* **Cross-machine restore documentation.** The "restore from
|
||||
another host" workflow works but isn't documented as a
|
||||
first-class operation in this README. Worth a dedicated section
|
||||
once it's settled.
|
||||
* **Schema migrations.** Currently nonexistent — pre-1.0 schema
|
||||
changes are handled by `vaultik database purge` plus a full
|
||||
re-scan. Post-1.0 we'll need a migration story to keep existing
|
||||
index databases usable across upgrades.
|
||||
* **Storage backend coverage tests.** S3, file://, and rclone://
|
||||
all share the Storer interface but the rclone path is the least
|
||||
exercised in CI.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user