Two related changes, both addressing leakage and brittleness around
the public bytes the destination store sees.
First, every remote storage path that previously embedded a human
snapshot ID (e.g. metadata/heraklion_berlin.sneak.fs.photos.2026.
catalog_2026-06-24T07:00:15Z/...) now uses the hashed remote key:
RemoteSnapshotKey(id) = hex(SHA256(SHA256("vaultik|" + id)))
Applied at:
* uploadSnapshotArtifacts (snapshot create write path)
* the manifest.json.zst snapshot_id field — manifest is
unencrypted, so the human ID would otherwise be readable to
anyone with bucket-list permission
* cleanupIncompleteSnapshots metadata-existence probe
* snapshot restore / verify (downloadSnapshotDB,
loadVerificationData)
* downloadManifestByKey, deleteRemoteSnapshotByKey
* CleanupLocalSnapshots reconciliation
* the locally-driven removal paths (RemoveSnapshot,
RemoveAllSnapshots, confirmAndExecutePurge)
The local index database keeps human IDs everywhere — the hash is a
boundary translation, not a rename. A directory listing of the
backup destination now looks like
"metadata/<64-hex>/{db.zst.age,manifest.json.zst}" with no host,
snapshot-name, or timestamp information visible.
Second, snapshot list no longer fails just because remote storage is
unreachable, and only consults the remote when the local machine can
plausibly decrypt:
* Listing is always driven by the local index database — that's
what holds the human IDs, timestamps, and per-snapshot stats
that the table actually shows.
* If no age secret key is configured, we skip remote listing
entirely (the box is treated as a write-only backup machine —
there's no value showing it remote-only keys it could never
restore).
* If a key IS configured, we try the remote listing; failures
(volume unmounted, permission denied, network error) downgrade
to a warning instead of aborting the command.
* When the remote listing succeeds, we cross-reference by hashing
each local human ID and diffing against the returned key set.
Local-only snapshots get the existing "stale local record"
cleanup hint; remote-only keys are surfaced as a single
"NOTE: N remote snapshot(s) found in backup destination store
but not in local database" line.
FileStorer construction also no longer does an eager mkdir — the
basePath is recorded and the directory is created lazily on first
write. A missing or unmounted destination during `snapshot list`
should NOT block the command, and now it doesn't.
RemoveAllSnapshots is rewritten to drive deletion from the local
index instead of from a remote listing, hashing each local ID to
find the corresponding remote key. Orphan remote keys (no matching
local snapshot) are handled separately and only deleted when
--remote is set. Existing tests are updated to hash storage paths
through the new RemoteSnapshotKey helper.
The hash format is a hard pre-1.0 break: existing remote snapshots
written under the human-ID path scheme are no longer readable; they
need to be either re-uploaded under the new scheme or manually
renamed. There is no fallback path; matching the project policy of
"no migrations pre-1.0."
41 lines
1.6 KiB
Go
41 lines
1.6 KiB
Go
package snapshot
|
|
|
|
import (
|
|
"crypto/sha256"
|
|
"encoding/hex"
|
|
)
|
|
|
|
// remoteKeyPrefix is mixed into the snapshot ID hash so the resulting
|
|
// hex digest is domain-separated from any other "double SHA256 of a
|
|
// string" identifier the user might also use. Keeping this stable is a
|
|
// hard compatibility requirement: changing it invalidates every
|
|
// existing snapshot's remote storage path.
|
|
const remoteKeyPrefix = "vaultik|"
|
|
|
|
// RemoteSnapshotKey returns the storage-side identifier for a snapshot
|
|
// given its human snapshot ID. It is hex(SHA256(SHA256(prefix + id))).
|
|
// The two SHA256 rounds match Bitcoin's "hash256" convention so the
|
|
// output looks like a 64-character hex blob with no exploitable
|
|
// structure visible to a remote observer.
|
|
//
|
|
// We use this in three places:
|
|
//
|
|
// - the "metadata/<remote-key>/..." subdirectory on the storage
|
|
// backend so a directory listing of the bucket / file:// dest
|
|
// doesn't reveal hostnames, configured snapshot names, or backup
|
|
// timestamps;
|
|
// - the `snapshot_id` field of the unencrypted manifest.json.zst
|
|
// for the same reason;
|
|
// - any code path that needs to translate a known local snapshot ID
|
|
// into the path it would occupy on remote storage.
|
|
//
|
|
// The human ID stays the user-visible handle everywhere else — local
|
|
// database joins, CLI arguments, summary lines, log fields — because
|
|
// it's never written to the public bytes once this function gates
|
|
// every storage-path construction.
|
|
func RemoteSnapshotKey(snapshotID string) string {
|
|
first := sha256.Sum256([]byte(remoteKeyPrefix + snapshotID))
|
|
second := sha256.Sum256(first[:])
|
|
return hex.EncodeToString(second[:])
|
|
}
|