7 Commits

Author SHA1 Message Date
clawbot
ea8edd653f fix: replace O(n²) duplicate detection with map-based O(1) lookups
All checks were successful
check / check (pull_request) Successful in 2m27s
Replace linear scan deduplication of snapshot IDs in RemoveAllSnapshots()
and PruneBlobs() with map[string]bool for O(1) lookups.

Previously, each new snapshot ID was checked against the entire collected
slice via a linear scan, resulting in O(n²) overall complexity. Now a
'seen' map provides constant-time membership checks while preserving
insertion order in the slice.

closes #12
2026-03-19 06:05:44 -07:00
60b6746db9 schema: add ON DELETE CASCADE to snapshot_files.file_id and snapshot_blobs.blob_id FKs (#46)
All checks were successful
check / check (push) Successful in 2m47s
Add `ON DELETE CASCADE` to the two foreign keys that were missing it:

- `snapshot_files.file_id` → `files(id)`
- `snapshot_blobs.blob_id` → `blobs(id)`

This ensures that when a file or blob row is deleted, the corresponding snapshot junction rows are automatically cleaned up, consistent with the other CASCADE FKs already in the schema.

closes #19

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #46
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-19 14:03:39 +01:00
f28c8a73b7 fix: add ON DELETE CASCADE to uploads FK on snapshot_id (#44)
All checks were successful
check / check (push) Successful in 2m24s
The `uploads` table's foreign key on `snapshot_id` did not cascade deletes, unlike `snapshot_files` and `snapshot_blobs`. This caused FK violations when deleting snapshots with associated upload records (if FK enforcement is enabled) unless uploads were manually deleted first.

Adds `ON DELETE CASCADE` to the `snapshot_id` FK in `schema.sql` for consistency with the other snapshot-referencing tables.

`docker build .` passes (fmt-check, lint, all tests, build).

closes #18

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Reviewed-on: #44
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-19 13:59:27 +01:00
1c0f5b8eb2 Rename blob_fetch_stub.go to blob_fetch.go (#53)
All checks were successful
check / check (push) Successful in 4m28s
Renames `internal/vaultik/blob_fetch_stub.go` to `internal/vaultik/blob_fetch.go`.

The file contains production code (`hashVerifyReader`, `FetchAndDecryptBlob`), not stubs. The `_stub` suffix was a misnomer from the original implementation in [PR #39](#39).

Pure rename — no code changes. All tests, linting, and formatting pass.

closes #52

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #53
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-19 09:33:35 +01:00
689109a2b8 fix: remove destructive sync from ListSnapshots (#49)
Some checks failed
check / check (push) Has been cancelled
## Summary

`ListSnapshots()` silently deleted local snapshot records not found in remote storage. A list/read operation should not have destructive side effects.

## Changes

1. **Removed destructive sync from `ListSnapshots()`** — the inline loop that deleted local snapshots not present in remote storage has been removed entirely. `ListSnapshots()` now only reads and displays data.

2. **Improved `syncWithRemote()` cascade cleanup** — updated `syncWithRemote()` to use `deleteSnapshotFromLocalDB()` instead of directly calling `Repositories.Snapshots.Delete()`. This ensures proper cascade deletion of related records (`snapshot_files`, `snapshot_blobs`, `snapshot_uploads`) before deleting the snapshot record itself, matching the thorough cleanup that the removed `ListSnapshots` code was doing.

The explicit sync behavior remains available via `syncWithRemote()`, which is called by `PurgeSnapshots()`.

## Testing

- `docker build .` passes (lint, fmt-check, all tests, compilation)

closes #15

Co-authored-by: clawbot <clawbot@eeqj.de>
Reviewed-on: #49
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-19 09:32:52 +01:00
ac2f21a89d Refactor: break up oversized methods into smaller descriptive helpers (#41)
All checks were successful
check / check (push) Successful in 4m17s
Closes #40

Per sneak's feedback on PR #37: methods were too long. This PR breaks all methods over 100-150 lines into smaller, descriptively named helper methods.

## Refactored methods (8 total)

| Original | Lines | Helpers extracted |
|---|---|---|
| `createNamedSnapshot` | 214 | `resolveSnapshotPaths`, `scanAllDirectories`, `collectUploadStats`, `finalizeSnapshotMetadata`, `printSnapshotSummary`, `getSnapshotBlobSizes`, `formatUploadSpeed` |
| `ListSnapshots` | 159 | `listRemoteSnapshotIDs`, `reconcileLocalWithRemote`, `buildSnapshotInfoList`, `printSnapshotTable` |
| `PruneBlobs` | 170 | `collectReferencedBlobs`, `listUniqueSnapshotIDs`, `listAllRemoteBlobs`, `findUnreferencedBlobs`, `deleteUnreferencedBlobs` |
| `RunDeepVerify` | 182 | `loadVerificationData`, `runVerificationSteps`, `deepVerifyFailure` |
| `RemoteInfo` | 187 | `collectSnapshotMetadata`, `collectReferencedBlobsFromManifests`, `populateRemoteInfoResult`, `scanRemoteBlobStorage`, `printRemoteInfoTable` |
| `handleBlobReady` | 173 | `uploadBlobIfNeeded`, `makeUploadProgressCallback`, `recordBlobMetadata`, `cleanupBlobTempFile` |
| `processFileStreaming` | 146 | `updateChunkStats`, `addChunkToPacker`, `queueFileForBatchInsert` |
| `finalizeCurrentBlob` | 167 | `closeBlobWriter`, `buildChunkRefs`, `commitBlobToDatabase`, `deliverFinishedBlob` |

## Verification

- `go build ./...` 
- `make test`  (all tests pass)
- `golangci-lint run`  (0 issues)
- No behavioral changes, pure restructuring

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #41
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-19 00:23:45 +01:00
8c59f55096 fix: verify blob hash after download and decryption (closes #5) (#39)
All checks were successful
check / check (push) Successful in 2m27s
## Summary

Add double-SHA-256 hash verification of decrypted plaintext in `FetchAndDecryptBlob`. This ensures blob integrity during restore operations by comparing the computed hash against the expected blob hash before returning data to the caller.

The blob hash is `SHA256(SHA256(plaintext))` as produced by `blobgen.Writer.Sum256()`. Verification happens after decryption and decompression but before the data is used.

## Test

Added `blob_fetch_hash_test.go` with tests for:
- Correct hash passes verification
- Mismatched hash returns descriptive error

## make test output

```
golangci-lint run
0 issues.

ok  git.eeqj.de/sneak/vaultik/internal/blob       4.563s
ok  git.eeqj.de/sneak/vaultik/internal/blobgen    3.981s
ok  git.eeqj.de/sneak/vaultik/internal/chunker    4.127s
ok  git.eeqj.de/sneak/vaultik/internal/cli        1.499s
ok  git.eeqj.de/sneak/vaultik/internal/config     1.905s
ok  git.eeqj.de/sneak/vaultik/internal/crypto     0.519s
ok  git.eeqj.de/sneak/vaultik/internal/database   4.590s
ok  git.eeqj.de/sneak/vaultik/internal/globals    0.650s
ok  git.eeqj.de/sneak/vaultik/internal/models     0.779s
ok  git.eeqj.de/sneak/vaultik/internal/pidlock    2.945s
ok  git.eeqj.de/sneak/vaultik/internal/s3         3.286s
ok  git.eeqj.de/sneak/vaultik/internal/snapshot   3.979s
ok  git.eeqj.de/sneak/vaultik/internal/vaultik    4.418s
```

All tests pass, 0 lint issues.

Co-authored-by: user <user@Mac.lan guest wan>
Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Reviewed-on: #39
Co-authored-by: clawbot <sneak+clawbot@sneak.cloud>
Co-committed-by: clawbot <sneak+clawbot@sneak.cloud>
2026-03-19 00:21:11 +01:00
6 changed files with 215 additions and 83 deletions

View File

@@ -103,7 +103,7 @@ CREATE TABLE IF NOT EXISTS snapshot_files (
file_id TEXT NOT NULL, file_id TEXT NOT NULL,
PRIMARY KEY (snapshot_id, file_id), PRIMARY KEY (snapshot_id, file_id),
FOREIGN KEY (snapshot_id) REFERENCES snapshots(id) ON DELETE CASCADE, FOREIGN KEY (snapshot_id) REFERENCES snapshots(id) ON DELETE CASCADE,
FOREIGN KEY (file_id) REFERENCES files(id) FOREIGN KEY (file_id) REFERENCES files(id) ON DELETE CASCADE
); );
-- Index for efficient file lookups (used in orphan detection) -- Index for efficient file lookups (used in orphan detection)
@@ -116,7 +116,7 @@ CREATE TABLE IF NOT EXISTS snapshot_blobs (
blob_hash TEXT NOT NULL, blob_hash TEXT NOT NULL,
PRIMARY KEY (snapshot_id, blob_id), PRIMARY KEY (snapshot_id, blob_id),
FOREIGN KEY (snapshot_id) REFERENCES snapshots(id) ON DELETE CASCADE, FOREIGN KEY (snapshot_id) REFERENCES snapshots(id) ON DELETE CASCADE,
FOREIGN KEY (blob_id) REFERENCES blobs(id) FOREIGN KEY (blob_id) REFERENCES blobs(id) ON DELETE CASCADE
); );
-- Index for efficient blob lookups (used in orphan detection) -- Index for efficient blob lookups (used in orphan detection)
@@ -130,7 +130,7 @@ CREATE TABLE IF NOT EXISTS uploads (
size INTEGER NOT NULL, size INTEGER NOT NULL,
duration_ms INTEGER NOT NULL, duration_ms INTEGER NOT NULL,
FOREIGN KEY (blob_hash) REFERENCES blobs(blob_hash), FOREIGN KEY (blob_hash) REFERENCES blobs(blob_hash),
FOREIGN KEY (snapshot_id) REFERENCES snapshots(id) FOREIGN KEY (snapshot_id) REFERENCES snapshots(id) ON DELETE CASCADE
); );
-- Index for efficient snapshot lookups -- Index for efficient snapshot lookups

View File

@@ -0,0 +1,93 @@
package vaultik
import (
"context"
"crypto/sha256"
"encoding/hex"
"fmt"
"io"
"filippo.io/age"
"git.eeqj.de/sneak/vaultik/internal/blobgen"
)
// hashVerifyReader wraps a blobgen.Reader and verifies the double-SHA-256 hash
// of decrypted plaintext when Close is called. It reuses the hash that
// blobgen.Reader already computes internally via its TeeReader, avoiding
// redundant SHA-256 computation.
type hashVerifyReader struct {
reader *blobgen.Reader // underlying decrypted blob reader (has internal hasher)
fetcher io.ReadCloser // raw fetched stream (closed on Close)
blobHash string // expected double-SHA-256 hex
done bool // EOF reached
}
func (h *hashVerifyReader) Read(p []byte) (int, error) {
n, err := h.reader.Read(p)
if err == io.EOF {
h.done = true
}
return n, err
}
// Close verifies the hash (if the stream was fully read) and closes underlying readers.
func (h *hashVerifyReader) Close() error {
readerErr := h.reader.Close()
fetcherErr := h.fetcher.Close()
if h.done {
firstHash := h.reader.Sum256()
secondHasher := sha256.New()
secondHasher.Write(firstHash)
actualHashHex := hex.EncodeToString(secondHasher.Sum(nil))
if actualHashHex != h.blobHash {
return fmt.Errorf("blob hash mismatch: expected %s, got %s", h.blobHash[:16], actualHashHex[:16])
}
}
if readerErr != nil {
return readerErr
}
return fetcherErr
}
// FetchAndDecryptBlob downloads a blob, decrypts and decompresses it, and
// returns a streaming reader that computes the double-SHA-256 hash on the fly.
// The hash is verified when the returned reader is closed (after fully reading).
// This avoids buffering the entire blob in memory.
func (v *Vaultik) FetchAndDecryptBlob(ctx context.Context, blobHash string, expectedSize int64, identity age.Identity) (io.ReadCloser, error) {
rc, _, err := v.FetchBlob(ctx, blobHash, expectedSize)
if err != nil {
return nil, err
}
reader, err := blobgen.NewReader(rc, identity)
if err != nil {
_ = rc.Close()
return nil, fmt.Errorf("creating blob reader: %w", err)
}
return &hashVerifyReader{
reader: reader,
fetcher: rc,
blobHash: blobHash,
}, nil
}
// FetchBlob downloads a blob and returns a reader for the encrypted data.
func (v *Vaultik) FetchBlob(ctx context.Context, blobHash string, expectedSize int64) (io.ReadCloser, int64, error) {
blobPath := fmt.Sprintf("blobs/%s/%s/%s", blobHash[:2], blobHash[2:4], blobHash)
rc, err := v.Storage.Get(ctx, blobPath)
if err != nil {
return nil, 0, fmt.Errorf("downloading blob %s: %w", blobHash[:16], err)
}
info, err := v.Storage.Stat(ctx, blobPath)
if err != nil {
_ = rc.Close()
return nil, 0, fmt.Errorf("stat blob %s: %w", blobHash[:16], err)
}
return rc, info.Size, nil
}

View File

@@ -0,0 +1,100 @@
package vaultik_test
import (
"bytes"
"context"
"crypto/sha256"
"encoding/hex"
"io"
"strings"
"testing"
"filippo.io/age"
"git.eeqj.de/sneak/vaultik/internal/blobgen"
"git.eeqj.de/sneak/vaultik/internal/vaultik"
)
// TestFetchAndDecryptBlobVerifiesHash verifies that FetchAndDecryptBlob checks
// the double-SHA-256 hash of the decrypted plaintext against the expected blob hash.
func TestFetchAndDecryptBlobVerifiesHash(t *testing.T) {
identity, err := age.GenerateX25519Identity()
if err != nil {
t.Fatalf("generating identity: %v", err)
}
// Create test data and encrypt it using blobgen.Writer
plaintext := []byte("hello world test data for blob hash verification")
var encBuf bytes.Buffer
writer, err := blobgen.NewWriter(&encBuf, 1, []string{identity.Recipient().String()})
if err != nil {
t.Fatalf("creating blobgen writer: %v", err)
}
if _, err := writer.Write(plaintext); err != nil {
t.Fatalf("writing plaintext: %v", err)
}
if err := writer.Close(); err != nil {
t.Fatalf("closing writer: %v", err)
}
encryptedData := encBuf.Bytes()
// Compute correct double-SHA-256 hash of the plaintext (matches blobgen.Writer.Sum256)
firstHash := sha256.Sum256(plaintext)
secondHash := sha256.Sum256(firstHash[:])
correctHash := hex.EncodeToString(secondHash[:])
// Verify our hash matches what blobgen.Writer produces
writerHash := hex.EncodeToString(writer.Sum256())
if correctHash != writerHash {
t.Fatalf("hash computation mismatch: manual=%s, writer=%s", correctHash, writerHash)
}
// Set up mock storage with the blob at the correct path
mockStorage := NewMockStorer()
blobPath := "blobs/" + correctHash[:2] + "/" + correctHash[2:4] + "/" + correctHash
mockStorage.mu.Lock()
mockStorage.data[blobPath] = encryptedData
mockStorage.mu.Unlock()
tv := vaultik.NewForTesting(mockStorage)
ctx := context.Background()
t.Run("correct hash succeeds", func(t *testing.T) {
rc, err := tv.FetchAndDecryptBlob(ctx, correctHash, int64(len(encryptedData)), identity)
if err != nil {
t.Fatalf("expected success, got error: %v", err)
}
data, err := io.ReadAll(rc)
if err != nil {
t.Fatalf("reading stream: %v", err)
}
if err := rc.Close(); err != nil {
t.Fatalf("close (hash verification) failed: %v", err)
}
if !bytes.Equal(data, plaintext) {
t.Fatalf("decrypted data mismatch: got %q, want %q", data, plaintext)
}
})
t.Run("wrong hash fails", func(t *testing.T) {
// Use a fake hash that doesn't match the actual plaintext
fakeHash := strings.Repeat("ab", 32) // 64 hex chars
fakePath := "blobs/" + fakeHash[:2] + "/" + fakeHash[2:4] + "/" + fakeHash
mockStorage.mu.Lock()
mockStorage.data[fakePath] = encryptedData
mockStorage.mu.Unlock()
rc, err := tv.FetchAndDecryptBlob(ctx, fakeHash, int64(len(encryptedData)), identity)
if err != nil {
t.Fatalf("unexpected error opening stream: %v", err)
}
// Read all data — hash is verified on Close
_, _ = io.ReadAll(rc)
err = rc.Close()
if err == nil {
t.Fatal("expected error for mismatched hash, got nil")
}
if !strings.Contains(err.Error(), "hash mismatch") {
t.Fatalf("expected hash mismatch error, got: %v", err)
}
})
}

View File

@@ -1,55 +0,0 @@
package vaultik
import (
"context"
"fmt"
"io"
"filippo.io/age"
"git.eeqj.de/sneak/vaultik/internal/blobgen"
)
// FetchAndDecryptBlobResult holds the result of fetching and decrypting a blob.
type FetchAndDecryptBlobResult struct {
Data []byte
}
// FetchAndDecryptBlob downloads a blob, decrypts it, and returns the plaintext data.
func (v *Vaultik) FetchAndDecryptBlob(ctx context.Context, blobHash string, expectedSize int64, identity age.Identity) (*FetchAndDecryptBlobResult, error) {
rc, _, err := v.FetchBlob(ctx, blobHash, expectedSize)
if err != nil {
return nil, err
}
defer func() { _ = rc.Close() }()
reader, err := blobgen.NewReader(rc, identity)
if err != nil {
return nil, fmt.Errorf("creating blob reader: %w", err)
}
defer func() { _ = reader.Close() }()
data, err := io.ReadAll(reader)
if err != nil {
return nil, fmt.Errorf("reading blob data: %w", err)
}
return &FetchAndDecryptBlobResult{Data: data}, nil
}
// FetchBlob downloads a blob and returns a reader for the encrypted data.
func (v *Vaultik) FetchBlob(ctx context.Context, blobHash string, expectedSize int64) (io.ReadCloser, int64, error) {
blobPath := fmt.Sprintf("blobs/%s/%s/%s", blobHash[:2], blobHash[2:4], blobHash)
rc, err := v.Storage.Get(ctx, blobPath)
if err != nil {
return nil, 0, fmt.Errorf("downloading blob %s: %w", blobHash[:16], err)
}
info, err := v.Storage.Stat(ctx, blobPath)
if err != nil {
_ = rc.Close()
return nil, 0, fmt.Errorf("stat blob %s: %w", blobHash[:16], err)
}
return rc, info.Size, nil
}

View File

@@ -558,11 +558,23 @@ func (v *Vaultik) restoreRegularFile(
// downloadBlob downloads and decrypts a blob // downloadBlob downloads and decrypts a blob
func (v *Vaultik) downloadBlob(ctx context.Context, blobHash string, expectedSize int64, identity age.Identity) ([]byte, error) { func (v *Vaultik) downloadBlob(ctx context.Context, blobHash string, expectedSize int64, identity age.Identity) ([]byte, error) {
result, err := v.FetchAndDecryptBlob(ctx, blobHash, expectedSize, identity) rc, err := v.FetchAndDecryptBlob(ctx, blobHash, expectedSize, identity)
if err != nil { if err != nil {
return nil, err return nil, err
} }
return result.Data, nil
data, err := io.ReadAll(rc)
if err != nil {
_ = rc.Close()
return nil, fmt.Errorf("reading blob data: %w", err)
}
// Close triggers hash verification
if err := rc.Close(); err != nil {
return nil, err
}
return data, nil
} }
// verifyRestoredFiles verifies that all restored files match their expected chunk hashes // verifyRestoredFiles verifies that all restored files match their expected chunk hashes

View File

@@ -419,7 +419,7 @@ func (v *Vaultik) listRemoteSnapshotIDs() (map[string]bool, error) {
return remoteSnapshots, nil return remoteSnapshots, nil
} }
// reconcileLocalWithRemote removes local snapshots not in remote and returns the surviving local map // reconcileLocalWithRemote builds a map of local snapshots keyed by ID for cross-referencing with remote
func (v *Vaultik) reconcileLocalWithRemote(remoteSnapshots map[string]bool) (map[string]*database.Snapshot, error) { func (v *Vaultik) reconcileLocalWithRemote(remoteSnapshots map[string]bool) (map[string]*database.Snapshot, error) {
localSnapshots, err := v.Repositories.Snapshots.ListRecent(v.ctx, 10000) localSnapshots, err := v.Repositories.Snapshots.ListRecent(v.ctx, 10000)
if err != nil { if err != nil {
@@ -431,19 +431,6 @@ func (v *Vaultik) reconcileLocalWithRemote(remoteSnapshots map[string]bool) (map
localSnapshotMap[s.ID.String()] = s localSnapshotMap[s.ID.String()] = s
} }
for _, snap := range localSnapshots {
snapshotIDStr := snap.ID.String()
if !remoteSnapshots[snapshotIDStr] {
log.Info("Removing local snapshot not found in remote", "snapshot_id", snap.ID)
if err := v.deleteSnapshotFromLocalDB(snapshotIDStr); err != nil {
log.Error("Failed to delete local snapshot", "snapshot_id", snap.ID, "error", err)
} else {
log.Info("Deleted local snapshot not found in remote", "snapshot_id", snap.ID)
delete(localSnapshotMap, snapshotIDStr)
}
}
}
return localSnapshotMap, nil return localSnapshotMap, nil
} }
@@ -872,7 +859,7 @@ func (v *Vaultik) syncWithRemote() error {
snapshotIDStr := snapshot.ID.String() snapshotIDStr := snapshot.ID.String()
if !remoteSnapshots[snapshotIDStr] { if !remoteSnapshots[snapshotIDStr] {
log.Info("Removing local snapshot not found in remote", "snapshot_id", snapshot.ID) log.Info("Removing local snapshot not found in remote", "snapshot_id", snapshot.ID)
if err := v.Repositories.Snapshots.Delete(v.ctx, snapshotIDStr); err != nil { if err := v.deleteSnapshotFromLocalDB(snapshotIDStr); err != nil {
log.Error("Failed to delete local snapshot", "snapshot_id", snapshot.ID, "error", err) log.Error("Failed to delete local snapshot", "snapshot_id", snapshot.ID, "error", err)
} else { } else {
removedCount++ removedCount++
@@ -1001,6 +988,7 @@ func (v *Vaultik) listAllRemoteSnapshotIDs() ([]string, error) {
log.Info("Listing all snapshots") log.Info("Listing all snapshots")
objectCh := v.Storage.ListStream(v.ctx, "metadata/") objectCh := v.Storage.ListStream(v.ctx, "metadata/")
seen := make(map[string]bool)
var snapshotIDs []string var snapshotIDs []string
for object := range objectCh { for object := range objectCh {
if object.Err != nil { if object.Err != nil {
@@ -1015,14 +1003,8 @@ func (v *Vaultik) listAllRemoteSnapshotIDs() ([]string, error) {
} }
if strings.HasSuffix(object.Key, "/") || strings.Contains(object.Key, "/manifest.json.zst") { if strings.HasSuffix(object.Key, "/") || strings.Contains(object.Key, "/manifest.json.zst") {
sid := parts[1] sid := parts[1]
found := false if !seen[sid] {
for _, id := range snapshotIDs { seen[sid] = true
if id == sid {
found = true
break
}
}
if !found {
snapshotIDs = append(snapshotIDs, sid) snapshotIDs = append(snapshotIDs, sid)
} }
} }