9 Commits

Author SHA1 Message Date
2089fc8292 Merge branch 'next' into fix/issue-26 2026-02-09 01:45:20 +01:00
ebaf2a65ca Fix AddFile to verify actual bytes read matches declared size (closes #25) (#30)
After reading file content, verify `totalRead == size` and return an error on mismatch.

Co-authored-by: clawbot <clawbot@openclaw>
Reviewed-on: #30
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-02-09 01:35:07 +01:00
clawbot
6948c65012 Specify and enforce path invariants in Builder
Add ValidatePath() enforcing: valid UTF-8, forward-slash only,
relative paths, no '..' segments, no empty segments. Called from
both AddFile and AddFileWithHash. Proto comments document the rules.

Closes #26
2026-02-08 16:12:06 -08:00
4b80c0067b docs: replace TODO.md with design questions and implementation plan 2026-02-08 18:40:31 +01:00
5ab092098b progress 2026-02-08 09:25:58 -08:00
4a2060087d Add GPG signature verification on manifest load
- Implement gpgVerify function that creates a temporary keyring to verify
  detached signatures against embedded public keys
- Signature verification happens during deserialization after hash
  validation but before decompression
- Extract signatureString() as a method on manifest for generating the
  canonical signature string (MAGIC-UUID-MULTIHASH)
- Add --require-signature flag to check command to mandate signature from
  a specific GPG key ID
- Expose IsSigned() and Signer() methods on Checker for signature status
2025-12-18 05:56:16 -08:00
213364bab5 Add UUID to manifest and verify integrity before decompression
- Add UUID field to both inner and outer manifest messages
- Generate random v4 UUID when creating manifest
- Hash compressed data (not uncompressed) for integrity check
- Verify hash before decompression to prevent malicious payloads
- Validate UUIDs are proper format and match between inner/outer
- Sign string format: MAGIC-UUID-MULTIHASH
2025-12-18 02:20:51 -08:00
778999a285 Add GPG signing support for manifest generation
- Add --sign-key flag and MFER_SIGN_KEY env var to gen and freshen commands
- Sign inner message multihash with GPG detached signature
- Include signer fingerprint and public key in outer wrapper
- Add comprehensive tests with temporary GPG keyring
- Increase test timeout to 10s for GPG key generation
2025-12-18 02:12:54 -08:00
308c583d57 Remove codebase structure section from README
godoc provides this documentation automatically
2025-12-18 01:38:13 -08:00
21 changed files with 1669 additions and 233 deletions

View File

@@ -24,7 +24,7 @@ run: ./bin/mfer
ci: test ci: test
test: $(SOURCEFILES) mfer/mf.pb.go test: $(SOURCEFILES) mfer/mf.pb.go
go test -v --timeout 3s ./... go test -v --timeout 10s ./...
$(PROTOC_GEN_GO): $(PROTOC_GEN_GO):
test -e $(PROTOC_GEN_GO) || go install -v google.golang.org/protobuf/cmd/protoc-gen-go@v1.28.1 test -e $(PROTOC_GEN_GO) || go install -v google.golang.org/protobuf/cmd/protoc-gen-go@v1.28.1

193
README.md
View File

@@ -52,199 +52,6 @@ Reading file contents and computing cryptographic hashes for manifest generation
- **NO_COLOR:** Respect the `NO_COLOR` environment variable for disabling colored output. - **NO_COLOR:** Respect the `NO_COLOR` environment variable for disabling colored output.
- **Options pattern:** Use `NewWithOptions(opts *Options)` constructor pattern for configurable types. - **Options pattern:** Use `NewWithOptions(opts *Options)` constructor pattern for configurable types.
# Codebase Structure
## cmd/mfer/
### main.go
- **Variables**
- `Appname string` - Application name
- `Version string` - Version string (set at build time)
- `Gitrev string` - Git revision (set at build time)
## internal/cli/
### entry.go
- **Variables**
- `NO_COLOR bool` - Disables color output when NO_COLOR env var is set
- **Functions**
- `Run(Appname, Version, Gitrev string) int` - Main entry point for the CLI
### mfer.go
- **Types**
- `CLIApp struct` - Main CLI application container
- **Methods**
- `(*CLIApp) VersionString() string` - Returns formatted version string
## internal/log/
### log.go
- **Functions**
- `Init()` - Initializes the logger
- `Info(arg string)` - Logs at info level
- `Infof(format string, args ...interface{})` - Logs at info level with formatting
- `Debug(arg string)` - Logs at debug level with caller info
- `Debugf(format string, args ...interface{})` - Logs at debug level with formatting and caller info
- `Dump(args ...interface{})` - Logs spew dump at debug level
- `Progressf(format string, args ...interface{})` - Prints progress message (overwrites current line)
- `ProgressDone()` - Completes progress line with newline
- `EnableDebugLogging()` - Sets log level to debug
- `SetLevel(arg log.Level)` - Sets log level
- `SetLevelFromVerbosity(l int)` - Sets log level from verbosity count
- `GetLevel() log.Level` - Returns current log level
- `GetLogger() *log.Logger` - Returns underlying logger
- `WithError(e error) *log.Entry` - Returns log entry with error attached
- `DisableStyling()` - Disables colors and styling (for NO_COLOR)
## internal/scanner/
### scanner.go
- **Types**
- `Options struct` - Options for scanner behavior
- `IncludeDotfiles bool` - Include dot (hidden) files (excluded by default)
- `FollowSymLinks bool`
- `EnumerateStatus struct` - Progress information for enumeration phase
- `FilesFound int64`
- `BytesFound int64`
- `ScanStatus struct` - Progress information for scan phase
- `TotalFiles int64`
- `ScannedFiles int64`
- `TotalBytes int64`
- `ScannedBytes int64`
- `BytesPerSec float64`
- `ETA time.Duration`
- `FileEntry struct` - Represents an enumerated file
- `Path string` - Relative path (used in manifest)
- `AbsPath string` - Absolute path (used for reading file content)
- `Size int64`
- `Mtime time.Time`
- `Ctime time.Time`
- `Scanner struct` - Accumulates files and generates manifests
- **Functions**
- `New() *Scanner` - Creates a new Scanner with default options
- `NewWithOptions(opts *Options) *Scanner` - Creates a new Scanner with given options
- **Methods (Enumeration Phase)**
- `(*Scanner) EnumerateFile(path string) error` - Enumerates a single file, calling stat() for metadata
- `(*Scanner) EnumeratePath(inputPath string, progress chan<- EnumerateStatus) error` - Walks a directory and enumerates all files
- `(*Scanner) EnumeratePaths(progress chan<- EnumerateStatus, inputPaths ...string) error` - Walks multiple directories
- `(*Scanner) EnumerateFS(afs afero.Fs, basePath string, progress chan<- EnumerateStatus) error` - Walks an afero filesystem
- **Methods (Accessors)**
- `(*Scanner) Files() []*FileEntry` - Returns copy of all enumerated files
- `(*Scanner) FileCount() int64` - Returns number of files
- `(*Scanner) TotalBytes() int64` - Returns total size of all files
- **Methods (Scan Phase)**
- `(*Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- ScanStatus) error` - Reads file contents, computes hashes, generates manifest
## internal/checker/
### checker.go
- **Types**
- `Result struct` - Outcome of checking a single file
- `Path string` - File path from manifest
- `Status Status` - Verification status
- `Message string` - Error or status message
- `Status int` - Verification status enumeration
- `StatusOK` - File matches manifest
- `StatusMissing` - File not found
- `StatusSizeMismatch` - File size differs from manifest
- `StatusHashMismatch` - File hash differs from manifest
- `StatusError` - Error occurred during verification
- `CheckStatus struct` - Progress information for check operation
- `TotalFiles int64`
- `CheckedFiles int64`
- `TotalBytes int64`
- `CheckedBytes int64`
- `BytesPerSec float64`
- `ETA time.Duration`
- `Failures int64`
- `Checker struct` - Verifies files against a manifest
- **Functions**
- `NewChecker(manifestPath string, basePath string) (*Checker, error)` - Creates a new Checker for the given manifest and base path
- **Methods**
- `(s Status) String() string` - Returns string representation of status
- `(*Checker) FileCount() int64` - Returns number of files in the manifest
- `(*Checker) TotalBytes() int64` - Returns total size of all files in manifest
- `(*Checker) Check(ctx context.Context, results chan<- Result, progress chan<- CheckStatus) error` - Verifies all files against the manifest
## mfer/
### manifest.go
- **Types**
- `manifest struct` - Internal representation of a manifest file
- **Methods**
- `(*manifest) Files() []*MFFilePath` - Returns all file entries from a loaded manifest
### builder.go
- **Types**
- `FileHashProgress struct` - Progress info during file hashing (BytesRead int64)
- `Builder struct` - Constructs manifests by adding files one at a time
- **Functions**
- `NewBuilder() *Builder` - Creates a new Builder
- **Methods**
- `(*Builder) AddFile(path string, size int64, mtime time.Time, reader io.Reader, progress chan<- FileHashProgress) (int64, error)` - Reads file, computes hash, adds to manifest
- `(*Builder) AddFileWithHash(path string, size int64, mtime time.Time, hash []byte) error` - Adds file with pre-computed hash
- `(*Builder) FileCount() int` - Returns number of files added
- `(*Builder) Build(w io.Writer) error` - Finalizes and writes manifest
### serialize.go
- **Constants**
- `MAGIC string` - Magic bytes prefix for manifest files ("ZNAVSRFG")
### deserialize.go
- **Functions**
- `NewManifestFromReader(input io.Reader) (*manifest, error)` - Reads and parses manifest from io.Reader
- `NewManifestFromFile(fs afero.Fs, path string) (*manifest, error)` - Reads and parses manifest from file path
### mf.pb.go (generated from mf.proto)
- **Enum Types**
- `MFFileOuter_Version` - Outer file format version
- `MFFileOuter_VERSION_NONE`
- `MFFileOuter_VERSION_ONE`
- `MFFileOuter_CompressionType` - Compression type for inner message
- `MFFileOuter_COMPRESSION_NONE`
- `MFFileOuter_COMPRESSION_ZSTD`
- `MFFile_Version` - Inner file format version
- `MFFile_VERSION_NONE`
- `MFFile_VERSION_ONE`
- **Message Types**
- `Timestamp struct` - Timestamp with seconds and nanoseconds
- `GetSeconds() int64`
- `GetNanos() int32`
- `MFFileOuter struct` - Outer wrapper containing compressed/signed inner message
- `GetVersion() MFFileOuter_Version`
- `GetCompressionType() MFFileOuter_CompressionType`
- `GetSize() int64`
- `GetSha256() []byte`
- `GetInnerMessage() []byte`
- `GetSignature() []byte`
- `GetSigner() []byte`
- `GetSigningPubKey() []byte`
- `MFFilePath struct` - Individual file entry in manifest
- `GetPath() string`
- `GetSize() int64`
- `GetHashes() []*MFFileChecksum`
- `GetMimeType() string`
- `GetMtime() *Timestamp`
- `GetCtime() *Timestamp`
- `GetAtime() *Timestamp`
- `MFFileChecksum struct` - File checksum using multihash
- `GetMultiHash() []byte`
- `MFFile struct` - Inner manifest containing file list
- `GetVersion() MFFile_Version`
- `GetFiles() []*MFFilePath`
- `GetCreatedAt() *Timestamp`
# Build Status # Build Status
[![Build Status](https://drone.datavi.be/api/badges/sneak/mfer/status.svg)](https://drone.datavi.be/sneak/mfer) [![Build Status](https://drone.datavi.be/api/badges/sneak/mfer/status.svg)](https://drone.datavi.be/sneak/mfer)

129
TODO.md
View File

@@ -1,19 +1,122 @@
# TODO # TODO: mfer 1.0
## Critical ## Design Questions
- [ ] Fix broken error comparison in `internal/checker/checker.go:195` - `errors.Is(err, errors.New("file does not exist"))` always returns false because `errors.New()` creates a new instance each call *sneak: please answer inline below each question. These are preserved for posterity.*
- [ ] Fix unchecked `hash.Write()` errors in `mfer/builder.go:52`, `mfer/serialize.go:56`, `internal/cli/freshen.go:340`
- [ ] Fix URL path traversal risk in `internal/cli/fetch.go:116` - path isn't URL-escaped, should use `url.JoinPath()` or proper encoding
## Important ### Format Design
- [ ] Fix goroutine leak in signal handler `internal/cli/gen.go:98-106` - goroutine runs until channel closed, leaks if program exits normally **1. Should `MFFileChecksum` be simplified?**
- [ ] Fix timestamp precision in `mfer/serialize.go:16-22` - use `t.Nanosecond()` instead of manual calculation Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
- [ ] Add context cancellation check to filesystem walk in `internal/cli/freshen.go` - Ctrl-C doesn't work during scan phase
## Code Quality > *answer:*
- [ ] Consolidate duplicate `pathIsHidden` implementations in `internal/scanner/scanner.go:385-402` and `internal/cli/freshen.go:378-397` **2. Should file permissions/mode be stored?**
- [ ] Make `TotalBytes()` in `internal/scanner/scanner.go:250-259` track total incrementally instead of recalculating on every call The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
- [ ] Add input validation to `AddFileWithHash()` in `mfer/builder.go:107-120` - validate path, size, and hash inputs
> *answer:*
**3. Should `atime` be removed from the schema?**
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
> *answer:*
**4. What are the path normalization rules?**
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
> *answer:*
**5. Should we add a version byte after the magic?**
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
> *answer:*
**6. Should we add a length-prefix after the magic?**
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
> *answer:*
### Signature Design
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
> *answer:*
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
> *answer:*
**9. Should we support detached signature files (`.mf.sig`)?**
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
> *answer:*
**10. GPG vs pure-Go crypto for signatures?**
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
> *answer:*
### Implementation Design
**11. Should manifests be deterministic by default?**
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
> *answer:*
**12. Should we consolidate or keep both scanner/checker implementations?**
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
> *answer:*
**13. Should the `manifest` type be exported?**
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
> *answer:*
**14. What should the Go module path be for 1.0?**
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
> *answer:*
---
## Implementation Plan
### Phase 1: Foundation (format correctness)
- [ ] Delete `internal/scanner/` and `internal/checker/` — consolidate on `mfer/` package versions; update CLI code
- [ ] Add deterministic file ordering — sort entries by path (lexicographic, byte-order) in `Builder.Build()`; add test asserting byte-identical output from two runs
- [ ] Add decompression size limit — `io.LimitReader` in `deserializeInner()` with `m.pbOuter.Size` as bound
- [ ] Fix `errors.Is` dead code in checker — replace with `os.IsNotExist(err)` or `errors.Is(err, fs.ErrNotExist)`
- [ ] Fix `AddFile` to verify size — check `totalRead == size` after reading, return error on mismatch
- [ ] Specify path invariants — add proto comments (UTF-8, forward-slash, relative, no `..`, no leading `/`); validate in `Builder.AddFile` and `Builder.AddFileWithHash`
### Phase 2: CLI polish
- [ ] Fix flag naming — all CLI flags use kebab-case as primary (`--include-dotfiles`, `--follow-symlinks`)
- [ ] Fix URL construction in fetch — use `BaseURL.JoinPath()` or `url.JoinPath()` instead of string concatenation
- [ ] Add progress rate-limiting to Checker — throttle to once per second, matching Scanner
- [ ] Add `--deterministic` flag (or make it default) — omit `createdAt`, sort files
### Phase 3: Robustness
- [ ] Replace GPG subprocess with pure-Go crypto — `github.com/ProtonMail/go-crypto` or Ed25519/signify
- [ ] Add timeout to any remaining subprocess calls
- [ ] Add fuzzing tests for `NewManifestFromReader`
- [ ] Add retry logic to fetch — exponential backoff for transient HTTP errors
### Phase 4: Format finalization
- [ ] Remove or deprecate `atime` from proto (pending design question answer)
- [ ] Reserve `optional uint32 mode = 305` in `MFFilePath` for future file permissions
- [ ] Add version byte after magic — `ZNAVSRFG\x01` for format version 1
- [ ] Write format specification document — separate from README: magic, outer structure, compression, inner structure, path invariants, signature scheme, canonical serialization
### Phase 5: Release prep
- [ ] Finalize Go module path
- [ ] Audit all error messages for consistency and helpfulness
- [ ] Add `--version` output matching SemVer
- [ ] Tag v1.0.0

1
go.mod
View File

@@ -6,6 +6,7 @@ require (
github.com/apex/log v1.9.0 github.com/apex/log v1.9.0
github.com/davecgh/go-spew v1.1.1 github.com/davecgh/go-spew v1.1.1
github.com/dustin/go-humanize v1.0.1 github.com/dustin/go-humanize v1.0.1
github.com/google/uuid v1.1.2
github.com/klauspost/compress v1.18.2 github.com/klauspost/compress v1.18.2
github.com/multiformats/go-multihash v0.2.3 github.com/multiformats/go-multihash v0.2.3
github.com/pterm/pterm v0.12.35 github.com/pterm/pterm v0.12.35

1
go.sum
View File

@@ -135,6 +135,7 @@ github.com/google/pprof v0.0.0-20201203190320-1bf35d6f28c2/go.mod h1:kpwsk12EmLe
github.com/google/pprof v0.0.0-20201218002935-b9804c9f04c2/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE= github.com/google/pprof v0.0.0-20201218002935-b9804c9f04c2/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI= github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
github.com/google/uuid v1.1.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= github.com/google/uuid v1.1.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/google/uuid v1.1.2 h1:EVhdT+1Kseyi1/pUmXKaFxYsDNy9RQYkMWRH68J/W7Y=
github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg= github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg=
github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk= github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk=

281
internal/checker/checker.go Normal file
View File

@@ -0,0 +1,281 @@
package checker
import (
"bytes"
"context"
"crypto/sha256"
"errors"
"io"
"os"
"path/filepath"
"github.com/multiformats/go-multihash"
"github.com/spf13/afero"
"sneak.berlin/go/mfer/mfer"
)
// Result represents the outcome of checking a single file.
type Result struct {
Path string // Relative path from manifest
Status Status // Verification result status
Message string // Human-readable description of the result
}
// Status represents the verification status of a file.
type Status int
const (
StatusOK Status = iota // File matches manifest (size and hash verified)
StatusMissing // File not found on disk
StatusSizeMismatch // File size differs from manifest
StatusHashMismatch // File hash differs from manifest
StatusExtra // File exists on disk but not in manifest
StatusError // Error occurred during verification
)
func (s Status) String() string {
switch s {
case StatusOK:
return "OK"
case StatusMissing:
return "MISSING"
case StatusSizeMismatch:
return "SIZE_MISMATCH"
case StatusHashMismatch:
return "HASH_MISMATCH"
case StatusExtra:
return "EXTRA"
case StatusError:
return "ERROR"
default:
return "UNKNOWN"
}
}
// CheckStatus contains progress information for the check operation.
type CheckStatus struct {
TotalFiles int64 // Total number of files in manifest
CheckedFiles int64 // Number of files checked so far
TotalBytes int64 // Total bytes to verify (sum of all file sizes)
CheckedBytes int64 // Bytes verified so far
BytesPerSec float64 // Current throughput rate
Failures int64 // Number of verification failures encountered
}
// Checker verifies files against a manifest.
type Checker struct {
basePath string
files []*mfer.MFFilePath
fs afero.Fs
// manifestPaths is a set of paths in the manifest for quick lookup
manifestPaths map[string]struct{}
}
// NewChecker creates a new Checker for the given manifest, base path, and filesystem.
// The basePath is the directory relative to which manifest paths are resolved.
// If fs is nil, the real filesystem (OsFs) is used.
func NewChecker(manifestPath string, basePath string, fs afero.Fs) (*Checker, error) {
if fs == nil {
fs = afero.NewOsFs()
}
m, err := mfer.NewManifestFromFile(fs, manifestPath)
if err != nil {
return nil, err
}
abs, err := filepath.Abs(basePath)
if err != nil {
return nil, err
}
files := m.Files()
manifestPaths := make(map[string]struct{}, len(files))
for _, f := range files {
manifestPaths[f.Path] = struct{}{}
}
return &Checker{
basePath: abs,
files: files,
fs: fs,
manifestPaths: manifestPaths,
}, nil
}
// FileCount returns the number of files in the manifest.
func (c *Checker) FileCount() int64 {
return int64(len(c.files))
}
// TotalBytes returns the total size of all files in the manifest.
func (c *Checker) TotalBytes() int64 {
var total int64
for _, f := range c.files {
total += f.Size
}
return total
}
// Check verifies all files against the manifest.
// Results are sent to the results channel as files are checked.
// Progress updates are sent to the progress channel approximately once per second.
// Both channels are closed when the method returns.
func (c *Checker) Check(ctx context.Context, results chan<- Result, progress chan<- CheckStatus) error {
if results != nil {
defer close(results)
}
if progress != nil {
defer close(progress)
}
totalFiles := int64(len(c.files))
totalBytes := c.TotalBytes()
var checkedFiles int64
var checkedBytes int64
var failures int64
for _, entry := range c.files {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
result := c.checkFile(entry, &checkedBytes)
if result.Status != StatusOK {
failures++
}
checkedFiles++
if results != nil {
results <- result
}
// Send progress (simplified - every file for now)
if progress != nil {
sendCheckStatus(progress, CheckStatus{
TotalFiles: totalFiles,
CheckedFiles: checkedFiles,
TotalBytes: totalBytes,
CheckedBytes: checkedBytes,
Failures: failures,
})
}
}
return nil
}
func (c *Checker) checkFile(entry *mfer.MFFilePath, checkedBytes *int64) Result {
absPath := filepath.Join(c.basePath, entry.Path)
// Check if file exists
info, err := c.fs.Stat(absPath)
if err != nil {
if errors.Is(err, afero.ErrFileNotFound) || errors.Is(err, errors.New("file does not exist")) {
return Result{Path: entry.Path, Status: StatusMissing, Message: "file not found"}
}
// Check for "file does not exist" style errors
exists, _ := afero.Exists(c.fs, absPath)
if !exists {
return Result{Path: entry.Path, Status: StatusMissing, Message: "file not found"}
}
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()}
}
// Check size
if info.Size() != entry.Size {
*checkedBytes += info.Size()
return Result{
Path: entry.Path,
Status: StatusSizeMismatch,
Message: "size mismatch",
}
}
// Open and hash file
f, err := c.fs.Open(absPath)
if err != nil {
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()}
}
defer f.Close()
h := sha256.New()
n, err := io.Copy(h, f)
if err != nil {
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()}
}
*checkedBytes += n
// Encode as multihash and compare
computed, err := multihash.Encode(h.Sum(nil), multihash.SHA2_256)
if err != nil {
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()}
}
// Check against all hashes in manifest (at least one must match)
for _, hash := range entry.Hashes {
if bytes.Equal(computed, hash.MultiHash) {
return Result{Path: entry.Path, Status: StatusOK}
}
}
return Result{Path: entry.Path, Status: StatusHashMismatch, Message: "hash mismatch"}
}
// FindExtraFiles walks the filesystem and reports files not in the manifest.
// Results are sent to the results channel. The channel is closed when done.
func (c *Checker) FindExtraFiles(ctx context.Context, results chan<- Result) error {
if results != nil {
defer close(results)
}
return afero.Walk(c.fs, c.basePath, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
select {
case <-ctx.Done():
return ctx.Err()
default:
}
// Skip directories
if info.IsDir() {
return nil
}
// Get relative path
relPath, err := filepath.Rel(c.basePath, path)
if err != nil {
return err
}
// Check if path is in manifest
if _, exists := c.manifestPaths[relPath]; !exists {
if results != nil {
results <- Result{
Path: relPath,
Status: StatusExtra,
Message: "not in manifest",
}
}
}
return nil
})
}
// sendCheckStatus sends a status update without blocking.
func sendCheckStatus(ch chan<- CheckStatus, status CheckStatus) {
if ch == nil {
return
}
select {
case ch <- status:
default:
}
}

View File

@@ -1,8 +1,10 @@
package cli package cli
import ( import (
"encoding/hex"
"fmt" "fmt"
"path/filepath" "path/filepath"
"strings"
"time" "time"
"github.com/dustin/go-humanize" "github.com/dustin/go-humanize"
@@ -68,6 +70,35 @@ func (mfa *CLIApp) checkManifestOperation(ctx *cli.Context) error {
return fmt.Errorf("failed to load manifest: %w", err) return fmt.Errorf("failed to load manifest: %w", err)
} }
// Check signature requirement
requiredSigner := ctx.String("require-signature")
if requiredSigner != "" {
// Validate fingerprint format: must be exactly 40 hex characters
if len(requiredSigner) != 40 {
return fmt.Errorf("invalid fingerprint: must be exactly 40 hex characters, got %d", len(requiredSigner))
}
if _, err := hex.DecodeString(requiredSigner); err != nil {
return fmt.Errorf("invalid fingerprint: must be valid hex: %w", err)
}
if !chk.IsSigned() {
return fmt.Errorf("manifest is not signed, but signature from %s is required", requiredSigner)
}
// Extract fingerprint from the embedded public key (not from the signer field)
// This validates the key is importable and gets its actual fingerprint
embeddedFP, err := chk.ExtractEmbeddedSigningKeyFP()
if err != nil {
return fmt.Errorf("failed to extract fingerprint from embedded signing key: %w", err)
}
// Compare fingerprints - must be exact match (case-insensitive)
if !strings.EqualFold(embeddedFP, requiredSigner) {
return fmt.Errorf("embedded signing key fingerprint %s does not match required %s", embeddedFP, requiredSigner)
}
log.Infof("manifest signature verified (signer: %s)", embeddedFP)
}
log.Infof("manifest contains %d files, %s", chk.FileCount(), humanize.IBytes(uint64(chk.TotalBytes()))) log.Infof("manifest contains %d files, %s", chk.FileCount(), humanize.IBytes(uint64(chk.TotalBytes())))
// Set up results channel // Set up results channel

View File

@@ -227,6 +227,14 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
builder := mfer.NewBuilder() builder := mfer.NewBuilder()
// Set up signing options if sign-key is provided
if signKey := ctx.String("sign-key"); signKey != "" {
builder.SetSigningOptions(&mfer.SigningOptions{
KeyID: mfer.GPGKeyID(signKey),
})
log.Infof("signing manifest with GPG key: %s", signKey)
}
for _, e := range entries { for _, e := range entries {
select { select {
case <-ctx.Done(): case <-ctx.Done():

View File

@@ -25,6 +25,14 @@ func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error {
Fs: mfa.Fs, Fs: mfa.Fs,
} }
// Set up signing options if sign-key is provided
if signKey := ctx.String("sign-key"); signKey != "" {
opts.SigningOptions = &mfer.SigningOptions{
KeyID: mfer.GPGKeyID(signKey),
}
log.Infof("signing manifest with GPG key: %s", signKey)
}
s := mfer.NewScannerWithOptions(opts) s := mfer.NewScannerWithOptions(opts)
// Phase 1: Enumeration - collect paths and stat files // Phase 1: Enumeration - collect paths and stat files

View File

@@ -148,6 +148,12 @@ func (mfa *CLIApp) run(args []string) {
Aliases: []string{"P"}, Aliases: []string{"P"},
Usage: "Show progress during enumeration and scanning", Usage: "Show progress during enumeration and scanning",
}, },
&cli.StringFlag{
Name: "sign-key",
Aliases: []string{"s"},
Usage: "GPG key ID to sign the manifest with",
EnvVars: []string{"MFER_SIGN_KEY"},
},
), ),
}, },
{ {
@@ -175,6 +181,12 @@ func (mfa *CLIApp) run(args []string) {
Name: "no-extra-files", Name: "no-extra-files",
Usage: "Fail if files exist in base directory that are not in manifest", Usage: "Fail if files exist in base directory that are not in manifest",
}, },
&cli.StringFlag{
Name: "require-signature",
Aliases: []string{"S"},
Usage: "Require manifest to be signed by the specified GPG key ID",
EnvVars: []string{"MFER_REQUIRE_SIGNATURE"},
},
), ),
}, },
{ {
@@ -208,6 +220,12 @@ func (mfa *CLIApp) run(args []string) {
Aliases: []string{"P"}, Aliases: []string{"P"},
Usage: "Show progress during scanning and hashing", Usage: "Show progress during scanning and hashing",
}, },
&cli.StringFlag{
Name: "sign-key",
Aliases: []string{"s"},
Usage: "GPG key ID to sign the manifest with",
EnvVars: []string{"MFER_SIGN_KEY"},
},
), ),
}, },
{ {

373
internal/scanner/scanner.go Normal file
View File

@@ -0,0 +1,373 @@
package scanner
import (
"context"
"io"
"io/fs"
"path"
"path/filepath"
"strings"
"sync"
"time"
"github.com/spf13/afero"
"sneak.berlin/go/mfer/mfer"
)
// Phase 1: Enumeration
// ---------------------
// Walking directories and calling stat() on files to collect metadata.
// Builds the list of files to be scanned. Relatively fast (metadata only).
// EnumerateStatus contains progress information for the enumeration phase.
type EnumerateStatus struct {
FilesFound int64 // Number of files discovered so far
BytesFound int64 // Total size of discovered files (from stat)
}
// Phase 2: Scan (ToManifest)
// --------------------------
// Reading file contents and computing hashes for manifest generation.
// This is the expensive phase that reads all file data.
// ScanStatus contains progress information for the scan phase.
type ScanStatus struct {
TotalFiles int64 // Total number of files to scan
ScannedFiles int64 // Number of files scanned so far
TotalBytes int64 // Total bytes to read (sum of all file sizes)
ScannedBytes int64 // Bytes read so far
BytesPerSec float64 // Current throughput rate
}
// Options configures scanner behavior.
type Options struct {
IgnoreDotfiles bool // Skip files and directories starting with a dot
FollowSymLinks bool // Resolve symlinks instead of skipping them
Fs afero.Fs // Filesystem to use, defaults to OsFs if nil
}
// FileEntry represents a file that has been enumerated.
type FileEntry struct {
Path string // Relative path (used in manifest)
AbsPath string // Absolute path (used for reading file content)
Size int64 // File size in bytes
Mtime time.Time // Last modification time
Ctime time.Time // Creation time (platform-dependent)
}
// Scanner accumulates files and generates manifests from them.
type Scanner struct {
mu sync.RWMutex
files []*FileEntry
options *Options
fs afero.Fs
}
// New creates a new Scanner with default options.
func New() *Scanner {
return NewWithOptions(nil)
}
// NewWithOptions creates a new Scanner with the given options.
func NewWithOptions(opts *Options) *Scanner {
if opts == nil {
opts = &Options{}
}
fs := opts.Fs
if fs == nil {
fs = afero.NewOsFs()
}
return &Scanner{
files: make([]*FileEntry, 0),
options: opts,
fs: fs,
}
}
// EnumerateFile adds a single file to the scanner, calling stat() to get metadata.
func (s *Scanner) EnumerateFile(filePath string) error {
abs, err := filepath.Abs(filePath)
if err != nil {
return err
}
info, err := s.fs.Stat(abs)
if err != nil {
return err
}
// For single files, use the filename as the relative path
basePath := filepath.Dir(abs)
return s.enumerateFileWithInfo(filepath.Base(abs), basePath, info, nil)
}
// EnumeratePath walks a directory path and adds all files to the scanner.
// If progress is non-nil, status updates are sent as files are discovered.
// The progress channel is closed when the method returns.
func (s *Scanner) EnumeratePath(inputPath string, progress chan<- EnumerateStatus) error {
if progress != nil {
defer close(progress)
}
abs, err := filepath.Abs(inputPath)
if err != nil {
return err
}
afs := afero.NewReadOnlyFs(afero.NewBasePathFs(s.fs, abs))
return s.enumerateFS(afs, abs, progress)
}
// EnumeratePaths walks multiple directory paths and adds all files to the scanner.
// If progress is non-nil, status updates are sent as files are discovered.
// The progress channel is closed when the method returns.
func (s *Scanner) EnumeratePaths(progress chan<- EnumerateStatus, inputPaths ...string) error {
if progress != nil {
defer close(progress)
}
for _, p := range inputPaths {
abs, err := filepath.Abs(p)
if err != nil {
return err
}
afs := afero.NewReadOnlyFs(afero.NewBasePathFs(s.fs, abs))
if err := s.enumerateFS(afs, abs, progress); err != nil {
return err
}
}
return nil
}
// EnumerateFS walks an afero filesystem and adds all files to the scanner.
// If progress is non-nil, status updates are sent as files are discovered.
// The progress channel is closed when the method returns.
// basePath is used to compute absolute paths for file reading.
func (s *Scanner) EnumerateFS(afs afero.Fs, basePath string, progress chan<- EnumerateStatus) error {
if progress != nil {
defer close(progress)
}
return s.enumerateFS(afs, basePath, progress)
}
// enumerateFS is the internal implementation that doesn't close the progress channel.
func (s *Scanner) enumerateFS(afs afero.Fs, basePath string, progress chan<- EnumerateStatus) error {
return afero.Walk(afs, "/", func(p string, info fs.FileInfo, err error) error {
if err != nil {
return err
}
if s.options.IgnoreDotfiles && pathIsHidden(p) {
if info.IsDir() {
return filepath.SkipDir
}
return nil
}
return s.enumerateFileWithInfo(p, basePath, info, progress)
})
}
// enumerateFileWithInfo adds a file with pre-existing fs.FileInfo.
func (s *Scanner) enumerateFileWithInfo(filePath string, basePath string, info fs.FileInfo, progress chan<- EnumerateStatus) error {
if info.IsDir() {
// Manifests contain only files, directories are implied
return nil
}
// Clean the path - remove leading slash if present
cleanPath := filePath
if len(cleanPath) > 0 && cleanPath[0] == '/' {
cleanPath = cleanPath[1:]
}
// Compute absolute path for file reading
absPath := filepath.Join(basePath, cleanPath)
entry := &FileEntry{
Path: cleanPath,
AbsPath: absPath,
Size: info.Size(),
Mtime: info.ModTime(),
// Note: Ctime not available from fs.FileInfo on all platforms
// Will need platform-specific code to extract it
}
s.mu.Lock()
s.files = append(s.files, entry)
filesFound := int64(len(s.files))
var bytesFound int64
for _, f := range s.files {
bytesFound += f.Size
}
s.mu.Unlock()
sendEnumerateStatus(progress, EnumerateStatus{
FilesFound: filesFound,
BytesFound: bytesFound,
})
return nil
}
// Files returns a copy of all files added to the scanner.
func (s *Scanner) Files() []*FileEntry {
s.mu.RLock()
defer s.mu.RUnlock()
out := make([]*FileEntry, len(s.files))
copy(out, s.files)
return out
}
// FileCount returns the number of files in the scanner.
func (s *Scanner) FileCount() int64 {
s.mu.RLock()
defer s.mu.RUnlock()
return int64(len(s.files))
}
// TotalBytes returns the total size of all files in the scanner.
func (s *Scanner) TotalBytes() int64 {
s.mu.RLock()
defer s.mu.RUnlock()
var total int64
for _, f := range s.files {
total += f.Size
}
return total
}
// ToManifest reads all file contents, computes hashes, and generates a manifest.
// If progress is non-nil, status updates are sent approximately once per second.
// The progress channel is closed when the method returns.
// The manifest is written to the provided io.Writer.
func (s *Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- ScanStatus) error {
if progress != nil {
defer close(progress)
}
s.mu.RLock()
files := make([]*FileEntry, len(s.files))
copy(files, s.files)
totalFiles := int64(len(files))
var totalBytes int64
for _, f := range files {
totalBytes += f.Size
}
s.mu.RUnlock()
builder := mfer.NewBuilder()
var scannedFiles int64
var scannedBytes int64
lastProgressTime := time.Now()
startTime := time.Now()
for _, entry := range files {
// Check for cancellation
select {
case <-ctx.Done():
return ctx.Err()
default:
}
// Open file
f, err := s.fs.Open(entry.AbsPath)
if err != nil {
return err
}
// Add to manifest with progress callback
bytesRead, err := builder.AddFile(
entry.Path,
entry.Size,
entry.Mtime,
f,
func(fileBytes int64) {
// Send progress at most once per second
now := time.Now()
if progress != nil && now.Sub(lastProgressTime) >= time.Second {
elapsed := now.Sub(startTime).Seconds()
currentBytes := scannedBytes + fileBytes
var rate float64
if elapsed > 0 {
rate = float64(currentBytes) / elapsed
}
sendScanStatus(progress, ScanStatus{
TotalFiles: totalFiles,
ScannedFiles: scannedFiles,
TotalBytes: totalBytes,
ScannedBytes: currentBytes,
BytesPerSec: rate,
})
lastProgressTime = now
}
},
)
f.Close()
if err != nil {
return err
}
scannedFiles++
scannedBytes += bytesRead
}
// Send final progress
if progress != nil {
elapsed := time.Since(startTime).Seconds()
var rate float64
if elapsed > 0 {
rate = float64(scannedBytes) / elapsed
}
sendScanStatus(progress, ScanStatus{
TotalFiles: totalFiles,
ScannedFiles: scannedFiles,
TotalBytes: totalBytes,
ScannedBytes: scannedBytes,
BytesPerSec: rate,
})
}
// Build and write manifest
return builder.Build(w)
}
// pathIsHidden returns true if the path or any of its parent directories
// start with a dot (hidden files/directories).
func pathIsHidden(p string) bool {
tp := path.Clean(p)
if strings.HasPrefix(tp, ".") {
return true
}
for {
d, f := path.Split(tp)
if strings.HasPrefix(f, ".") {
return true
}
if d == "" {
return false
}
tp = d[0 : len(d)-1] // trim trailing slash from dir
}
}
// sendEnumerateStatus sends a status update without blocking.
// If the channel is full, the update is dropped.
func sendEnumerateStatus(ch chan<- EnumerateStatus, status EnumerateStatus) {
if ch == nil {
return
}
select {
case ch <- status:
default:
// Channel full, drop this update
}
}
// sendScanStatus sends a status update without blocking.
// If the channel is full, the update is dropped.
func sendScanStatus(ch chan<- ScanStatus, status ScanStatus) {
if ch == nil {
return
}
select {
case ch <- status:
default:
// Channel full, drop this update
}
}

View File

@@ -3,13 +3,47 @@ package mfer
import ( import (
"crypto/sha256" "crypto/sha256"
"errors" "errors"
"fmt"
"io" "io"
"strings"
"sync" "sync"
"time" "time"
"unicode/utf8"
"github.com/multiformats/go-multihash" "github.com/multiformats/go-multihash"
) )
// ValidatePath checks that a file path conforms to manifest path invariants:
// - Must be valid UTF-8
// - Must use forward slashes only (no backslashes)
// - Must be relative (no leading /)
// - Must not contain ".." segments
// - Must not contain empty segments (no "//")
// - Must not be empty
func ValidatePath(p string) error {
if p == "" {
return errors.New("path cannot be empty")
}
if !utf8.ValidString(p) {
return fmt.Errorf("path %q is not valid UTF-8", p)
}
if strings.ContainsRune(p, '\\') {
return fmt.Errorf("path %q contains backslash; use forward slashes only", p)
}
if strings.HasPrefix(p, "/") {
return fmt.Errorf("path %q is absolute; must be relative", p)
}
for _, seg := range strings.Split(p, "/") {
if seg == "" {
return fmt.Errorf("path %q contains empty segment", p)
}
if seg == ".." {
return fmt.Errorf("path %q contains '..' segment", p)
}
}
return nil
}
// RelFilePath represents a relative file path within a manifest. // RelFilePath represents a relative file path within a manifest.
type RelFilePath string type RelFilePath string
@@ -50,9 +84,10 @@ type FileHashProgress struct {
// Builder constructs a manifest by adding files one at a time. // Builder constructs a manifest by adding files one at a time.
type Builder struct { type Builder struct {
mu sync.Mutex mu sync.Mutex
files []*MFFilePath files []*MFFilePath
createdAt time.Time createdAt time.Time
signingOptions *SigningOptions
} }
// NewBuilder creates a new Builder. // NewBuilder creates a new Builder.
@@ -73,6 +108,10 @@ func (b *Builder) AddFile(
reader io.Reader, reader io.Reader,
progress chan<- FileHashProgress, progress chan<- FileHashProgress,
) (FileSize, error) { ) (FileSize, error) {
if err := ValidatePath(string(path)); err != nil {
return 0, err
}
// Create hash writer // Create hash writer
h := sha256.New() h := sha256.New()
@@ -95,6 +134,11 @@ func (b *Builder) AddFile(
} }
} }
// Verify actual bytes read matches declared size
if totalRead != size {
return totalRead, fmt.Errorf("size mismatch for %q: declared %d bytes but read %d bytes", path, size, totalRead)
}
// Encode hash as multihash (SHA2-256) // Encode hash as multihash (SHA2-256)
mh, err := multihash.Encode(h.Sum(nil), multihash.SHA2_256) mh, err := multihash.Encode(h.Sum(nil), multihash.SHA2_256)
if err != nil { if err != nil {
@@ -140,8 +184,8 @@ func (b *Builder) FileCount() int {
// This is useful when the hash is already known (e.g., from an existing manifest). // This is useful when the hash is already known (e.g., from an existing manifest).
// Returns an error if path is empty, size is negative, or hash is nil/empty. // Returns an error if path is empty, size is negative, or hash is nil/empty.
func (b *Builder) AddFileWithHash(path RelFilePath, size FileSize, mtime ModTime, hash Multihash) error { func (b *Builder) AddFileWithHash(path RelFilePath, size FileSize, mtime ModTime, hash Multihash) error {
if path == "" { if err := ValidatePath(string(path)); err != nil {
return errors.New("path cannot be empty") return err
} }
if size < 0 { if size < 0 {
return errors.New("size cannot be negative") return errors.New("size cannot be negative")
@@ -165,6 +209,14 @@ func (b *Builder) AddFileWithHash(path RelFilePath, size FileSize, mtime ModTime
return nil return nil
} }
// SetSigningOptions sets the GPG signing options for the manifest.
// If opts is non-nil, the manifest will be signed when Build() is called.
func (b *Builder) SetSigningOptions(opts *SigningOptions) {
b.mu.Lock()
defer b.mu.Unlock()
b.signingOptions = opts
}
// Build finalizes the manifest and writes it to the writer. // Build finalizes the manifest and writes it to the writer.
func (b *Builder) Build(w io.Writer) error { func (b *Builder) Build(w io.Writer) error {
b.mu.Lock() b.mu.Lock()
@@ -179,7 +231,8 @@ func (b *Builder) Build(w io.Writer) error {
// Create a temporary manifest to use existing serialization // Create a temporary manifest to use existing serialization
m := &manifest{ m := &manifest{
pbInner: inner, pbInner: inner,
signingOptions: b.signingOptions,
} }
// Generate outer wrapper // Generate outer wrapper

View File

@@ -70,6 +70,10 @@ type Checker struct {
fs afero.Fs fs afero.Fs
// manifestPaths is a set of paths in the manifest for quick lookup // manifestPaths is a set of paths in the manifest for quick lookup
manifestPaths map[RelFilePath]struct{} manifestPaths map[RelFilePath]struct{}
// signature info from the manifest
signature []byte
signer []byte
signingPubKey []byte
} }
// NewChecker creates a new Checker for the given manifest, base path, and filesystem. // NewChecker creates a new Checker for the given manifest, base path, and filesystem.
@@ -101,6 +105,9 @@ func NewChecker(manifestPath string, basePath string, fs afero.Fs) (*Checker, er
files: files, files: files,
fs: fs, fs: fs,
manifestPaths: manifestPaths, manifestPaths: manifestPaths,
signature: m.pbOuter.Signature,
signer: m.pbOuter.Signer,
signingPubKey: m.pbOuter.SigningPubKey,
}, nil }, nil
} }
@@ -118,6 +125,31 @@ func (c *Checker) TotalBytes() FileSize {
return total return total
} }
// IsSigned returns true if the manifest has a signature.
func (c *Checker) IsSigned() bool {
return len(c.signature) > 0
}
// Signer returns the signer fingerprint if the manifest is signed, nil otherwise.
func (c *Checker) Signer() []byte {
return c.signer
}
// SigningPubKey returns the signing public key if the manifest is signed, nil otherwise.
func (c *Checker) SigningPubKey() []byte {
return c.signingPubKey
}
// ExtractEmbeddedSigningKeyFP imports the manifest's embedded public key into a
// temporary keyring and extracts its fingerprint. This validates the key and
// returns its actual fingerprint from the key material itself.
func (c *Checker) ExtractEmbeddedSigningKeyFP() (string, error) {
if len(c.signingPubKey) == 0 {
return "", errors.New("manifest has no signing public key")
}
return gpgExtractPubKeyFingerprint(c.signingPubKey)
}
// Check verifies all files against the manifest. // Check verifies all files against the manifest.
// Results are sent to the results channel as files are checked. // Results are sent to the results channel as files are checked.
// Progress updates are sent to the progress channel approximately once per second. // Progress updates are sent to the progress channel approximately once per second.

View File

@@ -2,9 +2,12 @@ package mfer
import ( import (
"bytes" "bytes"
"crypto/sha256"
"errors" "errors"
"fmt"
"io" "io"
"github.com/google/uuid"
"github.com/klauspost/compress/zstd" "github.com/klauspost/compress/zstd"
"github.com/spf13/afero" "github.com/spf13/afero"
"google.golang.org/protobuf/proto" "google.golang.org/protobuf/proto"
@@ -12,6 +15,19 @@ import (
"sneak.berlin/go/mfer/internal/log" "sneak.berlin/go/mfer/internal/log"
) )
// validateUUID checks that the byte slice is a valid UUID (16 bytes, parseable).
func validateUUID(data []byte) error {
if len(data) != 16 {
return errors.New("invalid UUID length")
}
// Try to parse as UUID to validate format
_, err := uuid.FromBytes(data)
if err != nil {
return errors.New("invalid UUID format")
}
return nil
}
func (m *manifest) deserializeInner() error { func (m *manifest) deserializeInner() error {
if m.pbOuter.Version != MFFileOuter_VERSION_ONE { if m.pbOuter.Version != MFFileOuter_VERSION_ONE {
return errors.New("unknown version") return errors.New("unknown version")
@@ -20,6 +36,38 @@ func (m *manifest) deserializeInner() error {
return errors.New("unknown compression type") return errors.New("unknown compression type")
} }
// Validate outer UUID before any decompression
if err := validateUUID(m.pbOuter.Uuid); err != nil {
return errors.New("outer UUID invalid: " + err.Error())
}
// Verify hash of compressed data before decompression
h := sha256.New()
if _, err := h.Write(m.pbOuter.InnerMessage); err != nil {
return err
}
sha256Hash := h.Sum(nil)
if !bytes.Equal(sha256Hash, m.pbOuter.Sha256) {
return errors.New("compressed data hash mismatch")
}
// Verify signature if present
if len(m.pbOuter.Signature) > 0 {
if len(m.pbOuter.SigningPubKey) == 0 {
return errors.New("signature present but no public key")
}
sigString, err := m.signatureString()
if err != nil {
return fmt.Errorf("failed to generate signature string for verification: %w", err)
}
if err := gpgVerify([]byte(sigString), m.pbOuter.Signature, m.pbOuter.SigningPubKey); err != nil {
return fmt.Errorf("signature verification failed: %w", err)
}
log.Infof("signature verified successfully")
}
bb := bytes.NewBuffer(m.pbOuter.InnerMessage) bb := bytes.NewBuffer(m.pbOuter.InnerMessage)
zr, err := zstd.NewReader(bb) zr, err := zstd.NewReader(bb)
@@ -45,6 +93,16 @@ func (m *manifest) deserializeInner() error {
return err return err
} }
// Validate inner UUID
if err := validateUUID(m.pbInner.Uuid); err != nil {
return errors.New("inner UUID invalid: " + err.Error())
}
// Verify UUIDs match
if !bytes.Equal(m.pbOuter.Uuid, m.pbInner.Uuid) {
return errors.New("outer and inner UUID mismatch")
}
log.Infof("loaded manifest with %d files", len(m.pbInner.Files)) log.Infof("loaded manifest with %d files", len(m.pbInner.Files))
return nil return nil
} }

212
mfer/gpg.go Normal file
View File

@@ -0,0 +1,212 @@
package mfer
import (
"bytes"
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
)
// GPGKeyID represents a GPG key identifier (fingerprint or key ID).
type GPGKeyID string
// SigningOptions contains options for GPG signing.
type SigningOptions struct {
KeyID GPGKeyID
}
// gpgSign creates a detached signature of the data using the specified key.
// Returns the armored detached signature.
func gpgSign(data []byte, keyID GPGKeyID) ([]byte, error) {
cmd := exec.Command("gpg",
"--detach-sign",
"--armor",
"--local-user", string(keyID),
)
cmd.Stdin = bytes.NewReader(data)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("gpg sign failed: %w: %s", err, stderr.String())
}
return stdout.Bytes(), nil
}
// gpgExportPublicKey exports the public key for the specified key ID.
// Returns the armored public key.
func gpgExportPublicKey(keyID GPGKeyID) ([]byte, error) {
cmd := exec.Command("gpg",
"--export",
"--armor",
string(keyID),
)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("gpg export failed: %w: %s", err, stderr.String())
}
if stdout.Len() == 0 {
return nil, fmt.Errorf("gpg key not found: %s", keyID)
}
return stdout.Bytes(), nil
}
// gpgGetKeyFingerprint gets the full fingerprint for a key ID.
func gpgGetKeyFingerprint(keyID GPGKeyID) ([]byte, error) {
cmd := exec.Command("gpg",
"--with-colons",
"--fingerprint",
string(keyID),
)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("gpg fingerprint lookup failed: %w: %s", err, stderr.String())
}
// Parse the colon-delimited output to find the fingerprint
lines := strings.Split(stdout.String(), "\n")
for _, line := range lines {
fields := strings.Split(line, ":")
if len(fields) >= 10 && fields[0] == "fpr" {
return []byte(fields[9]), nil
}
}
return nil, fmt.Errorf("fingerprint not found for key: %s", keyID)
}
// gpgExtractPubKeyFingerprint imports a public key into a temporary keyring
// and extracts its fingerprint. This verifies the key is valid and returns
// the actual fingerprint from the key material.
func gpgExtractPubKeyFingerprint(pubKey []byte) (string, error) {
// Create temporary directory for GPG operations
tmpDir, err := os.MkdirTemp("", "mfer-gpg-fingerprint-*")
if err != nil {
return "", fmt.Errorf("failed to create temp dir: %w", err)
}
defer os.RemoveAll(tmpDir)
// Set restrictive permissions
if err := os.Chmod(tmpDir, 0o700); err != nil {
return "", fmt.Errorf("failed to set temp dir permissions: %w", err)
}
// Write public key to temp file
pubKeyFile := filepath.Join(tmpDir, "pubkey.asc")
if err := os.WriteFile(pubKeyFile, pubKey, 0o600); err != nil {
return "", fmt.Errorf("failed to write public key: %w", err)
}
// Import the public key into the temporary keyring
importCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--import",
pubKeyFile,
)
var importStderr bytes.Buffer
importCmd.Stderr = &importStderr
if err := importCmd.Run(); err != nil {
return "", fmt.Errorf("failed to import public key: %w: %s", err, importStderr.String())
}
// List keys to get fingerprint
listCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--with-colons",
"--fingerprint",
)
var listStdout, listStderr bytes.Buffer
listCmd.Stdout = &listStdout
listCmd.Stderr = &listStderr
if err := listCmd.Run(); err != nil {
return "", fmt.Errorf("failed to list keys: %w: %s", err, listStderr.String())
}
// Parse the colon-delimited output to find the fingerprint
lines := strings.Split(listStdout.String(), "\n")
for _, line := range lines {
fields := strings.Split(line, ":")
if len(fields) >= 10 && fields[0] == "fpr" {
return fields[9], nil
}
}
return "", fmt.Errorf("fingerprint not found in imported key")
}
// gpgVerify verifies a detached signature against data using the provided public key.
// It creates a temporary keyring to import the public key for verification.
func gpgVerify(data, signature, pubKey []byte) error {
// Create temporary directory for GPG operations
tmpDir, err := os.MkdirTemp("", "mfer-gpg-verify-*")
if err != nil {
return fmt.Errorf("failed to create temp dir: %w", err)
}
defer os.RemoveAll(tmpDir)
// Set restrictive permissions
if err := os.Chmod(tmpDir, 0o700); err != nil {
return fmt.Errorf("failed to set temp dir permissions: %w", err)
}
// Write public key to temp file
pubKeyFile := filepath.Join(tmpDir, "pubkey.asc")
if err := os.WriteFile(pubKeyFile, pubKey, 0o600); err != nil {
return fmt.Errorf("failed to write public key: %w", err)
}
// Write signature to temp file
sigFile := filepath.Join(tmpDir, "signature.asc")
if err := os.WriteFile(sigFile, signature, 0o600); err != nil {
return fmt.Errorf("failed to write signature: %w", err)
}
// Write data to temp file
dataFile := filepath.Join(tmpDir, "data")
if err := os.WriteFile(dataFile, data, 0o600); err != nil {
return fmt.Errorf("failed to write data: %w", err)
}
// Import the public key into the temporary keyring
importCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--import",
pubKeyFile,
)
var importStderr bytes.Buffer
importCmd.Stderr = &importStderr
if err := importCmd.Run(); err != nil {
return fmt.Errorf("failed to import public key: %w: %s", err, importStderr.String())
}
// Verify the signature
verifyCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--verify",
sigFile,
dataFile,
)
var verifyStderr bytes.Buffer
verifyCmd.Stderr = &verifyStderr
if err := verifyCmd.Run(); err != nil {
return fmt.Errorf("signature verification failed: %w: %s", err, verifyStderr.String())
}
return nil
}

347
mfer/gpg_test.go Normal file
View File

@@ -0,0 +1,347 @@
package mfer
import (
"bytes"
"context"
"os"
"os/exec"
"path/filepath"
"strings"
"testing"
"github.com/spf13/afero"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// testGPGEnv sets up a temporary GPG home directory with a test key.
// Returns the key ID and a cleanup function.
func testGPGEnv(t *testing.T) (GPGKeyID, func()) {
t.Helper()
// Check if gpg is installed
if _, err := exec.LookPath("gpg"); err != nil {
t.Skip("gpg not installed, skipping signing test")
return "", func() {}
}
// Create temporary GPG home directory
gpgHome, err := os.MkdirTemp("", "mfer-gpg-test-*")
require.NoError(t, err)
// Set restrictive permissions on GPG home
require.NoError(t, os.Chmod(gpgHome, 0o700))
// Save original GNUPGHOME and set new one
origGPGHome := os.Getenv("GNUPGHOME")
os.Setenv("GNUPGHOME", gpgHome)
cleanup := func() {
if origGPGHome == "" {
os.Unsetenv("GNUPGHOME")
} else {
os.Setenv("GNUPGHOME", origGPGHome)
}
os.RemoveAll(gpgHome)
}
// Generate a test key with no passphrase
keyParams := `%no-protection
Key-Type: RSA
Key-Length: 2048
Name-Real: MFER Test Key
Name-Email: test@mfer.test
Expire-Date: 0
%commit
`
paramsFile := filepath.Join(gpgHome, "key-params")
require.NoError(t, os.WriteFile(paramsFile, []byte(keyParams), 0o600))
cmd := exec.Command("gpg", "--batch", "--gen-key", paramsFile)
cmd.Env = append(os.Environ(), "GNUPGHOME="+gpgHome)
output, err := cmd.CombinedOutput()
if err != nil {
cleanup()
t.Skipf("failed to generate test GPG key: %v: %s", err, output)
return "", func() {}
}
// Get the key fingerprint
cmd = exec.Command("gpg", "--list-keys", "--with-colons", "test@mfer.test")
cmd.Env = append(os.Environ(), "GNUPGHOME="+gpgHome)
output, err = cmd.Output()
if err != nil {
cleanup()
t.Fatalf("failed to list test key: %v", err)
}
// Parse fingerprint from output
var keyID string
for _, line := range strings.Split(string(output), "\n") {
fields := strings.Split(line, ":")
if len(fields) >= 10 && fields[0] == "fpr" {
keyID = fields[9]
break
}
}
if keyID == "" {
cleanup()
t.Fatal("failed to find test key fingerprint")
}
return GPGKeyID(keyID), cleanup
}
func TestGPGSign(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data to sign")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
assert.NotEmpty(t, sig)
assert.Contains(t, string(sig), "-----BEGIN PGP SIGNATURE-----")
assert.Contains(t, string(sig), "-----END PGP SIGNATURE-----")
}
func TestGPGExportPublicKey(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
pubKey, err := gpgExportPublicKey(keyID)
require.NoError(t, err)
assert.NotEmpty(t, pubKey)
assert.Contains(t, string(pubKey), "-----BEGIN PGP PUBLIC KEY BLOCK-----")
assert.Contains(t, string(pubKey), "-----END PGP PUBLIC KEY BLOCK-----")
}
func TestGPGGetKeyFingerprint(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
fingerprint, err := gpgGetKeyFingerprint(keyID)
require.NoError(t, err)
assert.NotEmpty(t, fingerprint)
// The fingerprint should be 40 hex chars
assert.Len(t, fingerprint, 40, "fingerprint should be 40 hex chars")
}
func TestGPGSignInvalidKey(t *testing.T) {
// Set up test environment (we need GNUPGHOME set)
_, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data")
_, err := gpgSign(data, GPGKeyID("NONEXISTENT_KEY_ID_12345"))
assert.Error(t, err)
}
func TestBuilderWithSigning(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create a builder with signing options
b := NewBuilder()
b.SetSigningOptions(&SigningOptions{
KeyID: keyID,
})
// Add a test file
content := []byte("test file content")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
// Build the manifest
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Parse the manifest and verify signature fields are populated
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
require.NotNil(t, manifest.pbOuter)
assert.NotEmpty(t, manifest.pbOuter.Signature, "signature should be populated")
assert.NotEmpty(t, manifest.pbOuter.Signer, "signer should be populated")
assert.NotEmpty(t, manifest.pbOuter.SigningPubKey, "signing public key should be populated")
// Verify signature is a valid PGP signature
assert.Contains(t, string(manifest.pbOuter.Signature), "-----BEGIN PGP SIGNATURE-----")
// Verify public key is a valid PGP public key block
assert.Contains(t, string(manifest.pbOuter.SigningPubKey), "-----BEGIN PGP PUBLIC KEY BLOCK-----")
}
func TestScannerWithSigning(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create in-memory filesystem with test files
fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content1"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content2"), 0o644))
// Create scanner with signing options
opts := &ScannerOptions{
Fs: fs,
SigningOptions: &SigningOptions{
KeyID: keyID,
},
}
s := NewScannerWithOptions(opts)
// Enumerate files
require.NoError(t, s.EnumeratePath("/testdir", nil))
assert.Equal(t, FileCount(2), s.FileCount())
// Generate signed manifest
var buf bytes.Buffer
require.NoError(t, s.ToManifest(context.Background(), &buf, nil))
// Parse and verify
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
assert.NotEmpty(t, manifest.pbOuter.Signature)
assert.NotEmpty(t, manifest.pbOuter.Signer)
assert.NotEmpty(t, manifest.pbOuter.SigningPubKey)
}
func TestGPGVerify(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data to sign and verify")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
pubKey, err := gpgExportPublicKey(keyID)
require.NoError(t, err)
// Verify the signature
err = gpgVerify(data, sig, pubKey)
require.NoError(t, err)
}
func TestGPGVerifyInvalidSignature(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data to sign")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
pubKey, err := gpgExportPublicKey(keyID)
require.NoError(t, err)
// Try to verify with different data - should fail
wrongData := []byte("different data")
err = gpgVerify(wrongData, sig, pubKey)
assert.Error(t, err)
}
func TestGPGVerifyBadPublicKey(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
// Try to verify with invalid public key - should fail
badPubKey := []byte("not a valid public key")
err = gpgVerify(data, sig, badPubKey)
assert.Error(t, err)
}
func TestManifestSignatureVerification(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create a builder with signing options
b := NewBuilder()
b.SetSigningOptions(&SigningOptions{
KeyID: keyID,
})
// Add a test file
content := []byte("test file content for verification")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
// Build the manifest
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Parse the manifest - signature should be verified during load
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
require.NotNil(t, manifest)
// Signature should be present and valid
assert.NotEmpty(t, manifest.pbOuter.Signature)
}
func TestManifestTamperedSignatureFails(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create a signed manifest
b := NewBuilder()
b.SetSigningOptions(&SigningOptions{
KeyID: keyID,
})
content := []byte("test file content")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Tamper with the signature by replacing some bytes
data := buf.Bytes()
// Find and modify a byte in the signature portion
for i := range data {
if i > 100 && data[i] == 'A' {
data[i] = 'B'
break
}
}
// Try to load the tampered manifest - should fail
_, err = NewManifestFromReader(bytes.NewReader(data))
assert.Error(t, err)
}
func TestBuilderWithoutSigning(t *testing.T) {
// Create a builder without signing options
b := NewBuilder()
// Add a test file
content := []byte("test file content")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
// Build the manifest
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Parse the manifest and verify signature fields are empty
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
require.NotNil(t, manifest.pbOuter)
assert.Empty(t, manifest.pbOuter.Signature, "signature should be empty when not signing")
assert.Empty(t, manifest.pbOuter.Signer, "signer should be empty when not signing")
assert.Empty(t, manifest.pbOuter.SigningPubKey, "signing public key should be empty when not signing")
}

View File

@@ -2,16 +2,21 @@ package mfer
import ( import (
"bytes" "bytes"
"encoding/hex"
"errors"
"fmt" "fmt"
"github.com/multiformats/go-multihash"
) )
// manifest holds the internal representation of a manifest file. // manifest holds the internal representation of a manifest file.
// Use NewManifestFromFile or NewManifestFromReader to load an existing manifest, // Use NewManifestFromFile or NewManifestFromReader to load an existing manifest,
// or use Builder to create a new one. // or use Builder to create a new one.
type manifest struct { type manifest struct {
pbInner *MFFile pbInner *MFFile
pbOuter *MFFileOuter pbOuter *MFFileOuter
output *bytes.Buffer output *bytes.Buffer
signingOptions *SigningOptions
} }
func (m *manifest) String() string { func (m *manifest) String() string {
@@ -29,3 +34,26 @@ func (m *manifest) Files() []*MFFilePath {
} }
return m.pbInner.Files return m.pbInner.Files
} }
// signatureString generates the canonical string used for signing/verification.
// Format: MAGIC-UUID-MULTIHASH where UUID and multihash are hex-encoded.
// Requires pbOuter to be set with Uuid and Sha256 fields.
func (m *manifest) signatureString() (string, error) {
if m.pbOuter == nil {
return "", errors.New("pbOuter not set")
}
if len(m.pbOuter.Uuid) == 0 {
return "", errors.New("UUID not set")
}
if len(m.pbOuter.Sha256) == 0 {
return "", errors.New("SHA256 hash not set")
}
mh, err := multihash.Encode(m.pbOuter.Sha256, multihash.SHA2_256)
if err != nil {
return "", fmt.Errorf("failed to encode multihash: %w", err)
}
uuidStr := hex.EncodeToString(m.pbOuter.Uuid)
mhStr := hex.EncodeToString(mh)
return fmt.Sprintf("%s-%s-%s", MAGIC, uuidStr, mhStr), nil
}

View File

@@ -218,8 +218,10 @@ type MFFileOuter struct {
CompressionType MFFileOuter_CompressionType `protobuf:"varint,102,opt,name=compressionType,proto3,enum=MFFileOuter_CompressionType" json:"compressionType,omitempty"` CompressionType MFFileOuter_CompressionType `protobuf:"varint,102,opt,name=compressionType,proto3,enum=MFFileOuter_CompressionType" json:"compressionType,omitempty"`
// these are used solely to detect corruption/truncation // these are used solely to detect corruption/truncation
// and not for cryptographic integrity. // and not for cryptographic integrity.
Size int64 `protobuf:"varint,103,opt,name=size,proto3" json:"size,omitempty"` Size int64 `protobuf:"varint,103,opt,name=size,proto3" json:"size,omitempty"`
Sha256 []byte `protobuf:"bytes,104,opt,name=sha256,proto3" json:"sha256,omitempty"` Sha256 []byte `protobuf:"bytes,104,opt,name=sha256,proto3" json:"sha256,omitempty"`
// uuid must match the uuid in the inner message
Uuid []byte `protobuf:"bytes,105,opt,name=uuid,proto3" json:"uuid,omitempty"`
InnerMessage []byte `protobuf:"bytes,199,opt,name=innerMessage,proto3" json:"innerMessage,omitempty"` InnerMessage []byte `protobuf:"bytes,199,opt,name=innerMessage,proto3" json:"innerMessage,omitempty"`
// detached signature, ascii or binary // detached signature, ascii or binary
Signature []byte `protobuf:"bytes,201,opt,name=signature,proto3,oneof" json:"signature,omitempty"` Signature []byte `protobuf:"bytes,201,opt,name=signature,proto3,oneof" json:"signature,omitempty"`
@@ -289,6 +291,13 @@ func (x *MFFileOuter) GetSha256() []byte {
return nil return nil
} }
func (x *MFFileOuter) GetUuid() []byte {
if x != nil {
return x.Uuid
}
return nil
}
func (x *MFFileOuter) GetInnerMessage() []byte { func (x *MFFileOuter) GetInnerMessage() []byte {
if x != nil { if x != nil {
return x.InnerMessage return x.InnerMessage
@@ -463,6 +472,9 @@ type MFFile struct {
Version MFFile_Version `protobuf:"varint,100,opt,name=version,proto3,enum=MFFile_Version" json:"version,omitempty"` Version MFFile_Version `protobuf:"varint,100,opt,name=version,proto3,enum=MFFile_Version" json:"version,omitempty"`
// required manifest attributes: // required manifest attributes:
Files []*MFFilePath `protobuf:"bytes,101,rep,name=files,proto3" json:"files,omitempty"` Files []*MFFilePath `protobuf:"bytes,101,rep,name=files,proto3" json:"files,omitempty"`
// uuid is a random v4 UUID generated when creating the manifest
// used as part of the signature to prevent replay attacks
Uuid []byte `protobuf:"bytes,102,opt,name=uuid,proto3" json:"uuid,omitempty"`
// optional manifest attributes 2xx: // optional manifest attributes 2xx:
CreatedAt *Timestamp `protobuf:"bytes,201,opt,name=createdAt,proto3,oneof" json:"createdAt,omitempty"` CreatedAt *Timestamp `protobuf:"bytes,201,opt,name=createdAt,proto3,oneof" json:"createdAt,omitempty"`
unknownFields protoimpl.UnknownFields unknownFields protoimpl.UnknownFields
@@ -513,6 +525,13 @@ func (x *MFFile) GetFiles() []*MFFilePath {
return nil return nil
} }
func (x *MFFile) GetUuid() []byte {
if x != nil {
return x.Uuid
}
return nil
}
func (x *MFFile) GetCreatedAt() *Timestamp { func (x *MFFile) GetCreatedAt() *Timestamp {
if x != nil { if x != nil {
return x.CreatedAt return x.CreatedAt
@@ -527,12 +546,13 @@ const file_mf_proto_rawDesc = "" +
"\bmf.proto\";\n" + "\bmf.proto\";\n" +
"\tTimestamp\x12\x18\n" + "\tTimestamp\x12\x18\n" +
"\aseconds\x18\x01 \x01(\x03R\aseconds\x12\x14\n" + "\aseconds\x18\x01 \x01(\x03R\aseconds\x12\x14\n" +
"\x05nanos\x18\x02 \x01(\x05R\x05nanos\"\xdc\x03\n" + "\x05nanos\x18\x02 \x01(\x05R\x05nanos\"\xf0\x03\n" +
"\vMFFileOuter\x12.\n" + "\vMFFileOuter\x12.\n" +
"\aversion\x18e \x01(\x0e2\x14.MFFileOuter.VersionR\aversion\x12F\n" + "\aversion\x18e \x01(\x0e2\x14.MFFileOuter.VersionR\aversion\x12F\n" +
"\x0fcompressionType\x18f \x01(\x0e2\x1c.MFFileOuter.CompressionTypeR\x0fcompressionType\x12\x12\n" + "\x0fcompressionType\x18f \x01(\x0e2\x1c.MFFileOuter.CompressionTypeR\x0fcompressionType\x12\x12\n" +
"\x04size\x18g \x01(\x03R\x04size\x12\x16\n" + "\x04size\x18g \x01(\x03R\x04size\x12\x16\n" +
"\x06sha256\x18h \x01(\fR\x06sha256\x12#\n" + "\x06sha256\x18h \x01(\fR\x06sha256\x12\x12\n" +
"\x04uuid\x18i \x01(\fR\x04uuid\x12#\n" +
"\finnerMessage\x18\xc7\x01 \x01(\fR\finnerMessage\x12\"\n" + "\finnerMessage\x18\xc7\x01 \x01(\fR\finnerMessage\x12\"\n" +
"\tsignature\x18\xc9\x01 \x01(\fH\x00R\tsignature\x88\x01\x01\x12\x1c\n" + "\tsignature\x18\xc9\x01 \x01(\fH\x00R\tsignature\x88\x01\x01\x12\x1c\n" +
"\x06signer\x18\xca\x01 \x01(\fH\x01R\x06signer\x88\x01\x01\x12*\n" + "\x06signer\x18\xca\x01 \x01(\fH\x01R\x06signer\x88\x01\x01\x12*\n" +
@@ -564,10 +584,11 @@ const file_mf_proto_rawDesc = "" +
"\x06_ctimeB\b\n" + "\x06_ctimeB\b\n" +
"\x06_atime\".\n" + "\x06_atime\".\n" +
"\x0eMFFileChecksum\x12\x1c\n" + "\x0eMFFileChecksum\x12\x1c\n" +
"\tmultiHash\x18\x01 \x01(\fR\tmultiHash\"\xc2\x01\n" + "\tmultiHash\x18\x01 \x01(\fR\tmultiHash\"\xd6\x01\n" +
"\x06MFFile\x12)\n" + "\x06MFFile\x12)\n" +
"\aversion\x18d \x01(\x0e2\x0f.MFFile.VersionR\aversion\x12!\n" + "\aversion\x18d \x01(\x0e2\x0f.MFFile.VersionR\aversion\x12!\n" +
"\x05files\x18e \x03(\v2\v.MFFilePathR\x05files\x12.\n" + "\x05files\x18e \x03(\v2\v.MFFilePathR\x05files\x12\x12\n" +
"\x04uuid\x18f \x01(\fR\x04uuid\x12.\n" +
"\tcreatedAt\x18\xc9\x01 \x01(\v2\n" + "\tcreatedAt\x18\xc9\x01 \x01(\v2\n" +
".TimestampH\x00R\tcreatedAt\x88\x01\x01\",\n" + ".TimestampH\x00R\tcreatedAt\x88\x01\x01\",\n" +
"\aVersion\x12\x10\n" + "\aVersion\x12\x10\n" +

View File

@@ -28,6 +28,9 @@ message MFFileOuter {
int64 size = 103; int64 size = 103;
bytes sha256 = 104; bytes sha256 = 104;
// uuid must match the uuid in the inner message
bytes uuid = 105;
bytes innerMessage = 199; bytes innerMessage = 199;
// 2xx for optional manifest root attributes // 2xx for optional manifest root attributes
// think we might use gosignify instead of gpg: // think we might use gosignify instead of gpg:
@@ -43,6 +46,9 @@ message MFFileOuter {
message MFFilePath { message MFFilePath {
// required attributes: // required attributes:
// Path invariants: must be valid UTF-8, use forward slashes only,
// be relative (no leading /), contain no ".." segments, and no
// empty segments (no "//").
string path = 1; string path = 1;
int64 size = 2; int64 size = 2;
@@ -72,6 +78,10 @@ message MFFile {
// required manifest attributes: // required manifest attributes:
repeated MFFilePath files = 101; repeated MFFilePath files = 101;
// uuid is a random v4 UUID generated when creating the manifest
// used as part of the signature to prevent replay attacks
bytes uuid = 102;
// optional manifest attributes 2xx: // optional manifest attributes 2xx:
optional Timestamp createdAt = 201; optional Timestamp createdAt = 201;
} }

View File

@@ -43,9 +43,10 @@ type ScanStatus struct {
// ScannerOptions configures scanner behavior. // ScannerOptions configures scanner behavior.
type ScannerOptions struct { type ScannerOptions struct {
IncludeDotfiles bool // Include files and directories starting with a dot (default: exclude) IncludeDotfiles bool // Include files and directories starting with a dot (default: exclude)
FollowSymLinks bool // Resolve symlinks instead of skipping them FollowSymLinks bool // Resolve symlinks instead of skipping them
Fs afero.Fs // Filesystem to use, defaults to OsFs if nil Fs afero.Fs // Filesystem to use, defaults to OsFs if nil
SigningOptions *SigningOptions // GPG signing options (nil = no signing)
} }
// FileEntry represents a file that has been enumerated. // FileEntry represents a file that has been enumerated.
@@ -272,6 +273,9 @@ func (s *Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- S
s.mu.RUnlock() s.mu.RUnlock()
builder := NewBuilder() builder := NewBuilder()
if s.options.SigningOptions != nil {
builder.SetSigningOptions(s.options.SigningOptions)
}
var scannedFiles FileCount var scannedFiles FileCount
var scannedBytes FileSize var scannedBytes FileSize

View File

@@ -4,8 +4,10 @@ import (
"bytes" "bytes"
"crypto/sha256" "crypto/sha256"
"errors" "errors"
"fmt"
"time" "time"
"github.com/google/uuid"
"github.com/klauspost/compress/zstd" "github.com/klauspost/compress/zstd"
"google.golang.org/protobuf/proto" "google.golang.org/protobuf/proto"
) )
@@ -47,14 +49,17 @@ func (m *manifest) generateOuter() error {
if m.pbInner == nil { if m.pbInner == nil {
return errors.New("internal error") return errors.New("internal error")
} }
// Generate UUID and set on inner message
manifestUUID := uuid.New()
m.pbInner.Uuid = manifestUUID[:]
innerData, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbInner) innerData, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbInner)
if err != nil { if err != nil {
return err return err
} }
h := sha256.New() // Compress the inner data
h.Write(innerData)
idc := new(bytes.Buffer) idc := new(bytes.Buffer)
zw, err := zstd.NewWriter(idc, zstd.WithEncoderLevel(zstd.SpeedBestCompression)) zw, err := zstd.NewWriter(idc, zstd.WithEncoderLevel(zstd.SpeedBestCompression))
if err != nil { if err != nil {
@@ -64,16 +69,51 @@ func (m *manifest) generateOuter() error {
if err != nil { if err != nil {
return err return err
} }
_ = zw.Close() _ = zw.Close()
o := &MFFileOuter{ compressedData := idc.Bytes()
InnerMessage: idc.Bytes(),
// Hash the compressed data for integrity verification before decompression
h := sha256.New()
if _, err := h.Write(compressedData); err != nil {
return err
}
sha256Hash := h.Sum(nil)
m.pbOuter = &MFFileOuter{
InnerMessage: compressedData,
Size: int64(len(innerData)), Size: int64(len(innerData)),
Sha256: h.Sum(nil), Sha256: sha256Hash,
Uuid: manifestUUID[:],
Version: MFFileOuter_VERSION_ONE, Version: MFFileOuter_VERSION_ONE,
CompressionType: MFFileOuter_COMPRESSION_ZSTD, CompressionType: MFFileOuter_COMPRESSION_ZSTD,
} }
m.pbOuter = o
// Sign the manifest if signing options are provided
if m.signingOptions != nil && m.signingOptions.KeyID != "" {
sigString, err := m.signatureString()
if err != nil {
return fmt.Errorf("failed to generate signature string: %w", err)
}
sig, err := gpgSign([]byte(sigString), m.signingOptions.KeyID)
if err != nil {
return fmt.Errorf("failed to sign manifest: %w", err)
}
m.pbOuter.Signature = sig
fingerprint, err := gpgGetKeyFingerprint(m.signingOptions.KeyID)
if err != nil {
return fmt.Errorf("failed to get key fingerprint: %w", err)
}
m.pbOuter.Signer = fingerprint
pubKey, err := gpgExportPublicKey(m.signingOptions.KeyID)
if err != nil {
return fmt.Errorf("failed to export public key: %w", err)
}
m.pbOuter.SigningPubKey = pubKey
}
return nil return nil
} }