27 Commits

Author SHA1 Message Date
c62a4dd5e9 Merge branch 'next' into fix/issue-13 2026-02-09 02:13:09 +01:00
70af055d4e Fix newTimestampFromTime panic on extreme dates (closes #15) (#20)
Co-authored-by: clawbot <clawbot@openclaw>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #20
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-02-09 02:10:21 +01:00
04b05e01e8 Consolidate scanner/checker — delete internal/scanner/ and internal/checker/ (closes #22) (#27)
Remove unused `internal/scanner/` and `internal/checker/` packages. The CLI already uses `mfer.Scanner` and `mfer.Checker` from the `mfer/` package directly, so these were dead code.

Co-authored-by: clawbot <clawbot@openclaw>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #27
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-02-09 02:09:01 +01:00
7144617d0e Add decompression size limit in deserializeInner() (closes #24) (#29)
Wrap zstd decompressor with `io.LimitReader` (256MB max) to prevent decompression bombs.

Co-authored-by: clawbot <clawbot@openclaw>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #29
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-02-09 01:45:55 +01:00
2efffd9da8 Specify and enforce path invariants (closes #26) (#31)
Add `ValidatePath()` enforcing UTF-8, forward-slash, relative, no `..`, no empty segments. Applied in `AddFile` and `AddFileWithHash`. Proto comments document the rules.

Co-authored-by: clawbot <clawbot@openclaw>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #31
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-02-09 01:45:29 +01:00
ebaf2a65ca Fix AddFile to verify actual bytes read matches declared size (closes #25) (#30)
After reading file content, verify `totalRead == size` and return an error on mismatch.

Co-authored-by: clawbot <clawbot@openclaw>
Reviewed-on: #30
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-02-09 01:35:07 +01:00
clawbot
34438cb5b9 fix: URL-encode file paths in fetch command to handle special characters
File paths with spaces, #, ?, %, etc. were concatenated directly into
URLs without encoding, producing malformed download URLs.

Add encodeFilePath() that encodes each path segment individually
(preserving directory separators) and use it in fetch.
2026-02-08 12:03:11 -08:00
4b80c0067b docs: replace TODO.md with design questions and implementation plan 2026-02-08 18:40:31 +01:00
5ab092098b progress 2026-02-08 09:25:58 -08:00
4a2060087d Add GPG signature verification on manifest load
- Implement gpgVerify function that creates a temporary keyring to verify
  detached signatures against embedded public keys
- Signature verification happens during deserialization after hash
  validation but before decompression
- Extract signatureString() as a method on manifest for generating the
  canonical signature string (MAGIC-UUID-MULTIHASH)
- Add --require-signature flag to check command to mandate signature from
  a specific GPG key ID
- Expose IsSigned() and Signer() methods on Checker for signature status
2025-12-18 05:56:16 -08:00
213364bab5 Add UUID to manifest and verify integrity before decompression
- Add UUID field to both inner and outer manifest messages
- Generate random v4 UUID when creating manifest
- Hash compressed data (not uncompressed) for integrity check
- Verify hash before decompression to prevent malicious payloads
- Validate UUIDs are proper format and match between inner/outer
- Sign string format: MAGIC-UUID-MULTIHASH
2025-12-18 02:20:51 -08:00
778999a285 Add GPG signing support for manifest generation
- Add --sign-key flag and MFER_SIGN_KEY env var to gen and freshen commands
- Sign inner message multihash with GPG detached signature
- Include signer fingerprint and public key in outer wrapper
- Add comprehensive tests with temporary GPG keyring
- Increase test timeout to 10s for GPG key generation
2025-12-18 02:12:54 -08:00
308c583d57 Remove codebase structure section from README
godoc provides this documentation automatically
2025-12-18 01:38:13 -08:00
019fe41c3d Update .gitignore for new bin/ build directory 2025-12-18 01:30:50 -08:00
fc0b38ea19 Add TODO.md with codebase audit findings
Document issues found during code audit including:
- Critical: broken error comparison, unchecked hash writes, URL path traversal
- Important: goroutine leak, timestamp precision, missing context cancellation
- Code quality: duplicate functions, inefficient calculations, missing validation
2025-12-18 01:30:01 -08:00
61c17ca585 Normalize markdown formatting in documentation
- Use consistent dash-style bullet points
- Remove trailing whitespace
- Add missing blank lines between sections
- Add trailing newline to README.md
2025-12-18 01:29:56 -08:00
dae6c64e24 Change build output path from mfer.cmd to bin/mfer
Use conventional bin/ directory for build output instead of
placing executable in project root.
2025-12-18 01:29:47 -08:00
a5b0343b28 Use Go 1.13+ octal literal syntax throughout codebase
Update file permission literals from legacy octal format (0755, 0644)
to explicit Go 1.13+ format (0o755, 0o644) for improved readability.
2025-12-18 01:29:40 -08:00
e25e309581 Move checker package into mfer package
Consolidate checker functionality into the mfer package alongside
scanner, removing the need for a separate internal/checker package.
2025-12-18 01:28:35 -08:00
dc115c5ba2 Add custom types for type safety throughout codebase
- Add FileCount, FileSize, RelFilePath, AbsFilePath, ModTime, Multihash types
- Add UnixSeconds and UnixNanos types for timestamp handling
- Add URL types (ManifestURL, FileURL, BaseURL) with safe path joining
- Consolidate scanner package into mfer package
- Update checker to use custom types in Result and CheckStatus
- Add ModTime.Timestamp() method for protobuf conversion
- Update all tests to use proper custom types
2025-12-18 01:01:18 -08:00
a9f0d2abe4 Update README to reflect current API (FileProgress was already a channel) 2025-12-17 17:19:08 -08:00
1588e1bb9f Remove unused legacy manifest APIs
Removed:
- New(), NewFromPaths(), NewFromFS() - unused constructors
- Scan(), addFile(), addInputPath(), addInputFS() - unused scanning code
- WriteToFile(), Write() - unused output methods (Builder.Build() is used)
- GetFileCount(), GetTotalFileSize() - unused accessors
- pathIsHidden() - duplicated in internal/scanner
- ManifestScanOptions - unused options struct
- HasError(), AddError(), WithContext() - unused error/context handling
- NewFromProto() - deprecated alias
- manifestFile struct - unused internal type

Kept:
- manifest struct (simplified to just pbInner, pbOuter, output)
- NewManifestFromReader(), NewManifestFromFile() - for loading manifests
- Files() - returns files from loaded manifest
- Builder and its methods - for creating manifests
2025-12-17 17:16:35 -08:00
09e8da0855 Update CLAUDE.md and clean up completed TODOs in README 2025-12-17 17:09:33 -08:00
efa4bb929a Update README: mark FIXMEs as resolved 2025-12-17 17:08:37 -08:00
16e3538ea6 Document WriteToFile overwrite behavior, remove misplaced FIXME 2025-12-17 17:08:11 -08:00
1ae384b6f6 Add context cancellation support to Scan 2025-12-17 17:07:02 -08:00
b55ae961c8 Validate filesystem in addInputFS 2025-12-17 17:05:42 -08:00
33 changed files with 1666 additions and 894 deletions

2
.gitignore vendored
View File

@@ -1,4 +1,4 @@
/mfer.cmd /bin/
/tmp /tmp
*.tmp *.tmp
*.dockerimage *.dockerimage

BIN
.index.mf Normal file

Binary file not shown.

View File

@@ -1,15 +1,20 @@
# Important Rules # Important Rules
* never, ever mention claude or anthropic in commit messages. do not use attribution - when fixing a bug, write a failing test FIRST. only after the test fails, write
the code to fix the bug. then ensure the test passes. leave the test in
place and commit it with the bugfix. don't run shell commands to test
bugfixes or reproduce bugs. write tests!
* after each change, run "make fmt". - never, ever mention claude or anthropic in commit messages. do not use attribution
* after each change, run "make test" and ensure all tests pass. - after each change, run "make fmt".
* after each change, run "make lint" and ensure no linting errors. fix any - after each change, run "make test" and ensure all tests pass.
- after each change, run "make lint" and ensure no linting errors. fix any
you find, one by one. you find, one by one.
* after each change, commit the files you've changed. push after - after each change, commit the files you've changed. push after
committing. committing.
* NEVER use `git add -A`. always add only individual files that you've changed. - NEVER use `git add -A`. always add only individual files that you've changed.

View File

@@ -17,14 +17,14 @@ GOFLAGS := -ldflags "$(GOLDFLAGS)"
default: fmt test default: fmt test
run: ./mfer.cmd run: ./bin/mfer
./$< ./$<
./$< gen ./$< gen
ci: test ci: test
test: $(SOURCEFILES) mfer/mf.pb.go test: $(SOURCEFILES) mfer/mf.pb.go
go test -v --timeout 3s ./... go test -v --timeout 10s ./...
$(PROTOC_GEN_GO): $(PROTOC_GEN_GO):
test -e $(PROTOC_GEN_GO) || go install -v google.golang.org/protobuf/cmd/protoc-gen-go@v1.28.1 test -e $(PROTOC_GEN_GO) || go install -v google.golang.org/protobuf/cmd/protoc-gen-go@v1.28.1
@@ -38,12 +38,12 @@ devprereqs:
mfer/mf.pb.go: mfer/mf.proto mfer/mf.pb.go: mfer/mf.proto
cd mfer && go generate . cd mfer && go generate .
mfer.cmd: $(SOURCEFILES) mfer/mf.pb.go bin/mfer: $(SOURCEFILES) mfer/mf.pb.go
protoc --version protoc --version
cd cmd/mfer && go build -tags urfave_cli_no_docs -o ../../mfer.cmd $(GOFLAGS) . cd cmd/mfer && go build -tags urfave_cli_no_docs -o ../../bin/mfer $(GOFLAGS) .
clean: clean:
rm -rfv mfer/*.pb.go mfer.cmd cmd/mfer/mfer *.dockerimage rm -rfv mfer/*.pb.go bin/mfer cmd/mfer/mfer *.dockerimage
fmt: mfer/mf.pb.go fmt: mfer/mf.pb.go
gofumpt -l -w mfer internal cmd gofumpt -l -w mfer internal cmd

224
README.md
View File

@@ -52,205 +52,6 @@ Reading file contents and computing cryptographic hashes for manifest generation
- **NO_COLOR:** Respect the `NO_COLOR` environment variable for disabling colored output. - **NO_COLOR:** Respect the `NO_COLOR` environment variable for disabling colored output.
- **Options pattern:** Use `NewWithOptions(opts *Options)` constructor pattern for configurable types. - **Options pattern:** Use `NewWithOptions(opts *Options)` constructor pattern for configurable types.
# Codebase Structure
## cmd/mfer/
### main.go
- **Variables**
- `Appname string` - Application name
- `Version string` - Version string (set at build time)
- `Gitrev string` - Git revision (set at build time)
## internal/cli/
### entry.go
- **Variables**
- `NO_COLOR bool` - Disables color output when NO_COLOR env var is set
- **Functions**
- `Run(Appname, Version, Gitrev string) int` - Main entry point for the CLI
### mfer.go
- **Types**
- `CLIApp struct` - Main CLI application container
- **Methods**
- `(*CLIApp) VersionString() string` - Returns formatted version string
## internal/log/
### log.go
- **Functions**
- `Init()` - Initializes the logger
- `Info(arg string)` - Logs at info level
- `Infof(format string, args ...interface{})` - Logs at info level with formatting
- `Debug(arg string)` - Logs at debug level with caller info
- `Debugf(format string, args ...interface{})` - Logs at debug level with formatting and caller info
- `Dump(args ...interface{})` - Logs spew dump at debug level
- `Progressf(format string, args ...interface{})` - Prints progress message (overwrites current line)
- `ProgressDone()` - Completes progress line with newline
- `EnableDebugLogging()` - Sets log level to debug
- `SetLevel(arg log.Level)` - Sets log level
- `SetLevelFromVerbosity(l int)` - Sets log level from verbosity count
- `GetLevel() log.Level` - Returns current log level
- `GetLogger() *log.Logger` - Returns underlying logger
- `WithError(e error) *log.Entry` - Returns log entry with error attached
- `DisableStyling()` - Disables colors and styling (for NO_COLOR)
## internal/scanner/
### scanner.go
- **Types**
- `Options struct` - Options for scanner behavior
- `IncludeDotfiles bool` - Include dot (hidden) files (excluded by default)
- `FollowSymLinks bool`
- `EnumerateStatus struct` - Progress information for enumeration phase
- `FilesFound int64`
- `BytesFound int64`
- `ScanStatus struct` - Progress information for scan phase
- `TotalFiles int64`
- `ScannedFiles int64`
- `TotalBytes int64`
- `ScannedBytes int64`
- `BytesPerSec float64`
- `ETA time.Duration`
- `FileEntry struct` - Represents an enumerated file
- `Path string` - Relative path (used in manifest)
- `AbsPath string` - Absolute path (used for reading file content)
- `Size int64`
- `Mtime time.Time`
- `Ctime time.Time`
- `Scanner struct` - Accumulates files and generates manifests
- **Functions**
- `New() *Scanner` - Creates a new Scanner with default options
- `NewWithOptions(opts *Options) *Scanner` - Creates a new Scanner with given options
- **Methods (Enumeration Phase)**
- `(*Scanner) EnumerateFile(path string) error` - Enumerates a single file, calling stat() for metadata
- `(*Scanner) EnumeratePath(inputPath string, progress chan<- EnumerateStatus) error` - Walks a directory and enumerates all files
- `(*Scanner) EnumeratePaths(progress chan<- EnumerateStatus, inputPaths ...string) error` - Walks multiple directories
- `(*Scanner) EnumerateFS(afs afero.Fs, basePath string, progress chan<- EnumerateStatus) error` - Walks an afero filesystem
- **Methods (Accessors)**
- `(*Scanner) Files() []*FileEntry` - Returns copy of all enumerated files
- `(*Scanner) FileCount() int64` - Returns number of files
- `(*Scanner) TotalBytes() int64` - Returns total size of all files
- **Methods (Scan Phase)**
- `(*Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- ScanStatus) error` - Reads file contents, computes hashes, generates manifest
## internal/checker/
### checker.go
- **Types**
- `Result struct` - Outcome of checking a single file
- `Path string` - File path from manifest
- `Status Status` - Verification status
- `Message string` - Error or status message
- `Status int` - Verification status enumeration
- `StatusOK` - File matches manifest
- `StatusMissing` - File not found
- `StatusSizeMismatch` - File size differs from manifest
- `StatusHashMismatch` - File hash differs from manifest
- `StatusError` - Error occurred during verification
- `CheckStatus struct` - Progress information for check operation
- `TotalFiles int64`
- `CheckedFiles int64`
- `TotalBytes int64`
- `CheckedBytes int64`
- `BytesPerSec float64`
- `ETA time.Duration`
- `Failures int64`
- `Checker struct` - Verifies files against a manifest
- **Functions**
- `NewChecker(manifestPath string, basePath string) (*Checker, error)` - Creates a new Checker for the given manifest and base path
- **Methods**
- `(s Status) String() string` - Returns string representation of status
- `(*Checker) FileCount() int64` - Returns number of files in the manifest
- `(*Checker) TotalBytes() int64` - Returns total size of all files in manifest
- `(*Checker) Check(ctx context.Context, results chan<- Result, progress chan<- CheckStatus) error` - Verifies all files against the manifest
## mfer/
### manifest.go
- **Types**
- `ManifestScanOptions struct` - Options for scanning directories
- `IncludeDotfiles bool` - Include dot (hidden) files (excluded by default)
- `FollowSymLinks bool`
- **Functions**
- `New() *manifest` - Creates a new empty manifest
- `NewFromPaths(options *ManifestScanOptions, inputPaths ...string) (*manifest, error)` - Creates manifest from filesystem paths
- `NewFromFS(options *ManifestScanOptions, fs afero.Fs) (*manifest, error)` - Creates manifest from afero filesystem
- **Methods**
- `(*manifest) HasError() bool` - Returns true if manifest has errors
- `(*manifest) AddError(e error) *manifest` - Adds an error to the manifest
- `(*manifest) WithContext(c context.Context) *manifest` - Sets context for cancellation
- `(*manifest) GetFileCount() int64` - Returns number of files in manifest
- `(*manifest) GetTotalFileSize() int64` - Returns total size of all files
- `(*manifest) Files() []*MFFilePath` - Returns all file entries from a loaded manifest
- `(*manifest) Scan() error` - Scans source filesystems and populates file list
### output.go
- **Methods**
- `(*manifest) WriteToFile(path string) error` - Writes manifest to file path
- `(*manifest) WriteTo(output io.Writer) error` - Writes manifest to io.Writer
### builder.go
- **Types**
- `FileProgress func(bytesRead int64)` - Callback for file processing progress
- `Builder struct` - Constructs manifests by adding files one at a time
- **Functions**
- `NewBuilder() *Builder` - Creates a new Builder
- **Methods**
- `(*Builder) AddFile(path string, size int64, mtime time.Time, reader io.Reader, progress FileProgress) (int64, error)` - Reads file, computes hash, adds to manifest
- `(*Builder) FileCount() int` - Returns number of files added
- `(*Builder) Build(w io.Writer) error` - Finalizes and writes manifest
### serialize.go
- **Constants**
- `MAGIC string` - Magic bytes prefix for manifest files ("ZNAVSRFG")
### deserialize.go
- **Functions**
- `NewFromProto(input io.Reader) (*manifest, error)` - Deserializes manifest from protobuf
- `NewManifestFromReader(input io.Reader) (*manifest, error)` - Reads and parses manifest from io.Reader
- `NewManifestFromFile(path string) (*manifest, error)` - Reads and parses manifest from file path
### mf.pb.go (generated from mf.proto)
- **Enum Types**
- `MFFileOuter_Version` - Outer file format version
- `MFFileOuter_VERSION_NONE`
- `MFFileOuter_VERSION_ONE`
- `MFFileOuter_CompressionType` - Compression type for inner message
- `MFFileOuter_COMPRESSION_NONE`
- `MFFileOuter_COMPRESSION_ZSTD`
- `MFFile_Version` - Inner file format version
- `MFFile_VERSION_NONE`
- `MFFile_VERSION_ONE`
- **Message Types**
- `Timestamp struct` - Timestamp with seconds and nanoseconds
- `GetSeconds() int64`
- `GetNanos() int32`
- `MFFileOuter struct` - Outer wrapper containing compressed/signed inner message
- `GetVersion() MFFileOuter_Version`
- `GetCompressionType() MFFileOuter_CompressionType`
- `GetSize() int64`
- `GetSha256() []byte`
- `GetInnerMessage() []byte`
- `GetSignature() []byte`
- `GetSigner() []byte`
- `GetSigningPubKey() []byte`
- `MFFilePath struct` - Individual file entry in manifest
- `GetPath() string`
- `GetSize() int64`
- `GetHashes() []*MFFileChecksum`
- `GetMimeType() string`
- `GetMtime() *Timestamp`
- `GetCtime() *Timestamp`
- `GetAtime() *Timestamp`
- `MFFileChecksum struct` - File checksum using multihash
- `GetMultiHash() []byte`
- `MFFile struct` - Inner manifest containing file list
- `GetVersion() MFFile_Version`
- `GetFiles() []*MFFilePath`
- `GetCreatedAt() *Timestamp`
# Build Status # Build Status
[![Build Status](https://drone.datavi.be/api/badges/sneak/mfer/status.svg)](https://drone.datavi.be/sneak/mfer) [![Build Status](https://drone.datavi.be/api/badges/sneak/mfer/status.svg)](https://drone.datavi.be/sneak/mfer)
@@ -270,7 +71,6 @@ requests](https://git.eeqj.de/sneak/mfer/pulls) and pass CI to be merged.
Any changes submitted to this project must also be Any changes submitted to this project must also be
[WTFPL-licensed](https://wtfpl.net) to be considered. [WTFPL-licensed](https://wtfpl.net) to be considered.
# Problem Statement # Problem Statement
Given a plain URL, there is no standard way to safely and programmatically Given a plain URL, there is no standard way to safely and programmatically
@@ -352,22 +152,6 @@ The manifest file would do several important things:
- **Manifest size:** Manifests must fit entirely in system memory during reading and writing. - **Manifest size:** Manifests must fit entirely in system memory during reading and writing.
# TODO
## Medium Priority
- [x] **Atomic writes for `mfer gen`** - Writes to temp file then atomic rename; cleans up temp file on error/interrupt.
- [ ] **Change FileProgress callback to channel** - `mfer/builder.go` uses a callback for progress reporting; should use channels like `EnumerateStatus` and `ScanStatus` for consistency.
- [ ] **Consolidate legacy manifest code** - `mfer/manifest.go` has old scanning code (`Scan()`, `addFile()`) that duplicates the new `internal/scanner` + `mfer/builder.go` pattern.
- [ ] **Add context cancellation to legacy code** - The old `manifest.Scan()` doesn't support context cancellation; the new scanner does.
## Lower Priority
- [x] **Add unit tests for `internal/checker`** - 88.5% coverage.
- [x] **Add unit tests for `internal/scanner`** - 80.1% coverage.
- [ ] **Clean up FIXMEs in manifest.go** - Validate input paths exist, validate filesystem.
- [x] **Validate input paths before scanning** - Fails fast with clear error if paths don't exist.
# Open Questions # Open Questions
- Should the manifest file include checksums of individual file chunks, or just for the whole assembled file? - Should the manifest file include checksums of individual file chunks, or just for the whole assembled file?
@@ -457,13 +241,13 @@ desired username for an account on this Gitea instance.
## Links ## Links
* Repo: [https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer) - Repo: [https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer)
* Issues: [https://git.eeqj.de/sneak/mfer/issues](https://git.eeqj.de/sneak/mfer/issues) - Issues: [https://git.eeqj.de/sneak/mfer/issues](https://git.eeqj.de/sneak/mfer/issues)
# Authors # Authors
* [@sneak &lt;sneak@sneak.berlin&gt;](mailto:sneak@sneak.berlin) - [@sneak &lt;sneak@sneak.berlin&gt;](mailto:sneak@sneak.berlin)
# License # License
* [WTFPL](https://wtfpl.net) - [WTFPL](https://wtfpl.net)

127
TODO.md
View File

@@ -1,33 +1,122 @@
# TODO for 1.0 Release # TODO: mfer 1.0
## High Priority ## Design Questions
- [ ] **Fix panic in log.go** - `internal/log/log.go:141` has a `panic("unable to get logger")` that should return an error or handle gracefully instead. *sneak: please answer inline below each question. These are preserved for posterity.*
- [ ] **Clean up FIXMEs in manifest.go** - Multiple FIXMEs need attention: ### Format Design
- Line 67: Validate input paths exist before processing
- Line 77: Add validation for filesystem input
- Line 163: Avoid redundant stat calls
- Line 182: Add context support for cancellation
- [ ] **Fix WriteToFile overwrite behavior** - `mfer/output.go:9` has FIXME to refuse overwriting without `-f` flag. **1. Should `MFFileChecksum` be simplified?**
Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
- [ ] **Consolidate legacy manifest code** - `mfer/manifest.go` has old scanning code (`Scan()`, `addFile()`) that duplicates the new `internal/scanner` + `mfer/builder.go` pattern. Remove duplication. > *answer:*
## Medium Priority **2. Should file permissions/mode be stored?**
The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
- [ ] **Add unit tests for `internal/checker`** - Currently has no test files; only tested indirectly via CLI tests. > *answer:*
- [ ] **Add unit tests for `internal/scanner`** - Currently has no test files. **3. Should `atime` be removed from the schema?**
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
- [ ] **Add context cancellation to legacy code** - The old `manifest.Scan()` doesn't support context cancellation; the new scanner does. > *answer:*
- [ ] **Validate input paths before scanning** - Should fail fast with a clear error if paths don't exist. **4. What are the path normalization rules?**
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
- [ ] **Add resume support for fetch** - Allow resuming partial downloads using HTTP Range requests and existing temp files. > *answer:*
## Lower Priority **5. Should we add a version byte after the magic?**
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
- [ ] **Add manifest signature support** - Implement signing and verification using signify or similar. > *answer:*
- [ ] **Improve error messages** - Ensure all error messages are clear and actionable. **6. Should we add a length-prefix after the magic?**
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
> *answer:*
### Signature Design
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
> *answer:*
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
> *answer:*
**9. Should we support detached signature files (`.mf.sig`)?**
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
> *answer:*
**10. GPG vs pure-Go crypto for signatures?**
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
> *answer:*
### Implementation Design
**11. Should manifests be deterministic by default?**
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
> *answer:*
**12. Should we consolidate or keep both scanner/checker implementations?**
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
> *answer:*
**13. Should the `manifest` type be exported?**
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
> *answer:*
**14. What should the Go module path be for 1.0?**
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
> *answer:*
---
## Implementation Plan
### Phase 1: Foundation (format correctness)
- [ ] Delete `internal/scanner/` and `internal/checker/` — consolidate on `mfer/` package versions; update CLI code
- [ ] Add deterministic file ordering — sort entries by path (lexicographic, byte-order) in `Builder.Build()`; add test asserting byte-identical output from two runs
- [ ] Add decompression size limit — `io.LimitReader` in `deserializeInner()` with `m.pbOuter.Size` as bound
- [ ] Fix `errors.Is` dead code in checker — replace with `os.IsNotExist(err)` or `errors.Is(err, fs.ErrNotExist)`
- [ ] Fix `AddFile` to verify size — check `totalRead == size` after reading, return error on mismatch
- [ ] Specify path invariants — add proto comments (UTF-8, forward-slash, relative, no `..`, no leading `/`); validate in `Builder.AddFile` and `Builder.AddFileWithHash`
### Phase 2: CLI polish
- [ ] Fix flag naming — all CLI flags use kebab-case as primary (`--include-dotfiles`, `--follow-symlinks`)
- [ ] Fix URL construction in fetch — use `BaseURL.JoinPath()` or `url.JoinPath()` instead of string concatenation
- [ ] Add progress rate-limiting to Checker — throttle to once per second, matching Scanner
- [ ] Add `--deterministic` flag (or make it default) — omit `createdAt`, sort files
### Phase 3: Robustness
- [ ] Replace GPG subprocess with pure-Go crypto — `github.com/ProtonMail/go-crypto` or Ed25519/signify
- [ ] Add timeout to any remaining subprocess calls
- [ ] Add fuzzing tests for `NewManifestFromReader`
- [ ] Add retry logic to fetch — exponential backoff for transient HTTP errors
### Phase 4: Format finalization
- [ ] Remove or deprecate `atime` from proto (pending design question answer)
- [ ] Reserve `optional uint32 mode = 305` in `MFFilePath` for future file permissions
- [ ] Add version byte after magic — `ZNAVSRFG\x01` for format version 1
- [ ] Write format specification document — separate from README: magic, outer structure, compression, inner structure, path invariants, signature scheme, canonical serialization
### Phase 5: Release prep
- [ ] Finalize Go module path
- [ ] Audit all error messages for consistency and helpfulness
- [ ] Add `--version` output matching SemVer
- [ ] Tag v1.0.0

1
go.mod
View File

@@ -6,6 +6,7 @@ require (
github.com/apex/log v1.9.0 github.com/apex/log v1.9.0
github.com/davecgh/go-spew v1.1.1 github.com/davecgh/go-spew v1.1.1
github.com/dustin/go-humanize v1.0.1 github.com/dustin/go-humanize v1.0.1
github.com/google/uuid v1.1.2
github.com/klauspost/compress v1.18.2 github.com/klauspost/compress v1.18.2
github.com/multiformats/go-multihash v0.2.3 github.com/multiformats/go-multihash v0.2.3
github.com/pterm/pterm v0.12.35 github.com/pterm/pterm v0.12.35

1
go.sum
View File

@@ -135,6 +135,7 @@ github.com/google/pprof v0.0.0-20201203190320-1bf35d6f28c2/go.mod h1:kpwsk12EmLe
github.com/google/pprof v0.0.0-20201218002935-b9804c9f04c2/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE= github.com/google/pprof v0.0.0-20201218002935-b9804c9f04c2/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI= github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
github.com/google/uuid v1.1.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= github.com/google/uuid v1.1.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/google/uuid v1.1.2 h1:EVhdT+1Kseyi1/pUmXKaFxYsDNy9RQYkMWRH68J/W7Y=
github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg= github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg=
github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk= github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk=

View File

@@ -1,15 +1,17 @@
package cli package cli
import ( import (
"encoding/hex"
"fmt" "fmt"
"path/filepath" "path/filepath"
"strings"
"time" "time"
"github.com/dustin/go-humanize" "github.com/dustin/go-humanize"
"github.com/spf13/afero" "github.com/spf13/afero"
"github.com/urfave/cli/v2" "github.com/urfave/cli/v2"
"sneak.berlin/go/mfer/internal/checker"
"sneak.berlin/go/mfer/internal/log" "sneak.berlin/go/mfer/internal/log"
"sneak.berlin/go/mfer/mfer"
) )
// findManifest looks for a manifest file in the given directory. // findManifest looks for a manifest file in the given directory.
@@ -63,20 +65,49 @@ func (mfa *CLIApp) checkManifestOperation(ctx *cli.Context) error {
log.Infof("checking manifest %s with base %s", manifestPath, basePath) log.Infof("checking manifest %s with base %s", manifestPath, basePath)
// Create checker // Create checker
chk, err := checker.NewChecker(manifestPath, basePath, mfa.Fs) chk, err := mfer.NewChecker(manifestPath, basePath, mfa.Fs)
if err != nil { if err != nil {
return fmt.Errorf("failed to load manifest: %w", err) return fmt.Errorf("failed to load manifest: %w", err)
} }
// Check signature requirement
requiredSigner := ctx.String("require-signature")
if requiredSigner != "" {
// Validate fingerprint format: must be exactly 40 hex characters
if len(requiredSigner) != 40 {
return fmt.Errorf("invalid fingerprint: must be exactly 40 hex characters, got %d", len(requiredSigner))
}
if _, err := hex.DecodeString(requiredSigner); err != nil {
return fmt.Errorf("invalid fingerprint: must be valid hex: %w", err)
}
if !chk.IsSigned() {
return fmt.Errorf("manifest is not signed, but signature from %s is required", requiredSigner)
}
// Extract fingerprint from the embedded public key (not from the signer field)
// This validates the key is importable and gets its actual fingerprint
embeddedFP, err := chk.ExtractEmbeddedSigningKeyFP()
if err != nil {
return fmt.Errorf("failed to extract fingerprint from embedded signing key: %w", err)
}
// Compare fingerprints - must be exact match (case-insensitive)
if !strings.EqualFold(embeddedFP, requiredSigner) {
return fmt.Errorf("embedded signing key fingerprint %s does not match required %s", embeddedFP, requiredSigner)
}
log.Infof("manifest signature verified (signer: %s)", embeddedFP)
}
log.Infof("manifest contains %d files, %s", chk.FileCount(), humanize.IBytes(uint64(chk.TotalBytes()))) log.Infof("manifest contains %d files, %s", chk.FileCount(), humanize.IBytes(uint64(chk.TotalBytes())))
// Set up results channel // Set up results channel
results := make(chan checker.Result, 1) results := make(chan mfer.Result, 1)
// Set up progress channel // Set up progress channel
var progress chan checker.CheckStatus var progress chan mfer.CheckStatus
if showProgress { if showProgress {
progress = make(chan checker.CheckStatus, 1) progress = make(chan mfer.CheckStatus, 1)
go func() { go func() {
for status := range progress { for status := range progress {
if status.ETA > 0 { if status.ETA > 0 {
@@ -103,7 +134,7 @@ func (mfa *CLIApp) checkManifestOperation(ctx *cli.Context) error {
done := make(chan struct{}) done := make(chan struct{})
go func() { go func() {
for result := range results { for result := range results {
if result.Status != checker.StatusOK { if result.Status != mfer.StatusOK {
failures++ failures++
log.Infof("%s: %s (%s)", result.Status, result.Path, result.Message) log.Infof("%s: %s (%s)", result.Status, result.Path, result.Message)
} else { } else {
@@ -124,7 +155,7 @@ func (mfa *CLIApp) checkManifestOperation(ctx *cli.Context) error {
// Check for extra files if requested // Check for extra files if requested
if ctx.Bool("no-extra-files") { if ctx.Bool("no-extra-files") {
extraResults := make(chan checker.Result, 1) extraResults := make(chan mfer.Result, 1)
extraDone := make(chan struct{}) extraDone := make(chan struct{})
go func() { go func() {
for result := range extraResults { for result := range extraResults {

View File

@@ -65,9 +65,9 @@ func TestGenerateCommand(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files in memory filesystem // Create test files in memory filesystem
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("test content"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("test content"), 0o644))
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -85,9 +85,9 @@ func TestGenerateAndCheckCommand(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files with subdirectory // Create test files with subdirectory
require.NoError(t, fs.MkdirAll("/testdir/subdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir/subdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/file2.txt", []byte("test content"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/file2.txt", []byte("test content"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -104,8 +104,8 @@ func TestCheckCommandWithMissingFile(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -125,8 +125,8 @@ func TestCheckCommandWithCorruptedFile(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -134,7 +134,7 @@ func TestCheckCommandWithCorruptedFile(t *testing.T) {
require.Equal(t, 0, exitCode, "generate failed: %s", opts.Stderr.(*bytes.Buffer).String()) require.Equal(t, 0, exitCode, "generate failed: %s", opts.Stderr.(*bytes.Buffer).String())
// Corrupt the file (change content but keep same size) // Corrupt the file (change content but keep same size)
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("HELLO WORLD"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("HELLO WORLD"), 0o644))
// Check manifest - should fail with hash mismatch // Check manifest - should fail with hash mismatch
opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/testdir/test.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/testdir/test.mf"}, fs)
@@ -146,8 +146,8 @@ func TestCheckCommandWithSizeMismatch(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello world"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -155,7 +155,7 @@ func TestCheckCommandWithSizeMismatch(t *testing.T) {
require.Equal(t, 0, exitCode, "generate failed: %s", opts.Stderr.(*bytes.Buffer).String()) require.Equal(t, 0, exitCode, "generate failed: %s", opts.Stderr.(*bytes.Buffer).String())
// Change file size // Change file size
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("different size content here"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("different size content here"), 0o644))
// Check manifest - should fail with size mismatch // Check manifest - should fail with size mismatch
opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/testdir/test.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/testdir/test.mf"}, fs)
@@ -167,8 +167,8 @@ func TestBannerOutput(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
// Run without -q to see banner // Run without -q to see banner
opts := testOpts([]string{"mfer", "generate", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -193,9 +193,9 @@ func TestGenerateExcludesDotfilesByDefault(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files including dotfiles // Create test files including dotfiles
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden", []byte("secret"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden", []byte("secret"), 0o644))
// Generate manifest without --include-dotfiles (default excludes dotfiles) // Generate manifest without --include-dotfiles (default excludes dotfiles)
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -217,9 +217,9 @@ func TestGenerateWithIncludeDotfiles(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files including dotfiles // Create test files including dotfiles
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden", []byte("secret"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden", []byte("secret"), 0o644))
// Generate manifest with --include-dotfiles // Generate manifest with --include-dotfiles
opts := testOpts([]string{"mfer", "generate", "-q", "--include-dotfiles", "-o", "/testdir/test.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "--include-dotfiles", "-o", "/testdir/test.mf", "/testdir"}, fs)
@@ -236,10 +236,10 @@ func TestMultipleInputPaths(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files in multiple directories // Create test files in multiple directories
require.NoError(t, fs.MkdirAll("/dir1", 0755)) require.NoError(t, fs.MkdirAll("/dir1", 0o755))
require.NoError(t, fs.MkdirAll("/dir2", 0755)) require.NoError(t, fs.MkdirAll("/dir2", 0o755))
require.NoError(t, afero.WriteFile(fs, "/dir1/file1.txt", []byte("content1"), 0644)) require.NoError(t, afero.WriteFile(fs, "/dir1/file1.txt", []byte("content1"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/dir2/file2.txt", []byte("content2"), 0644)) require.NoError(t, afero.WriteFile(fs, "/dir2/file2.txt", []byte("content2"), 0o644))
// Generate manifest from multiple paths // Generate manifest from multiple paths
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/dir1", "/dir2"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/dir1", "/dir2"}, fs)
@@ -254,9 +254,9 @@ func TestNoExtraFilesPass(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files // Create test files
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("world"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs)
@@ -273,8 +273,8 @@ func TestNoExtraFilesFail(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files // Create test files
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs)
@@ -282,7 +282,7 @@ func TestNoExtraFilesFail(t *testing.T) {
require.Equal(t, 0, exitCode) require.Equal(t, 0, exitCode)
// Add an extra file after manifest generation // Add an extra file after manifest generation
require.NoError(t, afero.WriteFile(fs, "/testdir/extra.txt", []byte("extra"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/extra.txt", []byte("extra"), 0o644))
// Check with --no-extra-files (should fail - extra file exists) // Check with --no-extra-files (should fail - extra file exists)
opts = testOpts([]string{"mfer", "check", "-q", "--no-extra-files", "--base", "/testdir", "/manifest.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--no-extra-files", "--base", "/testdir", "/manifest.mf"}, fs)
@@ -294,9 +294,9 @@ func TestNoExtraFilesWithSubdirectory(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test files with subdirectory // Create test files with subdirectory
require.NoError(t, fs.MkdirAll("/testdir/subdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir/subdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/file2.txt", []byte("world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/file2.txt", []byte("world"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs)
@@ -304,7 +304,7 @@ func TestNoExtraFilesWithSubdirectory(t *testing.T) {
require.Equal(t, 0, exitCode) require.Equal(t, 0, exitCode)
// Add extra file in subdirectory // Add extra file in subdirectory
require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/extra.txt", []byte("extra"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/extra.txt", []byte("extra"), 0o644))
// Check with --no-extra-files (should fail) // Check with --no-extra-files (should fail)
opts = testOpts([]string{"mfer", "check", "-q", "--no-extra-files", "--base", "/testdir", "/manifest.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--no-extra-files", "--base", "/testdir", "/manifest.mf"}, fs)
@@ -316,8 +316,8 @@ func TestCheckWithoutNoExtraFilesIgnoresExtra(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/manifest.mf", "/testdir"}, fs)
@@ -325,7 +325,7 @@ func TestCheckWithoutNoExtraFilesIgnoresExtra(t *testing.T) {
require.Equal(t, 0, exitCode) require.Equal(t, 0, exitCode)
// Add extra file // Add extra file
require.NoError(t, afero.WriteFile(fs, "/testdir/extra.txt", []byte("extra"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/extra.txt", []byte("extra"), 0o644))
// Check WITHOUT --no-extra-files (should pass - extra files ignored) // Check WITHOUT --no-extra-files (should pass - extra files ignored)
opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs)
@@ -337,8 +337,8 @@ func TestGenerateAtomicWriteNoTempFileOnSuccess(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/testdir"}, fs)
@@ -360,11 +360,11 @@ func TestGenerateAtomicWriteOverwriteWithForce(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
// Create existing manifest with different content // Create existing manifest with different content
require.NoError(t, afero.WriteFile(fs, "/output.mf", []byte("old content"), 0644)) require.NoError(t, afero.WriteFile(fs, "/output.mf", []byte("old content"), 0o644))
// Generate manifest with --force // Generate manifest with --force
opts := testOpts([]string{"mfer", "generate", "-q", "-f", "-o", "/output.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-f", "-o", "/output.mf", "/testdir"}, fs)
@@ -386,11 +386,11 @@ func TestGenerateFailsWithoutForceWhenOutputExists(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
// Create existing manifest // Create existing manifest
require.NoError(t, afero.WriteFile(fs, "/output.mf", []byte("existing"), 0644)) require.NoError(t, afero.WriteFile(fs, "/output.mf", []byte("existing"), 0o644))
// Generate manifest WITHOUT --force (should fail) // Generate manifest WITHOUT --force (should fail)
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/testdir"}, fs)
@@ -411,8 +411,8 @@ func TestGenerateAtomicWriteUsesTemp(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create test file // Create test file
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("hello"), 0o644))
// Generate manifest // Generate manifest
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/testdir"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/testdir"}, fs)
@@ -464,8 +464,8 @@ func TestGenerateAtomicWriteCleansUpOnError(t *testing.T) {
baseFs := afero.NewMemMapFs() baseFs := afero.NewMemMapFs()
// Create test files - need enough content to trigger the write failure // Create test files - need enough content to trigger the write failure
require.NoError(t, baseFs.MkdirAll("/testdir", 0755)) require.NoError(t, baseFs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(baseFs, "/testdir/file1.txt", []byte("hello world this is a test file"), 0644)) require.NoError(t, afero.WriteFile(baseFs, "/testdir/file1.txt", []byte("hello world this is a test file"), 0o644))
// Wrap with failing writer that fails after writing some bytes // Wrap with failing writer that fails after writing some bytes
fs := &failingWriterFs{Fs: baseFs, failAfter: 10} fs := &failingWriterFs{Fs: baseFs, failAfter: 10}
@@ -489,8 +489,8 @@ func TestGenerateValidatesInputPaths(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
// Create one valid directory // Create one valid directory
require.NoError(t, fs.MkdirAll("/validdir", 0755)) require.NoError(t, fs.MkdirAll("/validdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/validdir/file.txt", []byte("content"), 0644)) require.NoError(t, afero.WriteFile(fs, "/validdir/file.txt", []byte("content"), 0o644))
t.Run("nonexistent path fails fast", func(t *testing.T) { t.Run("nonexistent path fails fast", func(t *testing.T) {
opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/nonexistent"}, fs) opts := testOpts([]string{"mfer", "generate", "-q", "-o", "/output.mf", "/nonexistent"}, fs)
@@ -527,7 +527,7 @@ func TestCheckDetectsManifestCorruption(t *testing.T) {
// Create many small files with random names to generate a ~1MB manifest // Create many small files with random names to generate a ~1MB manifest
// Each manifest entry is roughly 50-60 bytes, so we need ~20000 files // Each manifest entry is roughly 50-60 bytes, so we need ~20000 files
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
numFiles := 20000 numFiles := 20000
for i := 0; i < numFiles; i++ { for i := 0; i < numFiles; i++ {
@@ -536,7 +536,7 @@ func TestCheckDetectsManifestCorruption(t *testing.T) {
// Small random content // Small random content
content := make([]byte, 16+rng.Intn(48)) content := make([]byte, 16+rng.Intn(48))
rng.Read(content) rng.Read(content)
require.NoError(t, afero.WriteFile(fs, filename, content, 0644)) require.NoError(t, afero.WriteFile(fs, filename, content, 0o644))
} }
// Generate manifest outside of testdir // Generate manifest outside of testdir
@@ -551,7 +551,7 @@ func TestCheckDetectsManifestCorruption(t *testing.T) {
t.Logf("manifest size: %d bytes (%d files)", len(validManifest), numFiles) t.Logf("manifest size: %d bytes (%d files)", len(validManifest), numFiles)
// First corruption: truncate the manifest // First corruption: truncate the manifest
require.NoError(t, afero.WriteFile(fs, "/manifest.mf", validManifest[:len(validManifest)/2], 0644)) require.NoError(t, afero.WriteFile(fs, "/manifest.mf", validManifest[:len(validManifest)/2], 0o644))
// Check should fail with truncated manifest // Check should fail with truncated manifest
opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs)
@@ -559,7 +559,7 @@ func TestCheckDetectsManifestCorruption(t *testing.T) {
assert.Equal(t, 1, exitCode, "check should fail with truncated manifest") assert.Equal(t, 1, exitCode, "check should fail with truncated manifest")
// Verify check passes with valid manifest // Verify check passes with valid manifest
require.NoError(t, afero.WriteFile(fs, "/manifest.mf", validManifest, 0644)) require.NoError(t, afero.WriteFile(fs, "/manifest.mf", validManifest, 0o644))
opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs)
exitCode = RunWithOptions(opts) exitCode = RunWithOptions(opts)
require.Equal(t, 0, exitCode, "check should pass with valid manifest") require.Equal(t, 0, exitCode, "check should pass with valid manifest")
@@ -579,7 +579,7 @@ func TestCheckDetectsManifestCorruption(t *testing.T) {
} }
corrupted[offset] = newByte corrupted[offset] = newByte
require.NoError(t, afero.WriteFile(fs, "/manifest.mf", corrupted, 0644)) require.NoError(t, afero.WriteFile(fs, "/manifest.mf", corrupted, 0o644))
// Check should fail with corrupted manifest // Check should fail with corrupted manifest
opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs) opts = testOpts([]string{"mfer", "check", "-q", "--base", "/testdir", "/manifest.mf"}, fs)
@@ -588,6 +588,6 @@ func TestCheckDetectsManifestCorruption(t *testing.T) {
i, offset, originalByte, newByte) i, offset, originalByte, newByte)
// Restore valid manifest for next iteration // Restore valid manifest for next iteration
require.NoError(t, afero.WriteFile(fs, "/manifest.mf", validManifest, 0644)) require.NoError(t, afero.WriteFile(fs, "/manifest.mf", validManifest, 0o644))
} }
} }

View File

@@ -113,7 +113,7 @@ func (mfa *CLIApp) fetchManifestOperation(ctx *cli.Context) error {
return fmt.Errorf("invalid path in manifest: %w", err) return fmt.Errorf("invalid path in manifest: %w", err)
} }
fileURL := baseURL.String() + f.Path fileURL := baseURL.String() + encodeFilePath(f.Path)
log.Infof("fetching %s", f.Path) log.Infof("fetching %s", f.Path)
if err := downloadFile(fileURL, localPath, f, progress); err != nil { if err := downloadFile(fileURL, localPath, f, progress); err != nil {
@@ -139,6 +139,15 @@ func (mfa *CLIApp) fetchManifestOperation(ctx *cli.Context) error {
return nil return nil
} }
// encodeFilePath URL-encodes each segment of a file path while preserving slashes.
func encodeFilePath(p string) string {
segments := strings.Split(p, "/")
for i, seg := range segments {
segments[i] = url.PathEscape(seg)
}
return strings.Join(segments, "/")
}
// sanitizePath validates and sanitizes a file path from the manifest. // sanitizePath validates and sanitizes a file path from the manifest.
// It prevents path traversal attacks and rejects unsafe paths. // It prevents path traversal attacks and rejects unsafe paths.
func sanitizePath(p string) (string, error) { func sanitizePath(p string) (string, error) {
@@ -257,7 +266,7 @@ func downloadFile(fileURL, localPath string, entry *mfer.MFFilePath, progress ch
// Create parent directories if needed // Create parent directories if needed
dir := filepath.Dir(localPath) dir := filepath.Dir(localPath)
if dir != "" && dir != "." { if dir != "" && dir != "." {
if err := os.MkdirAll(dir, 0755); err != nil { if err := os.MkdirAll(dir, 0o755); err != nil {
return err return err
} }
} }

View File

@@ -13,10 +13,32 @@ import (
"github.com/spf13/afero" "github.com/spf13/afero"
"github.com/stretchr/testify/assert" "github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require" "github.com/stretchr/testify/require"
"sneak.berlin/go/mfer/internal/scanner"
"sneak.berlin/go/mfer/mfer" "sneak.berlin/go/mfer/mfer"
) )
func TestEncodeFilePath(t *testing.T) {
tests := []struct {
input string
expected string
}{
{"file.txt", "file.txt"},
{"dir/file.txt", "dir/file.txt"},
{"my file.txt", "my%20file.txt"},
{"dir/my file.txt", "dir/my%20file.txt"},
{"file#1.txt", "file%231.txt"},
{"file?v=1.txt", "file%3Fv=1.txt"},
{"path/to/file with spaces.txt", "path/to/file%20with%20spaces.txt"},
{"100%done.txt", "100%25done.txt"},
}
for _, tt := range tests {
t.Run(tt.input, func(t *testing.T) {
result := encodeFilePath(tt.input)
assert.Equal(t, tt.expected, result)
})
}
}
func TestSanitizePath(t *testing.T) { func TestSanitizePath(t *testing.T) {
// Valid paths that should be accepted // Valid paths that should be accepted
validTests := []struct { validTests := []struct {
@@ -107,15 +129,15 @@ func TestFetchFromHTTP(t *testing.T) {
for path, content := range testFiles { for path, content := range testFiles {
fullPath := "/" + path // MemMapFs needs absolute paths fullPath := "/" + path // MemMapFs needs absolute paths
dir := filepath.Dir(fullPath) dir := filepath.Dir(fullPath)
require.NoError(t, sourceFs.MkdirAll(dir, 0755)) require.NoError(t, sourceFs.MkdirAll(dir, 0o755))
require.NoError(t, afero.WriteFile(sourceFs, fullPath, content, 0644)) require.NoError(t, afero.WriteFile(sourceFs, fullPath, content, 0o644))
} }
// Generate manifest using scanner // Generate manifest using scanner
opts := &scanner.Options{ opts := &mfer.ScannerOptions{
Fs: sourceFs, Fs: sourceFs,
} }
s := scanner.NewWithOptions(opts) s := mfer.NewScannerWithOptions(opts)
require.NoError(t, s.EnumerateFS(sourceFs, "/", nil)) require.NoError(t, s.EnumerateFS(sourceFs, "/", nil))
var manifestBuf bytes.Buffer var manifestBuf bytes.Buffer
@@ -197,11 +219,11 @@ func TestFetchHashMismatch(t *testing.T) {
// Create source filesystem with a test file // Create source filesystem with a test file
sourceFs := afero.NewMemMapFs() sourceFs := afero.NewMemMapFs()
originalContent := []byte("Original content") originalContent := []byte("Original content")
require.NoError(t, afero.WriteFile(sourceFs, "/file.txt", originalContent, 0644)) require.NoError(t, afero.WriteFile(sourceFs, "/file.txt", originalContent, 0o644))
// Generate manifest // Generate manifest
opts := &scanner.Options{Fs: sourceFs} opts := &mfer.ScannerOptions{Fs: sourceFs}
s := scanner.NewWithOptions(opts) s := mfer.NewScannerWithOptions(opts)
require.NoError(t, s.EnumerateFS(sourceFs, "/", nil)) require.NoError(t, s.EnumerateFS(sourceFs, "/", nil))
var manifestBuf bytes.Buffer var manifestBuf bytes.Buffer
@@ -249,11 +271,11 @@ func TestFetchSizeMismatch(t *testing.T) {
// Create source filesystem with a test file // Create source filesystem with a test file
sourceFs := afero.NewMemMapFs() sourceFs := afero.NewMemMapFs()
originalContent := []byte("Original content with specific size") originalContent := []byte("Original content with specific size")
require.NoError(t, afero.WriteFile(sourceFs, "/file.txt", originalContent, 0644)) require.NoError(t, afero.WriteFile(sourceFs, "/file.txt", originalContent, 0o644))
// Generate manifest // Generate manifest
opts := &scanner.Options{Fs: sourceFs} opts := &mfer.ScannerOptions{Fs: sourceFs}
s := scanner.NewWithOptions(opts) s := mfer.NewScannerWithOptions(opts)
require.NoError(t, s.EnumerateFS(sourceFs, "/", nil)) require.NoError(t, s.EnumerateFS(sourceFs, "/", nil))
var manifestBuf bytes.Buffer var manifestBuf bytes.Buffer
@@ -298,11 +320,11 @@ func TestFetchProgress(t *testing.T) {
sourceFs := afero.NewMemMapFs() sourceFs := afero.NewMemMapFs()
// Create content large enough to trigger multiple progress updates // Create content large enough to trigger multiple progress updates
content := bytes.Repeat([]byte("x"), 100*1024) // 100KB content := bytes.Repeat([]byte("x"), 100*1024) // 100KB
require.NoError(t, afero.WriteFile(sourceFs, "/large.txt", content, 0644)) require.NoError(t, afero.WriteFile(sourceFs, "/large.txt", content, 0o644))
// Generate manifest // Generate manifest
opts := &scanner.Options{Fs: sourceFs} opts := &mfer.ScannerOptions{Fs: sourceFs}
s := scanner.NewWithOptions(opts) s := mfer.NewScannerWithOptions(opts)
require.NoError(t, s.EnumerateFS(sourceFs, "/", nil)) require.NoError(t, s.EnumerateFS(sourceFs, "/", nil))
var manifestBuf bytes.Buffer var manifestBuf bytes.Buffer

View File

@@ -113,7 +113,7 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
} }
// Handle dotfiles // Handle dotfiles
if !includeDotfiles && pathIsHidden(relPath) { if !includeDotfiles && mfer.IsHiddenPath(filepath.ToSlash(relPath)) {
if info.IsDir() { if info.IsDir() {
return filepath.SkipDir return filepath.SkipDir
} }
@@ -227,6 +227,14 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
builder := mfer.NewBuilder() builder := mfer.NewBuilder()
// Set up signing options if sign-key is provided
if signKey := ctx.String("sign-key"); signKey != "" {
builder.SetSigningOptions(&mfer.SigningOptions{
KeyID: mfer.GPGKeyID(signKey),
})
log.Infof("signing manifest with GPG key: %s", signKey)
}
for _, e := range entries { for _, e := range entries {
select { select {
case <-ctx.Done(): case <-ctx.Done():
@@ -274,10 +282,14 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
hashedFiles++ hashedFiles++
// Add to builder with computed hash // Add to builder with computed hash
addFileToBuilder(builder, e.path, e.size, e.mtime, hash) if err := addFileToBuilder(builder, e.path, e.size, e.mtime, hash); err != nil {
return fmt.Errorf("failed to add %s: %w", e.path, err)
}
} else { } else {
// Use existing entry // Use existing entry
addExistingToBuilder(builder, e.existing) if err := addExistingToBuilder(builder, e.existing); err != nil {
return fmt.Errorf("failed to add %s: %w", e.path, err)
}
} }
} }
@@ -360,38 +372,15 @@ func hashFile(r io.Reader, size int64, progress func(int64)) ([]byte, int64, err
} }
// addFileToBuilder adds a new file entry to the builder // addFileToBuilder adds a new file entry to the builder
func addFileToBuilder(b *mfer.Builder, path string, size int64, mtime time.Time, hash []byte) { func addFileToBuilder(b *mfer.Builder, path string, size int64, mtime time.Time, hash []byte) error {
// Use the builder's internal method indirectly by creating an entry return b.AddFileWithHash(mfer.RelFilePath(path), mfer.FileSize(size), mfer.ModTime(mtime), hash)
// Since Builder.AddFile reads from a reader, we need to use a different approach
// We'll access the builder's files directly through a custom method
b.AddFileWithHash(path, size, mtime, hash)
} }
// addExistingToBuilder adds an existing manifest entry to the builder // addExistingToBuilder adds an existing manifest entry to the builder
func addExistingToBuilder(b *mfer.Builder, entry *mfer.MFFilePath) { func addExistingToBuilder(b *mfer.Builder, entry *mfer.MFFilePath) error {
mtime := time.Unix(entry.Mtime.Seconds, int64(entry.Mtime.Nanos)) mtime := time.Unix(entry.Mtime.Seconds, int64(entry.Mtime.Nanos))
if len(entry.Hashes) > 0 { if len(entry.Hashes) == 0 {
b.AddFileWithHash(entry.Path, entry.Size, mtime, entry.Hashes[0].MultiHash) return nil
} }
} return b.AddFileWithHash(mfer.RelFilePath(entry.Path), mfer.FileSize(entry.Size), mfer.ModTime(mtime), entry.Hashes[0].MultiHash)
// pathIsHidden checks if a path contains hidden components
func pathIsHidden(p string) bool {
// "." is not hidden, it's the current directory
if p == "." {
return false
}
// Check each path component
for p != "" && p != "." && p != "/" {
base := filepath.Base(p)
if len(base) > 0 && base[0] == '.' {
return true
}
parent := filepath.Dir(p)
if parent == p {
break
}
p = parent
}
return false
} }

View File

@@ -8,7 +8,6 @@ import (
"github.com/spf13/afero" "github.com/spf13/afero"
"github.com/stretchr/testify/assert" "github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require" "github.com/stretchr/testify/require"
"sneak.berlin/go/mfer/internal/scanner"
"sneak.berlin/go/mfer/mfer" "sneak.berlin/go/mfer/mfer"
) )
@@ -16,20 +15,20 @@ func TestFreshenUnchanged(t *testing.T) {
// Create filesystem with test files // Create filesystem with test files
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content1"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content1"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content2"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content2"), 0o644))
// Generate initial manifest // Generate initial manifest
opts := &scanner.Options{Fs: fs} opts := &mfer.ScannerOptions{Fs: fs}
s := scanner.NewWithOptions(opts) s := mfer.NewScannerWithOptions(opts)
require.NoError(t, s.EnumeratePath("/testdir", nil)) require.NoError(t, s.EnumeratePath("/testdir", nil))
var manifestBuf bytes.Buffer var manifestBuf bytes.Buffer
require.NoError(t, s.ToManifest(context.Background(), &manifestBuf, nil)) require.NoError(t, s.ToManifest(context.Background(), &manifestBuf, nil))
// Write manifest to filesystem // Write manifest to filesystem
require.NoError(t, afero.WriteFile(fs, "/testdir/.index.mf", manifestBuf.Bytes(), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/.index.mf", manifestBuf.Bytes(), 0o644))
// Parse manifest to verify // Parse manifest to verify
manifest, err := mfer.NewManifestFromFile(fs, "/testdir/.index.mf") manifest, err := mfer.NewManifestFromFile(fs, "/testdir/.index.mf")
@@ -41,20 +40,20 @@ func TestFreshenWithChanges(t *testing.T) {
// Create filesystem with test files // Create filesystem with test files
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content1"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content1"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content2"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content2"), 0o644))
// Generate initial manifest // Generate initial manifest
opts := &scanner.Options{Fs: fs} opts := &mfer.ScannerOptions{Fs: fs}
s := scanner.NewWithOptions(opts) s := mfer.NewScannerWithOptions(opts)
require.NoError(t, s.EnumeratePath("/testdir", nil)) require.NoError(t, s.EnumeratePath("/testdir", nil))
var manifestBuf bytes.Buffer var manifestBuf bytes.Buffer
require.NoError(t, s.ToManifest(context.Background(), &manifestBuf, nil)) require.NoError(t, s.ToManifest(context.Background(), &manifestBuf, nil))
// Write manifest to filesystem // Write manifest to filesystem
require.NoError(t, afero.WriteFile(fs, "/testdir/.index.mf", manifestBuf.Bytes(), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/.index.mf", manifestBuf.Bytes(), 0o644))
// Verify initial manifest has 2 files // Verify initial manifest has 2 files
manifest, err := mfer.NewManifestFromFile(fs, "/testdir/.index.mf") manifest, err := mfer.NewManifestFromFile(fs, "/testdir/.index.mf")
@@ -62,10 +61,10 @@ func TestFreshenWithChanges(t *testing.T) {
assert.Len(t, manifest.Files(), 2) assert.Len(t, manifest.Files(), 2)
// Add a new file // Add a new file
require.NoError(t, afero.WriteFile(fs, "/testdir/file3.txt", []byte("content3"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file3.txt", []byte("content3"), 0o644))
// Modify file2 (change content and size) // Modify file2 (change content and size)
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("modified content2"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("modified content2"), 0o644))
// Remove file1 // Remove file1
require.NoError(t, fs.Remove("/testdir/file1.txt")) require.NoError(t, fs.Remove("/testdir/file1.txt"))

View File

@@ -13,29 +13,37 @@ import (
"github.com/spf13/afero" "github.com/spf13/afero"
"github.com/urfave/cli/v2" "github.com/urfave/cli/v2"
"sneak.berlin/go/mfer/internal/log" "sneak.berlin/go/mfer/internal/log"
"sneak.berlin/go/mfer/internal/scanner" "sneak.berlin/go/mfer/mfer"
) )
func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error { func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error {
log.Debug("generateManifestOperation()") log.Debug("generateManifestOperation()")
opts := &scanner.Options{ opts := &mfer.ScannerOptions{
IncludeDotfiles: ctx.Bool("IncludeDotfiles"), IncludeDotfiles: ctx.Bool("IncludeDotfiles"),
FollowSymLinks: ctx.Bool("FollowSymLinks"), FollowSymLinks: ctx.Bool("FollowSymLinks"),
Fs: mfa.Fs, Fs: mfa.Fs,
} }
s := scanner.NewWithOptions(opts) // Set up signing options if sign-key is provided
if signKey := ctx.String("sign-key"); signKey != "" {
opts.SigningOptions = &mfer.SigningOptions{
KeyID: mfer.GPGKeyID(signKey),
}
log.Infof("signing manifest with GPG key: %s", signKey)
}
s := mfer.NewScannerWithOptions(opts)
// Phase 1: Enumeration - collect paths and stat files // Phase 1: Enumeration - collect paths and stat files
args := ctx.Args() args := ctx.Args()
showProgress := ctx.Bool("progress") showProgress := ctx.Bool("progress")
// Set up enumeration progress reporting // Set up enumeration progress reporting
var enumProgress chan scanner.EnumerateStatus var enumProgress chan mfer.EnumerateStatus
var enumWg sync.WaitGroup var enumWg sync.WaitGroup
if showProgress { if showProgress {
enumProgress = make(chan scanner.EnumerateStatus, 1) enumProgress = make(chan mfer.EnumerateStatus, 1)
enumWg.Add(1) enumWg.Add(1)
go func() { go func() {
defer enumWg.Done() defer enumWg.Done()
@@ -117,10 +125,10 @@ func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error {
}() }()
// Phase 2: Scan - read file contents and generate manifest // Phase 2: Scan - read file contents and generate manifest
var scanProgress chan scanner.ScanStatus var scanProgress chan mfer.ScanStatus
var scanWg sync.WaitGroup var scanWg sync.WaitGroup
if showProgress { if showProgress {
scanProgress = make(chan scanner.ScanStatus, 1) scanProgress = make(chan mfer.ScanStatus, 1)
scanWg.Add(1) scanWg.Add(1)
go func() { go func() {
defer scanWg.Done() defer scanWg.Done()

View File

@@ -148,6 +148,12 @@ func (mfa *CLIApp) run(args []string) {
Aliases: []string{"P"}, Aliases: []string{"P"},
Usage: "Show progress during enumeration and scanning", Usage: "Show progress during enumeration and scanning",
}, },
&cli.StringFlag{
Name: "sign-key",
Aliases: []string{"s"},
Usage: "GPG key ID to sign the manifest with",
EnvVars: []string{"MFER_SIGN_KEY"},
},
), ),
}, },
{ {
@@ -175,6 +181,12 @@ func (mfa *CLIApp) run(args []string) {
Name: "no-extra-files", Name: "no-extra-files",
Usage: "Fail if files exist in base directory that are not in manifest", Usage: "Fail if files exist in base directory that are not in manifest",
}, },
&cli.StringFlag{
Name: "require-signature",
Aliases: []string{"S"},
Usage: "Require manifest to be signed by the specified GPG key ID",
EnvVars: []string{"MFER_REQUIRE_SIGNATURE"},
},
), ),
}, },
{ {
@@ -208,6 +220,12 @@ func (mfa *CLIApp) run(args []string) {
Aliases: []string{"P"}, Aliases: []string{"P"},
Usage: "Show progress during scanning and hashing", Usage: "Show progress during scanning and hashing",
}, },
&cli.StringFlag{
Name: "sign-key",
Aliases: []string{"s"},
Usage: "GPG key ID to sign the manifest with",
EnvVars: []string{"MFER_SIGN_KEY"},
},
), ),
}, },
{ {

View File

@@ -2,16 +2,84 @@ package mfer
import ( import (
"crypto/sha256" "crypto/sha256"
"errors"
"fmt"
"io" "io"
"strings"
"sync" "sync"
"time" "time"
"unicode/utf8"
"github.com/multiformats/go-multihash" "github.com/multiformats/go-multihash"
) )
// ValidatePath checks that a file path conforms to manifest path invariants:
// - Must be valid UTF-8
// - Must use forward slashes only (no backslashes)
// - Must be relative (no leading /)
// - Must not contain ".." segments
// - Must not contain empty segments (no "//")
// - Must not be empty
func ValidatePath(p string) error {
if p == "" {
return errors.New("path cannot be empty")
}
if !utf8.ValidString(p) {
return fmt.Errorf("path %q is not valid UTF-8", p)
}
if strings.ContainsRune(p, '\\') {
return fmt.Errorf("path %q contains backslash; use forward slashes only", p)
}
if strings.HasPrefix(p, "/") {
return fmt.Errorf("path %q is absolute; must be relative", p)
}
for _, seg := range strings.Split(p, "/") {
if seg == "" {
return fmt.Errorf("path %q contains empty segment", p)
}
if seg == ".." {
return fmt.Errorf("path %q contains '..' segment", p)
}
}
return nil
}
// RelFilePath represents a relative file path within a manifest.
type RelFilePath string
// AbsFilePath represents an absolute file path on the filesystem.
type AbsFilePath string
// FileSize represents the size of a file in bytes.
type FileSize int64
// FileCount represents a count of files.
type FileCount int64
// ModTime represents a file's modification time.
type ModTime time.Time
// UnixSeconds represents seconds since Unix epoch.
type UnixSeconds int64
// UnixNanos represents the nanosecond component of a timestamp (0-999999999).
type UnixNanos int32
// Timestamp converts ModTime to a protobuf Timestamp.
func (m ModTime) Timestamp() *Timestamp {
t := time.Time(m)
return &Timestamp{
Seconds: t.Unix(),
Nanos: int32(t.Nanosecond()),
}
}
// Multihash represents a multihash-encoded file hash (typically SHA2-256).
type Multihash []byte
// FileHashProgress reports progress during file hashing. // FileHashProgress reports progress during file hashing.
type FileHashProgress struct { type FileHashProgress struct {
BytesRead int64 // Total bytes read so far for the current file BytesRead FileSize // Total bytes read so far for the current file
} }
// Builder constructs a manifest by adding files one at a time. // Builder constructs a manifest by adding files one at a time.
@@ -19,6 +87,7 @@ type Builder struct {
mu sync.Mutex mu sync.Mutex
files []*MFFilePath files []*MFFilePath
createdAt time.Time createdAt time.Time
signingOptions *SigningOptions
} }
// NewBuilder creates a new Builder. // NewBuilder creates a new Builder.
@@ -33,24 +102,28 @@ func NewBuilder() *Builder {
// Progress updates are sent to the progress channel (if non-nil) without blocking. // Progress updates are sent to the progress channel (if non-nil) without blocking.
// Returns the number of bytes read. // Returns the number of bytes read.
func (b *Builder) AddFile( func (b *Builder) AddFile(
path string, path RelFilePath,
size int64, size FileSize,
mtime time.Time, mtime ModTime,
reader io.Reader, reader io.Reader,
progress chan<- FileHashProgress, progress chan<- FileHashProgress,
) (int64, error) { ) (FileSize, error) {
if err := ValidatePath(string(path)); err != nil {
return 0, err
}
// Create hash writer // Create hash writer
h := sha256.New() h := sha256.New()
// Read file in chunks, updating hash and progress // Read file in chunks, updating hash and progress
var totalRead int64 var totalRead FileSize
buf := make([]byte, 64*1024) // 64KB chunks buf := make([]byte, 64*1024) // 64KB chunks
for { for {
n, err := reader.Read(buf) n, err := reader.Read(buf)
if n > 0 { if n > 0 {
h.Write(buf[:n]) h.Write(buf[:n])
totalRead += int64(n) totalRead += FileSize(n)
sendFileHashProgress(progress, FileHashProgress{BytesRead: totalRead}) sendFileHashProgress(progress, FileHashProgress{BytesRead: totalRead})
} }
if err == io.EOF { if err == io.EOF {
@@ -61,6 +134,11 @@ func (b *Builder) AddFile(
} }
} }
// Verify actual bytes read matches declared size
if totalRead != size {
return totalRead, fmt.Errorf("size mismatch for %q: declared %d bytes but read %d bytes", path, size, totalRead)
}
// Encode hash as multihash (SHA2-256) // Encode hash as multihash (SHA2-256)
mh, err := multihash.Encode(h.Sum(nil), multihash.SHA2_256) mh, err := multihash.Encode(h.Sum(nil), multihash.SHA2_256)
if err != nil { if err != nil {
@@ -69,12 +147,12 @@ func (b *Builder) AddFile(
// Create file entry // Create file entry
entry := &MFFilePath{ entry := &MFFilePath{
Path: path, Path: string(path),
Size: size, Size: int64(size),
Hashes: []*MFFileChecksum{ Hashes: []*MFFileChecksum{
{MultiHash: mh}, {MultiHash: mh},
}, },
Mtime: newTimestampFromTime(mtime), Mtime: mtime.Timestamp(),
} }
b.mu.Lock() b.mu.Lock()
@@ -104,19 +182,39 @@ func (b *Builder) FileCount() int {
// AddFileWithHash adds a file entry with a pre-computed hash. // AddFileWithHash adds a file entry with a pre-computed hash.
// This is useful when the hash is already known (e.g., from an existing manifest). // This is useful when the hash is already known (e.g., from an existing manifest).
func (b *Builder) AddFileWithHash(path string, size int64, mtime time.Time, hash []byte) { // Returns an error if path is empty, size is negative, or hash is nil/empty.
func (b *Builder) AddFileWithHash(path RelFilePath, size FileSize, mtime ModTime, hash Multihash) error {
if err := ValidatePath(string(path)); err != nil {
return err
}
if size < 0 {
return errors.New("size cannot be negative")
}
if len(hash) == 0 {
return errors.New("hash cannot be nil or empty")
}
entry := &MFFilePath{ entry := &MFFilePath{
Path: path, Path: string(path),
Size: size, Size: int64(size),
Hashes: []*MFFileChecksum{ Hashes: []*MFFileChecksum{
{MultiHash: hash}, {MultiHash: hash},
}, },
Mtime: newTimestampFromTime(mtime), Mtime: mtime.Timestamp(),
} }
b.mu.Lock() b.mu.Lock()
b.files = append(b.files, entry) b.files = append(b.files, entry)
b.mu.Unlock() b.mu.Unlock()
return nil
}
// SetSigningOptions sets the GPG signing options for the manifest.
// If opts is non-nil, the manifest will be signed when Build() is called.
func (b *Builder) SetSigningOptions(opts *SigningOptions) {
b.mu.Lock()
defer b.mu.Unlock()
b.signingOptions = opts
} }
// Build finalizes the manifest and writes it to the writer. // Build finalizes the manifest and writes it to the writer.
@@ -134,6 +232,7 @@ func (b *Builder) Build(w io.Writer) error {
// Create a temporary manifest to use existing serialization // Create a temporary manifest to use existing serialization
m := &manifest{ m := &manifest{
pbInner: inner, pbInner: inner,
signingOptions: b.signingOptions,
} }
// Generate outer wrapper // Generate outer wrapper

127
mfer/builder_test.go Normal file
View File

@@ -0,0 +1,127 @@
package mfer
import (
"bytes"
"strings"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestNewBuilder(t *testing.T) {
b := NewBuilder()
assert.NotNil(t, b)
assert.Equal(t, 0, b.FileCount())
}
func TestBuilderAddFile(t *testing.T) {
b := NewBuilder()
content := []byte("test content")
reader := bytes.NewReader(content)
bytesRead, err := b.AddFile("test.txt", FileSize(len(content)), ModTime(time.Now()), reader, nil)
require.NoError(t, err)
assert.Equal(t, FileSize(len(content)), bytesRead)
assert.Equal(t, 1, b.FileCount())
}
func TestBuilderAddFileWithHash(t *testing.T) {
b := NewBuilder()
hash := make([]byte, 34) // SHA256 multihash is 34 bytes
err := b.AddFileWithHash("test.txt", 100, ModTime(time.Now()), hash)
require.NoError(t, err)
assert.Equal(t, 1, b.FileCount())
}
func TestBuilderAddFileWithHashValidation(t *testing.T) {
t.Run("empty path", func(t *testing.T) {
b := NewBuilder()
hash := make([]byte, 34)
err := b.AddFileWithHash("", 100, ModTime(time.Now()), hash)
assert.Error(t, err)
assert.Contains(t, err.Error(), "path")
})
t.Run("negative size", func(t *testing.T) {
b := NewBuilder()
hash := make([]byte, 34)
err := b.AddFileWithHash("test.txt", -1, ModTime(time.Now()), hash)
assert.Error(t, err)
assert.Contains(t, err.Error(), "size")
})
t.Run("nil hash", func(t *testing.T) {
b := NewBuilder()
err := b.AddFileWithHash("test.txt", 100, ModTime(time.Now()), nil)
assert.Error(t, err)
assert.Contains(t, err.Error(), "hash")
})
t.Run("empty hash", func(t *testing.T) {
b := NewBuilder()
err := b.AddFileWithHash("test.txt", 100, ModTime(time.Now()), []byte{})
assert.Error(t, err)
assert.Contains(t, err.Error(), "hash")
})
t.Run("valid inputs", func(t *testing.T) {
b := NewBuilder()
hash := make([]byte, 34)
err := b.AddFileWithHash("test.txt", 100, ModTime(time.Now()), hash)
assert.NoError(t, err)
assert.Equal(t, 1, b.FileCount())
})
}
func TestBuilderBuild(t *testing.T) {
b := NewBuilder()
content := []byte("test content")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime(time.Now()), reader, nil)
require.NoError(t, err)
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Should have magic bytes
assert.True(t, strings.HasPrefix(buf.String(), MAGIC))
}
func TestNewTimestampFromTimeExtremeDate(t *testing.T) {
// Regression test: newTimestampFromTime used UnixNano() which panics
// for dates outside ~1678-2262. Now uses Nanosecond() which is safe.
tests := []struct {
name string
time time.Time
}{
{"zero time", time.Time{}},
{"year 1000", time.Date(1000, 1, 1, 0, 0, 0, 0, time.UTC)},
{"year 3000", time.Date(3000, 1, 1, 0, 0, 0, 123456789, time.UTC)},
{"unix epoch", time.Unix(0, 0)},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Should not panic
ts := newTimestampFromTime(tt.time)
assert.Equal(t, tt.time.Unix(), ts.Seconds)
assert.Equal(t, int32(tt.time.Nanosecond()), ts.Nanos)
})
}
}
func TestBuilderBuildEmpty(t *testing.T) {
b := NewBuilder()
var buf bytes.Buffer
err := b.Build(&buf)
require.NoError(t, err)
// Should still produce valid manifest with 0 files
assert.True(t, strings.HasPrefix(buf.String(), MAGIC))
}

View File

@@ -1,4 +1,4 @@
package checker package mfer
import ( import (
"bytes" "bytes"
@@ -12,12 +12,11 @@ import (
"github.com/multiformats/go-multihash" "github.com/multiformats/go-multihash"
"github.com/spf13/afero" "github.com/spf13/afero"
"sneak.berlin/go/mfer/mfer"
) )
// Result represents the outcome of checking a single file. // Result represents the outcome of checking a single file.
type Result struct { type Result struct {
Path string // Relative path from manifest Path RelFilePath // Relative path from manifest
Status Status // Verification result status Status Status // Verification result status
Message string // Human-readable description of the result Message string // Human-readable description of the result
} }
@@ -55,22 +54,26 @@ func (s Status) String() string {
// CheckStatus contains progress information for the check operation. // CheckStatus contains progress information for the check operation.
type CheckStatus struct { type CheckStatus struct {
TotalFiles int64 // Total number of files in manifest TotalFiles FileCount // Total number of files in manifest
CheckedFiles int64 // Number of files checked so far CheckedFiles FileCount // Number of files checked so far
TotalBytes int64 // Total bytes to verify (sum of all file sizes) TotalBytes FileSize // Total bytes to verify (sum of all file sizes)
CheckedBytes int64 // Bytes verified so far CheckedBytes FileSize // Bytes verified so far
BytesPerSec float64 // Current throughput rate BytesPerSec float64 // Current throughput rate
ETA time.Duration // Estimated time to completion ETA time.Duration // Estimated time to completion
Failures int64 // Number of verification failures encountered Failures FileCount // Number of verification failures encountered
} }
// Checker verifies files against a manifest. // Checker verifies files against a manifest.
type Checker struct { type Checker struct {
basePath string basePath AbsFilePath
files []*mfer.MFFilePath files []*MFFilePath
fs afero.Fs fs afero.Fs
// manifestPaths is a set of paths in the manifest for quick lookup // manifestPaths is a set of paths in the manifest for quick lookup
manifestPaths map[string]struct{} manifestPaths map[RelFilePath]struct{}
// signature info from the manifest
signature []byte
signer []byte
signingPubKey []byte
} }
// NewChecker creates a new Checker for the given manifest, base path, and filesystem. // NewChecker creates a new Checker for the given manifest, base path, and filesystem.
@@ -81,7 +84,7 @@ func NewChecker(manifestPath string, basePath string, fs afero.Fs) (*Checker, er
fs = afero.NewOsFs() fs = afero.NewOsFs()
} }
m, err := mfer.NewManifestFromFile(fs, manifestPath) m, err := NewManifestFromFile(fs, manifestPath)
if err != nil { if err != nil {
return nil, err return nil, err
} }
@@ -92,33 +95,61 @@ func NewChecker(manifestPath string, basePath string, fs afero.Fs) (*Checker, er
} }
files := m.Files() files := m.Files()
manifestPaths := make(map[string]struct{}, len(files)) manifestPaths := make(map[RelFilePath]struct{}, len(files))
for _, f := range files { for _, f := range files {
manifestPaths[f.Path] = struct{}{} manifestPaths[RelFilePath(f.Path)] = struct{}{}
} }
return &Checker{ return &Checker{
basePath: abs, basePath: AbsFilePath(abs),
files: files, files: files,
fs: fs, fs: fs,
manifestPaths: manifestPaths, manifestPaths: manifestPaths,
signature: m.pbOuter.Signature,
signer: m.pbOuter.Signer,
signingPubKey: m.pbOuter.SigningPubKey,
}, nil }, nil
} }
// FileCount returns the number of files in the manifest. // FileCount returns the number of files in the manifest.
func (c *Checker) FileCount() int64 { func (c *Checker) FileCount() FileCount {
return int64(len(c.files)) return FileCount(len(c.files))
} }
// TotalBytes returns the total size of all files in the manifest. // TotalBytes returns the total size of all files in the manifest.
func (c *Checker) TotalBytes() int64 { func (c *Checker) TotalBytes() FileSize {
var total int64 var total FileSize
for _, f := range c.files { for _, f := range c.files {
total += f.Size total += FileSize(f.Size)
} }
return total return total
} }
// IsSigned returns true if the manifest has a signature.
func (c *Checker) IsSigned() bool {
return len(c.signature) > 0
}
// Signer returns the signer fingerprint if the manifest is signed, nil otherwise.
func (c *Checker) Signer() []byte {
return c.signer
}
// SigningPubKey returns the signing public key if the manifest is signed, nil otherwise.
func (c *Checker) SigningPubKey() []byte {
return c.signingPubKey
}
// ExtractEmbeddedSigningKeyFP imports the manifest's embedded public key into a
// temporary keyring and extracts its fingerprint. This validates the key and
// returns its actual fingerprint from the key material itself.
func (c *Checker) ExtractEmbeddedSigningKeyFP() (string, error) {
if len(c.signingPubKey) == 0 {
return "", errors.New("manifest has no signing public key")
}
return gpgExtractPubKeyFingerprint(c.signingPubKey)
}
// Check verifies all files against the manifest. // Check verifies all files against the manifest.
// Results are sent to the results channel as files are checked. // Results are sent to the results channel as files are checked.
// Progress updates are sent to the progress channel approximately once per second. // Progress updates are sent to the progress channel approximately once per second.
@@ -131,12 +162,12 @@ func (c *Checker) Check(ctx context.Context, results chan<- Result, progress cha
defer close(progress) defer close(progress)
} }
totalFiles := int64(len(c.files)) totalFiles := FileCount(len(c.files))
totalBytes := c.TotalBytes() totalBytes := c.TotalBytes()
var checkedFiles int64 var checkedFiles FileCount
var checkedBytes int64 var checkedBytes FileSize
var failures int64 var failures FileCount
startTime := time.Now() startTime := time.Now()
@@ -186,28 +217,29 @@ func (c *Checker) Check(ctx context.Context, results chan<- Result, progress cha
return nil return nil
} }
func (c *Checker) checkFile(entry *mfer.MFFilePath, checkedBytes *int64) Result { func (c *Checker) checkFile(entry *MFFilePath, checkedBytes *FileSize) Result {
absPath := filepath.Join(c.basePath, entry.Path) absPath := filepath.Join(string(c.basePath), entry.Path)
relPath := RelFilePath(entry.Path)
// Check if file exists // Check if file exists
info, err := c.fs.Stat(absPath) info, err := c.fs.Stat(absPath)
if err != nil { if err != nil {
if errors.Is(err, afero.ErrFileNotFound) || errors.Is(err, errors.New("file does not exist")) { if errors.Is(err, afero.ErrFileNotFound) || errors.Is(err, errors.New("file does not exist")) {
return Result{Path: entry.Path, Status: StatusMissing, Message: "file not found"} return Result{Path: relPath, Status: StatusMissing, Message: "file not found"}
} }
// Check for "file does not exist" style errors // Check for "file does not exist" style errors
exists, _ := afero.Exists(c.fs, absPath) exists, _ := afero.Exists(c.fs, absPath)
if !exists { if !exists {
return Result{Path: entry.Path, Status: StatusMissing, Message: "file not found"} return Result{Path: relPath, Status: StatusMissing, Message: "file not found"}
} }
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()} return Result{Path: relPath, Status: StatusError, Message: err.Error()}
} }
// Check size // Check size
if info.Size() != entry.Size { if info.Size() != entry.Size {
*checkedBytes += info.Size() *checkedBytes += FileSize(info.Size())
return Result{ return Result{
Path: entry.Path, Path: relPath,
Status: StatusSizeMismatch, Status: StatusSizeMismatch,
Message: "size mismatch", Message: "size mismatch",
} }
@@ -216,31 +248,31 @@ func (c *Checker) checkFile(entry *mfer.MFFilePath, checkedBytes *int64) Result
// Open and hash file // Open and hash file
f, err := c.fs.Open(absPath) f, err := c.fs.Open(absPath)
if err != nil { if err != nil {
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()} return Result{Path: relPath, Status: StatusError, Message: err.Error()}
} }
defer func() { _ = f.Close() }() defer func() { _ = f.Close() }()
h := sha256.New() h := sha256.New()
n, err := io.Copy(h, f) n, err := io.Copy(h, f)
if err != nil { if err != nil {
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()} return Result{Path: relPath, Status: StatusError, Message: err.Error()}
} }
*checkedBytes += n *checkedBytes += FileSize(n)
// Encode as multihash and compare // Encode as multihash and compare
computed, err := multihash.Encode(h.Sum(nil), multihash.SHA2_256) computed, err := multihash.Encode(h.Sum(nil), multihash.SHA2_256)
if err != nil { if err != nil {
return Result{Path: entry.Path, Status: StatusError, Message: err.Error()} return Result{Path: relPath, Status: StatusError, Message: err.Error()}
} }
// Check against all hashes in manifest (at least one must match) // Check against all hashes in manifest (at least one must match)
for _, hash := range entry.Hashes { for _, hash := range entry.Hashes {
if bytes.Equal(computed, hash.MultiHash) { if bytes.Equal(computed, hash.MultiHash) {
return Result{Path: entry.Path, Status: StatusOK} return Result{Path: relPath, Status: StatusOK}
} }
} }
return Result{Path: entry.Path, Status: StatusHashMismatch, Message: "hash mismatch"} return Result{Path: relPath, Status: StatusHashMismatch, Message: "hash mismatch"}
} }
// FindExtraFiles walks the filesystem and reports files not in the manifest. // FindExtraFiles walks the filesystem and reports files not in the manifest.
@@ -250,7 +282,7 @@ func (c *Checker) FindExtraFiles(ctx context.Context, results chan<- Result) err
defer close(results) defer close(results)
} }
return afero.Walk(c.fs, c.basePath, func(path string, info os.FileInfo, err error) error { return afero.Walk(c.fs, string(c.basePath), func(path string, info os.FileInfo, err error) error {
if err != nil { if err != nil {
return err return err
} }
@@ -267,10 +299,11 @@ func (c *Checker) FindExtraFiles(ctx context.Context, results chan<- Result) err
} }
// Get relative path // Get relative path
relPath, err := filepath.Rel(c.basePath, path) rel, err := filepath.Rel(string(c.basePath), path)
if err != nil { if err != nil {
return err return err
} }
relPath := RelFilePath(rel)
// Check if path is in manifest // Check if path is in manifest
if _, exists := c.manifestPaths[relPath]; !exists { if _, exists := c.manifestPaths[relPath]; !exists {

View File

@@ -1,4 +1,4 @@
package checker package mfer
import ( import (
"bytes" "bytes"
@@ -9,7 +9,6 @@ import (
"github.com/spf13/afero" "github.com/spf13/afero"
"github.com/stretchr/testify/assert" "github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require" "github.com/stretchr/testify/require"
"sneak.berlin/go/mfer/mfer"
) )
func TestStatusString(t *testing.T) { func TestStatusString(t *testing.T) {
@@ -37,16 +36,16 @@ func TestStatusString(t *testing.T) {
func createTestManifest(t *testing.T, fs afero.Fs, manifestPath string, files map[string][]byte) { func createTestManifest(t *testing.T, fs afero.Fs, manifestPath string, files map[string][]byte) {
t.Helper() t.Helper()
builder := mfer.NewBuilder() builder := NewBuilder()
for path, content := range files { for path, content := range files {
reader := bytes.NewReader(content) reader := bytes.NewReader(content)
_, err := builder.AddFile(path, int64(len(content)), time.Now(), reader, nil) _, err := builder.AddFile(RelFilePath(path), FileSize(len(content)), ModTime(time.Now()), reader, nil)
require.NoError(t, err) require.NoError(t, err)
} }
var buf bytes.Buffer var buf bytes.Buffer
require.NoError(t, builder.Build(&buf)) require.NoError(t, builder.Build(&buf))
require.NoError(t, afero.WriteFile(fs, manifestPath, buf.Bytes(), 0644)) require.NoError(t, afero.WriteFile(fs, manifestPath, buf.Bytes(), 0o644))
} }
// createFilesOnDisk creates the given files on the filesystem. // createFilesOnDisk creates the given files on the filesystem.
@@ -55,8 +54,8 @@ func createFilesOnDisk(t *testing.T, fs afero.Fs, basePath string, files map[str
for path, content := range files { for path, content := range files {
fullPath := basePath + "/" + path fullPath := basePath + "/" + path
require.NoError(t, fs.MkdirAll(basePath, 0755)) require.NoError(t, fs.MkdirAll(basePath, 0o755))
require.NoError(t, afero.WriteFile(fs, fullPath, content, 0644)) require.NoError(t, afero.WriteFile(fs, fullPath, content, 0o644))
} }
} }
@@ -72,7 +71,7 @@ func TestNewChecker(t *testing.T) {
chk, err := NewChecker("/manifest.mf", "/", fs) chk, err := NewChecker("/manifest.mf", "/", fs)
require.NoError(t, err) require.NoError(t, err)
assert.NotNil(t, chk) assert.NotNil(t, chk)
assert.Equal(t, int64(2), chk.FileCount()) assert.Equal(t, FileCount(2), chk.FileCount())
}) })
t.Run("missing manifest", func(t *testing.T) { t.Run("missing manifest", func(t *testing.T) {
@@ -83,7 +82,7 @@ func TestNewChecker(t *testing.T) {
t.Run("invalid manifest", func(t *testing.T) { t.Run("invalid manifest", func(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, afero.WriteFile(fs, "/bad.mf", []byte("not a manifest"), 0644)) require.NoError(t, afero.WriteFile(fs, "/bad.mf", []byte("not a manifest"), 0o644))
_, err := NewChecker("/bad.mf", "/", fs) _, err := NewChecker("/bad.mf", "/", fs)
assert.Error(t, err) assert.Error(t, err)
}) })
@@ -101,8 +100,8 @@ func TestCheckerFileCountAndTotalBytes(t *testing.T) {
chk, err := NewChecker("/manifest.mf", "/", fs) chk, err := NewChecker("/manifest.mf", "/", fs)
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(3), chk.FileCount()) assert.Equal(t, FileCount(3), chk.FileCount())
assert.Equal(t, int64(2+11+1000), chk.TotalBytes()) assert.Equal(t, FileSize(2+11+1000), chk.TotalBytes())
} }
func TestCheckAllFilesOK(t *testing.T) { func TestCheckAllFilesOK(t *testing.T) {
@@ -158,7 +157,7 @@ func TestCheckMissingFile(t *testing.T) {
okCount++ okCount++
case StatusMissing: case StatusMissing:
missingCount++ missingCount++
assert.Equal(t, "missing.txt", r.Path) assert.Equal(t, RelFilePath("missing.txt"), r.Path)
} }
} }
@@ -186,7 +185,7 @@ func TestCheckSizeMismatch(t *testing.T) {
r := <-results r := <-results
assert.Equal(t, StatusSizeMismatch, r.Status) assert.Equal(t, StatusSizeMismatch, r.Status)
assert.Equal(t, "file.txt", r.Path) assert.Equal(t, RelFilePath("file.txt"), r.Path)
} }
func TestCheckHashMismatch(t *testing.T) { func TestCheckHashMismatch(t *testing.T) {
@@ -212,7 +211,7 @@ func TestCheckHashMismatch(t *testing.T) {
r := <-results r := <-results
assert.Equal(t, StatusHashMismatch, r.Status) assert.Equal(t, StatusHashMismatch, r.Status)
assert.Equal(t, "file.txt", r.Path) assert.Equal(t, RelFilePath("file.txt"), r.Path)
} }
func TestCheckWithProgress(t *testing.T) { func TestCheckWithProgress(t *testing.T) {
@@ -246,11 +245,11 @@ func TestCheckWithProgress(t *testing.T) {
assert.NotEmpty(t, progressUpdates) assert.NotEmpty(t, progressUpdates)
// Final progress should show all files checked // Final progress should show all files checked
final := progressUpdates[len(progressUpdates)-1] final := progressUpdates[len(progressUpdates)-1]
assert.Equal(t, int64(2), final.TotalFiles) assert.Equal(t, FileCount(2), final.TotalFiles)
assert.Equal(t, int64(2), final.CheckedFiles) assert.Equal(t, FileCount(2), final.CheckedFiles)
assert.Equal(t, int64(300), final.TotalBytes) assert.Equal(t, FileSize(300), final.TotalBytes)
assert.Equal(t, int64(300), final.CheckedBytes) assert.Equal(t, FileSize(300), final.CheckedBytes)
assert.Equal(t, int64(0), final.Failures) assert.Equal(t, FileCount(0), final.Failures)
} }
func TestCheckContextCancellation(t *testing.T) { func TestCheckContextCancellation(t *testing.T) {
@@ -301,7 +300,7 @@ func TestFindExtraFiles(t *testing.T) {
} }
assert.Len(t, extras, 1) assert.Len(t, extras, 1)
assert.Equal(t, "file2.txt", extras[0].Path) assert.Equal(t, RelFilePath("file2.txt"), extras[0].Path)
assert.Equal(t, StatusExtra, extras[0].Status) assert.Equal(t, StatusExtra, extras[0].Status)
assert.Equal(t, "not in manifest", extras[0].Message) assert.Equal(t, "not in manifest", extras[0].Message)
} }
@@ -363,8 +362,8 @@ func TestCheckSubdirectories(t *testing.T) {
// Create files with full directory structure // Create files with full directory structure
for path, content := range files { for path, content := range files {
fullPath := "/data/" + path fullPath := "/data/" + path
require.NoError(t, fs.MkdirAll("/data/dir1/dir2/dir3", 0755)) require.NoError(t, fs.MkdirAll("/data/dir1/dir2/dir3", 0o755))
require.NoError(t, afero.WriteFile(fs, fullPath, content, 0644)) require.NoError(t, afero.WriteFile(fs, fullPath, content, 0o644))
} }
chk, err := NewChecker("/manifest.mf", "/data", fs) chk, err := NewChecker("/manifest.mf", "/data", fs)
@@ -390,8 +389,8 @@ func TestCheckEmptyManifest(t *testing.T) {
chk, err := NewChecker("/manifest.mf", "/data", fs) chk, err := NewChecker("/manifest.mf", "/data", fs)
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(0), chk.FileCount()) assert.Equal(t, FileCount(0), chk.FileCount())
assert.Equal(t, int64(0), chk.TotalBytes()) assert.Equal(t, FileSize(0), chk.TotalBytes())
results := make(chan Result, 10) results := make(chan Result, 10)
err = chk.Check(context.Background(), results, nil) err = chk.Check(context.Background(), results, nil)

View File

@@ -3,4 +3,9 @@ package mfer
const ( const (
Version = "0.1.0" Version = "0.1.0"
ReleaseDate = "2025-12-17" ReleaseDate = "2025-12-17"
// MaxDecompressedSize is the maximum allowed size of decompressed manifest
// data (256 MB). This prevents decompression bombs from consuming excessive
// memory.
MaxDecompressedSize int64 = 256 * 1024 * 1024
) )

View File

@@ -2,9 +2,12 @@ package mfer
import ( import (
"bytes" "bytes"
"crypto/sha256"
"errors" "errors"
"fmt"
"io" "io"
"github.com/google/uuid"
"github.com/klauspost/compress/zstd" "github.com/klauspost/compress/zstd"
"github.com/spf13/afero" "github.com/spf13/afero"
"google.golang.org/protobuf/proto" "google.golang.org/protobuf/proto"
@@ -12,6 +15,19 @@ import (
"sneak.berlin/go/mfer/internal/log" "sneak.berlin/go/mfer/internal/log"
) )
// validateUUID checks that the byte slice is a valid UUID (16 bytes, parseable).
func validateUUID(data []byte) error {
if len(data) != 16 {
return errors.New("invalid UUID length")
}
// Try to parse as UUID to validate format
_, err := uuid.FromBytes(data)
if err != nil {
return errors.New("invalid UUID format")
}
return nil
}
func (m *manifest) deserializeInner() error { func (m *manifest) deserializeInner() error {
if m.pbOuter.Version != MFFileOuter_VERSION_ONE { if m.pbOuter.Version != MFFileOuter_VERSION_ONE {
return errors.New("unknown version") return errors.New("unknown version")
@@ -20,6 +36,38 @@ func (m *manifest) deserializeInner() error {
return errors.New("unknown compression type") return errors.New("unknown compression type")
} }
// Validate outer UUID before any decompression
if err := validateUUID(m.pbOuter.Uuid); err != nil {
return errors.New("outer UUID invalid: " + err.Error())
}
// Verify hash of compressed data before decompression
h := sha256.New()
if _, err := h.Write(m.pbOuter.InnerMessage); err != nil {
return err
}
sha256Hash := h.Sum(nil)
if !bytes.Equal(sha256Hash, m.pbOuter.Sha256) {
return errors.New("compressed data hash mismatch")
}
// Verify signature if present
if len(m.pbOuter.Signature) > 0 {
if len(m.pbOuter.SigningPubKey) == 0 {
return errors.New("signature present but no public key")
}
sigString, err := m.signatureString()
if err != nil {
return fmt.Errorf("failed to generate signature string for verification: %w", err)
}
if err := gpgVerify([]byte(sigString), m.pbOuter.Signature, m.pbOuter.SigningPubKey); err != nil {
return fmt.Errorf("signature verification failed: %w", err)
}
log.Infof("signature verified successfully")
}
bb := bytes.NewBuffer(m.pbOuter.InnerMessage) bb := bytes.NewBuffer(m.pbOuter.InnerMessage)
zr, err := zstd.NewReader(bb) zr, err := zstd.NewReader(bb)
@@ -28,10 +76,20 @@ func (m *manifest) deserializeInner() error {
} }
defer zr.Close() defer zr.Close()
dat, err := io.ReadAll(zr) // Limit decompressed size to prevent decompression bombs.
// Use declared size + 1 byte to detect overflow, capped at MaxDecompressedSize.
maxSize := MaxDecompressedSize
if m.pbOuter.Size > 0 && m.pbOuter.Size < int64(maxSize) {
maxSize = int64(m.pbOuter.Size) + 1
}
limitedReader := io.LimitReader(zr, maxSize)
dat, err := io.ReadAll(limitedReader)
if err != nil { if err != nil {
return err return err
} }
if int64(len(dat)) >= MaxDecompressedSize {
return fmt.Errorf("decompressed data exceeds maximum allowed size of %d bytes", MaxDecompressedSize)
}
isize := len(dat) isize := len(dat)
if int64(isize) != m.pbOuter.Size { if int64(isize) != m.pbOuter.Size {
@@ -45,6 +103,16 @@ func (m *manifest) deserializeInner() error {
return err return err
} }
// Validate inner UUID
if err := validateUUID(m.pbInner.Uuid); err != nil {
return errors.New("inner UUID invalid: " + err.Error())
}
// Verify UUIDs match
if !bytes.Equal(m.pbOuter.Uuid, m.pbInner.Uuid) {
return errors.New("outer and inner UUID mismatch")
}
log.Infof("loaded manifest with %d files", len(m.pbInner.Files)) log.Infof("loaded manifest with %d files", len(m.pbInner.Files))
return nil return nil
} }
@@ -61,7 +129,7 @@ func validateMagic(dat []byte) bool {
// NewManifestFromReader reads a manifest from an io.Reader. // NewManifestFromReader reads a manifest from an io.Reader.
func NewManifestFromReader(input io.Reader) (*manifest, error) { func NewManifestFromReader(input io.Reader) (*manifest, error) {
m := New() m := &manifest{}
dat, err := io.ReadAll(input) dat, err := io.ReadAll(input)
if err != nil { if err != nil {
return nil, err return nil, err
@@ -102,8 +170,3 @@ func NewManifestFromFile(fs afero.Fs, path string) (*manifest, error) {
defer func() { _ = f.Close() }() defer func() { _ = f.Close() }()
return NewManifestFromReader(f) return NewManifestFromReader(f)
} }
// NewFromProto is deprecated, use NewManifestFromReader instead.
func NewFromProto(input io.Reader) (*manifest, error) {
return NewManifestFromReader(input)
}

212
mfer/gpg.go Normal file
View File

@@ -0,0 +1,212 @@
package mfer
import (
"bytes"
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
)
// GPGKeyID represents a GPG key identifier (fingerprint or key ID).
type GPGKeyID string
// SigningOptions contains options for GPG signing.
type SigningOptions struct {
KeyID GPGKeyID
}
// gpgSign creates a detached signature of the data using the specified key.
// Returns the armored detached signature.
func gpgSign(data []byte, keyID GPGKeyID) ([]byte, error) {
cmd := exec.Command("gpg",
"--detach-sign",
"--armor",
"--local-user", string(keyID),
)
cmd.Stdin = bytes.NewReader(data)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("gpg sign failed: %w: %s", err, stderr.String())
}
return stdout.Bytes(), nil
}
// gpgExportPublicKey exports the public key for the specified key ID.
// Returns the armored public key.
func gpgExportPublicKey(keyID GPGKeyID) ([]byte, error) {
cmd := exec.Command("gpg",
"--export",
"--armor",
string(keyID),
)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("gpg export failed: %w: %s", err, stderr.String())
}
if stdout.Len() == 0 {
return nil, fmt.Errorf("gpg key not found: %s", keyID)
}
return stdout.Bytes(), nil
}
// gpgGetKeyFingerprint gets the full fingerprint for a key ID.
func gpgGetKeyFingerprint(keyID GPGKeyID) ([]byte, error) {
cmd := exec.Command("gpg",
"--with-colons",
"--fingerprint",
string(keyID),
)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("gpg fingerprint lookup failed: %w: %s", err, stderr.String())
}
// Parse the colon-delimited output to find the fingerprint
lines := strings.Split(stdout.String(), "\n")
for _, line := range lines {
fields := strings.Split(line, ":")
if len(fields) >= 10 && fields[0] == "fpr" {
return []byte(fields[9]), nil
}
}
return nil, fmt.Errorf("fingerprint not found for key: %s", keyID)
}
// gpgExtractPubKeyFingerprint imports a public key into a temporary keyring
// and extracts its fingerprint. This verifies the key is valid and returns
// the actual fingerprint from the key material.
func gpgExtractPubKeyFingerprint(pubKey []byte) (string, error) {
// Create temporary directory for GPG operations
tmpDir, err := os.MkdirTemp("", "mfer-gpg-fingerprint-*")
if err != nil {
return "", fmt.Errorf("failed to create temp dir: %w", err)
}
defer os.RemoveAll(tmpDir)
// Set restrictive permissions
if err := os.Chmod(tmpDir, 0o700); err != nil {
return "", fmt.Errorf("failed to set temp dir permissions: %w", err)
}
// Write public key to temp file
pubKeyFile := filepath.Join(tmpDir, "pubkey.asc")
if err := os.WriteFile(pubKeyFile, pubKey, 0o600); err != nil {
return "", fmt.Errorf("failed to write public key: %w", err)
}
// Import the public key into the temporary keyring
importCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--import",
pubKeyFile,
)
var importStderr bytes.Buffer
importCmd.Stderr = &importStderr
if err := importCmd.Run(); err != nil {
return "", fmt.Errorf("failed to import public key: %w: %s", err, importStderr.String())
}
// List keys to get fingerprint
listCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--with-colons",
"--fingerprint",
)
var listStdout, listStderr bytes.Buffer
listCmd.Stdout = &listStdout
listCmd.Stderr = &listStderr
if err := listCmd.Run(); err != nil {
return "", fmt.Errorf("failed to list keys: %w: %s", err, listStderr.String())
}
// Parse the colon-delimited output to find the fingerprint
lines := strings.Split(listStdout.String(), "\n")
for _, line := range lines {
fields := strings.Split(line, ":")
if len(fields) >= 10 && fields[0] == "fpr" {
return fields[9], nil
}
}
return "", fmt.Errorf("fingerprint not found in imported key")
}
// gpgVerify verifies a detached signature against data using the provided public key.
// It creates a temporary keyring to import the public key for verification.
func gpgVerify(data, signature, pubKey []byte) error {
// Create temporary directory for GPG operations
tmpDir, err := os.MkdirTemp("", "mfer-gpg-verify-*")
if err != nil {
return fmt.Errorf("failed to create temp dir: %w", err)
}
defer os.RemoveAll(tmpDir)
// Set restrictive permissions
if err := os.Chmod(tmpDir, 0o700); err != nil {
return fmt.Errorf("failed to set temp dir permissions: %w", err)
}
// Write public key to temp file
pubKeyFile := filepath.Join(tmpDir, "pubkey.asc")
if err := os.WriteFile(pubKeyFile, pubKey, 0o600); err != nil {
return fmt.Errorf("failed to write public key: %w", err)
}
// Write signature to temp file
sigFile := filepath.Join(tmpDir, "signature.asc")
if err := os.WriteFile(sigFile, signature, 0o600); err != nil {
return fmt.Errorf("failed to write signature: %w", err)
}
// Write data to temp file
dataFile := filepath.Join(tmpDir, "data")
if err := os.WriteFile(dataFile, data, 0o600); err != nil {
return fmt.Errorf("failed to write data: %w", err)
}
// Import the public key into the temporary keyring
importCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--import",
pubKeyFile,
)
var importStderr bytes.Buffer
importCmd.Stderr = &importStderr
if err := importCmd.Run(); err != nil {
return fmt.Errorf("failed to import public key: %w: %s", err, importStderr.String())
}
// Verify the signature
verifyCmd := exec.Command("gpg",
"--homedir", tmpDir,
"--verify",
sigFile,
dataFile,
)
var verifyStderr bytes.Buffer
verifyCmd.Stderr = &verifyStderr
if err := verifyCmd.Run(); err != nil {
return fmt.Errorf("signature verification failed: %w: %s", err, verifyStderr.String())
}
return nil
}

347
mfer/gpg_test.go Normal file
View File

@@ -0,0 +1,347 @@
package mfer
import (
"bytes"
"context"
"os"
"os/exec"
"path/filepath"
"strings"
"testing"
"github.com/spf13/afero"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// testGPGEnv sets up a temporary GPG home directory with a test key.
// Returns the key ID and a cleanup function.
func testGPGEnv(t *testing.T) (GPGKeyID, func()) {
t.Helper()
// Check if gpg is installed
if _, err := exec.LookPath("gpg"); err != nil {
t.Skip("gpg not installed, skipping signing test")
return "", func() {}
}
// Create temporary GPG home directory
gpgHome, err := os.MkdirTemp("", "mfer-gpg-test-*")
require.NoError(t, err)
// Set restrictive permissions on GPG home
require.NoError(t, os.Chmod(gpgHome, 0o700))
// Save original GNUPGHOME and set new one
origGPGHome := os.Getenv("GNUPGHOME")
os.Setenv("GNUPGHOME", gpgHome)
cleanup := func() {
if origGPGHome == "" {
os.Unsetenv("GNUPGHOME")
} else {
os.Setenv("GNUPGHOME", origGPGHome)
}
os.RemoveAll(gpgHome)
}
// Generate a test key with no passphrase
keyParams := `%no-protection
Key-Type: RSA
Key-Length: 2048
Name-Real: MFER Test Key
Name-Email: test@mfer.test
Expire-Date: 0
%commit
`
paramsFile := filepath.Join(gpgHome, "key-params")
require.NoError(t, os.WriteFile(paramsFile, []byte(keyParams), 0o600))
cmd := exec.Command("gpg", "--batch", "--gen-key", paramsFile)
cmd.Env = append(os.Environ(), "GNUPGHOME="+gpgHome)
output, err := cmd.CombinedOutput()
if err != nil {
cleanup()
t.Skipf("failed to generate test GPG key: %v: %s", err, output)
return "", func() {}
}
// Get the key fingerprint
cmd = exec.Command("gpg", "--list-keys", "--with-colons", "test@mfer.test")
cmd.Env = append(os.Environ(), "GNUPGHOME="+gpgHome)
output, err = cmd.Output()
if err != nil {
cleanup()
t.Fatalf("failed to list test key: %v", err)
}
// Parse fingerprint from output
var keyID string
for _, line := range strings.Split(string(output), "\n") {
fields := strings.Split(line, ":")
if len(fields) >= 10 && fields[0] == "fpr" {
keyID = fields[9]
break
}
}
if keyID == "" {
cleanup()
t.Fatal("failed to find test key fingerprint")
}
return GPGKeyID(keyID), cleanup
}
func TestGPGSign(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data to sign")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
assert.NotEmpty(t, sig)
assert.Contains(t, string(sig), "-----BEGIN PGP SIGNATURE-----")
assert.Contains(t, string(sig), "-----END PGP SIGNATURE-----")
}
func TestGPGExportPublicKey(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
pubKey, err := gpgExportPublicKey(keyID)
require.NoError(t, err)
assert.NotEmpty(t, pubKey)
assert.Contains(t, string(pubKey), "-----BEGIN PGP PUBLIC KEY BLOCK-----")
assert.Contains(t, string(pubKey), "-----END PGP PUBLIC KEY BLOCK-----")
}
func TestGPGGetKeyFingerprint(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
fingerprint, err := gpgGetKeyFingerprint(keyID)
require.NoError(t, err)
assert.NotEmpty(t, fingerprint)
// The fingerprint should be 40 hex chars
assert.Len(t, fingerprint, 40, "fingerprint should be 40 hex chars")
}
func TestGPGSignInvalidKey(t *testing.T) {
// Set up test environment (we need GNUPGHOME set)
_, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data")
_, err := gpgSign(data, GPGKeyID("NONEXISTENT_KEY_ID_12345"))
assert.Error(t, err)
}
func TestBuilderWithSigning(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create a builder with signing options
b := NewBuilder()
b.SetSigningOptions(&SigningOptions{
KeyID: keyID,
})
// Add a test file
content := []byte("test file content")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
// Build the manifest
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Parse the manifest and verify signature fields are populated
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
require.NotNil(t, manifest.pbOuter)
assert.NotEmpty(t, manifest.pbOuter.Signature, "signature should be populated")
assert.NotEmpty(t, manifest.pbOuter.Signer, "signer should be populated")
assert.NotEmpty(t, manifest.pbOuter.SigningPubKey, "signing public key should be populated")
// Verify signature is a valid PGP signature
assert.Contains(t, string(manifest.pbOuter.Signature), "-----BEGIN PGP SIGNATURE-----")
// Verify public key is a valid PGP public key block
assert.Contains(t, string(manifest.pbOuter.SigningPubKey), "-----BEGIN PGP PUBLIC KEY BLOCK-----")
}
func TestScannerWithSigning(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create in-memory filesystem with test files
fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content1"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content2"), 0o644))
// Create scanner with signing options
opts := &ScannerOptions{
Fs: fs,
SigningOptions: &SigningOptions{
KeyID: keyID,
},
}
s := NewScannerWithOptions(opts)
// Enumerate files
require.NoError(t, s.EnumeratePath("/testdir", nil))
assert.Equal(t, FileCount(2), s.FileCount())
// Generate signed manifest
var buf bytes.Buffer
require.NoError(t, s.ToManifest(context.Background(), &buf, nil))
// Parse and verify
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
assert.NotEmpty(t, manifest.pbOuter.Signature)
assert.NotEmpty(t, manifest.pbOuter.Signer)
assert.NotEmpty(t, manifest.pbOuter.SigningPubKey)
}
func TestGPGVerify(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data to sign and verify")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
pubKey, err := gpgExportPublicKey(keyID)
require.NoError(t, err)
// Verify the signature
err = gpgVerify(data, sig, pubKey)
require.NoError(t, err)
}
func TestGPGVerifyInvalidSignature(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data to sign")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
pubKey, err := gpgExportPublicKey(keyID)
require.NoError(t, err)
// Try to verify with different data - should fail
wrongData := []byte("different data")
err = gpgVerify(wrongData, sig, pubKey)
assert.Error(t, err)
}
func TestGPGVerifyBadPublicKey(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
data := []byte("test data")
sig, err := gpgSign(data, keyID)
require.NoError(t, err)
// Try to verify with invalid public key - should fail
badPubKey := []byte("not a valid public key")
err = gpgVerify(data, sig, badPubKey)
assert.Error(t, err)
}
func TestManifestSignatureVerification(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create a builder with signing options
b := NewBuilder()
b.SetSigningOptions(&SigningOptions{
KeyID: keyID,
})
// Add a test file
content := []byte("test file content for verification")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
// Build the manifest
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Parse the manifest - signature should be verified during load
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
require.NotNil(t, manifest)
// Signature should be present and valid
assert.NotEmpty(t, manifest.pbOuter.Signature)
}
func TestManifestTamperedSignatureFails(t *testing.T) {
keyID, cleanup := testGPGEnv(t)
defer cleanup()
// Create a signed manifest
b := NewBuilder()
b.SetSigningOptions(&SigningOptions{
KeyID: keyID,
})
content := []byte("test file content")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Tamper with the signature by replacing some bytes
data := buf.Bytes()
// Find and modify a byte in the signature portion
for i := range data {
if i > 100 && data[i] == 'A' {
data[i] = 'B'
break
}
}
// Try to load the tampered manifest - should fail
_, err = NewManifestFromReader(bytes.NewReader(data))
assert.Error(t, err)
}
func TestBuilderWithoutSigning(t *testing.T) {
// Create a builder without signing options
b := NewBuilder()
// Add a test file
content := []byte("test file content")
reader := bytes.NewReader(content)
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime{}, reader, nil)
require.NoError(t, err)
// Build the manifest
var buf bytes.Buffer
err = b.Build(&buf)
require.NoError(t, err)
// Parse the manifest and verify signature fields are empty
manifest, err := NewManifestFromReader(&buf)
require.NoError(t, err)
require.NotNil(t, manifest.pbOuter)
assert.Empty(t, manifest.pbOuter.Signature, "signature should be empty when not signing")
assert.Empty(t, manifest.pbOuter.Signer, "signer should be empty when not signing")
assert.Empty(t, manifest.pbOuter.SigningPubKey, "signing public key should be empty when not signing")
}

View File

@@ -2,133 +2,29 @@ package mfer
import ( import (
"bytes" "bytes"
"context" "encoding/hex"
"errors" "errors"
"fmt" "fmt"
"io/fs"
"os"
"path"
"path/filepath"
"strings"
"github.com/spf13/afero" "github.com/multiformats/go-multihash"
"sneak.berlin/go/mfer/internal/log"
) )
type manifestFile struct { // manifest holds the internal representation of a manifest file.
path string // Use NewManifestFromFile or NewManifestFromReader to load an existing manifest,
info fs.FileInfo // or use Builder to create a new one.
}
func (m *manifestFile) String() string {
return fmt.Sprintf("<File \"%s\">", m.path)
}
type manifest struct { type manifest struct {
sourceFS []afero.Fs
files []*manifestFile
scanOptions *ManifestScanOptions
totalFileSize int64
pbInner *MFFile pbInner *MFFile
pbOuter *MFFileOuter pbOuter *MFFileOuter
output *bytes.Buffer output *bytes.Buffer
ctx context.Context signingOptions *SigningOptions
errors []*error
} }
func (m *manifest) String() string { func (m *manifest) String() string {
return fmt.Sprintf("<Manifest count=%d totalSize=%d>", len(m.files), m.totalFileSize) count := 0
}
// ManifestScanOptions configures behavior when scanning directories for manifest generation.
type ManifestScanOptions struct {
IncludeDotfiles bool // Include files and directories starting with a dot (default: exclude)
FollowSymLinks bool // Resolve symlinks instead of skipping them
}
func (m *manifest) HasError() bool {
return len(m.errors) > 0
}
func (m *manifest) AddError(e error) *manifest {
m.errors = append(m.errors, &e)
return m
}
func (m *manifest) WithContext(c context.Context) *manifest {
m.ctx = c
return m
}
func (m *manifest) addInputPath(inputPath string) error {
abs, err := filepath.Abs(inputPath)
if err != nil {
return err
}
// Validate path exists
if _, err := os.Stat(abs); err != nil {
return fmt.Errorf("path does not exist: %s", inputPath)
}
afs := afero.NewReadOnlyFs(afero.NewBasePathFs(afero.NewOsFs(), abs))
return m.addInputFS(afs)
}
func (m *manifest) addInputFS(f afero.Fs) error {
if m.sourceFS == nil {
m.sourceFS = make([]afero.Fs, 0)
}
m.sourceFS = append(m.sourceFS, f)
// FIXME do some sort of check on f here?
return nil
}
// New creates an empty manifest.
func New() *manifest {
m := &manifest{}
return m
}
// NewFromPaths creates a manifest configured to scan the given filesystem paths.
func NewFromPaths(options *ManifestScanOptions, inputPaths ...string) (*manifest, error) {
log.Dump(inputPaths)
m := New()
m.scanOptions = options
for _, p := range inputPaths {
err := m.addInputPath(p)
if err != nil {
return nil, err
}
}
return m, nil
}
// NewFromFS creates a manifest configured to scan the given afero filesystem.
func NewFromFS(options *ManifestScanOptions, fs afero.Fs) (*manifest, error) {
m := New()
m.scanOptions = options
err := m.addInputFS(fs)
if err != nil {
return nil, err
}
return m, nil
}
func (m *manifest) GetFileCount() int64 {
if m.pbInner != nil { if m.pbInner != nil {
return int64(len(m.pbInner.Files)) count = len(m.pbInner.Files)
} }
return int64(len(m.files)) return fmt.Sprintf("<Manifest count=%d>", count)
}
func (m *manifest) GetTotalFileSize() int64 {
if m.pbInner != nil {
var total int64
for _, f := range m.pbInner.Files {
total += f.Size
}
return total
}
return m.totalFileSize
} }
// Files returns all file entries from a loaded manifest. // Files returns all file entries from a loaded manifest.
@@ -139,64 +35,25 @@ func (m *manifest) Files() []*MFFilePath {
return m.pbInner.Files return m.pbInner.Files
} }
func pathIsHidden(p string) bool { // signatureString generates the canonical string used for signing/verification.
tp := path.Clean(p) // Format: MAGIC-UUID-MULTIHASH where UUID and multihash are hex-encoded.
if strings.HasPrefix(tp, ".") { // Requires pbOuter to be set with Uuid and Sha256 fields.
return true func (m *manifest) signatureString() (string, error) {
if m.pbOuter == nil {
return "", errors.New("pbOuter not set")
} }
for { if len(m.pbOuter.Uuid) == 0 {
d, f := path.Split(tp) return "", errors.New("UUID not set")
if strings.HasPrefix(f, ".") {
return true
}
if d == "" {
return false
}
tp = d[0 : len(d)-1] // trim trailing slash from dir
} }
if len(m.pbOuter.Sha256) == 0 {
return "", errors.New("SHA256 hash not set")
} }
func (m *manifest) addFile(p string, fi fs.FileInfo, sfsIndex int) error { mh, err := multihash.Encode(m.pbOuter.Sha256, multihash.SHA2_256)
if !m.scanOptions.IncludeDotfiles && pathIsHidden(p) {
return nil
}
if fi == nil {
// fi should come from Walk; if nil, stat to get info
var err error
fi, err = m.sourceFS[sfsIndex].Stat(p)
if err != nil { if err != nil {
return err return "", fmt.Errorf("failed to encode multihash: %w", err)
} }
} uuidStr := hex.EncodeToString(m.pbOuter.Uuid)
if fi.IsDir() { mhStr := hex.EncodeToString(mh)
// manifests contain only files, directories are implied. return fmt.Sprintf("%s-%s-%s", MAGIC, uuidStr, mhStr), nil
return nil
}
cleanPath := p
if cleanPath[0:1] == "/" {
cleanPath = cleanPath[1:]
}
nf := &manifestFile{
path: cleanPath,
info: fi,
}
m.files = append(m.files, nf)
m.totalFileSize = m.totalFileSize + fi.Size()
return nil
}
func (m *manifest) Scan() error {
// FIXME scan and whatever function does the hashing should take ctx
for idx, sfs := range m.sourceFS {
if sfs == nil {
return errors.New("invalid source fs")
}
e := afero.Walk(sfs, "/", func(p string, info fs.FileInfo, err error) error {
return m.addFile(p, info, idx)
})
if e != nil {
return e
}
}
return nil
} }

View File

@@ -220,6 +220,8 @@ type MFFileOuter struct {
// and not for cryptographic integrity. // and not for cryptographic integrity.
Size int64 `protobuf:"varint,103,opt,name=size,proto3" json:"size,omitempty"` Size int64 `protobuf:"varint,103,opt,name=size,proto3" json:"size,omitempty"`
Sha256 []byte `protobuf:"bytes,104,opt,name=sha256,proto3" json:"sha256,omitempty"` Sha256 []byte `protobuf:"bytes,104,opt,name=sha256,proto3" json:"sha256,omitempty"`
// uuid must match the uuid in the inner message
Uuid []byte `protobuf:"bytes,105,opt,name=uuid,proto3" json:"uuid,omitempty"`
InnerMessage []byte `protobuf:"bytes,199,opt,name=innerMessage,proto3" json:"innerMessage,omitempty"` InnerMessage []byte `protobuf:"bytes,199,opt,name=innerMessage,proto3" json:"innerMessage,omitempty"`
// detached signature, ascii or binary // detached signature, ascii or binary
Signature []byte `protobuf:"bytes,201,opt,name=signature,proto3,oneof" json:"signature,omitempty"` Signature []byte `protobuf:"bytes,201,opt,name=signature,proto3,oneof" json:"signature,omitempty"`
@@ -289,6 +291,13 @@ func (x *MFFileOuter) GetSha256() []byte {
return nil return nil
} }
func (x *MFFileOuter) GetUuid() []byte {
if x != nil {
return x.Uuid
}
return nil
}
func (x *MFFileOuter) GetInnerMessage() []byte { func (x *MFFileOuter) GetInnerMessage() []byte {
if x != nil { if x != nil {
return x.InnerMessage return x.InnerMessage
@@ -463,6 +472,9 @@ type MFFile struct {
Version MFFile_Version `protobuf:"varint,100,opt,name=version,proto3,enum=MFFile_Version" json:"version,omitempty"` Version MFFile_Version `protobuf:"varint,100,opt,name=version,proto3,enum=MFFile_Version" json:"version,omitempty"`
// required manifest attributes: // required manifest attributes:
Files []*MFFilePath `protobuf:"bytes,101,rep,name=files,proto3" json:"files,omitempty"` Files []*MFFilePath `protobuf:"bytes,101,rep,name=files,proto3" json:"files,omitempty"`
// uuid is a random v4 UUID generated when creating the manifest
// used as part of the signature to prevent replay attacks
Uuid []byte `protobuf:"bytes,102,opt,name=uuid,proto3" json:"uuid,omitempty"`
// optional manifest attributes 2xx: // optional manifest attributes 2xx:
CreatedAt *Timestamp `protobuf:"bytes,201,opt,name=createdAt,proto3,oneof" json:"createdAt,omitempty"` CreatedAt *Timestamp `protobuf:"bytes,201,opt,name=createdAt,proto3,oneof" json:"createdAt,omitempty"`
unknownFields protoimpl.UnknownFields unknownFields protoimpl.UnknownFields
@@ -513,6 +525,13 @@ func (x *MFFile) GetFiles() []*MFFilePath {
return nil return nil
} }
func (x *MFFile) GetUuid() []byte {
if x != nil {
return x.Uuid
}
return nil
}
func (x *MFFile) GetCreatedAt() *Timestamp { func (x *MFFile) GetCreatedAt() *Timestamp {
if x != nil { if x != nil {
return x.CreatedAt return x.CreatedAt
@@ -527,12 +546,13 @@ const file_mf_proto_rawDesc = "" +
"\bmf.proto\";\n" + "\bmf.proto\";\n" +
"\tTimestamp\x12\x18\n" + "\tTimestamp\x12\x18\n" +
"\aseconds\x18\x01 \x01(\x03R\aseconds\x12\x14\n" + "\aseconds\x18\x01 \x01(\x03R\aseconds\x12\x14\n" +
"\x05nanos\x18\x02 \x01(\x05R\x05nanos\"\xdc\x03\n" + "\x05nanos\x18\x02 \x01(\x05R\x05nanos\"\xf0\x03\n" +
"\vMFFileOuter\x12.\n" + "\vMFFileOuter\x12.\n" +
"\aversion\x18e \x01(\x0e2\x14.MFFileOuter.VersionR\aversion\x12F\n" + "\aversion\x18e \x01(\x0e2\x14.MFFileOuter.VersionR\aversion\x12F\n" +
"\x0fcompressionType\x18f \x01(\x0e2\x1c.MFFileOuter.CompressionTypeR\x0fcompressionType\x12\x12\n" + "\x0fcompressionType\x18f \x01(\x0e2\x1c.MFFileOuter.CompressionTypeR\x0fcompressionType\x12\x12\n" +
"\x04size\x18g \x01(\x03R\x04size\x12\x16\n" + "\x04size\x18g \x01(\x03R\x04size\x12\x16\n" +
"\x06sha256\x18h \x01(\fR\x06sha256\x12#\n" + "\x06sha256\x18h \x01(\fR\x06sha256\x12\x12\n" +
"\x04uuid\x18i \x01(\fR\x04uuid\x12#\n" +
"\finnerMessage\x18\xc7\x01 \x01(\fR\finnerMessage\x12\"\n" + "\finnerMessage\x18\xc7\x01 \x01(\fR\finnerMessage\x12\"\n" +
"\tsignature\x18\xc9\x01 \x01(\fH\x00R\tsignature\x88\x01\x01\x12\x1c\n" + "\tsignature\x18\xc9\x01 \x01(\fH\x00R\tsignature\x88\x01\x01\x12\x1c\n" +
"\x06signer\x18\xca\x01 \x01(\fH\x01R\x06signer\x88\x01\x01\x12*\n" + "\x06signer\x18\xca\x01 \x01(\fH\x01R\x06signer\x88\x01\x01\x12*\n" +
@@ -564,10 +584,11 @@ const file_mf_proto_rawDesc = "" +
"\x06_ctimeB\b\n" + "\x06_ctimeB\b\n" +
"\x06_atime\".\n" + "\x06_atime\".\n" +
"\x0eMFFileChecksum\x12\x1c\n" + "\x0eMFFileChecksum\x12\x1c\n" +
"\tmultiHash\x18\x01 \x01(\fR\tmultiHash\"\xc2\x01\n" + "\tmultiHash\x18\x01 \x01(\fR\tmultiHash\"\xd6\x01\n" +
"\x06MFFile\x12)\n" + "\x06MFFile\x12)\n" +
"\aversion\x18d \x01(\x0e2\x0f.MFFile.VersionR\aversion\x12!\n" + "\aversion\x18d \x01(\x0e2\x0f.MFFile.VersionR\aversion\x12!\n" +
"\x05files\x18e \x03(\v2\v.MFFilePathR\x05files\x12.\n" + "\x05files\x18e \x03(\v2\v.MFFilePathR\x05files\x12\x12\n" +
"\x04uuid\x18f \x01(\fR\x04uuid\x12.\n" +
"\tcreatedAt\x18\xc9\x01 \x01(\v2\n" + "\tcreatedAt\x18\xc9\x01 \x01(\v2\n" +
".TimestampH\x00R\tcreatedAt\x88\x01\x01\",\n" + ".TimestampH\x00R\tcreatedAt\x88\x01\x01\",\n" +
"\aVersion\x12\x10\n" + "\aVersion\x12\x10\n" +

View File

@@ -28,6 +28,9 @@ message MFFileOuter {
int64 size = 103; int64 size = 103;
bytes sha256 = 104; bytes sha256 = 104;
// uuid must match the uuid in the inner message
bytes uuid = 105;
bytes innerMessage = 199; bytes innerMessage = 199;
// 2xx for optional manifest root attributes // 2xx for optional manifest root attributes
// think we might use gosignify instead of gpg: // think we might use gosignify instead of gpg:
@@ -43,6 +46,9 @@ message MFFileOuter {
message MFFilePath { message MFFilePath {
// required attributes: // required attributes:
// Path invariants: must be valid UTF-8, use forward slashes only,
// be relative (no leading /), contain no ".." segments, and no
// empty segments (no "//").
string path = 1; string path = 1;
int64 size = 2; int64 size = 2;
@@ -72,6 +78,10 @@ message MFFile {
// required manifest attributes: // required manifest attributes:
repeated MFFilePath files = 101; repeated MFFilePath files = 101;
// uuid is a random v4 UUID generated when creating the manifest
// used as part of the signature to prevent replay attacks
bytes uuid = 102;
// optional manifest attributes 2xx: // optional manifest attributes 2xx:
optional Timestamp createdAt = 201; optional Timestamp createdAt = 201;
} }

View File

@@ -1,15 +0,0 @@
package mfer
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestPathHiddenFunc(t *testing.T) {
assert.False(t, pathIsHidden("/a/b/c/hello.txt"))
assert.True(t, pathIsHidden("/a/b/c/.hello.txt"))
assert.True(t, pathIsHidden("/a/.b/c/hello.txt"))
assert.True(t, pathIsHidden("/.a/b/c/hello.txt"))
assert.False(t, pathIsHidden("./a/b/c/hello.txt"))
}

View File

@@ -1,33 +0,0 @@
package mfer
import (
"io"
"os"
)
func (m *manifest) WriteToFile(path string) error {
// FIXME refuse to overwrite without -f if file exists
f, err := os.Create(path)
if err != nil {
return err
}
defer func() { _ = f.Close() }()
return m.Write(f)
}
func (m *manifest) Write(output io.Writer) error {
if m.pbOuter == nil {
err := m.generate()
if err != nil {
return err
}
}
_, err := output.Write(m.output.Bytes())
if err != nil {
return err
}
return nil
}

View File

@@ -1,4 +1,4 @@
package scanner package mfer
import ( import (
"context" "context"
@@ -13,7 +13,6 @@ import (
"github.com/dustin/go-humanize" "github.com/dustin/go-humanize"
"github.com/spf13/afero" "github.com/spf13/afero"
"sneak.berlin/go/mfer/internal/log" "sneak.berlin/go/mfer/internal/log"
"sneak.berlin/go/mfer/mfer"
) )
// Phase 1: Enumeration // Phase 1: Enumeration
@@ -23,8 +22,8 @@ import (
// EnumerateStatus contains progress information for the enumeration phase. // EnumerateStatus contains progress information for the enumeration phase.
type EnumerateStatus struct { type EnumerateStatus struct {
FilesFound int64 // Number of files discovered so far FilesFound FileCount // Number of files discovered so far
BytesFound int64 // Total size of discovered files (from stat) BytesFound FileSize // Total size of discovered files (from stat)
} }
// Phase 2: Scan (ToManifest) // Phase 2: Scan (ToManifest)
@@ -34,27 +33,28 @@ type EnumerateStatus struct {
// ScanStatus contains progress information for the scan phase. // ScanStatus contains progress information for the scan phase.
type ScanStatus struct { type ScanStatus struct {
TotalFiles int64 // Total number of files to scan TotalFiles FileCount // Total number of files to scan
ScannedFiles int64 // Number of files scanned so far ScannedFiles FileCount // Number of files scanned so far
TotalBytes int64 // Total bytes to read (sum of all file sizes) TotalBytes FileSize // Total bytes to read (sum of all file sizes)
ScannedBytes int64 // Bytes read so far ScannedBytes FileSize // Bytes read so far
BytesPerSec float64 // Current throughput rate BytesPerSec float64 // Current throughput rate
ETA time.Duration // Estimated time to completion ETA time.Duration // Estimated time to completion
} }
// Options configures scanner behavior. // ScannerOptions configures scanner behavior.
type Options struct { type ScannerOptions struct {
IncludeDotfiles bool // Include files and directories starting with a dot (default: exclude) IncludeDotfiles bool // Include files and directories starting with a dot (default: exclude)
FollowSymLinks bool // Resolve symlinks instead of skipping them FollowSymLinks bool // Resolve symlinks instead of skipping them
Fs afero.Fs // Filesystem to use, defaults to OsFs if nil Fs afero.Fs // Filesystem to use, defaults to OsFs if nil
SigningOptions *SigningOptions // GPG signing options (nil = no signing)
} }
// FileEntry represents a file that has been enumerated. // FileEntry represents a file that has been enumerated.
type FileEntry struct { type FileEntry struct {
Path string // Relative path (used in manifest) Path RelFilePath // Relative path (used in manifest)
AbsPath string // Absolute path (used for reading file content) AbsPath AbsFilePath // Absolute path (used for reading file content)
Size int64 // File size in bytes Size FileSize // File size in bytes
Mtime time.Time // Last modification time Mtime ModTime // Last modification time
Ctime time.Time // Creation time (platform-dependent) Ctime time.Time // Creation time (platform-dependent)
} }
@@ -62,19 +62,20 @@ type FileEntry struct {
type Scanner struct { type Scanner struct {
mu sync.RWMutex mu sync.RWMutex
files []*FileEntry files []*FileEntry
options *Options totalBytes FileSize // cached sum of all file sizes
options *ScannerOptions
fs afero.Fs fs afero.Fs
} }
// New creates a new Scanner with default options. // NewScanner creates a new Scanner with default options.
func New() *Scanner { func NewScanner() *Scanner {
return NewWithOptions(nil) return NewScannerWithOptions(nil)
} }
// NewWithOptions creates a new Scanner with the given options. // NewScannerWithOptions creates a new Scanner with the given options.
func NewWithOptions(opts *Options) *Scanner { func NewScannerWithOptions(opts *ScannerOptions) *Scanner {
if opts == nil { if opts == nil {
opts = &Options{} opts = &ScannerOptions{}
} }
fs := opts.Fs fs := opts.Fs
if fs == nil { if fs == nil {
@@ -154,7 +155,7 @@ func (s *Scanner) enumerateFS(afs afero.Fs, basePath string, progress chan<- Enu
if err != nil { if err != nil {
return err return err
} }
if !s.options.IncludeDotfiles && pathIsHidden(p) { if !s.options.IncludeDotfiles && IsHiddenPath(p) {
if info.IsDir() { if info.IsDir() {
return filepath.SkipDir return filepath.SkipDir
} }
@@ -206,21 +207,19 @@ func (s *Scanner) enumerateFileWithInfo(filePath string, basePath string, info f
} }
entry := &FileEntry{ entry := &FileEntry{
Path: cleanPath, Path: RelFilePath(cleanPath),
AbsPath: absPath, AbsPath: AbsFilePath(absPath),
Size: info.Size(), Size: FileSize(info.Size()),
Mtime: info.ModTime(), Mtime: ModTime(info.ModTime()),
// Note: Ctime not available from fs.FileInfo on all platforms // Note: Ctime not available from fs.FileInfo on all platforms
// Will need platform-specific code to extract it // Will need platform-specific code to extract it
} }
s.mu.Lock() s.mu.Lock()
s.files = append(s.files, entry) s.files = append(s.files, entry)
filesFound := int64(len(s.files)) s.totalBytes += entry.Size
var bytesFound int64 filesFound := FileCount(len(s.files))
for _, f := range s.files { bytesFound := s.totalBytes
bytesFound += f.Size
}
s.mu.Unlock() s.mu.Unlock()
sendEnumerateStatus(progress, EnumerateStatus{ sendEnumerateStatus(progress, EnumerateStatus{
@@ -241,21 +240,17 @@ func (s *Scanner) Files() []*FileEntry {
} }
// FileCount returns the number of files in the scanner. // FileCount returns the number of files in the scanner.
func (s *Scanner) FileCount() int64 { func (s *Scanner) FileCount() FileCount {
s.mu.RLock() s.mu.RLock()
defer s.mu.RUnlock() defer s.mu.RUnlock()
return int64(len(s.files)) return FileCount(len(s.files))
} }
// TotalBytes returns the total size of all files in the scanner. // TotalBytes returns the total size of all files in the scanner.
func (s *Scanner) TotalBytes() int64 { func (s *Scanner) TotalBytes() FileSize {
s.mu.RLock() s.mu.RLock()
defer s.mu.RUnlock() defer s.mu.RUnlock()
var total int64 return s.totalBytes
for _, f := range s.files {
total += f.Size
}
return total
} }
// ToManifest reads all file contents, computes hashes, and generates a manifest. // ToManifest reads all file contents, computes hashes, and generates a manifest.
@@ -270,17 +265,20 @@ func (s *Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- S
s.mu.RLock() s.mu.RLock()
files := make([]*FileEntry, len(s.files)) files := make([]*FileEntry, len(s.files))
copy(files, s.files) copy(files, s.files)
totalFiles := int64(len(files)) totalFiles := FileCount(len(files))
var totalBytes int64 var totalBytes FileSize
for _, f := range files { for _, f := range files {
totalBytes += f.Size totalBytes += f.Size
} }
s.mu.RUnlock() s.mu.RUnlock()
builder := mfer.NewBuilder() builder := NewBuilder()
if s.options.SigningOptions != nil {
builder.SetSigningOptions(s.options.SigningOptions)
}
var scannedFiles int64 var scannedFiles FileCount
var scannedBytes int64 var scannedBytes FileSize
lastProgressTime := time.Now() lastProgressTime := time.Now()
startTime := time.Now() startTime := time.Now()
@@ -293,18 +291,18 @@ func (s *Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- S
} }
// Open file // Open file
f, err := s.fs.Open(entry.AbsPath) f, err := s.fs.Open(string(entry.AbsPath))
if err != nil { if err != nil {
return err return err
} }
// Create progress channel for this file // Create progress channel for this file
var fileProgress chan mfer.FileHashProgress var fileProgress chan FileHashProgress
var wg sync.WaitGroup var wg sync.WaitGroup
if progress != nil { if progress != nil {
fileProgress = make(chan mfer.FileHashProgress, 1) fileProgress = make(chan FileHashProgress, 1)
wg.Add(1) wg.Add(1)
go func(baseScannedBytes int64) { go func(baseScannedBytes FileSize) {
defer wg.Done() defer wg.Done()
for p := range fileProgress { for p := range fileProgress {
// Send progress at most once per second // Send progress at most once per second
@@ -382,9 +380,10 @@ func (s *Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- S
return builder.Build(w) return builder.Build(w)
} }
// pathIsHidden returns true if the path or any of its parent directories // IsHiddenPath returns true if the path or any of its parent directories
// start with a dot (hidden files/directories). // start with a dot (hidden files/directories).
func pathIsHidden(p string) bool { // The path should use forward slashes.
func IsHiddenPath(p string) bool {
tp := path.Clean(p) tp := path.Clean(p)
if strings.HasPrefix(tp, ".") { if strings.HasPrefix(tp, ".") {
return true return true

View File

@@ -1,4 +1,4 @@
package scanner package mfer
import ( import (
"bytes" "bytes"
@@ -11,77 +11,77 @@ import (
"github.com/stretchr/testify/require" "github.com/stretchr/testify/require"
) )
func TestNew(t *testing.T) { func TestNewScanner(t *testing.T) {
s := New() s := NewScanner()
assert.NotNil(t, s) assert.NotNil(t, s)
assert.Equal(t, int64(0), s.FileCount()) assert.Equal(t, FileCount(0), s.FileCount())
assert.Equal(t, int64(0), s.TotalBytes()) assert.Equal(t, FileSize(0), s.TotalBytes())
} }
func TestNewWithOptions(t *testing.T) { func TestNewScannerWithOptions(t *testing.T) {
t.Run("nil options", func(t *testing.T) { t.Run("nil options", func(t *testing.T) {
s := NewWithOptions(nil) s := NewScannerWithOptions(nil)
assert.NotNil(t, s) assert.NotNil(t, s)
}) })
t.Run("with options", func(t *testing.T) { t.Run("with options", func(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
opts := &Options{ opts := &ScannerOptions{
IncludeDotfiles: true, IncludeDotfiles: true,
FollowSymLinks: true, FollowSymLinks: true,
Fs: fs, Fs: fs,
} }
s := NewWithOptions(opts) s := NewScannerWithOptions(opts)
assert.NotNil(t, s) assert.NotNil(t, s)
}) })
} }
func TestEnumerateFile(t *testing.T) { func TestScannerEnumerateFile(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, afero.WriteFile(fs, "/test.txt", []byte("hello world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/test.txt", []byte("hello world"), 0o644))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumerateFile("/test.txt") err := s.EnumerateFile("/test.txt")
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(1), s.FileCount()) assert.Equal(t, FileCount(1), s.FileCount())
assert.Equal(t, int64(11), s.TotalBytes()) assert.Equal(t, FileSize(11), s.TotalBytes())
files := s.Files() files := s.Files()
require.Len(t, files, 1) require.Len(t, files, 1)
assert.Equal(t, "test.txt", files[0].Path) assert.Equal(t, RelFilePath("test.txt"), files[0].Path)
assert.Equal(t, int64(11), files[0].Size) assert.Equal(t, FileSize(11), files[0].Size)
} }
func TestEnumerateFileMissing(t *testing.T) { func TestScannerEnumerateFileMissing(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumerateFile("/nonexistent.txt") err := s.EnumerateFile("/nonexistent.txt")
assert.Error(t, err) assert.Error(t, err)
} }
func TestEnumeratePath(t *testing.T) { func TestScannerEnumeratePath(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir/subdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir/subdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("one"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("one"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("two"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("two"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/file3.txt", []byte("three"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/subdir/file3.txt", []byte("three"), 0o644))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumeratePath("/testdir", nil) err := s.EnumeratePath("/testdir", nil)
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(3), s.FileCount()) assert.Equal(t, FileCount(3), s.FileCount())
assert.Equal(t, int64(3+3+5), s.TotalBytes()) assert.Equal(t, FileSize(3+3+5), s.TotalBytes())
} }
func TestEnumeratePathWithProgress(t *testing.T) { func TestScannerEnumeratePathWithProgress(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("one"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("one"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("two"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("two"), 0o644))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
progress := make(chan EnumerateStatus, 10) progress := make(chan EnumerateStatus, 10)
err := s.EnumeratePath("/testdir", progress) err := s.EnumeratePath("/testdir", progress)
@@ -95,80 +95,57 @@ func TestEnumeratePathWithProgress(t *testing.T) {
assert.NotEmpty(t, updates) assert.NotEmpty(t, updates)
// Final update should show all files // Final update should show all files
final := updates[len(updates)-1] final := updates[len(updates)-1]
assert.Equal(t, int64(2), final.FilesFound) assert.Equal(t, FileCount(2), final.FilesFound)
assert.Equal(t, int64(6), final.BytesFound) assert.Equal(t, FileSize(6), final.BytesFound)
} }
func TestEnumeratePaths(t *testing.T) { func TestScannerEnumeratePaths(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/dir1", 0755)) require.NoError(t, fs.MkdirAll("/dir1", 0o755))
require.NoError(t, fs.MkdirAll("/dir2", 0755)) require.NoError(t, fs.MkdirAll("/dir2", 0o755))
require.NoError(t, afero.WriteFile(fs, "/dir1/a.txt", []byte("aaa"), 0644)) require.NoError(t, afero.WriteFile(fs, "/dir1/a.txt", []byte("aaa"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/dir2/b.txt", []byte("bbb"), 0644)) require.NoError(t, afero.WriteFile(fs, "/dir2/b.txt", []byte("bbb"), 0o644))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumeratePaths(nil, "/dir1", "/dir2") err := s.EnumeratePaths(nil, "/dir1", "/dir2")
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(2), s.FileCount()) assert.Equal(t, FileCount(2), s.FileCount())
} }
func TestExcludeDotfiles(t *testing.T) { func TestScannerExcludeDotfiles(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir/.hidden", 0755)) require.NoError(t, fs.MkdirAll("/testdir/.hidden", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/visible.txt", []byte("visible"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/visible.txt", []byte("visible"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden.txt", []byte("hidden"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden.txt", []byte("hidden"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden/inside.txt", []byte("inside"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/.hidden/inside.txt", []byte("inside"), 0o644))
t.Run("exclude by default", func(t *testing.T) { t.Run("exclude by default", func(t *testing.T) {
s := NewWithOptions(&Options{Fs: fs, IncludeDotfiles: false}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs, IncludeDotfiles: false})
err := s.EnumeratePath("/testdir", nil) err := s.EnumeratePath("/testdir", nil)
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(1), s.FileCount()) assert.Equal(t, FileCount(1), s.FileCount())
files := s.Files() files := s.Files()
assert.Equal(t, "visible.txt", files[0].Path) assert.Equal(t, RelFilePath("visible.txt"), files[0].Path)
}) })
t.Run("include when enabled", func(t *testing.T) { t.Run("include when enabled", func(t *testing.T) {
s := NewWithOptions(&Options{Fs: fs, IncludeDotfiles: true}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs, IncludeDotfiles: true})
err := s.EnumeratePath("/testdir", nil) err := s.EnumeratePath("/testdir", nil)
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(3), s.FileCount()) assert.Equal(t, FileCount(3), s.FileCount())
}) })
} }
func TestPathIsHidden(t *testing.T) { func TestScannerToManifest(t *testing.T) {
tests := []struct {
path string
hidden bool
}{
{"file.txt", false},
{".hidden", true},
{"dir/file.txt", false},
{"dir/.hidden", true},
{".dir/file.txt", true},
{"/absolute/path", false},
{"/absolute/.hidden", true},
{"./relative", false}, // path.Clean removes leading ./
{"a/b/c/.d/e", true},
}
for _, tt := range tests {
t.Run(tt.path, func(t *testing.T) {
assert.Equal(t, tt.hidden, pathIsHidden(tt.path), "pathIsHidden(%q)", tt.path)
})
}
}
func TestToManifest(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content one"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file1.txt", []byte("content one"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content two"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file2.txt", []byte("content two"), 0o644))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumeratePath("/testdir", nil) err := s.EnumeratePath("/testdir", nil)
require.NoError(t, err) require.NoError(t, err)
@@ -178,15 +155,15 @@ func TestToManifest(t *testing.T) {
// Manifest should have magic bytes // Manifest should have magic bytes
assert.True(t, buf.Len() > 0) assert.True(t, buf.Len() > 0)
assert.Equal(t, "ZNAVSRFG", string(buf.Bytes()[:8])) assert.Equal(t, MAGIC, string(buf.Bytes()[:8]))
} }
func TestToManifestWithProgress(t *testing.T) { func TestScannerToManifestWithProgress(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file.txt", bytes.Repeat([]byte("x"), 1000), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file.txt", bytes.Repeat([]byte("x"), 1000), 0o644))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumeratePath("/testdir", nil) err := s.EnumeratePath("/testdir", nil)
require.NoError(t, err) require.NoError(t, err)
@@ -204,22 +181,22 @@ func TestToManifestWithProgress(t *testing.T) {
assert.NotEmpty(t, updates) assert.NotEmpty(t, updates)
// Final update should show completion // Final update should show completion
final := updates[len(updates)-1] final := updates[len(updates)-1]
assert.Equal(t, int64(1), final.TotalFiles) assert.Equal(t, FileCount(1), final.TotalFiles)
assert.Equal(t, int64(1), final.ScannedFiles) assert.Equal(t, FileCount(1), final.ScannedFiles)
assert.Equal(t, int64(1000), final.TotalBytes) assert.Equal(t, FileSize(1000), final.TotalBytes)
assert.Equal(t, int64(1000), final.ScannedBytes) assert.Equal(t, FileSize(1000), final.ScannedBytes)
} }
func TestToManifestContextCancellation(t *testing.T) { func TestScannerToManifestContextCancellation(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
// Create many files to ensure we have time to cancel // Create many files to ensure we have time to cancel
for i := 0; i < 100; i++ { for i := 0; i < 100; i++ {
name := string(rune('a'+i%26)) + string(rune('0'+i/26)) + ".txt" name := string(rune('a'+i%26)) + string(rune('0'+i/26)) + ".txt"
require.NoError(t, afero.WriteFile(fs, "/testdir/"+name, bytes.Repeat([]byte("x"), 100), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/"+name, bytes.Repeat([]byte("x"), 100), 0o644))
} }
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumeratePath("/testdir", nil) err := s.EnumeratePath("/testdir", nil)
require.NoError(t, err) require.NoError(t, err)
@@ -231,9 +208,9 @@ func TestToManifestContextCancellation(t *testing.T) {
assert.ErrorIs(t, err, context.Canceled) assert.ErrorIs(t, err, context.Canceled)
} }
func TestToManifestEmptyScanner(t *testing.T) { func TestScannerToManifestEmptyScanner(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
var buf bytes.Buffer var buf bytes.Buffer
err := s.ToManifest(context.Background(), &buf, nil) err := s.ToManifest(context.Background(), &buf, nil)
@@ -241,14 +218,14 @@ func TestToManifestEmptyScanner(t *testing.T) {
// Should still produce a valid manifest // Should still produce a valid manifest
assert.True(t, buf.Len() > 0) assert.True(t, buf.Len() > 0)
assert.Equal(t, "ZNAVSRFG", string(buf.Bytes()[:8])) assert.Equal(t, MAGIC, string(buf.Bytes()[:8]))
} }
func TestFilesCopiesSlice(t *testing.T) { func TestScannerFilesCopiesSlice(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, afero.WriteFile(fs, "/test.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/test.txt", []byte("hello"), 0o644))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
require.NoError(t, s.EnumerateFile("/test.txt")) require.NoError(t, s.EnumerateFile("/test.txt"))
files1 := s.Files() files1 := s.Files()
@@ -258,20 +235,20 @@ func TestFilesCopiesSlice(t *testing.T) {
assert.NotSame(t, &files1[0], &files2[0]) assert.NotSame(t, &files1[0], &files2[0])
} }
func TestEnumerateFS(t *testing.T) { func TestScannerEnumerateFS(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir/sub", 0755)) require.NoError(t, fs.MkdirAll("/testdir/sub", 0o755))
require.NoError(t, afero.WriteFile(fs, "/testdir/file.txt", []byte("hello"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/file.txt", []byte("hello"), 0o644))
require.NoError(t, afero.WriteFile(fs, "/testdir/sub/nested.txt", []byte("world"), 0644)) require.NoError(t, afero.WriteFile(fs, "/testdir/sub/nested.txt", []byte("world"), 0o644))
// Create a basepath filesystem // Create a basepath filesystem
baseFs := afero.NewBasePathFs(fs, "/testdir") baseFs := afero.NewBasePathFs(fs, "/testdir")
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
err := s.EnumerateFS(baseFs, "/testdir", nil) err := s.EnumerateFS(baseFs, "/testdir", nil)
require.NoError(t, err) require.NoError(t, err)
assert.Equal(t, int64(2), s.FileCount()) assert.Equal(t, FileCount(2), s.FileCount())
} }
func TestSendEnumerateStatusNonBlocking(t *testing.T) { func TestSendEnumerateStatusNonBlocking(t *testing.T) {
@@ -317,37 +294,37 @@ func TestSendStatusNilChannel(t *testing.T) {
sendScanStatus(nil, ScanStatus{}) sendScanStatus(nil, ScanStatus{})
} }
func TestFileEntryFields(t *testing.T) { func TestScannerFileEntryFields(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
now := time.Now().Truncate(time.Second) now := time.Now().Truncate(time.Second)
require.NoError(t, afero.WriteFile(fs, "/test.txt", []byte("content"), 0644)) require.NoError(t, afero.WriteFile(fs, "/test.txt", []byte("content"), 0o644))
require.NoError(t, fs.Chtimes("/test.txt", now, now)) require.NoError(t, fs.Chtimes("/test.txt", now, now))
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
require.NoError(t, s.EnumerateFile("/test.txt")) require.NoError(t, s.EnumerateFile("/test.txt"))
files := s.Files() files := s.Files()
require.Len(t, files, 1) require.Len(t, files, 1)
entry := files[0] entry := files[0]
assert.Equal(t, "test.txt", entry.Path) assert.Equal(t, RelFilePath("test.txt"), entry.Path)
assert.Contains(t, entry.AbsPath, "test.txt") assert.Contains(t, string(entry.AbsPath), "test.txt")
assert.Equal(t, int64(7), entry.Size) assert.Equal(t, FileSize(7), entry.Size)
// Mtime should be set (within a second of now) // Mtime should be set (within a second of now)
assert.WithinDuration(t, now, entry.Mtime, 2*time.Second) assert.WithinDuration(t, now, time.Time(entry.Mtime), 2*time.Second)
} }
func TestLargeFileEnumeration(t *testing.T) { func TestScannerLargeFileEnumeration(t *testing.T) {
fs := afero.NewMemMapFs() fs := afero.NewMemMapFs()
require.NoError(t, fs.MkdirAll("/testdir", 0755)) require.NoError(t, fs.MkdirAll("/testdir", 0o755))
// Create 100 files // Create 100 files
for i := 0; i < 100; i++ { for i := 0; i < 100; i++ {
name := "/testdir/" + string(rune('a'+i%26)) + string(rune('0'+i/26%10)) + ".txt" name := "/testdir/" + string(rune('a'+i%26)) + string(rune('0'+i/26%10)) + ".txt"
require.NoError(t, afero.WriteFile(fs, name, []byte("data"), 0644)) require.NoError(t, afero.WriteFile(fs, name, []byte("data"), 0o644))
} }
s := NewWithOptions(&Options{Fs: fs}) s := NewScannerWithOptions(&ScannerOptions{Fs: fs})
progress := make(chan EnumerateStatus, 200) progress := make(chan EnumerateStatus, 200)
err := s.EnumeratePath("/testdir", progress) err := s.EnumeratePath("/testdir", progress)
@@ -357,6 +334,29 @@ func TestLargeFileEnumeration(t *testing.T) {
for range progress { for range progress {
} }
assert.Equal(t, int64(100), s.FileCount()) assert.Equal(t, FileCount(100), s.FileCount())
assert.Equal(t, int64(400), s.TotalBytes()) // 100 * 4 bytes assert.Equal(t, FileSize(400), s.TotalBytes()) // 100 * 4 bytes
}
func TestIsHiddenPath(t *testing.T) {
tests := []struct {
path string
hidden bool
}{
{"file.txt", false},
{".hidden", true},
{"dir/file.txt", false},
{"dir/.hidden", true},
{".dir/file.txt", true},
{"/absolute/path", false},
{"/absolute/.hidden", true},
{"./relative", false}, // path.Clean removes leading ./
{"a/b/c/.d/e", true},
}
for _, tt := range tests {
t.Run(tt.path, func(t *testing.T) {
assert.Equal(t, tt.hidden, IsHiddenPath(tt.path), "IsHiddenPath(%q)", tt.path)
})
}
} }

View File

@@ -4,8 +4,10 @@ import (
"bytes" "bytes"
"crypto/sha256" "crypto/sha256"
"errors" "errors"
"fmt"
"time" "time"
"github.com/google/uuid"
"github.com/klauspost/compress/zstd" "github.com/klauspost/compress/zstd"
"google.golang.org/protobuf/proto" "google.golang.org/protobuf/proto"
) )
@@ -14,11 +16,10 @@ import (
const MAGIC string = "ZNAVSRFG" const MAGIC string = "ZNAVSRFG"
func newTimestampFromTime(t time.Time) *Timestamp { func newTimestampFromTime(t time.Time) *Timestamp {
out := &Timestamp{ return &Timestamp{
Seconds: t.Unix(), Seconds: t.Unix(),
Nanos: int32(t.UnixNano() - (t.Unix() * 1000000000)), Nanos: int32(t.Nanosecond()),
} }
return out
} }
func (m *manifest) generate() error { func (m *manifest) generate() error {
@@ -47,14 +48,17 @@ func (m *manifest) generateOuter() error {
if m.pbInner == nil { if m.pbInner == nil {
return errors.New("internal error") return errors.New("internal error")
} }
// Generate UUID and set on inner message
manifestUUID := uuid.New()
m.pbInner.Uuid = manifestUUID[:]
innerData, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbInner) innerData, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbInner)
if err != nil { if err != nil {
return err return err
} }
h := sha256.New() // Compress the inner data
h.Write(innerData)
idc := new(bytes.Buffer) idc := new(bytes.Buffer)
zw, err := zstd.NewWriter(idc, zstd.WithEncoderLevel(zstd.SpeedBestCompression)) zw, err := zstd.NewWriter(idc, zstd.WithEncoderLevel(zstd.SpeedBestCompression))
if err != nil { if err != nil {
@@ -64,16 +68,51 @@ func (m *manifest) generateOuter() error {
if err != nil { if err != nil {
return err return err
} }
_ = zw.Close() _ = zw.Close()
o := &MFFileOuter{ compressedData := idc.Bytes()
InnerMessage: idc.Bytes(),
// Hash the compressed data for integrity verification before decompression
h := sha256.New()
if _, err := h.Write(compressedData); err != nil {
return err
}
sha256Hash := h.Sum(nil)
m.pbOuter = &MFFileOuter{
InnerMessage: compressedData,
Size: int64(len(innerData)), Size: int64(len(innerData)),
Sha256: h.Sum(nil), Sha256: sha256Hash,
Uuid: manifestUUID[:],
Version: MFFileOuter_VERSION_ONE, Version: MFFileOuter_VERSION_ONE,
CompressionType: MFFileOuter_COMPRESSION_ZSTD, CompressionType: MFFileOuter_COMPRESSION_ZSTD,
} }
m.pbOuter = o
// Sign the manifest if signing options are provided
if m.signingOptions != nil && m.signingOptions.KeyID != "" {
sigString, err := m.signatureString()
if err != nil {
return fmt.Errorf("failed to generate signature string: %w", err)
}
sig, err := gpgSign([]byte(sigString), m.signingOptions.KeyID)
if err != nil {
return fmt.Errorf("failed to sign manifest: %w", err)
}
m.pbOuter.Signature = sig
fingerprint, err := gpgGetKeyFingerprint(m.signingOptions.KeyID)
if err != nil {
return fmt.Errorf("failed to get key fingerprint: %w", err)
}
m.pbOuter.Signer = fingerprint
pubKey, err := gpgExportPublicKey(m.signingOptions.KeyID)
if err != nil {
return fmt.Errorf("failed to export public key: %w", err)
}
m.pbOuter.SigningPubKey = pubKey
}
return nil return nil
} }

53
mfer/url.go Normal file
View File

@@ -0,0 +1,53 @@
package mfer
import (
"net/url"
"strings"
)
// ManifestURL represents a URL pointing to a manifest file.
type ManifestURL string
// FileURL represents a URL pointing to a file to be fetched.
type FileURL string
// BaseURL represents a base URL for constructing file URLs.
type BaseURL string
// JoinPath safely joins a relative file path to a base URL.
// The path is properly URL-encoded to prevent path traversal.
func (b BaseURL) JoinPath(path RelFilePath) (FileURL, error) {
base, err := url.Parse(string(b))
if err != nil {
return "", err
}
// Ensure base path ends with /
if !strings.HasSuffix(base.Path, "/") {
base.Path += "/"
}
// Parse and encode the relative path
ref, err := url.Parse(url.PathEscape(string(path)))
if err != nil {
return "", err
}
resolved := base.ResolveReference(ref)
return FileURL(resolved.String()), nil
}
// String returns the URL as a string.
func (b BaseURL) String() string {
return string(b)
}
// String returns the URL as a string.
func (f FileURL) String() string {
return string(f)
}
// String returns the URL as a string.
func (m ManifestURL) String() string {
return string(m)
}