Compare commits
17 Commits
386b22efb8
...
e27f8a6c3b
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
e27f8a6c3b | ||
|
|
ca3e29e802 | ||
|
|
472221a7f6 | ||
|
|
dacc97d1f6 | ||
|
|
2f0005bf64 | ||
|
|
e18ab550ae | ||
|
|
07e0fc166a | ||
|
|
a6a72faafb | ||
|
|
8bb70bc6a9 | ||
|
|
008f270d90 | ||
| bbab6e73f4 | |||
| 615eecff79 | |||
| 9b67de016d | |||
|
|
3c779465e2 | ||
|
|
5572a4901f | ||
|
|
2adc275278 | ||
|
|
6d9c07510a |
2
.gitignore
vendored
2
.gitignore
vendored
@ -3,6 +3,8 @@
|
|||||||
*.tmp
|
*.tmp
|
||||||
*.dockerimage
|
*.dockerimage
|
||||||
/vendor
|
/vendor
|
||||||
|
vendor.tzst
|
||||||
|
modcache.tzst
|
||||||
|
|
||||||
# Stale files
|
# Stale files
|
||||||
.drone.yml
|
.drone.yml
|
||||||
|
|||||||
142
FORMAT.md
Normal file
142
FORMAT.md
Normal file
@ -0,0 +1,142 @@
|
|||||||
|
# .mf File Format Specification
|
||||||
|
|
||||||
|
Version 1.0
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
An `.mf` file is a binary manifest that describes a directory tree of files,
|
||||||
|
including their paths, sizes, and cryptographic checksums. It supports
|
||||||
|
optional GPG signatures for integrity verification and optional timestamps
|
||||||
|
for metadata preservation.
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
An `.mf` file consists of two parts, concatenated:
|
||||||
|
|
||||||
|
1. **Magic bytes** (8 bytes): the ASCII string `ZNAVSRFG`
|
||||||
|
2. **Outer message**: a Protocol Buffers serialized `MFFileOuter` message
|
||||||
|
|
||||||
|
There is no length prefix or version byte between the magic and the protobuf
|
||||||
|
message. The protobuf message extends to the end of the file.
|
||||||
|
|
||||||
|
See [`mfer/mf.proto`](mfer/mf.proto) for exact field numbers and types.
|
||||||
|
|
||||||
|
## Outer Message (`MFFileOuter`)
|
||||||
|
|
||||||
|
The outer message contains:
|
||||||
|
|
||||||
|
| Field | Number | Type | Description |
|
||||||
|
|--------------------|--------|-------------------|--------------------------------------------------|
|
||||||
|
| `version` | 101 | enum | Must be `VERSION_ONE` (1) |
|
||||||
|
| `compressionType` | 102 | enum | Compression of `innerMessage`; must be `COMPRESSION_ZSTD` (1) |
|
||||||
|
| `size` | 103 | int64 | Uncompressed size of `innerMessage` (corruption detection) |
|
||||||
|
| `sha256` | 104 | bytes | SHA-256 hash of the **compressed** `innerMessage` (corruption detection) |
|
||||||
|
| `uuid` | 105 | bytes | Random v4 UUID; must match the inner message UUID |
|
||||||
|
| `innerMessage` | 199 | bytes | Zstd-compressed serialized `MFFile` message |
|
||||||
|
| `signature` | 201 | bytes (optional) | GPG signature (ASCII-armored or binary) |
|
||||||
|
| `signer` | 202 | bytes (optional) | Full GPG key ID of the signer |
|
||||||
|
| `signingPubKey` | 203 | bytes (optional) | Full GPG signing public key |
|
||||||
|
|
||||||
|
### SHA-256 Hash
|
||||||
|
|
||||||
|
The `sha256` field (104) covers the **compressed** `innerMessage` bytes.
|
||||||
|
This allows verifying data integrity before decompression.
|
||||||
|
|
||||||
|
## Compression
|
||||||
|
|
||||||
|
The `innerMessage` field is compressed with [Zstandard (zstd)](https://facebook.github.io/zstd/).
|
||||||
|
Implementations must enforce a decompression size limit to prevent
|
||||||
|
decompression bombs. The reference implementation limits decompressed size to
|
||||||
|
256 MB.
|
||||||
|
|
||||||
|
## Inner Message (`MFFile`)
|
||||||
|
|
||||||
|
After decompressing `innerMessage`, the result is a serialized `MFFile`
|
||||||
|
(referred to as the manifest):
|
||||||
|
|
||||||
|
| Field | Number | Type | Description |
|
||||||
|
|-------------|--------|-----------------------|--------------------------------------------|
|
||||||
|
| `version` | 100 | enum | Must be `VERSION_ONE` (1) |
|
||||||
|
| `files` | 101 | repeated `MFFilePath` | List of files in the manifest |
|
||||||
|
| `uuid` | 102 | bytes | Random v4 UUID; must match outer UUID |
|
||||||
|
| `createdAt` | 201 | Timestamp (optional) | When the manifest was created |
|
||||||
|
|
||||||
|
## File Entries (`MFFilePath`)
|
||||||
|
|
||||||
|
Each file entry contains:
|
||||||
|
|
||||||
|
| Field | Number | Type | Description |
|
||||||
|
|------------|--------|---------------------------|--------------------------------------|
|
||||||
|
| `path` | 1 | string | Relative file path (see Path Rules) |
|
||||||
|
| `size` | 2 | int64 | File size in bytes |
|
||||||
|
| `hashes` | 3 | repeated `MFFileChecksum` | At least one hash required |
|
||||||
|
| `mimeType` | 301 | string (optional) | MIME type |
|
||||||
|
| `mtime` | 302 | Timestamp (optional) | Modification time |
|
||||||
|
| `ctime` | 303 | Timestamp (optional) | Change time (inode metadata change) |
|
||||||
|
|
||||||
|
Field 304 (`atime`) has been removed from the specification. Access time is
|
||||||
|
volatile and non-deterministic; it is not useful for integrity verification.
|
||||||
|
|
||||||
|
## Path Rules
|
||||||
|
|
||||||
|
All `path` values must satisfy these invariants:
|
||||||
|
|
||||||
|
- **UTF-8**: paths must be valid UTF-8
|
||||||
|
- **Forward slashes**: use `/` as the path separator (never `\`)
|
||||||
|
- **Relative only**: no leading `/`
|
||||||
|
- **No parent traversal**: no `..` path segments
|
||||||
|
- **No empty segments**: no `//` sequences
|
||||||
|
- **No trailing slash**: paths refer to files, not directories
|
||||||
|
|
||||||
|
Implementations must validate these invariants when reading and writing
|
||||||
|
manifests. Paths that violate these rules must be rejected.
|
||||||
|
|
||||||
|
## Hash Format (`MFFileChecksum`)
|
||||||
|
|
||||||
|
Each checksum is a single `bytes multiHash` field containing a
|
||||||
|
[multihash](https://multiformats.io/multihash/)-encoded value. Multihash is
|
||||||
|
self-describing: the encoded bytes include a varint algorithm identifier
|
||||||
|
followed by a varint digest length followed by the digest itself.
|
||||||
|
|
||||||
|
The 1.0 implementation writes SHA-256 multihashes (`0x12` algorithm code).
|
||||||
|
Implementations must be able to verify SHA-256 multihashes at minimum.
|
||||||
|
|
||||||
|
## Signature Scheme
|
||||||
|
|
||||||
|
Signing is optional. When present, the signature covers a canonical string
|
||||||
|
constructed as:
|
||||||
|
|
||||||
|
```
|
||||||
|
ZNAVSRFG-<UUID>-<SHA256>
|
||||||
|
```
|
||||||
|
|
||||||
|
Where:
|
||||||
|
- `ZNAVSRFG` is the magic bytes string (literal ASCII)
|
||||||
|
- `<UUID>` is the hex-encoded UUID from the outer message
|
||||||
|
- `<SHA256>` is the hex-encoded SHA-256 hash from the outer message (covering compressed data)
|
||||||
|
|
||||||
|
Components are separated by hyphens. The signature is produced by GPG over
|
||||||
|
this canonical string and stored in the `signature` field of the outer
|
||||||
|
message.
|
||||||
|
|
||||||
|
## Deterministic Serialization
|
||||||
|
|
||||||
|
By default, manifests are generated deterministically:
|
||||||
|
|
||||||
|
- File entries are sorted by `path` in **lexicographic byte order**
|
||||||
|
- `createdAt` is omitted unless explicitly requested
|
||||||
|
- `atime` is never included (field removed from schema)
|
||||||
|
|
||||||
|
This ensures that two independent runs over the same directory tree produce
|
||||||
|
byte-identical `.mf` files (assuming file contents and metadata have not
|
||||||
|
changed).
|
||||||
|
|
||||||
|
## MIME Type
|
||||||
|
|
||||||
|
The recommended MIME type for `.mf` files is `application/octet-stream`.
|
||||||
|
The `.mf` file extension is the canonical identifier.
|
||||||
|
|
||||||
|
## Reference
|
||||||
|
|
||||||
|
- Proto definition: [`mfer/mf.proto`](mfer/mf.proto)
|
||||||
|
- Reference implementation: [git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer)
|
||||||
36
README.md
36
README.md
@ -9,9 +9,8 @@ cryptographic checksums or signatures over same) to aid in archiving,
|
|||||||
downloading, and streaming, or mirroring. The manifest files' data is
|
downloading, and streaming, or mirroring. The manifest files' data is
|
||||||
serialized with Google's [protobuf serialization
|
serialized with Google's [protobuf serialization
|
||||||
format](https://developers.google.com/protocol-buffers). The structure of
|
format](https://developers.google.com/protocol-buffers). The structure of
|
||||||
these files can be found [in the format
|
these files can be found in the [format specification](FORMAT.md) and the
|
||||||
specification](https://git.eeqj.de/sneak/mfer/src/branch/main/mfer/mf.proto)
|
[protobuf schema](mfer/mf.proto), both included in the [project
|
||||||
which is included in the [project
|
|
||||||
repository](https://git.eeqj.de/sneak/mfer).
|
repository](https://git.eeqj.de/sneak/mfer).
|
||||||
|
|
||||||
The current version is pre-1.0 and while the repo was published in 2022,
|
The current version is pre-1.0 and while the repo was published in 2022,
|
||||||
@ -52,6 +51,37 @@ Reading file contents and computing cryptographic hashes for manifest generation
|
|||||||
- **NO_COLOR:** Respect the `NO_COLOR` environment variable for disabling colored output.
|
- **NO_COLOR:** Respect the `NO_COLOR` environment variable for disabling colored output.
|
||||||
- **Options pattern:** Use `NewWithOptions(opts *Options)` constructor pattern for configurable types.
|
- **Options pattern:** Use `NewWithOptions(opts *Options)` constructor pattern for configurable types.
|
||||||
|
|
||||||
|
# Building
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- Go 1.21 or later
|
||||||
|
- `protoc` (Protocol Buffers compiler) — only needed if modifying `.proto` files
|
||||||
|
- `golangci-lint` — for linting (`go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest`)
|
||||||
|
- `gofumpt` — for formatting (`go install mvdan.cc/gofumpt@latest`)
|
||||||
|
|
||||||
|
## Build
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# Build the binary
|
||||||
|
make bin/mfer
|
||||||
|
|
||||||
|
# Run tests
|
||||||
|
make test
|
||||||
|
|
||||||
|
# Format code
|
||||||
|
make fmt
|
||||||
|
|
||||||
|
# Lint
|
||||||
|
make lint
|
||||||
|
```
|
||||||
|
|
||||||
|
## Install from source
|
||||||
|
|
||||||
|
```sh
|
||||||
|
go install sneak.berlin/go/mfer/cmd/mfer@latest
|
||||||
|
```
|
||||||
|
|
||||||
# Build Status
|
# Build Status
|
||||||
|
|
||||||
[](https://drone.datavi.be/sneak/mfer)
|
[](https://drone.datavi.be/sneak/mfer)
|
||||||
|
|||||||
42
TODO.md
42
TODO.md
@ -9,76 +9,76 @@
|
|||||||
**1. Should `MFFileChecksum` be simplified?**
|
**1. Should `MFFileChecksum` be simplified?**
|
||||||
Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
|
Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Leave as-is for now.
|
||||||
|
|
||||||
**2. Should file permissions/mode be stored?**
|
**2. Should file permissions/mode be stored?**
|
||||||
The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
|
The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* No, not right now.
|
||||||
|
|
||||||
**3. Should `atime` be removed from the schema?**
|
**3. Should `atime` be removed from the schema?**
|
||||||
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
|
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* REMOVED — done. Field 304 has been removed from the proto schema.
|
||||||
|
|
||||||
**4. What are the path normalization rules?**
|
**4. What are the path normalization rules?**
|
||||||
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
|
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Implemented — UTF-8, forward-slash only, relative paths only, no `..` segments. Documented in FORMAT.md.
|
||||||
|
|
||||||
**5. Should we add a version byte after the magic?**
|
**5. Should we add a version byte after the magic?**
|
||||||
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
|
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* No — protobuf handles versioning via the `MFFileOuter.Version` field.
|
||||||
|
|
||||||
**6. Should we add a length-prefix after the magic?**
|
**6. Should we add a length-prefix after the magic?**
|
||||||
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
|
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Not needed now.
|
||||||
|
|
||||||
### Signature Design
|
### Signature Design
|
||||||
|
|
||||||
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
|
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
|
||||||
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
|
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Hash covers compressed data. Documented in FORMAT.md.
|
||||||
|
|
||||||
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
|
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
|
||||||
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
|
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Keep string format as-is (established).
|
||||||
|
|
||||||
**9. Should we support detached signature files (`.mf.sig`)?**
|
**9. Should we support detached signature files (`.mf.sig`)?**
|
||||||
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
|
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Not for 1.0.
|
||||||
|
|
||||||
**10. GPG vs pure-Go crypto for signatures?**
|
**10. GPG vs pure-Go crypto for signatures?**
|
||||||
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
|
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Keep GPG shelling for now (established).
|
||||||
|
|
||||||
### Implementation Design
|
### Implementation Design
|
||||||
|
|
||||||
**11. Should manifests be deterministic by default?**
|
**11. Should manifests be deterministic by default?**
|
||||||
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
|
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* YES — implemented, default behavior.
|
||||||
|
|
||||||
**12. Should we consolidate or keep both scanner/checker implementations?**
|
**12. Should we consolidate or keep both scanner/checker implementations?**
|
||||||
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
|
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Consolidated — done (PR#27).
|
||||||
|
|
||||||
**13. Should the `manifest` type be exported?**
|
**13. Should the `manifest` type be exported?**
|
||||||
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
|
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* Keep unexported.
|
||||||
|
|
||||||
**14. What should the Go module path be for 1.0?**
|
**14. What should the Go module path be for 1.0?**
|
||||||
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
|
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
|
||||||
|
|
||||||
> *answer:*
|
> *answer:* `sneak.berlin/go/mfer`
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -86,19 +86,19 @@ Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Whi
|
|||||||
|
|
||||||
### Phase 1: Foundation (format correctness)
|
### Phase 1: Foundation (format correctness)
|
||||||
|
|
||||||
- [ ] Delete `internal/scanner/` and `internal/checker/` — consolidate on `mfer/` package versions; update CLI code
|
- [x] Delete `internal/scanner/` and `internal/checker/` — consolidate on `mfer/` package versions; update CLI code
|
||||||
- [ ] Add deterministic file ordering — sort entries by path (lexicographic, byte-order) in `Builder.Build()`; add test asserting byte-identical output from two runs
|
- [x] Add deterministic file ordering — sort entries by path (lexicographic, byte-order) in `Builder.Build()`; add test asserting byte-identical output from two runs
|
||||||
- [ ] Add decompression size limit — `io.LimitReader` in `deserializeInner()` with `m.pbOuter.Size` as bound
|
- [x] Add decompression size limit — `io.LimitReader` in `deserializeInner()` with `m.pbOuter.Size` as bound
|
||||||
- [ ] Fix `errors.Is` dead code in checker — replace with `os.IsNotExist(err)` or `errors.Is(err, fs.ErrNotExist)`
|
- [ ] Fix `errors.Is` dead code in checker — replace with `os.IsNotExist(err)` or `errors.Is(err, fs.ErrNotExist)`
|
||||||
- [ ] Fix `AddFile` to verify size — check `totalRead == size` after reading, return error on mismatch
|
- [ ] Fix `AddFile` to verify size — check `totalRead == size` after reading, return error on mismatch
|
||||||
- [ ] Specify path invariants — add proto comments (UTF-8, forward-slash, relative, no `..`, no leading `/`); validate in `Builder.AddFile` and `Builder.AddFileWithHash`
|
- [x] Specify path invariants — add proto comments (UTF-8, forward-slash, relative, no `..`, no leading `/`); validate in `Builder.AddFile` and `Builder.AddFileWithHash`
|
||||||
|
|
||||||
### Phase 2: CLI polish
|
### Phase 2: CLI polish
|
||||||
|
|
||||||
- [ ] Fix flag naming — all CLI flags use kebab-case as primary (`--include-dotfiles`, `--follow-symlinks`)
|
- [ ] Fix flag naming — all CLI flags use kebab-case as primary (`--include-dotfiles`, `--follow-symlinks`)
|
||||||
- [ ] Fix URL construction in fetch — use `BaseURL.JoinPath()` or `url.JoinPath()` instead of string concatenation
|
- [ ] Fix URL construction in fetch — use `BaseURL.JoinPath()` or `url.JoinPath()` instead of string concatenation
|
||||||
- [ ] Add progress rate-limiting to Checker — throttle to once per second, matching Scanner
|
- [ ] Add progress rate-limiting to Checker — throttle to once per second, matching Scanner
|
||||||
- [ ] Add `--deterministic` flag (or make it default) — omit `createdAt`, sort files
|
- [x] Add `--deterministic` flag (or make it default) — omit `createdAt`, sort files
|
||||||
|
|
||||||
### Phase 3: Robustness
|
### Phase 3: Robustness
|
||||||
|
|
||||||
@ -109,10 +109,10 @@ Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Whi
|
|||||||
|
|
||||||
### Phase 4: Format finalization
|
### Phase 4: Format finalization
|
||||||
|
|
||||||
- [ ] Remove or deprecate `atime` from proto (pending design question answer)
|
- [x] Remove or deprecate `atime` from proto (pending design question answer)
|
||||||
- [ ] Reserve `optional uint32 mode = 305` in `MFFilePath` for future file permissions
|
- [ ] Reserve `optional uint32 mode = 305` in `MFFilePath` for future file permissions
|
||||||
- [ ] Add version byte after magic — `ZNAVSRFG\x01` for format version 1
|
- [ ] Add version byte after magic — `ZNAVSRFG\x01` for format version 1
|
||||||
- [ ] Write format specification document — separate from README: magic, outer structure, compression, inner structure, path invariants, signature scheme, canonical serialization
|
- [x] Write format specification document — separate from README: magic, outer structure, compression, inner structure, path invariants, signature scheme, canonical serialization
|
||||||
|
|
||||||
### Phase 5: Release prep
|
### Phase 5: Release prep
|
||||||
|
|
||||||
|
|||||||
@ -3,6 +3,7 @@ package cli
|
|||||||
import (
|
import (
|
||||||
"encoding/hex"
|
"encoding/hex"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"io"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
"strings"
|
"strings"
|
||||||
"time"
|
"time"
|
||||||
@ -34,29 +35,32 @@ func findManifest(fs afero.Fs, dir string) (string, error) {
|
|||||||
func (mfa *CLIApp) checkManifestOperation(ctx *cli.Context) error {
|
func (mfa *CLIApp) checkManifestOperation(ctx *cli.Context) error {
|
||||||
log.Debug("checkManifestOperation()")
|
log.Debug("checkManifestOperation()")
|
||||||
|
|
||||||
var manifestPath string
|
manifestPath, err := mfa.resolveManifestArg(ctx)
|
||||||
var err error
|
if err != nil {
|
||||||
|
return fmt.Errorf("check: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
if ctx.Args().Len() > 0 {
|
// URL manifests need to be downloaded to a temp file for the checker
|
||||||
arg := ctx.Args().Get(0)
|
if isHTTPURL(manifestPath) {
|
||||||
// Check if arg is a directory or a file
|
rc, fetchErr := mfa.openManifestReader(manifestPath)
|
||||||
info, statErr := mfa.Fs.Stat(arg)
|
if fetchErr != nil {
|
||||||
if statErr == nil && info.IsDir() {
|
return fmt.Errorf("check: %w", fetchErr)
|
||||||
// It's a directory, look for manifest inside
|
|
||||||
manifestPath, err = findManifest(mfa.Fs, arg)
|
|
||||||
if err != nil {
|
|
||||||
return err
|
|
||||||
}
|
}
|
||||||
} else {
|
tmpFile, tmpErr := afero.TempFile(mfa.Fs, "", "mfer-manifest-*.mf")
|
||||||
// Treat as a file path
|
if tmpErr != nil {
|
||||||
manifestPath = arg
|
_ = rc.Close()
|
||||||
|
return fmt.Errorf("check: failed to create temp file: %w", tmpErr)
|
||||||
}
|
}
|
||||||
} else {
|
tmpPath := tmpFile.Name()
|
||||||
// No argument, look in current directory
|
_, cpErr := io.Copy(tmpFile, rc)
|
||||||
manifestPath, err = findManifest(mfa.Fs, ".")
|
_ = rc.Close()
|
||||||
if err != nil {
|
_ = tmpFile.Close()
|
||||||
return err
|
if cpErr != nil {
|
||||||
|
_ = mfa.Fs.Remove(tmpPath)
|
||||||
|
return fmt.Errorf("check: failed to download manifest: %w", cpErr)
|
||||||
}
|
}
|
||||||
|
defer func() { _ = mfa.Fs.Remove(tmpPath) }()
|
||||||
|
manifestPath = tmpPath
|
||||||
}
|
}
|
||||||
|
|
||||||
basePath := ctx.String("base")
|
basePath := ctx.String("base")
|
||||||
|
|||||||
72
internal/cli/export.go
Normal file
72
internal/cli/export.go
Normal file
@ -0,0 +1,72 @@
|
|||||||
|
package cli
|
||||||
|
|
||||||
|
import (
|
||||||
|
"encoding/hex"
|
||||||
|
"encoding/json"
|
||||||
|
"fmt"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/urfave/cli/v2"
|
||||||
|
"sneak.berlin/go/mfer/mfer"
|
||||||
|
)
|
||||||
|
|
||||||
|
// ExportEntry represents a single file entry in the exported JSON output.
|
||||||
|
type ExportEntry struct {
|
||||||
|
Path string `json:"path"`
|
||||||
|
Size int64 `json:"size"`
|
||||||
|
Hashes []string `json:"hashes"`
|
||||||
|
Mtime *string `json:"mtime,omitempty"`
|
||||||
|
Ctime *string `json:"ctime,omitempty"`
|
||||||
|
}
|
||||||
|
|
||||||
|
func (mfa *CLIApp) exportManifestOperation(ctx *cli.Context) error {
|
||||||
|
pathOrURL, err := mfa.resolveManifestArg(ctx)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("export: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
rc, err := mfa.openManifestReader(pathOrURL)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("export: %w", err)
|
||||||
|
}
|
||||||
|
defer func() { _ = rc.Close() }()
|
||||||
|
|
||||||
|
manifest, err := mfer.NewManifestFromReader(rc)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("export: failed to parse manifest: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
files := manifest.Files()
|
||||||
|
entries := make([]ExportEntry, 0, len(files))
|
||||||
|
|
||||||
|
for _, f := range files {
|
||||||
|
entry := ExportEntry{
|
||||||
|
Path: f.Path,
|
||||||
|
Size: f.Size,
|
||||||
|
Hashes: make([]string, 0, len(f.Hashes)),
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, h := range f.Hashes {
|
||||||
|
entry.Hashes = append(entry.Hashes, hex.EncodeToString(h.MultiHash))
|
||||||
|
}
|
||||||
|
|
||||||
|
if f.Mtime != nil {
|
||||||
|
t := time.Unix(f.Mtime.Seconds, int64(f.Mtime.Nanos)).UTC().Format(time.RFC3339Nano)
|
||||||
|
entry.Mtime = &t
|
||||||
|
}
|
||||||
|
if f.Ctime != nil {
|
||||||
|
t := time.Unix(f.Ctime.Seconds, int64(f.Ctime.Nanos)).UTC().Format(time.RFC3339Nano)
|
||||||
|
entry.Ctime = &t
|
||||||
|
}
|
||||||
|
|
||||||
|
entries = append(entries, entry)
|
||||||
|
}
|
||||||
|
|
||||||
|
enc := json.NewEncoder(mfa.Stdout)
|
||||||
|
enc.SetIndent("", " ")
|
||||||
|
if err := enc.Encode(entries); err != nil {
|
||||||
|
return fmt.Errorf("export: failed to encode JSON: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
137
internal/cli/export_test.go
Normal file
137
internal/cli/export_test.go
Normal file
@ -0,0 +1,137 @@
|
|||||||
|
package cli
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bytes"
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"net/http"
|
||||||
|
"net/http/httptest"
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/spf13/afero"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
"sneak.berlin/go/mfer/mfer"
|
||||||
|
)
|
||||||
|
|
||||||
|
// buildTestManifest creates a manifest from in-memory files and returns its bytes.
|
||||||
|
func buildTestManifest(t *testing.T, files map[string][]byte) []byte {
|
||||||
|
t.Helper()
|
||||||
|
sourceFs := afero.NewMemMapFs()
|
||||||
|
for path, content := range files {
|
||||||
|
require.NoError(t, sourceFs.MkdirAll("/", 0o755))
|
||||||
|
require.NoError(t, afero.WriteFile(sourceFs, "/"+path, content, 0o644))
|
||||||
|
}
|
||||||
|
|
||||||
|
opts := &mfer.ScannerOptions{Fs: sourceFs}
|
||||||
|
s := mfer.NewScannerWithOptions(opts)
|
||||||
|
require.NoError(t, s.EnumerateFS(sourceFs, "/", nil))
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
require.NoError(t, s.ToManifest(context.Background(), &buf, nil))
|
||||||
|
return buf.Bytes()
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestExportManifestOperation(t *testing.T) {
|
||||||
|
testFiles := map[string][]byte{
|
||||||
|
"hello.txt": []byte("Hello, World!"),
|
||||||
|
"sub/file.txt": []byte("nested content"),
|
||||||
|
}
|
||||||
|
manifestData := buildTestManifest(t, testFiles)
|
||||||
|
|
||||||
|
// Write manifest to memfs
|
||||||
|
fs := afero.NewMemMapFs()
|
||||||
|
require.NoError(t, afero.WriteFile(fs, "/test.mf", manifestData, 0o644))
|
||||||
|
|
||||||
|
var stdout, stderr bytes.Buffer
|
||||||
|
exitCode := RunWithOptions(&RunOptions{
|
||||||
|
Appname: "mfer",
|
||||||
|
Args: []string{"mfer", "export", "/test.mf"},
|
||||||
|
Stdin: &bytes.Buffer{},
|
||||||
|
Stdout: &stdout,
|
||||||
|
Stderr: &stderr,
|
||||||
|
Fs: fs,
|
||||||
|
})
|
||||||
|
|
||||||
|
require.Equal(t, 0, exitCode, "stderr: %s", stderr.String())
|
||||||
|
|
||||||
|
var entries []ExportEntry
|
||||||
|
require.NoError(t, json.Unmarshal(stdout.Bytes(), &entries))
|
||||||
|
assert.Len(t, entries, 2)
|
||||||
|
|
||||||
|
// Verify entries have expected fields
|
||||||
|
pathSet := make(map[string]bool)
|
||||||
|
for _, e := range entries {
|
||||||
|
pathSet[e.Path] = true
|
||||||
|
assert.NotEmpty(t, e.Hashes, "entry %s should have hashes", e.Path)
|
||||||
|
assert.Greater(t, e.Size, int64(0), "entry %s should have positive size", e.Path)
|
||||||
|
}
|
||||||
|
assert.True(t, pathSet["hello.txt"])
|
||||||
|
assert.True(t, pathSet["sub/file.txt"])
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestExportFromHTTPURL(t *testing.T) {
|
||||||
|
testFiles := map[string][]byte{
|
||||||
|
"a.txt": []byte("aaa"),
|
||||||
|
}
|
||||||
|
manifestData := buildTestManifest(t, testFiles)
|
||||||
|
|
||||||
|
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
w.Header().Set("Content-Type", "application/octet-stream")
|
||||||
|
_, _ = w.Write(manifestData)
|
||||||
|
}))
|
||||||
|
defer server.Close()
|
||||||
|
|
||||||
|
var stdout, stderr bytes.Buffer
|
||||||
|
exitCode := RunWithOptions(&RunOptions{
|
||||||
|
Appname: "mfer",
|
||||||
|
Args: []string{"mfer", "export", server.URL + "/index.mf"},
|
||||||
|
Stdin: &bytes.Buffer{},
|
||||||
|
Stdout: &stdout,
|
||||||
|
Stderr: &stderr,
|
||||||
|
Fs: afero.NewMemMapFs(),
|
||||||
|
})
|
||||||
|
|
||||||
|
require.Equal(t, 0, exitCode, "stderr: %s", stderr.String())
|
||||||
|
|
||||||
|
var entries []ExportEntry
|
||||||
|
require.NoError(t, json.Unmarshal(stdout.Bytes(), &entries))
|
||||||
|
assert.Len(t, entries, 1)
|
||||||
|
assert.Equal(t, "a.txt", entries[0].Path)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestListFromHTTPURL(t *testing.T) {
|
||||||
|
testFiles := map[string][]byte{
|
||||||
|
"one.txt": []byte("1"),
|
||||||
|
"two.txt": []byte("22"),
|
||||||
|
}
|
||||||
|
manifestData := buildTestManifest(t, testFiles)
|
||||||
|
|
||||||
|
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
_, _ = w.Write(manifestData)
|
||||||
|
}))
|
||||||
|
defer server.Close()
|
||||||
|
|
||||||
|
var stdout, stderr bytes.Buffer
|
||||||
|
exitCode := RunWithOptions(&RunOptions{
|
||||||
|
Appname: "mfer",
|
||||||
|
Args: []string{"mfer", "list", server.URL + "/index.mf"},
|
||||||
|
Stdin: &bytes.Buffer{},
|
||||||
|
Stdout: &stdout,
|
||||||
|
Stderr: &stderr,
|
||||||
|
Fs: afero.NewMemMapFs(),
|
||||||
|
})
|
||||||
|
|
||||||
|
require.Equal(t, 0, exitCode, "stderr: %s", stderr.String())
|
||||||
|
output := stdout.String()
|
||||||
|
assert.Contains(t, output, "one.txt")
|
||||||
|
assert.Contains(t, output, "two.txt")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestIsHTTPURL(t *testing.T) {
|
||||||
|
assert.True(t, isHTTPURL("http://example.com/manifest.mf"))
|
||||||
|
assert.True(t, isHTTPURL("https://example.com/manifest.mf"))
|
||||||
|
assert.False(t, isHTTPURL("/local/path.mf"))
|
||||||
|
assert.False(t, isHTTPURL("relative/path.mf"))
|
||||||
|
assert.False(t, isHTTPURL("ftp://example.com/file"))
|
||||||
|
}
|
||||||
@ -67,7 +67,7 @@ func (mfa *CLIApp) fetchManifestOperation(ctx *cli.Context) error {
|
|||||||
// Compute base URL (directory containing manifest)
|
// Compute base URL (directory containing manifest)
|
||||||
baseURL, err := url.Parse(manifestURL)
|
baseURL, err := url.Parse(manifestURL)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("fetch: invalid manifest URL: %w", err)
|
||||||
}
|
}
|
||||||
baseURL.Path = path.Dir(baseURL.Path)
|
baseURL.Path = path.Dir(baseURL.Path)
|
||||||
if !strings.HasSuffix(baseURL.Path, "/") {
|
if !strings.HasSuffix(baseURL.Path, "/") {
|
||||||
@ -267,7 +267,7 @@ func downloadFile(fileURL, localPath string, entry *mfer.MFFilePath, progress ch
|
|||||||
dir := filepath.Dir(localPath)
|
dir := filepath.Dir(localPath)
|
||||||
if dir != "" && dir != "." {
|
if dir != "" && dir != "." {
|
||||||
if err := os.MkdirAll(dir, 0o755); err != nil {
|
if err := os.MkdirAll(dir, 0o755); err != nil {
|
||||||
return err
|
return fmt.Errorf("failed to create directory %s: %w", dir, err)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -287,9 +287,9 @@ func downloadFile(fileURL, localPath string, entry *mfer.MFFilePath, progress ch
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Fetch file
|
// Fetch file
|
||||||
resp, err := http.Get(fileURL)
|
resp, err := http.Get(fileURL) //nolint:gosec // URL constructed from manifest base
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("HTTP request failed: %w", err)
|
||||||
}
|
}
|
||||||
defer func() { _ = resp.Body.Close() }()
|
defer func() { _ = resp.Body.Close() }()
|
||||||
|
|
||||||
@ -307,7 +307,7 @@ func downloadFile(fileURL, localPath string, entry *mfer.MFFilePath, progress ch
|
|||||||
// Create temp file
|
// Create temp file
|
||||||
out, err := os.Create(tmpPath)
|
out, err := os.Create(tmpPath)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("failed to create temp file: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Set up hash computation
|
// Set up hash computation
|
||||||
|
|||||||
@ -41,8 +41,8 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
|
|||||||
|
|
||||||
basePath := ctx.String("base")
|
basePath := ctx.String("base")
|
||||||
showProgress := ctx.Bool("progress")
|
showProgress := ctx.Bool("progress")
|
||||||
includeDotfiles := ctx.Bool("IncludeDotfiles")
|
includeDotfiles := ctx.Bool("include-dotfiles")
|
||||||
followSymlinks := ctx.Bool("FollowSymLinks")
|
followSymlinks := ctx.Bool("follow-symlinks")
|
||||||
|
|
||||||
// Find manifest file
|
// Find manifest file
|
||||||
var manifestPath string
|
var manifestPath string
|
||||||
@ -54,7 +54,7 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
|
|||||||
if statErr == nil && info.IsDir() {
|
if statErr == nil && info.IsDir() {
|
||||||
manifestPath, err = findManifest(mfa.Fs, arg)
|
manifestPath, err = findManifest(mfa.Fs, arg)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("freshen: %w", err)
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
manifestPath = arg
|
manifestPath = arg
|
||||||
@ -62,7 +62,7 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
|
|||||||
} else {
|
} else {
|
||||||
manifestPath, err = findManifest(mfa.Fs, ".")
|
manifestPath, err = findManifest(mfa.Fs, ".")
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("freshen: %w", err)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -93,7 +93,7 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
|
|||||||
|
|
||||||
absBase, err := filepath.Abs(basePath)
|
absBase, err := filepath.Abs(basePath)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("freshen: invalid base path: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
err = afero.Walk(mfa.Fs, absBase, func(path string, info fs.FileInfo, walkErr error) error {
|
err = afero.Walk(mfa.Fs, absBase, func(path string, info fs.FileInfo, walkErr error) error {
|
||||||
@ -104,7 +104,7 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
|
|||||||
// Get relative path
|
// Get relative path
|
||||||
relPath, err := filepath.Rel(absBase, path)
|
relPath, err := filepath.Rel(absBase, path)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("freshen: failed to compute relative path for %s: %w", path, err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Skip the manifest file itself
|
// Skip the manifest file itself
|
||||||
@ -226,6 +226,9 @@ func (mfa *CLIApp) freshenManifestOperation(ctx *cli.Context) error {
|
|||||||
var hashedBytes int64
|
var hashedBytes int64
|
||||||
|
|
||||||
builder := mfer.NewBuilder()
|
builder := mfer.NewBuilder()
|
||||||
|
if ctx.Bool("include-timestamps") {
|
||||||
|
builder.SetIncludeTimestamps(true)
|
||||||
|
}
|
||||||
|
|
||||||
// Set up signing options if sign-key is provided
|
// Set up signing options if sign-key is provided
|
||||||
if signKey := ctx.String("sign-key"); signKey != "" {
|
if signKey := ctx.String("sign-key"); signKey != "" {
|
||||||
|
|||||||
@ -20,11 +20,18 @@ func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error {
|
|||||||
log.Debug("generateManifestOperation()")
|
log.Debug("generateManifestOperation()")
|
||||||
|
|
||||||
opts := &mfer.ScannerOptions{
|
opts := &mfer.ScannerOptions{
|
||||||
IncludeDotfiles: ctx.Bool("IncludeDotfiles"),
|
IncludeDotfiles: ctx.Bool("include-dotfiles"),
|
||||||
FollowSymLinks: ctx.Bool("FollowSymLinks"),
|
FollowSymLinks: ctx.Bool("follow-symlinks"),
|
||||||
|
IncludeTimestamps: ctx.Bool("include-timestamps"),
|
||||||
Fs: mfa.Fs,
|
Fs: mfa.Fs,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Set seed for deterministic UUID if provided
|
||||||
|
if seed := ctx.String("seed"); seed != "" {
|
||||||
|
opts.Seed = seed
|
||||||
|
log.Infof("using deterministic seed for manifest UUID")
|
||||||
|
}
|
||||||
|
|
||||||
// Set up signing options if sign-key is provided
|
// Set up signing options if sign-key is provided
|
||||||
if signKey := ctx.String("sign-key"); signKey != "" {
|
if signKey := ctx.String("sign-key"); signKey != "" {
|
||||||
opts.SigningOptions = &mfer.SigningOptions{
|
opts.SigningOptions = &mfer.SigningOptions{
|
||||||
@ -59,7 +66,7 @@ func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error {
|
|||||||
if args.Len() == 0 {
|
if args.Len() == 0 {
|
||||||
// Default to current directory
|
// Default to current directory
|
||||||
if err := s.EnumeratePath(".", enumProgress); err != nil {
|
if err := s.EnumeratePath(".", enumProgress); err != nil {
|
||||||
return err
|
return fmt.Errorf("generate: failed to enumerate current directory: %w", err)
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
// Collect and validate all paths first
|
// Collect and validate all paths first
|
||||||
@ -68,7 +75,7 @@ func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error {
|
|||||||
inputPath := args.Get(i)
|
inputPath := args.Get(i)
|
||||||
ap, err := filepath.Abs(inputPath)
|
ap, err := filepath.Abs(inputPath)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("generate: invalid path %q: %w", inputPath, err)
|
||||||
}
|
}
|
||||||
// Validate path exists before adding to list
|
// Validate path exists before adding to list
|
||||||
if exists, _ := afero.Exists(mfa.Fs, ap); !exists {
|
if exists, _ := afero.Exists(mfa.Fs, ap); !exists {
|
||||||
@ -78,7 +85,7 @@ func (mfa *CLIApp) generateManifestOperation(ctx *cli.Context) error {
|
|||||||
paths = append(paths, ap)
|
paths = append(paths, ap)
|
||||||
}
|
}
|
||||||
if err := s.EnumeratePaths(enumProgress, paths...); err != nil {
|
if err := s.EnumeratePaths(enumProgress, paths...); err != nil {
|
||||||
return err
|
return fmt.Errorf("generate: failed to enumerate paths: %w", err)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
enumWg.Wait()
|
enumWg.Wait()
|
||||||
|
|||||||
@ -16,32 +16,20 @@ func (mfa *CLIApp) listManifestOperation(ctx *cli.Context) error {
|
|||||||
longFormat := ctx.Bool("long")
|
longFormat := ctx.Bool("long")
|
||||||
print0 := ctx.Bool("print0")
|
print0 := ctx.Bool("print0")
|
||||||
|
|
||||||
// Find manifest file
|
pathOrURL, err := mfa.resolveManifestArg(ctx)
|
||||||
var manifestPath string
|
|
||||||
var err error
|
|
||||||
|
|
||||||
if ctx.Args().Len() > 0 {
|
|
||||||
arg := ctx.Args().Get(0)
|
|
||||||
info, statErr := mfa.Fs.Stat(arg)
|
|
||||||
if statErr == nil && info.IsDir() {
|
|
||||||
manifestPath, err = findManifest(mfa.Fs, arg)
|
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("list: %w", err)
|
||||||
}
|
|
||||||
} else {
|
|
||||||
manifestPath = arg
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
manifestPath, err = findManifest(mfa.Fs, ".")
|
|
||||||
if err != nil {
|
|
||||||
return err
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Load manifest
|
rc, err := mfa.openManifestReader(pathOrURL)
|
||||||
manifest, err := mfer.NewManifestFromFile(mfa.Fs, manifestPath)
|
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return fmt.Errorf("failed to load manifest: %w", err)
|
return fmt.Errorf("list: %w", err)
|
||||||
|
}
|
||||||
|
defer func() { _ = rc.Close() }()
|
||||||
|
|
||||||
|
manifest, err := mfer.NewManifestFromReader(rc)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("list: failed to parse manifest: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
files := manifest.Files()
|
files := manifest.Files()
|
||||||
|
|||||||
54
internal/cli/manifest_loader.go
Normal file
54
internal/cli/manifest_loader.go
Normal file
@ -0,0 +1,54 @@
|
|||||||
|
package cli
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"io"
|
||||||
|
"net/http"
|
||||||
|
"strings"
|
||||||
|
|
||||||
|
"github.com/urfave/cli/v2"
|
||||||
|
)
|
||||||
|
|
||||||
|
// isHTTPURL returns true if the string starts with http:// or https://.
|
||||||
|
func isHTTPURL(s string) bool {
|
||||||
|
return strings.HasPrefix(s, "http://") || strings.HasPrefix(s, "https://")
|
||||||
|
}
|
||||||
|
|
||||||
|
// openManifestReader opens a manifest from a path or URL and returns a ReadCloser.
|
||||||
|
// The caller must close the returned reader.
|
||||||
|
func (mfa *CLIApp) openManifestReader(pathOrURL string) (io.ReadCloser, error) {
|
||||||
|
if isHTTPURL(pathOrURL) {
|
||||||
|
resp, err := http.Get(pathOrURL) //nolint:gosec // user-provided URL is intentional
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to fetch %s: %w", pathOrURL, err)
|
||||||
|
}
|
||||||
|
if resp.StatusCode != http.StatusOK {
|
||||||
|
_ = resp.Body.Close()
|
||||||
|
return nil, fmt.Errorf("failed to fetch %s: HTTP %d", pathOrURL, resp.StatusCode)
|
||||||
|
}
|
||||||
|
return resp.Body, nil
|
||||||
|
}
|
||||||
|
f, err := mfa.Fs.Open(pathOrURL)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
return f, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// resolveManifestArg resolves the manifest path from CLI arguments.
|
||||||
|
// HTTP(S) URLs are returned as-is. Directories are searched for index.mf/.index.mf.
|
||||||
|
// If no argument is given, the current directory is searched.
|
||||||
|
func (mfa *CLIApp) resolveManifestArg(ctx *cli.Context) (string, error) {
|
||||||
|
if ctx.Args().Len() > 0 {
|
||||||
|
arg := ctx.Args().Get(0)
|
||||||
|
if isHTTPURL(arg) {
|
||||||
|
return arg, nil
|
||||||
|
}
|
||||||
|
info, statErr := mfa.Fs.Stat(arg)
|
||||||
|
if statErr == nil && info.IsDir() {
|
||||||
|
return findManifest(mfa.Fs, arg)
|
||||||
|
}
|
||||||
|
return arg, nil
|
||||||
|
}
|
||||||
|
return findManifest(mfa.Fs, ".")
|
||||||
|
}
|
||||||
@ -123,13 +123,13 @@ func (mfa *CLIApp) run(args []string) {
|
|||||||
},
|
},
|
||||||
Flags: append(commonFlags(),
|
Flags: append(commonFlags(),
|
||||||
&cli.BoolFlag{
|
&cli.BoolFlag{
|
||||||
Name: "FollowSymLinks",
|
Name: "follow-symlinks",
|
||||||
Aliases: []string{"follow-symlinks"},
|
Aliases: []string{"L"},
|
||||||
Usage: "Resolve encountered symlinks",
|
Usage: "Resolve encountered symlinks",
|
||||||
},
|
},
|
||||||
&cli.BoolFlag{
|
&cli.BoolFlag{
|
||||||
Name: "IncludeDotfiles",
|
Name: "include-dotfiles",
|
||||||
Aliases: []string{"include-dotfiles"},
|
|
||||||
Usage: "Include dot (hidden) files (excluded by default)",
|
Usage: "Include dot (hidden) files (excluded by default)",
|
||||||
},
|
},
|
||||||
&cli.StringFlag{
|
&cli.StringFlag{
|
||||||
@ -154,6 +154,15 @@ func (mfa *CLIApp) run(args []string) {
|
|||||||
Usage: "GPG key ID to sign the manifest with",
|
Usage: "GPG key ID to sign the manifest with",
|
||||||
EnvVars: []string{"MFER_SIGN_KEY"},
|
EnvVars: []string{"MFER_SIGN_KEY"},
|
||||||
},
|
},
|
||||||
|
&cli.StringFlag{
|
||||||
|
Name: "seed",
|
||||||
|
Usage: "Seed value for deterministic manifest UUID",
|
||||||
|
EnvVars: []string{"MFER_SEED"},
|
||||||
|
},
|
||||||
|
&cli.BoolFlag{
|
||||||
|
Name: "include-timestamps",
|
||||||
|
Usage: "Include createdAt timestamp in manifest (omitted by default for determinism)",
|
||||||
|
},
|
||||||
),
|
),
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -206,13 +215,13 @@ func (mfa *CLIApp) run(args []string) {
|
|||||||
Usage: "Base directory for resolving relative paths",
|
Usage: "Base directory for resolving relative paths",
|
||||||
},
|
},
|
||||||
&cli.BoolFlag{
|
&cli.BoolFlag{
|
||||||
Name: "FollowSymLinks",
|
Name: "follow-symlinks",
|
||||||
Aliases: []string{"follow-symlinks"},
|
Aliases: []string{"L"},
|
||||||
Usage: "Resolve encountered symlinks",
|
Usage: "Resolve encountered symlinks",
|
||||||
},
|
},
|
||||||
&cli.BoolFlag{
|
&cli.BoolFlag{
|
||||||
Name: "IncludeDotfiles",
|
Name: "include-dotfiles",
|
||||||
Aliases: []string{"include-dotfiles"},
|
|
||||||
Usage: "Include dot (hidden) files (excluded by default)",
|
Usage: "Include dot (hidden) files (excluded by default)",
|
||||||
},
|
},
|
||||||
&cli.BoolFlag{
|
&cli.BoolFlag{
|
||||||
@ -226,8 +235,20 @@ func (mfa *CLIApp) run(args []string) {
|
|||||||
Usage: "GPG key ID to sign the manifest with",
|
Usage: "GPG key ID to sign the manifest with",
|
||||||
EnvVars: []string{"MFER_SIGN_KEY"},
|
EnvVars: []string{"MFER_SIGN_KEY"},
|
||||||
},
|
},
|
||||||
|
&cli.BoolFlag{
|
||||||
|
Name: "include-timestamps",
|
||||||
|
Usage: "Include createdAt timestamp in manifest (omitted by default for determinism)",
|
||||||
|
},
|
||||||
),
|
),
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
Name: "export",
|
||||||
|
Usage: "Export manifest contents as JSON",
|
||||||
|
ArgsUsage: "[manifest file or URL]",
|
||||||
|
Action: func(c *cli.Context) error {
|
||||||
|
return mfa.exportManifestOperation(c)
|
||||||
|
},
|
||||||
|
},
|
||||||
{
|
{
|
||||||
Name: "version",
|
Name: "version",
|
||||||
Usage: "Show version",
|
Usage: "Show version",
|
||||||
@ -269,7 +290,7 @@ func (mfa *CLIApp) run(args []string) {
|
|||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
||||||
mfa.app.HideVersion = true
|
mfa.app.HideVersion = false
|
||||||
err := mfa.app.Run(args)
|
err := mfa.app.Run(args)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
mfa.exitCode = 1
|
mfa.exitCode = 1
|
||||||
|
|||||||
@ -5,6 +5,7 @@ import (
|
|||||||
"errors"
|
"errors"
|
||||||
"fmt"
|
"fmt"
|
||||||
"io"
|
"io"
|
||||||
|
"sort"
|
||||||
"strings"
|
"strings"
|
||||||
"sync"
|
"sync"
|
||||||
"time"
|
"time"
|
||||||
@ -87,7 +88,17 @@ type Builder struct {
|
|||||||
mu sync.Mutex
|
mu sync.Mutex
|
||||||
files []*MFFilePath
|
files []*MFFilePath
|
||||||
createdAt time.Time
|
createdAt time.Time
|
||||||
|
includeTimestamps bool
|
||||||
signingOptions *SigningOptions
|
signingOptions *SigningOptions
|
||||||
|
fixedUUID []byte // if set, use this UUID instead of generating one
|
||||||
|
}
|
||||||
|
|
||||||
|
// SetSeed derives a deterministic UUID from the given seed string.
|
||||||
|
// The seed is hashed once with SHA-256 and the first 16 bytes are used
|
||||||
|
// as a fixed UUID for the manifest.
|
||||||
|
func (b *Builder) SetSeed(seed string) {
|
||||||
|
hash := sha256.Sum256([]byte(seed))
|
||||||
|
b.fixedUUID = hash[:16]
|
||||||
}
|
}
|
||||||
|
|
||||||
// NewBuilder creates a new Builder.
|
// NewBuilder creates a new Builder.
|
||||||
@ -185,7 +196,7 @@ func (b *Builder) FileCount() int {
|
|||||||
// Returns an error if path is empty, size is negative, or hash is nil/empty.
|
// Returns an error if path is empty, size is negative, or hash is nil/empty.
|
||||||
func (b *Builder) AddFileWithHash(path RelFilePath, size FileSize, mtime ModTime, hash Multihash) error {
|
func (b *Builder) AddFileWithHash(path RelFilePath, size FileSize, mtime ModTime, hash Multihash) error {
|
||||||
if err := ValidatePath(string(path)); err != nil {
|
if err := ValidatePath(string(path)); err != nil {
|
||||||
return err
|
return fmt.Errorf("add file: %w", err)
|
||||||
}
|
}
|
||||||
if size < 0 {
|
if size < 0 {
|
||||||
return errors.New("size cannot be negative")
|
return errors.New("size cannot be negative")
|
||||||
@ -209,6 +220,14 @@ func (b *Builder) AddFileWithHash(path RelFilePath, size FileSize, mtime ModTime
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// SetIncludeTimestamps controls whether the manifest includes a createdAt timestamp.
|
||||||
|
// By default timestamps are omitted for deterministic output.
|
||||||
|
func (b *Builder) SetIncludeTimestamps(include bool) {
|
||||||
|
b.mu.Lock()
|
||||||
|
defer b.mu.Unlock()
|
||||||
|
b.includeTimestamps = include
|
||||||
|
}
|
||||||
|
|
||||||
// SetSigningOptions sets the GPG signing options for the manifest.
|
// SetSigningOptions sets the GPG signing options for the manifest.
|
||||||
// If opts is non-nil, the manifest will be signed when Build() is called.
|
// If opts is non-nil, the manifest will be signed when Build() is called.
|
||||||
func (b *Builder) SetSigningOptions(opts *SigningOptions) {
|
func (b *Builder) SetSigningOptions(opts *SigningOptions) {
|
||||||
@ -222,30 +241,41 @@ func (b *Builder) Build(w io.Writer) error {
|
|||||||
b.mu.Lock()
|
b.mu.Lock()
|
||||||
defer b.mu.Unlock()
|
defer b.mu.Unlock()
|
||||||
|
|
||||||
|
// Sort files by path for deterministic output
|
||||||
|
sort.Slice(b.files, func(i, j int) bool {
|
||||||
|
return b.files[i].Path < b.files[j].Path
|
||||||
|
})
|
||||||
|
|
||||||
// Create inner manifest
|
// Create inner manifest
|
||||||
inner := &MFFile{
|
inner := &MFFile{
|
||||||
Version: MFFile_VERSION_ONE,
|
Version: MFFile_VERSION_ONE,
|
||||||
CreatedAt: newTimestampFromTime(b.createdAt),
|
|
||||||
Files: b.files,
|
Files: b.files,
|
||||||
}
|
}
|
||||||
|
if b.includeTimestamps {
|
||||||
|
inner.CreatedAt = newTimestampFromTime(b.createdAt)
|
||||||
|
}
|
||||||
|
|
||||||
// Create a temporary manifest to use existing serialization
|
// Create a temporary manifest to use existing serialization
|
||||||
m := &manifest{
|
m := &manifest{
|
||||||
pbInner: inner,
|
pbInner: inner,
|
||||||
signingOptions: b.signingOptions,
|
signingOptions: b.signingOptions,
|
||||||
|
fixedUUID: b.fixedUUID,
|
||||||
}
|
}
|
||||||
|
|
||||||
// Generate outer wrapper
|
// Generate outer wrapper
|
||||||
if err := m.generateOuter(); err != nil {
|
if err := m.generateOuter(); err != nil {
|
||||||
return err
|
return fmt.Errorf("build: generate outer: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Generate final output
|
// Generate final output
|
||||||
if err := m.generate(); err != nil {
|
if err := m.generate(); err != nil {
|
||||||
return err
|
return fmt.Errorf("build: generate: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Write to output
|
// Write to output
|
||||||
_, err := w.Write(m.output.Bytes())
|
_, err := w.Write(m.output.Bytes())
|
||||||
return err
|
if err != nil {
|
||||||
|
return fmt.Errorf("build: write output: %w", err)
|
||||||
|
}
|
||||||
|
return nil
|
||||||
}
|
}
|
||||||
|
|||||||
@ -115,6 +115,207 @@ func TestNewTimestampFromTimeExtremeDate(t *testing.T) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestBuilderDeterministicOutput(t *testing.T) {
|
||||||
|
buildManifest := func() []byte {
|
||||||
|
b := NewBuilder()
|
||||||
|
// Use a fixed createdAt and UUID so output is reproducible
|
||||||
|
b.createdAt = time.Date(2025, 1, 1, 0, 0, 0, 0, time.UTC)
|
||||||
|
b.fixedUUID = make([]byte, 16) // all zeros
|
||||||
|
|
||||||
|
mtime := ModTime(time.Date(2025, 6, 1, 0, 0, 0, 0, time.UTC))
|
||||||
|
|
||||||
|
// Add files in reverse order to test sorting
|
||||||
|
files := []struct {
|
||||||
|
path string
|
||||||
|
content string
|
||||||
|
}{
|
||||||
|
{"c/file.txt", "content c"},
|
||||||
|
{"a/file.txt", "content a"},
|
||||||
|
{"b/file.txt", "content b"},
|
||||||
|
}
|
||||||
|
for _, f := range files {
|
||||||
|
r := bytes.NewReader([]byte(f.content))
|
||||||
|
_, err := b.AddFile(RelFilePath(f.path), FileSize(len(f.content)), mtime, r, nil)
|
||||||
|
require.NoError(t, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
err := b.Build(&buf)
|
||||||
|
require.NoError(t, err)
|
||||||
|
return buf.Bytes()
|
||||||
|
}
|
||||||
|
|
||||||
|
out1 := buildManifest()
|
||||||
|
out2 := buildManifest()
|
||||||
|
assert.Equal(t, out1, out2, "two builds with same input should produce byte-identical output")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestSetSeedDeterministic(t *testing.T) {
|
||||||
|
b1 := NewBuilder()
|
||||||
|
b1.SetSeed("test-seed-value")
|
||||||
|
b2 := NewBuilder()
|
||||||
|
b2.SetSeed("test-seed-value")
|
||||||
|
assert.Equal(t, b1.fixedUUID, b2.fixedUUID, "same seed should produce same UUID")
|
||||||
|
assert.Len(t, b1.fixedUUID, 16, "UUID should be 16 bytes")
|
||||||
|
|
||||||
|
b3 := NewBuilder()
|
||||||
|
b3.SetSeed("different-seed")
|
||||||
|
assert.NotEqual(t, b1.fixedUUID, b3.fixedUUID, "different seeds should produce different UUIDs")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestValidatePath(t *testing.T) {
|
||||||
|
valid := []string{
|
||||||
|
"file.txt",
|
||||||
|
"dir/file.txt",
|
||||||
|
"a/b/c/d.txt",
|
||||||
|
"file with spaces.txt",
|
||||||
|
"日本語.txt",
|
||||||
|
}
|
||||||
|
for _, p := range valid {
|
||||||
|
t.Run("valid:"+p, func(t *testing.T) {
|
||||||
|
assert.NoError(t, ValidatePath(p))
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
invalid := []struct {
|
||||||
|
path string
|
||||||
|
desc string
|
||||||
|
}{
|
||||||
|
{"", "empty"},
|
||||||
|
{"/absolute", "absolute path"},
|
||||||
|
{"has\\backslash", "backslash"},
|
||||||
|
{"has/../traversal", "dot-dot segment"},
|
||||||
|
{"has//double", "empty segment"},
|
||||||
|
{"..", "just dot-dot"},
|
||||||
|
{string([]byte{0xff, 0xfe}), "invalid UTF-8"},
|
||||||
|
}
|
||||||
|
for _, tt := range invalid {
|
||||||
|
t.Run("invalid:"+tt.desc, func(t *testing.T) {
|
||||||
|
assert.Error(t, ValidatePath(tt.path))
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestBuilderAddFileSizeMismatch(t *testing.T) {
|
||||||
|
b := NewBuilder()
|
||||||
|
content := []byte("short")
|
||||||
|
reader := bytes.NewReader(content)
|
||||||
|
|
||||||
|
// Declare wrong size
|
||||||
|
_, err := b.AddFile("test.txt", FileSize(100), ModTime(time.Now()), reader, nil)
|
||||||
|
assert.Error(t, err)
|
||||||
|
assert.Contains(t, err.Error(), "size mismatch")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestBuilderAddFileInvalidPath(t *testing.T) {
|
||||||
|
b := NewBuilder()
|
||||||
|
content := []byte("data")
|
||||||
|
reader := bytes.NewReader(content)
|
||||||
|
|
||||||
|
_, err := b.AddFile("", FileSize(len(content)), ModTime(time.Now()), reader, nil)
|
||||||
|
assert.Error(t, err)
|
||||||
|
|
||||||
|
reader.Reset(content)
|
||||||
|
_, err = b.AddFile("/absolute", FileSize(len(content)), ModTime(time.Now()), reader, nil)
|
||||||
|
assert.Error(t, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestBuilderAddFileWithProgress(t *testing.T) {
|
||||||
|
b := NewBuilder()
|
||||||
|
content := bytes.Repeat([]byte("x"), 1000)
|
||||||
|
reader := bytes.NewReader(content)
|
||||||
|
progress := make(chan FileHashProgress, 100)
|
||||||
|
|
||||||
|
bytesRead, err := b.AddFile("test.txt", FileSize(len(content)), ModTime(time.Now()), reader, progress)
|
||||||
|
close(progress)
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, FileSize(1000), bytesRead)
|
||||||
|
|
||||||
|
var updates []FileHashProgress
|
||||||
|
for p := range progress {
|
||||||
|
updates = append(updates, p)
|
||||||
|
}
|
||||||
|
assert.NotEmpty(t, updates)
|
||||||
|
// Last update should show all bytes
|
||||||
|
assert.Equal(t, FileSize(1000), updates[len(updates)-1].BytesRead)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestBuilderBuildRoundTrip(t *testing.T) {
|
||||||
|
// Build a manifest, deserialize it, verify all fields survive round-trip
|
||||||
|
b := NewBuilder()
|
||||||
|
now := time.Date(2025, 6, 15, 12, 0, 0, 0, time.UTC)
|
||||||
|
|
||||||
|
files := []struct {
|
||||||
|
path string
|
||||||
|
content []byte
|
||||||
|
}{
|
||||||
|
{"alpha.txt", []byte("alpha content")},
|
||||||
|
{"beta/gamma.txt", []byte("gamma content")},
|
||||||
|
{"beta/delta.txt", []byte("delta content")},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, f := range files {
|
||||||
|
reader := bytes.NewReader(f.content)
|
||||||
|
_, err := b.AddFile(RelFilePath(f.path), FileSize(len(f.content)), ModTime(now), reader, nil)
|
||||||
|
require.NoError(t, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
require.NoError(t, b.Build(&buf))
|
||||||
|
|
||||||
|
m, err := NewManifestFromReader(&buf)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
mfiles := m.Files()
|
||||||
|
require.Len(t, mfiles, 3)
|
||||||
|
|
||||||
|
// Verify sorted order
|
||||||
|
assert.Equal(t, "alpha.txt", mfiles[0].Path)
|
||||||
|
assert.Equal(t, "beta/delta.txt", mfiles[1].Path)
|
||||||
|
assert.Equal(t, "beta/gamma.txt", mfiles[2].Path)
|
||||||
|
|
||||||
|
// Verify sizes
|
||||||
|
assert.Equal(t, int64(len("alpha content")), mfiles[0].Size)
|
||||||
|
|
||||||
|
// Verify hashes are present
|
||||||
|
for _, f := range mfiles {
|
||||||
|
require.NotEmpty(t, f.Hashes, "file %s should have hashes", f.Path)
|
||||||
|
assert.NotEmpty(t, f.Hashes[0].MultiHash)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestNewManifestFromReaderInvalidMagic(t *testing.T) {
|
||||||
|
_, err := NewManifestFromReader(bytes.NewReader([]byte("NOT_VALID")))
|
||||||
|
assert.Error(t, err)
|
||||||
|
assert.Contains(t, err.Error(), "invalid file format")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestNewManifestFromReaderEmpty(t *testing.T) {
|
||||||
|
_, err := NewManifestFromReader(bytes.NewReader([]byte{}))
|
||||||
|
assert.Error(t, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestNewManifestFromReaderTruncated(t *testing.T) {
|
||||||
|
// Just the magic with nothing after
|
||||||
|
_, err := NewManifestFromReader(bytes.NewReader([]byte(MAGIC)))
|
||||||
|
assert.Error(t, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestManifestString(t *testing.T) {
|
||||||
|
b := NewBuilder()
|
||||||
|
content := []byte("test")
|
||||||
|
reader := bytes.NewReader(content)
|
||||||
|
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime(time.Now()), reader, nil)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
require.NoError(t, b.Build(&buf))
|
||||||
|
|
||||||
|
m, err := NewManifestFromReader(&buf)
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Contains(t, m.String(), "count=1")
|
||||||
|
}
|
||||||
|
|
||||||
func TestBuilderBuildEmpty(t *testing.T) {
|
func TestBuilderBuildEmpty(t *testing.T) {
|
||||||
b := NewBuilder()
|
b := NewBuilder()
|
||||||
|
|
||||||
@ -125,3 +326,62 @@ func TestBuilderBuildEmpty(t *testing.T) {
|
|||||||
// Should still produce valid manifest with 0 files
|
// Should still produce valid manifest with 0 files
|
||||||
assert.True(t, strings.HasPrefix(buf.String(), MAGIC))
|
assert.True(t, strings.HasPrefix(buf.String(), MAGIC))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestBuilderOmitsCreatedAtByDefault(t *testing.T) {
|
||||||
|
b := NewBuilder()
|
||||||
|
content := []byte("hello")
|
||||||
|
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime(time.Now()), bytes.NewReader(content), nil)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
require.NoError(t, b.Build(&buf))
|
||||||
|
|
||||||
|
m, err := NewManifestFromReader(&buf)
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Nil(t, m.pbInner.CreatedAt, "createdAt should be nil by default for deterministic output")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestBuilderIncludesCreatedAtWhenRequested(t *testing.T) {
|
||||||
|
b := NewBuilder()
|
||||||
|
b.SetIncludeTimestamps(true)
|
||||||
|
content := []byte("hello")
|
||||||
|
_, err := b.AddFile("test.txt", FileSize(len(content)), ModTime(time.Now()), bytes.NewReader(content), nil)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
var buf bytes.Buffer
|
||||||
|
require.NoError(t, b.Build(&buf))
|
||||||
|
|
||||||
|
m, err := NewManifestFromReader(&buf)
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.NotNil(t, m.pbInner.CreatedAt, "createdAt should be set when IncludeTimestamps is true")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestBuilderDeterministicFileOrder(t *testing.T) {
|
||||||
|
// Two builds with same files in different order should produce same file ordering.
|
||||||
|
// Note: UUIDs differ per build, so we compare parsed file lists, not raw bytes.
|
||||||
|
buildAndParse := func(order []string) []*MFFilePath {
|
||||||
|
b := NewBuilder()
|
||||||
|
for _, name := range order {
|
||||||
|
content := []byte("content of " + name)
|
||||||
|
_, err := b.AddFile(RelFilePath(name), FileSize(len(content)), ModTime(time.Unix(1000, 0)), bytes.NewReader(content), nil)
|
||||||
|
require.NoError(t, err)
|
||||||
|
}
|
||||||
|
var buf bytes.Buffer
|
||||||
|
require.NoError(t, b.Build(&buf))
|
||||||
|
m, err := NewManifestFromReader(&buf)
|
||||||
|
require.NoError(t, err)
|
||||||
|
return m.Files()
|
||||||
|
}
|
||||||
|
|
||||||
|
files1 := buildAndParse([]string{"b.txt", "a.txt"})
|
||||||
|
files2 := buildAndParse([]string{"a.txt", "b.txt"})
|
||||||
|
|
||||||
|
require.Len(t, files1, 2)
|
||||||
|
require.Len(t, files2, 2)
|
||||||
|
for i := range files1 {
|
||||||
|
assert.Equal(t, files1[i].Path, files2[i].Path)
|
||||||
|
assert.Equal(t, files1[i].Size, files2[i].Size)
|
||||||
|
}
|
||||||
|
assert.Equal(t, "a.txt", files1[0].Path)
|
||||||
|
assert.Equal(t, "b.txt", files1[1].Path)
|
||||||
|
}
|
||||||
|
|||||||
@ -70,6 +70,8 @@ type Checker struct {
|
|||||||
fs afero.Fs
|
fs afero.Fs
|
||||||
// manifestPaths is a set of paths in the manifest for quick lookup
|
// manifestPaths is a set of paths in the manifest for quick lookup
|
||||||
manifestPaths map[RelFilePath]struct{}
|
manifestPaths map[RelFilePath]struct{}
|
||||||
|
// manifestRelPath is the relative path of the manifest file from basePath (for exclusion)
|
||||||
|
manifestRelPath RelFilePath
|
||||||
// signature info from the manifest
|
// signature info from the manifest
|
||||||
signature []byte
|
signature []byte
|
||||||
signer []byte
|
signer []byte
|
||||||
@ -100,11 +102,22 @@ func NewChecker(manifestPath string, basePath string, fs afero.Fs) (*Checker, er
|
|||||||
manifestPaths[RelFilePath(f.Path)] = struct{}{}
|
manifestPaths[RelFilePath(f.Path)] = struct{}{}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Compute manifest's relative path from basePath for exclusion in FindExtraFiles
|
||||||
|
absManifest, err := filepath.Abs(manifestPath)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
manifestRel, err := filepath.Rel(abs, absManifest)
|
||||||
|
if err != nil {
|
||||||
|
manifestRel = ""
|
||||||
|
}
|
||||||
|
|
||||||
return &Checker{
|
return &Checker{
|
||||||
basePath: AbsFilePath(abs),
|
basePath: AbsFilePath(abs),
|
||||||
files: files,
|
files: files,
|
||||||
fs: fs,
|
fs: fs,
|
||||||
manifestPaths: manifestPaths,
|
manifestPaths: manifestPaths,
|
||||||
|
manifestRelPath: RelFilePath(manifestRel),
|
||||||
signature: m.pbOuter.Signature,
|
signature: m.pbOuter.Signature,
|
||||||
signer: m.pbOuter.Signer,
|
signer: m.pbOuter.Signer,
|
||||||
signingPubKey: m.pbOuter.SigningPubKey,
|
signingPubKey: m.pbOuter.SigningPubKey,
|
||||||
@ -170,6 +183,7 @@ func (c *Checker) Check(ctx context.Context, results chan<- Result, progress cha
|
|||||||
var failures FileCount
|
var failures FileCount
|
||||||
|
|
||||||
startTime := time.Now()
|
startTime := time.Now()
|
||||||
|
lastProgressTime := time.Now()
|
||||||
|
|
||||||
for _, entry := range c.files {
|
for _, entry := range c.files {
|
||||||
select {
|
select {
|
||||||
@ -188,8 +202,11 @@ func (c *Checker) Check(ctx context.Context, results chan<- Result, progress cha
|
|||||||
results <- result
|
results <- result
|
||||||
}
|
}
|
||||||
|
|
||||||
// Send progress with rate and ETA calculation
|
// Send progress at most once per second (rate-limited)
|
||||||
if progress != nil {
|
if progress != nil {
|
||||||
|
now := time.Now()
|
||||||
|
isLast := checkedFiles == totalFiles
|
||||||
|
if isLast || now.Sub(lastProgressTime) >= time.Second {
|
||||||
elapsed := time.Since(startTime)
|
elapsed := time.Since(startTime)
|
||||||
var bytesPerSec float64
|
var bytesPerSec float64
|
||||||
var eta time.Duration
|
var eta time.Duration
|
||||||
@ -211,6 +228,8 @@ func (c *Checker) Check(ctx context.Context, results chan<- Result, progress cha
|
|||||||
ETA: eta,
|
ETA: eta,
|
||||||
Failures: failures,
|
Failures: failures,
|
||||||
})
|
})
|
||||||
|
lastProgressTime = now
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -309,14 +328,13 @@ func (c *Checker) FindExtraFiles(ctx context.Context, results chan<- Result) err
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// Skip manifest files
|
relPath := RelFilePath(rel)
|
||||||
base := filepath.Base(rel)
|
|
||||||
if base == "index.mf" || base == ".index.mf" {
|
// Skip the manifest file itself
|
||||||
|
if relPath == c.manifestRelPath {
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
relPath := RelFilePath(rel)
|
|
||||||
|
|
||||||
// Check if path is in manifest
|
// Check if path is in manifest
|
||||||
if _, exists := c.manifestPaths[relPath]; !exists {
|
if _, exists := c.manifestPaths[relPath]; !exists {
|
||||||
if results != nil {
|
if results != nil {
|
||||||
|
|||||||
@ -3,6 +3,7 @@ package mfer
|
|||||||
import (
|
import (
|
||||||
"bytes"
|
"bytes"
|
||||||
"context"
|
"context"
|
||||||
|
"fmt"
|
||||||
"testing"
|
"testing"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
@ -452,6 +453,61 @@ func TestCheckMissingFileDetectedWithoutFallback(t *testing.T) {
|
|||||||
assert.Equal(t, 0, statusCounts[StatusError], "no files should be ERROR")
|
assert.Equal(t, 0, statusCounts[StatusError], "no files should be ERROR")
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestFindExtraFilesSkipsDotfiles(t *testing.T) {
|
||||||
|
// Regression test for #16: FindExtraFiles should not report dotfiles
|
||||||
|
// or the manifest file itself as extra files.
|
||||||
|
fs := afero.NewMemMapFs()
|
||||||
|
files := map[string][]byte{
|
||||||
|
"file1.txt": []byte("in manifest"),
|
||||||
|
}
|
||||||
|
createTestManifest(t, fs, "/data/.index.mf", files)
|
||||||
|
createFilesOnDisk(t, fs, "/data", files)
|
||||||
|
|
||||||
|
// Add dotfiles and manifest file on disk
|
||||||
|
require.NoError(t, afero.WriteFile(fs, "/data/.hidden", []byte("dotfile"), 0o644))
|
||||||
|
require.NoError(t, fs.MkdirAll("/data/.git", 0o755))
|
||||||
|
require.NoError(t, afero.WriteFile(fs, "/data/.git/config", []byte("git config"), 0o644))
|
||||||
|
|
||||||
|
chk, err := NewChecker("/data/.index.mf", "/data", fs)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
results := make(chan Result, 10)
|
||||||
|
err = chk.FindExtraFiles(context.Background(), results)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
var extras []Result
|
||||||
|
for r := range results {
|
||||||
|
extras = append(extras, r)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Should report NO extra files — dotfiles and manifest should be skipped
|
||||||
|
assert.Empty(t, extras, "FindExtraFiles should not report dotfiles or manifest file as extra; got: %v", extras)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestFindExtraFilesSkipsManifestFile(t *testing.T) {
|
||||||
|
// The manifest file itself should never be reported as extra
|
||||||
|
fs := afero.NewMemMapFs()
|
||||||
|
files := map[string][]byte{
|
||||||
|
"file1.txt": []byte("content"),
|
||||||
|
}
|
||||||
|
createTestManifest(t, fs, "/data/index.mf", files)
|
||||||
|
createFilesOnDisk(t, fs, "/data", files)
|
||||||
|
|
||||||
|
chk, err := NewChecker("/data/index.mf", "/data", fs)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
results := make(chan Result, 10)
|
||||||
|
err = chk.FindExtraFiles(context.Background(), results)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
var extras []Result
|
||||||
|
for r := range results {
|
||||||
|
extras = append(extras, r)
|
||||||
|
}
|
||||||
|
|
||||||
|
assert.Empty(t, extras, "manifest file should not be reported as extra; got: %v", extras)
|
||||||
|
}
|
||||||
|
|
||||||
func TestCheckEmptyManifest(t *testing.T) {
|
func TestCheckEmptyManifest(t *testing.T) {
|
||||||
fs := afero.NewMemMapFs()
|
fs := afero.NewMemMapFs()
|
||||||
// Create manifest with no files
|
// Create manifest with no files
|
||||||
@ -473,3 +529,40 @@ func TestCheckEmptyManifest(t *testing.T) {
|
|||||||
}
|
}
|
||||||
assert.Equal(t, 0, count)
|
assert.Equal(t, 0, count)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestCheckProgressRateLimited(t *testing.T) {
|
||||||
|
// Create many small files - progress should be rate-limited, not one per file.
|
||||||
|
// With rate-limiting to once per second, we should get far fewer progress
|
||||||
|
// updates than files (plus one final update).
|
||||||
|
fs := afero.NewMemMapFs()
|
||||||
|
files := make(map[string][]byte, 100)
|
||||||
|
for i := 0; i < 100; i++ {
|
||||||
|
name := fmt.Sprintf("file%03d.txt", i)
|
||||||
|
files[name] = []byte("content")
|
||||||
|
}
|
||||||
|
createTestManifest(t, fs, "/manifest.mf", files)
|
||||||
|
createFilesOnDisk(t, fs, "/data", files)
|
||||||
|
|
||||||
|
chk, err := NewChecker("/manifest.mf", "/data", fs)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
results := make(chan Result, 200)
|
||||||
|
progress := make(chan CheckStatus, 200)
|
||||||
|
err = chk.Check(context.Background(), results, progress)
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
// Drain results
|
||||||
|
for range results {
|
||||||
|
}
|
||||||
|
|
||||||
|
// Count progress updates
|
||||||
|
var progressCount int
|
||||||
|
for range progress {
|
||||||
|
progressCount++
|
||||||
|
}
|
||||||
|
|
||||||
|
// Should be far fewer than 100 (rate-limited to once per second)
|
||||||
|
// At minimum we get the final update
|
||||||
|
assert.GreaterOrEqual(t, progressCount, 1, "should get at least the final progress update")
|
||||||
|
assert.Less(t, progressCount, 100, "progress should be rate-limited, not one per file")
|
||||||
|
}
|
||||||
|
|||||||
@ -44,7 +44,7 @@ func (m *manifest) deserializeInner() error {
|
|||||||
// Verify hash of compressed data before decompression
|
// Verify hash of compressed data before decompression
|
||||||
h := sha256.New()
|
h := sha256.New()
|
||||||
if _, err := h.Write(m.pbOuter.InnerMessage); err != nil {
|
if _, err := h.Write(m.pbOuter.InnerMessage); err != nil {
|
||||||
return err
|
return fmt.Errorf("deserialize: hash write: %w", err)
|
||||||
}
|
}
|
||||||
sha256Hash := h.Sum(nil)
|
sha256Hash := h.Sum(nil)
|
||||||
if !bytes.Equal(sha256Hash, m.pbOuter.Sha256) {
|
if !bytes.Equal(sha256Hash, m.pbOuter.Sha256) {
|
||||||
@ -72,7 +72,7 @@ func (m *manifest) deserializeInner() error {
|
|||||||
|
|
||||||
zr, err := zstd.NewReader(bb)
|
zr, err := zstd.NewReader(bb)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("deserialize: zstd reader: %w", err)
|
||||||
}
|
}
|
||||||
defer zr.Close()
|
defer zr.Close()
|
||||||
|
|
||||||
@ -85,7 +85,7 @@ func (m *manifest) deserializeInner() error {
|
|||||||
limitedReader := io.LimitReader(zr, maxSize)
|
limitedReader := io.LimitReader(zr, maxSize)
|
||||||
dat, err := io.ReadAll(limitedReader)
|
dat, err := io.ReadAll(limitedReader)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("deserialize: decompress: %w", err)
|
||||||
}
|
}
|
||||||
if int64(len(dat)) >= MaxDecompressedSize {
|
if int64(len(dat)) >= MaxDecompressedSize {
|
||||||
return fmt.Errorf("decompressed data exceeds maximum allowed size of %d bytes", MaxDecompressedSize)
|
return fmt.Errorf("decompressed data exceeds maximum allowed size of %d bytes", MaxDecompressedSize)
|
||||||
@ -100,7 +100,7 @@ func (m *manifest) deserializeInner() error {
|
|||||||
// Deserialize inner message
|
// Deserialize inner message
|
||||||
m.pbInner = new(MFFile)
|
m.pbInner = new(MFFile)
|
||||||
if err := proto.Unmarshal(dat, m.pbInner); err != nil {
|
if err := proto.Unmarshal(dat, m.pbInner); err != nil {
|
||||||
return err
|
return fmt.Errorf("deserialize: unmarshal inner: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Validate inner UUID
|
// Validate inner UUID
|
||||||
|
|||||||
@ -17,6 +17,7 @@ type manifest struct {
|
|||||||
pbOuter *MFFileOuter
|
pbOuter *MFFileOuter
|
||||||
output *bytes.Buffer
|
output *bytes.Buffer
|
||||||
signingOptions *SigningOptions
|
signingOptions *SigningOptions
|
||||||
|
fixedUUID []byte // if set, use this UUID instead of generating one
|
||||||
}
|
}
|
||||||
|
|
||||||
func (m *manifest) String() string {
|
func (m *manifest) String() string {
|
||||||
|
|||||||
@ -1,7 +1,7 @@
|
|||||||
// Code generated by protoc-gen-go. DO NOT EDIT.
|
// Code generated by protoc-gen-go. DO NOT EDIT.
|
||||||
// versions:
|
// versions:
|
||||||
// protoc-gen-go v1.36.11
|
// protoc-gen-go v1.36.11
|
||||||
// protoc v6.33.0
|
// protoc v6.33.4
|
||||||
// source: mf.proto
|
// source: mf.proto
|
||||||
|
|
||||||
package mfer
|
package mfer
|
||||||
@ -329,6 +329,9 @@ func (x *MFFileOuter) GetSigningPubKey() []byte {
|
|||||||
type MFFilePath struct {
|
type MFFilePath struct {
|
||||||
state protoimpl.MessageState `protogen:"open.v1"`
|
state protoimpl.MessageState `protogen:"open.v1"`
|
||||||
// required attributes:
|
// required attributes:
|
||||||
|
// Path invariants: must be valid UTF-8, use forward slashes only,
|
||||||
|
// be relative (no leading /), contain no ".." segments, and no
|
||||||
|
// empty segments (no "//").
|
||||||
Path string `protobuf:"bytes,1,opt,name=path,proto3" json:"path,omitempty"`
|
Path string `protobuf:"bytes,1,opt,name=path,proto3" json:"path,omitempty"`
|
||||||
Size int64 `protobuf:"varint,2,opt,name=size,proto3" json:"size,omitempty"`
|
Size int64 `protobuf:"varint,2,opt,name=size,proto3" json:"size,omitempty"`
|
||||||
// gotta have at least one:
|
// gotta have at least one:
|
||||||
@ -336,8 +339,7 @@ type MFFilePath struct {
|
|||||||
// optional per-file metadata
|
// optional per-file metadata
|
||||||
MimeType *string `protobuf:"bytes,301,opt,name=mimeType,proto3,oneof" json:"mimeType,omitempty"`
|
MimeType *string `protobuf:"bytes,301,opt,name=mimeType,proto3,oneof" json:"mimeType,omitempty"`
|
||||||
Mtime *Timestamp `protobuf:"bytes,302,opt,name=mtime,proto3,oneof" json:"mtime,omitempty"`
|
Mtime *Timestamp `protobuf:"bytes,302,opt,name=mtime,proto3,oneof" json:"mtime,omitempty"`
|
||||||
Ctime *Timestamp `protobuf:"bytes,303,opt,name=ctime,proto3,oneof" json:"ctime,omitempty"`
|
Ctime *Timestamp `protobuf:"bytes,303,opt,name=ctime,proto3,oneof" json:"ctime,omitempty"` // Field 304 (atime) removed — not useful for integrity verification.
|
||||||
Atime *Timestamp `protobuf:"bytes,304,opt,name=atime,proto3,oneof" json:"atime,omitempty"`
|
|
||||||
unknownFields protoimpl.UnknownFields
|
unknownFields protoimpl.UnknownFields
|
||||||
sizeCache protoimpl.SizeCache
|
sizeCache protoimpl.SizeCache
|
||||||
}
|
}
|
||||||
@ -414,13 +416,6 @@ func (x *MFFilePath) GetCtime() *Timestamp {
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
func (x *MFFilePath) GetAtime() *Timestamp {
|
|
||||||
if x != nil {
|
|
||||||
return x.Atime
|
|
||||||
}
|
|
||||||
return nil
|
|
||||||
}
|
|
||||||
|
|
||||||
type MFFileChecksum struct {
|
type MFFileChecksum struct {
|
||||||
state protoimpl.MessageState `protogen:"open.v1"`
|
state protoimpl.MessageState `protogen:"open.v1"`
|
||||||
// 1.0 golang implementation must write a multihash here
|
// 1.0 golang implementation must write a multihash here
|
||||||
@ -566,7 +561,7 @@ const file_mf_proto_rawDesc = "" +
|
|||||||
"\n" +
|
"\n" +
|
||||||
"_signatureB\t\n" +
|
"_signatureB\t\n" +
|
||||||
"\a_signerB\x10\n" +
|
"\a_signerB\x10\n" +
|
||||||
"\x0e_signingPubKey\"\xa2\x02\n" +
|
"\x0e_signingPubKey\"\xf0\x01\n" +
|
||||||
"\n" +
|
"\n" +
|
||||||
"MFFilePath\x12\x12\n" +
|
"MFFilePath\x12\x12\n" +
|
||||||
"\x04path\x18\x01 \x01(\tR\x04path\x12\x12\n" +
|
"\x04path\x18\x01 \x01(\tR\x04path\x12\x12\n" +
|
||||||
@ -576,13 +571,10 @@ const file_mf_proto_rawDesc = "" +
|
|||||||
"\x05mtime\x18\xae\x02 \x01(\v2\n" +
|
"\x05mtime\x18\xae\x02 \x01(\v2\n" +
|
||||||
".TimestampH\x01R\x05mtime\x88\x01\x01\x12&\n" +
|
".TimestampH\x01R\x05mtime\x88\x01\x01\x12&\n" +
|
||||||
"\x05ctime\x18\xaf\x02 \x01(\v2\n" +
|
"\x05ctime\x18\xaf\x02 \x01(\v2\n" +
|
||||||
".TimestampH\x02R\x05ctime\x88\x01\x01\x12&\n" +
|
".TimestampH\x02R\x05ctime\x88\x01\x01B\v\n" +
|
||||||
"\x05atime\x18\xb0\x02 \x01(\v2\n" +
|
|
||||||
".TimestampH\x03R\x05atime\x88\x01\x01B\v\n" +
|
|
||||||
"\t_mimeTypeB\b\n" +
|
"\t_mimeTypeB\b\n" +
|
||||||
"\x06_mtimeB\b\n" +
|
"\x06_mtimeB\b\n" +
|
||||||
"\x06_ctimeB\b\n" +
|
"\x06_ctime\".\n" +
|
||||||
"\x06_atime\".\n" +
|
|
||||||
"\x0eMFFileChecksum\x12\x1c\n" +
|
"\x0eMFFileChecksum\x12\x1c\n" +
|
||||||
"\tmultiHash\x18\x01 \x01(\fR\tmultiHash\"\xd6\x01\n" +
|
"\tmultiHash\x18\x01 \x01(\fR\tmultiHash\"\xd6\x01\n" +
|
||||||
"\x06MFFile\x12)\n" +
|
"\x06MFFile\x12)\n" +
|
||||||
@ -627,15 +619,14 @@ var file_mf_proto_depIdxs = []int32{
|
|||||||
6, // 2: MFFilePath.hashes:type_name -> MFFileChecksum
|
6, // 2: MFFilePath.hashes:type_name -> MFFileChecksum
|
||||||
3, // 3: MFFilePath.mtime:type_name -> Timestamp
|
3, // 3: MFFilePath.mtime:type_name -> Timestamp
|
||||||
3, // 4: MFFilePath.ctime:type_name -> Timestamp
|
3, // 4: MFFilePath.ctime:type_name -> Timestamp
|
||||||
3, // 5: MFFilePath.atime:type_name -> Timestamp
|
2, // 5: MFFile.version:type_name -> MFFile.Version
|
||||||
2, // 6: MFFile.version:type_name -> MFFile.Version
|
5, // 6: MFFile.files:type_name -> MFFilePath
|
||||||
5, // 7: MFFile.files:type_name -> MFFilePath
|
3, // 7: MFFile.createdAt:type_name -> Timestamp
|
||||||
3, // 8: MFFile.createdAt:type_name -> Timestamp
|
8, // [8:8] is the sub-list for method output_type
|
||||||
9, // [9:9] is the sub-list for method output_type
|
8, // [8:8] is the sub-list for method input_type
|
||||||
9, // [9:9] is the sub-list for method input_type
|
8, // [8:8] is the sub-list for extension type_name
|
||||||
9, // [9:9] is the sub-list for extension type_name
|
8, // [8:8] is the sub-list for extension extendee
|
||||||
9, // [9:9] is the sub-list for extension extendee
|
0, // [0:8] is the sub-list for field type_name
|
||||||
0, // [0:9] is the sub-list for field type_name
|
|
||||||
}
|
}
|
||||||
|
|
||||||
func init() { file_mf_proto_init() }
|
func init() { file_mf_proto_init() }
|
||||||
|
|||||||
@ -59,7 +59,7 @@ message MFFilePath {
|
|||||||
optional string mimeType = 301;
|
optional string mimeType = 301;
|
||||||
optional Timestamp mtime = 302;
|
optional Timestamp mtime = 302;
|
||||||
optional Timestamp ctime = 303;
|
optional Timestamp ctime = 303;
|
||||||
optional Timestamp atime = 304;
|
// Field 304 (atime) removed — not useful for integrity verification.
|
||||||
}
|
}
|
||||||
|
|
||||||
message MFFileChecksum {
|
message MFFileChecksum {
|
||||||
|
|||||||
@ -45,8 +45,10 @@ type ScanStatus struct {
|
|||||||
type ScannerOptions struct {
|
type ScannerOptions struct {
|
||||||
IncludeDotfiles bool // Include files and directories starting with a dot (default: exclude)
|
IncludeDotfiles bool // Include files and directories starting with a dot (default: exclude)
|
||||||
FollowSymLinks bool // Resolve symlinks instead of skipping them
|
FollowSymLinks bool // Resolve symlinks instead of skipping them
|
||||||
|
IncludeTimestamps bool // Include createdAt timestamp in manifest (default: omit for determinism)
|
||||||
Fs afero.Fs // Filesystem to use, defaults to OsFs if nil
|
Fs afero.Fs // Filesystem to use, defaults to OsFs if nil
|
||||||
SigningOptions *SigningOptions // GPG signing options (nil = no signing)
|
SigningOptions *SigningOptions // GPG signing options (nil = no signing)
|
||||||
|
Seed string // If set, derive a deterministic UUID from this seed
|
||||||
}
|
}
|
||||||
|
|
||||||
// FileEntry represents a file that has been enumerated.
|
// FileEntry represents a file that has been enumerated.
|
||||||
@ -273,9 +275,15 @@ func (s *Scanner) ToManifest(ctx context.Context, w io.Writer, progress chan<- S
|
|||||||
s.mu.RUnlock()
|
s.mu.RUnlock()
|
||||||
|
|
||||||
builder := NewBuilder()
|
builder := NewBuilder()
|
||||||
|
if s.options.IncludeTimestamps {
|
||||||
|
builder.SetIncludeTimestamps(true)
|
||||||
|
}
|
||||||
if s.options.SigningOptions != nil {
|
if s.options.SigningOptions != nil {
|
||||||
builder.SetSigningOptions(s.options.SigningOptions)
|
builder.SetSigningOptions(s.options.SigningOptions)
|
||||||
}
|
}
|
||||||
|
if s.options.Seed != "" {
|
||||||
|
builder.SetSeed(s.options.Seed)
|
||||||
|
}
|
||||||
|
|
||||||
var scannedFiles FileCount
|
var scannedFiles FileCount
|
||||||
var scannedBytes FileSize
|
var scannedBytes FileSize
|
||||||
|
|||||||
@ -352,8 +352,10 @@ func TestIsHiddenPath(t *testing.T) {
|
|||||||
{"/absolute/.hidden", true},
|
{"/absolute/.hidden", true},
|
||||||
{"./relative", false}, // path.Clean removes leading ./
|
{"./relative", false}, // path.Clean removes leading ./
|
||||||
{"a/b/c/.d/e", true},
|
{"a/b/c/.d/e", true},
|
||||||
{".", false}, // current directory is not hidden
|
{".", false}, // current directory is not hidden (#14)
|
||||||
{"/", false}, // root is not hidden
|
{"/", false}, // root is not hidden
|
||||||
|
{"./", false}, // current directory with trailing slash
|
||||||
|
{"./file.txt", false}, // file in current directory
|
||||||
}
|
}
|
||||||
|
|
||||||
for _, tt := range tests {
|
for _, tt := range tests {
|
||||||
|
|||||||
@ -34,12 +34,12 @@ func (m *manifest) generate() error {
|
|||||||
}
|
}
|
||||||
dat, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbOuter)
|
dat, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbOuter)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("serialize: marshal outer: %w", err)
|
||||||
}
|
}
|
||||||
m.output = bytes.NewBuffer([]byte(MAGIC))
|
m.output = bytes.NewBuffer([]byte(MAGIC))
|
||||||
_, err = m.output.Write(dat)
|
_, err = m.output.Write(dat)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("serialize: write output: %w", err)
|
||||||
}
|
}
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
@ -49,24 +49,29 @@ func (m *manifest) generateOuter() error {
|
|||||||
return errors.New("internal error")
|
return errors.New("internal error")
|
||||||
}
|
}
|
||||||
|
|
||||||
// Generate UUID and set on inner message
|
// Use fixed UUID if provided, otherwise generate a new one
|
||||||
manifestUUID := uuid.New()
|
var manifestUUID uuid.UUID
|
||||||
|
if len(m.fixedUUID) == 16 {
|
||||||
|
copy(manifestUUID[:], m.fixedUUID)
|
||||||
|
} else {
|
||||||
|
manifestUUID = uuid.New()
|
||||||
|
}
|
||||||
m.pbInner.Uuid = manifestUUID[:]
|
m.pbInner.Uuid = manifestUUID[:]
|
||||||
|
|
||||||
innerData, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbInner)
|
innerData, err := proto.MarshalOptions{Deterministic: true}.Marshal(m.pbInner)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("serialize: marshal inner: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Compress the inner data
|
// Compress the inner data
|
||||||
idc := new(bytes.Buffer)
|
idc := new(bytes.Buffer)
|
||||||
zw, err := zstd.NewWriter(idc, zstd.WithEncoderLevel(zstd.SpeedBestCompression))
|
zw, err := zstd.NewWriter(idc, zstd.WithEncoderLevel(zstd.SpeedBestCompression))
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("serialize: create compressor: %w", err)
|
||||||
}
|
}
|
||||||
_, err = zw.Write(innerData)
|
_, err = zw.Write(innerData)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return err
|
return fmt.Errorf("serialize: compress: %w", err)
|
||||||
}
|
}
|
||||||
_ = zw.Close()
|
_ = zw.Close()
|
||||||
|
|
||||||
@ -75,7 +80,7 @@ func (m *manifest) generateOuter() error {
|
|||||||
// Hash the compressed data for integrity verification before decompression
|
// Hash the compressed data for integrity verification before decompression
|
||||||
h := sha256.New()
|
h := sha256.New()
|
||||||
if _, err := h.Write(compressedData); err != nil {
|
if _, err := h.Write(compressedData); err != nil {
|
||||||
return err
|
return fmt.Errorf("serialize: hash write: %w", err)
|
||||||
}
|
}
|
||||||
sha256Hash := h.Sum(nil)
|
sha256Hash := h.Sum(nil)
|
||||||
|
|
||||||
|
|||||||
@ -27,8 +27,12 @@ func (b BaseURL) JoinPath(path RelFilePath) (FileURL, error) {
|
|||||||
base.Path += "/"
|
base.Path += "/"
|
||||||
}
|
}
|
||||||
|
|
||||||
// Parse and encode the relative path
|
// Encode each path segment individually to preserve slashes
|
||||||
ref, err := url.Parse(url.PathEscape(string(path)))
|
segments := strings.Split(string(path), "/")
|
||||||
|
for i, seg := range segments {
|
||||||
|
segments[i] = url.PathEscape(seg)
|
||||||
|
}
|
||||||
|
ref, err := url.Parse(strings.Join(segments, "/"))
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return "", err
|
return "", err
|
||||||
}
|
}
|
||||||
|
|||||||
44
mfer/url_test.go
Normal file
44
mfer/url_test.go
Normal file
@ -0,0 +1,44 @@
|
|||||||
|
package mfer
|
||||||
|
|
||||||
|
import (
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestBaseURLJoinPath(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
base BaseURL
|
||||||
|
path RelFilePath
|
||||||
|
expected string
|
||||||
|
}{
|
||||||
|
{"https://example.com/dir/", "file.txt", "https://example.com/dir/file.txt"},
|
||||||
|
{"https://example.com/dir", "file.txt", "https://example.com/dir/file.txt"},
|
||||||
|
{"https://example.com/", "sub/file.txt", "https://example.com/sub/file.txt"},
|
||||||
|
{"https://example.com/dir/", "file with spaces.txt", "https://example.com/dir/file%20with%20spaces.txt"},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(string(tt.base)+"+"+string(tt.path), func(t *testing.T) {
|
||||||
|
result, err := tt.base.JoinPath(tt.path)
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, tt.expected, string(result))
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestBaseURLString(t *testing.T) {
|
||||||
|
b := BaseURL("https://example.com/")
|
||||||
|
assert.Equal(t, "https://example.com/", b.String())
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestFileURLString(t *testing.T) {
|
||||||
|
f := FileURL("https://example.com/file.txt")
|
||||||
|
assert.Equal(t, "https://example.com/file.txt", f.String())
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestManifestURLString(t *testing.T) {
|
||||||
|
m := ManifestURL("https://example.com/index.mf")
|
||||||
|
assert.Equal(t, "https://example.com/index.mf", m.String())
|
||||||
|
}
|
||||||
BIN
modcache.tzst
BIN
modcache.tzst
Binary file not shown.
BIN
vendor.tzst
BIN
vendor.tzst
Binary file not shown.
Loading…
Reference in New Issue
Block a user