Compare commits
6 Commits
9712c10fe3
...
47-add-tod
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
963dd829e6 | ||
|
|
cebf9785fc | ||
| 6ba32f5b35 | |||
| e62c709d42 | |||
| 89903fa1cd | |||
| b3d10106e1 |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -6,5 +6,8 @@
|
|||||||
vendor.tzst
|
vendor.tzst
|
||||||
modcache.tzst
|
modcache.tzst
|
||||||
|
|
||||||
|
# Generated manifest files
|
||||||
|
.index.mf
|
||||||
|
|
||||||
# Stale files
|
# Stale files
|
||||||
.drone.yml
|
.drone.yml
|
||||||
|
|||||||
30
AGENTS.md
Normal file
30
AGENTS.md
Normal file
@@ -0,0 +1,30 @@
|
|||||||
|
# Agent Instructions
|
||||||
|
|
||||||
|
Read `REPO_POLICIES.md` before making any changes. It is the authoritative
|
||||||
|
source for coding standards, formatting, linting, and workflow rules.
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
- When fixing a bug, write a failing test FIRST. Only after the test fails,
|
||||||
|
write the code to fix the bug. Then ensure the test passes. Leave the test in
|
||||||
|
place and commit it with the bugfix. Don't run shell commands to test bugfixes
|
||||||
|
or reproduce bugs. Write tests!
|
||||||
|
|
||||||
|
- After each change, run `make fmt`, then `make test`, then `make lint`. Fix any
|
||||||
|
failures before committing.
|
||||||
|
|
||||||
|
- After each change, commit only the files you've changed. Push after committing.
|
||||||
|
|
||||||
|
## Attribution
|
||||||
|
|
||||||
|
- Never mention Claude, Anthropic, or any AI/LLM tooling in commit messages. Do
|
||||||
|
not use attribution.
|
||||||
|
|
||||||
|
## Repository-Specific Notes
|
||||||
|
|
||||||
|
- This is a Go library + CLI tool for generating `.mf` manifest files.
|
||||||
|
- The proto definition is in `mfer/mf.proto`; generated `.pb.go` files are
|
||||||
|
committed (required for `go get` compatibility).
|
||||||
|
- The format specification is in `FORMAT.md`.
|
||||||
|
- See the TODO section in `README.md` for the 1.0 implementation plan
|
||||||
|
and open design questions.
|
||||||
20
CLAUDE.md
20
CLAUDE.md
@@ -1,20 +0,0 @@
|
|||||||
# Important Rules
|
|
||||||
|
|
||||||
- when fixing a bug, write a failing test FIRST. only after the test fails, write
|
|
||||||
the code to fix the bug. then ensure the test passes. leave the test in
|
|
||||||
place and commit it with the bugfix. don't run shell commands to test
|
|
||||||
bugfixes or reproduce bugs. write tests!
|
|
||||||
|
|
||||||
- never, ever mention claude or anthropic in commit messages. do not use attribution
|
|
||||||
|
|
||||||
- after each change, run "make fmt".
|
|
||||||
|
|
||||||
- after each change, run "make test" and ensure all tests pass.
|
|
||||||
|
|
||||||
- after each change, run "make lint" and ensure no linting errors. fix any
|
|
||||||
you find, one by one.
|
|
||||||
|
|
||||||
- after each change, commit the files you've changed. push after
|
|
||||||
committing.
|
|
||||||
|
|
||||||
- NEVER use `git add -A`. always add only individual files that you've changed.
|
|
||||||
30
Dockerfile
30
Dockerfile
@@ -1,16 +1,36 @@
|
|||||||
|
# Lint stage — fast feedback on formatting and lint issues
|
||||||
# golangci/golangci-lint:v2.0.2 (2026-03-14)
|
# golangci/golangci-lint:v2.0.2 (2026-03-14)
|
||||||
FROM golangci/golangci-lint@sha256:d55581f7797e7a0877a7c3aaa399b01bdc57d2874d6412601a046cc4062cb62e AS lint-bin
|
FROM golangci/golangci-lint@sha256:d55581f7797e7a0877a7c3aaa399b01bdc57d2874d6412601a046cc4062cb62e AS lint
|
||||||
|
|
||||||
# golang:1.23 (2026-03-14)
|
|
||||||
FROM golang@sha256:60deed95d3888cc5e4d9ff8a10c54e5edc008c6ae3fba6187be6fb592e19e8c0 AS builder
|
|
||||||
COPY --from=lint-bin /usr/bin/golangci-lint /usr/local/bin/golangci-lint
|
|
||||||
WORKDIR /src
|
WORKDIR /src
|
||||||
COPY go.mod go.sum ./
|
COPY go.mod go.sum ./
|
||||||
RUN go mod download
|
RUN go mod download
|
||||||
|
|
||||||
COPY . .
|
COPY . .
|
||||||
|
|
||||||
# Touch .pb.go so make does not try to regenerate via protoc (file is committed)
|
# Touch .pb.go so make does not try to regenerate via protoc (file is committed)
|
||||||
RUN touch mfer/mf.pb.go
|
RUN touch mfer/mf.pb.go
|
||||||
RUN make check
|
|
||||||
|
RUN make fmt-check
|
||||||
|
RUN make lint
|
||||||
|
|
||||||
|
# Build stage — tests and compilation
|
||||||
|
# golang:1.23 (2026-03-14)
|
||||||
|
FROM golang@sha256:60deed95d3888cc5e4d9ff8a10c54e5edc008c6ae3fba6187be6fb592e19e8c0 AS builder
|
||||||
|
|
||||||
|
# Force BuildKit to run the lint stage by creating a stage dependency
|
||||||
|
COPY --from=lint /src/go.sum /dev/null
|
||||||
|
|
||||||
|
WORKDIR /src
|
||||||
|
COPY go.mod go.sum ./
|
||||||
|
RUN go mod download
|
||||||
|
|
||||||
|
COPY . .
|
||||||
|
|
||||||
|
# Touch .pb.go so make does not try to regenerate via protoc (file is committed)
|
||||||
|
RUN touch mfer/mf.pb.go
|
||||||
|
|
||||||
|
RUN make test
|
||||||
RUN cd cmd/mfer && go build -tags urfave_cli_no_docs -o /mfer .
|
RUN cd cmd/mfer && go build -tags urfave_cli_no_docs -o /mfer .
|
||||||
|
|
||||||
FROM scratch
|
FROM scratch
|
||||||
|
|||||||
@@ -26,7 +26,7 @@ See [`mfer/mf.proto`](mfer/mf.proto) for exact field numbers and types.
|
|||||||
The outer message contains:
|
The outer message contains:
|
||||||
|
|
||||||
| Field | Number | Type | Description |
|
| Field | Number | Type | Description |
|
||||||
|--------------------|--------|-------------------|--------------------------------------------------|
|
| ----------------- | ------ | ---------------- | ------------------------------------------------------------------------ |
|
||||||
| `version` | 101 | enum | Must be `VERSION_ONE` (1) |
|
| `version` | 101 | enum | Must be `VERSION_ONE` (1) |
|
||||||
| `compressionType` | 102 | enum | Compression of `innerMessage`; must be `COMPRESSION_ZSTD` (1) |
|
| `compressionType` | 102 | enum | Compression of `innerMessage`; must be `COMPRESSION_ZSTD` (1) |
|
||||||
| `size` | 103 | int64 | Uncompressed size of `innerMessage` (corruption detection) |
|
| `size` | 103 | int64 | Uncompressed size of `innerMessage` (corruption detection) |
|
||||||
@@ -55,7 +55,7 @@ After decompressing `innerMessage`, the result is a serialized `MFFile`
|
|||||||
(referred to as the manifest):
|
(referred to as the manifest):
|
||||||
|
|
||||||
| Field | Number | Type | Description |
|
| Field | Number | Type | Description |
|
||||||
|-------------|--------|-----------------------|--------------------------------------------|
|
| ----------- | ------ | --------------------- | ------------------------------------- |
|
||||||
| `version` | 100 | enum | Must be `VERSION_ONE` (1) |
|
| `version` | 100 | enum | Must be `VERSION_ONE` (1) |
|
||||||
| `files` | 101 | repeated `MFFilePath` | List of files in the manifest |
|
| `files` | 101 | repeated `MFFilePath` | List of files in the manifest |
|
||||||
| `uuid` | 102 | bytes | Random v4 UUID; must match outer UUID |
|
| `uuid` | 102 | bytes | Random v4 UUID; must match outer UUID |
|
||||||
@@ -66,7 +66,7 @@ After decompressing `innerMessage`, the result is a serialized `MFFile`
|
|||||||
Each file entry contains:
|
Each file entry contains:
|
||||||
|
|
||||||
| Field | Number | Type | Description |
|
| Field | Number | Type | Description |
|
||||||
|------------|--------|---------------------------|--------------------------------------|
|
| ---------- | ------ | ------------------------- | ----------------------------------- |
|
||||||
| `path` | 1 | string | Relative file path (see Path Rules) |
|
| `path` | 1 | string | Relative file path (see Path Rules) |
|
||||||
| `size` | 2 | int64 | File size in bytes |
|
| `size` | 2 | int64 | File size in bytes |
|
||||||
| `hashes` | 3 | repeated `MFFileChecksum` | At least one hash required |
|
| `hashes` | 3 | repeated `MFFileChecksum` | At least one hash required |
|
||||||
@@ -111,6 +111,7 @@ ZNAVSRFG-<UUID>-<SHA256>
|
|||||||
```
|
```
|
||||||
|
|
||||||
Where:
|
Where:
|
||||||
|
|
||||||
- `ZNAVSRFG` is the magic bytes string (literal ASCII)
|
- `ZNAVSRFG` is the magic bytes string (literal ASCII)
|
||||||
- `<UUID>` is the hex-encoded UUID from the outer message
|
- `<UUID>` is the hex-encoded UUID from the outer message
|
||||||
- `<SHA256>` is the hex-encoded SHA-256 hash from the outer message (covering compressed data)
|
- `<SHA256>` is the hex-encoded SHA-256 hash from the outer message (covering compressed data)
|
||||||
|
|||||||
2
Makefile
2
Makefile
@@ -32,7 +32,7 @@ $(PROTOC_GEN_GO):
|
|||||||
fixme:
|
fixme:
|
||||||
@grep -nir fixme . | grep -v Makefile
|
@grep -nir fixme . | grep -v Makefile
|
||||||
|
|
||||||
check: test fmt-check
|
check: test lint fmt-check
|
||||||
|
|
||||||
fmt-check: mfer/mf.pb.go
|
fmt-check: mfer/mf.pb.go
|
||||||
sh -c 'test -z "$$(gofmt -l .)"'
|
sh -c 'test -z "$$(gofmt -l .)"'
|
||||||
|
|||||||
234
README.md
234
README.md
@@ -25,7 +25,8 @@ software. A compatible javascript library is planned.
|
|||||||
|
|
||||||
# Build Status
|
# Build Status
|
||||||
|
|
||||||
[](https://drone.datavi.be/sneak/mfer)
|
CI runs via `docker build .` which executes `make check` (formatting,
|
||||||
|
linting, tests). The `main` branch must always be green.
|
||||||
|
|
||||||
# Participation
|
# Participation
|
||||||
|
|
||||||
@@ -42,6 +43,8 @@ requests](https://git.eeqj.de/sneak/mfer/pulls) and pass CI to be merged.
|
|||||||
Any changes submitted to this project must also be
|
Any changes submitted to this project must also be
|
||||||
[WTFPL-licensed](https://wtfpl.net) to be considered.
|
[WTFPL-licensed](https://wtfpl.net) to be considered.
|
||||||
|
|
||||||
|
See [`REPO_POLICIES.md`](REPO_POLICIES.md) for detailed coding standards,
|
||||||
|
tooling requirements, and workflow conventions.
|
||||||
|
|
||||||
# Problem Statement
|
# Problem Statement
|
||||||
|
|
||||||
@@ -123,7 +126,6 @@ The manifest file would do several important things:
|
|||||||
# Open Questions
|
# Open Questions
|
||||||
|
|
||||||
- Should the manifest file include checksums of individual file chunks, or just for the whole assembled file?
|
- Should the manifest file include checksums of individual file chunks, or just for the whole assembled file?
|
||||||
|
|
||||||
- If so, should the chunksize be fixed or dynamic?
|
- If so, should the chunksize be fixed or dynamic?
|
||||||
|
|
||||||
- Should the manifest signature format be GnuPG signatures, or those from
|
- Should the manifest signature format be GnuPG signatures, or those from
|
||||||
@@ -207,24 +209,236 @@ regardless of filesystem format.
|
|||||||
Please email [`sneak@sneak.berlin`](mailto:sneak@sneak.berlin) with your
|
Please email [`sneak@sneak.berlin`](mailto:sneak@sneak.berlin) with your
|
||||||
desired username for an account on this Gitea instance.
|
desired username for an account on this Gitea instance.
|
||||||
|
|
||||||
|
# TODO: Remaining Work for 1.0
|
||||||
|
|
||||||
|
## Design Questions (Owner Decision Required)
|
||||||
|
|
||||||
|
These require @sneak's input before implementation. Answers should be added
|
||||||
|
inline below each question.
|
||||||
|
|
||||||
|
### Format Design
|
||||||
|
|
||||||
|
**1. Should `MFFileChecksum` be simplified?** Currently it's a separate
|
||||||
|
message wrapping a single `bytes multiHash` field. Since multihash
|
||||||
|
already self-describes the algorithm, `repeated bytes hashes` directly on
|
||||||
|
`MFFilePath` would be simpler and reduce per-file protobuf overhead. Is
|
||||||
|
the extra message layer intentional (e.g. planning to add per-hash
|
||||||
|
metadata like `verified_at`)?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**2. Should file permissions/mode be stored?** The format stores
|
||||||
|
mtime/ctime but not Unix file permissions. For archival use this may not
|
||||||
|
matter, but for software distribution or filesystem restoration it's a
|
||||||
|
gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`)
|
||||||
|
even if we don't populate it yet?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**3. Should `atime` be removed from the schema?** Access time is
|
||||||
|
volatile, non-deterministic, and often disabled (`noatime`). Including it
|
||||||
|
means two manifests of the same directory at different times will differ,
|
||||||
|
which conflicts with the determinism goal. Remove it, or document it as
|
||||||
|
"never set by default"?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**4. What are the path normalization rules?** The proto has `string path`
|
||||||
|
with no specification about: always forward-slash? Must be relative? No
|
||||||
|
`..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs
|
||||||
|
Linux)? Max path length? This is a security issue (path traversal) and a
|
||||||
|
cross-platform compatibility issue. What rules should the spec mandate?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**5. Should we add a version byte after the magic?** Currently
|
||||||
|
`ZNAVSRFG` is followed immediately by protobuf. Adding a version byte
|
||||||
|
(`ZNAVSRFG\x01`) would allow future framing changes without requiring
|
||||||
|
protobuf parsing to detect the version. `MFFileOuter.Version` serves
|
||||||
|
this purpose but requires successful deserialization to read. Worth the
|
||||||
|
extra byte?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**6. Should we add a length-prefix after the magic?** Protobuf is not
|
||||||
|
self-delimiting. If we ever want to concatenate manifests or append data
|
||||||
|
after the protobuf, the current framing is insufficient. Add a varint or
|
||||||
|
fixed-width length-prefix?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
### Signature Design
|
||||||
|
|
||||||
|
**7. What does the outer SHA-256 hash cover — compressed or uncompressed
|
||||||
|
data?** The code currently hashes compressed data (good for verifying
|
||||||
|
before decompression), but this should be explicitly documented. Which is
|
||||||
|
the intended behavior?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded
|
||||||
|
string?** Currently the canonical string is `MAGIC-UUID-MULTIHASH` with
|
||||||
|
hex encoding, which adds a transformation layer. Signing the raw `sha256`
|
||||||
|
bytes (or compressed `innerMessage` directly) would be simpler. Keep the
|
||||||
|
string format or switch to raw bytes?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**9. Should we support detached signature files (`.mf.sig`)?** Embedded
|
||||||
|
signatures are better for single-file distribution. Detached `.mf.sig`
|
||||||
|
files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are
|
||||||
|
simpler for HTTP serving. Support both modes?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**10. GPG vs pure-Go crypto for signatures?** Shelling out to `gpg` is
|
||||||
|
fragile (may not be installed, version-dependent output).
|
||||||
|
`github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could
|
||||||
|
use Ed25519/signify (simpler, no key management). Which direction?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
### Implementation Design
|
||||||
|
|
||||||
|
**11. Should manifests be deterministic by default?** This means: sort
|
||||||
|
file entries by path, omit `createdAt` timestamp (or make it opt-in), no
|
||||||
|
`atime`. Should determinism be the default, with a
|
||||||
|
`--include-timestamps` flag to opt in?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**12. Should we consolidate or keep both scanner/checker
|
||||||
|
implementations?** There are two parallel implementations:
|
||||||
|
`mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`,
|
||||||
|
`RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw
|
||||||
|
`int64`, `string`). The `mfer/` versions are superior. Delete the
|
||||||
|
`internal/` versions?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**13. Should the `manifest` type be exported?** Currently unexported with
|
||||||
|
exported constructors (`NewManifestFromReader`, `NewManifestFromFile`).
|
||||||
|
Consumers can't declare `var m *mfer.manifest`. Export the type, or
|
||||||
|
define an interface?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
**14. What should the Go module path be for 1.0?** Currently
|
||||||
|
`sneak.berlin/go/mfer` in `go.mod` but `git.eeqj.de/sneak/mfer/mfer` in
|
||||||
|
the proto `go_package` option. Which is canonical?
|
||||||
|
|
||||||
|
> _answer:_
|
||||||
|
|
||||||
|
## Implementation Tasks
|
||||||
|
|
||||||
|
### Repo Infrastructure
|
||||||
|
|
||||||
|
- [ ] Add `.golangci.yml` (fetch from
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/.golangci.yml`)
|
||||||
|
- [ ] Add `.editorconfig`
|
||||||
|
- [ ] Add `.gitea/workflows/check.yml` that runs `docker build .`
|
||||||
|
|
||||||
|
### Format & Correctness
|
||||||
|
|
||||||
|
- [ ] Resolve proto `go_package` path inconsistency
|
||||||
|
(`git.eeqj.de/sneak/mfer/mfer` vs `sneak.berlin/go/mfer`)
|
||||||
|
- [ ] Specify path invariants — add proto comments requiring UTF-8,
|
||||||
|
forward-slash, relative paths, no `..`, no leading `/`; validate
|
||||||
|
in `Builder.AddFile` and `Builder.AddFileWithHash` (pending design
|
||||||
|
question answer)
|
||||||
|
- [ ] Remove or deprecate `atime` from proto (pending design question
|
||||||
|
answer)
|
||||||
|
- [ ] Reserve `optional uint32 mode = 305` in `MFFilePath` for future
|
||||||
|
file permissions (pending design question answer)
|
||||||
|
- [ ] Add version byte after magic — `ZNAVSRFG\x01` for format version
|
||||||
|
1 (pending design question answer)
|
||||||
|
- [ ] Write format specification document — separate from README:
|
||||||
|
magic, outer structure, compression, inner structure, path
|
||||||
|
invariants, signature scheme, canonical serialization
|
||||||
|
|
||||||
|
### Library
|
||||||
|
|
||||||
|
- [ ] Delete `internal/scanner/` and `internal/checker/` — consolidate
|
||||||
|
on `mfer/` package versions; update CLI code (pending design
|
||||||
|
question answer)
|
||||||
|
- [ ] Add deterministic file ordering — sort entries by path
|
||||||
|
(lexicographic, byte-order) in `Builder.Build()`; add test
|
||||||
|
asserting byte-identical output from two runs
|
||||||
|
- [ ] Add decompression size limit — `io.LimitReader` in
|
||||||
|
`deserializeInner()` with `m.pbOuter.Size` as bound
|
||||||
|
- [ ] Fix `errors.Is` dead code in checker — replace with
|
||||||
|
`os.IsNotExist(err)` or `errors.Is(err, fs.ErrNotExist)`
|
||||||
|
- [ ] Fix `AddFile` to verify size — check `totalRead == size` after
|
||||||
|
reading, return error on mismatch
|
||||||
|
- [ ] Export the `manifest` type or define a public interface (pending
|
||||||
|
design question answer) — currently consumers cannot hold a reference
|
||||||
|
to a loaded manifest in their own type declarations
|
||||||
|
- [ ] Replace GPG subprocess calls with pure-Go crypto (pending design
|
||||||
|
question answer) — current implementation shells out to `gpg` which
|
||||||
|
may not be installed
|
||||||
|
- [ ] Add timeout to any remaining subprocess calls
|
||||||
|
|
||||||
|
### CLI
|
||||||
|
|
||||||
|
- [ ] Fix flag naming — all CLI flags should use kebab-case as primary
|
||||||
|
(`--include-dotfiles`, `--follow-symlinks`)
|
||||||
|
- [ ] Fix URL construction in fetch — use `BaseURL.JoinPath()` or
|
||||||
|
`url.JoinPath()` instead of string concatenation
|
||||||
|
- [ ] Add progress rate-limiting to Checker — throttle to once per
|
||||||
|
second, matching Scanner
|
||||||
|
- [ ] Add `--deterministic` flag or make it default — omit `createdAt`,
|
||||||
|
sort files (pending design question answer)
|
||||||
|
- [ ] Wire `--version` flag properly (currently only a `version`
|
||||||
|
subcommand exists; top-level `--version` shows urfave/cli generic
|
||||||
|
output)
|
||||||
|
- [ ] Add retry logic to `fetch` — currently no retries on transient
|
||||||
|
HTTP errors; needs exponential backoff
|
||||||
|
- [ ] `fetch` command uses bare `http.Get` with no timeout — needs
|
||||||
|
`http.Client` with configurable timeout
|
||||||
|
|
||||||
|
### Testing & Robustness
|
||||||
|
|
||||||
|
- [ ] Add fuzzing tests for `NewManifestFromReader` — protobuf
|
||||||
|
deserialization of untrusted input needs fuzz coverage
|
||||||
|
- [ ] Add integration test for `freshen` CLI command — current tests
|
||||||
|
only verify setup, not the actual freshen operation end-to-end
|
||||||
|
- [ ] Add test for `fetch` CLI command end-to-end (currently only
|
||||||
|
`downloadFile` is tested)
|
||||||
|
|
||||||
|
### Documentation
|
||||||
|
|
||||||
|
- [ ] Promote `FORMAT.md` as primary spec reference; README should link
|
||||||
|
to it more prominently
|
||||||
|
- [ ] Audit and update all error messages for consistency and
|
||||||
|
helpfulness
|
||||||
|
- [ ] Document the signature scheme more thoroughly (canonical string
|
||||||
|
format, verification steps)
|
||||||
|
|
||||||
|
### Release
|
||||||
|
|
||||||
|
- [ ] Finalize Go module path
|
||||||
|
- [ ] Update version constant in `mfer/constants.go`
|
||||||
|
- [ ] Add `--version` output matching SemVer
|
||||||
|
- [ ] Tag `v1.0.0`
|
||||||
|
|
||||||
# See Also
|
# See Also
|
||||||
|
|
||||||
## Prior Art: Metalink
|
## Prior Art: Metalink
|
||||||
|
|
||||||
* [Metalink - Mozilla Wiki](https://wiki.mozilla.org/Metalink)
|
- [Metalink - Mozilla Wiki](https://wiki.mozilla.org/Metalink)
|
||||||
* [Metalink - Wikipedia](https://en.wikipedia.org/wiki/Metalink)
|
- [Metalink - Wikipedia](https://en.wikipedia.org/wiki/Metalink)
|
||||||
* [RFC 5854 - The Metalink Download Description Format](https://datatracker.ietf.org/doc/html/rfc5854)
|
- [RFC 5854 - The Metalink Download Description Format](https://datatracker.ietf.org/doc/html/rfc5854)
|
||||||
* [RFC 6249 - Metalink/HTTP: Mirrors and Hashes](https://www.rfc-editor.org/rfc/rfc6249.html)
|
- [RFC 6249 - Metalink/HTTP: Mirrors and Hashes](https://www.rfc-editor.org/rfc/rfc6249.html)
|
||||||
|
|
||||||
## Links
|
## Links
|
||||||
|
|
||||||
* Repo: [https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer)
|
- Repo: [https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer)
|
||||||
* Issues: [https://git.eeqj.de/sneak/mfer/issues](https://git.eeqj.de/sneak/mfer/issues)
|
- Issues: [https://git.eeqj.de/sneak/mfer/issues](https://git.eeqj.de/sneak/mfer/issues)
|
||||||
|
|
||||||
# Authors
|
# Authors
|
||||||
|
|
||||||
* [@sneak <sneak@sneak.berlin>](mailto:sneak@sneak.berlin)
|
- [@sneak <sneak@sneak.berlin>](mailto:sneak@sneak.berlin)
|
||||||
|
|
||||||
# License
|
# License
|
||||||
|
|
||||||
* [WTFPL](https://wtfpl.net)
|
- [WTFPL](https://wtfpl.net)
|
||||||
|
|||||||
255
REPO_POLICIES.md
Normal file
255
REPO_POLICIES.md
Normal file
@@ -0,0 +1,255 @@
|
|||||||
|
---
|
||||||
|
title: Repository Policies
|
||||||
|
last_modified: 2026-03-10
|
||||||
|
---
|
||||||
|
|
||||||
|
This document covers repository structure, tooling, and workflow standards. Code
|
||||||
|
style conventions are in separate documents:
|
||||||
|
|
||||||
|
- [Code Styleguide](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE.md)
|
||||||
|
(general, bash, Docker)
|
||||||
|
- [Go](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE_GO.md)
|
||||||
|
- [JavaScript](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE_JS.md)
|
||||||
|
- [Python](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE_PYTHON.md)
|
||||||
|
- [Go HTTP Server Conventions](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/GO_HTTP_SERVER_CONVENTIONS.md)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
- Cross-project documentation (such as this file) must include
|
||||||
|
`last_modified: YYYY-MM-DD` in the YAML front matter so it can be kept in sync
|
||||||
|
with the authoritative source as policies evolve.
|
||||||
|
|
||||||
|
- **ALL external references must be pinned by cryptographic hash.** This
|
||||||
|
includes Docker base images, Go modules, npm packages, GitHub Actions, and
|
||||||
|
anything else fetched from a remote source. Version tags (`@v4`, `@latest`,
|
||||||
|
`:3.21`, etc.) are server-mutable and therefore remote code execution
|
||||||
|
vulnerabilities. The ONLY acceptable way to reference an external dependency
|
||||||
|
is by its content hash (Docker `@sha256:...`, Go module hash in `go.sum`, npm
|
||||||
|
integrity hash in lockfile, GitHub Actions `@<commit-sha>`). No exceptions.
|
||||||
|
This also means never `curl | bash` to install tools like pyenv, nvm, rustup,
|
||||||
|
etc. Instead, download a specific release archive from GitHub, verify its hash
|
||||||
|
(hardcoded in the Dockerfile or script), and only then install. Unverified
|
||||||
|
install scripts are arbitrary remote code execution. This is the single most
|
||||||
|
important rule in this document. Double-check every external reference in
|
||||||
|
every file before committing. There are zero exceptions to this rule.
|
||||||
|
|
||||||
|
- Every repo with software must have a root `Makefile` with these targets:
|
||||||
|
`make test`, `make lint`, `make fmt` (writes), `make fmt-check` (read-only),
|
||||||
|
`make check` (prereqs: `test`, `lint`, `fmt-check`), `make docker`, and
|
||||||
|
`make hooks` (installs pre-commit hook). A model Makefile is at
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/Makefile`.
|
||||||
|
|
||||||
|
- Always use Makefile targets (`make fmt`, `make test`, `make lint`, etc.)
|
||||||
|
instead of invoking the underlying tools directly. The Makefile is the single
|
||||||
|
source of truth for how these operations are run.
|
||||||
|
|
||||||
|
- The Makefile is authoritative documentation for how the repo is used. Beyond
|
||||||
|
the required targets above, it should have targets for every common operation:
|
||||||
|
running a local development server (`make run`, `make dev`), re-initializing
|
||||||
|
or migrating the database (`make db-reset`, `make migrate`), building
|
||||||
|
artifacts (`make build`), generating code, seeding data, or anything else a
|
||||||
|
developer would do regularly. If someone checks out the repo and types
|
||||||
|
`make<tab>`, they should see every meaningful operation available. A new
|
||||||
|
contributor should be able to understand the entire development workflow by
|
||||||
|
reading the Makefile.
|
||||||
|
|
||||||
|
- Every repo should have a `Dockerfile`. All Dockerfiles must run `make check`
|
||||||
|
as a build step so the build fails if the branch is not green. For non-server
|
||||||
|
repos, the Dockerfile should bring up a development environment and run
|
||||||
|
`make check`. For server repos, `make check` should run as an early build
|
||||||
|
stage before the final image is assembled.
|
||||||
|
|
||||||
|
- Every repo should have a Gitea Actions workflow (`.gitea/workflows/`) that
|
||||||
|
runs `docker build .` on push. Since the Dockerfile already runs `make check`,
|
||||||
|
a successful build implies all checks pass.
|
||||||
|
|
||||||
|
- Use platform-standard formatters: `black` for Python, `prettier` for
|
||||||
|
JS/CSS/Markdown/HTML, `go fmt` for Go. Always use default configuration with
|
||||||
|
two exceptions: four-space indents (except Go), and `proseWrap: always` for
|
||||||
|
Markdown (hard-wrap at 80 columns). Documentation and writing repos (Markdown,
|
||||||
|
HTML, CSS) should also have `.prettierrc` and `.prettierignore`.
|
||||||
|
|
||||||
|
- Pre-commit hook: `make check` if local testing is possible, otherwise
|
||||||
|
`make lint && make fmt-check`. The Makefile should provide a `make hooks`
|
||||||
|
target to install the pre-commit hook.
|
||||||
|
|
||||||
|
- All repos with software must have tests that run via the platform-standard
|
||||||
|
test framework (`go test`, `pytest`, `jest`/`vitest`, etc.). If no meaningful
|
||||||
|
tests exist yet, add the most minimal test possible — e.g. importing the
|
||||||
|
module under test to verify it compiles/parses. There is no excuse for
|
||||||
|
`make test` to be a no-op.
|
||||||
|
|
||||||
|
- `make test` must complete in under 20 seconds. Add a 30-second timeout in the
|
||||||
|
Makefile.
|
||||||
|
|
||||||
|
- Docker builds must complete in under 5 minutes.
|
||||||
|
|
||||||
|
- `make check` must not modify any files in the repo. Tests may use temporary
|
||||||
|
directories.
|
||||||
|
|
||||||
|
- `main` must always pass `make check`, no exceptions.
|
||||||
|
|
||||||
|
- Never commit secrets. `.env` files, credentials, API keys, and private keys
|
||||||
|
must be in `.gitignore`. No exceptions.
|
||||||
|
|
||||||
|
- `.gitignore` should be comprehensive from the start: OS files (`.DS_Store`),
|
||||||
|
editor files (`.swp`, `*~`), language build artifacts, and `node_modules/`.
|
||||||
|
Fetch the standard `.gitignore` from
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/.gitignore` when setting up
|
||||||
|
a new repo.
|
||||||
|
|
||||||
|
- **No build artifacts in version control.** Code-derived data (compiled
|
||||||
|
bundles, minified output, generated assets) must never be committed to the
|
||||||
|
repository if it can be avoided. The build process (e.g. Dockerfile, Makefile)
|
||||||
|
should generate these at build time. Notable exception: Go protobuf generated
|
||||||
|
files (`.pb.go`) ARE committed because repos need to work with `go get`, which
|
||||||
|
downloads code but does not execute code generation.
|
||||||
|
|
||||||
|
- Never use `git add -A` or `git add .`. Always stage files explicitly by name.
|
||||||
|
|
||||||
|
- Never force-push to `main`.
|
||||||
|
|
||||||
|
- Make all changes on a feature branch. You can do whatever you want on a
|
||||||
|
feature branch.
|
||||||
|
|
||||||
|
- `.golangci.yml` is standardized and must _NEVER_ be modified by an agent, only
|
||||||
|
manually by the user. Fetch from
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/.golangci.yml`.
|
||||||
|
|
||||||
|
- When pinning images or packages by hash, add a comment above the reference
|
||||||
|
with the version and date (YYYY-MM-DD).
|
||||||
|
|
||||||
|
- Use `yarn`, not `npm`.
|
||||||
|
|
||||||
|
- Write all dates as YYYY-MM-DD (ISO 8601).
|
||||||
|
|
||||||
|
- Simple projects should be configured with environment variables.
|
||||||
|
|
||||||
|
- Dockerized web services listen on port 8080 by default, overridable with
|
||||||
|
`PORT`.
|
||||||
|
|
||||||
|
- **HTTP/web services must be hardened for production internet exposure before
|
||||||
|
tagging 1.0.** This means full compliance with security best practices
|
||||||
|
including, without limitation, all of the following:
|
||||||
|
- **Security headers** on every response:
|
||||||
|
- `Strict-Transport-Security` (HSTS) with `max-age` of at least one year
|
||||||
|
and `includeSubDomains`.
|
||||||
|
- `Content-Security-Policy` (CSP) with a restrictive default policy
|
||||||
|
(`default-src 'self'` as a baseline, tightened per-resource as
|
||||||
|
needed). Never use `unsafe-inline` or `unsafe-eval` unless
|
||||||
|
unavoidable, and document the reason.
|
||||||
|
- `X-Frame-Options: DENY` (or `SAMEORIGIN` if framing is required).
|
||||||
|
Prefer the `frame-ancestors` CSP directive as the primary control.
|
||||||
|
- `X-Content-Type-Options: nosniff`.
|
||||||
|
- `Referrer-Policy: strict-origin-when-cross-origin` (or stricter).
|
||||||
|
- `Permissions-Policy` restricting access to browser features the
|
||||||
|
application does not use (camera, microphone, geolocation, etc.).
|
||||||
|
- **Request and response limits:**
|
||||||
|
- Maximum request body size enforced on all endpoints (e.g. Go
|
||||||
|
`http.MaxBytesReader`). Choose a sane default per-route; never accept
|
||||||
|
unbounded input.
|
||||||
|
- Maximum response body size where applicable (e.g. paginated APIs).
|
||||||
|
- `ReadTimeout` and `ReadHeaderTimeout` on the `http.Server` to defend
|
||||||
|
against slowloris attacks.
|
||||||
|
- `WriteTimeout` on the `http.Server`.
|
||||||
|
- `IdleTimeout` on the `http.Server`.
|
||||||
|
- Per-handler execution time limits via `context.WithTimeout` or
|
||||||
|
chi/stdlib `middleware.Timeout`.
|
||||||
|
- **Authentication and session security:**
|
||||||
|
- Rate limiting on password-based authentication endpoints. API keys are
|
||||||
|
high-entropy and not susceptible to brute force, so they are exempt.
|
||||||
|
- CSRF tokens on all state-mutating HTML forms. API endpoints
|
||||||
|
authenticated via `Authorization` header (Bearer token, API key) are
|
||||||
|
exempt because the browser does not attach these automatically.
|
||||||
|
- Passwords stored using bcrypt, scrypt, or argon2 — never plain-text,
|
||||||
|
MD5, or SHA.
|
||||||
|
- Session cookies set with `HttpOnly`, `Secure`, and `SameSite=Lax` (or
|
||||||
|
`Strict`) attributes.
|
||||||
|
- **Reverse proxy awareness:**
|
||||||
|
- True client IP detection when behind a reverse proxy
|
||||||
|
(`X-Forwarded-For`, `X-Real-IP`). The application must accept
|
||||||
|
forwarded headers only from a configured set of trusted proxy
|
||||||
|
addresses — never trust `X-Forwarded-For` unconditionally.
|
||||||
|
- **CORS:**
|
||||||
|
- Authenticated endpoints must restrict `Access-Control-Allow-Origin` to
|
||||||
|
an explicit allowlist of known origins. Wildcard (`*`) is acceptable
|
||||||
|
only for public, unauthenticated read-only APIs.
|
||||||
|
- **Error handling:**
|
||||||
|
- Internal errors must never leak stack traces, SQL queries, file paths,
|
||||||
|
or other implementation details to the client. Return generic error
|
||||||
|
messages in production; detailed errors only when `DEBUG` is enabled.
|
||||||
|
- **TLS:**
|
||||||
|
- Services never terminate TLS directly. They are always deployed behind
|
||||||
|
a TLS-terminating reverse proxy. The service itself listens on plain
|
||||||
|
HTTP. However, HSTS headers and `Secure` cookie flags must still be
|
||||||
|
set by the application so that the browser enforces HTTPS end-to-end.
|
||||||
|
|
||||||
|
This list is non-exhaustive. Apply defense-in-depth: if a standard security
|
||||||
|
hardening measure exists for HTTP services and is not listed here, it is
|
||||||
|
still expected. When in doubt, harden.
|
||||||
|
|
||||||
|
- `README.md` is the primary documentation. Required sections:
|
||||||
|
- **Description**: First line must include the project name, purpose,
|
||||||
|
category (web server, SPA, CLI tool, etc.), license, and author. Example:
|
||||||
|
"µPaaS is an MIT-licensed Go web application by @sneak that receives
|
||||||
|
git-frontend webhooks and deploys applications via Docker in realtime."
|
||||||
|
- **Getting Started**: Copy-pasteable install/usage code block.
|
||||||
|
- **Rationale**: Why does this exist?
|
||||||
|
- **Design**: How is the program structured?
|
||||||
|
- **TODO**: Update meticulously, even between commits. When planning, put
|
||||||
|
the todo list in the README so a new agent can pick up where the last one
|
||||||
|
left off.
|
||||||
|
- **License**: MIT, GPL, or WTFPL. Ask the user for new projects. Include a
|
||||||
|
`LICENSE` file in the repo root and a License section in the README.
|
||||||
|
- **Author**: [@sneak](https://sneak.berlin).
|
||||||
|
|
||||||
|
- First commit of a new repo should contain only `README.md`.
|
||||||
|
|
||||||
|
- Go module root: `sneak.berlin/go/<name>`. Always run `go mod tidy` before
|
||||||
|
committing.
|
||||||
|
|
||||||
|
- Use SemVer.
|
||||||
|
|
||||||
|
- Database migrations live in `internal/db/migrations/` and must be embedded in
|
||||||
|
the binary.
|
||||||
|
- `000_migration.sql` — contains ONLY the creation of the migrations
|
||||||
|
tracking table itself. Nothing else.
|
||||||
|
- `001_schema.sql` — the full application schema.
|
||||||
|
- **Pre-1.0.0:** never add additional migration files (002, 003, etc.).
|
||||||
|
There is no installed base to migrate. Edit `001_schema.sql` directly.
|
||||||
|
- **Post-1.0.0:** add new numbered migration files for each schema change.
|
||||||
|
Never edit existing migrations after release.
|
||||||
|
|
||||||
|
- All repos should have an `.editorconfig` enforcing the project's indentation
|
||||||
|
settings.
|
||||||
|
|
||||||
|
- Avoid putting files in the repo root unless necessary. Root should contain
|
||||||
|
only project-level config files (`README.md`, `Makefile`, `Dockerfile`,
|
||||||
|
`LICENSE`, `.gitignore`, `.editorconfig`, `REPO_POLICIES.md`, and
|
||||||
|
language-specific config). Everything else goes in a subdirectory. Canonical
|
||||||
|
subdirectory names:
|
||||||
|
- `bin/` — executable scripts and tools
|
||||||
|
- `cmd/` — Go command entrypoints
|
||||||
|
- `configs/` — configuration templates and examples
|
||||||
|
- `deploy/` — deployment manifests (k8s, compose, terraform)
|
||||||
|
- `docs/` — documentation and markdown (README.md stays in root)
|
||||||
|
- `internal/` — Go internal packages
|
||||||
|
- `internal/db/migrations/` — database migrations
|
||||||
|
- `pkg/` — Go library packages
|
||||||
|
- `share/` — systemd units, data files
|
||||||
|
- `static/` — static assets (images, fonts, etc.)
|
||||||
|
- `web/` — web frontend source
|
||||||
|
|
||||||
|
- When setting up a new repo, files from the `prompts` repo may be used as
|
||||||
|
templates. Fetch them from
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/<path>`.
|
||||||
|
|
||||||
|
- New repos must contain at minimum:
|
||||||
|
- `README.md`, `.git`, `.gitignore`, `.editorconfig`
|
||||||
|
- `LICENSE`, `REPO_POLICIES.md` (copy from the `prompts` repo)
|
||||||
|
- `Makefile`
|
||||||
|
- `Dockerfile`, `.dockerignore`
|
||||||
|
- `.gitea/workflows/check.yml`
|
||||||
|
- Go: `go.mod`, `go.sum`, `.golangci.yml`
|
||||||
|
- JS: `package.json`, `yarn.lock`, `.prettierrc`, `.prettierignore`
|
||||||
|
- Python: `pyproject.toml`
|
||||||
122
TODO.md
122
TODO.md
@@ -1,122 +0,0 @@
|
|||||||
# TODO: mfer 1.0
|
|
||||||
|
|
||||||
## Design Questions
|
|
||||||
|
|
||||||
*sneak: please answer inline below each question. These are preserved for posterity.*
|
|
||||||
|
|
||||||
### Format Design
|
|
||||||
|
|
||||||
**1. Should `MFFileChecksum` be simplified?**
|
|
||||||
Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**2. Should file permissions/mode be stored?**
|
|
||||||
The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**3. Should `atime` be removed from the schema?**
|
|
||||||
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**4. What are the path normalization rules?**
|
|
||||||
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**5. Should we add a version byte after the magic?**
|
|
||||||
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**6. Should we add a length-prefix after the magic?**
|
|
||||||
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
### Signature Design
|
|
||||||
|
|
||||||
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
|
|
||||||
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
|
|
||||||
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**9. Should we support detached signature files (`.mf.sig`)?**
|
|
||||||
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**10. GPG vs pure-Go crypto for signatures?**
|
|
||||||
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
### Implementation Design
|
|
||||||
|
|
||||||
**11. Should manifests be deterministic by default?**
|
|
||||||
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**12. Should we consolidate or keep both scanner/checker implementations?**
|
|
||||||
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**13. Should the `manifest` type be exported?**
|
|
||||||
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
**14. What should the Go module path be for 1.0?**
|
|
||||||
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
|
|
||||||
|
|
||||||
> *answer:*
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Implementation Plan
|
|
||||||
|
|
||||||
### Phase 1: Foundation (format correctness)
|
|
||||||
|
|
||||||
- [ ] Delete `internal/scanner/` and `internal/checker/` — consolidate on `mfer/` package versions; update CLI code
|
|
||||||
- [ ] Add deterministic file ordering — sort entries by path (lexicographic, byte-order) in `Builder.Build()`; add test asserting byte-identical output from two runs
|
|
||||||
- [ ] Add decompression size limit — `io.LimitReader` in `deserializeInner()` with `m.pbOuter.Size` as bound
|
|
||||||
- [ ] Fix `errors.Is` dead code in checker — replace with `os.IsNotExist(err)` or `errors.Is(err, fs.ErrNotExist)`
|
|
||||||
- [ ] Fix `AddFile` to verify size — check `totalRead == size` after reading, return error on mismatch
|
|
||||||
- [ ] Specify path invariants — add proto comments (UTF-8, forward-slash, relative, no `..`, no leading `/`); validate in `Builder.AddFile` and `Builder.AddFileWithHash`
|
|
||||||
|
|
||||||
### Phase 2: CLI polish
|
|
||||||
|
|
||||||
- [ ] Fix flag naming — all CLI flags use kebab-case as primary (`--include-dotfiles`, `--follow-symlinks`)
|
|
||||||
- [ ] Fix URL construction in fetch — use `BaseURL.JoinPath()` or `url.JoinPath()` instead of string concatenation
|
|
||||||
- [ ] Add progress rate-limiting to Checker — throttle to once per second, matching Scanner
|
|
||||||
- [ ] Add `--deterministic` flag (or make it default) — omit `createdAt`, sort files
|
|
||||||
|
|
||||||
### Phase 3: Robustness
|
|
||||||
|
|
||||||
- [ ] Replace GPG subprocess with pure-Go crypto — `github.com/ProtonMail/go-crypto` or Ed25519/signify
|
|
||||||
- [ ] Add timeout to any remaining subprocess calls
|
|
||||||
- [ ] Add fuzzing tests for `NewManifestFromReader`
|
|
||||||
- [ ] Add retry logic to fetch — exponential backoff for transient HTTP errors
|
|
||||||
|
|
||||||
### Phase 4: Format finalization
|
|
||||||
|
|
||||||
- [ ] Remove or deprecate `atime` from proto (pending design question answer)
|
|
||||||
- [ ] Reserve `optional uint32 mode = 305` in `MFFilePath` for future file permissions
|
|
||||||
- [ ] Add version byte after magic — `ZNAVSRFG\x01` for format version 1
|
|
||||||
- [ ] Write format specification document — separate from README: magic, outer structure, compression, inner structure, path invariants, signature scheme, canonical serialization
|
|
||||||
|
|
||||||
### Phase 5: Release prep
|
|
||||||
|
|
||||||
- [ ] Finalize Go module path
|
|
||||||
- [ ] Audit all error messages for consistency and helpfulness
|
|
||||||
- [ ] Add `--version` output matching SemVer
|
|
||||||
- [ ] Tag v1.0.0
|
|
||||||
Reference in New Issue
Block a user