add REPO_POLICIES.md, rename CLAUDE.md to AGENTS.md, deduplicate
- Add REPO_POLICIES.md from the standard template (sneak/prompts) - Rename CLAUDE.md to AGENTS.md with agent-specific workflow instructions - Move policy-like rules (git add, make targets, etc.) to REPO_POLICIES.md - Add reference to REPO_POLICIES.md in README Participation section - Run make fmt (prettier formatting on markdown files)
This commit is contained in:
30
TODO.md
30
TODO.md
@@ -2,83 +2,83 @@
|
||||
|
||||
## Design Questions
|
||||
|
||||
*sneak: please answer inline below each question. These are preserved for posterity.*
|
||||
_sneak: please answer inline below each question. These are preserved for posterity._
|
||||
|
||||
### Format Design
|
||||
|
||||
**1. Should `MFFileChecksum` be simplified?**
|
||||
Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**2. Should file permissions/mode be stored?**
|
||||
The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**3. Should `atime` be removed from the schema?**
|
||||
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**4. What are the path normalization rules?**
|
||||
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**5. Should we add a version byte after the magic?**
|
||||
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**6. Should we add a length-prefix after the magic?**
|
||||
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
### Signature Design
|
||||
|
||||
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
|
||||
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
|
||||
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**9. Should we support detached signature files (`.mf.sig`)?**
|
||||
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**10. GPG vs pure-Go crypto for signatures?**
|
||||
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
### Implementation Design
|
||||
|
||||
**11. Should manifests be deterministic by default?**
|
||||
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**12. Should we consolidate or keep both scanner/checker implementations?**
|
||||
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**13. Should the `manifest` type be exported?**
|
||||
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
**14. What should the Go module path be for 1.0?**
|
||||
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
|
||||
|
||||
> *answer:*
|
||||
> _answer:_
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user