Compare commits
5 Commits
64490e0d17
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 6ba32f5b35 | |||
| e62c709d42 | |||
| 89903fa1cd | |||
| b3d10106e1 | |||
| 9712c10fe3 |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -6,5 +6,8 @@
|
|||||||
vendor.tzst
|
vendor.tzst
|
||||||
modcache.tzst
|
modcache.tzst
|
||||||
|
|
||||||
|
# Generated manifest files
|
||||||
|
.index.mf
|
||||||
|
|
||||||
# Stale files
|
# Stale files
|
||||||
.drone.yml
|
.drone.yml
|
||||||
|
|||||||
29
AGENTS.md
Normal file
29
AGENTS.md
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
# Agent Instructions
|
||||||
|
|
||||||
|
Read `REPO_POLICIES.md` before making any changes. It is the authoritative
|
||||||
|
source for coding standards, formatting, linting, and workflow rules.
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
- When fixing a bug, write a failing test FIRST. Only after the test fails,
|
||||||
|
write the code to fix the bug. Then ensure the test passes. Leave the test in
|
||||||
|
place and commit it with the bugfix. Don't run shell commands to test bugfixes
|
||||||
|
or reproduce bugs. Write tests!
|
||||||
|
|
||||||
|
- After each change, run `make fmt`, then `make test`, then `make lint`. Fix any
|
||||||
|
failures before committing.
|
||||||
|
|
||||||
|
- After each change, commit only the files you've changed. Push after committing.
|
||||||
|
|
||||||
|
## Attribution
|
||||||
|
|
||||||
|
- Never mention Claude, Anthropic, or any AI/LLM tooling in commit messages. Do
|
||||||
|
not use attribution.
|
||||||
|
|
||||||
|
## Repository-Specific Notes
|
||||||
|
|
||||||
|
- This is a Go library + CLI tool for generating `.mf` manifest files.
|
||||||
|
- The proto definition is in `mfer/mf.proto`; generated `.pb.go` files are
|
||||||
|
committed (required for `go get` compatibility).
|
||||||
|
- The format specification is in `FORMAT.md`.
|
||||||
|
- See `TODO.md` for the 1.0 implementation plan and open design questions.
|
||||||
20
CLAUDE.md
20
CLAUDE.md
@@ -1,20 +0,0 @@
|
|||||||
# Important Rules
|
|
||||||
|
|
||||||
- when fixing a bug, write a failing test FIRST. only after the test fails, write
|
|
||||||
the code to fix the bug. then ensure the test passes. leave the test in
|
|
||||||
place and commit it with the bugfix. don't run shell commands to test
|
|
||||||
bugfixes or reproduce bugs. write tests!
|
|
||||||
|
|
||||||
- never, ever mention claude or anthropic in commit messages. do not use attribution
|
|
||||||
|
|
||||||
- after each change, run "make fmt".
|
|
||||||
|
|
||||||
- after each change, run "make test" and ensure all tests pass.
|
|
||||||
|
|
||||||
- after each change, run "make lint" and ensure no linting errors. fix any
|
|
||||||
you find, one by one.
|
|
||||||
|
|
||||||
- after each change, commit the files you've changed. push after
|
|
||||||
committing.
|
|
||||||
|
|
||||||
- NEVER use `git add -A`. always add only individual files that you've changed.
|
|
||||||
31
Dockerfile
31
Dockerfile
@@ -1,9 +1,36 @@
|
|||||||
FROM golang@sha256:60deed95d3888cc5e4d9ff8a10c54e5edc008c6ae3fba6187be6fb592e19e8c0 AS builder
|
# Lint stage — fast feedback on formatting and lint issues
|
||||||
|
# golangci/golangci-lint:v2.0.2 (2026-03-14)
|
||||||
|
FROM golangci/golangci-lint@sha256:d55581f7797e7a0877a7c3aaa399b01bdc57d2874d6412601a046cc4062cb62e AS lint
|
||||||
|
|
||||||
WORKDIR /src
|
WORKDIR /src
|
||||||
COPY go.mod go.sum ./
|
COPY go.mod go.sum ./
|
||||||
RUN go mod download
|
RUN go mod download
|
||||||
|
|
||||||
COPY . .
|
COPY . .
|
||||||
RUN go test -v --timeout 30s ./...
|
|
||||||
|
# Touch .pb.go so make does not try to regenerate via protoc (file is committed)
|
||||||
|
RUN touch mfer/mf.pb.go
|
||||||
|
|
||||||
|
RUN make fmt-check
|
||||||
|
RUN make lint
|
||||||
|
|
||||||
|
# Build stage — tests and compilation
|
||||||
|
# golang:1.23 (2026-03-14)
|
||||||
|
FROM golang@sha256:60deed95d3888cc5e4d9ff8a10c54e5edc008c6ae3fba6187be6fb592e19e8c0 AS builder
|
||||||
|
|
||||||
|
# Force BuildKit to run the lint stage by creating a stage dependency
|
||||||
|
COPY --from=lint /src/go.sum /dev/null
|
||||||
|
|
||||||
|
WORKDIR /src
|
||||||
|
COPY go.mod go.sum ./
|
||||||
|
RUN go mod download
|
||||||
|
|
||||||
|
COPY . .
|
||||||
|
|
||||||
|
# Touch .pb.go so make does not try to regenerate via protoc (file is committed)
|
||||||
|
RUN touch mfer/mf.pb.go
|
||||||
|
|
||||||
|
RUN make test
|
||||||
RUN cd cmd/mfer && go build -tags urfave_cli_no_docs -o /mfer .
|
RUN cd cmd/mfer && go build -tags urfave_cli_no_docs -o /mfer .
|
||||||
|
|
||||||
FROM scratch
|
FROM scratch
|
||||||
|
|||||||
51
FORMAT.md
51
FORMAT.md
@@ -25,17 +25,17 @@ See [`mfer/mf.proto`](mfer/mf.proto) for exact field numbers and types.
|
|||||||
|
|
||||||
The outer message contains:
|
The outer message contains:
|
||||||
|
|
||||||
| Field | Number | Type | Description |
|
| Field | Number | Type | Description |
|
||||||
|--------------------|--------|-------------------|--------------------------------------------------|
|
| ----------------- | ------ | ---------------- | ------------------------------------------------------------------------ |
|
||||||
| `version` | 101 | enum | Must be `VERSION_ONE` (1) |
|
| `version` | 101 | enum | Must be `VERSION_ONE` (1) |
|
||||||
| `compressionType` | 102 | enum | Compression of `innerMessage`; must be `COMPRESSION_ZSTD` (1) |
|
| `compressionType` | 102 | enum | Compression of `innerMessage`; must be `COMPRESSION_ZSTD` (1) |
|
||||||
| `size` | 103 | int64 | Uncompressed size of `innerMessage` (corruption detection) |
|
| `size` | 103 | int64 | Uncompressed size of `innerMessage` (corruption detection) |
|
||||||
| `sha256` | 104 | bytes | SHA-256 hash of the **compressed** `innerMessage` (corruption detection) |
|
| `sha256` | 104 | bytes | SHA-256 hash of the **compressed** `innerMessage` (corruption detection) |
|
||||||
| `uuid` | 105 | bytes | Random v4 UUID; must match the inner message UUID |
|
| `uuid` | 105 | bytes | Random v4 UUID; must match the inner message UUID |
|
||||||
| `innerMessage` | 199 | bytes | Zstd-compressed serialized `MFFile` message |
|
| `innerMessage` | 199 | bytes | Zstd-compressed serialized `MFFile` message |
|
||||||
| `signature` | 201 | bytes (optional) | GPG signature (ASCII-armored or binary) |
|
| `signature` | 201 | bytes (optional) | GPG signature (ASCII-armored or binary) |
|
||||||
| `signer` | 202 | bytes (optional) | Full GPG key ID of the signer |
|
| `signer` | 202 | bytes (optional) | Full GPG key ID of the signer |
|
||||||
| `signingPubKey` | 203 | bytes (optional) | Full GPG signing public key |
|
| `signingPubKey` | 203 | bytes (optional) | Full GPG signing public key |
|
||||||
|
|
||||||
### SHA-256 Hash
|
### SHA-256 Hash
|
||||||
|
|
||||||
@@ -54,25 +54,25 @@ decompression bombs. The reference implementation limits decompressed size to
|
|||||||
After decompressing `innerMessage`, the result is a serialized `MFFile`
|
After decompressing `innerMessage`, the result is a serialized `MFFile`
|
||||||
(referred to as the manifest):
|
(referred to as the manifest):
|
||||||
|
|
||||||
| Field | Number | Type | Description |
|
| Field | Number | Type | Description |
|
||||||
|-------------|--------|-----------------------|--------------------------------------------|
|
| ----------- | ------ | --------------------- | ------------------------------------- |
|
||||||
| `version` | 100 | enum | Must be `VERSION_ONE` (1) |
|
| `version` | 100 | enum | Must be `VERSION_ONE` (1) |
|
||||||
| `files` | 101 | repeated `MFFilePath` | List of files in the manifest |
|
| `files` | 101 | repeated `MFFilePath` | List of files in the manifest |
|
||||||
| `uuid` | 102 | bytes | Random v4 UUID; must match outer UUID |
|
| `uuid` | 102 | bytes | Random v4 UUID; must match outer UUID |
|
||||||
| `createdAt` | 201 | Timestamp (optional) | When the manifest was created |
|
| `createdAt` | 201 | Timestamp (optional) | When the manifest was created |
|
||||||
|
|
||||||
## File Entries (`MFFilePath`)
|
## File Entries (`MFFilePath`)
|
||||||
|
|
||||||
Each file entry contains:
|
Each file entry contains:
|
||||||
|
|
||||||
| Field | Number | Type | Description |
|
| Field | Number | Type | Description |
|
||||||
|------------|--------|---------------------------|--------------------------------------|
|
| ---------- | ------ | ------------------------- | ----------------------------------- |
|
||||||
| `path` | 1 | string | Relative file path (see Path Rules) |
|
| `path` | 1 | string | Relative file path (see Path Rules) |
|
||||||
| `size` | 2 | int64 | File size in bytes |
|
| `size` | 2 | int64 | File size in bytes |
|
||||||
| `hashes` | 3 | repeated `MFFileChecksum` | At least one hash required |
|
| `hashes` | 3 | repeated `MFFileChecksum` | At least one hash required |
|
||||||
| `mimeType` | 301 | string (optional) | MIME type |
|
| `mimeType` | 301 | string (optional) | MIME type |
|
||||||
| `mtime` | 302 | Timestamp (optional) | Modification time |
|
| `mtime` | 302 | Timestamp (optional) | Modification time |
|
||||||
| `ctime` | 303 | Timestamp (optional) | Change time (inode metadata change) |
|
| `ctime` | 303 | Timestamp (optional) | Change time (inode metadata change) |
|
||||||
|
|
||||||
Field 304 (`atime`) has been removed from the specification. Access time is
|
Field 304 (`atime`) has been removed from the specification. Access time is
|
||||||
volatile and non-deterministic; it is not useful for integrity verification.
|
volatile and non-deterministic; it is not useful for integrity verification.
|
||||||
@@ -111,6 +111,7 @@ ZNAVSRFG-<UUID>-<SHA256>
|
|||||||
```
|
```
|
||||||
|
|
||||||
Where:
|
Where:
|
||||||
|
|
||||||
- `ZNAVSRFG` is the magic bytes string (literal ASCII)
|
- `ZNAVSRFG` is the magic bytes string (literal ASCII)
|
||||||
- `<UUID>` is the hex-encoded UUID from the outer message
|
- `<UUID>` is the hex-encoded UUID from the outer message
|
||||||
- `<SHA256>` is the hex-encoded SHA-256 hash from the outer message (covering compressed data)
|
- `<SHA256>` is the hex-encoded SHA-256 hash from the outer message (covering compressed data)
|
||||||
|
|||||||
15
Makefile
15
Makefile
@@ -5,7 +5,7 @@ export PATH := $(PATH):$(GOPATH)/bin
|
|||||||
PROTOC_GEN_GO := $(GOPATH)/bin/protoc-gen-go
|
PROTOC_GEN_GO := $(GOPATH)/bin/protoc-gen-go
|
||||||
SOURCEFILES := mfer/*.go mfer/*.proto internal/*/*.go cmd/*/*.go go.mod go.sum
|
SOURCEFILES := mfer/*.go mfer/*.proto internal/*/*.go cmd/*/*.go go.mod go.sum
|
||||||
ARCH := $(shell uname -m)
|
ARCH := $(shell uname -m)
|
||||||
GITREV_BUILD := $(shell bash $(PWD)/bin/gitrev.sh)
|
GITREV_BUILD := $(shell bash $(PWD)/bin/gitrev.sh 2>/dev/null || echo unknown)
|
||||||
APPNAME := mfer
|
APPNAME := mfer
|
||||||
VERSION := 0.1.0
|
VERSION := 0.1.0
|
||||||
export DOCKER_IMAGE_CACHE_DIR := $(HOME)/Library/Caches/Docker/$(APPNAME)-$(ARCH)
|
export DOCKER_IMAGE_CACHE_DIR := $(HOME)/Library/Caches/Docker/$(APPNAME)-$(ARCH)
|
||||||
@@ -13,7 +13,7 @@ GOLDFLAGS += -X main.Version=$(VERSION)
|
|||||||
GOLDFLAGS += -X main.Gitrev=$(GITREV_BUILD)
|
GOLDFLAGS += -X main.Gitrev=$(GITREV_BUILD)
|
||||||
GOFLAGS := -ldflags "$(GOLDFLAGS)"
|
GOFLAGS := -ldflags "$(GOLDFLAGS)"
|
||||||
|
|
||||||
.PHONY: docker default run ci test fixme
|
.PHONY: docker default run ci test check lint fmt fmt-check hooks fixme
|
||||||
|
|
||||||
default: fmt test
|
default: fmt test
|
||||||
|
|
||||||
@@ -32,8 +32,17 @@ $(PROTOC_GEN_GO):
|
|||||||
fixme:
|
fixme:
|
||||||
@grep -nir fixme . | grep -v Makefile
|
@grep -nir fixme . | grep -v Makefile
|
||||||
|
|
||||||
|
check: test lint fmt-check
|
||||||
|
|
||||||
|
fmt-check: mfer/mf.pb.go
|
||||||
|
sh -c 'test -z "$$(gofmt -l .)"'
|
||||||
|
|
||||||
|
hooks:
|
||||||
|
echo '#!/bin/sh\nmake check' > .git/hooks/pre-commit
|
||||||
|
chmod +x .git/hooks/pre-commit
|
||||||
|
|
||||||
devprereqs:
|
devprereqs:
|
||||||
which golangci-lint || go install -v github.com/golangci/golangci-lint/cmd/golangci-lint@latest
|
which golangci-lint || go install -v github.com/golangci/golangci-lint/cmd/golangci-lint@v2.0.2
|
||||||
|
|
||||||
mfer/mf.pb.go: mfer/mf.proto
|
mfer/mf.pb.go: mfer/mf.proto
|
||||||
cd mfer && go generate .
|
cd mfer && go generate .
|
||||||
|
|||||||
33
README.md
33
README.md
@@ -3,25 +3,25 @@
|
|||||||
[mfer](https://git.eeqj.de/sneak/mfer) is a reference implementation library
|
[mfer](https://git.eeqj.de/sneak/mfer) is a reference implementation library
|
||||||
and thin wrapper command-line utility written in [Go](https://golang.org)
|
and thin wrapper command-line utility written in [Go](https://golang.org)
|
||||||
and first published in 2022 under the [WTFPL](https://wtfpl.net) (public
|
and first published in 2022 under the [WTFPL](https://wtfpl.net) (public
|
||||||
domain) license. It specifies and generates `.mf` manifest files over a
|
domain) license. It specifies and generates `.mf` manifest files over a
|
||||||
directory tree of files to encapsulate metadata about them (such as
|
directory tree of files to encapsulate metadata about them (such as
|
||||||
cryptographic checksums or signatures over same) to aid in archiving,
|
cryptographic checksums or signatures over same) to aid in archiving,
|
||||||
downloading, and streaming, or mirroring. The manifest files' data is
|
downloading, and streaming, or mirroring. The manifest files' data is
|
||||||
serialized with Google's [protobuf serialization
|
serialized with Google's [protobuf serialization
|
||||||
format](https://developers.google.com/protocol-buffers). The structure of
|
format](https://developers.google.com/protocol-buffers). The structure of
|
||||||
these files can be found [in the format
|
these files can be found [in the format
|
||||||
specification](https://git.eeqj.de/sneak/mfer/src/branch/main/mfer/mf.proto)
|
specification](https://git.eeqj.de/sneak/mfer/src/branch/main/mfer/mf.proto)
|
||||||
which is included in the [project
|
which is included in the [project
|
||||||
repository](https://git.eeqj.de/sneak/mfer).
|
repository](https://git.eeqj.de/sneak/mfer).
|
||||||
|
|
||||||
The current version is pre-1.0 and while the repo was published in 2022,
|
The current version is pre-1.0 and while the repo was published in 2022,
|
||||||
there has not yet been any versioned release. [SemVer](https://semver.org)
|
there has not yet been any versioned release. [SemVer](https://semver.org)
|
||||||
will be used for releases.
|
will be used for releases.
|
||||||
|
|
||||||
This project was started by [@sneak](https://sneak.berlin) to scratch an
|
This project was started by [@sneak](https://sneak.berlin) to scratch an
|
||||||
itch in 2022 and is currently a one-person effort, though the goal is for
|
itch in 2022 and is currently a one-person effort, though the goal is for
|
||||||
this to emerge as a de-facto standard and be incorporated into other
|
this to emerge as a de-facto standard and be incorporated into other
|
||||||
software. A compatible javascript library is planned.
|
software. A compatible javascript library is planned.
|
||||||
|
|
||||||
# Build Status
|
# Build Status
|
||||||
|
|
||||||
@@ -30,18 +30,20 @@ software. A compatible javascript library is planned.
|
|||||||
# Participation
|
# Participation
|
||||||
|
|
||||||
The community is as yet nonexistent so there are no defined policies or
|
The community is as yet nonexistent so there are no defined policies or
|
||||||
norms yet. Primary development happens on a privately-run Gitea instance at
|
norms yet. Primary development happens on a privately-run Gitea instance at
|
||||||
[https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer) and issues
|
[https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer) and issues
|
||||||
are [tracked there](https://git.eeqj.de/sneak/mfer/issues).
|
are [tracked there](https://git.eeqj.de/sneak/mfer/issues).
|
||||||
|
|
||||||
Changes must always be formatted with a standard `go fmt`, syntactically
|
Changes must always be formatted with a standard `go fmt`, syntactically
|
||||||
valid, and must pass the linting defined in the repository (presently only
|
valid, and must pass the linting defined in the repository (presently only
|
||||||
the `golangci-lint` defaults), which can be run with a `make lint`. The
|
the `golangci-lint` defaults), which can be run with a `make lint`. The
|
||||||
`main` branch is protected and all changes must be made via [pull
|
`main` branch is protected and all changes must be made via [pull
|
||||||
requests](https://git.eeqj.de/sneak/mfer/pulls) and pass CI to be merged.
|
requests](https://git.eeqj.de/sneak/mfer/pulls) and pass CI to be merged.
|
||||||
Any changes submitted to this project must also be
|
Any changes submitted to this project must also be
|
||||||
[WTFPL-licensed](https://wtfpl.net) to be considered.
|
[WTFPL-licensed](https://wtfpl.net) to be considered.
|
||||||
|
|
||||||
|
See [`REPO_POLICIES.md`](REPO_POLICIES.md) for detailed coding standards,
|
||||||
|
tooling requirements, and workflow conventions.
|
||||||
|
|
||||||
# Problem Statement
|
# Problem Statement
|
||||||
|
|
||||||
@@ -123,7 +125,6 @@ The manifest file would do several important things:
|
|||||||
# Open Questions
|
# Open Questions
|
||||||
|
|
||||||
- Should the manifest file include checksums of individual file chunks, or just for the whole assembled file?
|
- Should the manifest file include checksums of individual file chunks, or just for the whole assembled file?
|
||||||
|
|
||||||
- If so, should the chunksize be fixed or dynamic?
|
- If so, should the chunksize be fixed or dynamic?
|
||||||
|
|
||||||
- Should the manifest signature format be GnuPG signatures, or those from
|
- Should the manifest signature format be GnuPG signatures, or those from
|
||||||
@@ -211,20 +212,20 @@ desired username for an account on this Gitea instance.
|
|||||||
|
|
||||||
## Prior Art: Metalink
|
## Prior Art: Metalink
|
||||||
|
|
||||||
* [Metalink - Mozilla Wiki](https://wiki.mozilla.org/Metalink)
|
- [Metalink - Mozilla Wiki](https://wiki.mozilla.org/Metalink)
|
||||||
* [Metalink - Wikipedia](https://en.wikipedia.org/wiki/Metalink)
|
- [Metalink - Wikipedia](https://en.wikipedia.org/wiki/Metalink)
|
||||||
* [RFC 5854 - The Metalink Download Description Format](https://datatracker.ietf.org/doc/html/rfc5854)
|
- [RFC 5854 - The Metalink Download Description Format](https://datatracker.ietf.org/doc/html/rfc5854)
|
||||||
* [RFC 6249 - Metalink/HTTP: Mirrors and Hashes](https://www.rfc-editor.org/rfc/rfc6249.html)
|
- [RFC 6249 - Metalink/HTTP: Mirrors and Hashes](https://www.rfc-editor.org/rfc/rfc6249.html)
|
||||||
|
|
||||||
## Links
|
## Links
|
||||||
|
|
||||||
* Repo: [https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer)
|
- Repo: [https://git.eeqj.de/sneak/mfer](https://git.eeqj.de/sneak/mfer)
|
||||||
* Issues: [https://git.eeqj.de/sneak/mfer/issues](https://git.eeqj.de/sneak/mfer/issues)
|
- Issues: [https://git.eeqj.de/sneak/mfer/issues](https://git.eeqj.de/sneak/mfer/issues)
|
||||||
|
|
||||||
# Authors
|
# Authors
|
||||||
|
|
||||||
* [@sneak <sneak@sneak.berlin>](mailto:sneak@sneak.berlin)
|
- [@sneak <sneak@sneak.berlin>](mailto:sneak@sneak.berlin)
|
||||||
|
|
||||||
# License
|
# License
|
||||||
|
|
||||||
* [WTFPL](https://wtfpl.net)
|
- [WTFPL](https://wtfpl.net)
|
||||||
|
|||||||
255
REPO_POLICIES.md
Normal file
255
REPO_POLICIES.md
Normal file
@@ -0,0 +1,255 @@
|
|||||||
|
---
|
||||||
|
title: Repository Policies
|
||||||
|
last_modified: 2026-03-10
|
||||||
|
---
|
||||||
|
|
||||||
|
This document covers repository structure, tooling, and workflow standards. Code
|
||||||
|
style conventions are in separate documents:
|
||||||
|
|
||||||
|
- [Code Styleguide](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE.md)
|
||||||
|
(general, bash, Docker)
|
||||||
|
- [Go](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE_GO.md)
|
||||||
|
- [JavaScript](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE_JS.md)
|
||||||
|
- [Python](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/CODE_STYLEGUIDE_PYTHON.md)
|
||||||
|
- [Go HTTP Server Conventions](https://git.eeqj.de/sneak/prompts/raw/branch/main/prompts/GO_HTTP_SERVER_CONVENTIONS.md)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
- Cross-project documentation (such as this file) must include
|
||||||
|
`last_modified: YYYY-MM-DD` in the YAML front matter so it can be kept in sync
|
||||||
|
with the authoritative source as policies evolve.
|
||||||
|
|
||||||
|
- **ALL external references must be pinned by cryptographic hash.** This
|
||||||
|
includes Docker base images, Go modules, npm packages, GitHub Actions, and
|
||||||
|
anything else fetched from a remote source. Version tags (`@v4`, `@latest`,
|
||||||
|
`:3.21`, etc.) are server-mutable and therefore remote code execution
|
||||||
|
vulnerabilities. The ONLY acceptable way to reference an external dependency
|
||||||
|
is by its content hash (Docker `@sha256:...`, Go module hash in `go.sum`, npm
|
||||||
|
integrity hash in lockfile, GitHub Actions `@<commit-sha>`). No exceptions.
|
||||||
|
This also means never `curl | bash` to install tools like pyenv, nvm, rustup,
|
||||||
|
etc. Instead, download a specific release archive from GitHub, verify its hash
|
||||||
|
(hardcoded in the Dockerfile or script), and only then install. Unverified
|
||||||
|
install scripts are arbitrary remote code execution. This is the single most
|
||||||
|
important rule in this document. Double-check every external reference in
|
||||||
|
every file before committing. There are zero exceptions to this rule.
|
||||||
|
|
||||||
|
- Every repo with software must have a root `Makefile` with these targets:
|
||||||
|
`make test`, `make lint`, `make fmt` (writes), `make fmt-check` (read-only),
|
||||||
|
`make check` (prereqs: `test`, `lint`, `fmt-check`), `make docker`, and
|
||||||
|
`make hooks` (installs pre-commit hook). A model Makefile is at
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/Makefile`.
|
||||||
|
|
||||||
|
- Always use Makefile targets (`make fmt`, `make test`, `make lint`, etc.)
|
||||||
|
instead of invoking the underlying tools directly. The Makefile is the single
|
||||||
|
source of truth for how these operations are run.
|
||||||
|
|
||||||
|
- The Makefile is authoritative documentation for how the repo is used. Beyond
|
||||||
|
the required targets above, it should have targets for every common operation:
|
||||||
|
running a local development server (`make run`, `make dev`), re-initializing
|
||||||
|
or migrating the database (`make db-reset`, `make migrate`), building
|
||||||
|
artifacts (`make build`), generating code, seeding data, or anything else a
|
||||||
|
developer would do regularly. If someone checks out the repo and types
|
||||||
|
`make<tab>`, they should see every meaningful operation available. A new
|
||||||
|
contributor should be able to understand the entire development workflow by
|
||||||
|
reading the Makefile.
|
||||||
|
|
||||||
|
- Every repo should have a `Dockerfile`. All Dockerfiles must run `make check`
|
||||||
|
as a build step so the build fails if the branch is not green. For non-server
|
||||||
|
repos, the Dockerfile should bring up a development environment and run
|
||||||
|
`make check`. For server repos, `make check` should run as an early build
|
||||||
|
stage before the final image is assembled.
|
||||||
|
|
||||||
|
- Every repo should have a Gitea Actions workflow (`.gitea/workflows/`) that
|
||||||
|
runs `docker build .` on push. Since the Dockerfile already runs `make check`,
|
||||||
|
a successful build implies all checks pass.
|
||||||
|
|
||||||
|
- Use platform-standard formatters: `black` for Python, `prettier` for
|
||||||
|
JS/CSS/Markdown/HTML, `go fmt` for Go. Always use default configuration with
|
||||||
|
two exceptions: four-space indents (except Go), and `proseWrap: always` for
|
||||||
|
Markdown (hard-wrap at 80 columns). Documentation and writing repos (Markdown,
|
||||||
|
HTML, CSS) should also have `.prettierrc` and `.prettierignore`.
|
||||||
|
|
||||||
|
- Pre-commit hook: `make check` if local testing is possible, otherwise
|
||||||
|
`make lint && make fmt-check`. The Makefile should provide a `make hooks`
|
||||||
|
target to install the pre-commit hook.
|
||||||
|
|
||||||
|
- All repos with software must have tests that run via the platform-standard
|
||||||
|
test framework (`go test`, `pytest`, `jest`/`vitest`, etc.). If no meaningful
|
||||||
|
tests exist yet, add the most minimal test possible — e.g. importing the
|
||||||
|
module under test to verify it compiles/parses. There is no excuse for
|
||||||
|
`make test` to be a no-op.
|
||||||
|
|
||||||
|
- `make test` must complete in under 20 seconds. Add a 30-second timeout in the
|
||||||
|
Makefile.
|
||||||
|
|
||||||
|
- Docker builds must complete in under 5 minutes.
|
||||||
|
|
||||||
|
- `make check` must not modify any files in the repo. Tests may use temporary
|
||||||
|
directories.
|
||||||
|
|
||||||
|
- `main` must always pass `make check`, no exceptions.
|
||||||
|
|
||||||
|
- Never commit secrets. `.env` files, credentials, API keys, and private keys
|
||||||
|
must be in `.gitignore`. No exceptions.
|
||||||
|
|
||||||
|
- `.gitignore` should be comprehensive from the start: OS files (`.DS_Store`),
|
||||||
|
editor files (`.swp`, `*~`), language build artifacts, and `node_modules/`.
|
||||||
|
Fetch the standard `.gitignore` from
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/.gitignore` when setting up
|
||||||
|
a new repo.
|
||||||
|
|
||||||
|
- **No build artifacts in version control.** Code-derived data (compiled
|
||||||
|
bundles, minified output, generated assets) must never be committed to the
|
||||||
|
repository if it can be avoided. The build process (e.g. Dockerfile, Makefile)
|
||||||
|
should generate these at build time. Notable exception: Go protobuf generated
|
||||||
|
files (`.pb.go`) ARE committed because repos need to work with `go get`, which
|
||||||
|
downloads code but does not execute code generation.
|
||||||
|
|
||||||
|
- Never use `git add -A` or `git add .`. Always stage files explicitly by name.
|
||||||
|
|
||||||
|
- Never force-push to `main`.
|
||||||
|
|
||||||
|
- Make all changes on a feature branch. You can do whatever you want on a
|
||||||
|
feature branch.
|
||||||
|
|
||||||
|
- `.golangci.yml` is standardized and must _NEVER_ be modified by an agent, only
|
||||||
|
manually by the user. Fetch from
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/.golangci.yml`.
|
||||||
|
|
||||||
|
- When pinning images or packages by hash, add a comment above the reference
|
||||||
|
with the version and date (YYYY-MM-DD).
|
||||||
|
|
||||||
|
- Use `yarn`, not `npm`.
|
||||||
|
|
||||||
|
- Write all dates as YYYY-MM-DD (ISO 8601).
|
||||||
|
|
||||||
|
- Simple projects should be configured with environment variables.
|
||||||
|
|
||||||
|
- Dockerized web services listen on port 8080 by default, overridable with
|
||||||
|
`PORT`.
|
||||||
|
|
||||||
|
- **HTTP/web services must be hardened for production internet exposure before
|
||||||
|
tagging 1.0.** This means full compliance with security best practices
|
||||||
|
including, without limitation, all of the following:
|
||||||
|
- **Security headers** on every response:
|
||||||
|
- `Strict-Transport-Security` (HSTS) with `max-age` of at least one year
|
||||||
|
and `includeSubDomains`.
|
||||||
|
- `Content-Security-Policy` (CSP) with a restrictive default policy
|
||||||
|
(`default-src 'self'` as a baseline, tightened per-resource as
|
||||||
|
needed). Never use `unsafe-inline` or `unsafe-eval` unless
|
||||||
|
unavoidable, and document the reason.
|
||||||
|
- `X-Frame-Options: DENY` (or `SAMEORIGIN` if framing is required).
|
||||||
|
Prefer the `frame-ancestors` CSP directive as the primary control.
|
||||||
|
- `X-Content-Type-Options: nosniff`.
|
||||||
|
- `Referrer-Policy: strict-origin-when-cross-origin` (or stricter).
|
||||||
|
- `Permissions-Policy` restricting access to browser features the
|
||||||
|
application does not use (camera, microphone, geolocation, etc.).
|
||||||
|
- **Request and response limits:**
|
||||||
|
- Maximum request body size enforced on all endpoints (e.g. Go
|
||||||
|
`http.MaxBytesReader`). Choose a sane default per-route; never accept
|
||||||
|
unbounded input.
|
||||||
|
- Maximum response body size where applicable (e.g. paginated APIs).
|
||||||
|
- `ReadTimeout` and `ReadHeaderTimeout` on the `http.Server` to defend
|
||||||
|
against slowloris attacks.
|
||||||
|
- `WriteTimeout` on the `http.Server`.
|
||||||
|
- `IdleTimeout` on the `http.Server`.
|
||||||
|
- Per-handler execution time limits via `context.WithTimeout` or
|
||||||
|
chi/stdlib `middleware.Timeout`.
|
||||||
|
- **Authentication and session security:**
|
||||||
|
- Rate limiting on password-based authentication endpoints. API keys are
|
||||||
|
high-entropy and not susceptible to brute force, so they are exempt.
|
||||||
|
- CSRF tokens on all state-mutating HTML forms. API endpoints
|
||||||
|
authenticated via `Authorization` header (Bearer token, API key) are
|
||||||
|
exempt because the browser does not attach these automatically.
|
||||||
|
- Passwords stored using bcrypt, scrypt, or argon2 — never plain-text,
|
||||||
|
MD5, or SHA.
|
||||||
|
- Session cookies set with `HttpOnly`, `Secure`, and `SameSite=Lax` (or
|
||||||
|
`Strict`) attributes.
|
||||||
|
- **Reverse proxy awareness:**
|
||||||
|
- True client IP detection when behind a reverse proxy
|
||||||
|
(`X-Forwarded-For`, `X-Real-IP`). The application must accept
|
||||||
|
forwarded headers only from a configured set of trusted proxy
|
||||||
|
addresses — never trust `X-Forwarded-For` unconditionally.
|
||||||
|
- **CORS:**
|
||||||
|
- Authenticated endpoints must restrict `Access-Control-Allow-Origin` to
|
||||||
|
an explicit allowlist of known origins. Wildcard (`*`) is acceptable
|
||||||
|
only for public, unauthenticated read-only APIs.
|
||||||
|
- **Error handling:**
|
||||||
|
- Internal errors must never leak stack traces, SQL queries, file paths,
|
||||||
|
or other implementation details to the client. Return generic error
|
||||||
|
messages in production; detailed errors only when `DEBUG` is enabled.
|
||||||
|
- **TLS:**
|
||||||
|
- Services never terminate TLS directly. They are always deployed behind
|
||||||
|
a TLS-terminating reverse proxy. The service itself listens on plain
|
||||||
|
HTTP. However, HSTS headers and `Secure` cookie flags must still be
|
||||||
|
set by the application so that the browser enforces HTTPS end-to-end.
|
||||||
|
|
||||||
|
This list is non-exhaustive. Apply defense-in-depth: if a standard security
|
||||||
|
hardening measure exists for HTTP services and is not listed here, it is
|
||||||
|
still expected. When in doubt, harden.
|
||||||
|
|
||||||
|
- `README.md` is the primary documentation. Required sections:
|
||||||
|
- **Description**: First line must include the project name, purpose,
|
||||||
|
category (web server, SPA, CLI tool, etc.), license, and author. Example:
|
||||||
|
"µPaaS is an MIT-licensed Go web application by @sneak that receives
|
||||||
|
git-frontend webhooks and deploys applications via Docker in realtime."
|
||||||
|
- **Getting Started**: Copy-pasteable install/usage code block.
|
||||||
|
- **Rationale**: Why does this exist?
|
||||||
|
- **Design**: How is the program structured?
|
||||||
|
- **TODO**: Update meticulously, even between commits. When planning, put
|
||||||
|
the todo list in the README so a new agent can pick up where the last one
|
||||||
|
left off.
|
||||||
|
- **License**: MIT, GPL, or WTFPL. Ask the user for new projects. Include a
|
||||||
|
`LICENSE` file in the repo root and a License section in the README.
|
||||||
|
- **Author**: [@sneak](https://sneak.berlin).
|
||||||
|
|
||||||
|
- First commit of a new repo should contain only `README.md`.
|
||||||
|
|
||||||
|
- Go module root: `sneak.berlin/go/<name>`. Always run `go mod tidy` before
|
||||||
|
committing.
|
||||||
|
|
||||||
|
- Use SemVer.
|
||||||
|
|
||||||
|
- Database migrations live in `internal/db/migrations/` and must be embedded in
|
||||||
|
the binary.
|
||||||
|
- `000_migration.sql` — contains ONLY the creation of the migrations
|
||||||
|
tracking table itself. Nothing else.
|
||||||
|
- `001_schema.sql` — the full application schema.
|
||||||
|
- **Pre-1.0.0:** never add additional migration files (002, 003, etc.).
|
||||||
|
There is no installed base to migrate. Edit `001_schema.sql` directly.
|
||||||
|
- **Post-1.0.0:** add new numbered migration files for each schema change.
|
||||||
|
Never edit existing migrations after release.
|
||||||
|
|
||||||
|
- All repos should have an `.editorconfig` enforcing the project's indentation
|
||||||
|
settings.
|
||||||
|
|
||||||
|
- Avoid putting files in the repo root unless necessary. Root should contain
|
||||||
|
only project-level config files (`README.md`, `Makefile`, `Dockerfile`,
|
||||||
|
`LICENSE`, `.gitignore`, `.editorconfig`, `REPO_POLICIES.md`, and
|
||||||
|
language-specific config). Everything else goes in a subdirectory. Canonical
|
||||||
|
subdirectory names:
|
||||||
|
- `bin/` — executable scripts and tools
|
||||||
|
- `cmd/` — Go command entrypoints
|
||||||
|
- `configs/` — configuration templates and examples
|
||||||
|
- `deploy/` — deployment manifests (k8s, compose, terraform)
|
||||||
|
- `docs/` — documentation and markdown (README.md stays in root)
|
||||||
|
- `internal/` — Go internal packages
|
||||||
|
- `internal/db/migrations/` — database migrations
|
||||||
|
- `pkg/` — Go library packages
|
||||||
|
- `share/` — systemd units, data files
|
||||||
|
- `static/` — static assets (images, fonts, etc.)
|
||||||
|
- `web/` — web frontend source
|
||||||
|
|
||||||
|
- When setting up a new repo, files from the `prompts` repo may be used as
|
||||||
|
templates. Fetch them from
|
||||||
|
`https://git.eeqj.de/sneak/prompts/raw/branch/main/<path>`.
|
||||||
|
|
||||||
|
- New repos must contain at minimum:
|
||||||
|
- `README.md`, `.git`, `.gitignore`, `.editorconfig`
|
||||||
|
- `LICENSE`, `REPO_POLICIES.md` (copy from the `prompts` repo)
|
||||||
|
- `Makefile`
|
||||||
|
- `Dockerfile`, `.dockerignore`
|
||||||
|
- `.gitea/workflows/check.yml`
|
||||||
|
- Go: `go.mod`, `go.sum`, `.golangci.yml`
|
||||||
|
- JS: `package.json`, `yarn.lock`, `.prettierrc`, `.prettierignore`
|
||||||
|
- Python: `pyproject.toml`
|
||||||
30
TODO.md
30
TODO.md
@@ -2,83 +2,83 @@
|
|||||||
|
|
||||||
## Design Questions
|
## Design Questions
|
||||||
|
|
||||||
*sneak: please answer inline below each question. These are preserved for posterity.*
|
_sneak: please answer inline below each question. These are preserved for posterity._
|
||||||
|
|
||||||
### Format Design
|
### Format Design
|
||||||
|
|
||||||
**1. Should `MFFileChecksum` be simplified?**
|
**1. Should `MFFileChecksum` be simplified?**
|
||||||
Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
|
Currently it's a separate message wrapping a single `bytes multiHash` field. Since multihash already self-describes the algorithm, `repeated bytes hashes` directly on `MFFilePath` would be simpler and reduce per-file protobuf overhead. Is the extra message layer intentional (e.g. planning to add per-hash metadata like `verified_at`)?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**2. Should file permissions/mode be stored?**
|
**2. Should file permissions/mode be stored?**
|
||||||
The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
|
The format stores mtime/ctime but not Unix file permissions. For archival use (ExFAT, filesystem-independent checksums) this may not matter, but for software distribution or filesystem restoration it's a gap. Should we reserve a field now (e.g. `optional uint32 mode = 305`) even if we don't populate it yet?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**3. Should `atime` be removed from the schema?**
|
**3. Should `atime` be removed from the schema?**
|
||||||
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
|
Access time is volatile, non-deterministic, and often disabled (`noatime`). Including it means two manifests of the same directory at different times will differ, which conflicts with the determinism goal. Remove it, or document it as "never set by default"?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**4. What are the path normalization rules?**
|
**4. What are the path normalization rules?**
|
||||||
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
|
The proto has `string path` with no specification about: always forward-slash? Must be relative? No `..` components allowed? UTF-8 NFC vs NFD normalization (macOS vs Linux)? Max path length? This is a security issue (path traversal) and a cross-platform compatibility issue. What rules should the spec mandate?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**5. Should we add a version byte after the magic?**
|
**5. Should we add a version byte after the magic?**
|
||||||
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
|
Currently `ZNAVSRFG` is followed immediately by protobuf. Adding a version byte (`ZNAVSRFG\x01`) would allow future framing changes without requiring protobuf parsing to detect the version. `MFFileOuter.Version` serves this purpose but requires successful deserialization to read. Worth the extra byte?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**6. Should we add a length-prefix after the magic?**
|
**6. Should we add a length-prefix after the magic?**
|
||||||
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
|
Protobuf is not self-delimiting. If we ever want to concatenate manifests or append data after the protobuf, the current framing is insufficient. Add a varint or fixed-width length-prefix?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
### Signature Design
|
### Signature Design
|
||||||
|
|
||||||
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
|
**7. What does the outer SHA-256 hash cover — compressed or uncompressed data?**
|
||||||
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
|
The review notes it currently hashes compressed data (good for verifying before decompression), but this should be explicitly documented. Which is the intended behavior?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
|
**8. Should `signatureString()` sign raw bytes instead of a hex-encoded string?**
|
||||||
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
|
Currently the canonical string is `MAGIC-UUID-MULTIHASH` with hex encoding, which adds a transformation layer. Signing the raw `sha256` bytes (or compressed `innerMessage` directly) would be simpler. Keep the string format or switch to raw bytes?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**9. Should we support detached signature files (`.mf.sig`)?**
|
**9. Should we support detached signature files (`.mf.sig`)?**
|
||||||
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
|
Embedded signatures are better for single-file distribution. Detached `.mf.sig` files follow the familiar `SHASUMS`/`SHASUMS.asc` pattern and are simpler for HTTP serving. Support both modes?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**10. GPG vs pure-Go crypto for signatures?**
|
**10. GPG vs pure-Go crypto for signatures?**
|
||||||
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
|
Shelling out to `gpg` is fragile (may not be installed, version-dependent output). `github.com/ProtonMail/go-crypto` provides pure-Go OpenPGP, or we could go Ed25519/signify (simpler, no key management). Which direction?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
### Implementation Design
|
### Implementation Design
|
||||||
|
|
||||||
**11. Should manifests be deterministic by default?**
|
**11. Should manifests be deterministic by default?**
|
||||||
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
|
This means: sort file entries by path, omit `createdAt` timestamp (or make it opt-in), no `atime`. Should determinism be the default, with a `--include-timestamps` flag to opt in?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**12. Should we consolidate or keep both scanner/checker implementations?**
|
**12. Should we consolidate or keep both scanner/checker implementations?**
|
||||||
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
|
There are two parallel implementations: `mfer/scanner.go` + `mfer/checker.go` (typed with `FileSize`, `RelFilePath`) and `internal/scanner/` + `internal/checker/` (raw `int64`, `string`). The `mfer/` versions are superior. Delete the `internal/` versions?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**13. Should the `manifest` type be exported?**
|
**13. Should the `manifest` type be exported?**
|
||||||
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
|
Currently unexported with exported constructors (`New`, `NewFromPaths`, etc.). Consumers can't declare `var m *mfer.manifest`. Export the type, or define an interface?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
**14. What should the Go module path be for 1.0?**
|
**14. What should the Go module path be for 1.0?**
|
||||||
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
|
Currently mixed between `sneak.berlin/go/mfer` and `git.eeqj.de/sneak/mfer`. Which is canonical?
|
||||||
|
|
||||||
> *answer:*
|
> _answer:_
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user