mfer/README.md

67 lines
2.9 KiB
Markdown
Raw Normal View History

2021-01-18 23:33:46 +00:00
# mfer
2021-01-18 23:36:40 +00:00
Manifest file generator and checker.
# Problem Statement
Given a plain URL, there is no standard way to safely and programmatically download everything "under" that URL path. `wget -r` can traverse directory listings if they're enabled, but every server has a different format, and this does not verify cryptographic integrity of the files.
2021-01-18 23:38:36 +00:00
Currently, the solution that people are using are sidecar files in the format of `SHASUMS` checksum files, as well as a `SHASUMS.asc` PGP detached signature. This is not checksum-algorithm-agnostic and the sidecar file is not always consistently named.
2021-01-18 23:36:40 +00:00
# Proposed Solution
A standard, a manifest file format, and a tool for generating same.
The manifest file would be called `index.mf`, and the tool for generating such would be called `mfer`.
The manifest file would do several important things:
* have a standard filename, so if given `https://example.com/downloadpackage/` one could fetch `https://example.com/downloadpackage/index.mf` to enumerate the full directory listing.
* contain a version field for extensibility
* contain structured data (protobuf, json, or cbor)
* provide an inner signed container, so that the manifest file itself can embed a signature and a public key alongside in a single file
* contain a list of files, each with a relative path to the manifest
* contain manifest timestamp
* contain mtime information for files so that file metadata can be preserved
* contain cryptographic checksums in several different formats for each file
* probably encoded with multihash to indicate algo + hash
* sha256 at the minimum
* would be nice to include an IPFS/IPLD CIDv1 root hash for each file, which likely involves doing an ipfs file object chunking
# Design Goals
* Replace SHASUMS/SHASUMS.asc files
* be easy to download/resume
* be easy to use across protocols (given an HTTPS url, fetch manifest, then download file contents via bittorrent or ipfs)
2021-01-18 23:38:36 +00:00
# Non-Goals
* Manifest generation speed
* Small manifest file size (within reason)
# Open Questions
* Should the manifest file include checksums of individual file chunks, or just for the whole assembled file?
* If so, should the chunksize be fixed or dynamic?
2021-01-18 23:36:40 +00:00
# Tool Examples
* `mfer gen` / `mfer gen .`
* recurses under current directory and writes out an `index.mf`
* `mfer check` / `mfer check .`
* verifies checksums of all files in manifest, displaying error and exiting nonzero if any files are missing or corrupted
# Implementation Plan
## Phase One:
* golang module for reusability/embedding
* golang module client providing `mfer` CLI
## Phase Two:
* ES5 or TypeScript module for reusability/embedding
* ES5/TypeScript module client providing `mfjs` CLI
# Hopes And Dreams
* `aria2c https://example.com/manifestdirectory/`
* (fetches `https://example.com/manifestdirectory/index.mf`, downloads and checksums all files, resumes any that exist locally already)
* `mfer fetch https://example.com/manifestdirectory/`