From 9637c8de05231370b98374e61bfea8bed87089df Mon Sep 17 00:00:00 2001 From: Jeffrey Paul Date: Mon, 18 Jan 2021 23:36:40 +0000 Subject: [PATCH] Update 'README.md' --- README.md | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/README.md b/README.md index b430799..c18a95d 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,60 @@ # mfer +Manifest file generator and checker. + +# Problem Statement + +Given a plain URL, there is no standard way to safely and programmatically download everything "under" that URL path. `wget -r` can traverse directory listings if they're enabled, but every server has a different format, and this does not verify cryptographic integrity of the files. + +Currently, the solution that people are using are sidecar files in the format of SHASUMS checksum files, as well as a SHASUMS.asc PGP detached signature. This is not checksum-agnostic and the sidecar file is not always consistently named. + +# Proposed Solution + +A standard, a manifest file format, and a tool for generating same. + +The manifest file would be called `index.mf`, and the tool for generating such would be called `mfer`. + +The manifest file would do several important things: + +* have a standard filename, so if given `https://example.com/downloadpackage/` one could fetch `https://example.com/downloadpackage/index.mf` to enumerate the full directory listing. +* contain a version field for extensibility +* contain structured data (protobuf, json, or cbor) +* provide an inner signed container, so that the manifest file itself can embed a signature and a public key alongside in a single file +* contain a list of files, each with a relative path to the manifest +* contain manifest timestamp +* contain mtime information for files so that file metadata can be preserved +* contain cryptographic checksums in several different formats for each file + * probably encoded with multihash to indicate algo + hash + * sha256 at the minimum + * would be nice to include an IPFS/IPLD CIDv1 root hash for each file, which likely involves doing an ipfs file object chunking + +# Design Goals + +* Replace SHASUMS/SHASUMS.asc files +* be easy to download/resume +* be easy to use across protocols (given an HTTPS url, fetch manifest, then download file contents via bittorrent or ipfs) + +# Tool Examples + +* `mfer gen` / `mfer gen .` + * recurses under current directory and writes out an `index.mf` +* `mfer check` / `mfer check .` + * verifies checksums of all files in manifest, displaying error and exiting nonzero if any files are missing or corrupted + +# Implementation Plan + +## Phase One: + +* golang module for reusability/embedding +* golang module client providing `mfer` CLI + +## Phase Two: + +* ES5 or TypeScript module for reusability/embedding +* ES5/TypeScript module client providing `mfjs` CLI + +# Hopes And Dreams + +* `aria2c https://example.com/manifestdirectory/` + * (fetches `https://example.com/manifestdirectory/index.mf`, downloads and checksums all files, resumes any that exist locally already) +* `mfer fetch https://example.com/manifestdirectory/` \ No newline at end of file