Add --ml and --exif flags to backup-metadata

--ml fetches face detections and CLIP embeddings from the /files/data/fetch
endpoint (type 'mldata'). Each blob is encrypted with the file's key and
gzipped; we decrypt with decryptBlob, gunzip, and include the parsed JSON
as 'mlData' in the per-file output. Fetched in batches of 200 file IDs.

--exif downloads each file, runs sharp().metadata() to extract image
properties (format, dimensions, color space, orientation), then parses
the raw EXIF buffer with exif-reader for structured tags (lens, ISO,
shutter, aperture, GPS altitude, etc.). Also captures raw IPTC, XMP,
and ICC profile data. Included as 'imageMetadata' in the per-file output.

Without either flag, behavior is unchanged (fast metadata-only dump).

Adds exif-reader 2.0.3 as a runtime dependency.
3 new tests (ML data decrypted, ML data absent when flag not set, EXIF
extraction). 119 total tests, all green.
This commit is contained in:
2026-06-09 17:35:35 -04:00
parent 73bfec5a9e
commit c8e7971445
5 changed files with 357 additions and 23 deletions

View File

@@ -337,13 +337,25 @@ program
program
.command("backup-metadata")
.description(
"Dump all decrypted account metadata (no file content) to a directory",
"Dump all decrypted account metadata to a directory of JSON files",
)
.argument("<dir>", "Output directory")
.action(async (dir: string) => {
.option(
"--ml",
"Include ML data (face detections, CLIP embeddings) from the Ente server",
)
.option(
"--exif",
"Download each file and extract full EXIF/IPTC/XMP metadata (slow)",
)
.action(async (dir: string, opts: { ml?: boolean; exif?: boolean }) => {
await init();
const client = requireSession();
await runMetadataBackup(client, dir, (msg) => stderr.write(msg + "\n"));
await runMetadataBackup(client, dir, {
mlData: opts.ml,
exif: opts.exif,
onProgress: (msg) => stderr.write(msg + "\n"),
});
});
program