closes #31 ## Problem `ImageProcessor.Process` used `io.ReadAll(input)` without any size limit, allowing arbitrarily large inputs to exhaust all available memory. This is a DoS vector — even though the upstream fetcher has a `MaxResponseSize` limit (50 MiB), the processor interface accepts any `io.Reader` and should defend itself independently. Additionally, the service layer's `processFromSourceOrFetch` read cached source content with `io.ReadAll` without a bound, so an unexpectedly large cached file could also cause unbounded memory consumption. ## Changes ### Processor (`processor.go`) - Added `maxInputBytes` field to `ImageProcessor` (configurable, defaults to 50 MiB via `DefaultMaxInputBytes`) - `NewImageProcessor` now accepts a `maxInputBytes` parameter (0 or negative uses the default) - `Process` now wraps the input reader with `io.LimitReader` and rejects inputs exceeding the limit with `ErrInputDataTooLarge` - Added `DefaultMaxInputBytes` and `ErrInputDataTooLarge` exported constants/errors ### Service (`service.go`) - `NewService` now wires the fetcher's `MaxResponseSize` through to the processor - Extracted `loadCachedSource` helper method to flatten nesting in `processFromSourceOrFetch` - Cached source reads are now bounded by `maxResponseSize` — oversized cached files are discarded and re-fetched ### Tests (`processor_test.go`) - `TestImageProcessor_RejectsOversizedInputData` — verifies that inputs exceeding `maxInputBytes` are rejected with `ErrInputDataTooLarge` - `TestImageProcessor_AcceptsInputWithinLimit` — verifies that inputs within the limit are processed normally - `TestImageProcessor_DefaultMaxInputBytes` — verifies that 0 and negative values use the default - All existing tests updated to use `NewImageProcessor(0)` (default limit) Co-authored-by: user <user@Mac.lan guest wan> Co-authored-by: clawbot <clawbot@eeqj.de> Reviewed-on: #37 Co-authored-by: clawbot <clawbot@noreply.example.org> Co-committed-by: clawbot <clawbot@noreply.example.org>
pixa
pixa is a GPL-3.0-licensed Go web server by @sneak that proxies images from upstream sources, optionally resizing or transforming them, and serves the results. Both source and transformed images are cached to disk so that subsequent requests are served without origin fetches or additional processing.
Getting Started
# clone and build
git clone https://git.eeqj.de/sneak/pixa.git
cd pixa
make build
# run with a config file
./bin/pixad --config config.example.yml
# or build and run via Docker
make docker
docker run -p 8080:8080 pixad:latest
Rationale
Image-heavy web applications need a fast, caching reverse proxy that can resize and transcode images on the fly. pixa fills that role as a single, self-contained binary with no external runtime dependencies beyond libvips. It supports HMAC-SHA256 signed URLs with expiration to prevent abuse, and whitelisted source hosts for open access.
Design
Storage
- Source content:
<statedir>/cache/src-content/<ab>/<cd>/<sha256 of source content> - Source metadata:
<statedir>/cache/src-metadata/<hostname>/<sha256 of path>.json(fetch time, original headers, request, content hash) - Database:
<statedir>/state.sqlite3(SQLite) - Output documents:
<statedir>/cache/dst-content/<ab>/<cd>/<sha256 of output content>
Multiple source paths may reference the same content blob; the database tracks references rather than using filesystem refcounting. In-process caching of request-to-output mappings targets 1-5k r/s.
Routes
/v1/image/<host>/<path>/<size>.<format>?sig=<signature>&exp=<expiration>
Images are only fetched from origins using TLS with valid certificates.
<format>: one oforig,png,jpeg,webp<size>:origor<width>x<height>(e.g.800x600)
Source Hosts
Source hosts may be whitelisted in the configuration. Non-whitelisted hosts require an HMAC-SHA256 signature.
Signature Specification
Signatures use HMAC-SHA256 and include an expiration timestamp to prevent replay attacks.
Signed data format (colon-separated):
HMAC-SHA256(secret, "host:path:query:width:height:format:expiration")
Where:
host— source origin hostname (e.g.cdn.example.com)path— source path (e.g./photos/cat.jpg)query— source query string, empty string if nonewidth— requested width in pixels,0for originalheight— requested height in pixels,0for originalformat— output format (jpeg, png, webp, avif, gif, orig)expiration— Unix timestamp when signature expires
Example: resize
https://cdn.example.com/photos/cat.jpg to 800x600 WebP with
expiration 1704067200:
- Build input:
cdn.example.com:/photos/cat.jpg::800:600:webp:1704067200 - Compute HMAC-SHA256 with your secret key
- Base64URL-encode the result
- URL:
/v1/image/cdn.example.com/photos/cat.jpg/800x600.webp?sig=<base64url>&exp=1704067200
Whitelist patterns:
- Exact match:
cdn.example.com— matches only that host - Suffix match:
.example.com— matchescdn.example.com,images.example.com, andexample.com
Configuration
Configured via YAML file (--config). Key settings:
access_control_allow_origin— CORS originsource_host_whitelist— list of allowed upstream hostsupstream_fetch_timeout— timeout for origin requestsupstream_max_response_size— max origin response sizedownstream_timeout— client response timeoutsigning_key— HMAC secret for URL signatures
See config.example.yml for all options with defaults.
Architecture
- Dependency injection: Uber fx
- HTTP router: go-chi
- Image processing: govips (CGO wrapper for libvips)
- Database: SQLite via modernc.org/sqlite
- Static assets: embedded via
//go:embed - Metrics: Prometheus
- Logging: stdlib slog
TODO
See TODO.md for the full prioritized task list.
License
GPL-3.0. See LICENSE.