Files
pixa/README.md
clawbot a50364bfca
All checks were successful
check / check (push) Successful in 58s
Enforce and document exact-match-only for signature verification (#40)
Closes #27

Signatures are per-URL only — this PR adds explicit tests and documentation enforcing that HMAC-SHA256 signatures verify against exact URLs only. No suffix matching, wildcard matching, or partial matching is supported.

## What this does NOT touch

**The host whitelist code (`whitelist.go`) is not modified.** This PR is exclusively about signature verification, per sneak's instructions on [issue #27](#27), [PR #32](#32), and [PR #35](#35).

## Changes

### `internal/imgcache/signature.go`
- Added documentation comments on `Verify()` and `buildSignatureData()` explicitly specifying that signatures are exact-match only — no suffix, wildcard, or partial matching

### `internal/imgcache/signature_test.go`
- **`TestSigner_Verify_ExactMatchOnly`**: 14 tamper cases verifying that modifying any signed component (host, path, query, dimensions, format) causes verification to fail. Host-specific cases include:
  - Parent domain (`example.com`) does not match subdomain signature (`cdn.example.com`)
  - Sibling subdomain (`images.example.com`) does not match
  - Deeper subdomain (`images.cdn.example.com`) does not match
  - Evil suffix domain (`cdn.example.com.evil.com`) does not match
  - Prefixed host (`evilcdn.example.com`) does not match
- **`TestSigner_Sign_ExactHostInData`**: Verifies that suffix-related hosts (`cdn.example.com`, `example.com`, `images.example.com`, etc.) all produce distinct signatures

### `internal/imgcache/service_test.go`
- **`TestService_ValidateRequest_SignatureExactHostMatch`**: Integration test through `ValidateRequest` verifying that a valid signature for `cdn.example.com` is rejected when presented with a different host (parent domain, sibling subdomain, deeper subdomain, evil suffix, prefixed host)

### `README.md`
- Updated Signature Specification section to explicitly document exact-match-only semantics

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #40
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-20 23:56:45 +01:00

142 lines
4.2 KiB
Markdown

# pixa
pixa is a GPL-3.0-licensed Go web server by
[@sneak](https://sneak.berlin) that proxies images from upstream
sources, optionally resizing or transforming them, and serves the
results. Both source and transformed images are cached to disk so that
subsequent requests are served without origin fetches or additional
processing.
## Getting Started
```bash
# clone and build
git clone https://git.eeqj.de/sneak/pixa.git
cd pixa
make build
# run with a config file
./bin/pixad --config config.example.yml
# or build and run via Docker
make docker
docker run -p 8080:8080 pixad:latest
```
## Rationale
Image-heavy web applications need a fast, caching reverse proxy that
can resize and transcode images on the fly. pixa fills that role as a
single, self-contained binary with no external runtime dependencies
beyond libvips. It supports HMAC-SHA256 signed URLs with expiration to
prevent abuse, and whitelisted source hosts for open access.
## Design
### Storage
- **Source content**:
`<statedir>/cache/src-content/<ab>/<cd>/<sha256 of source content>`
- **Source metadata**:
`<statedir>/cache/src-metadata/<hostname>/<sha256 of path>.json`
(fetch time, original headers, request, content hash)
- **Database**: `<statedir>/state.sqlite3` (SQLite)
- **Output documents**:
`<statedir>/cache/dst-content/<ab>/<cd>/<sha256 of output content>`
Multiple source paths may reference the same content blob; the
database tracks references rather than using filesystem refcounting.
In-process caching of request-to-output mappings targets 1-5k r/s.
### Routes
```
/v1/image/<host>/<path>/<size>.<format>?sig=<signature>&exp=<expiration>
```
Images are only fetched from origins using TLS with valid certificates.
- `<format>`: one of `orig`, `png`, `jpeg`, `webp`
- `<size>`: `orig` or `<width>x<height>` (e.g. `800x600`)
### Source Hosts
Source hosts may be whitelisted in the configuration. Non-whitelisted
hosts require an HMAC-SHA256 signature.
#### Signature Specification
Signatures use HMAC-SHA256 and include an expiration timestamp to
prevent replay attacks. Signatures are **exact match only**: every
component (host, path, query, dimensions, format, expiration) must
match exactly what was signed. No suffix matching, wildcard matching,
or partial matching is supported.
**Signed data format** (colon-separated):
```
HMAC-SHA256(secret, "host:path:query:width:height:format:expiration")
```
Where:
- `host` — source origin hostname (e.g. `cdn.example.com`)
- `path` — source path (e.g. `/photos/cat.jpg`)
- `query` — source query string, empty string if none
- `width` — requested width in pixels, `0` for original
- `height` — requested height in pixels, `0` for original
- `format` — output format (jpeg, png, webp, avif, gif, orig)
- `expiration` — Unix timestamp when signature expires
**Example:** resize
`https://cdn.example.com/photos/cat.jpg` to 800x600 WebP with
expiration 1704067200:
1. Build input:
`cdn.example.com:/photos/cat.jpg::800:600:webp:1704067200`
2. Compute HMAC-SHA256 with your secret key
3. Base64URL-encode the result
4. URL:
`/v1/image/cdn.example.com/photos/cat.jpg/800x600.webp?sig=<base64url>&exp=1704067200`
**Whitelist patterns:**
- **Exact match**: `cdn.example.com` — matches only that host
- **Suffix match**: `.example.com` — matches `cdn.example.com`,
`images.example.com`, and `example.com`
### Configuration
Configured via YAML file (`--config`). Key settings:
- `access_control_allow_origin` — CORS origin
- `source_host_whitelist` — list of allowed upstream hosts
- `upstream_fetch_timeout` — timeout for origin requests
- `upstream_max_response_size` — max origin response size
- `downstream_timeout` — client response timeout
- `signing_key` — HMAC secret for URL signatures
See `config.example.yml` for all options with defaults.
### Architecture
- **Dependency injection**: Uber fx
- **HTTP router**: go-chi
- **Image processing**: govips (CGO wrapper for libvips)
- **Database**: SQLite via modernc.org/sqlite
- **Static assets**: embedded via `//go:embed`
- **Metrics**: Prometheus
- **Logging**: stdlib slog
## TODO
See [TODO.md](TODO.md) for the full prioritized task list.
## License
GPL-3.0. See [LICENSE](LICENSE).
## Author
[@sneak](https://sneak.berlin)