5 Commits

Author SHA1 Message Date
clawbot
a853fe7ee7 refactor: extract httpfetcher package from imgcache
All checks were successful
check / check (push) Successful in 57s
Move HTTPFetcher, Config (was FetcherConfig), SSRF-safe dialer, rate
limiting, content-type validation, and related error vars from
internal/imgcache/fetcher.go into new internal/httpfetcher/ package.

The Fetcher interface and FetchResult type also move to httpfetcher
to avoid circular imports (imgcache imports httpfetcher, not the other
way around).

Renames to avoid stuttering:
  NewHTTPFetcher -> httpfetcher.New
  FetcherConfig  -> httpfetcher.Config
  NewMockFetcher -> httpfetcher.NewMock

The ServiceConfig.FetcherConfig field is retained (it describes what
kind of config it holds, not a stutter).

Pure refactor - no behavior changes. Unit tests for the httpfetcher
package are included.

refs #39
2026-04-17 06:47:05 +00:00
6b4a1d7607 refactor: extract magic byte detection into internal/magic package (#42)
All checks were successful
check / check (push) Successful in 1m39s
## Summary

Extract magic byte detection and MIME type handling from `internal/imgcache/` into a new focused `internal/magic/` package.

Part of [issue #39](#39)

## Changes

### New package: `internal/magic/`

Moved the following from `internal/imgcache/magic.go`:
- `MIMEType` type and constants (`MIMETypeJPEG`, `MIMETypePNG`, etc.)
- `DetectFormat()` — detects image format from magic bytes
- `ValidateMagicBytes()` — validates content matches declared MIME type
- `PeekAndValidate()` — reads minimum bytes, validates, returns combined reader
- `IsSupportedMIMEType()` — checks if a MIME type is supported
- `MIMEToImageFormat()` — converts MIME type to ImageFormat
- `ImageFormatToMIME()` — converts ImageFormat to MIME string
- All error sentinels (`ErrUnknownFormat`, `ErrMagicByteMismatch`, `ErrNotEnoughData`)
- All helper functions (`detectSVG`, `skipBOM`, `normalizeMIMEType`)

The magic package defines its own `ImageFormat` type and constants to avoid circular imports (`imgcache` → `magic` for validation; `magic` cannot import `imgcache`).

### Updated imports
- `internal/imgcache/service.go`: uses `magic.ValidateMagicBytes()`
- `internal/imgcache/service_test.go`: uses `magic.DetectFormat()` and `magic.MIMEToImageFormat()`

### Naming
- Clean package-qualified names: `magic.DetectFormat()`, `magic.ValidateMagicBytes()`, etc.
- No stuttering names

### Tests
- Full test suite moved to `internal/magic/magic_test.go` (all 15 test functions preserved)
- All existing tests pass unchanged
- `docker build .` passes (includes `make check`: fmt, lint, tests)

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #42
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-04-07 00:41:48 +02:00
e34743f070 refactor: extract whitelist package from internal/imgcache (#41)
All checks were successful
check / check (push) Successful in 4s
Extract `HostWhitelist`, `NewHostWhitelist`, `IsWhitelisted`, `IsEmpty`, and `Count` from `internal/imgcache/` into the new `internal/whitelist/` package.

The whitelist package is completely self-contained, depending only on `net/url` and `strings` from the standard library. No circular imports introduced.

**Changes:**
- Moved `whitelist.go` → `internal/whitelist/whitelist.go` (added package comment)
- Moved `whitelist_test.go` → `internal/whitelist/whitelist_test.go` (adapted to external test style)
- Updated `internal/imgcache/service.go` to import from `sneak.berlin/go/pixa/internal/whitelist`

`docker build .` passes (lint, tests, build).

Part of [issue #39](#39)

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #41
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-25 20:44:56 +01:00
7010d55d72 Move schema_migrations table creation into 000.sql (#36)
All checks were successful
check / check (push) Successful in 1m43s
## Summary

Moves the `schema_migrations` table definition from inline Go code into `internal/database/schema/000.sql`, so the migration tracking table schema lives alongside all other schema files.

closes #29

## Changes

### New file: `internal/database/schema/000.sql`
- Contains the `CREATE TABLE IF NOT EXISTS schema_migrations` DDL
- This is applied as a bootstrap step before the normal migration loop

### Refactored: `internal/database/database.go`
- Removed the inline `CREATE TABLE IF NOT EXISTS schema_migrations` SQL from both `runMigrations` and `ApplyMigrations`
- Added `bootstrapMigrationsTable()` which:
  - Checks `sqlite_master` to see if the table already exists
  - If missing: reads and executes `000.sql` to create it, then records version `000`
  - If present (backwards compat with existing DBs created by old inline code): back-fills version `000` so the normal loop skips the bootstrap file
- Deduplicated: both `Database.runMigrations()` and the exported `ApplyMigrations()` now delegate to a single `applyMigrations()` helper
- Added `logInfo`/`logDebug` helpers to handle the optional logger (nil when called from `ApplyMigrations` in tests)

### New file: `internal/database/database_test.go`
- `TestApplyMigrations_CreatesSchemaAndTables` — verifies all migrations apply and all expected tables exist
- `TestApplyMigrations_Idempotent` — verifies running migrations twice produces no errors or duplicates
- `TestBootstrapMigrationsTable_FreshDatabase` — verifies bootstrap creates the table and records version 000
- `TestBootstrapMigrationsTable_ExistingTableBackwardsCompat` — verifies existing DBs (from old inline-SQL code) get version 000 back-filled without data loss

## Conflict note

[PR #33](#33) (for [issue #28](#28)) is also modifying migration code. This PR is based on current `main` and the conflict will be resolved at merge time.

Co-authored-by: user <user@Mac.lan guest wan>
Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Co-authored-by: clawbot <clawbot@sneak.berlin>
Co-authored-by: clawbot <clawbot@eeqj.de>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #36
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-25 02:20:52 +01:00
a50364bfca Enforce and document exact-match-only for signature verification (#40)
All checks were successful
check / check (push) Successful in 58s
Closes #27

Signatures are per-URL only — this PR adds explicit tests and documentation enforcing that HMAC-SHA256 signatures verify against exact URLs only. No suffix matching, wildcard matching, or partial matching is supported.

## What this does NOT touch

**The host whitelist code (`whitelist.go`) is not modified.** This PR is exclusively about signature verification, per sneak's instructions on [issue #27](#27), [PR #32](#32), and [PR #35](#35).

## Changes

### `internal/imgcache/signature.go`
- Added documentation comments on `Verify()` and `buildSignatureData()` explicitly specifying that signatures are exact-match only — no suffix, wildcard, or partial matching

### `internal/imgcache/signature_test.go`
- **`TestSigner_Verify_ExactMatchOnly`**: 14 tamper cases verifying that modifying any signed component (host, path, query, dimensions, format) causes verification to fail. Host-specific cases include:
  - Parent domain (`example.com`) does not match subdomain signature (`cdn.example.com`)
  - Sibling subdomain (`images.example.com`) does not match
  - Deeper subdomain (`images.cdn.example.com`) does not match
  - Evil suffix domain (`cdn.example.com.evil.com`) does not match
  - Prefixed host (`evilcdn.example.com`) does not match
- **`TestSigner_Sign_ExactHostInData`**: Verifies that suffix-related hosts (`cdn.example.com`, `example.com`, `images.example.com`, etc.) all produce distinct signatures

### `internal/imgcache/service_test.go`
- **`TestService_ValidateRequest_SignatureExactHostMatch`**: Integration test through `ValidateRequest` verifying that a valid signature for `cdn.example.com` is rejected when presented with a different host (parent domain, sibling subdomain, deeper subdomain, evil suffix, prefixed host)

### `README.md`
- Updated Signature Specification section to explicitly document exact-match-only semantics

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #40
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-20 23:56:45 +01:00
23 changed files with 896 additions and 178 deletions

View File

@@ -67,7 +67,10 @@ hosts require an HMAC-SHA256 signature.
#### Signature Specification #### Signature Specification
Signatures use HMAC-SHA256 and include an expiration timestamp to Signatures use HMAC-SHA256 and include an expiration timestamp to
prevent replay attacks. prevent replay attacks. Signatures are **exact match only**: every
component (host, path, query, dimensions, format, expiration) must
match exactly what was signed. No suffix matching, wildcard matching,
or partial matching is supported.
**Signed data format** (colon-separated): **Signed data format** (colon-separated):

View File

@@ -1,25 +1,26 @@
package imgcache // Package allowlist provides host-based URL allow-listing for the image proxy.
package allowlist
import ( import (
"net/url" "net/url"
"strings" "strings"
) )
// HostWhitelist implements the Whitelist interface for checking allowed source hosts. // HostAllowList checks whether source hosts are permitted.
type HostWhitelist struct { type HostAllowList struct {
// exactHosts contains hosts that must match exactly (e.g., "cdn.example.com") // exactHosts contains hosts that must match exactly (e.g., "cdn.example.com")
exactHosts map[string]struct{} exactHosts map[string]struct{}
// suffixHosts contains domain suffixes to match (e.g., ".example.com" matches "cdn.example.com") // suffixHosts contains domain suffixes to match (e.g., ".example.com" matches "cdn.example.com")
suffixHosts []string suffixHosts []string
} }
// NewHostWhitelist creates a whitelist from a list of host patterns. // New creates a HostAllowList from a list of host patterns.
// Patterns starting with "." are treated as suffix matches. // Patterns starting with "." are treated as suffix matches.
// Examples: // Examples:
// - "cdn.example.com" - exact match only // - "cdn.example.com" - exact match only
// - ".example.com" - matches cdn.example.com, images.example.com, etc. // - ".example.com" - matches cdn.example.com, images.example.com, etc.
func NewHostWhitelist(patterns []string) *HostWhitelist { func New(patterns []string) *HostAllowList {
w := &HostWhitelist{ w := &HostAllowList{
exactHosts: make(map[string]struct{}), exactHosts: make(map[string]struct{}),
suffixHosts: make([]string, 0), suffixHosts: make([]string, 0),
} }
@@ -40,8 +41,8 @@ func NewHostWhitelist(patterns []string) *HostWhitelist {
return w return w
} }
// IsWhitelisted checks if a URL's host is in the whitelist. // IsAllowed checks if a URL's host is in the allow list.
func (w *HostWhitelist) IsWhitelisted(u *url.URL) bool { func (w *HostAllowList) IsAllowed(u *url.URL) bool {
if u == nil { if u == nil {
return false return false
} }
@@ -71,12 +72,12 @@ func (w *HostWhitelist) IsWhitelisted(u *url.URL) bool {
return false return false
} }
// IsEmpty returns true if the whitelist has no entries. // IsEmpty returns true if the allow list has no entries.
func (w *HostWhitelist) IsEmpty() bool { func (w *HostAllowList) IsEmpty() bool {
return len(w.exactHosts) == 0 && len(w.suffixHosts) == 0 return len(w.exactHosts) == 0 && len(w.suffixHosts) == 0
} }
// Count returns the total number of whitelist entries. // Count returns the total number of allow list entries.
func (w *HostWhitelist) Count() int { func (w *HostAllowList) Count() int {
return len(w.exactHosts) + len(w.suffixHosts) return len(w.exactHosts) + len(w.suffixHosts)
} }

View File

@@ -1,11 +1,13 @@
package imgcache package allowlist_test
import ( import (
"net/url" "net/url"
"testing" "testing"
"sneak.berlin/go/pixa/internal/allowlist"
) )
func TestHostWhitelist_IsWhitelisted(t *testing.T) { func TestHostAllowList_IsAllowed(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
patterns []string patterns []string
@@ -67,7 +69,7 @@ func TestHostWhitelist_IsWhitelisted(t *testing.T) {
want: true, want: true,
}, },
{ {
name: "empty whitelist", name: "empty allow list",
patterns: []string{}, patterns: []string{},
testURL: "https://cdn.example.com/image.jpg", testURL: "https://cdn.example.com/image.jpg",
want: false, want: false,
@@ -94,7 +96,7 @@ func TestHostWhitelist_IsWhitelisted(t *testing.T) {
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) { t.Run(tt.name, func(t *testing.T) {
w := NewHostWhitelist(tt.patterns) w := allowlist.New(tt.patterns)
var u *url.URL var u *url.URL
if tt.testURL != "" { if tt.testURL != "" {
@@ -105,15 +107,15 @@ func TestHostWhitelist_IsWhitelisted(t *testing.T) {
} }
} }
got := w.IsWhitelisted(u) got := w.IsAllowed(u)
if got != tt.want { if got != tt.want {
t.Errorf("IsWhitelisted() = %v, want %v", got, tt.want) t.Errorf("IsAllowed() = %v, want %v", got, tt.want)
} }
}) })
} }
} }
func TestHostWhitelist_IsEmpty(t *testing.T) { func TestHostAllowList_IsEmpty(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
patterns []string patterns []string
@@ -143,7 +145,7 @@ func TestHostWhitelist_IsEmpty(t *testing.T) {
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) { t.Run(tt.name, func(t *testing.T) {
w := NewHostWhitelist(tt.patterns) w := allowlist.New(tt.patterns)
if got := w.IsEmpty(); got != tt.want { if got := w.IsEmpty(); got != tt.want {
t.Errorf("IsEmpty() = %v, want %v", got, tt.want) t.Errorf("IsEmpty() = %v, want %v", got, tt.want)
} }
@@ -151,7 +153,7 @@ func TestHostWhitelist_IsEmpty(t *testing.T) {
} }
} }
func TestHostWhitelist_Count(t *testing.T) { func TestHostAllowList_Count(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
patterns []string patterns []string
@@ -181,7 +183,7 @@ func TestHostWhitelist_Count(t *testing.T) {
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) { t.Run(tt.name, func(t *testing.T) {
w := NewHostWhitelist(tt.patterns) w := allowlist.New(tt.patterns)
if got := w.Count(); got != tt.want { if got := w.Count(); got != tt.want {
t.Errorf("Count() = %v, want %v", got, tt.want) t.Errorf("Count() = %v, want %v", got, tt.want)
} }

View File

@@ -9,6 +9,7 @@ import (
"log/slog" "log/slog"
"path/filepath" "path/filepath"
"sort" "sort"
"strconv"
"strings" "strings"
"go.uber.org/fx" "go.uber.org/fx"
@@ -21,6 +22,10 @@ import (
//go:embed schema/*.sql //go:embed schema/*.sql
var schemaFS embed.FS var schemaFS embed.FS
// bootstrapVersion is the migration that creates the schema_migrations
// table itself. It is applied before the normal migration loop.
const bootstrapVersion = 0
// Params defines dependencies for Database. // Params defines dependencies for Database.
type Params struct { type Params struct {
fx.In fx.In
@@ -38,35 +43,40 @@ type Database struct {
// ParseMigrationVersion extracts the numeric version prefix from a migration // ParseMigrationVersion extracts the numeric version prefix from a migration
// filename. Filenames must follow the pattern "<version>.sql" or // filename. Filenames must follow the pattern "<version>.sql" or
// "<version>_<description>.sql", where version is a zero-padded numeric // "<version>_<description>.sql", where version is a zero-padded numeric
// string (e.g. "001", "002"). Returns the version string and an error if // string (e.g. "001", "002"). Returns the version as an integer and an
// the filename does not match the expected pattern. // error if the filename does not match the expected pattern.
func ParseMigrationVersion(filename string) (string, error) { func ParseMigrationVersion(filename string) (int, error) {
name := strings.TrimSuffix(filename, filepath.Ext(filename)) name := strings.TrimSuffix(filename, filepath.Ext(filename))
if name == "" { if name == "" {
return "", fmt.Errorf("invalid migration filename %q: empty name", filename) return 0, fmt.Errorf("invalid migration filename %q: empty name", filename)
} }
// Split on underscore to separate version from description. // Split on underscore to separate version from description.
// If there's no underscore, the entire stem is the version. // If there's no underscore, the entire stem is the version.
version := name versionStr := name
if idx := strings.IndexByte(name, '_'); idx >= 0 { if idx := strings.IndexByte(name, '_'); idx >= 0 {
version = name[:idx] versionStr = name[:idx]
} }
if version == "" { if versionStr == "" {
return "", fmt.Errorf("invalid migration filename %q: empty version prefix", filename) return 0, fmt.Errorf("invalid migration filename %q: empty version prefix", filename)
} }
// Validate the version is purely numeric. // Validate the version is purely numeric.
for _, ch := range version { for _, ch := range versionStr {
if ch < '0' || ch > '9' { if ch < '0' || ch > '9' {
return "", fmt.Errorf( return 0, fmt.Errorf(
"invalid migration filename %q: version %q contains non-numeric character %q", "invalid migration filename %q: version %q contains non-numeric character %q",
filename, version, string(ch), filename, versionStr, string(ch),
) )
} }
} }
version, err := strconv.Atoi(versionStr)
if err != nil {
return 0, fmt.Errorf("invalid migration filename %q: %w", filename, err)
}
return version, nil return version, nil
} }
@@ -143,17 +153,34 @@ func collectMigrations() ([]string, error) {
return migrations, nil return migrations, nil
} }
// ensureMigrationsTable creates the schema_migrations tracking table if // bootstrapMigrationsTable ensures the schema_migrations table exists
// it does not already exist. // by applying 000.sql if the table is missing.
func ensureMigrationsTable(ctx context.Context, db *sql.DB) error { func bootstrapMigrationsTable(ctx context.Context, db *sql.DB, log *slog.Logger) error {
_, err := db.ExecContext(ctx, ` var tableExists int
CREATE TABLE IF NOT EXISTS schema_migrations (
version TEXT PRIMARY KEY, err := db.QueryRowContext(ctx,
applied_at DATETIME DEFAULT CURRENT_TIMESTAMP "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='schema_migrations'",
) ).Scan(&tableExists)
`)
if err != nil { if err != nil {
return fmt.Errorf("failed to create migrations table: %w", err) return fmt.Errorf("failed to check for migrations table: %w", err)
}
if tableExists > 0 {
return nil
}
content, err := schemaFS.ReadFile("schema/000.sql")
if err != nil {
return fmt.Errorf("failed to read bootstrap migration 000.sql: %w", err)
}
if log != nil {
log.Info("applying bootstrap migration", "version", bootstrapVersion)
}
_, err = db.ExecContext(ctx, string(content))
if err != nil {
return fmt.Errorf("failed to apply bootstrap migration: %w", err)
} }
return nil return nil
@@ -164,7 +191,7 @@ func ensureMigrationsTable(ctx context.Context, db *sql.DB) error {
// This is exported so tests can apply the real schema without the full fx // This is exported so tests can apply the real schema without the full fx
// lifecycle. // lifecycle.
func ApplyMigrations(ctx context.Context, db *sql.DB, log *slog.Logger) error { func ApplyMigrations(ctx context.Context, db *sql.DB, log *slog.Logger) error {
if err := ensureMigrationsTable(ctx, db); err != nil { if err := bootstrapMigrationsTable(ctx, db, log); err != nil {
return err return err
} }

View File

@@ -8,37 +8,51 @@ import (
_ "modernc.org/sqlite" // SQLite driver registration _ "modernc.org/sqlite" // SQLite driver registration
) )
// openTestDB returns a fresh in-memory SQLite database.
func openTestDB(t *testing.T) *sql.DB {
t.Helper()
db, err := sql.Open("sqlite", ":memory:")
if err != nil {
t.Fatalf("failed to open test db: %v", err)
}
t.Cleanup(func() { db.Close() })
return db
}
func TestParseMigrationVersion(t *testing.T) { func TestParseMigrationVersion(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
filename string filename string
want string want int
wantErr bool wantErr bool
}{ }{
{ {
name: "version only", name: "version only",
filename: "001.sql", filename: "001.sql",
want: "001", want: 1,
}, },
{ {
name: "version with description", name: "version with description",
filename: "001_initial_schema.sql", filename: "001_initial_schema.sql",
want: "001", want: 1,
}, },
{ {
name: "multi-digit version", name: "multi-digit version",
filename: "042_add_indexes.sql", filename: "042_add_indexes.sql",
want: "042", want: 42,
}, },
{ {
name: "long version number", name: "long version number",
filename: "00001_long_prefix.sql", filename: "00001_long_prefix.sql",
want: "00001", want: 1,
}, },
{ {
name: "description with multiple underscores", name: "description with multiple underscores",
filename: "003_add_user_auth_tables.sql", filename: "003_add_user_auth_tables.sql",
want: "003", want: 3,
}, },
{ {
name: "empty filename", name: "empty filename",
@@ -67,7 +81,7 @@ func TestParseMigrationVersion(t *testing.T) {
got, err := ParseMigrationVersion(tt.filename) got, err := ParseMigrationVersion(tt.filename)
if tt.wantErr { if tt.wantErr {
if err == nil { if err == nil {
t.Errorf("ParseMigrationVersion(%q) expected error, got %q", tt.filename, got) t.Errorf("ParseMigrationVersion(%q) expected error, got %d", tt.filename, got)
} }
return return
@@ -80,76 +94,131 @@ func TestParseMigrationVersion(t *testing.T) {
} }
if got != tt.want { if got != tt.want {
t.Errorf("ParseMigrationVersion(%q) = %q, want %q", tt.filename, got, tt.want) t.Errorf("ParseMigrationVersion(%q) = %d, want %d", tt.filename, got, tt.want)
} }
}) })
} }
} }
func TestApplyMigrations(t *testing.T) { func TestApplyMigrations_CreatesSchemaAndTables(t *testing.T) {
db, err := sql.Open("sqlite", ":memory:") db := openTestDB(t)
if err != nil { ctx := context.Background()
t.Fatalf("failed to open in-memory database: %v", err)
}
defer db.Close()
// Apply migrations should succeed. if err := ApplyMigrations(ctx, db, nil); err != nil {
if err := ApplyMigrations(context.Background(), db, nil); err != nil {
t.Fatalf("ApplyMigrations failed: %v", err) t.Fatalf("ApplyMigrations failed: %v", err)
} }
// Verify the schema_migrations table recorded the version. // The schema_migrations table must exist and contain at least
var version string // version 0 (the bootstrap) and 1 (the initial schema).
rows, err := db.Query("SELECT version FROM schema_migrations ORDER BY version")
err = db.QueryRowContext(context.Background(),
"SELECT version FROM schema_migrations LIMIT 1",
).Scan(&version)
if err != nil { if err != nil {
t.Fatalf("failed to query schema_migrations: %v", err) t.Fatalf("failed to query schema_migrations: %v", err)
} }
defer rows.Close()
if version != "001" { var versions []int
t.Errorf("expected version %q, got %q", "001", version) for rows.Next() {
var v int
if err := rows.Scan(&v); err != nil {
t.Fatalf("failed to scan version: %v", err)
}
versions = append(versions, v)
} }
// Verify a table from the migration exists (source_content). if err := rows.Err(); err != nil {
var tableName string t.Fatalf("row iteration error: %v", err)
}
err = db.QueryRowContext(context.Background(), if len(versions) < 2 {
"SELECT name FROM sqlite_master WHERE type='table' AND name='source_content'", t.Fatalf("expected at least 2 migrations recorded, got %d: %v", len(versions), versions)
).Scan(&tableName) }
if err != nil {
t.Fatalf("expected source_content table to exist: %v", err) if versions[0] != 0 {
t.Errorf("first recorded migration = %d, want %d", versions[0], 0)
}
if versions[1] != 1 {
t.Errorf("second recorded migration = %d, want %d", versions[1], 1)
}
// Verify that the application tables created by 001.sql exist.
for _, table := range []string{"source_content", "source_metadata", "output_content", "request_cache", "negative_cache", "cache_stats"} {
var count int
err := db.QueryRow(
"SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name=?",
table,
).Scan(&count)
if err != nil {
t.Fatalf("failed to check for table %s: %v", table, err)
}
if count != 1 {
t.Errorf("table %s does not exist after migrations", table)
}
} }
} }
func TestApplyMigrationsIdempotent(t *testing.T) { func TestApplyMigrations_Idempotent(t *testing.T) {
db, err := sql.Open("sqlite", ":memory:") db := openTestDB(t)
if err != nil { ctx := context.Background()
t.Fatalf("failed to open in-memory database: %v", err)
}
defer db.Close()
// Apply twice should succeed (idempotent). if err := ApplyMigrations(ctx, db, nil); err != nil {
if err := ApplyMigrations(context.Background(), db, nil); err != nil {
t.Fatalf("first ApplyMigrations failed: %v", err) t.Fatalf("first ApplyMigrations failed: %v", err)
} }
if err := ApplyMigrations(context.Background(), db, nil); err != nil { // Running a second time must succeed without errors.
if err := ApplyMigrations(ctx, db, nil); err != nil {
t.Fatalf("second ApplyMigrations failed: %v", err) t.Fatalf("second ApplyMigrations failed: %v", err)
} }
// Should still have exactly one migration recorded. // Verify no duplicate rows in schema_migrations.
var count int var count int
err = db.QueryRowContext(context.Background(), err := db.QueryRow("SELECT COUNT(*) FROM schema_migrations WHERE version = 0").Scan(&count)
"SELECT COUNT(*) FROM schema_migrations",
).Scan(&count)
if err != nil { if err != nil {
t.Fatalf("failed to count schema_migrations: %v", err) t.Fatalf("failed to count version 0 rows: %v", err)
} }
if count != 1 { if count != 1 {
t.Errorf("expected 1 migration record, got %d", count) t.Errorf("expected exactly 1 row for version 0, got %d", count)
}
}
func TestBootstrapMigrationsTable_FreshDatabase(t *testing.T) {
db := openTestDB(t)
ctx := context.Background()
if err := bootstrapMigrationsTable(ctx, db, nil); err != nil {
t.Fatalf("bootstrapMigrationsTable failed: %v", err)
}
// schema_migrations table must exist.
var tableCount int
err := db.QueryRow(
"SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='schema_migrations'",
).Scan(&tableCount)
if err != nil {
t.Fatalf("failed to check for table: %v", err)
}
if tableCount != 1 {
t.Fatalf("schema_migrations table not created")
}
// Version 0 must be recorded.
var recorded int
err = db.QueryRow(
"SELECT COUNT(*) FROM schema_migrations WHERE version = 0",
).Scan(&recorded)
if err != nil {
t.Fatalf("failed to check version: %v", err)
}
if recorded != 1 {
t.Errorf("expected version 0 to be recorded, got count %d", recorded)
} }
} }

View File

@@ -0,0 +1,9 @@
-- Migration 000: Schema migrations tracking table
-- Applied as a bootstrap step before the normal migration loop.
CREATE TABLE IF NOT EXISTS schema_migrations (
version INTEGER PRIMARY KEY,
applied_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
INSERT OR IGNORE INTO schema_migrations (version) VALUES (0);

View File

@@ -13,6 +13,7 @@ import (
"sneak.berlin/go/pixa/internal/database" "sneak.berlin/go/pixa/internal/database"
"sneak.berlin/go/pixa/internal/encurl" "sneak.berlin/go/pixa/internal/encurl"
"sneak.berlin/go/pixa/internal/healthcheck" "sneak.berlin/go/pixa/internal/healthcheck"
"sneak.berlin/go/pixa/internal/httpfetcher"
"sneak.berlin/go/pixa/internal/imgcache" "sneak.berlin/go/pixa/internal/imgcache"
"sneak.berlin/go/pixa/internal/logger" "sneak.berlin/go/pixa/internal/logger"
"sneak.berlin/go/pixa/internal/session" "sneak.berlin/go/pixa/internal/session"
@@ -72,7 +73,7 @@ func (s *Handlers) initImageService() error {
s.imgCache = cache s.imgCache = cache
// Create the fetcher config // Create the fetcher config
fetcherCfg := imgcache.DefaultFetcherConfig() fetcherCfg := httpfetcher.DefaultConfig()
fetcherCfg.AllowHTTP = s.config.AllowHTTP fetcherCfg.AllowHTTP = s.config.AllowHTTP
if s.config.UpstreamConnectionsPerHost > 0 { if s.config.UpstreamConnectionsPerHost > 0 {
fetcherCfg.MaxConnectionsPerHost = s.config.UpstreamConnectionsPerHost fetcherCfg.MaxConnectionsPerHost = s.config.UpstreamConnectionsPerHost

View File

@@ -18,6 +18,7 @@ import (
"github.com/go-chi/chi/v5" "github.com/go-chi/chi/v5"
"sneak.berlin/go/pixa/internal/database" "sneak.berlin/go/pixa/internal/database"
"sneak.berlin/go/pixa/internal/httpfetcher"
"sneak.berlin/go/pixa/internal/imgcache" "sneak.berlin/go/pixa/internal/imgcache"
) )
@@ -116,16 +117,16 @@ func newMockFetcher(fs fs.FS) *mockFetcher {
return &mockFetcher{fs: fs} return &mockFetcher{fs: fs}
} }
func (f *mockFetcher) Fetch(ctx context.Context, url string) (*imgcache.FetchResult, error) { func (f *mockFetcher) Fetch(ctx context.Context, url string) (*httpfetcher.FetchResult, error) {
// Remove https:// prefix // Remove https:// prefix
path := url[8:] // Remove "https://" path := url[8:] // Remove "https://"
data, err := fs.ReadFile(f.fs, path) data, err := fs.ReadFile(f.fs, path)
if err != nil { if err != nil {
return nil, imgcache.ErrUpstreamError return nil, httpfetcher.ErrUpstreamError
} }
return &imgcache.FetchResult{ return &httpfetcher.FetchResult{
Content: io.NopCloser(bytes.NewReader(data)), Content: io.NopCloser(bytes.NewReader(data)),
ContentLength: int64(len(data)), ContentLength: int64(len(data)),
ContentType: "image/jpeg", ContentType: "image/jpeg",

View File

@@ -8,6 +8,7 @@ import (
"time" "time"
"github.com/go-chi/chi/v5" "github.com/go-chi/chi/v5"
"sneak.berlin/go/pixa/internal/httpfetcher"
"sneak.berlin/go/pixa/internal/imgcache" "sneak.berlin/go/pixa/internal/imgcache"
) )
@@ -97,13 +98,13 @@ func (s *Handlers) HandleImage() http.HandlerFunc {
) )
// Check for specific error types // Check for specific error types
if errors.Is(err, imgcache.ErrSSRFBlocked) { if errors.Is(err, httpfetcher.ErrSSRFBlocked) {
s.respondError(w, "forbidden", http.StatusForbidden) s.respondError(w, "forbidden", http.StatusForbidden)
return return
} }
if errors.Is(err, imgcache.ErrUpstreamError) { if errors.Is(err, httpfetcher.ErrUpstreamError) {
s.respondError(w, "upstream error", http.StatusBadGateway) s.respondError(w, "upstream error", http.StatusBadGateway)
return return

View File

@@ -11,6 +11,7 @@ import (
"github.com/go-chi/chi/v5" "github.com/go-chi/chi/v5"
"sneak.berlin/go/pixa/internal/encurl" "sneak.berlin/go/pixa/internal/encurl"
"sneak.berlin/go/pixa/internal/httpfetcher"
"sneak.berlin/go/pixa/internal/imgcache" "sneak.berlin/go/pixa/internal/imgcache"
) )
@@ -100,11 +101,11 @@ func (s *Handlers) HandleImageEnc() http.HandlerFunc {
// handleImageError converts image service errors to HTTP responses. // handleImageError converts image service errors to HTTP responses.
func (s *Handlers) handleImageError(w http.ResponseWriter, err error) { func (s *Handlers) handleImageError(w http.ResponseWriter, err error) {
switch { switch {
case errors.Is(err, imgcache.ErrSSRFBlocked): case errors.Is(err, httpfetcher.ErrSSRFBlocked):
s.respondError(w, "forbidden", http.StatusForbidden) s.respondError(w, "forbidden", http.StatusForbidden)
case errors.Is(err, imgcache.ErrUpstreamError): case errors.Is(err, httpfetcher.ErrUpstreamError):
s.respondError(w, "upstream error", http.StatusBadGateway) s.respondError(w, "upstream error", http.StatusBadGateway)
case errors.Is(err, imgcache.ErrUpstreamTimeout): case errors.Is(err, httpfetcher.ErrUpstreamTimeout):
s.respondError(w, "upstream timeout", http.StatusGatewayTimeout) s.respondError(w, "upstream timeout", http.StatusGatewayTimeout)
default: default:
s.log.Error("image request failed", "error", err) s.log.Error("image request failed", "error", err)

View File

@@ -1,4 +1,6 @@
package imgcache // Package httpfetcher fetches content from upstream HTTP origins with SSRF
// protection, per-host connection limits, and content-type validation.
package httpfetcher
import ( import (
"context" "context"
@@ -37,25 +39,55 @@ var (
ErrUpstreamTimeout = errors.New("upstream request timeout") ErrUpstreamTimeout = errors.New("upstream request timeout")
) )
// FetcherConfig holds configuration for the upstream fetcher. // Fetcher retrieves content from upstream origins.
type FetcherConfig struct { type Fetcher interface {
// Timeout for upstream requests // Fetch retrieves content from the given URL.
Fetch(ctx context.Context, url string) (*FetchResult, error)
}
// FetchResult contains the result of fetching from upstream.
type FetchResult struct {
// Content is the raw image data.
Content io.ReadCloser
// ContentLength is the size in bytes (-1 if unknown).
ContentLength int64
// ContentType is the MIME type from upstream.
ContentType string
// Headers contains all response headers from upstream.
Headers map[string][]string
// StatusCode is the HTTP status code from upstream.
StatusCode int
// FetchDurationMs is how long the fetch took in milliseconds.
FetchDurationMs int64
// RemoteAddr is the IP:port of the upstream server.
RemoteAddr string
// HTTPVersion is the protocol version (e.g., "1.1", "2.0").
HTTPVersion string
// TLSVersion is the TLS protocol version (e.g., "TLS 1.3").
TLSVersion string
// TLSCipherSuite is the negotiated cipher suite name.
TLSCipherSuite string
}
// Config holds configuration for the upstream fetcher.
type Config struct {
// Timeout for upstream requests.
Timeout time.Duration Timeout time.Duration
// MaxResponseSize is the maximum allowed response body size // MaxResponseSize is the maximum allowed response body size.
MaxResponseSize int64 MaxResponseSize int64
// UserAgent to send to upstream servers // UserAgent to send to upstream servers.
UserAgent string UserAgent string
// AllowedContentTypes is a whitelist of MIME types to accept // AllowedContentTypes is an allow list of MIME types to accept.
AllowedContentTypes []string AllowedContentTypes []string
// AllowHTTP allows non-TLS connections (for testing only) // AllowHTTP allows non-TLS connections (for testing only).
AllowHTTP bool AllowHTTP bool
// MaxConnectionsPerHost limits concurrent connections to each upstream host // MaxConnectionsPerHost limits concurrent connections to each upstream host.
MaxConnectionsPerHost int MaxConnectionsPerHost int
} }
// DefaultFetcherConfig returns sensible defaults. // DefaultConfig returns a Config with sensible defaults.
func DefaultFetcherConfig() *FetcherConfig { func DefaultConfig() *Config {
return &FetcherConfig{ return &Config{
Timeout: DefaultFetchTimeout, Timeout: DefaultFetchTimeout,
MaxResponseSize: DefaultMaxResponseSize, MaxResponseSize: DefaultMaxResponseSize,
UserAgent: "pixa/1.0", UserAgent: "pixa/1.0",
@@ -72,18 +104,18 @@ func DefaultFetcherConfig() *FetcherConfig {
} }
} }
// HTTPFetcher implements the Fetcher interface with SSRF protection. // HTTPFetcher implements Fetcher with SSRF protection and per-host connection limits.
type HTTPFetcher struct { type HTTPFetcher struct {
client *http.Client client *http.Client
config *FetcherConfig config *Config
hostSems map[string]chan struct{} // per-host semaphores hostSems map[string]chan struct{} // per-host semaphores
hostSemMu sync.Mutex // protects hostSems map hostSemMu sync.Mutex // protects hostSems map
} }
// NewHTTPFetcher creates a new fetcher with SSRF protection. // New creates a new HTTPFetcher with SSRF protection.
func NewHTTPFetcher(config *FetcherConfig) *HTTPFetcher { func New(config *Config) *HTTPFetcher {
if config == nil { if config == nil {
config = DefaultFetcherConfig() config = DefaultConfig()
} }
// Create transport with SSRF-safe dialer // Create transport with SSRF-safe dialer
@@ -250,7 +282,7 @@ func (f *HTTPFetcher) Fetch(ctx context.Context, url string) (*FetchResult, erro
}, nil }, nil
} }
// isAllowedContentType checks if the content type is in the whitelist. // isAllowedContentType checks if the content type is in the allow list.
func (f *HTTPFetcher) isAllowedContentType(contentType string) bool { func (f *HTTPFetcher) isAllowedContentType(contentType string) bool {
// Extract the MIME type without parameters // Extract the MIME type without parameters
mediaType := strings.TrimSpace(strings.Split(contentType, ";")[0]) mediaType := strings.TrimSpace(strings.Split(contentType, ";")[0])

View File

@@ -0,0 +1,329 @@
package httpfetcher
import (
"context"
"errors"
"io"
"net"
"testing"
"testing/fstest"
)
func TestDefaultConfig(t *testing.T) {
cfg := DefaultConfig()
if cfg.Timeout != DefaultFetchTimeout {
t.Errorf("Timeout = %v, want %v", cfg.Timeout, DefaultFetchTimeout)
}
if cfg.MaxResponseSize != DefaultMaxResponseSize {
t.Errorf("MaxResponseSize = %d, want %d", cfg.MaxResponseSize, DefaultMaxResponseSize)
}
if cfg.MaxConnectionsPerHost != DefaultMaxConnectionsPerHost {
t.Errorf("MaxConnectionsPerHost = %d, want %d",
cfg.MaxConnectionsPerHost, DefaultMaxConnectionsPerHost)
}
if cfg.AllowHTTP {
t.Error("AllowHTTP should default to false")
}
if len(cfg.AllowedContentTypes) == 0 {
t.Error("AllowedContentTypes should not be empty")
}
}
func TestNewWithNilConfigUsesDefaults(t *testing.T) {
f := New(nil)
if f == nil {
t.Fatal("New(nil) returned nil")
}
if f.config == nil {
t.Fatal("config should be populated from DefaultConfig")
}
if f.config.Timeout != DefaultFetchTimeout {
t.Errorf("Timeout = %v, want %v", f.config.Timeout, DefaultFetchTimeout)
}
}
func TestIsAllowedContentType(t *testing.T) {
f := New(DefaultConfig())
tests := []struct {
contentType string
want bool
}{
{"image/jpeg", true},
{"image/png", true},
{"image/webp", true},
{"image/jpeg; charset=utf-8", true},
{"IMAGE/JPEG", true},
{"text/html", false},
{"application/octet-stream", false},
{"", false},
}
for _, tc := range tests {
t.Run(tc.contentType, func(t *testing.T) {
got := f.isAllowedContentType(tc.contentType)
if got != tc.want {
t.Errorf("isAllowedContentType(%q) = %v, want %v", tc.contentType, got, tc.want)
}
})
}
}
func TestExtractHost(t *testing.T) {
tests := []struct {
url string
want string
}{
{"https://example.com/path", "example.com"},
{"http://example.com:8080/path", "example.com:8080"},
{"https://example.com", "example.com"},
{"https://example.com?q=1", "example.com"},
{"example.com/path", "example.com"},
{"", ""},
}
for _, tc := range tests {
t.Run(tc.url, func(t *testing.T) {
got := extractHost(tc.url)
if got != tc.want {
t.Errorf("extractHost(%q) = %q, want %q", tc.url, got, tc.want)
}
})
}
}
func TestIsLocalhost(t *testing.T) {
tests := []struct {
host string
want bool
}{
{"localhost", true},
{"LOCALHOST", true},
{"127.0.0.1", true},
{"::1", true},
{"[::1]", true},
{"foo.localhost", true},
{"foo.local", true},
{"example.com", false},
{"127.0.0.2", false}, // Handled by isPrivateIP, not isLocalhost string match
}
for _, tc := range tests {
t.Run(tc.host, func(t *testing.T) {
got := isLocalhost(tc.host)
if got != tc.want {
t.Errorf("isLocalhost(%q) = %v, want %v", tc.host, got, tc.want)
}
})
}
}
func TestIsPrivateIP(t *testing.T) {
tests := []struct {
ip string
want bool
}{
{"127.0.0.1", true}, // loopback
{"10.0.0.1", true}, // private
{"192.168.1.1", true}, // private
{"172.16.0.1", true}, // private
{"169.254.1.1", true}, // link-local
{"0.0.0.0", true}, // unspecified
{"224.0.0.1", true}, // multicast
{"::1", true}, // IPv6 loopback
{"fe80::1", true}, // IPv6 link-local
{"8.8.8.8", false}, // public
{"2001:4860:4860::8888", false}, // public IPv6
}
for _, tc := range tests {
t.Run(tc.ip, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
got := isPrivateIP(ip)
if got != tc.want {
t.Errorf("isPrivateIP(%q) = %v, want %v", tc.ip, got, tc.want)
}
})
}
if !isPrivateIP(nil) {
t.Error("isPrivateIP(nil) should return true")
}
}
func TestValidateURL_RejectsNonHTTPS(t *testing.T) {
err := validateURL("http://example.com/path", false)
if !errors.Is(err, ErrUnsupportedScheme) {
t.Errorf("validateURL http = %v, want ErrUnsupportedScheme", err)
}
}
func TestValidateURL_AllowsHTTPWhenConfigured(t *testing.T) {
// Use a host that won't resolve (explicit .invalid TLD) so we don't hit DNS.
err := validateURL("http://nonexistent.invalid/path", true)
// We expect a host resolution error, not ErrUnsupportedScheme.
if errors.Is(err, ErrUnsupportedScheme) {
t.Error("validateURL with AllowHTTP should not return ErrUnsupportedScheme")
}
}
func TestValidateURL_RejectsLocalhost(t *testing.T) {
err := validateURL("https://localhost/path", false)
if !errors.Is(err, ErrSSRFBlocked) {
t.Errorf("validateURL localhost = %v, want ErrSSRFBlocked", err)
}
}
func TestValidateURL_EmptyHost(t *testing.T) {
err := validateURL("https:///path", false)
if !errors.Is(err, ErrInvalidHost) {
t.Errorf("validateURL empty host = %v, want ErrInvalidHost", err)
}
}
func TestMockFetcher_FetchesFile(t *testing.T) {
mockFS := fstest.MapFS{
"example.com/images/photo.jpg": &fstest.MapFile{Data: []byte("fake-jpeg-data")},
}
m := NewMock(mockFS)
result, err := m.Fetch(context.Background(), "https://example.com/images/photo.jpg")
if err != nil {
t.Fatalf("Fetch() error = %v", err)
}
defer func() { _ = result.Content.Close() }()
if result.ContentType != "image/jpeg" {
t.Errorf("ContentType = %q, want image/jpeg", result.ContentType)
}
data, err := io.ReadAll(result.Content)
if err != nil {
t.Fatalf("read content: %v", err)
}
if string(data) != "fake-jpeg-data" {
t.Errorf("Content = %q, want %q", string(data), "fake-jpeg-data")
}
if result.ContentLength != int64(len("fake-jpeg-data")) {
t.Errorf("ContentLength = %d, want %d", result.ContentLength, len("fake-jpeg-data"))
}
}
func TestMockFetcher_MissingFileReturnsUpstreamError(t *testing.T) {
mockFS := fstest.MapFS{}
m := NewMock(mockFS)
_, err := m.Fetch(context.Background(), "https://example.com/missing.jpg")
if !errors.Is(err, ErrUpstreamError) {
t.Errorf("Fetch() error = %v, want ErrUpstreamError", err)
}
}
func TestMockFetcher_RespectsContextCancellation(t *testing.T) {
mockFS := fstest.MapFS{
"example.com/photo.jpg": &fstest.MapFile{Data: []byte("data")},
}
m := NewMock(mockFS)
ctx, cancel := context.WithCancel(context.Background())
cancel()
_, err := m.Fetch(ctx, "https://example.com/photo.jpg")
if !errors.Is(err, context.Canceled) {
t.Errorf("Fetch() error = %v, want context.Canceled", err)
}
}
func TestDetectContentTypeFromPath(t *testing.T) {
tests := []struct {
path string
want string
}{
{"foo/bar.jpg", "image/jpeg"},
{"foo/bar.JPG", "image/jpeg"},
{"foo/bar.jpeg", "image/jpeg"},
{"foo/bar.png", "image/png"},
{"foo/bar.gif", "image/gif"},
{"foo/bar.webp", "image/webp"},
{"foo/bar.avif", "image/avif"},
{"foo/bar.svg", "image/svg+xml"},
{"foo/bar.bin", "application/octet-stream"},
{"foo/bar", "application/octet-stream"},
}
for _, tc := range tests {
t.Run(tc.path, func(t *testing.T) {
got := detectContentTypeFromPath(tc.path)
if got != tc.want {
t.Errorf("detectContentTypeFromPath(%q) = %q, want %q", tc.path, got, tc.want)
}
})
}
}
func TestLimitedReader_EnforcesLimit(t *testing.T) {
src := make([]byte, 100)
r := &limitedReader{
reader: &byteReader{data: src},
remaining: 50,
}
buf := make([]byte, 100)
n, err := r.Read(buf)
if err != nil {
t.Fatalf("first Read error = %v", err)
}
if n > 50 {
t.Errorf("read %d bytes, should be capped at 50", n)
}
// Drain until limit is exhausted.
total := n
for total < 50 {
nn, err := r.Read(buf)
total += nn
if err != nil {
t.Fatalf("during drain: %v", err)
}
}
// Now the limit is exhausted — next read should error.
_, err = r.Read(buf)
if !errors.Is(err, ErrResponseTooLarge) {
t.Errorf("exhausted Read error = %v, want ErrResponseTooLarge", err)
}
}
// byteReader is a minimal io.Reader over a byte slice for testing.
type byteReader struct {
data []byte
pos int
}
func (r *byteReader) Read(p []byte) (int, error) {
if r.pos >= len(r.data) {
return 0, io.EOF
}
n := copy(p, r.data[r.pos:])
r.pos += n
return n, nil
}

View File

@@ -1,4 +1,4 @@
package imgcache package httpfetcher
import ( import (
"context" "context"
@@ -10,15 +10,15 @@ import (
"strings" "strings"
) )
// MockFetcher implements the Fetcher interface using an embedded filesystem. // MockFetcher implements Fetcher using an embedded filesystem.
// Files are organized as: hostname/path/to/file.ext // Files are organized as: hostname/path/to/file.ext
// URLs like https://example.com/images/photo.jpg map to example.com/images/photo.jpg // URLs like https://example.com/images/photo.jpg map to example.com/images/photo.jpg.
type MockFetcher struct { type MockFetcher struct {
fs fs.FS fs fs.FS
} }
// NewMockFetcher creates a new mock fetcher backed by the given filesystem. // NewMock creates a new mock fetcher backed by the given filesystem.
func NewMockFetcher(fsys fs.FS) *MockFetcher { func NewMock(fsys fs.FS) *MockFetcher {
return &MockFetcher{fs: fsys} return &MockFetcher{fs: fsys}
} }

View File

@@ -9,6 +9,8 @@ import (
"io" "io"
"path/filepath" "path/filepath"
"time" "time"
"sneak.berlin/go/pixa/internal/httpfetcher"
) )
// Cache errors. // Cache errors.
@@ -111,7 +113,7 @@ func (c *Cache) StoreSource(
ctx context.Context, ctx context.Context,
req *ImageRequest, req *ImageRequest,
content io.Reader, content io.Reader,
result *FetchResult, result *httpfetcher.FetchResult,
) (ContentHash, error) { ) (ContentHash, error) {
// Store content // Store content
contentHash, size, err := c.srcContent.Store(content) contentHash, size, err := c.srcContent.Store(content)

View File

@@ -9,6 +9,7 @@ import (
"time" "time"
_ "modernc.org/sqlite" _ "modernc.org/sqlite"
"sneak.berlin/go/pixa/internal/httpfetcher"
) )
func setupTestDB(t *testing.T) *sql.DB { func setupTestDB(t *testing.T) *sql.DB {
@@ -152,7 +153,7 @@ func TestCache_StoreAndLookup(t *testing.T) {
// Store source content // Store source content
sourceContent := []byte("fake jpeg data") sourceContent := []byte("fake jpeg data")
fetchResult := &FetchResult{ fetchResult := &httpfetcher.FetchResult{
ContentType: "image/jpeg", ContentType: "image/jpeg",
Headers: map[string][]string{"Content-Type": {"image/jpeg"}}, Headers: map[string][]string{"Content-Type": {"image/jpeg"}},
} }

View File

@@ -169,36 +169,6 @@ type Whitelist interface {
IsWhitelisted(u *url.URL) bool IsWhitelisted(u *url.URL) bool
} }
// Fetcher fetches images from upstream origins
type Fetcher interface {
// Fetch retrieves an image from the origin
Fetch(ctx context.Context, url string) (*FetchResult, error)
}
// FetchResult contains the result of fetching from upstream
type FetchResult struct {
// Content is the raw image data
Content io.ReadCloser
// ContentLength is the size in bytes (-1 if unknown)
ContentLength int64
// ContentType is the MIME type from upstream
ContentType string
// Headers contains all response headers from upstream
Headers map[string][]string
// StatusCode is the HTTP status code from upstream
StatusCode int
// FetchDurationMs is how long the fetch took in milliseconds
FetchDurationMs int64
// RemoteAddr is the IP:port of the upstream server
RemoteAddr string
// HTTPVersion is the protocol version (e.g., "1.1", "2.0")
HTTPVersion string
// TLSVersion is the TLS protocol version (e.g., "TLS 1.3")
TLSVersion string
// TLSCipherSuite is the negotiated cipher suite name
TLSCipherSuite string
}
// Storage handles persistent storage of cached content // Storage handles persistent storage of cached content
type Storage interface { type Storage interface {
// Store saves content and returns its hash // Store saves content and returns its hash

View File

@@ -11,16 +11,19 @@ import (
"time" "time"
"github.com/dustin/go-humanize" "github.com/dustin/go-humanize"
"sneak.berlin/go/pixa/internal/allowlist"
"sneak.berlin/go/pixa/internal/httpfetcher"
"sneak.berlin/go/pixa/internal/imageprocessor" "sneak.berlin/go/pixa/internal/imageprocessor"
"sneak.berlin/go/pixa/internal/magic"
) )
// Service implements the ImageCache interface, orchestrating cache, fetcher, and processor. // Service implements the ImageCache interface, orchestrating cache, fetcher, and processor.
type Service struct { type Service struct {
cache *Cache cache *Cache
fetcher Fetcher fetcher httpfetcher.Fetcher
processor *imageprocessor.ImageProcessor processor *imageprocessor.ImageProcessor
signer *Signer signer *Signer
whitelist *HostWhitelist allowlist *allowlist.HostAllowList
log *slog.Logger log *slog.Logger
allowHTTP bool allowHTTP bool
maxResponseSize int64 maxResponseSize int64
@@ -31,9 +34,9 @@ type ServiceConfig struct {
// Cache is the cache instance // Cache is the cache instance
Cache *Cache Cache *Cache
// FetcherConfig configures the upstream fetcher (ignored if Fetcher is set) // FetcherConfig configures the upstream fetcher (ignored if Fetcher is set)
FetcherConfig *FetcherConfig FetcherConfig *httpfetcher.Config
// Fetcher is an optional custom fetcher (for testing) // Fetcher is an optional custom fetcher (for testing)
Fetcher Fetcher Fetcher httpfetcher.Fetcher
// SigningKey is the HMAC signing key (empty disables signing) // SigningKey is the HMAC signing key (empty disables signing)
SigningKey string SigningKey string
// Whitelist is the list of hosts that don't require signatures // Whitelist is the list of hosts that don't require signatures
@@ -55,15 +58,15 @@ func NewService(cfg *ServiceConfig) (*Service, error) {
// Resolve fetcher config for defaults // Resolve fetcher config for defaults
fetcherCfg := cfg.FetcherConfig fetcherCfg := cfg.FetcherConfig
if fetcherCfg == nil { if fetcherCfg == nil {
fetcherCfg = DefaultFetcherConfig() fetcherCfg = httpfetcher.DefaultConfig()
} }
// Use custom fetcher if provided, otherwise create HTTP fetcher // Use custom fetcher if provided, otherwise create HTTP fetcher
var fetcher Fetcher var fetcher httpfetcher.Fetcher
if cfg.Fetcher != nil { if cfg.Fetcher != nil {
fetcher = cfg.Fetcher fetcher = cfg.Fetcher
} else { } else {
fetcher = NewHTTPFetcher(fetcherCfg) fetcher = httpfetcher.New(fetcherCfg)
} }
signer := NewSigner(cfg.SigningKey) signer := NewSigner(cfg.SigningKey)
@@ -85,7 +88,7 @@ func NewService(cfg *ServiceConfig) (*Service, error) {
fetcher: fetcher, fetcher: fetcher,
processor: imageprocessor.New(imageprocessor.Params{MaxInputBytes: maxResponseSize}), processor: imageprocessor.New(imageprocessor.Params{MaxInputBytes: maxResponseSize}),
signer: signer, signer: signer,
whitelist: NewHostWhitelist(cfg.Whitelist), allowlist: allowlist.New(cfg.Whitelist),
log: log, log: log,
allowHTTP: allowHTTP, allowHTTP: allowHTTP,
maxResponseSize: maxResponseSize, maxResponseSize: maxResponseSize,
@@ -111,7 +114,7 @@ func (s *Service) Get(ctx context.Context, req *ImageRequest) (*ImageResponse, e
"path", req.SourcePath, "path", req.SourcePath,
) )
return nil, fmt.Errorf("%w: %w", ErrUpstreamError, ErrNegativeCached) return nil, fmt.Errorf("%w: %w", httpfetcher.ErrUpstreamError, ErrNegativeCached)
} }
// Check variant cache first (disk only, no DB) // Check variant cache first (disk only, no DB)
@@ -276,7 +279,7 @@ func (s *Service) fetchAndProcess(
) )
// Validate magic bytes match content type // Validate magic bytes match content type
if err := ValidateMagicBytes(sourceData, fetchResult.ContentType); err != nil { if err := magic.ValidateMagicBytes(sourceData, fetchResult.ContentType); err != nil {
return nil, fmt.Errorf("content validation failed: %w", err) return nil, fmt.Errorf("content validation failed: %w", err)
} }
@@ -381,7 +384,7 @@ func (s *Service) Stats(ctx context.Context) (*CacheStats, error) {
// ValidateRequest validates the request signature if required. // ValidateRequest validates the request signature if required.
func (s *Service) ValidateRequest(req *ImageRequest) error { func (s *Service) ValidateRequest(req *ImageRequest) error {
// Check if host is whitelisted (no signature required) // Check if host is allowed (no signature required)
sourceURL := req.SourceURL() sourceURL := req.SourceURL()
parsedURL, err := url.Parse(sourceURL) parsedURL, err := url.Parse(sourceURL)
@@ -389,11 +392,11 @@ func (s *Service) ValidateRequest(req *ImageRequest) error {
return fmt.Errorf("invalid source URL: %w", err) return fmt.Errorf("invalid source URL: %w", err)
} }
if s.whitelist.IsWhitelisted(parsedURL) { if s.allowlist.IsAllowed(parsedURL) {
return nil return nil
} }
// Signature required for non-whitelisted hosts // Signature required for non-allowed hosts
return s.signer.Verify(req) return s.signer.Verify(req)
} }
@@ -416,13 +419,13 @@ const (
// isNegativeCacheable returns true if the error should be cached. // isNegativeCacheable returns true if the error should be cached.
func isNegativeCacheable(err error) bool { func isNegativeCacheable(err error) bool {
return errors.Is(err, ErrUpstreamError) return errors.Is(err, httpfetcher.ErrUpstreamError)
} }
// extractStatusCode extracts HTTP status code from error message. // extractStatusCode extracts HTTP status code from error message.
func extractStatusCode(err error) int { func extractStatusCode(err error) int {
// Default to 502 Bad Gateway for upstream errors // Default to 502 Bad Gateway for upstream errors
if errors.Is(err, ErrUpstreamError) { if errors.Is(err, httpfetcher.ErrUpstreamError) {
return httpStatusBadGateway return httpStatusBadGateway
} }

View File

@@ -5,6 +5,8 @@ import (
"io" "io"
"testing" "testing"
"time" "time"
"sneak.berlin/go/pixa/internal/magic"
) )
func TestService_Get_WhitelistedHost(t *testing.T) { func TestService_Get_WhitelistedHost(t *testing.T) {
@@ -151,6 +153,74 @@ func TestService_Get_NonWhitelistedHost_InvalidSignature(t *testing.T) {
} }
} }
// TestService_ValidateRequest_SignatureExactHostMatch verifies that
// ValidateRequest enforces exact host matching for signatures. A
// signature for one host must not verify for a different host, even
// if they share a domain suffix.
func TestService_ValidateRequest_SignatureExactHostMatch(t *testing.T) {
signingKey := "test-signing-key-must-be-32-chars"
svc, _ := SetupTestService(t,
WithSigningKey(signingKey),
WithNoWhitelist(),
)
signer := NewSigner(signingKey)
// Sign a request for "cdn.example.com"
signedReq := &ImageRequest{
SourceHost: "cdn.example.com",
SourcePath: "/photos/cat.jpg",
Size: Size{Width: 50, Height: 50},
Format: FormatJPEG,
Quality: 85,
FitMode: FitCover,
Expires: time.Now().Add(time.Hour),
}
signedReq.Signature = signer.Sign(signedReq)
// The original request should pass validation
t.Run("exact host passes", func(t *testing.T) {
err := svc.ValidateRequest(signedReq)
if err != nil {
t.Errorf("ValidateRequest() exact host failed: %v", err)
}
})
// Try to reuse the signature with different hosts
tests := []struct {
name string
host string
}{
{"parent domain", "example.com"},
{"sibling subdomain", "images.example.com"},
{"deeper subdomain", "a.cdn.example.com"},
{"evil suffix domain", "cdn.example.com.evil.com"},
{"prefixed host", "evilcdn.example.com"},
}
for _, tt := range tests {
t.Run(tt.name+" rejected", func(t *testing.T) {
req := &ImageRequest{
SourceHost: tt.host,
SourcePath: signedReq.SourcePath,
SourceQuery: signedReq.SourceQuery,
Size: signedReq.Size,
Format: signedReq.Format,
Quality: signedReq.Quality,
FitMode: signedReq.FitMode,
Expires: signedReq.Expires,
Signature: signedReq.Signature,
}
err := svc.ValidateRequest(req)
if err == nil {
t.Errorf("ValidateRequest() should reject signature for host %q (signed for %q)",
tt.host, signedReq.SourceHost)
}
})
}
}
func TestService_Get_InvalidFile(t *testing.T) { func TestService_Get_InvalidFile(t *testing.T) {
svc, fixtures := SetupTestService(t) svc, fixtures := SetupTestService(t)
ctx := context.Background() ctx := context.Background()
@@ -247,17 +317,17 @@ func TestService_Get_FormatConversion(t *testing.T) {
t.Fatalf("failed to read response: %v", err) t.Fatalf("failed to read response: %v", err)
} }
detectedMIME, err := DetectFormat(data) detectedMIME, err := magic.DetectFormat(data)
if err != nil { if err != nil {
t.Fatalf("failed to detect format: %v", err) t.Fatalf("failed to detect format: %v", err)
} }
expectedFormat, ok := MIMEToImageFormat(tt.wantMIME) expectedFormat, ok := magic.MIMEToImageFormat(tt.wantMIME)
if !ok { if !ok {
t.Fatalf("unknown format for MIME type: %s", tt.wantMIME) t.Fatalf("unknown format for MIME type: %s", tt.wantMIME)
} }
detectedFormat, ok := MIMEToImageFormat(string(detectedMIME)) detectedFormat, ok := magic.MIMEToImageFormat(string(detectedMIME))
if !ok { if !ok {
t.Fatalf("unknown format for detected MIME type: %s", detectedMIME) t.Fatalf("unknown format for detected MIME type: %s", detectedMIME)
} }

View File

@@ -43,6 +43,11 @@ func (s *Signer) Sign(req *ImageRequest) string {
} }
// Verify checks if the signature on the request is valid and not expired. // Verify checks if the signature on the request is valid and not expired.
// Signatures are exact-match only: every component of the signed data
// (host, path, query, dimensions, format, expiration) must match exactly.
// No suffix matching, wildcard matching, or partial matching is supported.
// A signature for "cdn.example.com" will NOT verify for "example.com" or
// "other.cdn.example.com", and vice versa.
func (s *Signer) Verify(req *ImageRequest) error { func (s *Signer) Verify(req *ImageRequest) error {
// Check expiration first // Check expiration first
if req.Expires.IsZero() { if req.Expires.IsZero() {
@@ -66,6 +71,8 @@ func (s *Signer) Verify(req *ImageRequest) error {
// buildSignatureData creates the string to be signed. // buildSignatureData creates the string to be signed.
// Format: "host:path:query:width:height:format:expiration" // Format: "host:path:query:width:height:format:expiration"
// All components are used verbatim (exact match). No normalization,
// suffix matching, or wildcard expansion is performed.
func (s *Signer) buildSignatureData(req *ImageRequest) string { func (s *Signer) buildSignatureData(req *ImageRequest) string {
return fmt.Sprintf("%s:%s:%s:%d:%d:%s:%d", return fmt.Sprintf("%s:%s:%s:%d:%d:%s:%d",
req.SourceHost, req.SourceHost,

View File

@@ -152,6 +152,178 @@ func TestSigner_Verify(t *testing.T) {
} }
} }
// TestSigner_Verify_ExactMatchOnly verifies that signatures enforce exact
// matching on every URL component. No suffix matching, wildcard matching,
// or partial matching is supported.
func TestSigner_Verify_ExactMatchOnly(t *testing.T) {
signer := NewSigner("test-secret-key")
// Base request that we'll sign, then tamper with individual fields.
baseReq := func() *ImageRequest {
req := &ImageRequest{
SourceHost: "cdn.example.com",
SourcePath: "/photos/cat.jpg",
SourceQuery: "token=abc",
Size: Size{Width: 800, Height: 600},
Format: FormatWebP,
Expires: time.Now().Add(1 * time.Hour),
}
req.Signature = signer.Sign(req)
return req
}
tests := []struct {
name string
tamper func(req *ImageRequest)
}{
{
name: "parent domain does not match subdomain",
tamper: func(req *ImageRequest) {
// Signed for cdn.example.com, try example.com
req.SourceHost = "example.com"
},
},
{
name: "subdomain does not match parent domain",
tamper: func(req *ImageRequest) {
// Signed for cdn.example.com, try images.cdn.example.com
req.SourceHost = "images.cdn.example.com"
},
},
{
name: "sibling subdomain does not match",
tamper: func(req *ImageRequest) {
// Signed for cdn.example.com, try images.example.com
req.SourceHost = "images.example.com"
},
},
{
name: "host with suffix appended does not match",
tamper: func(req *ImageRequest) {
// Signed for cdn.example.com, try cdn.example.com.evil.com
req.SourceHost = "cdn.example.com.evil.com"
},
},
{
name: "host with prefix does not match",
tamper: func(req *ImageRequest) {
// Signed for cdn.example.com, try evilcdn.example.com
req.SourceHost = "evilcdn.example.com"
},
},
{
name: "different path does not match",
tamper: func(req *ImageRequest) {
req.SourcePath = "/photos/dog.jpg"
},
},
{
name: "path suffix does not match",
tamper: func(req *ImageRequest) {
req.SourcePath = "/photos/cat.jpg/extra"
},
},
{
name: "path prefix does not match",
tamper: func(req *ImageRequest) {
req.SourcePath = "/other/photos/cat.jpg"
},
},
{
name: "different query does not match",
tamper: func(req *ImageRequest) {
req.SourceQuery = "token=xyz"
},
},
{
name: "added query does not match empty query",
tamper: func(req *ImageRequest) {
req.SourceQuery = "extra=1"
},
},
{
name: "removed query does not match",
tamper: func(req *ImageRequest) {
req.SourceQuery = ""
},
},
{
name: "different width does not match",
tamper: func(req *ImageRequest) {
req.Size.Width = 801
},
},
{
name: "different height does not match",
tamper: func(req *ImageRequest) {
req.Size.Height = 601
},
},
{
name: "different format does not match",
tamper: func(req *ImageRequest) {
req.Format = FormatPNG
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
req := baseReq()
tt.tamper(req)
err := signer.Verify(req)
if err != ErrSignatureInvalid {
t.Errorf("Verify() = %v, want %v", err, ErrSignatureInvalid)
}
})
}
// Verify the unmodified base request still passes
t.Run("unmodified request passes", func(t *testing.T) {
req := baseReq()
if err := signer.Verify(req); err != nil {
t.Errorf("Verify() unmodified request failed: %v", err)
}
})
}
// TestSigner_Sign_ExactHostInData verifies that Sign uses the exact host
// string in the signature data, producing different signatures for
// suffix-related hosts.
func TestSigner_Sign_ExactHostInData(t *testing.T) {
signer := NewSigner("test-secret-key")
hosts := []string{
"cdn.example.com",
"example.com",
"images.example.com",
"images.cdn.example.com",
"cdn.example.com.evil.com",
}
sigs := make(map[string]string)
for _, host := range hosts {
req := &ImageRequest{
SourceHost: host,
SourcePath: "/photos/cat.jpg",
SourceQuery: "",
Size: Size{Width: 800, Height: 600},
Format: FormatWebP,
Expires: time.Unix(1704067200, 0),
}
sig := signer.Sign(req)
if existing, ok := sigs[sig]; ok {
t.Errorf("hosts %q and %q produced the same signature", existing, host)
}
sigs[sig] = host
}
}
func TestSigner_DifferentKeys(t *testing.T) { func TestSigner_DifferentKeys(t *testing.T) {
signer1 := NewSigner("secret-key-1") signer1 := NewSigner("secret-key-1")
signer2 := NewSigner("secret-key-2") signer2 := NewSigner("secret-key-2")

View File

@@ -15,6 +15,7 @@ import (
"time" "time"
"sneak.berlin/go/pixa/internal/database" "sneak.berlin/go/pixa/internal/database"
"sneak.berlin/go/pixa/internal/httpfetcher"
) )
// TestFixtures contains paths to test files in the mock filesystem. // TestFixtures contains paths to test files in the mock filesystem.
@@ -172,7 +173,7 @@ func SetupTestService(t *testing.T, opts ...TestServiceOption) (*Service, *TestF
svc, err := NewService(&ServiceConfig{ svc, err := NewService(&ServiceConfig{
Cache: cache, Cache: cache,
Fetcher: NewMockFetcher(mockFS), Fetcher: httpfetcher.NewMock(mockFS),
SigningKey: cfg.signingKey, SigningKey: cfg.signingKey,
Whitelist: cfg.whitelist, Whitelist: cfg.whitelist,
}) })

View File

@@ -1,4 +1,6 @@
package imgcache // Package magic detects image formats from magic bytes and validates
// content against declared MIME types.
package magic
import ( import (
"bytes" "bytes"
@@ -27,6 +29,20 @@ const (
MIMETypeSVG = MIMEType("image/svg+xml") MIMETypeSVG = MIMEType("image/svg+xml")
) )
// ImageFormat represents supported output image formats.
// This mirrors the type in imgcache to avoid circular imports.
type ImageFormat string
// Supported image output formats.
const (
FormatOriginal ImageFormat = "orig"
FormatJPEG ImageFormat = "jpeg"
FormatPNG ImageFormat = "png"
FormatWebP ImageFormat = "webp"
FormatAVIF ImageFormat = "avif"
FormatGIF ImageFormat = "gif"
)
// MinMagicBytes is the minimum number of bytes needed to detect format. // MinMagicBytes is the minimum number of bytes needed to detect format.
const MinMagicBytes = 12 const MinMagicBytes = 12
@@ -189,7 +205,7 @@ func PeekAndValidate(r io.Reader, declaredType string) (io.Reader, error) {
return io.MultiReader(bytes.NewReader(buf), r), nil return io.MultiReader(bytes.NewReader(buf), r), nil
} }
// MIMEToImageFormat converts a MIME type to our ImageFormat type. // MIMEToImageFormat converts a MIME type to an ImageFormat.
func MIMEToImageFormat(mimeType string) (ImageFormat, bool) { func MIMEToImageFormat(mimeType string) (ImageFormat, bool) {
normalized := normalizeMIMEType(mimeType) normalized := normalizeMIMEType(mimeType)
switch MIMEType(normalized) { switch MIMEType(normalized) {
@@ -208,7 +224,7 @@ func MIMEToImageFormat(mimeType string) (ImageFormat, bool) {
} }
} }
// ImageFormatToMIME converts our ImageFormat to a MIME type string. // ImageFormatToMIME converts an ImageFormat to a MIME type string.
func ImageFormatToMIME(format ImageFormat) string { func ImageFormatToMIME(format ImageFormat) string {
switch format { switch format {
case FormatJPEG: case FormatJPEG:

View File

@@ -1,4 +1,4 @@
package imgcache package magic
import ( import (
"bytes" "bytes"