Bound imageprocessor.Process input read to prevent unbounded memory use #37
Reference in New Issue
Block a user
Delete Branch "fix/bounded-processor-read"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
closes #31
Problem
ImageProcessor.Processusedio.ReadAll(input)without any size limit, allowing arbitrarily large inputs to exhaust all available memory. This is a DoS vector — even though the upstream fetcher has aMaxResponseSizelimit (50 MiB), the processor interface accepts anyio.Readerand should defend itself independently.Additionally, the service layer's
processFromSourceOrFetchread cached source content withio.ReadAllwithout a bound, so an unexpectedly large cached file could also cause unbounded memory consumption.Changes
Processor (
processor.go)maxInputBytesfield toImageProcessor(configurable, defaults to 50 MiB viaDefaultMaxInputBytes)NewImageProcessornow accepts amaxInputBytesparameter (0 or negative uses the default)Processnow wraps the input reader withio.LimitReaderand rejects inputs exceeding the limit withErrInputDataTooLargeDefaultMaxInputBytesandErrInputDataTooLargeexported constants/errorsService (
service.go)NewServicenow wires the fetcher'sMaxResponseSizethrough to the processorloadCachedSourcehelper method to flatten nesting inprocessFromSourceOrFetchmaxResponseSize— oversized cached files are discarded and re-fetchedTests (
processor_test.go)TestImageProcessor_RejectsOversizedInputData— verifies that inputs exceedingmaxInputBytesare rejected withErrInputDataTooLargeTestImageProcessor_AcceptsInputWithinLimit— verifies that inputs within the limit are processed normallyTestImageProcessor_DefaultMaxInputBytes— verifies that 0 and negative values use the defaultNewImageProcessor(0)(default limit)Code Review: PR #37 — Bound imageprocessor.Process input read
Policy Compliance
.golangci.ymlunchanged ✅Makefileunchanged ✅Dockerfileunchanged ✅.gitea/workflows/unchanged ✅processor.go,processor_test.go,service.go) — all ininternal/imgcache/✅internal/package ✅Requirements Checklist (Issue #31)
imageprocessor.Processmust not do unbounded read into memoryio.LimitReader(input, p.maxInputBytes+1)bounds the readErrInputDataTooLargesentinel errormaxInputBytesparam, default 50 MiB viaDefaultMaxInputBytes@clawbot pingCode Analysis
processor.go — Core fix:
io.LimitReader(input, p.maxInputBytes+1)+len(data) > p.maxInputBytescheck is the standard bounded-read pattern. The +1 allows detecting overflow without consuming extra memory beyond the limit. Correct.NewImageProcessor(maxInputBytes int64)— breaking API change is fine since it'sinternal/package. All callers updated.DefaultMaxResponseSizein the fetcher — consistent.service.go — Cached source bounding:
loadCachedSourcehelper applies the sameLimitReaderpattern to cached source reads. Directly related to the issue — cached sources also feed intoProcess, so bounding them prevents the same class of problem.reader.Close()is called unconditionally afterio.ReadAll, including on error paths. Correct.maxResponseSizewired fromfetcherCfg.MaxResponseSizeto both the processor and the service. Clean.fetcherCfgresolution moved earlier to extract defaults even when a customFetcheris injected. Correct — previously a custom fetcher with nilFetcherConfigwould skip default resolution.processor_test.go — Test coverage:
TestImageProcessor_RejectsOversizedInputData— small limit (1024), larger JPEG, verifiesErrInputDataTooLarge✅TestImageProcessor_AcceptsInputWithinLimit— 10× headroom, verifies processing succeeds ✅TestImageProcessor_DefaultMaxInputBytes— verifies 0 and -1 both produceDefaultMaxInputBytes✅NewImageProcessor()toNewImageProcessor(0)✅Edge cases considered:
len(data) == maxInputBytes→ not greater → passes ✅maxInputBytes+1bytes → rejected ✅LimitReadercaps atmaxInputBytes+1regardless → memory bounded ✅maxInputBytes <= 0: defaults to 50 MiB ✅README Consistency
No README update needed. The processor byte limit is an internal implementation detail, not a user-facing configuration. The existing
upstream_max_response_sizeconfig option in README remains accurate.Build Result
Verdict: PASS ✅
Clean, focused fix. Standard
io.LimitReaderpattern correctly applied to both the processor's direct input and the service's cached source path. Tests are comprehensive. No policy violations, no scope creep, no cheating.shouldn’t it be imageprocessor.New() and a Params instead of a bare positional argument?
871972f726tod36e511032Rework: Constructor refactored to params pattern
Addressed review feedback — refactored the image processor constructor from a bare positional argument to a params struct:
Changes
processor.go:Paramsstruct withMaxInputBytes int64fieldNewImageProcessor(maxInputBytes int64)→New(params Params)Params{}usesDefaultMaxInputBytes(50 MiB)service.go:NewImageProcessor(maxResponseSize)→New(Params{MaxInputBytes: maxResponseSize})processor_test.go:NewImageProcessor(0)→New(Params{})NewImageProcessor(limit)→New(Params{MaxInputBytes: limit})NewImageProcessor(-1)→New(Params{MaxInputBytes: -1})Verification
make fmt✅docker build .✅ (fmt-check, lint, all tests pass, binary builds)Code Review: PR #37 — Post-Rework Review
Policy Compliance
.golangci.ymlunchangedMakefileunchangedDockerfileunchanged.gitea/workflows/unchanged.dockerignoreunchanged@sha256:with version commentsinternal/package files changedprocessor.go,processor_test.go,service.go)New()per Go StyleguideRequirements Checklist (Issue #31)
imageprocessor.Processmust not do unbounded read into memoryio.LimitReader(input, p.maxInputBytes+1)bounds the readErrInputDataTooLargesentinel errorParams{MaxInputBytes}, default 50 MiB viaDefaultMaxInputBytesloadCachedSourceapplies sameLimitReaderpatternSneak's Rework Request
New()(notNewImageProcessor())Paramsstruct instead of bare positionalint64Params{}gives sensible defaultsDefaultMaxInputBytes= 50 MiB)NewImageProcessorcompletely removed from codebaseCode Analysis
processor.go:
io.LimitReader(input, p.maxInputBytes+1)+int64(len(data)) > p.maxInputBytesis the standard bounded-read pattern. The +1 allows detecting overflow without consuming extra memory beyond the limit. Correct.Paramsstruct withMaxInputBytesfield — clean, extensible. Negative/zero values default to 50 MiB.ImageProcessoris safe for concurrent use (immutable after construction).service.go:
fetcherCfgresolution moved earlier so defaults are available even when a customFetcheris injected. Necessary formaxResponseSizewiring.loadCachedSourcehelper correctly appliesLimitReaderto cached reads, closesreaderunconditionally afterReadAll(including on error paths), and discards oversized cached files (triggering a re-fetch). Correct.maxResponseSizewired fromfetcherCfg.MaxResponseSizeto both processor (Params{MaxInputBytes: maxResponseSize}) and service field. Consistent.allowHTTPcheck still reads fromcfg.FetcherConfig(user-supplied), notfetcherCfg(potentially defaulted). This is pre-existing behavior and correct — default config should not enable HTTP.processor_test.go:
TestImageProcessor_RejectsOversizedInputData— 1024-byte limit, larger JPEG, verifiesErrInputDataTooLarge✅TestImageProcessor_AcceptsInputWithinLimit— 10× headroom, verifies processing succeeds ✅TestImageProcessor_DefaultMaxInputBytes— verifies 0 and -1 both produceDefaultMaxInputBytes✅NewImageProcessor(0)toNew(Params{})— no test assertions weakened ✅TestImageProcessor_ImplementsInterfacestill verifies interface compliance ✅Edge cases:
len(data) == maxInputBytes→ not greater → passes ✅maxInputBytes+1bytes → rejected ✅LimitReadercaps atmaxInputBytes+1→ memory bounded ✅MaxInputBytes <= 0: defaults to 50 MiB ✅README Consistency
No README update needed.
Params,New(), andmaxInputBytesare internal implementation details. The existingupstream_max_response_sizeconfig option in README remains accurate.Build Result
Verdict: PASS ✅
Clean, focused fix with correct rework.
NewImageProcessor(int64)fully replaced withNew(Params{})per sneak's feedback. Standardio.LimitReaderpattern correctly applied to both processor input and cached source reads. Tests comprehensive and properly updated. No policy violations, no scope creep, no cheating.is there a good reason Server and ImageProcessor are in the same package?
*Service
Good question — looking at the package,
internal/imgcache/currently contains the image processor, HTTP fetcher, host whitelist, service orchestrator, and server/handlers all in one package. TheImageProcessoris a distinct concern (image format conversion/resizing) from the HTTP serving layer.This is pre-existing structure though, not introduced by this PR. Want me to open a separate issue to split
ImageProcessorinto its owninternal/imageprocessor/package?yes, and create a new issue documenting a list of every other item that should be moved out into its own focused module. it doesn’t have to be everything if stuff is tightly coupled but anything that can easily be pulled out, should be. meanwhile for this issue let’s just pull out the one.
Reworked per review feedback. Changes:
Extracted
imageprocessorinto its own packageinternal/imageprocessor/imageprocessor.go— standalone package with its ownFormat,FitMode,Size,Request, andResulttypes. No imports fromimgcache, avoiding circular dependencies.internal/imageprocessor/imageprocessor_test.go— all processor tests moved here with a self-containeddetectMIMEhelper (no dependency onimgcache.DetectFormat).internal/imgcache/processor.goandinternal/imgcache/processor_test.goProcessorinterface andProcessResultfromimgcache.go(now unused)service.gostores*imageprocessor.ImageProcessordirectly and convertsImageRequest→imageprocessor.Requestat the boundaryCallers use:
imageprocessor.New(imageprocessor.Params{MaxInputBytes: ...})Created tracking issue for further refactoring
Issue #39: Refactor: split internal/imgcache into focused packages — lists every concern in
internal/imgcache/with analysis of which are easily separable (fetcher, signature, magic, whitelist, urlparser) vs tightly coupled (cache, storage, service, shared types).Verification
docker build .passes — all tests green, lint clean, build succeeds.Code Review: PR #37 — Post-Rework v3 (imageprocessor extraction)
Requirements Checklist (Issue #31)
imageprocessor.Processbounded readio.LimitReader(input, p.maxInputBytes+1)ErrInputDataTooLargesentinel errorParams{MaxInputBytes}, default 50 MiBloadCachedSourceapplies sameLimitReaderpatternSneak's Feedback — All Addressed
New()(notNewImageProcessor())Paramsstruct (not bare positionalint64)ImageProcessorextracted tointernal/imageprocessor/Policy Compliance
.golangci.ymlunchangedMakefileunchangedDockerfileunchanged.gitea/workflows/unchangedNew()per Go StyleguideParamsstruct per Go StyleguideimageprocessorPackage Extraction Analysis
Clean separation verified:
internal/imageprocessor/imports only stdlib + govips — zero imports fromimgcache✅internal/imgcache/service.goimportsimageprocessor— one-way dependency, no circular risk ✅Processorinterface andProcessResultremoved fromimgcache.go✅processor.goandprocessor_test.gofully deleted fromimgcache/✅service.goconvertsImageRequest→imageprocessor.Requestat boundary — clean type translation via compatible underlying string types ✅Type duplication (
imgcache.ImageFormat/FitMode/Sizevsimageprocessor.Format/FitMode/Size): Correct by design. Each package owns its domain types; conversion happens at the service boundary. Avoids coupling the HTTP-facing types to the processor internals.Code Correctness
io.LimitReader(input, p.maxInputBytes+1)+int64(len(data)) > p.maxInputBytes— standard bounded-read pattern. The +1 detects overflow without consuming extra memory. ✅New(Params{})defaultsMaxInputBytestoDefaultMaxInputBytes(50 MiB) for zero/negative values. MatchesDefaultMaxResponseSizein fetcher. ✅loadCachedSourcebounds cached reads AND closes the reader unconditionally (including error paths). ✅maxResponseSizewired fromfetcherCfg.MaxResponseSizeto both processor and service field. Default is 50 MiB. ✅ImageProcessoris safe for concurrent use (immutable after construction). ✅Test Coverage
internal/imageprocessor/— all migrated correctlyRejectsOversizedInputData,AcceptsInputWithinLimit,DefaultMaxInputBytesTestImageProcessor_ImplementsInterfacecorrectly removed (interface removed)detectMIMEtest helper avoids cross-package dependency onimgcache.DetectFormat✅testdata/red.avifcorrectly duplicated tointernal/imageprocessor/testdata/✅Minor Nit (non-blocking)
internal/imgcache/testdata/red.avif(281 bytes) is now orphaned — the test that used it (TestImageProcessor_DecodeAVIF) moved toimageprocessor. Could be cleaned up in a follow-up.Build Result
Verdict: PASS ✅
Clean, correct extraction. All three rounds of feedback fully addressed: bounded reads (v1), Params constructor (v2), package extraction (v3). Standard
io.LimitReaderpattern correctly applied at both processor and cached source layers. Tests comprehensive and properly migrated. No policy violations, no scope creep, no cheating.@clawbot please fix the nit - sloppy and unprofessional