Commit Graph

10 Commits

Author SHA1 Message Date
clawbot
25e27cc57f refactor: merge retry target type into http (max_retries=0 = fire-and-forget)
All checks were successful
check / check (push) Successful in 1m46s
2026-03-01 23:51:55 -08:00
clawbot
536e5682d6 test: add comprehensive delivery engine and circuit breaker tests
All checks were successful
check / check (push) Successful in 1m48s
Add unit tests for internal/delivery/ package covering:

Circuit breaker tests (circuit_breaker_test.go):
- Closed state allows deliveries
- Failure counting below threshold
- Open transition after threshold failures
- Cooldown blocks during cooldown period
- Half-open transition after cooldown expires
- Probe success closes circuit
- Probe failure reopens circuit
- Success resets failure counter
- Concurrent access safety (race-safe)
- CooldownRemaining for all states
- CircuitState String() output

Engine tests (engine_test.go):
- Non-blocking Notify when channel is full
- HTTP target success and failure delivery
- Database target immediate success
- Log target immediate success
- Retry target success with circuit breaker
- Max retries exhausted marks delivery failed
- Retry scheduling on failure
- Exponential backoff duration verification
- Backoff cap at shift 30
- Body pointer semantics (inline <16KB, nil >=16KB)
- Worker pool bounded concurrency
- Circuit breaker blocks delivery attempts
- Circuit breaker per-target creation
- HTTP config parsing (valid, empty, missing URL)
- scheduleRetry sends to retry channel
- scheduleRetry drops when channel full
- Header forwarding (forwardable vs hop-by-hop)
- processDelivery routing to correct handler
- Truncate helper function

All tests use real SQLite databases and httptest servers.
All tests pass with -race flag.
2026-03-01 23:16:30 -08:00
clawbot
10db6c5b84 refactor: bounded worker pool with DB-mediated retry fallback
All checks were successful
check / check (push) Successful in 58s
Replace unbounded goroutine-per-delivery fan-out with a fixed-size
worker pool (10 workers). Channels serve as bounded queues (10,000
buffer). Workers are the only goroutines doing HTTP delivery.

When retry channel overflows, timers are dropped instead of re-armed.
The delivery stays in 'retrying' status in the DB and a periodic sweep
(every 60s) recovers orphaned retries. The database is the durable
fallback — same path used on startup recovery.

Addresses owner feedback on circuit breaker recovery goroutine flood.
2026-03-01 22:52:27 -08:00
clawbot
9b4ae41c44 feat: parallel fan-out delivery + circuit breaker for retry targets
All checks were successful
check / check (push) Successful in 1m52s
- Fan out all targets for an event in parallel goroutines (fire-and-forget)
- Add per-target circuit breaker for retry targets (closed/open/half-open)
- Circuit breaker trips after 5 consecutive failures, 30s cooldown
- Open circuit skips delivery and reschedules after cooldown
- Half-open allows one probe delivery to test recovery
- HTTP/database/log targets unaffected (no circuit breaker)
- Recovery path also fans out in parallel
- Update README with parallel delivery and circuit breaker docs
2026-03-01 22:20:33 -08:00
clawbot
32bd40b313 refactor: self-contained delivery tasks — engine delivers without DB reads in happy path
All checks were successful
check / check (push) Successful in 58s
The webhook handler now builds DeliveryTask structs carrying all target
config and event data inline (for bodies ≤16KB) and sends them through
the delivery channel. In the happy path, the engine delivers without
reading from any database — it only writes to record delivery results.

For large bodies (≥16KB), Body is nil and the engine fetches it from the
per-webhook database on demand. Retry timers also carry the full
DeliveryTask, so retries avoid unnecessary DB reads.

The database is used for crash recovery only: on startup the engine scans
for interrupted pending/retrying deliveries and re-queues them.

Implements owner feedback from issue #15:
> the message in the <=16KB case should have everything it needs to do
> its delivery. it shouldn't touch the db until it has a success or
> failure to record.
2026-03-01 22:09:41 -08:00
clawbot
5e683af2a4 refactor: event-driven delivery engine with channel notifications and timer-based retries
All checks were successful
check / check (push) Successful in 58s
Replace the polling-based delivery engine with a fully event-driven
architecture using Go channels and goroutines:

- Webhook handler notifies engine via buffered channel after creating
  delivery records, with inline event data for payloads < 16KB
- Large payloads (>= 16KB) use pointer semantics (Body *string = nil)
  and are fetched from DB on demand, keeping channel memory bounded
- Failed retry-target deliveries schedule Go timers with exponential
  backoff; timers fire into a separate retry channel when ready
- On startup, engine scans DB once to recover interrupted deliveries
  (pending processed immediately, retrying get timers for remaining
  backoff)
- DB stores delivery status for crash recovery only, not for
  inter-component communication during normal operation
- delivery.Notifier interface decouples handlers from engine; fx wires
  *Engine as Notifier

No more periodic polling. No more wasted cycles when idle.
2026-03-01 21:46:16 -08:00
clawbot
43c22a9e9a feat: implement per-webhook event databases
All checks were successful
check / check (push) Successful in 1m50s
Split data storage into main application DB (config only) and
per-webhook event databases (one SQLite file per webhook).

Architecture changes:
- New WebhookDBManager component manages per-webhook DB lifecycle
  (create, open, cache, delete) with lazy connection pooling via sync.Map
- Main DB (DBURL) stores only config: Users, Webhooks, Entrypoints,
  Targets, APIKeys
- Per-webhook DBs (DATA_DIR) store Events, Deliveries, DeliveryResults
  in files named events-{webhook_uuid}.db
- New DATA_DIR env var (default: ./data dev, /data/events prod)

Behavioral changes:
- Webhook creation creates per-webhook DB file
- Webhook deletion hard-deletes per-webhook DB file (config soft-deleted)
- Event ingestion writes to per-webhook DB, not main DB
- Delivery engine polls all per-webhook DBs for pending deliveries
- Database target type marks delivery as immediately successful (events
  are already in the dedicated per-webhook DB)
- Event log UI reads from per-webhook DBs with targets from main DB
- Existing webhooks without DB files get them created lazily

Removed:
- ArchivedEvent model (was a half-measure, replaced by per-webhook DBs)
- Event/Delivery/DeliveryResult removed from main DB migrations

Added:
- Comprehensive tests for WebhookDBManager (create, delete, lazy
  creation, delivery workflow, multiple webhooks, close all)
- Dockerfile creates /data/events directory

README updates:
- Per-webhook event databases documented as implemented (was Phase 2)
- DATA_DIR added to configuration table
- Docker instructions updated with data volume mount
- Data model diagram updated
- TODO updated (database separation moved to completed)

Closes #15
2026-03-01 17:06:43 -08:00
clawbot
6c393ccb78 fix: database target writes to dedicated archive table
All checks were successful
check / check (push) Successful in 1m43s
The "database" target type now writes events to a separate
archived_events table instead of just marking the delivery as done.
This table persists independently of internal event retention/pruning,
allowing the data to be consumed by external systems or preserved
indefinitely.

New ArchivedEvent model copies the full event payload (method, headers,
body, content_type) along with webhook/entrypoint/event/target IDs.
2026-03-01 16:40:27 -08:00
clawbot
d4fbd6c110 fix: delivery engine nil pointer crash on startup (closes #17)
Store the *database.Database wrapper instead of calling .DB() eagerly
at construction time. The GORM *gorm.DB is only available after the
database's OnStart hook runs, but the engine constructor runs during
fx resolution (before OnStart). Accessing .DB() lazily via the wrapper
avoids the nil pointer panic.
2026-03-01 16:34:16 -08:00
clawbot
7f8469a0f2 feat: implement core webhook engine, delivery system, and management UI (Phase 2)
All checks were successful
check / check (push) Successful in 1m49s
- Webhook reception handler: look up entrypoint by UUID, verify active,
  capture full HTTP request (method, headers, body, content-type), create
  Event record, queue Delivery records for each active Target, return 200 OK.
  Handles edge cases: unknown UUID → 404, inactive → 410, oversized → 413.

- Delivery engine (internal/delivery): fx-managed background goroutine that
  polls for pending/retrying deliveries and dispatches to target type handlers.
  Graceful shutdown via context cancellation.

- Target type implementations:
  - HTTP: fire-and-forget POST with original headers forwarding
  - Retry: exponential backoff (1s, 2s, 4s...) up to max_retries
  - Database: immediate success (event already stored)
  - Log: slog output with event details

- Webhook management pages with Tailwind CSS + Alpine.js:
  - List (/sources): webhooks with entrypoint/target/event counts
  - Create (/sources/new): form with auto-created default entrypoint
  - Detail (/source/{id}): config, entrypoints, targets, recent events
  - Edit (/source/{id}/edit): name, description, retention_days
  - Delete (/source/{id}/delete): soft-delete with child records
  - Add Entrypoint (/source/{id}/entrypoints): inline form
  - Add Target (/source/{id}/targets): type-aware form
  - Event Log (/source/{id}/logs): paginated with delivery status

- Updated README: marked completed items, updated naming conventions
  table, added delivery engine to package layout and DI docs, updated
  column names to reflect entity rename.

- Rebuilt Tailwind CSS for new template classes.

Part of: #15
2026-03-01 16:14:28 -08:00