feat: implement per-webhook event databases
All checks were successful
check / check (push) Successful in 1m50s

Split data storage into main application DB (config only) and
per-webhook event databases (one SQLite file per webhook).

Architecture changes:
- New WebhookDBManager component manages per-webhook DB lifecycle
  (create, open, cache, delete) with lazy connection pooling via sync.Map
- Main DB (DBURL) stores only config: Users, Webhooks, Entrypoints,
  Targets, APIKeys
- Per-webhook DBs (DATA_DIR) store Events, Deliveries, DeliveryResults
  in files named events-{webhook_uuid}.db
- New DATA_DIR env var (default: ./data dev, /data/events prod)

Behavioral changes:
- Webhook creation creates per-webhook DB file
- Webhook deletion hard-deletes per-webhook DB file (config soft-deleted)
- Event ingestion writes to per-webhook DB, not main DB
- Delivery engine polls all per-webhook DBs for pending deliveries
- Database target type marks delivery as immediately successful (events
  are already in the dedicated per-webhook DB)
- Event log UI reads from per-webhook DBs with targets from main DB
- Existing webhooks without DB files get them created lazily

Removed:
- ArchivedEvent model (was a half-measure, replaced by per-webhook DBs)
- Event/Delivery/DeliveryResult removed from main DB migrations

Added:
- Comprehensive tests for WebhookDBManager (create, delete, lazy
  creation, delivery workflow, multiple webhooks, close all)
- Dockerfile creates /data/events directory

README updates:
- Per-webhook event databases documented as implemented (was Phase 2)
- DATA_DIR added to configuration table
- Docker instructions updated with data volume mount
- Data model diagram updated
- TODO updated (database separation moved to completed)

Closes #15
This commit is contained in:
clawbot
2026-03-01 17:06:43 -08:00
parent 6c393ccb78
commit 43c22a9e9a
13 changed files with 814 additions and 198 deletions

113
README.md
View File

@@ -66,7 +66,8 @@ Configuration is resolved in this order (highest priority first):
| ----------------------- | ----------------------------------- | -------- |
| `WEBHOOKER_ENVIRONMENT` | `dev` or `prod` | `dev` |
| `PORT` | HTTP listen port | `8080` |
| `DBURL` | SQLite database connection string | *(required)* |
| `DBURL` | SQLite connection string (main app DB) | *(required)* |
| `DATA_DIR` | Directory for per-webhook event DBs | `./data` (dev) / `/data/events` (prod) |
| `SESSION_KEY` | Base64-encoded 32-byte session key | *(required in prod)* |
| `DEBUG` | Enable debug logging | `false` |
| `METRICS_USERNAME` | Basic auth username for `/metrics` | `""` |
@@ -84,6 +85,7 @@ docker run -d \
-p 8080:8080 \
-v /path/to/data:/data \
-e DBURL="file:/data/webhooker.db?cache=shared&mode=rwc" \
-e DATA_DIR="/data/events" \
-e SESSION_KEY="<base64-encoded-32-byte-key>" \
-e WEBHOOKER_ENVIRONMENT=prod \
webhooker:latest
@@ -91,7 +93,10 @@ docker run -d \
The container runs as a non-root user (`webhooker`, UID 1000), exposes
port 8080, and includes a health check against
`/.well-known/healthcheck`.
`/.well-known/healthcheck`. The `/data` volume holds both the main
application database and the per-webhook event databases (in
`/data/events/`). Mount this as a persistent volume to preserve data
across container restarts.
## Rationale
@@ -195,7 +200,7 @@ tier** (event ingestion, delivery, and logging).
┌─────────────────────────────────────────────────────────────┐
│ EVENT TIER │
(planned: per-webhook dedicated database)
(per-webhook dedicated databases)
│ │
│ ┌──────────┐ ┌──────────┐ ┌─────────────────┐ │
│ │ Event │──1:N──│ Delivery │──1:N──│ DeliveryResult │ │
@@ -286,8 +291,10 @@ events should be forwarded.
Fire-and-forget: a single attempt with no retries.
- **`retry`** — Forward the event via HTTP POST with automatic retry on
failure. Uses exponential backoff up to `max_retries` attempts.
- **`database`** — Store the event in the webhook's database only (no
external delivery). Useful for pure logging/archival.
- **`database`** — Confirm the event is stored in the webhook's
per-webhook database (no external delivery). Since events are always
written to the per-webhook DB on ingestion, this target marks delivery
as immediately successful. Useful for ensuring durable event archival.
- **`log`** — Write the event to the application log (stdout). Useful
for debugging.
@@ -384,21 +391,13 @@ All entities include these fields from `BaseModel`:
### Database Architecture
#### Current Implementation
#### Per-Webhook Event Databases
webhooker currently uses a **single SQLite database** for all data —
application configuration, user accounts, and (once implemented) event
storage. The database connection is managed by GORM with a single
connection string configured via `DBURL`. On first startup the database
is auto-migrated and an `admin` user is created.
webhooker uses **separate SQLite database files**: a main application
database for configuration data and per-webhook databases for event
storage.
#### Planned: Per-Webhook Event Databases (Phase 2)
In a future phase (see TODO Phase 2 below), webhooker will split into
**separate SQLite database files**: a main application database for
configuration data and per-webhook databases for event storage.
**Main Application Database** — will store:
**Main Application Database** (`DBURL`) — stores configuration only:
- **Users** — accounts and Argon2id password hashes
- **Webhooks** — webhook configurations
@@ -406,14 +405,22 @@ configuration data and per-webhook databases for event storage.
- **Targets** — delivery destination configurations
- **APIKeys** — programmatic access credentials
**Per-Webhook Event Databases** — each webhook will get its own
dedicated SQLite file containing:
On first startup the main database is auto-migrated and an `admin` user
is created.
**Per-Webhook Event Databases** (`DATA_DIR`) — each webhook gets its own
dedicated SQLite file named `events-{webhook_uuid}.db`, containing:
- **Events** — captured incoming webhook payloads
- **Deliveries** — event-to-target pairings and their status
- **DeliveryResults** — individual delivery attempt logs
This planned separation will provide:
Per-webhook databases are created automatically when a webhook is
created (and lazily on first access for webhooks that predate this
feature). They are managed by the `WebhookDBManager` component, which
handles connection pooling, lazy opening, migrations, and cleanup.
This separation provides:
- **Isolation** — a high-volume webhook won't cause lock contention or
WAL bloat affecting the main application or other webhooks.
@@ -421,14 +428,21 @@ This planned separation will provide:
backed up, archived, rotated, or size-limited without impacting the
application.
- **Clean deletion** — removing a webhook and all its history is as
simple as deleting one file.
simple as deleting one file. Configuration is soft-deleted in the main
DB; the event database file is hard-deleted (permanently removed).
- **Per-webhook retention** — the `retention_days` field on each webhook
will control automatic cleanup of old events in that webhook's
database only.
- **Performance** — each webhook's database will have its own WAL, its
own page cache, and its own lock, so concurrent event ingestion across
controls automatic cleanup of old events in that webhook's database
only.
- **Performance** — each webhook's database has its own WAL, its own
page cache, and its own lock, so concurrent event ingestion across
webhooks won't contend.
The **database target type** leverages this architecture: since events
are already stored in the per-webhook database by design, the database
target simply marks the delivery as immediately successful. The
per-webhook DB IS the dedicated event database — that's the whole point
of the database target type.
The database uses the
[modernc.org/sqlite](https://pkg.go.dev/modernc.org/sqlite) driver at
runtime, though CGO is required at build time due to the transitive
@@ -549,16 +563,17 @@ webhooker/
│ ├── database/
│ │ ├── base_model.go # BaseModel with UUID primary keys
│ │ ├── database.go # GORM connection, migrations, admin seed
│ │ ├── models.go # AutoMigrate for all models
│ │ ├── models.go # AutoMigrate for config-tier models
│ │ ├── model_user.go # User entity
│ │ ├── model_webhook.go # Webhook entity
│ │ ├── model_entrypoint.go # Entrypoint entity
│ │ ├── model_target.go # Target entity and TargetType enum
│ │ ├── model_event.go # Event entity
│ │ ├── model_delivery.go # Delivery entity and DeliveryStatus enum
│ │ ├── model_delivery_result.go # DeliveryResult entity
│ │ ├── model_event.go # Event entity (per-webhook DB)
│ │ ├── model_delivery.go # Delivery entity (per-webhook DB)
│ │ ├── model_delivery_result.go # DeliveryResult entity (per-webhook DB)
│ │ ├── model_apikey.go # APIKey entity
│ │ ── password.go # Argon2id hashing and verification
│ │ ── password.go # Argon2id hashing and verification
│ │ └── webhook_db_manager.go # Per-webhook DB lifecycle manager
│ ├── globals/
│ │ └── globals.go # Build-time variables (appname, version, arch)
│ ├── delivery/
@@ -604,13 +619,16 @@ Components are wired via Uber fx in this order:
1. `globals.New` — Build-time variables (appname, version, arch)
2. `logger.New` — Structured logging (slog with TTY detection)
3. `config.New` — Configuration loading (pkg/config + environment)
4. `database.New` — SQLite connection, migrations, admin user seed
5. `healthcheck.New` — Health check service
6. `session.New` — Cookie-based session manager
7. `handlers.New` — HTTP handlers
8. `middleware.New` — HTTP middleware
9. `delivery.New`Background delivery engine
10. `server.New` — HTTP server and router
4. `database.New` Main SQLite connection, config migrations, admin
user seed
5. `database.NewWebhookDBManager` — Per-webhook event database
lifecycle manager
6. `healthcheck.New` — Health check service
7. `session.New`Cookie-based session manager
8. `handlers.New` — HTTP handlers
9. `middleware.New` — HTTP middleware
10. `delivery.New` — Background delivery engine
11. `server.New` — HTTP server and router
The server starts via `fx.Invoke(func(*server.Server, *delivery.Engine)
{})` which triggers the fx lifecycle hooks in dependency order.
@@ -657,7 +675,8 @@ The Dockerfile uses a multi-stage build:
1. **Builder stage** (Debian-based `golang:1.24`) — installs
golangci-lint, downloads dependencies, copies source, runs `make
check` (format verification, linting, tests, compilation).
2. **Runtime stage** (`alpine:3.21`) — copies the binary, runs as
2. **Runtime stage** (`alpine:3.21`) — copies the binary, creates the
`/data/events` directory for per-webhook event databases, runs as
non-root user, exposes port 8080, includes a health check.
The builder uses Debian rather than Alpine because GORM's SQLite
@@ -690,12 +709,21 @@ linted, tested, and compiled.
- [x] Build event processing and target delivery engine
- [x] Implement HTTP target type (fire-and-forget POST)
- [x] Implement retry target type (exponential backoff)
- [x] Implement database target type (store only)
- [x] Implement database target type (store events in per-webhook DB)
- [x] Implement log target type (console output)
- [x] Webhook management pages (list, create, edit, delete)
- [x] Webhook request log viewer with pagination
- [x] Entrypoint and target management UI
### Completed: Per-Webhook Event Databases
- [x] Split into main application DB + per-webhook event DBs
- [x] Per-webhook database lifecycle management (create on webhook
creation, delete on webhook removal)
- [x] `WebhookDBManager` component with lazy connection pooling
- [x] Delivery engine polls all per-webhook DBs for pending deliveries
- [x] Database target type marks delivery as immediately successful
(events are already in the per-webhook DB)
### Remaining: Core Features
- [ ] Per-webhook rate limiting in the receiver handler
- [ ] Webhook signature verification (GitHub, Stripe formats)
@@ -708,11 +736,8 @@ linted, tested, and compiled.
- [ ] Analytics dashboard (success rates, response times)
- [ ] Delivery status and retry management UI
### Remaining: Database Separation
- [ ] Split into main application DB + per-webhook event DBs
### Remaining: Event Maintenance
- [ ] Automatic event retention cleanup based on `retention_days`
- [ ] Per-webhook database lifecycle management (create on webhook
creation, delete on webhook removal)
### Remaining: REST API
- [ ] RESTful CRUD for webhooks, entrypoints, targets