feat: add retry with exponential backoff for notification delivery (#87)
All checks were successful
check / check (push) Successful in 37s
All checks were successful
check / check (push) Successful in 37s
## Summary Notifications were fire-and-forget: if Slack, Mattermost, or ntfy was temporarily down, changes were silently lost. This adds automatic retry with exponential backoff and jitter to all notification endpoints. ## Changes ### New file: `internal/notify/retry.go` - `RetryConfig` struct with configurable max retries, base delay, max delay - `backoff()` computes delay as `BaseDelay * 2^attempt`, capped at `MaxDelay`, with ±25% jitter - `deliverWithRetry()` wraps any send function with the retry loop - Defaults: 3 retries (4 total attempts), 1s base delay, 10s max delay - Context-aware: respects cancellation during retry sleep - Injectable `sleepFn` for test determinism ### Modified: `internal/notify/notify.go` - Added `retryConfig` and `sleepFn` fields to `Service` - Updated `dispatchNtfy`, `dispatchSlack`, `dispatchMattermost` to wrap sends in `deliverWithRetry` - Structured logging: warns on each retry, logs error only after all retries exhausted, logs info on success after retry ### Modified: `internal/notify/export_test.go` - Added test helpers: `SetRetryConfig`, `SetSleepFunc`, `DeliverWithRetry`, `BackoffDuration` ### New file: `internal/notify/retry_test.go` - Backoff calculation tests (exponential increase, max cap with jitter) - `deliverWithRetry` unit tests: first-attempt success, transient failure recovery, exhausted retries, context cancellation - Integration tests via `SendNotification`: transient failure retries, all-endpoints retry independently, permanent failure exhausts retries ## Verification - `make fmt` ✅ - `make check` (format + lint + tests + build) ✅ - `docker build .` ✅ - All existing tests continue to pass unchanged - No DNS client mocking — notification tests use `httptest` servers closes #62 Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de> Reviewed-on: #87 Co-authored-by: clawbot <clawbot@noreply.example.org> Co-committed-by: clawbot <clawbot@noreply.example.org>
This commit was merged in pull request #87.
This commit is contained in:
@@ -6,6 +6,7 @@ import (
|
||||
"log/slog"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"time"
|
||||
)
|
||||
|
||||
// NtfyPriority exports ntfyPriority for testing.
|
||||
@@ -74,3 +75,31 @@ func (svc *Service) SendSlack(
|
||||
ctx, webhookURL, title, message, priority,
|
||||
)
|
||||
}
|
||||
|
||||
// SetRetryConfig overrides the retry configuration for
|
||||
// testing.
|
||||
func (svc *Service) SetRetryConfig(cfg RetryConfig) {
|
||||
svc.retryConfig = cfg
|
||||
}
|
||||
|
||||
// SetSleepFunc overrides the sleep function so tests can
|
||||
// eliminate real delays.
|
||||
func (svc *Service) SetSleepFunc(
|
||||
fn func(time.Duration) <-chan time.Time,
|
||||
) {
|
||||
svc.sleepFn = fn
|
||||
}
|
||||
|
||||
// DeliverWithRetry exports deliverWithRetry for testing.
|
||||
func (svc *Service) DeliverWithRetry(
|
||||
ctx context.Context,
|
||||
endpoint string,
|
||||
fn func(context.Context) error,
|
||||
) error {
|
||||
return svc.deliverWithRetry(ctx, endpoint, fn)
|
||||
}
|
||||
|
||||
// BackoffDuration exports RetryConfig.backoff for testing.
|
||||
func (rc RetryConfig) BackoffDuration(attempt int) time.Duration {
|
||||
return rc.defaults().backoff(attempt)
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user