Files
dnswatcher/internal/notify/export_test.go
clawbot 31bd6c3228
All checks were successful
check / check (push) Successful in 42s
feat: add retry with exponential backoff for notification delivery
Notifications were fire-and-forget: if Slack, Mattermost, or ntfy was
temporarily down, changes were silently lost. This adds automatic retry
with exponential backoff and jitter to all notification endpoints.

Implementation:
- New retry.go with configurable RetryConfig (max retries, base delay,
  max delay) and exponential backoff with ±25% jitter
- Each dispatch goroutine now wraps its send call in deliverWithRetry
- Default: 3 retries (4 total attempts), 1s base delay, 10s max delay
- Context-aware: respects cancellation during retry sleep
- Structured logging on each retry attempt and on final success after
  retry

All existing tests continue to pass. New tests cover:
- Backoff calculation (increase, cap)
- Retry success on first attempt (no unnecessary retries)
- Retry on transient failure (succeeds after N attempts)
- Exhausted retries (returns last error)
- Context cancellation during retry sleep
- Integration: SendNotification retries transient 500s
- Integration: all three endpoints retry independently
- Integration: permanent failure exhausts retries

closes #62
2026-03-10 11:11:32 -07:00

106 lines
2.4 KiB
Go

package notify
import (
"context"
"io"
"log/slog"
"net/http"
"net/url"
"time"
)
// NtfyPriority exports ntfyPriority for testing.
func NtfyPriority(priority string) string {
return ntfyPriority(priority)
}
// SlackColor exports slackColor for testing.
func SlackColor(priority string) string {
return slackColor(priority)
}
// NewRequestForTest exports newRequest for testing.
func NewRequestForTest(
ctx context.Context,
method string,
target *url.URL,
body io.Reader,
) *http.Request {
return newRequest(ctx, method, target, body)
}
// NewTestService creates a Service suitable for unit testing.
// It discards log output and uses the given transport.
func NewTestService(transport http.RoundTripper) *Service {
return &Service{
log: slog.New(slog.DiscardHandler),
transport: transport,
history: NewAlertHistory(),
}
}
// SetNtfyURL sets the ntfy URL on a Service for testing.
func (svc *Service) SetNtfyURL(u *url.URL) {
svc.ntfyURL = u
}
// SetSlackWebhookURL sets the Slack webhook URL on a
// Service for testing.
func (svc *Service) SetSlackWebhookURL(u *url.URL) {
svc.slackWebhookURL = u
}
// SetMattermostWebhookURL sets the Mattermost webhook URL on
// a Service for testing.
func (svc *Service) SetMattermostWebhookURL(u *url.URL) {
svc.mattermostWebhookURL = u
}
// SendNtfy exports sendNtfy for testing.
func (svc *Service) SendNtfy(
ctx context.Context,
topicURL *url.URL,
title, message, priority string,
) error {
return svc.sendNtfy(ctx, topicURL, title, message, priority)
}
// SendSlack exports sendSlack for testing.
func (svc *Service) SendSlack(
ctx context.Context,
webhookURL *url.URL,
title, message, priority string,
) error {
return svc.sendSlack(
ctx, webhookURL, title, message, priority,
)
}
// SetRetryConfig overrides the retry configuration for
// testing.
func (svc *Service) SetRetryConfig(cfg RetryConfig) {
svc.retryConfig = cfg
}
// SetSleepFunc overrides the sleep function so tests can
// eliminate real delays.
func (svc *Service) SetSleepFunc(
fn func(time.Duration) <-chan time.Time,
) {
svc.sleepFn = fn
}
// DeliverWithRetry exports deliverWithRetry for testing.
func (svc *Service) DeliverWithRetry(
ctx context.Context,
endpoint string,
fn func(context.Context) error,
) error {
return svc.deliverWithRetry(ctx, endpoint, fn)
}
// BackoffDuration exports RetryConfig.backoff for testing.
func (rc RetryConfig) BackoffDuration(attempt int) time.Duration {
return rc.defaults().backoff(attempt)
}