Commit Graph

63 Commits

Author SHA1 Message Date
65180ad661 feat: add DNSWATCHER_SEND_TEST_NOTIFICATION env var (#85)
All checks were successful
check / check (push) Successful in 5s
When set to a truthy value, sends a startup status notification to all configured notification channels after the first full scan completes on application startup. The notification is clearly an all-ok/success message showing the number of monitored domains, hostnames, ports, and certificates.

Changes:
- Added `SendTestNotification` config field reading `DNSWATCHER_SEND_TEST_NOTIFICATION`
- Added `maybeSendTestNotification()` in watcher, called after initial `RunOnce` in `Run`
- Added 3 watcher tests (enabled via Run, enabled via RunOnce alone, disabled)
- Added config tests for the new field
- Updated README: env var table, example .env, Docker example

Closes #84

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #85
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-04 21:41:55 +01:00
1076543c23 feat: add unauthenticated web dashboard showing monitoring state and recent alerts (#83)
All checks were successful
check / check (push) Successful in 4s
## Summary

Adds a read-only web dashboard at `GET /` that shows the current monitoring state and recent alerts. Unauthenticated, single-page, no navigation.

## What it shows

- **Summary bar**: counts of monitored domains, hostnames, ports, certificates
- **Domains**: nameservers with last-checked age
- **Hostnames**: per-nameserver DNS records, status badges, relative age
- **Ports**: open/closed state with associated hostnames and age
- **TLS Certificates**: CN, issuer, expiry (color-coded by urgency), status, age
- **Recent Alerts**: last 100 notifications in reverse chronological order with priority badges

Every data point displays its age (e.g. "5m ago") so freshness is visible at a glance. Auto-refreshes every 30 seconds.

## What it does NOT show

No secrets: webhook URLs, ntfy topics, Slack/Mattermost endpoints, API tokens, and configuration details are never exposed.

## Design

All assets (CSS) are embedded in the binary and served from `/s/`. Zero external HTTP requests at runtime — no CDN dependencies or third-party resources. Dark, technical aesthetic with saturated teals and blues on dark slate. Single page — everything on one screen.

## Implementation

- `internal/notify/history.go` — thread-safe ring buffer (`AlertHistory`) storing last 100 alerts
- `internal/notify/notify.go` — records each alert in history before dispatch; refactored `SendNotification` into smaller `dispatch*` helpers to satisfy funlen
- `internal/handlers/dashboard.go` — `HandleDashboard()` handler with embedded HTML template, helper functions (`relTime`, `formatRecords`, `expiryDays`, `joinStrings`)
- `internal/handlers/templates/dashboard.html` — Tailwind-styled single-page dashboard
- `internal/handlers/handlers.go` — added `State` and `Notify` dependencies via fx
- `internal/server/routes.go` — registered `GET /` route
- `static/` — embedded CSS assets served via `/s/` prefix
- `README.md` — documented the dashboard and new endpoint

## Tests

- `internal/notify/history_test.go` — empty, add+recent ordering, overflow beyond capacity
- `internal/handlers/dashboard_test.go` — `relTime`, `expiryDays`, `formatRecords`
- All existing tests pass unchanged
- `docker build .` passes

closes [#82](#82)

<!-- session: rework-pr-83 -->

Co-authored-by: user <user@Mac.lan guest wan>
Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Reviewed-on: #83
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-04 13:03:38 +01:00
1843d09eb3 test(notify): add comprehensive tests for notification delivery (#79)
All checks were successful
check / check (push) Successful in 50s
## Summary

Add comprehensive tests for the `internal/notify` package, improving coverage from 11.1% to 80.0%.

Closes [issue #71](#71).

## What was added

### `delivery_test.go` — 28 new test functions

**Priority mapping tests:**
- `TestNtfyPriority` — all priority levels (error→urgent, warning→high, success→default, info→low, unknown→default)
- `TestSlackColor` — all color mappings including default fallback

**Request construction:**
- `TestNewRequest` — method, URL, host, headers, body
- `TestNewRequestPreservesContext` — context propagation

**ntfy delivery (`sendNtfy`):**
- `TestSendNtfyHeaders` — Title, Priority headers, POST body content
- `TestSendNtfyAllPriorities` — end-to-end header verification for all priority levels
- `TestSendNtfyClientError` — 403 returns `ErrNtfyFailed`
- `TestSendNtfyServerError` — 500 returns `ErrNtfyFailed`
- `TestSendNtfySuccess` — 200 OK succeeds
- `TestSendNtfyNetworkError` — transport failure handling

**Slack/Mattermost delivery (`sendSlack`):**
- `TestSendSlackPayloadFields` — JSON payload structure, Content-Type header, attachment fields
- `TestSendSlackAllColors` — color mapping for all priorities
- `TestSendSlackClientError` — 400 returns `ErrSlackFailed`
- `TestSendSlackServerError` — 502 returns `ErrSlackFailed`
- `TestSendSlackNetworkError` — transport failure handling

**`SendNotification` goroutine dispatch:**
- `TestSendNotificationAllEndpoints` — all three endpoints receive notifications concurrently
- `TestSendNotificationNoWebhooks` — no-op when no endpoints configured
- `TestSendNotificationNtfyOnly` — ntfy-only dispatch
- `TestSendNotificationSlackOnly` — slack-only dispatch
- `TestSendNotificationMattermostOnly` — mattermost-only dispatch
- `TestSendNotificationNtfyError` — error logging path (no panic)
- `TestSendNotificationSlackError` — error logging path (no panic)
- `TestSendNotificationMattermostError` — error logging path (no panic)

**Payload marshaling:**
- `TestSlackPayloadJSON` — round-trip marshal/unmarshal
- `TestSlackPayloadEmptyAttachments` — `omitempty` behavior

### `export_test.go` — test bridge

Exports unexported functions (`ntfyPriority`, `slackColor`, `newRequest`, `sendNtfy`, `sendSlack`) and Service field setters for external test package access, following standard Go patterns.

## Coverage

| Function | Before | After |
|---|---|---|
| `IsAllowedScheme` | 100% | 100% |
| `ValidateWebhookURL` | 100% | 100% |
| `newRequest` | 0% | 100% |
| `SendNotification` | 0% | 100% |
| `sendNtfy` | 0% | 100% |
| `ntfyPriority` | 0% | 100% |
| `sendSlack` | 0% | 94.1% |
| `slackColor` | 0% | 100% |
| **Total** | **11.1%** | **80.0%** |

The remaining 20% is the `New()` constructor (requires fx wiring) and one unreachable `json.Marshal` error path in `sendSlack`.

## Testing approach

- `httptest.Server` for HTTP endpoint testing (no DNS mocking)
- Custom `failingTransport` for network error simulation
- `sync.Mutex`-protected captures for concurrent goroutine verification
- All tests are parallel

`docker build .` passes 

<!-- session: agent:sdlc-manager:subagent:6158e09a-aba4-4778-89ca-c12b22014ccd -->

Co-authored-by: user <user@Mac.lan guest wan>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #79
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-04 11:26:31 +01:00
c5bf16055e test(state): add comprehensive test coverage for internal/state package (#80)
Some checks failed
check / check (push) Has been cancelled
## Summary

Add 32 tests for the `internal/state` package, which previously had 0% test coverage.

### Tests added:

**Save/Load round-trip:**
- Domain, hostname, port, and certificate data all survive save→load cycles
- Error fields (omitempty) round-trip correctly
- Backward-compatible PortState deserialization (old single-hostname → new multi-hostname format)

**Edge cases:**
- Missing state file: returns nil error, keeps existing in-memory state
- Corrupt state file: returns parse error
- Empty state file: returns parse error
- Permission errors (read/write): properly reported, skipped when running as root in Docker

**Atomic write:**
- No leftover .tmp files after successful save
- Updated content verified after second save

**Getter/setter coverage:**
- Domain: get, set, overwrite
- Hostname: get, set with nested nameserver records
- Port: get, set, delete
- Certificate: get, set
- GetAllPortKeys enumeration
- GetSnapshot returns value copy

**Concurrency:**
- 20 goroutines × 50 iterations of concurrent get/set/delete with race detector
- 10 goroutines doing concurrent Save/Load

**Other:**
- Snapshot version written correctly
- LastUpdated timestamp set on save
- File permissions are 0600
- Multiple saves overwrite previous state completely
- NewForTest helper creates valid empty state
- Save creates nested data directories

Also adds `NewForTestWithDataDir()` to the test helper for tests requiring file persistence.

Closes [issue #70](#70)

<!-- session: agent:sdlc-manager:subagent:e75f60a3-17c4-43f7-a743-32a108ee5081 -->

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Reviewed-on: #80
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-04 11:26:05 +01:00
d6130e5892 test(config): add comprehensive tests for config loading path (#81)
All checks were successful
check / check (push) Successful in 4s
## Summary

Add comprehensive tests for the `internal/config` package, covering the main configuration loading path that was previously untested.

Closes [issue #72](#72)

## What Changed

Added three new test files:

- **`config_test.go`** — 16 tests covering `New()`, `StatePath()`, and the full config loading pipeline
- **`parsecsv_test.go`** — 10 test cases for `parseCSV()` edge cases
- **`export_test.go`** — standard Go export bridge for testing unexported `parseCSV`

## Test Coverage

| Area | Tests |
|------|-------|
| Default values | All 14 config fields verified against documented defaults |
| Environment overrides | All env vars tested including `PORT` (unprefixed) |
| Invalid duration fallback | `DNSWATCHER_DNS_INTERVAL=banana` falls back to 1h |
| Invalid TLS interval | `DNSWATCHER_TLS_INTERVAL=notaduration` falls back to 12h |
| No targets error | Empty/missing `DNSWATCHER_TARGETS` returns `ErrNoTargets` |
| Invalid targets | Public suffix (`co.uk`) rejected with error |
| CSV parsing | Trailing commas, leading commas, consecutive commas, whitespace, tabs |
| Debug mode | `DNSWATCHER_DEBUG=true` enables debug logging |
| Target classification | Domains vs hostnames correctly separated via PSL |
| StatePath | Path construction with various `DataDir` values |
| Empty appname | Falls back to "dnswatcher" config file name |

**Coverage: 23% → 92.5%**

## Notes

- Tests use `viper.Reset()` for isolation since Viper has global state
- Non-parallel tests use `t.Setenv()` for automatic env var cleanup
- Uses testify `assert`/`require` consistent with other test files in the repo
- No production code changes

<!-- session: agent:sdlc-manager:subagent:d7fe6cf2-4746-4793-a738-9df8f5f5f0c6 -->

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #81
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-04 11:23:24 +01:00
0a74971ade docs: fix README inaccuracies found during QA audit (#74)
All checks were successful
check / check (push) Successful in 9s
## Summary

Fixes documentation inaccuracies in README.md identified during QA audit.

### Changes

**API table (closes #67):**
- Removed `GET /api/v1/domains` and `GET /api/v1/hostnames` from the HTTP API table. These endpoints are not implemented — the only routes in `internal/server/routes.go` are `/health`, `/api/v1/status`, and `/metrics` (conditional).

**Feature claims (closes #68):**
- Removed "Inconsistency resolved" from hostname monitoring features. `detectInconsistencies()` detects current inconsistencies but has no state tracking to detect when they resolve.
- Removed `nxdomain` and `nodata` from the state status values table. While the resolver defines these constants, `buildHostnameState()` in the watcher only ever sets status to `"ok"`. Failed queries set `"error"` via the NS disappearance path. These values are never written to state.
- Removed "Empty response" (NODATA/NXDOMAIN) detection claim. Changes are caught generically by `detectRecordChanges()`, not with specific NODATA/NXDOMAIN labeling.

### What was NOT changed

- "Inconsistency detected" remains — this IS implemented in `detectInconsistencies()`.
- All other feature claims were verified against the code and are accurate.
- No Go source code was modified.

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #74
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-02 08:40:42 +01:00
e882e7d237 feat: fail fast when no monitoring targets configured (#75)
Some checks failed
check / check (push) Failing after 46s
## Summary

When `DNSWATCHER_TARGETS` is empty (the default), dnswatcher previously started successfully and ran indefinitely monitoring nothing. This is a common misconfiguration — forgetting to set the variable or making a typo in its name — and gave no indication anything was wrong.

## Changes

- Added `ErrNoTargets` sentinel error in `internal/config/config.go`
- Extracted `parseAndValidateTargets()` helper to validate that at least one domain or hostname is configured after target classification
- If no targets are configured, dnswatcher now exits with a clear error: `"no monitoring targets configured: set DNSWATCHER_TARGETS environment variable"`
- Updated README.md to document that `DNSWATCHER_TARGETS` is required and dnswatcher will refuse to start without it

## How it works

The validation runs during config construction (via uber/fx), before the watcher or any other component starts. If `DNSWATCHER_TARGETS` is empty or contains only whitespace/empty entries, `buildConfig()` returns `ErrNoTargets`, which causes fx to fail startup with a clear error message.

This is fail-fast behavior: a monitoring daemon with nothing to monitor is a misconfiguration and should not silently run.

Closes #69

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Reviewed-on: #75
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-02 01:26:55 +01:00
6ebc4ffa04 fix: use context.Background() for watcher goroutine lifetime (#63)
All checks were successful
check / check (push) Successful in 31s
## Summary

The `OnStart` hook previously derived the watcher's context from the fx startup context (`startCtx`) via `context.WithoutCancel()`. While `WithoutCancel` strips cancellation and deadline, using `context.Background()` makes the intent explicit: the watcher's monitoring loop must outlive the fx startup phase and is controlled solely by the `cancel` func called in `OnStop`.

## Changes

- Replace `context.WithCancel(context.WithoutCancel(startCtx))` with `context.WithCancel(context.Background())`
- Add explanatory comment documenting why the watcher context is not derived from the startup context
- Unused `startCtx` parameter changed to `_`

Closes #53

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #63
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-02 00:39:08 +01:00
b20e75459f fix: track multiple hostnames per IP:port in port state (#65)
All checks were successful
check / check (push) Successful in 34s
## Summary

Port state keys are `ip:port` with a single `hostname` field. When multiple hostnames resolve to the same IP (shared hosting, CDN), only one hostname was associated. This caused orphaned port state when that hostname removed the IP from DNS while the IP remained valid for other hostnames.

## Changes

### State (`internal/state/state.go`)
- `PortState.Hostname` (string) → `PortState.Hostnames` ([]string)
- Custom `UnmarshalJSON` for backward compatibility: reads old single `hostname` field and migrates to a single-element `hostnames` slice
- Added `DeletePortState` and `GetAllPortKeys` methods for cleanup

### Watcher (`internal/watcher/watcher.go`)
- Refactored `checkAllPorts` into three phases:
  1. Build IP:port → hostname associations from current DNS data
  2. Check each unique IP:port once with all associated hostnames
  3. Clean up stale port state entries with no hostname references
- Port change notifications now list all associated hostnames (`Hosts:` instead of `Host:`)
- Added `buildPortAssociations`, `parsePortKey`, and `cleanupStalePorts` helper functions

### README
- Updated state file format example: `hostname` → `hostnames` (array)
- Updated notification description to reflect multiple hostnames

## Backward Compatibility

Existing state files with the old single `hostname` string are handled gracefully via custom JSON unmarshaling — they are read as single-element `hostnames` slices.

Closes #55

Co-authored-by: clawbot <clawbot@noreply.eeqj.de>
Reviewed-on: #65
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-02 00:32:27 +01:00
ee14bd01ae fix: enforce DNS-first ordering for port and TLS checks (#64)
All checks were successful
check / check (push) Successful in 8s
## Summary

DNS checks now always complete before port or TLS checks begin, ensuring those checks use freshly resolved IP addresses instead of potentially stale ones from a previous cycle.

## Problem

Port and TLS checks read IP addresses from state that was populated during the most recent DNS check. If DNS changes between cycles, port/TLS checks may target stale IPs. In particular, when the TLS ticker fired (every 12h), it ran `runTLSChecks` without refreshing DNS first — meaning TLS checks could use IPs that were up to 12 hours old.

## Changes

- **Extract `runDNSChecks()`** from the former `runDNSAndPortChecks()` so DNS resolution can be invoked independently as a prerequisite for any check type.
- **TLS ticker now runs DNS first**: When the TLS ticker fires, DNS checks run before TLS checks, ensuring fresh IPs.
- **`RunOnce` uses explicit 3-phase ordering**: DNS → ports → TLS. Port checks must complete before TLS because TLS checks only target IPs where port 443 is open.
- **New test `TestDNSRunsBeforePortAndTLSChecks`**: Verifies that when DNS IPs change between cycles, port and TLS checks pick up the new IPs.
- **README updated**: Monitoring lifecycle section now documents the DNS-first ordering guarantee.

## Check ordering

| Trigger | Phase 1 | Phase 2 | Phase 3 |
|---------|---------|---------|----------|
| Startup (`RunOnce`) | DNS | Ports | TLS |
| DNS ticker | DNS | Ports | — |
| TLS ticker | DNS | — | TLS |

closes #58

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #64
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-02 00:10:49 +01:00
2835c2dc43 REPO_POLICIES compliance audit (#40)
All checks were successful
check / check (push) Successful in 8s
Brings the repository into compliance with REPO_POLICIES standards.

Closes [issue #39](#39).

## Changes

### Added files
- **REPO_POLICIES.md** — fetched from `sneak/prompts` (last_modified: 2026-02-22)
- **.editorconfig** — fetched from `sneak/prompts`, enforces 4-space indents (tabs for Makefile)
- **.dockerignore** — standard Go exclusions (.git/, bin/, *.md, LICENSE, .editorconfig, .gitignore)

### Makefile updates
- Added `fmt-check` target (read-only gofmt check)
- Added `hooks` target (installs pre-commit hook running `make check`)
- Added `docker` target (runs `docker build .`)
- Added `-timeout 30s` to both `test` and `check` targets
- Updated `.PHONY` list with all new targets

### Removed files
- **CLAUDE.md** — superseded by REPO_POLICIES.md
- **CONVENTIONS.md** — superseded by REPO_POLICIES.md

### README updates
- First line now includes project name, purpose, category (daemon), and author per REPO_POLICIES format
- Updated CONVENTIONS.md reference to REPO_POLICIES.md
- Added **License** section (pending author choice)
- Added **Author** section: [@sneak](https://sneak.berlin)

### Intentionally skipped
- **LICENSE file** — not created; license choice (MIT, GPL, or WTFPL) requires sneak's input

## Verification
- `docker build .` passes (all checks green: fmt, lint, tests, build)
- No changes to `.golangci.yml`, test assertions, or linter config

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #40
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-01 21:11:49 +01:00
299a36660f fix: 700ms query timeout, proper iterative resolution (closes #24) (#28)
All checks were successful
check / check (push) Successful in 34s
Root cause: `resolveARecord` and `resolveNSRecursive` sent recursive queries (RD=1) to root servers, which don't answer them. This caused 5s timeouts × 2 retries × 3 servers = hanging tests.

Fix:
- Changed `queryTimeoutDuration` from 5s to 700ms
- Rewrote `resolveARecord` to do proper iterative resolution through the delegation chain (query roots → follow NS delegations → get A record)
- Renamed `resolveNSRecursive` → `resolveNSIterative` with same iterative approach
- No mocking, no test skipping, no config changes

`make check` passes: all 29 resolver tests pass with real DNS in ~10s.

Co-authored-by: clawbot <clawbot@git.eeqj.de>
Reviewed-on: #28
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-01 21:10:38 +01:00
02ca796085 Merge pull request 'Simplify CI: docker build instead of manual toolchain setup' (#38) from fix/simplify-ci into main
Some checks failed
check / check (push) Failing after 40s
Reviewed-on: #38
2026-02-28 13:04:31 +01:00
clawbot
2e3526986f simplify CI to docker build, pin all image refs by SHA
Some checks failed
check / check (push) Failing after 1m23s
- Replace convoluted CI workflow (setup-go, install golangci-lint, install
  goimports, make check) with simple 'docker build .' per repo policy
- Pin Docker base images by SHA256 hash instead of mutable tags
- Pin golangci-lint and goimports by commit hash instead of @latest
- Add binutils-gold for linker compatibility on alpine
- Run on all pushes, not just main/PR branches
2026-02-28 03:54:11 -08:00
55c6c21b5a Merge pull request 'fix: distinguish timeout from negative DNS responses (closes #35)' (#37) from fix/timeout-vs-negative-response into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #37
2026-02-28 12:38:18 +01:00
clawbot
2993911883 fix: distinguish timeout from negative DNS responses (closes #35)
Some checks failed
Check / check (pull_request) Failing after 5m41s
2026-02-28 03:35:54 -08:00
70fac87254 Merge pull request 'fix: remove ErrNotImplemented stub — all checks fully implemented (closes #16)' (#23) from fix/remove-unimplemented-stubs into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #23
2026-02-28 12:26:27 +01:00
940f7c89da Merge branch 'main' into fix/remove-unimplemented-stubs
Some checks failed
Check / check (pull_request) Failing after 5m45s
2026-02-28 12:09:24 +01:00
0eb57fc15b Merge pull request 'fix: look up A/AAAA records for apex domains to enable port/TLS checks (closes #19)' (#21) from fix/domain-port-tls-state-lookup into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #21
2026-02-28 12:09:04 +01:00
5739108dc7 Merge branch 'main' into fix/domain-port-tls-state-lookup
Some checks failed
Check / check (pull_request) Failing after 5m41s
2026-02-28 12:08:56 +01:00
54272c2be5 Merge pull request 'fix: deduplicate TLS expiry warnings to prevent notification spam (closes #18)' (#22) from fix/tls-expiry-dedup into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #22
2026-02-28 12:08:46 +01:00
7d380aafa4 Merge branch 'main' into fix/remove-unimplemented-stubs
Some checks failed
Check / check (pull_request) Failing after 5m42s
2026-02-28 12:08:35 +01:00
b18d29d586 Merge branch 'main' into fix/domain-port-tls-state-lookup
Some checks failed
Check / check (pull_request) Failing after 5m40s
2026-02-28 12:08:25 +01:00
e63241cc3c Merge branch 'main' into fix/tls-expiry-dedup
Some checks failed
Check / check (pull_request) Failing after 5m40s
2026-02-28 12:08:19 +01:00
5ab217bfd2 Merge pull request 'Reduce DNS query timeout and limit root server fan-out (closes #29)' (#30) from fix/reduce-dns-timeout-and-root-fanout into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #30
2026-02-28 12:07:20 +01:00
518a2cc42e Merge pull request 'doc: add TESTING.md — real DNS only, no mocks' (#34) from doc/testing-policy into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #34
2026-02-28 12:06:57 +01:00
user
4cb81aac24 doc: add testing policy — real DNS only, no mocks
Some checks failed
Check / check (pull_request) Failing after 5m24s
Documents the project testing philosophy: all resolver tests must
use live DNS queries. Mocking the DNS client layer is not permitted.
Includes rationale and anti-patterns to avoid.
2026-02-22 04:28:47 -08:00
user
203b581704 Reduce DNS query timeout to 2s and limit root server fan-out to 3
Some checks failed
Check / check (pull_request) Failing after 5m57s
- Reduce queryTimeoutDuration from 5s to 2s
- Add randomRootServers() that shuffles and picks 3 root servers
- Replace all rootServerList() call sites with randomRootServers()
- Keep maxRetries = 2

Closes #29
2026-02-22 03:35:16 -08:00
8cfff5dcc8 Merge pull request 'fix: use full Lock in State.Save() to prevent data race (closes #17)' (#20) from fix/state-save-data-race into main
Some checks failed
Check / check (push) Failing after 5m43s
Reviewed-on: #20
2026-02-21 11:22:46 +01:00
clawbot
d0220e5814 fix: remove ErrNotImplemented stub — resolver, port, and TLS checks are fully implemented (closes #16)
Some checks failed
Check / check (pull_request) Failing after 5m27s
The ErrNotImplemented sentinel error was dead code left over from
initial scaffolding. The resolver performs real iterative DNS
lookups from root servers, PortCheck does TCP connection checks,
and TLSCheck verifies TLS certificates and expiry. Removed the
unused error constant.
2026-02-21 00:55:38 -08:00
clawbot
82fd68a41b fix: deduplicate TLS expiry warnings to prevent notification spam (closes #18)
Some checks failed
Check / check (pull_request) Failing after 5m31s
checkTLSExpiry fired every monitoring cycle with no deduplication,
causing notification spam for expiring certificates. Added an
in-memory map tracking the last notification time per domain/IP
pair, suppressing re-notification within the TLS check interval.

Added TestTLSExpiryWarningDedup to verify deduplication works.
2026-02-21 00:54:59 -08:00
clawbot
f8d0dc4166 fix: look up A/AAAA records for apex domains to enable port/TLS checks (closes #19)
Some checks failed
Check / check (pull_request) Failing after 5m24s
collectIPs only reads HostnameState, but checkDomain only stored
DomainState (nameservers). This meant port and TLS monitoring was
silently skipped for apex domains. Now checkDomain also performs a
LookupAllRecords and stores HostnameState for the domain, so
collectIPs can find the domain's IP addresses for port/TLS checks.

Added TestDomainPortAndTLSChecks to verify the fix.
2026-02-21 00:53:42 -08:00
clawbot
b162ca743b fix: use full Lock in State.Save() to prevent data race (closes #17)
Some checks failed
Check / check (pull_request) Failing after 5m31s
State.Save() was using RLock but mutating s.snapshot.LastUpdated,
which is a write operation. This created a data race since other
goroutines could also hold a read lock and observe a partially
written timestamp. Changed to full Lock to ensure exclusive access
during the mutation.
2026-02-21 00:51:58 -08:00
622acdb494 Merge pull request 'feat: implement TCP port connectivity checker (closes #3)' (#6) from feature/portcheck-implementation into main
Some checks failed
Check / check (push) Failing after 5m42s
Reviewed-on: #6
2026-02-20 19:38:37 +01:00
4d4f74d1b6 Merge pull request 'feat: implement iterative DNS resolver (closes #1)' (#9) from feature/resolver into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #9
2026-02-20 19:37:59 +01:00
617270acba Merge pull request 'feat: implement TLS certificate inspector (closes #4)' (#7) from feature/tlscheck-implementation into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #7
2026-02-20 19:36:39 +01:00
clawbot
687027be53 test: add tests for no-peer-certificates error path
All checks were successful
Check / check (pull_request) Successful in 10m50s
2026-02-20 07:44:01 -08:00
user
54b00f3b2a fix: return error for no peer certs, include IP SANs
- extractCertInfo now returns an error (ErrNoPeerCertificates) instead
  of an empty struct when there are no peer certificates
- SubjectAlternativeNames now includes both DNS names and IP addresses
  from cert.IPAddresses

Addresses review feedback on PR #7.
2026-02-20 07:44:01 -08:00
clawbot
3fcf203485 fix: resolve gosec SSRF findings and formatting issues
Validate webhook/ntfy URLs at Service construction time and add
targeted nolint directives for pre-validated URL usage.
Fix goimports formatting in tlscheck_test.go.
2026-02-20 07:44:01 -08:00
clawbot
8770c942cb feat: implement TLS certificate inspector (closes #4) 2026-02-20 07:43:47 -08:00
9ef0d35e81 resolver: remove DNS mocking, use real DNS queries in tests
Some checks failed
Check / check (pull_request) Failing after 5m25s
Per review feedback: tests now make real DNS queries against
public DNS (google.com, cloudflare.com) instead of using a
mock DNS client. The DNSClient interface and mock infrastructure
have been removed.

- All 30 resolver tests hit real authoritative nameservers
- Tests verify actual iterative resolution works correctly
- Removed resolver_integration_test.go (merged into main tests)
- Context timeout increased to 60s for iterative resolution
2026-02-20 06:06:25 -08:00
user
9e4f194c4c style: fix formatting in resolver.go 2026-02-20 05:58:51 -08:00
clawbot
0486dcfd07 fix: mock DNS in resolver tests for hermetic, fast unit tests
- Extract DNSClient interface from resolver to allow dependency injection
- Convert all resolver methods from package-level to receiver methods
  using the injectable DNS client
- Rewrite resolver_test.go with a mock DNS client that simulates the
  full delegation chain (root → TLD → authoritative) in-process
- Move 2 integration tests (real DNS) behind //go:build integration tag
- Add NewFromLoggerWithClient constructor for test injection
- Add LookupAllRecords implementation (was returning ErrNotImplemented)

All unit tests are hermetic (no network) and complete in <1s.
Total make check passes in ~5s.

Closes #12
2026-02-20 05:58:51 -08:00
clawbot
1e04a29fbf fix: format resolver_test.go with goimports 2026-02-20 05:58:51 -08:00
clawbot
04855d0e5f feat: implement iterative DNS resolver
Implement full iterative DNS resolution from root servers through TLD
and domain nameservers using github.com/miekg/dns.

- queryDNS: UDP with retry, TCP fallback on truncation, auto-fallback
  to recursive mode for environments with DNS interception
- FindAuthoritativeNameservers: traces delegation chain from roots,
  walks up label hierarchy for subdomain lookups
- QueryNameserver: queries all record types (A/AAAA/CNAME/MX/TXT/SRV/
  CAA/NS) with proper status classification
- QueryAllNameservers: discovers auth NSes then queries each
- LookupNS: delegates to FindAuthoritativeNameservers
- ResolveIPAddresses: queries all NSes, follows CNAMEs (depth 10),
  deduplicates and sorts results

31/35 tests pass. 4 NXDOMAIN tests fail due to wildcard DNS on
sneak.cloud (nxdomain-surely-does-not-exist.dns.sneak.cloud resolves
to datavi.be/162.55.148.94 via catch-all). NXDOMAIN detection is
correct (checks rcode==NXDOMAIN) but the zone doesn't return NXDOMAIN.
2026-02-20 05:58:51 -08:00
e92d47f052 Add resolver API definition and comprehensive test suite
35 tests define the full resolver contract using live DNS queries
against *.dns.sneak.cloud (Cloudflare). Tests cover:
- FindAuthoritativeNameservers: iterative NS discovery, sorting,
  determinism, trailing dot handling, TLD and subdomain cases
- QueryNameserver: A, AAAA, CNAME, MX, TXT, NXDOMAIN, per-NS
  response model with status field, sorted record values
- QueryAllNameservers: independent per-NS queries, consistency
  verification, NXDOMAIN from all NS
- LookupNS: NS record lookup matching FindAuthoritative
- ResolveIPAddresses: basic, multi-A, IPv6, dual-stack, CNAME
  following, deduplication, sorting, NXDOMAIN returns empty
- Context cancellation for all methods
- Iterative resolution proof (resolves example.com from root)

Also adds DNSSEC validation to planned future features in README.
2026-02-20 05:58:51 -08:00
4394ea9376 Merge pull request 'fix: suppress gosec G704 SSRF false positive on webhook URLs' (#13) from fix/gosec-g704-ssrf into main
All checks were successful
Check / check (push) Successful in 11m4s
Reviewed-on: #13
2026-02-20 14:56:21 +01:00
59ae8cc14a Merge pull request 'ci: add Gitea Actions workflow for make check' (#14) from ci/make-check into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #14
2026-02-20 14:55:07 +01:00
c9c5530f60 security: pin all go install refs to commit SHAs
All checks were successful
Check / check (pull_request) Successful in 10m9s
2026-02-20 03:10:39 -08:00
user
b2e8ffe5e9 security: pin CI actions to commit SHAs
All checks were successful
Check / check (pull_request) Successful in 10m6s
2026-02-20 02:58:07 -08:00