# dnswatcher dnswatcher is a pre-1.0 Go daemon by [@sneak](https://sneak.berlin) that monitors DNS records, TCP port availability, and TLS certificates, delivering real-time change notifications via Slack, Mattermost, and ntfy webhooks. > ⚠️ Pre-1.0 software. APIs, configuration, and behavior may change without notice. dnswatcher watches configured DNS domains and hostnames for changes, monitors TCP port availability, tracks TLS certificate expiry, and delivers real-time notifications via Slack, Mattermost, and/or ntfy webhooks. It performs all DNS resolution itself via iterative (non-recursive) queries, tracing from root nameservers to authoritative servers directly—never relying on upstream recursive resolvers. State is persisted to a local JSON file so that monitoring survives restarts without requiring an external database. --- ## Features ### DNS Domain Monitoring (Apex Domains) - Accepts a list of DNS domain names (apex domains, identified via the [Public Suffix List](https://publicsuffix.org/)). - Every **1 hour**, performs a full iterative trace from root servers to discover all authoritative nameservers (NS records) for each domain. - Queries **every** discovered authoritative nameserver independently. - Stores the NS record set as observed by the delegation chain. - Any change triggers a notification: - NS added to or removed from the delegation. - NS IP address changed (glue record change). ### DNS Hostname Monitoring (Subdomains) - Accepts a list of DNS hostnames (subdomains, distinguished from apex domains via the Public Suffix List). - Every **1 hour**, performs a full iterative trace to discover the authoritative nameservers for the hostname's parent domain. - Queries **each** authoritative nameserver independently for **all** record types: A, AAAA, CNAME, MX, TXT, SRV, CAA, NS. - Stores results **per nameserver**. The state for a hostname is not a merged view — it is a map from nameserver to record set. - Any observable change in any nameserver's response triggers a notification. This includes: - **Record change**: A nameserver returns different records than it did on the previous check (additions, removals, value changes). - **NS query failure**: A nameserver that previously responded becomes unreachable (timeout, SERVFAIL, REFUSED, network error). This is distinct from "responded with no records." - **NS recovery**: A previously-unreachable nameserver starts responding again. - **Inconsistency detected**: Two nameservers that previously agreed now return different record sets for the same hostname. ### TCP Port Monitoring - For every configured domain and hostname, constructs a deduplicated list of all IPv4 and IPv6 addresses resolved via A, AAAA, and CNAME chain resolution across all authoritative nameservers. - Checks TCP connectivity on ports **80** and **443** for each IP address. - Every **1 hour**, re-checks all ports. - Any change in port availability triggers a notification: - Port transitioned from open to closed (or vice versa). - New IP appeared (from DNS change) and its port state was recorded. - IP disappeared (from DNS change) — noted in the DNS change notification; port state for that IP is removed. ### TLS Certificate Monitoring - Every **12 hours**, for each IP address listening on port 443, connects via TLS using the correct SNI hostname. - Records the certificate's Subject CN, SANs, issuer, and expiry date. - Any change triggers a notification: - Certificate is expiring within **7 days** (warning, repeated each check until renewed or expired). - Certificate CN, issuer, or SANs changed (replacement detected, reports old and new values). - TLS connection failure to a previously-reachable IP:443 (handshake error, timeout, connection refused after previously succeeding). - TLS recovery: a previously-failing IP:443 now completes a handshake again. ### Notifications **Every observable state change produces a notification.** dnswatcher is designed as a real-time change feed — degradations, failures, recoveries, and routine changes are all reported equally. Supported notification backends: | Backend | Configuration | Payload Format | |----------------|--------------------------|------------------------------| | **Slack** | Incoming Webhook URL | Attachments with color | | **Mattermost** | Incoming Webhook URL | Slack-compatible attachments | | **ntfy** | Topic URL (e.g. `https://ntfy.sh/mytopic`) | Title + body + priority | All configured endpoints receive every notification. Notification content includes: - **DNS record changes**: Which hostname, which nameserver, what record type, old values, new values. - **DNS NS changes**: Which domain, which nameservers were added/removed. - **NS query failures**: Which nameserver failed, error type (timeout, SERVFAIL, REFUSED, network error), which hostname/domain affected. - **NS recoveries**: Which nameserver recovered, which hostname/domain. - **NS inconsistencies**: Which nameservers disagree, what each one returned, which hostname affected. - **Port changes**: Which IP:port, old state, new state, all associated hostnames. - **TLS expiry warnings**: Which certificate, days remaining, CN, issuer, associated hostname and IP. - **TLS certificate changes**: Old and new CN/issuer/SANs, associated hostname and IP. - **TLS connection failures/recoveries**: Which IP:port, error details, associated hostname. ### State Management - All monitoring state is kept in memory and persisted to a JSON file on disk (`DATA_DIR/state.json`). - State is loaded on startup to resume monitoring without triggering false-positive change notifications. - State is written atomically (write to temp file, then rename) to prevent corruption. ### Web Dashboard dnswatcher includes an unauthenticated, read-only web dashboard at the root URL (`/`). It displays: - **Summary counts** for monitored domains, hostnames, ports, and certificates. - **Domains** with their discovered nameservers. - **Hostnames** with per-nameserver DNS records and status. - **Ports** with open/closed state and associated hostnames. - **TLS certificates** with CN, issuer, expiry, and status. - **Recent alerts** (last 100 notifications sent since the process started), displayed in reverse chronological order. Every data point shows its age (e.g. "5m ago") so you can tell at a glance how fresh the information is. The page auto-refreshes every 30 seconds. The dashboard intentionally does not expose any configuration details such as webhook URLs, notification endpoints, or API tokens. All assets (CSS) are embedded in the binary and served from the application itself. The dashboard makes zero external HTTP requests — no CDN dependencies or third-party resources are loaded at runtime. ### HTTP API dnswatcher exposes a lightweight HTTP API for operational visibility: | Endpoint | Description | |---------------------------------------|--------------------------------| | `GET /` | Web dashboard (HTML) | | `GET /s/...` | Static assets (embedded CSS) | | `GET /.well-known/healthcheck` | Health check (JSON) | | `GET /health` | Health check (JSON, legacy) | | `GET /api/v1/status` | Current monitoring state | | `GET /metrics` | Prometheus metrics (optional) | --- ## Architecture ``` cmd/dnswatcher/main.go Entry point (uber/fx bootstrap) internal/ config/config.go Viper-based configuration globals/globals.go Build-time variables (version) logger/logger.go slog structured logging (TTY detection) healthcheck/healthcheck.go Health check service middleware/middleware.go HTTP middleware (logging, CORS, metrics auth) handlers/handlers.go HTTP request handlers server/ server.go HTTP server lifecycle routes.go Route definitions state/state.go JSON file state persistence resolver/resolver.go Iterative DNS resolution engine portcheck/portcheck.go TCP port connectivity checker tlscheck/tlscheck.go TLS certificate inspector notify/notify.go Notification service (Slack, Mattermost, ntfy) watcher/watcher.go Main monitoring orchestrator and scheduler ``` ### Design Principles - **No recursive resolvers**: All DNS resolution is performed iteratively, tracing from root nameservers through the delegation chain to authoritative servers. - **No external database**: State is persisted as a single JSON file. - **Dependency injection**: All components are wired via [uber/fx](https://github.com/uber-go/fx). - **Structured logging**: All logs use `log/slog` with JSON output in production (TTY detection for development). - **Graceful shutdown**: All background goroutines respect context cancellation and the fx lifecycle. --- ## Configuration Configuration is loaded via [Viper](https://github.com/spf13/viper) with the following precedence (highest to lowest): 1. Environment variables (prefixed with `DNSWATCHER_`) 2. `.env` file (loaded via godotenv) 3. Config file: `/etc/dnswatcher/dnswatcher.yaml`, `~/.config/dnswatcher/dnswatcher.yaml`, or `./dnswatcher.yaml` 4. Defaults ### Environment Variables | Variable | Description | Default | |---------------------------------|--------------------------------------------|-------------| | `PORT` | HTTP listen port | `8080` | | `DNSWATCHER_DEBUG` | Enable debug logging | `false` | | `DNSWATCHER_DATA_DIR` | Directory for state file | `./data` | | `DNSWATCHER_TARGETS` | Comma-separated DNS names (auto-classified via PSL) | `""` | | `DNSWATCHER_SLACK_WEBHOOK` | Slack incoming webhook URL | `""` | | `DNSWATCHER_MATTERMOST_WEBHOOK` | Mattermost incoming webhook URL | `""` | | `DNSWATCHER_NTFY_TOPIC` | ntfy topic URL | `""` | | `DNSWATCHER_DNS_INTERVAL` | DNS check interval | `1h` | | `DNSWATCHER_TLS_INTERVAL` | TLS check interval | `12h` | | `DNSWATCHER_TLS_EXPIRY_WARNING` | Days before expiry to warn | `7` | | `DNSWATCHER_SENTRY_DSN` | Sentry DSN for error reporting | `""` | | `DNSWATCHER_MAINTENANCE_MODE` | Enable maintenance mode | `false` | | `DNSWATCHER_METRICS_USERNAME` | Basic auth username for /metrics | `""` | | `DNSWATCHER_METRICS_PASSWORD` | Basic auth password for /metrics | `""` | | `DNSWATCHER_SEND_TEST_NOTIFICATION` | Send a test notification after first scan completes | `false` | **`DNSWATCHER_TARGETS` is required.** dnswatcher will refuse to start if no monitoring targets are configured. A monitoring daemon with nothing to monitor is a misconfiguration, so dnswatcher fails fast with a clear error message rather than running silently. Set `DNSWATCHER_TARGETS` to a comma-separated list of DNS names before starting. ### Example `.env` ```sh PORT=8080 DNSWATCHER_DEBUG=false DNSWATCHER_DATA_DIR=./data DNSWATCHER_TARGETS=example.com,example.org,www.example.com,api.example.com,mail.example.org DNSWATCHER_SLACK_WEBHOOK=https://hooks.slack.com/services/T.../B.../xxx DNSWATCHER_MATTERMOST_WEBHOOK=https://mattermost.example.com/hooks/xxx DNSWATCHER_NTFY_TOPIC=https://ntfy.sh/my-dns-alerts DNSWATCHER_SEND_TEST_NOTIFICATION=true ``` --- ## DNS Resolution Strategy dnswatcher never uses the system's configured recursive resolver. Instead, it performs full iterative resolution: 1. **Root servers**: Starts from the IANA root nameserver list (hardcoded, with periodic refresh). 2. **TLD delegation**: Queries root servers for the TLD NS records. 3. **Domain delegation**: Queries TLD nameservers for the domain's NS records. 4. **Authoritative query**: Queries all discovered authoritative nameservers directly for the requested records. This approach ensures: - Independence from any upstream resolver's cache or filtering. - Ability to detect split-horizon or inconsistent responses across authoritative servers. - Visibility into the full delegation chain. For hostname monitoring, the resolver follows CNAME chains (with a depth limit to prevent loops) before collecting terminal A/AAAA records. --- ## State File Format The state file (`DATA_DIR/state.json`) contains the complete monitoring snapshot. Hostname records are stored **per authoritative nameserver**, not as a merged view, to enable inconsistency detection. ```json { "version": 1, "lastUpdated": "2026-02-19T12:00:00Z", "domains": { "example.com": { "nameservers": ["ns1.example.com.", "ns2.example.com."], "lastChecked": "2026-02-19T12:00:00Z" } }, "hostnames": { "www.example.com": { "recordsByNameserver": { "ns1.example.com.": { "records": { "A": ["93.184.216.34"], "AAAA": ["2606:2800:220:1:248:1893:25c8:1946"] }, "status": "ok", "lastChecked": "2026-02-19T12:00:00Z" }, "ns2.example.com.": { "records": { "A": ["93.184.216.34"], "AAAA": ["2606:2800:220:1:248:1893:25c8:1946"] }, "status": "ok", "lastChecked": "2026-02-19T12:00:00Z" } }, "lastChecked": "2026-02-19T12:00:00Z" } }, "ports": { "93.184.216.34:80": { "open": true, "hostnames": ["www.example.com"], "lastChecked": "2026-02-19T12:00:00Z" }, "93.184.216.34:443": { "open": true, "hostnames": ["www.example.com"], "lastChecked": "2026-02-19T12:00:00Z" } }, "certificates": { "93.184.216.34:443:www.example.com": { "commonName": "www.example.com", "issuer": "DigiCert TLS RSA SHA256 2020 CA1", "notAfter": "2027-01-15T23:59:59Z", "subjectAlternativeNames": ["www.example.com"], "status": "ok", "lastChecked": "2026-02-19T06:00:00Z" } } } ``` The `status` field for each per-nameserver entry and certificate entry tracks reachability: | Status | Meaning | |-------------|-------------------------------------------------| | `ok` | Query succeeded, records are current | | `error` | Query failed (timeout, SERVFAIL, network error) | --- ## Building ```sh make build # Build binary to bin/dnswatcher make test # Run tests with race detector make lint # Run golangci-lint make fmt # Format code make check # Run all checks (format, lint, test, build) make clean # Remove build artifacts ``` ### Build-Time Variables Version is injected via `-ldflags`: ```sh go build -ldflags "-X main.Version=$(git describe --tags --always)" ./cmd/dnswatcher ``` --- ## Docker ```sh docker build -t dnswatcher . docker run -d \ -p 8080:8080 \ -v dnswatcher-data:/var/lib/dnswatcher \ -e DNSWATCHER_TARGETS=example.com,www.example.com \ -e DNSWATCHER_NTFY_TOPIC=https://ntfy.sh/my-alerts \ -e DNSWATCHER_SEND_TEST_NOTIFICATION=true \ dnswatcher ``` --- ## Monitoring Lifecycle 1. **Startup**: Load state from disk. If no state file exists, start with empty state (first check will establish baseline without triggering change notifications). 2. **Initial check**: Immediately perform all DNS, port, and TLS checks on startup. 3. **Periodic checks** (DNS always runs first): - DNS checks: every `DNSWATCHER_DNS_INTERVAL` (default 1h). Also re-run before every TLS check cycle to ensure fresh IPs. - Port checks: every `DNSWATCHER_DNS_INTERVAL`, after DNS completes. - TLS checks: every `DNSWATCHER_TLS_INTERVAL` (default 12h), after DNS completes. - Port and TLS checks always use freshly resolved IP addresses from the DNS phase that immediately precedes them — never stale IPs from a previous cycle. 4. **On change detection**: Send notifications to all configured endpoints, update in-memory state, persist to disk. 5. **Shutdown**: Persist final state to disk, complete in-flight notifications, stop gracefully. --- ## Planned Future Features (Post-1.0) - **DNSSEC validation**: Validate the DNSSEC chain of trust during iterative resolution and report DNSSEC failures as notifications. --- ## Project Structure Follows the conventions defined in `REPO_POLICIES.md`, adapted from the [upaas](https://git.eeqj.de/sneak/upaas) project template. Uses uber/fx for dependency injection, go-chi for HTTP routing, slog for logging, and Viper for configuration. --- ## License License has not yet been chosen for this project. Pending decision by the author (MIT, GPL, or WTFPL). ## Author [@sneak](https://sneak.berlin)