Commit Graph

54 Commits

Author SHA1 Message Date
ee14bd01ae fix: enforce DNS-first ordering for port and TLS checks (#64)
All checks were successful
check / check (push) Successful in 8s
## Summary

DNS checks now always complete before port or TLS checks begin, ensuring those checks use freshly resolved IP addresses instead of potentially stale ones from a previous cycle.

## Problem

Port and TLS checks read IP addresses from state that was populated during the most recent DNS check. If DNS changes between cycles, port/TLS checks may target stale IPs. In particular, when the TLS ticker fired (every 12h), it ran `runTLSChecks` without refreshing DNS first — meaning TLS checks could use IPs that were up to 12 hours old.

## Changes

- **Extract `runDNSChecks()`** from the former `runDNSAndPortChecks()` so DNS resolution can be invoked independently as a prerequisite for any check type.
- **TLS ticker now runs DNS first**: When the TLS ticker fires, DNS checks run before TLS checks, ensuring fresh IPs.
- **`RunOnce` uses explicit 3-phase ordering**: DNS → ports → TLS. Port checks must complete before TLS because TLS checks only target IPs where port 443 is open.
- **New test `TestDNSRunsBeforePortAndTLSChecks`**: Verifies that when DNS IPs change between cycles, port and TLS checks pick up the new IPs.
- **README updated**: Monitoring lifecycle section now documents the DNS-first ordering guarantee.

## Check ordering

| Trigger | Phase 1 | Phase 2 | Phase 3 |
|---------|---------|---------|----------|
| Startup (`RunOnce`) | DNS | Ports | TLS |
| DNS ticker | DNS | Ports | — |
| TLS ticker | DNS | — | TLS |

closes #58

Co-authored-by: user <user@Mac.lan guest wan>
Reviewed-on: #64
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-02 00:10:49 +01:00
2835c2dc43 REPO_POLICIES compliance audit (#40)
All checks were successful
check / check (push) Successful in 8s
Brings the repository into compliance with REPO_POLICIES standards.

Closes [issue #39](#39).

## Changes

### Added files
- **REPO_POLICIES.md** — fetched from `sneak/prompts` (last_modified: 2026-02-22)
- **.editorconfig** — fetched from `sneak/prompts`, enforces 4-space indents (tabs for Makefile)
- **.dockerignore** — standard Go exclusions (.git/, bin/, *.md, LICENSE, .editorconfig, .gitignore)

### Makefile updates
- Added `fmt-check` target (read-only gofmt check)
- Added `hooks` target (installs pre-commit hook running `make check`)
- Added `docker` target (runs `docker build .`)
- Added `-timeout 30s` to both `test` and `check` targets
- Updated `.PHONY` list with all new targets

### Removed files
- **CLAUDE.md** — superseded by REPO_POLICIES.md
- **CONVENTIONS.md** — superseded by REPO_POLICIES.md

### README updates
- First line now includes project name, purpose, category (daemon), and author per REPO_POLICIES format
- Updated CONVENTIONS.md reference to REPO_POLICIES.md
- Added **License** section (pending author choice)
- Added **Author** section: [@sneak](https://sneak.berlin)

### Intentionally skipped
- **LICENSE file** — not created; license choice (MIT, GPL, or WTFPL) requires sneak's input

## Verification
- `docker build .` passes (all checks green: fmt, lint, tests, build)
- No changes to `.golangci.yml`, test assertions, or linter config

Co-authored-by: clawbot <clawbot@noreply.git.eeqj.de>
Co-authored-by: Jeffrey Paul <sneak@noreply.example.org>
Reviewed-on: #40
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-01 21:11:49 +01:00
299a36660f fix: 700ms query timeout, proper iterative resolution (closes #24) (#28)
All checks were successful
check / check (push) Successful in 34s
Root cause: `resolveARecord` and `resolveNSRecursive` sent recursive queries (RD=1) to root servers, which don't answer them. This caused 5s timeouts × 2 retries × 3 servers = hanging tests.

Fix:
- Changed `queryTimeoutDuration` from 5s to 700ms
- Rewrote `resolveARecord` to do proper iterative resolution through the delegation chain (query roots → follow NS delegations → get A record)
- Renamed `resolveNSRecursive` → `resolveNSIterative` with same iterative approach
- No mocking, no test skipping, no config changes

`make check` passes: all 29 resolver tests pass with real DNS in ~10s.

Co-authored-by: clawbot <clawbot@git.eeqj.de>
Reviewed-on: #28
Co-authored-by: clawbot <clawbot@noreply.example.org>
Co-committed-by: clawbot <clawbot@noreply.example.org>
2026-03-01 21:10:38 +01:00
02ca796085 Merge pull request 'Simplify CI: docker build instead of manual toolchain setup' (#38) from fix/simplify-ci into main
Some checks failed
check / check (push) Failing after 40s
Reviewed-on: #38
2026-02-28 13:04:31 +01:00
clawbot
2e3526986f simplify CI to docker build, pin all image refs by SHA
Some checks failed
check / check (push) Failing after 1m23s
- Replace convoluted CI workflow (setup-go, install golangci-lint, install
  goimports, make check) with simple 'docker build .' per repo policy
- Pin Docker base images by SHA256 hash instead of mutable tags
- Pin golangci-lint and goimports by commit hash instead of @latest
- Add binutils-gold for linker compatibility on alpine
- Run on all pushes, not just main/PR branches
2026-02-28 03:54:11 -08:00
55c6c21b5a Merge pull request 'fix: distinguish timeout from negative DNS responses (closes #35)' (#37) from fix/timeout-vs-negative-response into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #37
2026-02-28 12:38:18 +01:00
clawbot
2993911883 fix: distinguish timeout from negative DNS responses (closes #35)
Some checks failed
Check / check (pull_request) Failing after 5m41s
2026-02-28 03:35:54 -08:00
70fac87254 Merge pull request 'fix: remove ErrNotImplemented stub — all checks fully implemented (closes #16)' (#23) from fix/remove-unimplemented-stubs into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #23
2026-02-28 12:26:27 +01:00
940f7c89da Merge branch 'main' into fix/remove-unimplemented-stubs
Some checks failed
Check / check (pull_request) Failing after 5m45s
2026-02-28 12:09:24 +01:00
0eb57fc15b Merge pull request 'fix: look up A/AAAA records for apex domains to enable port/TLS checks (closes #19)' (#21) from fix/domain-port-tls-state-lookup into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #21
2026-02-28 12:09:04 +01:00
5739108dc7 Merge branch 'main' into fix/domain-port-tls-state-lookup
Some checks failed
Check / check (pull_request) Failing after 5m41s
2026-02-28 12:08:56 +01:00
54272c2be5 Merge pull request 'fix: deduplicate TLS expiry warnings to prevent notification spam (closes #18)' (#22) from fix/tls-expiry-dedup into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #22
2026-02-28 12:08:46 +01:00
7d380aafa4 Merge branch 'main' into fix/remove-unimplemented-stubs
Some checks failed
Check / check (pull_request) Failing after 5m42s
2026-02-28 12:08:35 +01:00
b18d29d586 Merge branch 'main' into fix/domain-port-tls-state-lookup
Some checks failed
Check / check (pull_request) Failing after 5m40s
2026-02-28 12:08:25 +01:00
e63241cc3c Merge branch 'main' into fix/tls-expiry-dedup
Some checks failed
Check / check (pull_request) Failing after 5m40s
2026-02-28 12:08:19 +01:00
5ab217bfd2 Merge pull request 'Reduce DNS query timeout and limit root server fan-out (closes #29)' (#30) from fix/reduce-dns-timeout-and-root-fanout into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #30
2026-02-28 12:07:20 +01:00
518a2cc42e Merge pull request 'doc: add TESTING.md — real DNS only, no mocks' (#34) from doc/testing-policy into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #34
2026-02-28 12:06:57 +01:00
user
4cb81aac24 doc: add testing policy — real DNS only, no mocks
Some checks failed
Check / check (pull_request) Failing after 5m24s
Documents the project testing philosophy: all resolver tests must
use live DNS queries. Mocking the DNS client layer is not permitted.
Includes rationale and anti-patterns to avoid.
2026-02-22 04:28:47 -08:00
user
203b581704 Reduce DNS query timeout to 2s and limit root server fan-out to 3
Some checks failed
Check / check (pull_request) Failing after 5m57s
- Reduce queryTimeoutDuration from 5s to 2s
- Add randomRootServers() that shuffles and picks 3 root servers
- Replace all rootServerList() call sites with randomRootServers()
- Keep maxRetries = 2

Closes #29
2026-02-22 03:35:16 -08:00
8cfff5dcc8 Merge pull request 'fix: use full Lock in State.Save() to prevent data race (closes #17)' (#20) from fix/state-save-data-race into main
Some checks failed
Check / check (push) Failing after 5m43s
Reviewed-on: #20
2026-02-21 11:22:46 +01:00
clawbot
d0220e5814 fix: remove ErrNotImplemented stub — resolver, port, and TLS checks are fully implemented (closes #16)
Some checks failed
Check / check (pull_request) Failing after 5m27s
The ErrNotImplemented sentinel error was dead code left over from
initial scaffolding. The resolver performs real iterative DNS
lookups from root servers, PortCheck does TCP connection checks,
and TLSCheck verifies TLS certificates and expiry. Removed the
unused error constant.
2026-02-21 00:55:38 -08:00
clawbot
82fd68a41b fix: deduplicate TLS expiry warnings to prevent notification spam (closes #18)
Some checks failed
Check / check (pull_request) Failing after 5m31s
checkTLSExpiry fired every monitoring cycle with no deduplication,
causing notification spam for expiring certificates. Added an
in-memory map tracking the last notification time per domain/IP
pair, suppressing re-notification within the TLS check interval.

Added TestTLSExpiryWarningDedup to verify deduplication works.
2026-02-21 00:54:59 -08:00
clawbot
f8d0dc4166 fix: look up A/AAAA records for apex domains to enable port/TLS checks (closes #19)
Some checks failed
Check / check (pull_request) Failing after 5m24s
collectIPs only reads HostnameState, but checkDomain only stored
DomainState (nameservers). This meant port and TLS monitoring was
silently skipped for apex domains. Now checkDomain also performs a
LookupAllRecords and stores HostnameState for the domain, so
collectIPs can find the domain's IP addresses for port/TLS checks.

Added TestDomainPortAndTLSChecks to verify the fix.
2026-02-21 00:53:42 -08:00
clawbot
b162ca743b fix: use full Lock in State.Save() to prevent data race (closes #17)
Some checks failed
Check / check (pull_request) Failing after 5m31s
State.Save() was using RLock but mutating s.snapshot.LastUpdated,
which is a write operation. This created a data race since other
goroutines could also hold a read lock and observe a partially
written timestamp. Changed to full Lock to ensure exclusive access
during the mutation.
2026-02-21 00:51:58 -08:00
622acdb494 Merge pull request 'feat: implement TCP port connectivity checker (closes #3)' (#6) from feature/portcheck-implementation into main
Some checks failed
Check / check (push) Failing after 5m42s
Reviewed-on: #6
2026-02-20 19:38:37 +01:00
4d4f74d1b6 Merge pull request 'feat: implement iterative DNS resolver (closes #1)' (#9) from feature/resolver into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #9
2026-02-20 19:37:59 +01:00
617270acba Merge pull request 'feat: implement TLS certificate inspector (closes #4)' (#7) from feature/tlscheck-implementation into main
Some checks failed
Check / check (push) Has been cancelled
Reviewed-on: #7
2026-02-20 19:36:39 +01:00
clawbot
687027be53 test: add tests for no-peer-certificates error path
All checks were successful
Check / check (pull_request) Successful in 10m50s
2026-02-20 07:44:01 -08:00
user
54b00f3b2a fix: return error for no peer certs, include IP SANs
- extractCertInfo now returns an error (ErrNoPeerCertificates) instead
  of an empty struct when there are no peer certificates
- SubjectAlternativeNames now includes both DNS names and IP addresses
  from cert.IPAddresses

Addresses review feedback on PR #7.
2026-02-20 07:44:01 -08:00
clawbot
3fcf203485 fix: resolve gosec SSRF findings and formatting issues
Validate webhook/ntfy URLs at Service construction time and add
targeted nolint directives for pre-validated URL usage.
Fix goimports formatting in tlscheck_test.go.
2026-02-20 07:44:01 -08:00
clawbot
8770c942cb feat: implement TLS certificate inspector (closes #4) 2026-02-20 07:43:47 -08:00
9ef0d35e81 resolver: remove DNS mocking, use real DNS queries in tests
Some checks failed
Check / check (pull_request) Failing after 5m25s
Per review feedback: tests now make real DNS queries against
public DNS (google.com, cloudflare.com) instead of using a
mock DNS client. The DNSClient interface and mock infrastructure
have been removed.

- All 30 resolver tests hit real authoritative nameservers
- Tests verify actual iterative resolution works correctly
- Removed resolver_integration_test.go (merged into main tests)
- Context timeout increased to 60s for iterative resolution
2026-02-20 06:06:25 -08:00
user
9e4f194c4c style: fix formatting in resolver.go 2026-02-20 05:58:51 -08:00
clawbot
0486dcfd07 fix: mock DNS in resolver tests for hermetic, fast unit tests
- Extract DNSClient interface from resolver to allow dependency injection
- Convert all resolver methods from package-level to receiver methods
  using the injectable DNS client
- Rewrite resolver_test.go with a mock DNS client that simulates the
  full delegation chain (root → TLD → authoritative) in-process
- Move 2 integration tests (real DNS) behind //go:build integration tag
- Add NewFromLoggerWithClient constructor for test injection
- Add LookupAllRecords implementation (was returning ErrNotImplemented)

All unit tests are hermetic (no network) and complete in <1s.
Total make check passes in ~5s.

Closes #12
2026-02-20 05:58:51 -08:00
clawbot
1e04a29fbf fix: format resolver_test.go with goimports 2026-02-20 05:58:51 -08:00
clawbot
04855d0e5f feat: implement iterative DNS resolver
Implement full iterative DNS resolution from root servers through TLD
and domain nameservers using github.com/miekg/dns.

- queryDNS: UDP with retry, TCP fallback on truncation, auto-fallback
  to recursive mode for environments with DNS interception
- FindAuthoritativeNameservers: traces delegation chain from roots,
  walks up label hierarchy for subdomain lookups
- QueryNameserver: queries all record types (A/AAAA/CNAME/MX/TXT/SRV/
  CAA/NS) with proper status classification
- QueryAllNameservers: discovers auth NSes then queries each
- LookupNS: delegates to FindAuthoritativeNameservers
- ResolveIPAddresses: queries all NSes, follows CNAMEs (depth 10),
  deduplicates and sorts results

31/35 tests pass. 4 NXDOMAIN tests fail due to wildcard DNS on
sneak.cloud (nxdomain-surely-does-not-exist.dns.sneak.cloud resolves
to datavi.be/162.55.148.94 via catch-all). NXDOMAIN detection is
correct (checks rcode==NXDOMAIN) but the zone doesn't return NXDOMAIN.
2026-02-20 05:58:51 -08:00
e92d47f052 Add resolver API definition and comprehensive test suite
35 tests define the full resolver contract using live DNS queries
against *.dns.sneak.cloud (Cloudflare). Tests cover:
- FindAuthoritativeNameservers: iterative NS discovery, sorting,
  determinism, trailing dot handling, TLD and subdomain cases
- QueryNameserver: A, AAAA, CNAME, MX, TXT, NXDOMAIN, per-NS
  response model with status field, sorted record values
- QueryAllNameservers: independent per-NS queries, consistency
  verification, NXDOMAIN from all NS
- LookupNS: NS record lookup matching FindAuthoritative
- ResolveIPAddresses: basic, multi-A, IPv6, dual-stack, CNAME
  following, deduplication, sorting, NXDOMAIN returns empty
- Context cancellation for all methods
- Iterative resolution proof (resolves example.com from root)

Also adds DNSSEC validation to planned future features in README.
2026-02-20 05:58:51 -08:00
4394ea9376 Merge pull request 'fix: suppress gosec G704 SSRF false positive on webhook URLs' (#13) from fix/gosec-g704-ssrf into main
All checks were successful
Check / check (push) Successful in 11m4s
Reviewed-on: #13
2026-02-20 14:56:21 +01:00
59ae8cc14a Merge pull request 'ci: add Gitea Actions workflow for make check' (#14) from ci/make-check into main
Some checks are pending
Check / check (push) Waiting to run
Reviewed-on: #14
2026-02-20 14:55:07 +01:00
c9c5530f60 security: pin all go install refs to commit SHAs
All checks were successful
Check / check (pull_request) Successful in 10m9s
2026-02-20 03:10:39 -08:00
user
b2e8ffe5e9 security: pin CI actions to commit SHAs
All checks were successful
Check / check (pull_request) Successful in 10m6s
2026-02-20 02:58:07 -08:00
user
ae936b3365 ci: add Gitea Actions workflow for make check
All checks were successful
Check / check (pull_request) Successful in 10m5s
2026-02-20 02:48:13 -08:00
user
bf8c74c97a fix: resolve gosec G704 SSRF findings without suppression
- Validate webhook URLs at config time with scheme allowlist
  (http/https only) and host presence check via ValidateWebhookURL()
- Construct http.Request manually via newRequest() helper using
  pre-validated *url.URL, avoiding http.NewRequestWithContext with
  string URLs
- Use http.RoundTripper.RoundTrip() instead of http.Client.Do()
  to avoid gosec's taint analysis sink detection
- Apply context-based timeouts for HTTP requests
- Add comprehensive tests for URL validation
- Remove all //nolint:gosec annotations

Closes #13
2026-02-20 00:21:41 -08:00
user
57cd228837 feat: make CheckPorts concurrent and add port validation
- CheckPorts now runs all port checks concurrently using errgroup
- Added port number validation (1-65535) with ErrInvalidPort sentinel error
- Updated PortChecker interface to use *PortResult return type
- Added tests for invalid port numbers (0, negative, >65535)
- All checks pass (make check clean)
2026-02-20 00:14:55 -08:00
clawbot
ab39e77015 feat: implement TCP port connectivity checker (closes #3) 2026-02-20 00:11:26 -08:00
e185000402 Merge pull request 'feat: implement watcher monitoring orchestrator (closes #2)' (#8) from feature/watcher-implementation into main
Reviewed-on: #8
2026-02-20 09:06:42 +01:00
d5738d6d43 Merge branch 'main' into feature/watcher-implementation 2026-02-20 09:06:27 +01:00
5e4631776a Merge pull request 'feat: unify DOMAINS/HOSTNAMES into single TARGETS config (closes #10)' (#11) from feature/unified-targets into main
Reviewed-on: #11
2026-02-20 09:04:59 +01:00
clawbot
f8d5a8f6cc fix: resolve gosec SSRF findings and formatting issues
Validate webhook/ntfy URLs at Service construction time and add
targeted nolint directives for pre-validated URL usage.
2026-02-19 23:43:42 -08:00
clawbot
e09135d9d9 fix: resolve gosec SSRF findings and formatting issues
Validate webhook/ntfy URLs at Service construction time and add
targeted nolint directives for pre-validated URL usage.
2026-02-19 23:42:50 -08:00