PASSIVE checkpoints may not fully checkpoint when writers are active.
TRUNCATE mode blocks writers briefly but ensures complete checkpointing,
keeping the WAL small for consistent read performance under heavy load.
Under heavy write load, 30 seconds is too long between checkpoints,
causing the WAL to grow and slow down read queries. More aggressive
checkpointing keeps the WAL small and maintains read performance.
Explicitly checkpoint the WAL on database initialization to consolidate
any large WAL file from previous runs. Log the checkpoint results to
help diagnose issues.
- Log slow requests (>1s) at WARNING level with slow=true flag
- Log request timeouts at WARNING level in TimeoutMiddleware
- Replace heavy GetStatsContext with lightweight Ping in health check
- Add Ping method to database interface (SELECT 1)
The WAL file was growing to 700MB+ which caused COUNT(*) queries to
timeout. Reads must scan the WAL to find current page versions, and
a large WAL makes this slow.
Add Checkpoint method to database interface and run PASSIVE checkpoints
every 30 seconds via the DBMaintainer. This keeps the WAL small and
maintains fast read performance under heavy write load.
Use runuser to drop privileges and execute the app as the routewatch
user (uid 1000). Fix data directory permissions at runtime since host
mounts may have incorrect ownership.
Display oldest and newest route timestamps in the Routing Table card
on the /status page. Timestamps are shown as relative times (e.g.,
"5m ago", "2h 30m ago").
Changes:
- Add OldestRoute and NewestRoute fields to database.Stats
- Query MIN/MAX of last_updated across live_routes_v4 and v6 tables
- Include timestamps in both /status.json and /api/v1/stats responses
- Add formatRelativeTime JavaScript function for display
- Create new home page (/) with overview stats, ASN lookup,
AS name search, and IP address lookup with JSON display
- Add responsive navbar to all pages with app branding
- Navbar shows "routewatch by @sneak" with link to author
- Status page accessible via navbar link
- Remove redirect from / to /status, serve home page instead
Change route from wildcard /prefix/* to explicit /prefix/{prefix}/{len}
to properly handle URL-encoded IPv6 addresses with CIDR notation.
- Separate prefix and length into individual path parameters
- Add prefixURL template function for generating correct links
- Remove url.QueryUnescape from handlers (chi handles decoding)
- Add GC statistics (run count, total/last pause, heap usage)
- Add BGP peer count tracking from RIS Live OPEN/NOTIFICATION messages
- Add route churn rate metric (announcements + withdrawals per second)
- Add announcement and withdrawal counters
- Add footer with attribution, license, and git revision
- Embed git revision at build time via ldflags
- Update HTML template to display all new metrics
- Track reconnection count in metrics tracker
- Display connection duration under Stream Statistics
- Display reconnect count since app startup
- Update both JSON API and HTML status page
- Replace chi's Logger middleware with structured slog-based logging
- Log request start (debug) and completion (info/warn/error by status)
- Include method, path, status, duration_ms, remote_addr in logs
- Increase request timeout from 8s to 30s for slow queries
- Add read/write/idle timeouts to HTTP server config
- Better server startup logging to confirm listening state
- Builder stage: vendor dependencies, build binary, create source archive
- Source archive (.tar.zst) includes all code and vendored dependencies
- Runtime stage: minimal Debian image with binary and source archive
- Health check via curl to /.well-known/healthcheck.json
- Runs as non-root user (routewatch:1000)
- Use PRAGMA incremental_vacuum instead of full VACUUM
- Frees ~1000 pages (~4MB) per run without blocking writes
- Run every 10 minutes instead of 6 hours since it's lightweight
- Set auto_vacuum=INCREMENTAL pragma for new databases
- Remove blocking VACUUM on startup
- Add health check endpoint at /.well-known/healthcheck.json that
verifies database and RIS Live connectivity, returns 200/503
- Fix panic in streamer when encountering unknown RIS message types
by logging a warning and continuing instead of crashing
- Add DBMaintainer for periodic database maintenance:
- VACUUM every 6 hours to reclaim space
- ANALYZE every hour to update query statistics
- Graceful shutdown support
- Add Vacuum() and Analyze() methods to database interface
- Add WHOIS Fetcher card showing fresh/stale/never-fetched ASN counts
- Display hourly success/error counts and current fetch interval
- Increase max WHOIS rate to 1/sec (down from 10 sec minimum)
- Select random stale ASN instead of oldest for better distribution
- Add index on whois_updated_at for query performance
- Track success/error timestamps for hourly stats
- Add GetWHOISStats database method for freshness statistics
- Always return consistent JSON structure with query and results array
- Add PTR field to IPInfo for reverse DNS records
- Support comma-separated IPs and hostnames in single query
- Do PTR lookup for all IPs (direct, resolved from hostname, or listed)
- Remove trailing dots from PTR records
- Accept hostnames in addition to IP addresses for /ip endpoints
- Resolve A and AAAA records for hostnames
- Return list of results with info for each resolved IP
- Include hostname in response when resolving hostnames
- Report per-IP errors while still returning successful lookups
- Reduce base interval from 60s to 15s for faster initial fetching
- Add exponential backoff on failure (up to 5 minute max interval)
- Decrease interval on success (down to 10 second minimum)
- Add mutex to prevent concurrent WHOIS fetches
- Track consecutive failures for backoff calculation
- Add /ip and /ip/{addr} JSON endpoints returning comprehensive IP info
- Include ASN, netblock, country code, org name, abuse contact, RIR data
- Extend ASN schema with WHOIS fields (country, org, abuse contact, etc)
- Create background WHOIS fetcher for rate-limited ASN info updates
- Store raw WHOIS responses for debugging and data preservation
- Queue on-demand WHOIS lookups when stale data is requested
- Refactor handleIPInfo to serve all IP endpoints consistently
The stream stats were showing decompressed data sizes, not actual wire
bandwidth. This change adds wire byte tracking by disabling automatic
gzip decompression in the HTTP client and wrapping the response body
with a counting reader before decompression. Both wire (compressed) and
decompressed bytes are now tracked and exposed in the API responses.
Add comprehensive godoc comments to all exported types, functions,
and constants throughout the codebase. Create README.md documenting
the project architecture, execution flow, database schema, and
component relationships.
Enhance documentation comments for constants, types, and exported methods
in peeringhandler.go to follow Go documentation conventions. The improved
comments provide more context about the purpose and behavior of each item.
Expand the documentation comment for CLIEntry to provide more context
about what the function does, including its use of the fx dependency
injection framework, signal handling, and blocking behavior.
The prefix length links for IPv6 prefixes were incorrectly pointing to
/prefixlength/<len> which would show IPv4 prefixes. Added a new route
/prefixlength6/<len> specifically for IPv6 prefixes and updated the
template to use the correct URL based on whether displaying IPv4 or IPv6
prefix distributions.
Also refactored handlePrefixLength to explicitly handle only IPv4 prefixes
and created handlePrefixLength6 for IPv6 prefixes, removing the ambiguous
auto-detection based on mask length value.
- HTTP request timeout: 2s -> 8s
- Stats collection context timeout: 1s -> 4s
- HTTP read header timeout: 10s -> 40s
This should prevent timeout errors when the database is under load
or when complex queries take longer than expected (slow query
threshold is 50ms).
- Created JSONValidationMiddleware that validates all JSON responses
- Ensures that even on timeout or internal errors, a valid JSON error response is returned
- Applied to all API endpoints including /status.json
- Prevents client-side JSON parse errors when server encounters issues
- Added GetASPeers method to database to fetch all peering relationships
- Updated AS detail handler to fetch and pass peers to template
- Added peers section to AS detail page showing all peer ASNs with their info
- Added peer count to the info cards at the top of the page
- Shows handle, description, and first/last seen dates for each peer
Updated templates to use the new field names after renaming:
- ASN.Number -> ASN.ASN in as_detail.html
- Fixed references to ASN field in prefix_detail.html for ASNInfo and ASPathEntry structs
- Track the maximum queue length seen for each handler
- Display high water marks on the status page with percentage
- Helps identify which handlers are experiencing queue pressure
The 2 minute interval was causing a noticeable delay before peerings
appeared in the database. Reducing to 30 seconds provides a better
user experience while still maintaining efficient batch processing.
- Remove UUID primary keys from ASNs table, use ASN number as primary key
- Update announcements table to reference ASN numbers directly
- Rename asns.number column to asns.asn for consistency
- Add prefix tracking to PrefixHandler to populate prefixes_v4/v6 tables
- Add UpdatePrefixesBatch method for efficient batch updates
- Update all database methods and models to use new schema
- Fix all references in code to use ASN field instead of Number
- Update test mocks to match new interfaces
The prefixes_v4 and prefixes_v6 tables were never being populated
because GetOrCreatePrefix was not being called anywhere. Since we
already track all prefixes in live_routes_v4 and live_routes_v6,
update stats queries to count distinct prefixes from those tables.
- Create separate tables for IPv4 and IPv6 prefixes in schema.sql
- Update indexes for new prefix tables
- Update getOrCreatePrefix to use appropriate table based on IP version
- Update GetStatsContext to count prefixes from both tables
- Remove ip_version column since it's implicit in the table name
- Update status page JavaScript to reset all fields to '-' on error
- Fix status page to not show 'Connected' when API returns error
- Update remaining database methods to use new live_routes_v4/v6 tables
- Fix GetStatsContext to count routes from both IPv4 and IPv6 tables
- Fix UpsertLiveRoute to insert into correct table based on IP version
- Fix DeleteLiveRoute to determine table from prefix IP version
- Create live_routes_v4 and live_routes_v6 tables
- Update all database methods to use appropriate table
- Add IP version detection in database queries
- Remove filtering by ip_version column for better performance
- Fix route count queries that were timing out
- Update PrefixHandler to include IP version in deletions