Commit Graph

62 Commits

Author SHA1 Message Date
9a63553f8d Add index to optimize COUNT(DISTINCT prefix) queries
- Add compound index on (ip_version, mask_length, prefix) to speed up prefix distribution queries
- This index will significantly improve performance of COUNT(DISTINCT prefix) operations
- Note: Existing databases will need to manually create this index or recreate the database
2025-07-28 19:01:45 +02:00
ba13c76c53 Fix prefix distribution bug and add prefix length pages
- Fix GetPrefixDistribution to count unique prefixes using COUNT(DISTINCT prefix) instead of COUNT(*)
- Add /prefixlength/<length> route showing random sample of 500 prefixes
- Make prefix counts on status page clickable links to prefix length pages
- Add GetRandomPrefixesByLength database method
- Create prefix_length.html template with sortable table
- Show prefix age and origin AS with descriptions
2025-07-28 18:42:38 +02:00
1dcde74a90 Update AS path display to show handles with clickable links
- Change AS path from descriptions to handles (short names)
- Make each AS in the path a clickable link to /as/<asn>
- Add font-weight to AS links in path for better visibility
- Prevent word wrapping on all table columns except AS path
- Remove unused maxASDescriptionLength constant
2025-07-28 18:31:35 +02:00
81267431f7 Increase batch sizes to improve write throughput
- Increase prefix batch size from 5K to 20K
- Increase ASN batch size from 10K to 30K
- Add comments warning not to reduce batch timeouts
- Add comments warning not to increase queue sizes above 100K
- Maintains existing batch timeouts for efficiency
2025-07-28 18:27:42 +02:00
dc3ceb8d94 Show AS descriptions in AS path on prefix detail page
- Display AS descriptions alongside AS numbers in format: Description (ASN)
- Truncate descriptions longer than 20 characters with ellipsis
- Increase container max width to 1600px for better display
- Enable word wrapping for AS path cells to handle long paths
- Update mobile responsive styles for AS path display
2025-07-28 18:25:26 +02:00
9ef2a22db3 Remove SQLite pragmas that set default values
- Remove page_size, wal_autocheckpoint, locking_mode, mmap_size
- Keep only pragmas that change behavior from defaults
- Increase cache size to 3GB (upper limit for 2.4GB database)
2025-07-28 18:12:25 +02:00
05805b8847 Optimize SQLite settings for better balance
- Reduce cache size from 8GB to 512MB (still plenty for 2.4GB DB)
- Reduce mmap_size from 10GB to 256MB (reasonable default)
- Use default page size (4KB) instead of 8KB
- Use default WAL checkpoint interval (1000 pages)
- Remove redundant pragmas (threads, cache_spill, read_uncommitted)
- Clean up connection string to only use _txlock parameter
- Keep synchronous=OFF for performance (since we have mutex protection)
2025-07-28 18:06:31 +02:00
ddb3cfa4f0 Add mutex to serialize database access
- Add internal mutex to Database struct with lock/unlock wrappers
- Add debug logging for lock acquisition and release with timing
- Wrap all write operations with database mutex
- Use _txlock=immediate in SQLite connection string

This works around apparent issues with SQLite's internal locking
not properly respecting busy_timeout in production environment.
2025-07-28 17:56:26 +02:00
3ef60459b2 Fix SQLite transaction deadlocks with immediate mode
- Add _txlock=immediate to SQLite connection string
- This prevents deadlocks by acquiring write locks at transaction start
- Multiple concurrent writers now queue properly instead of failing instantly
- Resolves 'database table is locked' errors in production
2025-07-28 17:26:42 +02:00
40d7f0185b Optimize database batch operations with prepared statements
- Add prepared statements to all batch operations for better performance
- Fix database lock contention by properly batching operations
- Update SQLite settings for extreme performance (8GB cache, sync OFF)
- Add proper error handling for statement closing
- Update tests to properly track batch operations
2025-07-28 17:21:40 +02:00
b9b0792df9 Fix shutdown handling and optimize SQLite settings
- Fix Ctrl-C shutdown by using fx.Shutdowner instead of just canceling context
- Pass context from fx lifecycle to rw.Run() for proper cancellation
- Adjust WAL settings: checkpoint at 50MB, max size 100MB
- Reduce busy timeout from 30s to 2s to fail fast on lock contention

This should fix the issue where Ctrl-C doesn't cause shutdown and improve
database responsiveness under heavy load.
2025-07-28 16:52:52 +02:00
21921a170c Optimize database performance to fix slow queries
- Add VACUUM on startup to defragment database
- Increase cache size from 256MB to 2GB for better performance
- Increase mmap_size from 256MB to 512MB
- Add PRAGMA analysis_limit=0 to disable automatic ANALYZE
- Remove PRAGMA optimize which could trigger slow ANALYZE

These changes should dramatically improve query performance and prevent
the 5+ second query times seen in production.
2025-07-28 16:47:59 +02:00
78d6e17c76 Add debug logging and optimize SQLite performance
- Add debug logging for goroutines and memory usage (enabled via DEBUG=routewatch)
- Increase SQLite connection pool from 1 to 10 connections for better concurrency
- Optimize SQLite pragmas for balanced performance and safety
- Add proper shutdown handling for peering handler
- Define constants to avoid magic numbers in code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-28 15:45:06 +02:00
9b649c98c9 Fix AS detail view and add prefix sorting
- Fix GetASDetails to properly handle timestamp from MAX(last_updated)
- Parse timestamp string from SQLite aggregate function result
- Add natural sorting of prefixes by IP address in AS detail view
- Sort IPv4 and IPv6 prefixes separately by network address
- Remove SQL ORDER BY since we're sorting in Go
- This fixes the issue where AS detail pages showed no prefixes
2025-07-28 04:42:10 +02:00
48db8b9edf Fix AS detail view to show unique prefixes
- Update GetASDetails query to GROUP BY prefix instead of using DISTINCT
- Use MAX(last_updated) to get the most recent update time for each prefix
- This prevents duplicate prefixes from appearing when announced by multiple peers
- Maintains the same prefix count and ordering
2025-07-28 04:36:22 +02:00
df31cf880a Fix prefix URL routing by using URL encoding
- Replace slash-to-dash conversion with proper URL encoding
- Update handlePrefixDetail and handlePrefixDetailJSON to URL decode prefix parameter
- Update handleIPRedirect to URL encode the prefix in the redirect
- Add urlEncode template function for use in templates
- Update AS detail template to URL encode prefix links
- This properly handles the slash in CIDR notation (e.g., /prefix/192.168.1.0%2F24)
2025-07-28 04:34:34 +02:00
af9ff258b1 Add /ip/<ip> route that redirects to prefix detail page
- Implement handleIPRedirect handler that looks up the prefix containing an IP
- Add /ip/{ip} route to routes.go
- Reuse existing GetASInfoForIP database method which returns prefix info
- Redirect to /prefix/<prefix> page with HTTP 303 See Other status
- Handle invalid IPs (400) and IPs with no route (404)
2025-07-28 04:31:22 +02:00
aeeb5e7d7d Implement AS and prefix detail pages
- Implement handleASDetail() and handlePrefixDetail() HTML handlers
- Create AS detail HTML template with prefix listings
- Create prefix detail HTML template with route information
- Add timeSince template function for human-readable durations
- Update templates.go to include new templates
- Server-side rendered pages as requested (no client-side API calls)
2025-07-28 04:26:20 +02:00
27ae80ea2e Refactor server package: split handlers and routes into separate files
- Move all handler functions to handlers.go
- Move setupRoutes to routes.go
- Clean up server.go to only contain core server logic
- Add missing GetASDetails and GetPrefixDetails to mockStore for tests
- Fix linter errors (magic numbers, unused parameters, blank lines)
2025-07-28 04:00:12 +02:00
2fc24bb937 Add route age information to IP lookup API
- Add last_updated timestamp and age fields to ASInfo
- Include route's last_updated time from live_routes table
- Calculate and display age as human-readable duration
- Update both IPv4 and IPv6 queries to fetch timestamp
- Fix error handling to return 400 for invalid IPs
2025-07-28 03:44:19 +02:00
691710bc7c Add /api/v1/ip/<ip> endpoint for IP to AS lookups
- Add handleIPLookup handler that uses GetASInfoForIP
- Create writeJSONError and writeJSONSuccess helper functions
- Refactor all JSON error responses to use the helpers
- Add GetASInfoForIP to Store interface
- Add mock implementation for tests
- Fix all linter warnings
2025-07-28 03:31:53 +02:00
afb916036c Fix handler processing time display for sub-millisecond values
- Add formatProcessingTime function to display microseconds for values < 1ms
- Show 0 µs for times < 0.001ms, X.X µs for times < 0.01ms
- Show X.XXX ms for times < 1ms, X.XX ms for times >= 1ms
- Apply formatting to both average and min/max time displays
2025-07-28 03:26:16 +02:00
13047b5cb9 Add IPv4 range optimization for IP to AS lookups
- Add v4_ip_start and v4_ip_end columns to live_routes table
- Calculate IPv4 CIDR ranges as 32-bit integers for fast lookups
- Update PrefixHandler to populate IPv4 range fields
- Add GetASInfoForIP method with optimized IPv4 queries
- Add comprehensive tests for IP conversion functions
2025-07-28 03:23:25 +02:00
ae89468a1b Remove routing table and snapshotter packages, update status page
- Remove routingtable package entirely as database handles all routing data
- Remove snapshotter package as database contains all information
- Rename 'Connection Status' box to 'RouteWatch' and add Go version, goroutines, memory usage
- Move IPv4/IPv6 prefix counts from Database Statistics to Routing Table box
- Add Peers count to Database Statistics box
- Add go-humanize dependency for memory formatting
- Update server to include new metrics in API responses
2025-07-28 03:11:36 +02:00
d929f24f80 Remove RoutingTableHandler and snapshotter, use database for route stats
- Remove RoutingTableHandler as PrefixHandler maintains live_routes table
- Update server to get route counts from database instead of in-memory routing table
- Add GetLiveRouteCounts method to database for IPv4/IPv6 route counts
- Use metrics tracker in PrefixHandler for route update rates
- Remove snapshotter entirely as database contains all information
- Update tests to work without routing table
2025-07-28 03:02:44 +02:00
cb1f4d9052 Add route update metrics tracking to PrefixHandler
- Add RecordIPv4Update and RecordIPv6Update to metrics package
- Add SetMetricsTracker method to PrefixHandler
- Track IPv4/IPv6 route updates when processing announcements
- Add GetMetricsTracker method to Streamer to expose metrics
2025-07-28 02:55:27 +02:00
bc640b0b37 Reduce all handler queue sizes to 100,000 2025-07-28 02:50:05 +02:00
7d814c9d2d Optimize SQLite and PrefixHandler for better performance
- Increase PrefixHandler queue size to 500k and batch size to 25k
- Set SQLite PRAGMA synchronous=OFF for faster writes (trades durability)
- Increase SQLite cache to 1GB and mmap to 512MB
- Increase WAL checkpoint interval to 10000 pages
- Set page size to 8KB for better performance
- Increase busy timeout to 30 seconds
- Keep single connection to avoid SQLite locking issues
2025-07-28 02:40:17 +02:00
54bb0ba1cb Simplify peerings table to store AS numbers directly
- Rename asn_peerings table to peerings
- Change columns from from_asn_id/to_asn_id to as_a/as_b (integers)
- Remove foreign key constraints to asns table
- Update RecordPeering to use AS numbers directly
- Add validation in RecordPeering to ensure:
  - Both AS numbers are > 0
  - AS numbers are different
  - as_a is always lower than as_b (normalized)
- Update PeeringHandler to no longer need ASN cache
- Simplify the code by removing unnecessary ASN lookups
2025-07-28 02:36:15 +02:00
1157003db7 Refactor database handlers and optimize PeeringHandler
- Create PeeringHandler for asn_peerings table maintenance
- Rename DBHandler to ASHandler (now only handles asns table)
- Move prefixes table maintenance to PrefixHandler
- Optimize PeeringHandler with in-memory AS path tracking:
  - Stores AS paths in memory with timestamps
  - Processes peerings in batch every 2 minutes
  - Prunes old paths (>30 minutes) every 5 minutes
  - Normalizes peerings with lower AS number first
- Each handler now has a single responsibility:
  - ASHandler: asns table
  - PeerHandler: bgp_peers table
  - PrefixHandler: prefixes and live_routes tables
  - PeeringHandler: asn_peerings table
2025-07-28 02:31:04 +02:00
eaa11b5f8d Move prefixes table maintenance from DBHandler to PrefixHandler
- DBHandler now only maintains asns and asn_peerings tables
- PrefixHandler maintains both prefixes and live_routes tables
- This consolidates all prefix-related operations in one handler
2025-07-28 02:16:12 +02:00
8b43882526 Increase batch sizes to 10000 and queue sizes to 200000, reduce timeout to 2s 2025-07-28 02:11:05 +02:00
eda90d96a9 Remove debug logging for withdrawals without origin ASN 2025-07-28 02:07:33 +02:00
3c46087976 Add live routing table with CIDR mask length tracking
- Added new live_routes table with mask_length column for tracking CIDR prefix lengths
- Updated PrefixHandler to maintain live routing table with additions and deletions
- Added route expiration functionality (5 minute timeout) to in-memory routing table
- Added prefix distribution stats showing count of prefixes by mask length
- Added IPv4/IPv6 prefix distribution cards to status page
- Updated database interface with UpsertLiveRoute, DeleteLiveRoute, and GetPrefixDistribution
- Set all handler queue depths to 50000 for consistency
- Doubled DBHandler batch size to 32000 for better throughput
- Fixed withdrawal handling to delete routes when origin ASN is available
2025-07-28 01:51:42 +02:00
cea7c3dfd3 Rename handlers and add PrefixHandler for database routing table
- Renamed BatchedDatabaseHandler to DBHandler
- Renamed BatchedPeerHandler to PeerHandler
- Quadrupled DBHandler batch size from 4000 to 16000
- Created new PrefixHandler using same batching strategy to maintain routing table in database
- Removed verbose batch flush logging from all handlers
- Updated app.go to use renamed handlers and register PrefixHandler
- Fixed test configuration to enable batched database writes
2025-07-28 01:37:19 +02:00
3aef3f9a07 Format logger source location as file.go:linenum
- Change source location format from separate source and line fields
  to a single source field with format 'file.go:linenum'
- This provides a more concise and standard format for source locations
2025-07-28 01:16:29 +02:00
67f6b78aaa Add custom logger with source location tracking and remove verbose database logs
- Create internal/logger package with Logger wrapper around slog
- Logger automatically adds source file, line number, and function name to all log entries
- Use golang.org/x/term to properly detect if stdout is a terminal
- Replace all slog.Logger usage with logger.Logger throughout the codebase
- Remove verbose logging from database GetStats() method
- Update all constructors and dependencies to use the new logger
2025-07-28 01:14:51 +02:00
3f06955214 Increase batch sizes and timeouts for better throughput
- Increase batch size from 100/50 to 500 for both handlers
- Increase batch timeout from 100-200ms to 5 seconds
- This will reduce database write frequency and improve throughput
  for high-volume BGP update streams
2025-07-28 01:03:09 +02:00
155c08d735 Implement batched database operations for improved performance
- Add BatchedDatabaseHandler that batches prefix, ASN, and peering operations
- Add BatchedPeerHandler that batches peer update operations
- Batch operations are deduped and flushed every 100-200ms or when batch size is reached
- Add EnableBatchedDatabaseWrites config option (enabled by default)
- Properly flush remaining batches on shutdown
- This significantly reduces database write pressure and improves throughput
2025-07-28 01:01:27 +02:00
d15a5e91b9 Inject Config as dependency for database, routing table, and snapshotter
- Remove old database Config struct and related functions
- Update database.New() to accept config.Config parameter
- Update routingtable.New() to accept config.Config parameter
- Update snapshotter.New() to accept config.Config parameter
- Simplify fx module providers in app.go
- Fix truthiness check for environment variables
- Handle empty state directory gracefully in routing table and snapshotter
- Update all tests to use empty state directory for testing
2025-07-28 00:55:09 +02:00
1a0622efaa Add handler queue metrics to status page
- Add GetHandlerStats() method to streamer to expose handler metrics
- Include queue length/capacity, processed/dropped counts, timing stats
- Update API to include handler_stats in response
- Add dynamic handler stats display to status page HTML
- Shows separate status box for each handler with all metrics
2025-07-28 00:32:37 +02:00
6593a7be76 Add database file size and reorganize status page
- Add database file size tracking to Stats struct and GetStats()
- Move routing table metrics to separate 'Routing Table' status box
- Add IPv4/IPv6 updates per second to routing table metrics
- Database box now shows: ASNs, prefixes, peerings, and database size
- Routing table box shows: live routes, IPv4/IPv6 counts, and update rates
2025-07-28 00:27:54 +02:00
5bd3add59b Fix deadlock in snapshotter and add timing logs
- Move GetDetailedStats() call outside of read lock to avoid deadlock
- Add timing logs to identify performance bottlenecks during snapshot
- Log duration for copying routes, marshaling JSON, and writing to disk
2025-07-28 00:16:01 +02:00
fa9b086629 Fix shutdown sequence to ensure final snapshot is taken
- Add Shutdown() method to RouteWatch with mutex-protected shutdown flag
- Move all cleanup logic from Run() to Shutdown()
- Call Shutdown() from fx OnStop hook
- This ensures snapshotter gets called during graceful shutdown
2025-07-28 00:11:54 +02:00
52cdcd5785 Fix snapshotter initialization and remove initial snapshot on startup
- Remove immediate snapshot when periodic goroutine starts
- Fix variable shadowing issue in snapshotter creation
- Add debug logging for snapshotter shutdown
- Snapshots now only occur after 10 minutes or on shutdown
2025-07-28 00:06:39 +02:00
ae2ef2ae0c Implement routing table snapshotter with automatic loading on startup
- Create snapshotter package with periodic (10 min) and on-demand snapshots
- Add JSON serialization with gzip compression and atomic file writes
- Update routing table to track AddedAt time for each route
- Load snapshots on startup, filtering out stale routes (>30 minutes old)
- Add ROUTEWATCH_DISABLE_SNAPSHOTTER env var for tests
- Use OS-appropriate state directories (macOS: ~/Library/Application Support, Linux: /var/lib or XDG_STATE_HOME)
2025-07-28 00:03:19 +02:00
283f2ddbf2 Add separate IPv4/IPv6 route counts to status page and API
- Update server Stats and StatsResponse structs to include ipv4_routes and ipv6_routes
- Fetch detailed routing table stats to get IPv4/IPv6 breakdown
- Add IPv4 Routes and IPv6 Routes display to HTML status page
- Change metric values to monospace font and remove bold styling
2025-07-27 23:46:47 +02:00
1d05372899 Fix linting errors for magic numbers in handler queue sizes
- Define constants for all handler queue capacities
- Fix integer overflow warning in metrics calculation
- Add missing blank lines before continue statements
2025-07-27 23:38:38 +02:00
76ec9f68b7 Add ASN info lookup and periodic routing table statistics
- Add handle and description columns to asns table
- Look up ASN info using asinfo package when creating new ASNs
- Remove noisy debug logging for individual route updates
- Add IPv4/IPv6 route counters and update rate tracking
- Log routing table statistics every 15 seconds with IPv4/IPv6 breakdown
- Track updates per second for both IPv4 and IPv6 routes separately
2025-07-27 23:25:23 +02:00
a555a1dee2 Replace live_routes database table with in-memory routing table
- Remove live_routes table from SQL schema and all related indexes
- Create new internal/routingtable package with thread-safe RoutingTable
- Implement RouteKey-based indexing with secondary indexes for efficient lookups
- Add RoutingTableHandler to manage in-memory routes separately from database
- Update DatabaseHandler to only handle persistent database operations
- Wire up RoutingTable through fx dependency injection
- Update server to get live route count from routing table instead of database
- Remove LiveRoutes field from database.Stats struct
- Update tests to work with new architecture
2025-07-27 23:16:19 +02:00