Compare commits

...

18 Commits

Author SHA1 Message Date
1ec0b3e7ca Change stats fetch interval from 500ms to 2 seconds
Reduces the frequency of stats API calls from twice per second
to once every 2 seconds, reducing server load.
2025-07-29 04:22:06 +02:00
037bbfb813 Reduce slow query threshold from 50ms to 25ms
This will help identify performance issues earlier by logging
any database query that takes longer than 25 milliseconds.
2025-07-29 04:20:43 +02:00
1fded42651 Quadruple all HTTP timeouts to prevent timeout errors
- HTTP request timeout: 2s -> 8s
- Stats collection context timeout: 1s -> 4s
- HTTP read header timeout: 10s -> 40s

This should prevent timeout errors when the database is under load
or when complex queries take longer than expected (slow query
threshold is 50ms).
2025-07-29 04:18:07 +02:00
3338e92785 Add JSON validation middleware to ensure valid API responses
- Created JSONValidationMiddleware that validates all JSON responses
- Ensures that even on timeout or internal errors, a valid JSON error response is returned
- Applied to all API endpoints including /status.json
- Prevents client-side JSON parse errors when server encounters issues
2025-07-29 04:13:01 +02:00
7aec01c499 Add AS peers display to AS detail page
- Added GetASPeers method to database to fetch all peering relationships
- Updated AS detail handler to fetch and pass peers to template
- Added peers section to AS detail page showing all peer ASNs with their info
- Added peer count to the info cards at the top of the page
- Shows handle, description, and first/last seen dates for each peer
2025-07-29 03:58:09 +02:00
deeedae180 Fix template references to renamed ASN fields
Updated templates to use the new field names after renaming:
- ASN.Number -> ASN.ASN in as_detail.html
- Fixed references to ASN field in prefix_detail.html for ASNInfo and ASPathEntry structs
2025-07-29 03:37:07 +02:00
d3966f2320 Fix SQL query to use renamed asn column
Fixed remaining references to a.number that should be a.asn after
the column rename in the ASNs table.
2025-07-29 02:52:47 +02:00
23127b86e9 Add queue high water marks to handler statistics
- Track the maximum queue length seen for each handler
- Display high water marks on the status page with percentage
- Helps identify which handlers are experiencing queue pressure
2025-07-29 02:46:53 +02:00
2cfca78464 Reduce peering processing interval from 2 minutes to 30 seconds
The 2 minute interval was causing a noticeable delay before peerings
appeared in the database. Reducing to 30 seconds provides a better
user experience while still maintaining efficient batch processing.
2025-07-28 23:05:58 +02:00
c9da20e630 Major schema refactoring: simplify ASN and prefix tracking
- Remove UUID primary keys from ASNs table, use ASN number as primary key
- Update announcements table to reference ASN numbers directly
- Rename asns.number column to asns.asn for consistency
- Add prefix tracking to PrefixHandler to populate prefixes_v4/v6 tables
- Add UpdatePrefixesBatch method for efficient batch updates
- Update all database methods and models to use new schema
- Fix all references in code to use ASN field instead of Number
- Update test mocks to match new interfaces
2025-07-28 22:58:55 +02:00
a165ecf759 Fix prefix stats by counting from live routes tables
The prefixes_v4 and prefixes_v6 tables were never being populated
because GetOrCreatePrefix was not being called anywhere. Since we
already track all prefixes in live_routes_v4 and live_routes_v6,
update stats queries to count distinct prefixes from those tables.
2025-07-28 22:44:44 +02:00
725d04ffa8 Split prefixes table into prefixes_v4 and prefixes_v6
- Create separate tables for IPv4 and IPv6 prefixes in schema.sql
- Update indexes for new prefix tables
- Update getOrCreatePrefix to use appropriate table based on IP version
- Update GetStatsContext to count prefixes from both tables
- Remove ip_version column since it's implicit in the table name
2025-07-28 22:41:42 +02:00
fc32090483 Fix JavaScript UI and complete database table migration
- Update status page JavaScript to reset all fields to '-' on error
- Fix status page to not show 'Connected' when API returns error
- Update remaining database methods to use new live_routes_v4/v6 tables
- Fix GetStatsContext to count routes from both IPv4 and IPv6 tables
- Fix UpsertLiveRoute to insert into correct table based on IP version
- Fix DeleteLiveRoute to determine table from prefix IP version
2025-07-28 22:39:01 +02:00
3673264552 Separate IPv4 and IPv6 routes into different tables
- Create live_routes_v4 and live_routes_v6 tables
- Update all database methods to use appropriate table
- Add IP version detection in database queries
- Remove filtering by ip_version column for better performance
- Fix route count queries that were timing out
- Update PrefixHandler to include IP version in deletions
2025-07-28 22:29:15 +02:00
8e12c07396 Implement queue backpressure with gradual message dropping
- Add gradual message dropping based on queue utilization
- Start dropping messages at 50% queue capacity
- Drop rate increases linearly from 0% at 50% to 100% at full
- Uses random drops to maintain fair distribution
- Helps prevent queue overflow under high load
2025-07-28 22:17:00 +02:00
b6ad50f23f Add warnings about schema changes and remove ad-hoc index creation
- Remove ad-hoc index creation from database.go Initialize method
- Add clear comments to both database.go and schema.sql warning that
  ALL schema changes must be made in schema.sql only
- We do not support migrations, schema changes outside schema.sql are forbidden
2025-07-28 22:09:19 +02:00
c35b76deb8 Optimize AS detail queries and increase PrefixHandler batch size
- Increase PrefixHandler batch size from 20k to 25k (25% increase)
- Add missing index on origin_asn for live_routes table
- This index significantly speeds up AS detail page queries
- Add code to create missing indexes on existing databases
2025-07-28 22:07:27 +02:00
6d46bbad5b Fix nil pointer dereference in GetPrefixDistributionContext
- Use separate variables for IPv4 and IPv6 query results
- Add nil checks before closing rows to prevent panic
- Prevents crash when database queries timeout or fail
2025-07-28 22:04:22 +02:00
17 changed files with 207609 additions and 178684 deletions

File diff suppressed because it is too large Load Diff

View File

@ -27,6 +27,7 @@ type Store interface {
// Prefix operations
GetOrCreatePrefix(prefix string, timestamp time.Time) (*Prefix, error)
UpdatePrefixesBatch(prefixes map[string]time.Time) error
// Announcement operations
RecordAnnouncement(announcement *Announcement) error
@ -59,6 +60,8 @@ type Store interface {
// AS and prefix detail operations
GetASDetails(asn int) (*ASN, []LiveRoute, error)
GetASDetailsContext(ctx context.Context, asn int) (*ASN, []LiveRoute, error)
GetASPeers(asn int) ([]ASPeer, error)
GetASPeersContext(ctx context.Context, asn int) ([]ASPeer, error)
GetPrefixDetails(prefix string) ([]LiveRoute, error)
GetPrefixDetailsContext(ctx context.Context, prefix string) ([]LiveRoute, error)
GetRandomPrefixesByLength(maskLength, ipVersion, limit int) ([]LiveRoute, error)

View File

@ -8,8 +8,7 @@ import (
// ASN represents an Autonomous System Number
type ASN struct {
ID uuid.UUID `json:"id"`
Number int `json:"number"`
ASN int `json:"asn"`
Handle string `json:"handle"`
Description string `json:"description"`
FirstSeen time.Time `json:"first_seen"`
@ -29,8 +28,8 @@ type Prefix struct {
type Announcement struct {
ID uuid.UUID `json:"id"`
PrefixID uuid.UUID `json:"prefix_id"`
ASNID uuid.UUID `json:"asn_id"`
OriginASNID uuid.UUID `json:"origin_asn_id"`
PeerASN int `json:"peer_asn"`
OriginASN int `json:"origin_asn"`
Path string `json:"path"` // JSON-encoded AS path
NextHop string `json:"next_hop"`
Timestamp time.Time `json:"timestamp"`
@ -40,8 +39,8 @@ type Announcement struct {
// ASNPeering represents a peering relationship between two ASNs
type ASNPeering struct {
ID uuid.UUID `json:"id"`
FromASNID uuid.UUID `json:"from_asn_id"`
ToASNID uuid.UUID `json:"to_asn_id"`
ASA int `json:"as_a"`
ASB int `json:"as_b"`
FirstSeen time.Time `json:"first_seen"`
LastSeen time.Time `json:"last_seen"`
}
@ -83,6 +82,7 @@ type LiveRouteDeletion struct {
Prefix string
OriginASN int
PeerIP string
IPVersion int
}
// PeerUpdate represents parameters for updating a peer

View File

@ -1,16 +1,27 @@
-- IMPORTANT: This is the ONLY place where schema changes should be made.
-- We do NOT support migrations. All schema changes MUST be in this file.
-- DO NOT make schema changes anywhere else in the codebase.
CREATE TABLE IF NOT EXISTS asns (
id TEXT PRIMARY KEY,
number INTEGER UNIQUE NOT NULL,
asn INTEGER PRIMARY KEY,
handle TEXT,
description TEXT,
first_seen DATETIME NOT NULL,
last_seen DATETIME NOT NULL
);
CREATE TABLE IF NOT EXISTS prefixes (
-- IPv4 prefixes table
CREATE TABLE IF NOT EXISTS prefixes_v4 (
id TEXT PRIMARY KEY,
prefix TEXT UNIQUE NOT NULL,
first_seen DATETIME NOT NULL,
last_seen DATETIME NOT NULL
);
-- IPv6 prefixes table
CREATE TABLE IF NOT EXISTS prefixes_v6 (
id TEXT PRIMARY KEY,
prefix TEXT UNIQUE NOT NULL,
ip_version INTEGER NOT NULL, -- 4 for IPv4, 6 for IPv6
first_seen DATETIME NOT NULL,
last_seen DATETIME NOT NULL
);
@ -18,15 +29,14 @@ CREATE TABLE IF NOT EXISTS prefixes (
CREATE TABLE IF NOT EXISTS announcements (
id TEXT PRIMARY KEY,
prefix_id TEXT NOT NULL,
asn_id TEXT NOT NULL,
origin_asn_id TEXT NOT NULL,
peer_asn INTEGER NOT NULL,
origin_asn INTEGER NOT NULL,
path TEXT NOT NULL,
next_hop TEXT,
timestamp DATETIME NOT NULL,
is_withdrawal BOOLEAN NOT NULL DEFAULT 0,
FOREIGN KEY (prefix_id) REFERENCES prefixes(id),
FOREIGN KEY (asn_id) REFERENCES asns(id),
FOREIGN KEY (origin_asn_id) REFERENCES asns(id)
FOREIGN KEY (peer_asn) REFERENCES asns(asn),
FOREIGN KEY (origin_asn) REFERENCES asns(asn)
);
CREATE TABLE IF NOT EXISTS peerings (
@ -48,49 +58,71 @@ CREATE TABLE IF NOT EXISTS bgp_peers (
last_message_type TEXT
);
CREATE INDEX IF NOT EXISTS idx_prefixes_ip_version ON prefixes(ip_version);
CREATE INDEX IF NOT EXISTS idx_prefixes_version_prefix ON prefixes(ip_version, prefix);
-- Indexes for prefixes_v4 table
CREATE INDEX IF NOT EXISTS idx_prefixes_v4_prefix ON prefixes_v4(prefix);
-- Indexes for prefixes_v6 table
CREATE INDEX IF NOT EXISTS idx_prefixes_v6_prefix ON prefixes_v6(prefix);
CREATE INDEX IF NOT EXISTS idx_announcements_timestamp ON announcements(timestamp);
CREATE INDEX IF NOT EXISTS idx_announcements_prefix_id ON announcements(prefix_id);
CREATE INDEX IF NOT EXISTS idx_announcements_asn_id ON announcements(asn_id);
CREATE INDEX IF NOT EXISTS idx_announcements_peer_asn ON announcements(peer_asn);
CREATE INDEX IF NOT EXISTS idx_announcements_origin_asn ON announcements(origin_asn);
CREATE INDEX IF NOT EXISTS idx_peerings_as_a ON peerings(as_a);
CREATE INDEX IF NOT EXISTS idx_peerings_as_b ON peerings(as_b);
CREATE INDEX IF NOT EXISTS idx_peerings_lookup ON peerings(as_a, as_b);
-- Additional indexes for prefixes table
CREATE INDEX IF NOT EXISTS idx_prefixes_prefix ON prefixes(prefix);
-- Indexes for asns table
CREATE INDEX IF NOT EXISTS idx_asns_number ON asns(number);
CREATE INDEX IF NOT EXISTS idx_asns_asn ON asns(asn);
-- Indexes for bgp_peers table
CREATE INDEX IF NOT EXISTS idx_bgp_peers_asn ON bgp_peers(peer_asn);
CREATE INDEX IF NOT EXISTS idx_bgp_peers_last_seen ON bgp_peers(last_seen);
CREATE INDEX IF NOT EXISTS idx_bgp_peers_ip ON bgp_peers(peer_ip);
-- Live routing table maintained by PrefixHandler
CREATE TABLE IF NOT EXISTS live_routes (
-- IPv4 routing table maintained by PrefixHandler
CREATE TABLE IF NOT EXISTS live_routes_v4 (
id TEXT PRIMARY KEY,
prefix TEXT NOT NULL,
mask_length INTEGER NOT NULL, -- CIDR mask length (0-32 for IPv4, 0-128 for IPv6)
ip_version INTEGER NOT NULL, -- 4 or 6
mask_length INTEGER NOT NULL, -- CIDR mask length (0-32)
origin_asn INTEGER NOT NULL,
peer_ip TEXT NOT NULL,
as_path TEXT NOT NULL, -- JSON array
next_hop TEXT NOT NULL,
last_updated DATETIME NOT NULL,
-- IPv4 range columns for fast lookups (NULL for IPv6)
v4_ip_start INTEGER, -- Start of IPv4 range as 32-bit unsigned int
v4_ip_end INTEGER, -- End of IPv4 range as 32-bit unsigned int
-- IPv4 range columns for fast lookups
ip_start INTEGER NOT NULL, -- Start of IPv4 range as 32-bit unsigned int
ip_end INTEGER NOT NULL, -- End of IPv4 range as 32-bit unsigned int
UNIQUE(prefix, origin_asn, peer_ip)
);
-- Indexes for live_routes table
CREATE INDEX IF NOT EXISTS idx_live_routes_prefix ON live_routes(prefix);
CREATE INDEX IF NOT EXISTS idx_live_routes_mask_length ON live_routes(mask_length);
CREATE INDEX IF NOT EXISTS idx_live_routes_ip_version_mask ON live_routes(ip_version, mask_length);
CREATE INDEX IF NOT EXISTS idx_live_routes_last_updated ON live_routes(last_updated);
-- IPv6 routing table maintained by PrefixHandler
CREATE TABLE IF NOT EXISTS live_routes_v6 (
id TEXT PRIMARY KEY,
prefix TEXT NOT NULL,
mask_length INTEGER NOT NULL, -- CIDR mask length (0-128)
origin_asn INTEGER NOT NULL,
peer_ip TEXT NOT NULL,
as_path TEXT NOT NULL, -- JSON array
next_hop TEXT NOT NULL,
last_updated DATETIME NOT NULL,
-- Note: IPv6 doesn't use integer range columns
UNIQUE(prefix, origin_asn, peer_ip)
);
-- Indexes for live_routes_v4 table
CREATE INDEX IF NOT EXISTS idx_live_routes_v4_prefix ON live_routes_v4(prefix);
CREATE INDEX IF NOT EXISTS idx_live_routes_v4_mask_length ON live_routes_v4(mask_length);
CREATE INDEX IF NOT EXISTS idx_live_routes_v4_origin_asn ON live_routes_v4(origin_asn);
CREATE INDEX IF NOT EXISTS idx_live_routes_v4_last_updated ON live_routes_v4(last_updated);
-- Indexes for IPv4 range queries
CREATE INDEX IF NOT EXISTS idx_live_routes_ipv4_range ON live_routes(v4_ip_start, v4_ip_end) WHERE ip_version = 4;
-- Index to optimize COUNT(DISTINCT prefix) queries
CREATE INDEX IF NOT EXISTS idx_live_routes_ip_mask_prefix ON live_routes(ip_version, mask_length, prefix);
CREATE INDEX IF NOT EXISTS idx_live_routes_v4_ip_range ON live_routes_v4(ip_start, ip_end);
-- Index to optimize prefix distribution queries
CREATE INDEX IF NOT EXISTS idx_live_routes_v4_mask_prefix ON live_routes_v4(mask_length, prefix);
-- Indexes for live_routes_v6 table
CREATE INDEX IF NOT EXISTS idx_live_routes_v6_prefix ON live_routes_v6(prefix);
CREATE INDEX IF NOT EXISTS idx_live_routes_v6_mask_length ON live_routes_v6(mask_length);
CREATE INDEX IF NOT EXISTS idx_live_routes_v6_origin_asn ON live_routes_v6(origin_asn);
CREATE INDEX IF NOT EXISTS idx_live_routes_v6_last_updated ON live_routes_v6(last_updated);
-- Index to optimize prefix distribution queries
CREATE INDEX IF NOT EXISTS idx_live_routes_v6_mask_prefix ON live_routes_v6(mask_length, prefix);

View File

@ -8,7 +8,7 @@ import (
"git.eeqj.de/sneak/routewatch/internal/logger"
)
const slowQueryThreshold = 50 * time.Millisecond
const slowQueryThreshold = 25 * time.Millisecond
// logSlowQuery logs queries that take longer than slowQueryThreshold
func logSlowQuery(logger *logger.Logger, query string, start time.Time) {

View File

@ -61,8 +61,7 @@ func (m *mockStore) GetOrCreateASN(number int, timestamp time.Time) (*database.A
}
asn := &database.ASN{
ID: uuid.New(),
Number: number,
ASN: number,
FirstSeen: timestamp,
LastSeen: timestamp,
}
@ -72,6 +71,37 @@ func (m *mockStore) GetOrCreateASN(number int, timestamp time.Time) (*database.A
return asn, nil
}
// UpdatePrefixesBatch mock implementation
func (m *mockStore) UpdatePrefixesBatch(prefixes map[string]time.Time) error {
m.mu.Lock()
defer m.mu.Unlock()
for prefix, timestamp := range prefixes {
if p, exists := m.Prefixes[prefix]; exists {
p.LastSeen = timestamp
} else {
const (
ipVersionV4 = 4
ipVersionV6 = 6
)
ipVersion := ipVersionV4
if strings.Contains(prefix, ":") {
ipVersion = ipVersionV6
}
m.Prefixes[prefix] = &database.Prefix{
ID: uuid.New(),
Prefix: prefix,
IPVersion: ipVersion,
FirstSeen: timestamp,
LastSeen: timestamp,
}
}
}
return nil
}
// GetOrCreatePrefix mock implementation
func (m *mockStore) GetOrCreatePrefix(prefix string, timestamp time.Time) (*database.Prefix, error) {
m.mu.Lock()
@ -261,6 +291,17 @@ func (m *mockStore) GetRandomPrefixesByLengthContext(ctx context.Context, maskLe
return m.GetRandomPrefixesByLength(maskLength, ipVersion, limit)
}
// GetASPeers mock implementation
func (m *mockStore) GetASPeers(asn int) ([]database.ASPeer, error) {
// Return empty peers for now
return []database.ASPeer{}, nil
}
// GetASPeersContext mock implementation with context support
func (m *mockStore) GetASPeersContext(ctx context.Context, asn int) ([]database.ASPeer, error) {
return m.GetASPeers(asn)
}
// UpsertLiveRouteBatch mock implementation
func (m *mockStore) UpsertLiveRouteBatch(routes []*database.LiveRoute) error {
m.mu.Lock()
@ -302,8 +343,7 @@ func (m *mockStore) GetOrCreateASNBatch(asns map[int]time.Time) error {
for number, timestamp := range asns {
if _, exists := m.ASNs[number]; !exists {
m.ASNs[number] = &database.ASN{
ID: uuid.New(),
Number: number,
ASN: number,
FirstSeen: timestamp,
LastSeen: timestamp,
}

View File

@ -21,7 +21,7 @@ const (
pathExpirationTime = 30 * time.Minute
// peeringProcessInterval is how often to process AS paths into peerings
peeringProcessInterval = 2 * time.Minute
peeringProcessInterval = 30 * time.Second
// pathPruneInterval is how often to prune old AS paths
pathPruneInterval = 5 * time.Minute

View File

@ -19,7 +19,7 @@ const (
prefixHandlerQueueSize = 100000
// prefixBatchSize is the number of prefix updates to batch together
prefixBatchSize = 20000
prefixBatchSize = 25000
// prefixBatchTimeout is the maximum time to wait before flushing a batch
// DO NOT reduce this timeout - larger batches are more efficient
@ -182,9 +182,15 @@ func (h *PrefixHandler) flushBatchLocked() {
var routesToUpsert []*database.LiveRoute
var routesToDelete []database.LiveRouteDeletion
// Skip the prefix table updates entirely - just update live_routes
// The prefix table is not critical for routing lookups
// Collect unique prefixes to update
prefixesToUpdate := make(map[string]time.Time)
for _, update := range prefixMap {
// Track prefix for both announcements and withdrawals
if _, exists := prefixesToUpdate[update.prefix]; !exists || update.timestamp.After(prefixesToUpdate[update.prefix]) {
prefixesToUpdate[update.prefix] = update.timestamp
}
if update.messageType == "announcement" && update.originASN > 0 {
// Create live route for batch upsert
route := h.createLiveRoute(update)
@ -192,11 +198,20 @@ func (h *PrefixHandler) flushBatchLocked() {
routesToUpsert = append(routesToUpsert, route)
}
} else if update.messageType == "withdrawal" {
// Parse CIDR to get IP version
_, ipVersion, err := parseCIDR(update.prefix)
if err != nil {
h.logger.Error("Failed to parse CIDR for withdrawal", "prefix", update.prefix, "error", err)
continue
}
// Create deletion record for batch delete
routesToDelete = append(routesToDelete, database.LiveRouteDeletion{
Prefix: update.prefix,
OriginASN: update.originASN,
PeerIP: update.peer,
IPVersion: ipVersion,
})
}
}
@ -219,6 +234,13 @@ func (h *PrefixHandler) flushBatchLocked() {
}
}
// Update prefix tables
if len(prefixesToUpdate) > 0 {
if err := h.db.UpdatePrefixesBatch(prefixesToUpdate); err != nil {
h.logger.Error("Failed to update prefix batch", "error", err, "count", len(prefixesToUpdate))
}
}
elapsed := time.Since(startTime)
h.logger.Debug("Flushed prefix batch",
"batch_size", batchSize,

View File

@ -80,8 +80,8 @@ func (s *Server) handleStatusJSON() http.HandlerFunc {
}
return func(w http.ResponseWriter, r *http.Request) {
// Create a 1 second timeout context for this request
ctx, cancel := context.WithTimeout(r.Context(), 1*time.Second)
// Create a 4 second timeout context for this request
ctx, cancel := context.WithTimeout(r.Context(), 4*time.Second)
defer cancel()
metrics := s.streamer.GetMetrics()
@ -174,14 +174,15 @@ func (s *Server) handleStatusJSON() http.HandlerFunc {
func (s *Server) handleStats() http.HandlerFunc {
// HandlerStatsInfo represents handler statistics in the API response
type HandlerStatsInfo struct {
Name string `json:"name"`
QueueLength int `json:"queue_length"`
QueueCapacity int `json:"queue_capacity"`
ProcessedCount uint64 `json:"processed_count"`
DroppedCount uint64 `json:"dropped_count"`
AvgProcessTimeMs float64 `json:"avg_process_time_ms"`
MinProcessTimeMs float64 `json:"min_process_time_ms"`
MaxProcessTimeMs float64 `json:"max_process_time_ms"`
Name string `json:"name"`
QueueLength int `json:"queue_length"`
QueueCapacity int `json:"queue_capacity"`
QueueHighWaterMark int `json:"queue_high_water_mark"`
ProcessedCount uint64 `json:"processed_count"`
DroppedCount uint64 `json:"dropped_count"`
AvgProcessTimeMs float64 `json:"avg_process_time_ms"`
MinProcessTimeMs float64 `json:"min_process_time_ms"`
MaxProcessTimeMs float64 `json:"max_process_time_ms"`
}
// StatsResponse represents the API statistics response
@ -213,8 +214,8 @@ func (s *Server) handleStats() http.HandlerFunc {
}
return func(w http.ResponseWriter, r *http.Request) {
// Create a 1 second timeout context for this request
ctx, cancel := context.WithTimeout(r.Context(), 1*time.Second)
// Create a 4 second timeout context for this request
ctx, cancel := context.WithTimeout(r.Context(), 4*time.Second)
defer cancel()
// Check if context is already cancelled
@ -251,7 +252,7 @@ func (s *Server) handleStats() http.HandlerFunc {
return
case err := <-errChan:
s.logger.Error("Failed to get database stats", "error", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
writeJSONError(w, http.StatusInternalServerError, err.Error())
return
case dbStats = <-statsChan:
@ -281,14 +282,15 @@ func (s *Server) handleStats() http.HandlerFunc {
const microsecondsPerMillisecond = 1000.0
for _, hs := range handlerStats {
handlerStatsInfo = append(handlerStatsInfo, HandlerStatsInfo{
Name: hs.Name,
QueueLength: hs.QueueLength,
QueueCapacity: hs.QueueCapacity,
ProcessedCount: hs.ProcessedCount,
DroppedCount: hs.DroppedCount,
AvgProcessTimeMs: float64(hs.AvgProcessTime.Microseconds()) / microsecondsPerMillisecond,
MinProcessTimeMs: float64(hs.MinProcessTime.Microseconds()) / microsecondsPerMillisecond,
MaxProcessTimeMs: float64(hs.MaxProcessTime.Microseconds()) / microsecondsPerMillisecond,
Name: hs.Name,
QueueLength: hs.QueueLength,
QueueCapacity: hs.QueueCapacity,
QueueHighWaterMark: hs.QueueHighWaterMark,
ProcessedCount: hs.ProcessedCount,
DroppedCount: hs.DroppedCount,
AvgProcessTimeMs: float64(hs.AvgProcessTime.Microseconds()) / microsecondsPerMillisecond,
MinProcessTimeMs: float64(hs.MinProcessTime.Microseconds()) / microsecondsPerMillisecond,
MaxProcessTimeMs: float64(hs.MaxProcessTime.Microseconds()) / microsecondsPerMillisecond,
})
}
@ -491,6 +493,14 @@ func (s *Server) handleASDetail() http.HandlerFunc {
return
}
// Get peers
peers, err := s.db.GetASPeersContext(r.Context(), asn)
if err != nil {
s.logger.Error("Failed to get AS peers", "error", err)
// Continue without peers rather than failing the whole request
peers = []database.ASPeer{}
}
// Group prefixes by IP version
const ipVersionV4 = 4
var ipv4Prefixes, ipv6Prefixes []database.LiveRoute
@ -547,6 +557,8 @@ func (s *Server) handleASDetail() http.HandlerFunc {
TotalCount int
IPv4Count int
IPv6Count int
Peers []database.ASPeer
PeerCount int
}{
ASN: asInfo,
IPv4Prefixes: ipv4Prefixes,
@ -554,6 +566,8 @@ func (s *Server) handleASDetail() http.HandlerFunc {
TotalCount: len(prefixes),
IPv4Count: len(ipv4Prefixes),
IPv6Count: len(ipv6Prefixes),
Peers: peers,
PeerCount: len(peers),
}
// Check if context is still valid before writing response
@ -605,7 +619,7 @@ func (s *Server) handlePrefixDetail() http.HandlerFunc {
// Group by origin AS and collect unique AS info
type ASNInfo struct {
Number int
ASN int
Handle string
Description string
PeerCount int
@ -622,7 +636,7 @@ func (s *Server) handlePrefixDetail() http.HandlerFunc {
description = asInfo.Description
}
originMap[route.OriginASN] = &ASNInfo{
Number: route.OriginASN,
ASN: route.OriginASN,
Handle: handle,
Description: description,
PeerCount: 0,
@ -655,7 +669,7 @@ func (s *Server) handlePrefixDetail() http.HandlerFunc {
// Create enhanced routes with AS path handles
type ASPathEntry struct {
Number int
ASN int
Handle string
}
type EnhancedRoute struct {
@ -674,7 +688,7 @@ func (s *Server) handlePrefixDetail() http.HandlerFunc {
for j, asn := range route.ASPath {
handle := asinfo.GetHandle(asn)
enhancedRoute.ASPathWithHandle[j] = ASPathEntry{
Number: asn,
ASN: asn,
Handle: handle,
}
}

View File

@ -217,3 +217,68 @@ func TimeoutMiddleware(timeout time.Duration) func(http.Handler) http.Handler {
})
}
}
// JSONValidationMiddleware ensures all JSON API responses are valid JSON
func JSONValidationMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Create a custom response writer to capture the response
rw := &responseWriter{
ResponseWriter: w,
body: &bytes.Buffer{},
statusCode: http.StatusOK,
}
// Serve the request
next.ServeHTTP(rw, r)
// Check if it's meant to be a JSON response
contentType := rw.Header().Get("Content-Type")
isJSON := contentType == "application/json" || contentType == ""
// If it's not JSON or has content, pass through
if !isJSON && rw.body.Len() > 0 {
w.WriteHeader(rw.statusCode)
_, _ = w.Write(rw.body.Bytes())
return
}
// For JSON responses, validate the JSON
if rw.body.Len() > 0 {
var testParse interface{}
if err := json.Unmarshal(rw.body.Bytes(), &testParse); err == nil {
// Valid JSON, write it out
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(rw.statusCode)
_, _ = w.Write(rw.body.Bytes())
return
}
}
// If we get here, either there's no body or invalid JSON
// Write a proper error response
w.Header().Set("Content-Type", "application/json")
// Determine appropriate status code
statusCode := rw.statusCode
if statusCode == http.StatusOK {
statusCode = http.StatusInternalServerError
}
w.WriteHeader(statusCode)
errorMsg := "Internal server error"
if statusCode == http.StatusRequestTimeout {
errorMsg = "Request timeout"
}
response := map[string]interface{}{
"status": "error",
"error": map[string]interface{}{
"msg": errorMsg,
"code": statusCode,
},
}
_ = json.NewEncoder(w).Encode(response)
})
}

View File

@ -16,14 +16,14 @@ func (s *Server) setupRoutes() {
r.Use(middleware.RealIP)
r.Use(middleware.Logger)
r.Use(middleware.Recoverer)
const requestTimeout = 2 * time.Second
const requestTimeout = 8 * time.Second
r.Use(TimeoutMiddleware(requestTimeout))
r.Use(JSONResponseMiddleware)
// Routes
r.Get("/", s.handleRoot())
r.Get("/status", s.handleStatusHTML())
r.Get("/status.json", s.handleStatusJSON())
r.Get("/status.json", JSONValidationMiddleware(s.handleStatusJSON()).ServeHTTP)
// AS and prefix detail pages
r.Get("/as/{asn}", s.handleASDetail())
@ -33,6 +33,7 @@ func (s *Server) setupRoutes() {
// API routes
r.Route("/api/v1", func(r chi.Router) {
r.Use(JSONValidationMiddleware)
r.Get("/stats", s.handleStats())
r.Get("/ip/{ip}", s.handleIPLookup())
r.Get("/as/{asn}", s.handleASDetailJSON())

View File

@ -42,7 +42,7 @@ func (s *Server) Start() error {
port = "8080"
}
const readHeaderTimeout = 10 * time.Second
const readHeaderTimeout = 40 * time.Second
s.srv = &http.Server{
Addr: ":" + port,
Handler: s.router,

View File

@ -8,6 +8,7 @@ import (
"encoding/json"
"fmt"
"math"
"math/rand"
"net/http"
"sync"
"sync/atomic"
@ -29,6 +30,10 @@ const (
bytesPerKB = 1024
bytesPerMB = 1024 * 1024
maxConcurrentHandlers = 800 // Maximum number of concurrent message handlers
// Backpressure constants
backpressureThreshold = 0.5 // Start dropping at 50% queue utilization
backpressureSlope = 2.0 // Slope for linear drop probability increase
)
// MessageHandler is an interface for handling RIS messages
@ -49,12 +54,13 @@ type RawMessageHandler func(line string)
// handlerMetrics tracks performance metrics for a handler
type handlerMetrics struct {
processedCount uint64 // Total messages processed
droppedCount uint64 // Total messages dropped
totalTime time.Duration // Total processing time (for average calculation)
minTime time.Duration // Minimum processing time
maxTime time.Duration // Maximum processing time
mu sync.Mutex // Protects the metrics
processedCount uint64 // Total messages processed
droppedCount uint64 // Total messages dropped
totalTime time.Duration // Total processing time (for average calculation)
minTime time.Duration // Minimum processing time
maxTime time.Duration // Maximum processing time
queueHighWaterMark int // Maximum queue length seen
mu sync.Mutex // Protects the metrics
}
// handlerInfo wraps a handler with its queue and metrics
@ -74,7 +80,8 @@ type Streamer struct {
cancel context.CancelFunc
running bool
metrics *metrics.Tracker
totalDropped uint64 // Total dropped messages across all handlers
totalDropped uint64 // Total dropped messages across all handlers
random *rand.Rand // Random number generator for backpressure drops
}
// New creates a new RIS streamer
@ -86,6 +93,8 @@ func New(logger *logger.Logger, metrics *metrics.Tracker) *Streamer {
},
handlers: make([]*handlerInfo, 0),
metrics: metrics,
//nolint:gosec // Non-cryptographic randomness is fine for backpressure
random: rand.New(rand.NewSource(time.Now().UnixNano())),
}
}
@ -206,14 +215,15 @@ func (s *Streamer) GetMetricsTracker() *metrics.Tracker {
// HandlerStats represents metrics for a single handler
type HandlerStats struct {
Name string
QueueLength int
QueueCapacity int
ProcessedCount uint64
DroppedCount uint64
AvgProcessTime time.Duration
MinProcessTime time.Duration
MaxProcessTime time.Duration
Name string
QueueLength int
QueueCapacity int
QueueHighWaterMark int
ProcessedCount uint64
DroppedCount uint64
AvgProcessTime time.Duration
MinProcessTime time.Duration
MaxProcessTime time.Duration
}
// GetHandlerStats returns current handler statistics
@ -227,13 +237,14 @@ func (s *Streamer) GetHandlerStats() []HandlerStats {
info.metrics.mu.Lock()
hs := HandlerStats{
Name: fmt.Sprintf("%T", info.handler),
QueueLength: len(info.queue),
QueueCapacity: cap(info.queue),
ProcessedCount: info.metrics.processedCount,
DroppedCount: info.metrics.droppedCount,
MinProcessTime: info.metrics.minTime,
MaxProcessTime: info.metrics.maxTime,
Name: fmt.Sprintf("%T", info.handler),
QueueLength: len(info.queue),
QueueCapacity: cap(info.queue),
QueueHighWaterMark: info.metrics.queueHighWaterMark,
ProcessedCount: info.metrics.processedCount,
DroppedCount: info.metrics.droppedCount,
MinProcessTime: info.metrics.minTime,
MaxProcessTime: info.metrics.maxTime,
}
// Calculate average time
@ -526,15 +537,33 @@ func (s *Streamer) stream(ctx context.Context) error {
// Dispatch to interested handlers
s.mu.RLock()
for _, info := range s.handlers {
if info.handler.WantsMessage(msg.Type) {
select {
case info.queue <- &msg:
// Message queued successfully
default:
// Queue is full, drop the message
atomic.AddUint64(&info.metrics.droppedCount, 1)
atomic.AddUint64(&s.totalDropped, 1)
if !info.handler.WantsMessage(msg.Type) {
continue
}
// Check if we should drop due to backpressure
if s.shouldDropForBackpressure(info) {
atomic.AddUint64(&info.metrics.droppedCount, 1)
atomic.AddUint64(&s.totalDropped, 1)
continue
}
// Try to queue the message
select {
case info.queue <- &msg:
// Message queued successfully
// Update high water mark if needed
queueLen := len(info.queue)
info.metrics.mu.Lock()
if queueLen > info.metrics.queueHighWaterMark {
info.metrics.queueHighWaterMark = queueLen
}
info.metrics.mu.Unlock()
default:
// Queue is full, drop the message
atomic.AddUint64(&info.metrics.droppedCount, 1)
atomic.AddUint64(&s.totalDropped, 1)
}
}
s.mu.RUnlock()
@ -546,3 +575,25 @@ func (s *Streamer) stream(ctx context.Context) error {
return nil
}
// shouldDropForBackpressure determines if a message should be dropped based on queue utilization
func (s *Streamer) shouldDropForBackpressure(info *handlerInfo) bool {
// Calculate queue utilization
queueLen := len(info.queue)
queueCap := cap(info.queue)
utilization := float64(queueLen) / float64(queueCap)
// No drops below threshold
if utilization < backpressureThreshold {
return false
}
// Calculate drop probability (0.0 at threshold, 1.0 at 100% full)
dropProbability := (utilization - backpressureThreshold) * backpressureSlope
if dropProbability > 1.0 {
dropProbability = 1.0
}
// Random drop based on probability
return s.random.Float64() < dropProbability
}

View File

@ -3,7 +3,7 @@
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AS{{.ASN.Number}} - {{.ASN.Handle}} - RouteWatch</title>
<title>AS{{.ASN.ASN}} - {{.ASN.Handle}} - RouteWatch</title>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
@ -136,7 +136,7 @@
<div class="container">
<a href="/status" class="nav-link">← Back to Status</a>
<h1>AS{{.ASN.Number}}{{if .ASN.Handle}} - {{.ASN.Handle}}{{end}}</h1>
<h1>AS{{.ASN.ASN}}{{if .ASN.Handle}} - {{.ASN.Handle}}{{end}}</h1>
{{if .ASN.Description}}
<p class="subtitle">{{.ASN.Description}}</p>
{{end}}
@ -154,6 +154,10 @@
<div class="info-label">IPv6 Prefixes</div>
<div class="info-value">{{.IPv6Count}}</div>
</div>
<div class="info-card">
<div class="info-label">Peer ASNs</div>
<div class="info-value">{{.PeerCount}}</div>
</div>
<div class="info-card">
<div class="info-label">First Seen</div>
<div class="info-value">{{.ASN.FirstSeen.Format "2006-01-02"}}</div>
@ -223,6 +227,44 @@
<p>No prefixes announced by this AS</p>
</div>
{{end}}
{{if .Peers}}
<div class="prefix-section">
<div class="prefix-header">
<h2>Peer ASNs</h2>
<span class="prefix-count">{{.PeerCount}}</span>
</div>
<table class="prefix-table">
<thead>
<tr>
<th>ASN</th>
<th>Handle</th>
<th>Description</th>
<th>First Seen</th>
<th>Last Seen</th>
</tr>
</thead>
<tbody>
{{range .Peers}}
<tr>
<td><a href="/as/{{.ASN}}" class="prefix-link">AS{{.ASN}}</a></td>
<td>{{if .Handle}}{{.Handle}}{{else}}-{{end}}</td>
<td>{{if .Description}}{{.Description}}{{else}}-{{end}}</td>
<td>{{.FirstSeen.Format "2006-01-02"}}</td>
<td>{{.LastSeen.Format "2006-01-02"}}</td>
</tr>
{{end}}
</tbody>
</table>
</div>
{{else}}
<div class="prefix-section">
<h2>Peer ASNs</h2>
<div class="empty-state">
<p>No peering relationships found for this AS</p>
</div>
</div>
{{end}}
</div>
</body>
</html>

View File

@ -207,7 +207,7 @@
<div class="origin-list">
{{range .Origins}}
<div class="origin-item">
<a href="/as/{{.Number}}" class="as-link">AS{{.Number}}</a>
<a href="/as/{{.ASN}}" class="as-link">AS{{.ASN}}</a>
{{if .Handle}} ({{.Handle}}){{end}}
<span style="color: #7f8c8d; margin-left: 10px;">{{.PeerCount}} peer{{if ne .PeerCount 1}}s{{end}}</span>
</div>
@ -240,7 +240,7 @@
<a href="/as/{{.OriginASN}}" class="as-link">AS{{.OriginASN}}</a>
</td>
<td class="peer-ip">{{.PeerIP}}</td>
<td class="as-path">{{range $i, $as := .ASPathWithHandle}}{{if $i}} → {{end}}<a href="/as/{{$as.Number}}" class="as-link">{{if $as.Handle}}{{$as.Handle}}{{else}}AS{{$as.Number}}{{end}}</a>{{end}}</td>
<td class="as-path">{{range $i, $as := .ASPathWithHandle}}{{if $i}} → {{end}}<a href="/as/{{$as.ASN}}" class="as-link">{{if $as.Handle}}{{$as.Handle}}{{else}}AS{{$as.ASN}}{{end}}</a>{{end}}</td>
<td class="peer-ip">{{.NextHop}}</td>
<td>{{.LastUpdated.Format "2006-01-02 15:04:05"}}</td>
<td class="age">{{.LastUpdated | timeSince}}</td>

View File

@ -264,6 +264,10 @@
<span class="metric-label">Queue</span>
<span class="metric-value">${handler.queue_length}/${handler.queue_capacity}</span>
</div>
<div class="metric">
<span class="metric-label">High Water Mark</span>
<span class="metric-value">${handler.queue_high_water_mark}/${handler.queue_capacity} (${Math.round(handler.queue_high_water_mark * 100 / handler.queue_capacity)}%)</span>
</div>
<div class="metric">
<span class="metric-label">Processed</span>
<span class="metric-value">${formatNumber(handler.processed_count)}</span>
@ -286,6 +290,39 @@
});
}
function resetAllFields() {
// Reset all metric fields to '-'
document.getElementById('connected').textContent = '-';
document.getElementById('connected').className = 'metric-value';
document.getElementById('uptime').textContent = '-';
document.getElementById('go_version').textContent = '-';
document.getElementById('goroutines').textContent = '-';
document.getElementById('memory_usage').textContent = '-';
document.getElementById('total_messages').textContent = '-';
document.getElementById('messages_per_sec').textContent = '-';
document.getElementById('total_bytes').textContent = '-';
document.getElementById('mbits_per_sec').textContent = '-';
document.getElementById('asns').textContent = '-';
document.getElementById('prefixes').textContent = '-';
document.getElementById('ipv4_prefixes').textContent = '-';
document.getElementById('ipv6_prefixes').textContent = '-';
document.getElementById('peerings').textContent = '-';
document.getElementById('peers').textContent = '-';
document.getElementById('database_size').textContent = '-';
document.getElementById('live_routes').textContent = '-';
document.getElementById('ipv4_routes').textContent = '-';
document.getElementById('ipv6_routes').textContent = '-';
document.getElementById('ipv4_updates_per_sec').textContent = '-';
document.getElementById('ipv6_updates_per_sec').textContent = '-';
// Clear handler stats
document.getElementById('handler-stats-container').innerHTML = '';
// Clear prefix distributions
document.getElementById('ipv4-prefix-distribution').innerHTML = '<div class="metric"><span class="metric-label">No data</span></div>';
document.getElementById('ipv6-prefix-distribution').innerHTML = '<div class="metric"><span class="metric-label">No data</span></div>';
}
function updateStatus() {
fetch('/api/v1/stats')
.then(response => response.json())
@ -294,6 +331,7 @@
if (response.status === 'error') {
document.getElementById('error').textContent = 'Error: ' + response.error.msg;
document.getElementById('error').style.display = 'block';
resetAllFields();
return;
}
@ -340,12 +378,13 @@
.catch(error => {
document.getElementById('error').textContent = 'Error fetching status: ' + error;
document.getElementById('error').style.display = 'block';
resetAllFields();
});
}
// Update immediately and then every 500ms
// Update immediately and then every 2 seconds
updateStatus();
setInterval(updateStatus, 500);
setInterval(updateStatus, 2000);
</script>
</body>
</html>

385054
log.txt

File diff suppressed because it is too large Load Diff