SYSTEM_METRICS

Monitor infrastructure performance, latency, and error rates

[ TOTAL_QPS ]

104.7

Queries per second across all routes

[ AVG_ERROR_RATE ]

2.53%

Average across all routes

[ AVG_LATENCY ]

451ms

Average response time

ROUTES_PERFORMANCE

Detailed metrics for each API endpoint

Route	QPS	Avg Latency	Error Rate
/api/classify	27.7	367ms	2.50%
/api/qa	26.8	515ms	2.39%
/api/extract	25.7	479ms	2.86%
/api/summarize	24.5	444ms	2.37%

ERROR_RATE_BY_ROUTE

Lower is better - fewer failures

LATENCY_BY_ROUTE

Lower is better - faster response times

THROUGHPUT_BY_ROUTE

Queries per second for each endpoint

SYSTEM_HEALTH_INSIGHTS

Highest Traffic Route

Route:/api/classify

QPS:27.7

Latency:367ms

Slowest Route

Route:/api/qa

Latency:515ms

QPS:26.8

Most Errors

Route:/api/extract

Error Rate:2.86%

Status:Warning

System Recommendations

Consider scaling for high traffic
Monitor trends over longer time periods

SYSTEM_UPTIME

Service availability and status monitoring

API GATEWAY

99.9%

24h uptime

MODEL SERVICE

99.8%

24h uptime

DATABASE

100%

24h uptime

ANALYTICS

99.7%

24h uptime

LAST 30 DAYS

ORANGE = UP • GRAY = INCIDENTS

RECENT INCIDENTS

Model Service Timeout

Nov 28, 2024 14:23 UTC • Resolved in 4m

2.1% downtime

API Gateway Latency Spike

Nov 25, 2024 09:15 UTC • Auto-resolved in 12m

Performance degraded

Scheduled maintenance completed

Nov 20, 2024 02:00 UTC • 30m planned downtime

Maintenance

METRIC_DEFINITIONS

QPS (Queries Per Second)

Number of requests handled per second. Higher values indicate more traffic. Use for capacity planning.

Average Latency

Mean response time for requests. Lower is better. Spikes indicate performance degradation or bottlenecks.

Error Rate

Percentage of failed requests. Target <2% for production systems. Investigate routes with >5% error rate.