Core Signals
Request timing
All API responses include:X-Response-Time
Provider state
Inspect provider/circuit status via:GET /api/stats/providers
Cost and usage
Inspect LLM usage/cost telemetry via:GET /api/stats/costs
Health status
GET /healthGET /health?deep=true(detailed dependency checks)
Eval and quality telemetry
GET /api/evals/runsGET /api/evals/runs/{run_id}GET /api/evals/compare
Streaming Timing Fields
Chat stream terminal payload can include:timing.search_mstiming.rerank_mstiming.llm_mstiming.total_ms
Logging and Tracing
Runtime uses structured logs and optional tracing integrations. If optional tracing backends are unavailable, core API behavior should remain operational.Services Tab Mapping
The/monitor surface maps to observability data like this:
| Services tab | Observability inputs |
|---|---|
| Health | /health, /health?deep=true |
| Architecture | architecture docs + runtime status metadata |
| Eval Runs | /api/evals/runs, /api/evals/compare |
| Databases | local store topology (data/*.db, data/chroma/) + data-store docs |
| Exploration | local curated monitor queue for external tools to evaluate |
| Tracing | provider stats + tracing integration state |
Practical Debug Loop
- confirm health endpoint
- check provider status and circuit state
- inspect response timing and stream timing fields
- inspect usage/cost endpoint for provider error spikes
- reproduce with a minimal request and capture logs