Skip to main content

Endpoint

  • POST /api/chat/stream
  • response content-type: text/event-stream
  • frames are emitted as data: {json}\n\n

Event Types

  • meta
  • status
  • clear
  • token
  • done (terminal success)
  • error (terminal failure)

Minimal Event Examples

{ "type": "meta", "conversation_id": "..." }
{ "type": "status", "status": "searching" }
{ "type": "token", "content": "Hello" }
{
  "type": "done",
  "sources": [],
  "answer_origin": "library_rag",
  "provenance_note": "No sources used for this answer.",
  "question_type": "content",
  "message_id": 123,
  "max_rerank_score": 0.71,
  "model_name": "llama-3.3-70b-versatile",
  "provider": "groq",
  "timing": {
    "search_ms": 10,
    "rerank_ms": 12,
    "llm_ms": 320,
    "total_ms": 380
  }
}
{ "type": "error", "message": "I hit an internal error while streaming. Please retry." }

Ordering Contract

Typical order:
  1. meta
  2. status (searching)
  3. branch-dependent status/clear/token
  4. terminal done or terminal error
Terminal rule:
  • stream should end with exactly one terminal event
  • no further events are expected after terminal

Frontend Contract Notes

Clients should:
  • ignore unknown event types safely
  • treat suggested_sources as discovery actions (not citations)
  • use message_id from done for feedback binding
  • treat model_name and provider as optional metadata (present when an LLM path executed)
  • when provenance_note contains a degraded-service warning (e.g., “reranker temporarily unavailable”), surface it to the user but do not treat the response as failed - the answer is still usable, just lower confidence