Chat Internals

Pipeline Summary

Entry Points

POST /api/chat (single JSON response)
POST /api/chat/stream (SSE stream)

Both paths share core orchestration and persist user messages before downstream retrieval/generation.

Routing Buckets

Deterministic buckets:

clear_meta
ambiguous
general
skill
content
web_search

Ambiguity follow-up state is persisted in conversation state for deterministic multi-turn resolution. Skill dispatch runs before web/content when a registered pattern or slash command matches.

Multi-Turn Query Rewriting

Before retrieval, the orchestrator rewrites vague follow-ups into standalone search queries using conversation history. This is the rewrite-retrieve-read pattern - the rewritten query flows end-to-end through both retrieval and generation. The rewriter returns one of three actions:

Action	When	Example
RETRIEVE (default)	Any message that could benefit from library content	”tell me more”, “yes”, “for the data”
CONVERSE	Pure meta-commentary about the conversation itself	”be more specific”, “elaborate on point 2”
CLARIFY	Last resort - message is truly unintelligible	Extremely rare by design

Key behaviors:

RETRIEVE is the default. A slightly off search is better than asking the user to repeat themselves.
CLARIFY never fires twice in a row. A hard code breaker forces RETRIEVE if the previous response was already a clarification.
CONVERSE never makes factual claims about sources. Questions like “did she mention X?” always trigger RETRIEVE.
The rewritten query anchors to the specific topic of the previous exchange, not the general source.

Content Path

Rewrite query for multi-turn context (see above)
Run hybrid retrieval (FTS + vector + rerank) using the rewritten query
Apply feedback-weighted score adjustment
Generate final answer via fallback-aware LLM path, using the rewritten query as the question
Persist assistant message + provenance metadata

Streaming Path Notes

SSE emits typed events and always targets one assistant slot in frontend state. Terminal event should be done or error.

Persistence Touchpoints

conversations
messages (includes max_rerank_score, answer_origin)
feedback (separate endpoint; impacts later ranking)

Start

Architecture

Data Model

Internals

Reliability

Authoring

Pipeline Summary

Entry Points

Routing Buckets

Multi-Turn Query Rewriting

Content Path

Streaming Path Notes

Persistence Touchpoints

Start

Architecture

Data Model

Internals

Reliability

Authoring

​Pipeline Summary

​Entry Points

​Routing Buckets

​Multi-Turn Query Rewriting

​Content Path

​Streaming Path Notes

​Persistence Touchpoints

​Related

Pipeline Summary

Entry Points

Routing Buckets

Multi-Turn Query Rewriting

Content Path

Streaming Path Notes

Persistence Touchpoints

Related