Retrieval Stack
- query rewrite - multi-turn follow-ups rewritten to standalone queries (see Chat Internals)
- lexical candidates from FTS5
- semantic candidates from vector search
- merge + dedupe
- rerank top candidates
- apply feedback-weighted adjustment
Default Retrieval Shape
- high-recall candidate generation before rerank
- final top-k context for generation
- rerank resilience: retry transient errors 2x with backoff, then fall back to vector similarity scores (L2-to-similarity conversion);
reranker_availableflag threaded through timing and response for observability