SELECT * FROM contracts REASON 'What are the late payment penalties?' LIMIT 5; CODE_BLOCK: SELECT * FROM contracts REASON 'What are the late payment penalties?' LIMIT 5; CODE_BLOCK: SELECT * FROM contracts REASON 'What are the late payment penalties?' LIMIT 5; CODE_BLOCK: SELECT * FROM contracts WHERE tags CONTAINS ANY ('nda') SEARCH 'termination' REASON 'What are the exit conditions and notice periods?' LIMIT 5; CODE_BLOCK: SELECT * FROM contracts WHERE tags CONTAINS ANY ('nda') SEARCH 'termination' REASON 'What are the exit conditions and notice periods?' LIMIT 5; CODE_BLOCK: SELECT * FROM contracts WHERE tags CONTAINS ANY ('nda') SEARCH 'termination' REASON 'What are the exit conditions and notice periods?' LIMIT 5;
- Split the document into chunks
- Embed each chunk as a vector
- At query time, find the top-k chunks by cosine similarity
- Pass them to the LLM
- Read the root summary — which top-level branches are relevant to this query?
- Traverse into relevant branches — read section summaries
- Drill into leaf nodes where the answer actually lives
- Return the exact passage with its full path and a confidence score
- The precise answer, not a ranked list of chunks
- The full context of where it sits in the document
- A traceable reasoning path (which branches were explored, which were skipped)
- A confidence score based on structural fit, not similarity score - redb — embedded ACID-compliant storage for the document tree - tantivy — BM25 full-text search for candidate pre-filtering - tokio — async parallel beam search across tree branches - rig-core — multi-provider LLM abstraction (OpenAI, Anthropic, Gemini, and more)