Tools

Tools: Verdict — When Policies Collide

2026-02-09 0 views admin

Tools: Verdict — When Policies Collide

Source: Dev.to

Verdict — When Policies Collide ## What I Built ## How I Used Algolia Agent Studio ## The index: 26 clause-level records ## Retrieval-time intelligence: ranking, Rules, and Synonyms ## Agent Studio as orchestrator ## Capturing the pipeline: SSE stream parsing ## Structured output: XML over free-form text ## Why Fast Retrieval Matters This is a submission for the Algolia Agent Studio Challenge: Consumer-Facing Non-Conversational Experiences Most support tools search for a policy and hand it to an agent. But real support tickets don't map to a single policy — they sit at the intersection of multiple, often contradictory ones. A customer's treadmill motor dies at 6 weeks. The 30-day return window says no. The 2-year motor warranty says yes. Which one wins? Verdict is a decision engine that resolves these conflicts. A support agent clicks a ticket, and Verdict retrieves the relevant policy clauses from Algolia, detects where they contradict each other, and applies a resolution hierarchy — product-specific overrides general, situational overrides everything — to produce a structured verdict with full citations. No chatbot, no back-and-forth. Click a ticket, get a ruling. The interesting case is a denial. Try the earbuds scenario: a customer wants to return opened wireless earbuds within the 30-day window. Sounds like a standard approval. But Verdict pulls a hygiene exception for in-ear audio products and denies the return. The UI renders the two policies side-by-side with a "VS" comparison — General Return Policy (overridden) vs. Hygiene Exception (prevails) — so the agent immediately sees why the return was blocked and can explain it to the customer. I built this for the non-conversational track because the whole point is that there's no conversation needed. The agent's workflow is: see ticket, click, read verdict, act. The decision is proactive and fully structured — verdict cards, policy comparison panels, conflict traces — not a chat bubble. Live URL: https://algoliahack.vercel.app/ Video: https://youtu.be/waHa-nvvmbA Here's what to look at: I indexed 26 policy clause records for a fictional retailer (Apex Gear) in a single index, apex_gear_policies. Each record is one clause — not a whole policy document — with structured metadata: a policy_layer (1-4), priority score, policy_type, product_tags, conditions, and effect. I split them this way because conflict resolution requires comparing individual clauses, not whole documents, and it keeps records under Algolia's 10KB free-tier limit. Three of the 26 records are deliberate red herrings — an expired holiday return extension, a loyalty member perk, and a bulk discount policy. They share product categories with the demo scenarios but shouldn't be cited. The agent consistently ignores them. This is where Algolia does more than store data. The index is configured with custom ranking: desc(policy_layer), desc(priority), desc(specificity_score). Every search result arrives pre-sorted by authority — the most specific, highest-priority clause first. The LLM receives policies in the right order before it starts reasoning, which naturally guides correct conflict resolution. On top of custom ranking, I added 3 index-level Rules and 6 Synonym groups. The Rules promote critical override policies when the query contains trigger words — a query mentioning "hygiene" promotes HYG-4.1 (the in-ear audio hygiene exception) to position 1, ensuring the agent can't miss it even in a noisy result set. Without that Rule, a broad search for "earbuds return" surfaces general return policies first, and the agent might approve a return it should deny. The Synonyms expand the search vocabulary — "defective" matches "broken", "malfunction", "stopped working"; "earbuds" expands to "in-ear audio" and "personal audio" — so the agent retrieves relevant clauses even when the customer's language doesn't match the index terminology. These are all retrieval-time features. They shape what the LLM sees before it starts reasoning — Algolia handles authority ranking and vocabulary normalization at query time, the LLM handles condition matching and explanation at reasoning time. They're complementary layers of intelligence. Agent Studio runs the agentic loop. The system prompt defines a multi-step protocol: extract key information from the ticket, then perform 3 targeted Algolia searches (general return policies, product-specific warranties, situational overrides), then analyze all retrieved policies, detect conflicts, resolve them using the layer hierarchy, and output a structured XML verdict. The agent decides which searches to run based on the ticket content. For the earbuds ticket, it searches for "hygiene earbuds in-ear audio return" because the ticket mentions opened in-ear products. For the treadmill ticket, it searches "Pro-Treadmill X500 warranty" because the issue is a mechanical failure. For the hiking boots, it searches "shipping damage carrier report override" because the package arrived crushed. The system prompt guides this decision-making, but Agent Studio executes the tool calls autonomously — I don't hardcode which searches run for which ticket. The system prompt also enforces anti-hallucination: every clause_id in the verdict must come verbatim from search results. The agent can't invent a policy, and it can't ask the customer for more information — it either decides or escalates. Agent Studio's /completions endpoint returns a Server-Sent Events stream. I built a custom SSE parser in the API route that captures every event in the agent's reasoning chain — not just the final text output. The parser correlates tool-input-start events (which carry the search query and a toolCallId) with tool-output-available events (which carry the actual Algolia hits for that toolCallId). This gives me the full pipeline: what the agent searched for, what Algolia returned, and which records the LLM ultimately cited. The frontend renders this as a visible pipeline trace — each Algolia search step shows the query text, hit count, and the individual policy records returned. Records that ended up cited in the final verdict get a "Cited in verdict" badge, so you can see exactly which retrieved clauses influenced the decision. This makes Algolia's contribution transparent instead of hiding it behind the LLM's output. The system prompt instructs the agent to produce XML-tagged output rather than free-form text: The frontend parses this into typed components with a regex-based tag extractor. If the LLM deviates from format, the UI falls back to displaying the raw response with a warning banner rather than crashing. Across 20+ test runs per scenario at temperature=0, the XML has been well-formed every time. Custom ranking means every search result arrives pre-sorted by policy authority. Index-level Rules promote critical override clauses to the top of the result set when trigger conditions are met. Synonyms normalize vocabulary so "motor stopped working" matches "mechanical defect" without the LLM needing to guess. These features shape the LLM's reasoning context before it processes a single token — and they're all configured in the Algolia dashboard, working transparently through Agent Studio without extra API code. A vector database would retrieve "semantically similar" policies, which isn't what I need. When a customer reports a treadmill motor failure, we get the exact motor warranty clause for that product model (policy_layer:3, applies_to:Pro-Treadmill X500), not five vaguely related fitness equipment policies ranked by embedding distance. Structured metadata with precise filtering is the right retrieval model for compliance data. The total analysis time (shown in the UI after each verdict) is typically 5-10 seconds — almost entirely LLM reasoning. Algolia's retrieval completes in under 50ms across all 3 searches. In a support workflow where agents triage dozens of tickets, that retrieval speed keeps the bottleneck on reasoning, not on waiting for data. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: <analysis> <verdict>APPROVED</verdict> <verdict_type>warranty_claim</verdict_type> <summary>Motor warranty overrides expired return window...</summary> <policies> <policy> <clause_id>WAR-3.1</clause_id> <policy_name>Pro-Treadmill Motor Warranty</policy_name> <applies>true</applies> <effect>warranty_approved</effect> <reason>Motor failed within 2-year warranty period</reason> </policy> </policies> <conflict><exists>true</exists>...</conflict> <resolution> <winning_policy>WAR-3.1</winning_policy> <rule_applied>Product-specific warranty overrides general return</rule_applied> </resolution> </analysis> Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: <analysis> <verdict>APPROVED</verdict> <verdict_type>warranty_claim</verdict_type> <summary>Motor warranty overrides expired return window...</summary> <policies> <policy> <clause_id>WAR-3.1</clause_id> <policy_name>Pro-Treadmill Motor Warranty</policy_name> <applies>true</applies> <effect>warranty_approved</effect> <reason>Motor failed within 2-year warranty period</reason> </policy> </policies> <conflict><exists>true</exists>...</conflict> <resolution> <winning_policy>WAR-3.1</winning_policy> <rule_applied>Product-specific warranty overrides general return</rule_applied> </resolution> </analysis> CODE_BLOCK: <analysis> <verdict>APPROVED</verdict> <verdict_type>warranty_claim</verdict_type> <summary>Motor warranty overrides expired return window...</summary> <policies> <policy> <clause_id>WAR-3.1</clause_id> <policy_name>Pro-Treadmill Motor Warranty</policy_name> <applies>true</applies> <effect>warranty_approved</effect> <reason>Motor failed within 2-year warranty period</reason> </policy> </policies> <conflict><exists>true</exists>...</conflict> <resolution> <winning_policy>WAR-3.1</winning_policy> <rule_applied>Product-specific warranty overrides general return</rule_applied> </resolution> </analysis> - Treadmill X500 (Warranty vs. Return) — Green APPROVED. The motor warranty overrides the expired return window. This sets the pattern: Verdict finds conflicts and resolves them. - SoundPro Earbuds (Hygiene Override) — Red DENIED. This is the one to watch. The earbuds are within the return window, but the hygiene exception blocks it. The VS comparison makes the conflict immediately visible. - Alpine Hiking Boots (Damage Override) — Green APPROVED. Return window expired at 39 days, but shipping damage was reported within 48 hours. The situational override wins. - TrailBlazer Daypack (Standard Return) — Green APPROVED, no conflict. Shows the system doesn't over-complicate simple cases. - Custom ticket — Paste any text and watch it analyze live. This proves the verdicts aren't pre-computed. - Policy Index — Browse all 26 Algolia records grouped by policy layer, including 3 red herring decoy policies that the agent correctly ignores.

🏷️ Tags

how-totutorialguidedev.toaimlllmserverdatabase