Tools

Tools: AI Agents Are Making Decisions Nobody Can Audit

2026-02-21 0 views admin

Tools: AI Agents Are Making Decisions Nobody Can Audit

Source: Dev.to

The problem nobody wants to talk about ## Why this is an infrastructure problem, not an application problem ## What I built ## The part I didn't expect ## What's next Last month, a developer posted on Reddit about an AI agent that got stuck in a loop and fired off 50,000 API requests before anyone noticed. Production was down. The bill was ugly. And the worst part? Nobody could tell exactly what the agent had been doing or why. This isn't an edge case anymore. It's Tuesday. AI agents are everywhere now. They're calling APIs, querying databases, executing code, and in some cases, spending real money — all autonomously. The frameworks for building them are incredible. CrewAI, LangChain, AutoGen, OpenAI's Agents SDK — they make it shockingly easy to stand up an agent that can do real work. But here's what none of these frameworks give you: visibility into what your agent actually did. No audit trail. No kill switch. No way to replay what happened after something goes wrong. No policy enforcement before a dangerous action executes. And perhaps most concerning — no PII redaction. Every prompt and completion your agent generates ships directly to your observability backend with customer data, API keys, and internal information fully intact. Every team I've talked to handles this differently. Most don't handle it at all. Think about TLS. Nobody implements TLS differently in every microservice. It's a standardized layer that sits below application code and handles encryption for everything above it. Agent safety needs to work the same way. If every team builds their own logging, their own kill switches, their own policy checks — you get inconsistency, gaps, and the kind of "we'll deal with it later" approach that leads to the 50,000-request incident above. The safety layer needs to be: I've been working on an open-source project called AIR Blackbox — think of it like a flight recorder for AI agents. It sits between your agents and your LLM providers and captures everything. The architecture is straightforward: One line change — swap your base_url — and every agent call flows through it. No SDK changes. No code refactoring. Here's what each piece does: Gateway — An OpenAI-compatible reverse proxy written in Go. It intercepts all LLM traffic, emits structured OpenTelemetry traces, and checks policies before forwarding requests. Any OpenAI-compatible client works without modification. Policy Engine — Evaluates requests against YAML-defined policies in real time. Risk tiers (low, medium, high, critical), trust scoring, programmable kill switches, and human-in-the-loop gates for high-risk operations. This isn't monitoring after the fact — it's governance before the action happens. OTel Collector — A custom processor for gen_ai telemetry. PII redaction using hash-and-preview (48-character preview + hash, so you can debug without exposing full data). Cost metrics. And loop detection — the thing that would have caught that 50,000-request incident before it became a disaster. Episode Store — Groups individual traces into task-level episodes you can replay. When something goes wrong, you don't sift through raw logs — you replay the episode like rewinding a tape. When I started building this, I thought the hard problem would be the technical architecture. It wasn't. OpenTelemetry gives you a solid foundation. Go is great for proxies. The plumbing was actually the straightforward part. The hard problem is convincing people they need it before the incident happens. Every team I talk to says some version of: "We're being careful." "Our agents are simple." "We'll add monitoring later." And then later arrives as a production incident, a leaked API key, or an auditor asking questions nobody prepared for. The companies that are thinking about this — the ones deploying agents in regulated industries, in healthcare, in finance — they already know. They're the ones asking: "Can we prove what our agent did? Can we shut it down instantly? Can we guarantee PII doesn't leak into our trace backend?" These aren't hypothetical questions. ISO 27001 auditors are starting to ask them. SOC 2 reviewers are starting to ask them. And if your answer is "we log stuff to CloudWatch," that's not going to cut it. AIR Blackbox is fully open source under Apache 2.0. It's 21 repositories, fully modular — you can use the whole stack or just the pieces you need. There are trust plugins for CrewAI, LangChain, AutoGen, and OpenAI's Agents SDK. A five-minute quickstart gets you the full stack running locally with make up. If you're deploying AI agents in production — or planning to — I'd genuinely appreciate your feedback. What gaps are you seeing? What keeps you up at night? GitHub: github.com/airblackbox There are interactive demos in the README if you want to explore without installing anything. I'm building AIR Blackbox because I think agent safety shouldn't be an afterthought bolted on after the first incident. It should be infrastructure — boring, reliable, and already running when the 50,001st request tries to fire. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: Your Agent ──→ Gateway ──→ Policy Engine ──→ LLM Provider │ │ ▼ ▼ OTel Collector Kill Switches │ Trust Scoring ▼ Risk Tiers Episode Store Jaeger · Prometheus Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Your Agent ──→ Gateway ──→ Policy Engine ──→ LLM Provider │ │ ▼ ▼ OTel Collector Kill Switches │ Trust Scoring ▼ Risk Tiers Episode Store Jaeger · Prometheus CODE_BLOCK: Your Agent ──→ Gateway ──→ Policy Engine ──→ LLM Provider │ │ ▼ ▼ OTel Collector Kill Switches │ Trust Scoring ▼ Risk Tiers Episode Store Jaeger · Prometheus - Framework-agnostic — works whether you're using CrewAI, LangChain, AutoGen, or something custom - Infrastructure-level — operates in the network path and telemetry pipeline, not inside agent code - Standardized — uses OpenTelemetry so it plugs into whatever observability stack you already have

🏷️ Tags

how-totutorialguidedev.toaimlopenaillmnetworkswitchapachedatabasegitgithub