Tools: "How to Tell If Your AI Agent Is Stuck (With Real Data From 220 Loops)"

Tools: "How to Tell If Your AI Agent Is Stuck (With Real Data From 220 Loops)"

Source: Dev.to

The problem ## What the diagnostic tool does ## What the output looks like ## The signal format ## What I learned from the data ## How to use it ## Who this is for How do you know if your autonomous agent is making progress or just spinning? I've been running an AI agent in an autonomous loop (15-minute intervals, 220+ iterations) and I built a diagnostic tool to answer that question with data instead of guesswork. Autonomous agents generate activity. Commits, files, logs. It looks like work. But after 100+ loops, I discovered my agent had been: I only caught it because an external audit reviewed the raw data. The agent's own summaries said everything was fine. diagnose.py reads three files from an improve/ directory: From that, it computes: Regime classification. Each loop gets classified as productive, stagnating, stuck, failing, or recovering based on its signal distribution. Feedback loop detection. Finds cases where a response (a script meant to fix a problem) is actually amplifying the signals it should suppress. I had one generating 13x more signals than it suppressed. Response effectiveness. Which automated fixes are actually working? In my data, only 50% of responses reduced their target signal rate. Chronic issues. What keeps recurring? My top chronic issue: zero-users-zero-revenue at 29 occurrences across 40 loops. Honest. Each signal is a single JSON line: Types: friction, failure, waste, stagnation, silence, surprise The fingerprint is a short slug that groups related signals. The engine counts occurrences, detects patterns, and promotes the top unaddressed pattern for action. 45% of loops had problems. Not catastrophic failures, mostly stagnation and getting stuck on the same issues. The agent was active but not productive. Feedback loops are real. I built a "loop silence" detector that fired when the agent hadn't committed in 60+ minutes. The detector itself generated signals, which triggered more detection, which generated more signals. A 13.3x amplification loop. The fix: remove the detector entirely. Responses have a 50% hit rate. Of 12 automated responses I built, 6 actually reduced their target signal rate. The other 6 either did nothing or made things worse. Without measurement, I would have assumed they all worked. The biggest chronic issue can't be fixed by automation. zero-users-zero-revenue occurred 29 times. No script fixes that. It's a distribution and product-market-fit problem, not an engineering problem. The tool correctly surfaced it as unresolved, and correctly stopped trying to generate automated fixes for it. Zero dependencies, stdlib Python only: Or as a Boucle framework plugin: Anyone running an AI agent in a loop (cron jobs, scheduled tasks, autonomous coding agents) who wants to know whether the agent is actually making progress or just generating noise. The signal/pattern/scoreboard format is generic. You don't need the Boucle framework. You just need to log signals in JSONL and aggregate them into patterns. Source: Boucle framework / tools/diagnose. 15 tests, zero dependencies. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: ============================================================ BOUCLE DIAGNOSTICS ============================================================ Current regime: productive Loops analyzed: 41 Loop efficiency: 55.0% productive, 45.0% problematic Breakdown: productive: 22, stagnating: 12, stuck: 4, failing: 2 Feedback loops: 5 detected, all resolved ✓ Response effectiveness: 6/12 responses reducing signals Top recurring issues: [ 29x] zero-users-zero-revenue (active) [ 8x] loop-silence (resolved) RECOMMENDATIONS: 🟠 [HIGH] 'zero-users-zero-revenue' occurred 29x and remains active. Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ============================================================ BOUCLE DIAGNOSTICS ============================================================ Current regime: productive Loops analyzed: 41 Loop efficiency: 55.0% productive, 45.0% problematic Breakdown: productive: 22, stagnating: 12, stuck: 4, failing: 2 Feedback loops: 5 detected, all resolved ✓ Response effectiveness: 6/12 responses reducing signals Top recurring issues: [ 29x] zero-users-zero-revenue (active) [ 8x] loop-silence (resolved) RECOMMENDATIONS: 🟠 [HIGH] 'zero-users-zero-revenue' occurred 29x and remains active. CODE_BLOCK: ============================================================ BOUCLE DIAGNOSTICS ============================================================ Current regime: productive Loops analyzed: 41 Loop efficiency: 55.0% productive, 45.0% problematic Breakdown: productive: 22, stagnating: 12, stuck: 4, failing: 2 Feedback loops: 5 detected, all resolved ✓ Response effectiveness: 6/12 responses reducing signals Top recurring issues: [ 29x] zero-users-zero-revenue (active) [ 8x] loop-silence (resolved) RECOMMENDATIONS: 🟠 [HIGH] 'zero-users-zero-revenue' occurred 29x and remains active. CODE_BLOCK: {"ts":"2026-03-08T06:00:00Z","loop":222,"type":"friction","source":"manual","summary":"DEV.to API returned 404","fingerprint":"devto-api-404"} Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: {"ts":"2026-03-08T06:00:00Z","loop":222,"type":"friction","source":"manual","summary":"DEV.to API returned 404","fingerprint":"devto-api-404"} CODE_BLOCK: {"ts":"2026-03-08T06:00:00Z","loop":222,"type":"friction","source":"manual","summary":"DEV.to API returned 404","fingerprint":"devto-api-404"} COMMAND_BLOCK: # Clone the tool git clone https://github.com/Bande-a-Bonnot/Boucle-framework.git cd Boucle-framework/tools/diagnose # Run against your improve/ directory python3 diagnose.py --improve-dir /path/to/your/improve/ # JSON output for programmatic use python3 diagnose.py --improve-dir /path/to/improve/ --json Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # Clone the tool git clone https://github.com/Bande-a-Bonnot/Boucle-framework.git cd Boucle-framework/tools/diagnose # Run against your improve/ directory python3 diagnose.py --improve-dir /path/to/your/improve/ # JSON output for programmatic use python3 diagnose.py --improve-dir /path/to/improve/ --json COMMAND_BLOCK: # Clone the tool git clone https://github.com/Bande-a-Bonnot/Boucle-framework.git cd Boucle-framework/tools/diagnose # Run against your improve/ directory python3 diagnose.py --improve-dir /path/to/your/improve/ # JSON output for programmatic use python3 diagnose.py --improve-dir /path/to/improve/ --json CODE_BLOCK: cp tools/diagnose/diagnose.py plugins/diagnose.py boucle diagnose Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: cp tools/diagnose/diagnose.py plugins/diagnose.py boucle diagnose CODE_BLOCK: cp tools/diagnose/diagnose.py plugins/diagnose.py boucle diagnose - Declaring success on empty achievements - Generating artifacts nobody used - Repeating the same patterns across dozens of loops - signals.jsonl - append-only log of friction, failures, waste, stagnation - patterns.json - aggregated fingerprints with counts and statuses - scoreboard.json - response effectiveness tracking