Tools: AI Governance Leaderboard: We Scanned 21 Top Repos Before RSA 2026

Tools: AI Governance Leaderboard: We Scanned 21 Top Repos Before RSA 2026

Source: Dev.to

AI Governance Leaderboard: We Scanned 21 Top Repos Before RSA 2026

The Numbers

Bottom 3

The Pattern: CI/CD Without Enforcement

AI Governance Is Nearly Absent

What the Scores Mean

Category Breakdown

AI Agent Frameworks (8 repos, avg 47/100)

ML Libraries (3 repos, avg 62/100)

Web Frameworks (3 repos, avg 58/100)

AI SDKs (4 repos, avg 56/100)

Local AI / Inference (3 repos, avg 53/100)

Methodology

What Would It Take to Score an A?

Scan Your Own Repo RSA Conference 2026 starts March 23. Every AI security vendor will be on stage talking about governance, compliance, and responsible AI. We wanted to see what governance actually looks like in the repos people are shipping. So we scanned 21 of the most popular AI/ML repositories using the same governance scanner anyone can run for free. No manual review. No subjective scoring. Just structural analysis of what each repo enforces automatically. The results are not great. View the full interactive leaderboard vLLM leads the pack at 78/100 with pre-commit hooks, 7 CI/CD workflows, a security policy, and Dependabot. Its one critical finding: 2 .env files committed to source control. BabyAGI's 17/100 is the lowest score in the set. No CI/CD pipeline, no enforcement hooks, no security policy, no governance config. It scores points only for having a test directory and basic project hygiene. The most striking finding across all 21 repos: nearly every project has CI/CD, but almost none enforce rules structurally. Most repos scored 15/15 on CI/CD. They have GitHub Actions. They run tests in the pipeline. That part of modern software development is well-adopted. But enforcement -- pre-commit hooks, commit-lint, CODEOWNERS, branch protection -- averages only 11/30 across all repos. This is the gap. Rules exist in documentation but are not structurally enforced before code enters the pipeline. This is exactly what we call the "detection gap" in the enforcement ladder framework. You can detect violations in CI, but by then the code is already committed. Structural enforcement catches problems before they enter the system. Only 6 of 21 repos (29%) have any AI governance configuration -- a CLAUDE.md file or .cursorrules. This means that in 71% of the most popular AI/ML repos, AI coding tools operate with zero structural guidance. When a developer uses Cursor, Claude Code, or GitHub Copilot on these repos, the AI has no project-specific rules to follow. No constraints on what it can modify. No enforced patterns. The governance score for these repos on this dimension: 0/15. The repos that do have governance configs: vLLM, LiteLLM, AutoGPT, LangChain, Transformers, and LocalAI. Our scanner evaluates 6 dimensions (100 points total): Grades: A (80+), B (60-79), C (40-59), D (20-39), F (below 20). The agent frameworks -- the repos building autonomous AI systems -- scored the lowest as a category. AutoGPT leads at 68, but BabyAGI (17), Autogen (30), and SuperAGI (41) drag the average down. These are the repos building systems that make autonomous decisions, and they have the least governance infrastructure. vLLM (78) lifts this category. scikit-learn and Transformers both score 54 -- solid CI/CD and testing, but weak on enforcement and governance. FastAPI (62), Pydantic (59), Django (54). These established projects have mature CI/CD but mostly lack AI governance configs and full enforcement tooling. The Anthropic SDK (55), OpenAI SDK (53), LlamaIndex (58), and DSPy (56) cluster tightly in the C range. The Anthropic SDK notably has no pre-commit hooks despite being from the company that makes Claude. LiteLLM (72) stands out. Ollama (36) is the weakest -- no enforcement hooks, no test infrastructure detected, and no governance config. All scans were run on March 16, 2026 using the Walseth AI Governance Scanner -- the same tool available for free at walseth.ai/scan. Scores are point-in-time snapshots based on the default branch at scan time. The scanner analyzes the file tree of each repository via the GitHub API. It checks for the presence of specific files and directories that indicate structural governance. It does not read file contents beyond filenames and paths. Repos that fail to scan (private, rate-limited, or not found) are excluded. All 21 repos in this leaderboard scanned successfully. No repo in this scan scored an A (80+). To get there, a project would need: The tooling exists. The patterns are well-understood. Most projects just have not prioritized structural enforcement alongside their CI/CD pipelines. Every score in this leaderboard was generated by the same free scanner you can run right now: Scan your repo free at walseth.ai/scan Want a deeper analysis? Our $497 Full Governance Report covers 30+ dimensions with specific remediation steps and a compliance roadmap. View the full interactive leaderboard with sortable columns Last scanned: March 16, 2026. Scores are point-in-time snapshots. Run the scanner to get the latest score for any repo. Originally published at walseth.ai Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to ? It will become hidden in your post, but will still be visible via the comment's permalink. as well , this person and/or - 21 repos scanned across AI agent frameworks, ML libraries, web frameworks, and AI SDKs - Average score: 53/100 (grade C) - Only 2 repos (10%) score 70+ and are on track for EU AI Act readiness - 6 repos (29%) have any AI governance configuration (CLAUDE.md or .cursorrules) - 1 repo scored an F - Enforcement (30 pts): Pre-commit hooks, commit-lint, CODEOWNERS, branch protection - CI/CD (15 pts): GitHub Actions, Travis CI, CircleCI workflows - Security (20 pts): Security policy, .gitignore, no committed .env files, Dependabot/Renovate - Testing (10 pts): Test configuration files, test directories - Governance (15 pts): CLAUDE.md, .cursorrules, governance directories - Hygiene (10 pts): README, CONTRIBUTING, LICENSE, CHANGELOG, lockfiles - Pre-commit hooks AND commit-lint AND CODEOWNERS (25/30 enforcement) - 3+ CI/CD workflows (15/15) - Security policy + Dependabot + no committed .env files (17-20/20) - Test config + test directories (10/10) - CLAUDE.md or .cursorrules + governance directory (15/15) - README + CONTRIBUTING + LICENSE + lockfile (8-10/10)