Tools: Latest: Your DevOps automation is invisible to AI. That's AI-Debt. And it's compounding.

Tools: Latest: Your DevOps automation is invisible to AI. That's AI-Debt. And it's compounding.

The future is agent-orchestration. MCP is its language.

What vendors have built — and why AWS is leading this race

AI-Debt

What's locked: Interface-Debt

What's dark: Context-Debt

The AI-Debt audit: two questions

How custom MCP tools pay down both debts: three examples

Example 1: Context-Debt (same function, different quality)

Example 2: Interface-Debt (wrapping an existing script)

Example 3: The purely contextual tool (no vendor equivalent)

A note on diminishing AI-Debt: why vendors won't close this gap

The window is closing

Where to start A new concept for platform and DevOps engineers, and why the window to act is narrower than you think. A few months ago I set out to build an internal DevOps agent. The goal was straightforward: diagnose pipeline failures and surface root causes faster than any engineer could manually. I was writing Python functions, connecting to the ADO REST API, the Kubernetes client, the Azure SDK. Building the integration layer from scratch. A senior colleague asked one question that changed everything: "Have you looked at the Azure MCP Server?" I hadn't. That question opened a window into an entire vendor ecosystem being assembled at speed, and into a far more important question about what it had not yet built. That gap has a name. This article is about it. We are moving from a world where automation meant writing explicit instructions for machines, to one where autonomous agents receive a goal and reason their way to achieving it. For every DevOps and platform team, the question is whether their existing automation will be visible to those agents, or invisible. The interface that makes automation agent-visible is the tool: a callable function an agent can discover, invoke, and reason over. The open standard governing how tools are described, discovered, and called is Model Context Protocol (MCP). MCP has three components. The MCP Host is the environment where the agent runs (an IDE like Cursor or Kiro, a platform like GitHub Copilot, or a custom agent you build). It contains the LLM doing the reasoning. The MCP Client lives inside the host and handles protocol communication. The MCP Server is where tools live, exposing callable functions and responding to invocations. In Python, a tool on an MCP Server looks like this: The @mcp.tool() decorator is the registration contract. The docstring (the text in triple quotes) is what the agent reads when deciding whether and how to call this function. Not optional documentation. It is the agent's primary reasoning interface into your tool. More on this shortly. SDK downloads grew from 100,000 at launch to over 8 million by April 2025. In December 2025, Anthropic donated MCP to the Linux Foundation, co-governed by OpenAI, Google, Microsoft, AWS, and Salesforce. The same governance move Kubernetes made in 2016. When something enters Linux Foundation governance, it stops being one vendor's experiment and becomes shared infrastructure. Every major cloud vendor has now built production-grade MCP servers for their own ecosystems. What matters for this article is what they have built, what they haven't, and what that gap means for every DevOps platform already running. Azure MCP Server (GA) exposes 40+ Azure services as agent-callable tools. The ADO MCP Server covers pipelines, builds, pull requests, and repositories. AWS embedded MCP into Bedrock AgentCore with IAM permissions and CloudTrail audit logging per tool call. Google released fully managed MCP servers for GKE, BigQuery, AlloyDB, and Spanner. Microsoft shipped a dedicated SQL MCP Server covering SQL Server, PostgreSQL, and Cosmos DB. Zero code, open source, free. Beyond their own services, the vendors have gone further, building tools to make your existing automation agent-ready without custom code. This is where the comparison gets interesting. AWS is the clear pioneer. At AWS Summit New York in July 2025, they announced a $100 million investment in their Generative AI Innovation Center, with agentic AI as the centrepiece. At re:Invent 2025, they shipped three domain-specific autonomous agents: a DevOps Agent, a Security Agent, and Kiro, an agentic IDE. Their open-source Strands Agents SDK introduced a model-first design philosophy. Instead of developers hardcoding every workflow path, the LLM reasons over available tools and decides the path itself. AWS has made agent development a first-class engineering discipline, with tooling, documentation, and production infrastructure to match. The centrepiece for existing automation is AgentCore Gateway: a fully managed service that converts your existing APIs and Lambda functions into MCP-compatible tools automatically. You provide an OpenAPI specification or a Lambda ARN, and the Gateway handles protocol translation, authentication, semantic tool discovery, and observability. No custom code required. Azure has equivalent capability but spread across three services. Azure APIM can expose any REST API as an MCP server: import your OpenAPI spec, click "Create MCP server," done. Azure Functions has a native mcpToolTrigger binding. Microsoft Foundry provides governance across 1,400+ connectors alongside custom tools, authenticated through Entra. The capability is there, but it requires more coordination across services compared to AWS's single-surface approach. Google's Apigee converts any managed API to an MCP server without changing the underlying service. Powerful for GCP-native APIs, but Apigee has historically been a complex enterprise product and lacks the seamless function-wrapping simplicity of AgentCore Gateway. The green rows are real, meaningful progress. AWS leads on developer experience and unification. The red rows tell the more important story. And they have a name. AI-Debt is the human automation your team built that AI agents cannot reach. Human is the key word. This is automation written by engineers, for engineers: scripts run manually, pipelines triggered by people, YAML files committed to repos and executed on build agents with specific tooling installed. It works perfectly for the people who use it today. The debt only becomes visible the day an agent arrives and has nothing to call. AI-Debt has two distinct components: Interface-Debt and Context-Debt. Both matter, and neither is solved by any vendor. Interface-Debt is automation that exists but cannot be called by agents. It has no discoverable interface, no function signature, no API endpoint, no callable handle that an agent can find and invoke. Your ADO pipeline YAML that runs a Helmfile deployment: not callable by agents. Your PowerShell script that creates Azure resources: not callable by agents. Your Bash script that validates secrets before a deployment: not callable by agents. Your kubectl wrapper that diagnoses stuck pods: not callable by agents. Bicep templates, ARM parameters, Makefile targets, cron scripts: none of them are discoverable or invocable. Vendors are reducing Interface-Debt for API-surface automation (code that already has a callable interface: Lambda functions, REST APIs, Azure Functions, managed cloud endpoints). Valuable progress. But it only covers automation that already exposes a typed, invocable surface. A Bash script running on an ADO pipeline agent has no Lambda ARN. A Helmfile task has no OpenAPI spec. The auto-wrap tools have nothing to target. For Azure/ADO environments specifically, the gap is significant. Years of YAML. Hundreds of pipeline tasks. Thousands of shell functions. All invisible to agents. Context-Debt is callable automation that agents cannot use intelligently, because the tools carry no description of when to use them, what they do, or how they behave on your specific platform. When an AI agent is given a set of tools, it decides which tool to call, and how, entirely based on the Python docstring attached to each tool. Not the code. The description. Research published in 2026 quantified this directly: editing tool docstrings can yield up to 10 times more usage of the same underlying function in production agents. A 2026 benchmark called OpaqueToolsBench studied what happens when tools have incomplete documentation and found that LLMs consistently struggle with tools that lack clear best practices or documented failure modes. Anthropic's own engineering team documented this from building Claude Code: when they launched the web search tool, they discovered Claude was needlessly appending "2025" to every query, biasing results. The fix was not a model change. It was improving the tool's docstring. AgentCore Gateway can wrap your Lambda, but it cannot write the docstring that tells the agent when this tool is relevant, what your platform's naming conventions are, or why a particular failure pattern should trigger it first. That knowledge exists only in your engineers' heads, your incident history, and the habits your team has built over years. A Lambda with an empty docstring is callable. It is not agent-ready. That gap is Context-Debt. Before paying down AI-Debt, you need to know how much you have. Most teams have never measured it. Two questions. Count the numbers and you have your baseline. Most enterprise DevOps platforms, when audited this way, find 70-90% of their automation sitting in Interface-Debt, with significant Context-Debt on whatever is callable. That number is your starting point. Custom MCP tools are new Python functions decorated with @mcp.tool(). They do two things at once: give your existing automation a callable interface (addressing Interface-Debt), and encode your platform knowledge in their docstrings (addressing Context-Debt). One new function, two debts addressed. This tool retrieves Kubernetes pod logs, a core diagnostic step in any deployment failure. The engineer already has a working kubectl logs call. The question is whether an agent can use it intelligently. The key is the Python docstring. This is what the agent reads when deciding whether and how to call this function. Not documentation for your colleagues, but the agent's only reasoning interface into your tool. The code is nearly identical. The agent's ability to use it correctly is not. The second version tells the agent when to call it, what namespace naming convention to use, which companion tool to call when it doesn't have a pod name, and what an empty response means. Without that docstring, the agent either skips the tool, calls it with wrong parameters, or hallucinates a response. Your team has a Bash script, /scripts/get-failed-builds.sh, that queries the ADO REST API for recent pipeline failures. It has been running for two years. Developers trigger it manually or reference it in ADO pipeline tasks running on a private agent pool. No AI agent can call it. It lives on a file system, not behind an API, with no discoverable interface. Here is how you pay down Interface-Debt: write a new MCP tool that calls the script, giving it a callable interface, while encoding platform knowledge in the docstring. The Bash script has not changed. It still runs where it always ran. The new MCP tool is a thin wrapper that converts it from invisible to callable, and the docstring converts it from callable to agent-ready. That is what paying down Interface-Debt looks like in practice. This tool has no script to wrap and no API to call. It queries an internal incident database built from 18 months of real platform failures. But the real value is not the database call. It is the docstring that encodes the diagnostic patterns a senior engineer applies instinctively. Think of it as team knowledge, made callable and permanent. No vendor MCP server can build this. AgentCore Gateway has no OpenAPI spec to import. This tool exists only because someone encoded real incident history into a docstring. This is Context-Debt resolved, and the only category of tool that truly differentiates your platform's agent capability from every other organisation using the same vendor tooling. Could vendors extend their auto-wrap to cover scripts and pipeline tasks? Theoretically yes. AWS could build mechanisms to execute arbitrary scripts via Lambda wrappers, Azure could auto-instrument ADO pipeline tasks. But there is a structural reason why they are unlikely to prioritise this. Vendors have no incentive to solve your platform's knowledge problem. Their investment goes into making their own services and managed resources accessible to agents: Lambda, REST APIs, their cloud-native tooling. The script-and-pipeline estate your team has accumulated over five years is yours, not theirs. Even if a vendor shipped a zero-code script wrapper tomorrow, it would only address Interface-Debt. Context-Debt (the knowledge of when to use each tool, how your platform behaves, what your failure patterns are) remains yours to encode. No vendor will ship that for you. That is what makes the custom MCP layer valuable and defensible. It is automation that only you can build. Gartner predicts 40% of enterprise applications will integrate task-specific AI agents by end of 2026, up from less than 5% in 2025. At the same time, they predict over 40% of those agentic AI projects will be cancelled by end of 2027 due to inadequate technical foundations. Read those two together. Agents are arriving. Nearly half the projects will fail — not because the models are poor, but because the platforms were not ready. The ones that succeed will have paid down AI-Debt before the agents arrived. Run the audit. Count your automation across two dimensions: what's locked and what's dark. That number is your baseline. Start with the highest-value workflows. Incident diagnosis, deployment validation, environment setup. Build custom MCP tools for those five or ten scenarios first. Treat docstrings as engineering work. The quality of your agent's decisions is directly proportional to the quality of your docstrings. Not documentation overhead — the core of what makes a platform agent-ready. AI-Debt is silent. Your scripts still run. Your pipelines still deploy. Everything works perfectly for humans. The debt only becomes visible the day an agent arrives and has nothing to call. That day is closer than most platform teams realise. Platform and DevOps engineering at a large UK financial institution. Views are my own.

I write about AI agents, cloud architecture, and occasionally things that have nothing to do with technology.

Building in this space or thinking about AI-Debt on your platform? I would be glad to hear from you. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command

Copy

$ @mcp.tool() async def get_build_logs(organization: str, build_id: int) -> str: """Retrieve the full log output for a specific ADO pipeline build.""" ... @mcp.tool() async def get_build_logs(organization: str, build_id: int) -> str: """Retrieve the full log output for a specific ADO pipeline build.""" ... @mcp.tool() async def get_build_logs(organization: str, build_id: int) -> str: """Retrieve the full log output for a specific ADO pipeline build.""" ... # Context-Debt: callable, but the agent has nothing to reason with @mcp.tool() async def get_pod_logs(namespace: str, pod_name: str) -> str: """Get logs for a pod.""" result = subprocess.run( ["-weight: 500;">kubectl", "logs", pod_name, "-n", namespace], capture_output=True, text=True ) return result.stdout # Context-Debt: callable, but the agent has nothing to reason with @mcp.tool() async def get_pod_logs(namespace: str, pod_name: str) -> str: """Get logs for a pod.""" result = subprocess.run( ["-weight: 500;">kubectl", "logs", pod_name, "-n", namespace], capture_output=True, text=True ) return result.stdout # Context-Debt: callable, but the agent has nothing to reason with @mcp.tool() async def get_pod_logs(namespace: str, pod_name: str) -> str: """Get logs for a pod.""" result = subprocess.run( ["-weight: 500;">kubectl", "logs", pod_name, "-n", namespace], capture_output=True, text=True ) return result.stdout # Context-Debt resolved: the docstring tells the agent # when to call this, how to call it, and what to do next @mcp.tool() async def get_pod_logs( namespace: str, pod_name: str, tail_lines: int = 100, previous: bool = False ) -> str: """ Retrieve recent logs from a Kubernetes pod in the AKS cluster. Use when diagnosing: - CrashLoopBackOff pods — set previous=True to see the crash reason - Init container failures — include init container name in pod_name - Startup failures during helmfile atomic deployments Namespace naming on this platform: {-weight: 500;">service}-{env} e.g. payments-dev, payments-staging, auth-prod If pod_name is unknown, call get_pods_in_namespace() first. Returns last {tail_lines} log lines. Increase for deeper history. Returns empty string if pod has not started emitting logs yet. In that case, call describe_pod() to check events instead. """ cmd = ["-weight: 500;">kubectl", "logs", pod_name, "-n", namespace, f"--tail={tail_lines}"] if previous: cmd.append("--previous") result = subprocess.run(cmd, capture_output=True, text=True) return result.stdout or result.stderr # Context-Debt resolved: the docstring tells the agent # when to call this, how to call it, and what to do next @mcp.tool() async def get_pod_logs( namespace: str, pod_name: str, tail_lines: int = 100, previous: bool = False ) -> str: """ Retrieve recent logs from a Kubernetes pod in the AKS cluster. Use when diagnosing: - CrashLoopBackOff pods — set previous=True to see the crash reason - Init container failures — include init container name in pod_name - Startup failures during helmfile atomic deployments Namespace naming on this platform: {-weight: 500;">service}-{env} e.g. payments-dev, payments-staging, auth-prod If pod_name is unknown, call get_pods_in_namespace() first. Returns last {tail_lines} log lines. Increase for deeper history. Returns empty string if pod has not started emitting logs yet. In that case, call describe_pod() to check events instead. """ cmd = ["-weight: 500;">kubectl", "logs", pod_name, "-n", namespace, f"--tail={tail_lines}"] if previous: cmd.append("--previous") result = subprocess.run(cmd, capture_output=True, text=True) return result.stdout or result.stderr # Context-Debt resolved: the docstring tells the agent # when to call this, how to call it, and what to do next @mcp.tool() async def get_pod_logs( namespace: str, pod_name: str, tail_lines: int = 100, previous: bool = False ) -> str: """ Retrieve recent logs from a Kubernetes pod in the AKS cluster. Use when diagnosing: - CrashLoopBackOff pods — set previous=True to see the crash reason - Init container failures — include init container name in pod_name - Startup failures during helmfile atomic deployments Namespace naming on this platform: {-weight: 500;">service}-{env} e.g. payments-dev, payments-staging, auth-prod If pod_name is unknown, call get_pods_in_namespace() first. Returns last {tail_lines} log lines. Increase for deeper history. Returns empty string if pod has not started emitting logs yet. In that case, call describe_pod() to check events instead. """ cmd = ["-weight: 500;">kubectl", "logs", pod_name, "-n", namespace, f"--tail={tail_lines}"] if previous: cmd.append("--previous") result = subprocess.run(cmd, capture_output=True, text=True) return result.stdout or result.stderr # /scripts/get-failed-builds.sh # Runs on ADO private agent pool, triggered manually or via pipeline task # Usage: ./get-failed-builds.sh <project> <pipeline-name> <days> # Returns: JSON array of failed runs with build_id, stage, duration # No agent can reach this — it has no callable interface # /scripts/get-failed-builds.sh # Runs on ADO private agent pool, triggered manually or via pipeline task # Usage: ./get-failed-builds.sh <project> <pipeline-name> <days> # Returns: JSON array of failed runs with build_id, stage, duration # No agent can reach this — it has no callable interface # /scripts/get-failed-builds.sh # Runs on ADO private agent pool, triggered manually or via pipeline task # Usage: ./get-failed-builds.sh <project> <pipeline-name> <days> # Returns: JSON array of failed runs with build_id, stage, duration # No agent can reach this — it has no callable interface # New MCP tool: Interface-Debt + Context-Debt resolved together @mcp.tool() async def get_recent_pipeline_failures( project: str, pipeline_name: str, days_back: int = 7 ) -> list[dict]: """ Get recent failed pipeline runs for a given ADO project and pipeline. Wraps the internal ADO query script and returns structured failure data. Call this as the first step in any pipeline diagnosis workflow. It gives you the build IDs needed for deeper analysis. Pipeline naming on this platform: {-weight: 500;">service}-{env}-deploy e.g. payments-dev-deploy, auth-staging-deploy, gateway-prod-deploy Returns list of failures with fields: build_id, start_time, failed_stage, duration_seconds, triggered_by, branch, retry_count. Most common failed stages on this platform: - helmfile-apply -> missing secrets (79%) or image pull (15%) - integration-tests -> environment config or dependency issues - security-scan -> new CVE in base image (check monthly patch cycle) After calling this, pass build_id to diagnose_build_failure() for root cause analysis. """ result = subprocess.run( ["/scripts/get-failed-builds.sh", project, pipeline_name, str(days_back)], capture_output=True, text=True ) return json.loads(result.stdout) # New MCP tool: Interface-Debt + Context-Debt resolved together @mcp.tool() async def get_recent_pipeline_failures( project: str, pipeline_name: str, days_back: int = 7 ) -> list[dict]: """ Get recent failed pipeline runs for a given ADO project and pipeline. Wraps the internal ADO query script and returns structured failure data. Call this as the first step in any pipeline diagnosis workflow. It gives you the build IDs needed for deeper analysis. Pipeline naming on this platform: {-weight: 500;">service}-{env}-deploy e.g. payments-dev-deploy, auth-staging-deploy, gateway-prod-deploy Returns list of failures with fields: build_id, start_time, failed_stage, duration_seconds, triggered_by, branch, retry_count. Most common failed stages on this platform: - helmfile-apply -> missing secrets (79%) or image pull (15%) - integration-tests -> environment config or dependency issues - security-scan -> new CVE in base image (check monthly patch cycle) After calling this, pass build_id to diagnose_build_failure() for root cause analysis. """ result = subprocess.run( ["/scripts/get-failed-builds.sh", project, pipeline_name, str(days_back)], capture_output=True, text=True ) return json.loads(result.stdout) # New MCP tool: Interface-Debt + Context-Debt resolved together @mcp.tool() async def get_recent_pipeline_failures( project: str, pipeline_name: str, days_back: int = 7 ) -> list[dict]: """ Get recent failed pipeline runs for a given ADO project and pipeline. Wraps the internal ADO query script and returns structured failure data. Call this as the first step in any pipeline diagnosis workflow. It gives you the build IDs needed for deeper analysis. Pipeline naming on this platform: {-weight: 500;">service}-{env}-deploy e.g. payments-dev-deploy, auth-staging-deploy, gateway-prod-deploy Returns list of failures with fields: build_id, start_time, failed_stage, duration_seconds, triggered_by, branch, retry_count. Most common failed stages on this platform: - helmfile-apply -> missing secrets (79%) or image pull (15%) - integration-tests -> environment config or dependency issues - security-scan -> new CVE in base image (check monthly patch cycle) After calling this, pass build_id to diagnose_build_failure() for root cause analysis. """ result = subprocess.run( ["/scripts/get-failed-builds.sh", project, pipeline_name, str(days_back)], capture_output=True, text=True ) return json.loads(result.stdout) @mcp.tool() async def get_platform_failure_pattern( error_signature: str, pipeline_stage: str, service_name: str = None ) -> dict: """ Look up known failure patterns on this platform from real incident history. CALL THIS FIRST in any diagnosis before running other tools. It encodes 18 months of incident data and directs you to the highest-probability root cause, skipping diagnostic dead ends. Known patterns (error_signature -> likely cause): - "timed out waiting for condition" + helmfile-apply -> missing secret in namespace (79% of cases) -> next: call check_keyvault_secret_exists() - "ImagePullBackOff" -> ACR authentication failure or incorrect image tag (92%) -> next: call check_acr_image_exists() - "CrashLoopBackOff" shortly after deployment -> application ConfigMap missing or malformed (71%) -> next: call get_pod_logs(previous=True) then check_configmap() - "503 Service Unavailable" post-deployment with healthy pods -> stale Istio VirtualService conflict in namespace (58%) -> next: call get_all_virtualservices_for_host() Returns: likely_cause, confidence_percent, recommended_tools, similar_past_incidents, avg_resolution_minutes. If confidence < 50%, this is a new pattern not yet seen. Document it via create_incident_record() and use the generic path. """ return await query_incident_database( error_signature, pipeline_stage, service_name ) @mcp.tool() async def get_platform_failure_pattern( error_signature: str, pipeline_stage: str, service_name: str = None ) -> dict: """ Look up known failure patterns on this platform from real incident history. CALL THIS FIRST in any diagnosis before running other tools. It encodes 18 months of incident data and directs you to the highest-probability root cause, skipping diagnostic dead ends. Known patterns (error_signature -> likely cause): - "timed out waiting for condition" + helmfile-apply -> missing secret in namespace (79% of cases) -> next: call check_keyvault_secret_exists() - "ImagePullBackOff" -> ACR authentication failure or incorrect image tag (92%) -> next: call check_acr_image_exists() - "CrashLoopBackOff" shortly after deployment -> application ConfigMap missing or malformed (71%) -> next: call get_pod_logs(previous=True) then check_configmap() - "503 Service Unavailable" post-deployment with healthy pods -> stale Istio VirtualService conflict in namespace (58%) -> next: call get_all_virtualservices_for_host() Returns: likely_cause, confidence_percent, recommended_tools, similar_past_incidents, avg_resolution_minutes. If confidence < 50%, this is a new pattern not yet seen. Document it via create_incident_record() and use the generic path. """ return await query_incident_database( error_signature, pipeline_stage, service_name ) @mcp.tool() async def get_platform_failure_pattern( error_signature: str, pipeline_stage: str, service_name: str = None ) -> dict: """ Look up known failure patterns on this platform from real incident history. CALL THIS FIRST in any diagnosis before running other tools. It encodes 18 months of incident data and directs you to the highest-probability root cause, skipping diagnostic dead ends. Known patterns (error_signature -> likely cause): - "timed out waiting for condition" + helmfile-apply -> missing secret in namespace (79% of cases) -> next: call check_keyvault_secret_exists() - "ImagePullBackOff" -> ACR authentication failure or incorrect image tag (92%) -> next: call check_acr_image_exists() - "CrashLoopBackOff" shortly after deployment -> application ConfigMap missing or malformed (71%) -> next: call get_pod_logs(previous=True) then check_configmap() - "503 Service Unavailable" post-deployment with healthy pods -> stale Istio VirtualService conflict in namespace (58%) -> next: call get_all_virtualservices_for_host() Returns: likely_cause, confidence_percent, recommended_tools, similar_past_incidents, avg_resolution_minutes. If confidence < 50%, this is a new pattern not yet seen. Document it via create_incident_record() and use the generic path. """ return await query_incident_database( error_signature, pipeline_stage, service_name )