Tools

Tools: Stop Credentialing Your AI Agents Like It's 2019 (2026)

2026-05-07 0 views admin

The Problem Nobody Talks About

The Math on Credential Exposure

Broker vs. Registry: Two Philosophies

What It Looks Like in Code

Multi-Agent Delegation: The Attack Vector Nobody Is Talking About

Why Now

The Argument, Not the Pitch

What I Built

Try It in 10 Minutes

The Question TL;DR: Your agent lives for 2 minutes. Its credential lives for 60. That mismatch is your attack surface. A broker that issues task-scoped, short-lived credentials closes the gap before the sprawl starts. AI agents are still new. Most teams are just now deploying their first agents at scale. 2026 is year one. And a lot of the identity conversation already assumes the mess exists: registries, inventories, entitlement reviews, cleanup workflows. But the mess is not inevitable. It's a choice you make at the beginning. If you start with a broker where every agent gets a short-lived, task-scoped credential at spawn time, the individual agent credential doesn't have to become another long-lived thing you track forever. This is the prevention argument: govern the things that persist, but issue ephemeral credentials to the things that don't. Right now, most teams are credentialing their agents one of three ways: The common thread: credentials outlive the work. The agent is ephemeral. The credential is not. That mismatch is your attack surface. Let's make it concrete. At scale, the difference is not academic. The exact numbers depend on your workload, TTLs, and renewal policy, but the shape of the risk is the same. Every 2-minute agent task backed by a 60-minute token leaves 58 extra minutes where a stolen credential is still useful. Multiply that across thousands of agent runs and you're generating a massive amount of unnecessary credential lifetime every single day. When a credential gets stolen, the attacker doesn't get access to what the agent was doing. They get access to everything that credential could do, for as long as it stays valid. Registry model: Persistent systems, applications, owners, policies, and audit trails get registered and governed. That's useful. But if every short-lived agent instance also becomes a persistent identity record, you accumulate thousands of identities, entitlements, and cleanup tasks. At that point, the registry's value proposition becomes "we'll help you manage the sprawl." Broker model: Every agent gets a credential at spawn. The credential is scoped to exactly what that task needs. It has a short TTL and can be released or revoked when the work is done. The persistent governance layer still exists above the agent, but the per-agent credential doesn't become a standing entitlement. The broker assumes at least some sprawl is preventable. Its value proposition is "don't create long-lived agent credentials in the first place." Prevention is usually cheaper than cleanup. Fewer stale identities. Fewer periodic access reviews. Fewer "why did this old agent still have access?" incidents. Same agent, same system prompt, same LLM, same decision. The only thing that changes is the credential. All three examples start here: Static API key (what most teams do today): OAuth token (better, but still mismatched): Broker (task-scoped, ephemeral): All three examples use the same LLM, the same system prompt, the same tool call, the same customer_id. The code that talks to the API is nearly identical. The difference is the credential. With the static key and the OAuth token, the developer already knows they only need customer 12345. But the credential can't enforce that. It's too broad to scope per-task, and there's no mechanism to narrow it at runtime. With the broker, the credential is built from the LLM's actual decision. customer_id flows directly into requested_scope. The token can only do what the LLM asked for, nothing more. If the LLM had picked a different tool or a different customer, the scope would have been different. And if the requested scope exceeds what the app is allowed to issue, the broker rejects it before the agent touches any data. This is where it gets interesting. Most serious agent deployments use multiple agents working together. Agent A researches. Agent B drafts. Agent C reviews. Agent D publishes. The output of one agent becomes the input of the next. The problem: How does Agent A give Agent B permission to act on its behalf? The naive approach: Agent A shares its credential with Agent B. Now Agent B has Agent A's permissions. If Agent A could read all customers, so can Agent B. Permissions expanded. This is credential escalation, and it's trivially easy in most agent architectures. The registry-only approach: Agent B gets its own standing identity and permissions, and you rely on governance later to prove that those permissions are still correct. The broker approach: delegation chain verification. When Agent A delegates to Agent B, it passes a token that says: "Agent A authorized Agent B to act on its behalf, with scope exactly equal to Agent A's current scope. Agent B cannot escalate. The delegation is cryptographically signed and time-bounded." If Agent A had read:data:customer-12345, Agent B gets read:data:customer-12345. Not read:data:*. Not write:data:customer-12345. Exactly what Agent A had, nothing more. The delegation chain is a series of signed tokens. Each link is bound to the previous one, so resource layers can verify the lineage instead of treating each delegated token as unrelated. This isn't just a feature. It's the security property I care about most: delegation should preserve or reduce authority, never expand it. 2026 is year one for agent deployment at scale. Most teams are figuring this out right now. The architectural decisions made in the next 12 months will persist for years. If you bake in long-lived agent credentials today, you'll spend the next two years cleaning them up. Access reviews. Entitlement audits. "Who had access to what when" forensics. Or it just doesn't get cleaned up at all. The enterprise vendors will sell you tools to manage the mess, because the mess will be real. If you start with a broker, you still need governance for the persistent systems around the agent. But the short-lived agent instance doesn't need to leave behind a standing credential. The registry vendors aren't wrong that sprawl is a problem. I think they're too quick to assume all of it is inevitable. I'm not here to sell you AgentWrit. I'm here to argue that the credential model you choose today determines your security posture for the next five years. If you start with long-lived credentials and registry-managed sprawl, you're choosing a future of cleanup, audits, and accumulated risk. If you start with ephemeral, task-scoped credentials, you're choosing a future where credentials don't outlive the work, where delegation doesn't escalate, and where the individual agent instance doesn't become a permanent entitlement. The broker model isn't new. It's how cloud-native systems have handled short-lived compute for years. VMs get credentials at boot. Containers get credentials at start. Serverless functions get credentials per invocation. The credential dies with the compute. Agents are just compute that happens to be intelligent. The same principle applies. I built AgentWrit because I needed this for my own agent deployments. It's a self-hosted credential broker for AI agents, source-available under PolyForm Internal Use for internal deployments. It's written in Go. Runs with Docker. The broker is source-available under PolyForm Internal Use 1.0.0; the Python SDK is MIT-licensed and live on PyPI. GitHub: https://github.com/devonartis/agentwrit

Python SDK: https://github.com/devonartis/agentwrit-python

Security pattern (CC BY-SA 4.0): https://github.com/devonartis/AI-Security-Blueprints The pattern is aligned with OWASP Agentic Top 10 (2026), NIST IR 8596, and IETF WIMSE. It's published separately because the architecture matters more than any single implementation. The full security architecture is also published as a preprint on Zenodo. Pull the broker Docker image, install the Python SDK, and run one of the two demos end to end. MedAssist is a FastAPI clinical assistant. You ask a plain-language question about a patient; an LLM picks tools (records, labs, billing, prescriptions); the app spawns broker agents on demand, each scoped to one patient and one category. Cross-patient questions are denied. Prescription writes flow through a delegation chain. demo/README.md has run instructions, a scenario playbook, and a code map. Support Tickets is a three-agent pipeline built with Flask + HTMX + SSE. Three LLM-driven agents (triage, knowledge, response) process customer tickets. Anonymous tickets halt at triage. Dangerous tools like delete_account and send_external_email are in the LLM's tool list but not in the agent's scope, so they never execute. One scenario deliberately skips release() to watch a 5-second TTL die on its own. demo2/README.md has run instructions, five scenarios, and a code map. How are you credentialing your agents today? If the answer involves shared API keys, long-lived OAuth tokens, or broad IAM roles, you might be building the mess that registry vendors will later sell you tools to manage. Start with prevention. It's cheaper to avoid standing agent credentials than to clean them up later. Devon Artis. Principal Security Engineer. CSA AI Controls Matrix contributor. Published the Ephemeral Agent Credentialing pattern as a preprint on Zenodo. Building AgentWrit. One person, no VC. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command

Copy

$ import json from openai import OpenAI llm = OpenAI() # The system prompt defines what this agent is and what tools it can call. system_prompt = """You are a customer support agent. You have these tools: - lookup_billing: Fetch billing history for a customer - edit_account: Edit a customer's account info - lookup_billing_all: Fetch billing history across all customers Use the appropriate tool based on the customer's request.""" # A request arrives. The LLM decides what tool to call. response = llm.chat.completions.create( model="gpt-4.1", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": "What's the billing history for customer 12345?"}, ], tools=tools, # lookup_billing, edit_account, lookup_billing_all ) # The LLM chose lookup_billing for customer 12345. # Extract the customer_id from the LLM's decision. tool_call = response.choices[0].message.tool_calls[0] args = json.loads(tool_call.function.arguments) customer_id = args["customer_id"] # "12345" # Now: how does the agent get the credential to make that call? import json from openai import OpenAI llm = OpenAI() # The system prompt defines what this agent is and what tools it can call. system_prompt = """You are a customer support agent. You have these tools: - lookup_billing: Fetch billing history for a customer - edit_account: Edit a customer's account info - lookup_billing_all: Fetch billing history across all customers Use the appropriate tool based on the customer's request.""" # A request arrives. The LLM decides what tool to call. response = llm.chat.completions.create( model="gpt-4.1", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": "What's the billing history for customer 12345?"}, ], tools=tools, # lookup_billing, edit_account, lookup_billing_all ) # The LLM chose lookup_billing for customer 12345. # Extract the customer_id from the LLM's decision. tool_call = response.choices[0].message.tool_calls[0] args = json.loads(tool_call.function.arguments) customer_id = args["customer_id"] # "12345" # Now: how does the agent get the credential to make that call? import json from openai import OpenAI llm = OpenAI() # The system prompt defines what this agent is and what tools it can call. system_prompt = """You are a customer support agent. You have these tools: - lookup_billing: Fetch billing history for a customer - edit_account: Edit a customer's account info - lookup_billing_all: Fetch billing history across all customers Use the appropriate tool based on the customer's request.""" # A request arrives. The LLM decides what tool to call. response = llm.chat.completions.create( model="gpt-4.1", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": "What's the billing history for customer 12345?"}, ], tools=tools, # lookup_billing, edit_account, lookup_billing_all ) # The LLM chose lookup_billing for customer 12345. # Extract the customer_id from the LLM's decision. tool_call = response.choices[0].message.tool_calls[0] args = json.loads(tool_call.function.arguments) customer_id = args["customer_id"] # "12345" # Now: how does the agent get the credential to make that call? # Same LLM decision. Same customer_id extracted above. # The credential is a shared key baked into the environment. api_key = os.environ["SHARED_API_KEY"] headers = {"Authorization": f"Bearer {api_key}"} result = requests.get( f"https://api.example.com/customers/{customer_id}/billing", headers=headers, ) # This works. But that same key could also call: # requests.get("https://api.example.com/customers", headers=headers) # ...and pull EVERY customer. The key doesn't know or care # that the LLM only asked for one. And it never expires. # Same LLM decision. Same customer_id extracted above. # The credential is a shared key baked into the environment. api_key = os.environ["SHARED_API_KEY"] headers = {"Authorization": f"Bearer {api_key}"} result = requests.get( f"https://api.example.com/customers/{customer_id}/billing", headers=headers, ) # This works. But that same key could also call: # requests.get("https://api.example.com/customers", headers=headers) # ...and pull EVERY customer. The key doesn't know or care # that the LLM only asked for one. And it never expires. # Same LLM decision. Same customer_id extracted above. # The credential is a shared key baked into the environment. api_key = os.environ["SHARED_API_KEY"] headers = {"Authorization": f"Bearer {api_key}"} result = requests.get( f"https://api.example.com/customers/{customer_id}/billing", headers=headers, ) # This works. But that same key could also call: # requests.get("https://api.example.com/customers", headers=headers) # ...and pull EVERY customer. The key doesn't know or care # that the LLM only asked for one. And it never expires. # Same LLM decision. Same customer_id. # We know we only need one customer. But OAuth scopes are defined # at the client level, not per-task. You can't mint a token for # "just customer 12345" without a separate OAuth client per customer. token = get_oauth_token(client_id=os.environ["CLIENT_ID"], client_secret=os.environ["CLIENT_SECRET"], scope="read:customers:*") # broad because it has to be headers = {"Authorization": f"Bearer {token}"} result = requests.get( f"https://api.example.com/customers/{customer_id}/billing", headers=headers, ) # This also works. But the token can read ALL customers, # and it's valid for 60 minutes. Agent is done in 2. # Same LLM decision. Same customer_id. # We know we only need one customer. But OAuth scopes are defined # at the client level, not per-task. You can't mint a token for # "just customer 12345" without a separate OAuth client per customer. token = get_oauth_token(client_id=os.environ["CLIENT_ID"], client_secret=os.environ["CLIENT_SECRET"], scope="read:customers:*") # broad because it has to be headers = {"Authorization": f"Bearer {token}"} result = requests.get( f"https://api.example.com/customers/{customer_id}/billing", headers=headers, ) # This also works. But the token can read ALL customers, # and it's valid for 60 minutes. Agent is done in 2. # Same LLM decision. Same customer_id. # We know we only need one customer. But OAuth scopes are defined # at the client level, not per-task. You can't mint a token for # "just customer 12345" without a separate OAuth client per customer. token = get_oauth_token(client_id=os.environ["CLIENT_ID"], client_secret=os.environ["CLIENT_SECRET"], scope="read:customers:*") # broad because it has to be headers = {"Authorization": f"Bearer {token}"} result = requests.get( f"https://api.example.com/customers/{customer_id}/billing", headers=headers, ) # This also works. But the token can read ALL customers, # and it's valid for 60 minutes. Agent is done in 2. from agentwrit import AgentWritApp from agentwrit.errors import AuthorizationError app = AgentWritApp( broker_url=os.environ["AGENTWRIT_BROKER_URL"], client_id=os.environ["AGENTWRIT_CLIENT_ID"], client_secret=os.environ["AGENTWRIT_CLIENT_SECRET"], ) # Same LLM decision. Same customer_id. # But now the credential is scoped to exactly what the LLM asked for. try: agent = app.create_agent( orch_id="billing-agent", task_id=f"billing-{customer_id}", requested_scope=[f"read:data:customer-{customer_id}"], ) except AuthorizationError as e: # Broker says no. Scope exceeds what this app is allowed to issue. print(e.problem.detail) # "scope exceeds app ceiling" print(e.problem.error_code) # "scope_violation" raise # Verify the token through the app. Returns the broker's signed claims. check = app.validate(agent.access_token) print(check.claims.scope) # ['read:data:customer-12345'] -- nothing else result = httpx.get( f"https://api.example.com/customers/{customer_id}/billing", headers=agent.bearer_header, ) # This works. But if something tries to pull ALL customers # with this token, the API checks the scope and rejects it. # Done. Kill the token at the broker. Right now. Not in 58 minutes. agent.release() from agentwrit import AgentWritApp from agentwrit.errors import AuthorizationError app = AgentWritApp( broker_url=os.environ["AGENTWRIT_BROKER_URL"], client_id=os.environ["AGENTWRIT_CLIENT_ID"], client_secret=os.environ["AGENTWRIT_CLIENT_SECRET"], ) # Same LLM decision. Same customer_id. # But now the credential is scoped to exactly what the LLM asked for. try: agent = app.create_agent( orch_id="billing-agent", task_id=f"billing-{customer_id}", requested_scope=[f"read:data:customer-{customer_id}"], ) except AuthorizationError as e: # Broker says no. Scope exceeds what this app is allowed to issue. print(e.problem.detail) # "scope exceeds app ceiling" print(e.problem.error_code) # "scope_violation" raise # Verify the token through the app. Returns the broker's signed claims. check = app.validate(agent.access_token) print(check.claims.scope) # ['read:data:customer-12345'] -- nothing else result = httpx.get( f"https://api.example.com/customers/{customer_id}/billing", headers=agent.bearer_header, ) # This works. But if something tries to pull ALL customers # with this token, the API checks the scope and rejects it. # Done. Kill the token at the broker. Right now. Not in 58 minutes. agent.release() from agentwrit import AgentWritApp from agentwrit.errors import AuthorizationError app = AgentWritApp( broker_url=os.environ["AGENTWRIT_BROKER_URL"], client_id=os.environ["AGENTWRIT_CLIENT_ID"], client_secret=os.environ["AGENTWRIT_CLIENT_SECRET"], ) # Same LLM decision. Same customer_id. # But now the credential is scoped to exactly what the LLM asked for. try: agent = app.create_agent( orch_id="billing-agent", task_id=f"billing-{customer_id}", requested_scope=[f"read:data:customer-{customer_id}"], ) except AuthorizationError as e: # Broker says no. Scope exceeds what this app is allowed to issue. print(e.problem.detail) # "scope exceeds app ceiling" print(e.problem.error_code) # "scope_violation" raise # Verify the token through the app. Returns the broker's signed claims. check = app.validate(agent.access_token) print(check.claims.scope) # ['read:data:customer-12345'] -- nothing else result = httpx.get( f"https://api.example.com/customers/{customer_id}/billing", headers=agent.bearer_header, ) # This works. But if something tries to pull ALL customers # with this token, the API checks the scope and rejects it. # Done. Kill the token at the broker. Right now. Not in 58 minutes. agent.release() - Shared -weight: 500;">service account with a static API key. Every agent uses the same key. When one gets compromised, you rotate the key and everything breaks. - OAuth token with a 15-60 minute TTL. The agent runs for a short task, but the credential stays valid much longer. - Broad IAM role assigned "just in case." Scoped wide enough to handle every possible task. When an agent gets compromised, it has access to everything. - Ephemeral identity: Every agent spawns with a unique cryptographic identity - Task-scoped tokens: Scoped to exactly what the task needs, not broad IAM roles - Short-lived credentials: Tokens expire in minutes and can be released or revoked early - Four-level revocation: Token, agent, task, or full delegation chain - Delegation chain verification: Permissions cannot expand at each hop, cryptographically enforced

Share this article

Twitter Facebook LinkedIn Reddit

🏷️ Tags

toolsutilitiessecurity toolscredentialingagentsproblemnobodycredential

More from Tools

Tools: I got tired of deploying broken cronjobs, so I built a tool that generates them properly - Complete Guide

2026-05-07 0

Tools: Update: Stop Googling "what is exit code 127" There's a better way to handle it

2026-05-07 0

Tools: Complete Guide to Every bash script I write starts with the same 20 lines. So I made a generator for them.

2026-05-07 0

Tools: How to Deploy Llama 3.2 Vision with TensorRT on a $14/Month DigitalOcean GPU Droplet: 3x Faster Multimodal Inference at 1/120th Claude Vision Cost - Full Analysis

2026-05-07 0

Trending

1

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

2025-10-27 • 189 views

2

CVE-2025-43939: Dell Unity OS Command Injection (High)

2025-10-30 • 148 views

3

Google disputes false claims of massive Gmail data breach

2025-10-30 • 130 views

4

Microsoft: DNS outage impacts Azure and Microsoft 365 services

2025-10-30 • 88 views

5

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting

2025-11-25 • 81 views

InfinitSec - Latest Cybersecurity, Technology & Gaming News

Tools: Stop Credentialing Your AI Agents Like It's 2019 (2026)

The Problem Nobody Talks About

The Math on Credential Exposure

Broker vs. Registry: Two Philosophies

What It Looks Like in Code

Multi-Agent Delegation: The Attack Vector Nobody Is Talking About

Why Now

The Argument, Not the Pitch

What I Built

Try It in 10 Minutes

🏷️ Tags

More from Tools

Tools: I got tired of deploying broken cronjobs, so I built a tool that generates them properly - Complete Guide

Tools: Update: Stop Googling "what is exit code 127" There's a better way to handle it

Tools: Complete Guide to Every bash script I write starts with the same 20 lines. So I made a generator for them.

Tools: How to Deploy Llama 3.2 Vision with TensorRT on a $14/Month DigitalOcean GPU Droplet: 3x Faster Multimodal Inference at 1/120th Claude Vision Cost - Full Analysis

Trending

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

CVE-2025-43939: Dell Unity OS Command Injection (High)

Google disputes false claims of massive Gmail data breach

Microsoft: DNS outage impacts Azure and Microsoft 365 services

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting