Tools

Tools: Building Autonomous AI Agents That Actually Do Work

2026-03-08 0 views admin

Tools: Building Autonomous AI Agents That Actually Do Work

Source: Dev.to

The Problem With "AI-Powered" Tools ## What an Agent Actually Looks Like ## The Architecture: Multi-Agent, Not Monolithic ## The ReAct Loop Inside Each Agent ## Real Workflow: Content Decay Detection ## What Makes This Different From a Cron Script ## Guardrails That Actually Matter ## Getting Started How I built an autonomous SEO agent in Python that perceives data, reasons through strategy, and executes tasks — no babysitting required. Most LLMs just sit there waiting for your next prompt. You type something, it responds, you type again. It's a glorified autocomplete loop. I wanted something that actually does work while I sleep. So I built an autonomous agent system in Python. Here's how it works and why the architecture matters more than the model you pick. Every SaaS tool slaps "AI-powered" on their landing page now. But 99% of them are just wrapping an LLM call in a nice UI. You still have to: That's not automation. That's you doing all the thinking while a machine does the typing. An agent follows a loop. Not a single prompt-response — a continuous cycle of perceiving, reasoning, and acting. The pattern I use is based on the ReAct framework: Each iteration, the agent observes its environment, thinks about what to do, does it, then observes the results. The key insight: the agent decides when to stop, not you. One big agent that does everything will hallucinate and lose context. Instead, I split responsibilities across specialized sub-agents: Each agent has a narrow scope. The researcher never writes content. The executor never decides strategy. This separation keeps hallucinations contained. Every sub-agent runs its own reasoning loop using the ReAct pattern — generating a "Thought" before each "Action": The trick is setting max_steps. Too low and the agent can't finish complex tasks. Too high and you're burning tokens on loops that lost the plot. I start at 10 and tune per agent. Here's a concrete workflow I run on a cron schedule via n8n: The whole thing runs every 24 hours. I get a Slack notification with what it found and what it did. Most days it finds nothing. Some days it catches a decay pattern weeks before I'd have noticed manually. A cron script does the same thing every time. An agent reasons about what it sees. When the SERP shifts from listicles to video carousels, a cron script keeps optimizing listicles. An agent notices the format change and adjusts its recommendations. The distinction matters: you define objectives and constraints, not step-by-step instructions. The agent figures out the steps. Autonomy without guardrails is a liability. Here's what I enforce: The goal isn't "human out of the loop" — it's "human on the loop." You set the boundaries, the agent operates within them, and you review the dashboard. You don't need a complex multi-agent system on day one. Start with a single agent that does one thing: That's an agent. Everything else is optimization. Deep dive with full architecture details: Agentic AI Workflows: Beyond Basic Content Generation Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: ┌─────────────────────────────────────┐ │ AGENT LOOP │ │ │ │ ┌──────────┐ │ │ │ PERCEIVE │◄── APIs, databases, │ │ └────┬─────┘ live data sources │ │ │ │ │ ┌────▼─────┐ │ │ │ REASON │◄── LLM + context │ │ └────┬─────┘ from RAG store │ │ │ │ │ ┌────▼─────┐ │ │ │ ACT │──► API calls, DB │ │ └────┬─────┘ writes, alerts │ │ │ │ │ └──────────► loop back ◄──────│ └─────────────────────────────────────┘ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ┌─────────────────────────────────────┐ │ AGENT LOOP │ │ │ │ ┌──────────┐ │ │ │ PERCEIVE │◄── APIs, databases, │ │ └────┬─────┘ live data sources │ │ │ │ │ ┌────▼─────┐ │ │ │ REASON │◄── LLM + context │ │ └────┬─────┘ from RAG store │ │ │ │ │ ┌────▼─────┐ │ │ │ ACT │──► API calls, DB │ │ └────┬─────┘ writes, alerts │ │ │ │ │ └──────────► loop back ◄──────│ └─────────────────────────────────────┘ CODE_BLOCK: ┌─────────────────────────────────────┐ │ AGENT LOOP │ │ │ │ ┌──────────┐ │ │ │ PERCEIVE │◄── APIs, databases, │ │ └────┬─────┘ live data sources │ │ │ │ │ ┌────▼─────┐ │ │ │ REASON │◄── LLM + context │ │ └────┬─────┘ from RAG store │ │ │ │ │ ┌────▼─────┐ │ │ │ ACT │──► API calls, DB │ │ └────┬─────┘ writes, alerts │ │ │ │ │ └──────────► loop back ◄──────│ └─────────────────────────────────────┘ COMMAND_BLOCK: from dataclasses import dataclass from enum import Enum class AgentRole(Enum): RESEARCHER = "researcher" AUDITOR = "auditor" STRATEGIST = "strategist" EXECUTOR = "executor" @dataclass class AgentTask: role: AgentRole objective: str constraints: list[str] context: dict class AgentOrchestrator: def init(self, agents: dict[AgentRole, Agent]): self.agents = agents self.shared_state = {} async def run_pipeline(self, trigger: dict): # Researcher gathers data research = await self.agents[AgentRole.RESEARCHER].execute( AgentTask( role=AgentRole.RESEARCHER, objective="Analyze SERP changes for target keywords", constraints=["Use DataForSEO API", "Max 500 queries"], context=trigger, ) ) # Auditor validates against guidelines audit = await self.agents[AgentRole.AUDITOR].execute( AgentTask( role=AgentRole.AUDITOR, objective="Check findings against brand guidelines", constraints=["Flag confidence < 0.7"], context={"research": research, self.shared_state}, ) ) # Strategist formulates the plan strategy = await self.agents[AgentRole.STRATEGIST].execute( AgentTask( role=AgentRole.STRATEGIST, objective="Create action plan from research + audit", constraints=["Prioritize by estimated impact"], context={"research": research, "audit": audit}, ) ) # Executor carries it out if strategy.confidence > 0.8: await self.agents[AgentRole.EXECUTOR].execute( AgentTask( role=AgentRole.EXECUTOR, objective="Implement approved changes", constraints=["Dry-run first", "Log all mutations"], context={"strategy": strategy}, ) ) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: from dataclasses import dataclass from enum import Enum class AgentRole(Enum): RESEARCHER = "researcher" AUDITOR = "auditor" STRATEGIST = "strategist" EXECUTOR = "executor" @dataclass class AgentTask: role: AgentRole objective: str constraints: list[str] context: dict class AgentOrchestrator: def init(self, agents: dict[AgentRole, Agent]): self.agents = agents self.shared_state = {} async def run_pipeline(self, trigger: dict): # Researcher gathers data research = await self.agents[AgentRole.RESEARCHER].execute( AgentTask( role=AgentRole.RESEARCHER, objective="Analyze SERP changes for target keywords", constraints=["Use DataForSEO API", "Max 500 queries"], context=trigger, ) ) # Auditor validates against guidelines audit = await self.agents[AgentRole.AUDITOR].execute( AgentTask( role=AgentRole.AUDITOR, objective="Check findings against brand guidelines", constraints=["Flag confidence < 0.7"], context={"research": research, self.shared_state}, ) ) # Strategist formulates the plan strategy = await self.agents[AgentRole.STRATEGIST].execute( AgentTask( role=AgentRole.STRATEGIST, objective="Create action plan from research + audit", constraints=["Prioritize by estimated impact"], context={"research": research, "audit": audit}, ) ) # Executor carries it out if strategy.confidence > 0.8: await self.agents[AgentRole.EXECUTOR].execute( AgentTask( role=AgentRole.EXECUTOR, objective="Implement approved changes", constraints=["Dry-run first", "Log all mutations"], context={"strategy": strategy}, ) ) COMMAND_BLOCK: from dataclasses import dataclass from enum import Enum class AgentRole(Enum): RESEARCHER = "researcher" AUDITOR = "auditor" STRATEGIST = "strategist" EXECUTOR = "executor" @dataclass class AgentTask: role: AgentRole objective: str constraints: list[str] context: dict class AgentOrchestrator: def init(self, agents: dict[AgentRole, Agent]): self.agents = agents self.shared_state = {} async def run_pipeline(self, trigger: dict): # Researcher gathers data research = await self.agents[AgentRole.RESEARCHER].execute( AgentTask( role=AgentRole.RESEARCHER, objective="Analyze SERP changes for target keywords", constraints=["Use DataForSEO API", "Max 500 queries"], context=trigger, ) ) # Auditor validates against guidelines audit = await self.agents[AgentRole.AUDITOR].execute( AgentTask( role=AgentRole.AUDITOR, objective="Check findings against brand guidelines", constraints=["Flag confidence < 0.7"], context={"research": research, self.shared_state}, ) ) # Strategist formulates the plan strategy = await self.agents[AgentRole.STRATEGIST].execute( AgentTask( role=AgentRole.STRATEGIST, objective="Create action plan from research + audit", constraints=["Prioritize by estimated impact"], context={"research": research, "audit": audit}, ) ) # Executor carries it out if strategy.confidence > 0.8: await self.agents[AgentRole.EXECUTOR].execute( AgentTask( role=AgentRole.EXECUTOR, objective="Implement approved changes", constraints=["Dry-run first", "Log all mutations"], context={"strategy": strategy}, ) ) COMMAND_BLOCK: import openai class Agent: def init(self, role: AgentRole, tools: list[callable]): self.role = role self.tools = {t.name: t for t in tools} self.client = openai.AsyncOpenAI() async def execute(self, task: AgentTask, max_steps: int = 10): messages = [ {"role": "system", "content": self._build_system_prompt(task)}, ] for step in range(max_steps): response = await self.client.chat.completions.create( model="gpt-4o", messages=messages, tools=self._tool_schemas(), ) message = response.choices[0].message if message.tool_calls: # Agent decided to use a tool for call in message.tool_calls: result = await self.tools[call.function.name]( json.loads(call.function.arguments) ) messages.append({"role": "tool", "content": str(result), "tool_call_id": call.id}) else: # Agent decided it's done return self._parse_final_answer(message.content) raise TimeoutError(f"Agent {self.role} exceeded {max_steps} steps") Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import openai class Agent: def init(self, role: AgentRole, tools: list[callable]): self.role = role self.tools = {t.name: t for t in tools} self.client = openai.AsyncOpenAI() async def execute(self, task: AgentTask, max_steps: int = 10): messages = [ {"role": "system", "content": self._build_system_prompt(task)}, ] for step in range(max_steps): response = await self.client.chat.completions.create( model="gpt-4o", messages=messages, tools=self._tool_schemas(), ) message = response.choices[0].message if message.tool_calls: # Agent decided to use a tool for call in message.tool_calls: result = await self.tools[call.function.name]( json.loads(call.function.arguments) ) messages.append({"role": "tool", "content": str(result), "tool_call_id": call.id}) else: # Agent decided it's done return self._parse_final_answer(message.content) raise TimeoutError(f"Agent {self.role} exceeded {max_steps} steps") COMMAND_BLOCK: import openai class Agent: def init(self, role: AgentRole, tools: list[callable]): self.role = role self.tools = {t.name: t for t in tools} self.client = openai.AsyncOpenAI() async def execute(self, task: AgentTask, max_steps: int = 10): messages = [ {"role": "system", "content": self._build_system_prompt(task)}, ] for step in range(max_steps): response = await self.client.chat.completions.create( model="gpt-4o", messages=messages, tools=self._tool_schemas(), ) message = response.choices[0].message if message.tool_calls: # Agent decided to use a tool for call in message.tool_calls: result = await self.tools[call.function.name]( json.loads(call.function.arguments) ) messages.append({"role": "tool", "content": str(result), "tool_call_id": call.id}) else: # Agent decided it's done return self._parse_final_answer(message.content) raise TimeoutError(f"Agent {self.role} exceeded {max_steps} steps") - Decide what to do - Write the prompt - Review the output - Decide what to do next - Repeat forever - Perceive: Pull last 28 days of GSC data. Flag pages where CTR dropped >15% while impressions stayed flat. - Reason: For each flagged page, fetch the current SERP. Ask the LLM: "Has the search intent shifted? Are competitors using a different content format?" - Act: Generate a refresh brief with specific recommendations. If confidence is high enough, push title tag updates directly via the WordPress API. - Dry-run mode: Every mutation gets logged before execution. The agent proposes; a human (or a second agent) approves. - Confidence thresholds: Actions below 0.7 confidence get queued for human review instead of auto-executing. - Audit trail: Every thought, tool call, and result is logged. No black boxes. - Budget caps: Max API calls per run, max tokens per agent, max mutations per cycle. - Pick one repetitive task you do weekly - Write a Python script that does the "perceive" step (fetch data) - Add an LLM call for the "reason" step (analyze the data) - Add an API call for the "act" step (do something with the analysis) - Wrap it in a loop with a termination condition

🏷️ Tags

how-totutorialguidedev.toaiopenaillmgptcronpythondatabase