Tools: Building AI Agents with Python: A Practical, Open-Source First Guide

Tools: Building AI Agents with Python: A Practical, Open-Source First Guide

What is an “AI agent” (in practice)? ## Project setup ## Step 1: Define tools (the agent’s capabilities) ## Example tools ## Step 2: Define messages + memory ## Step 3: The model interface (OpenAI-compatible) ## Step 4: Build the agent loop ## Step 5: Try it end-to-end ## Making it more agentic (without making it fragile) ## 1) Add a “planner” step ## 2) Add retrieval (RAG) properly ## 3) Add tool timeouts and cancellation ## 4) Add a strict allowlist and “capabilities” policy ## 5) Add structured tool outputs ## Open-source note: keep the architecture swappable ## Summary AI agents are more than “LLM + prompt.” A useful agent can plan, use tools, remember context, and act safely in the real world (files, APIs, databases). In this post, we’ll build a small but capable agent in Python using an open-source stack. This is aimed at intermediate Python developers who want to understand the moving parts and keep the architecture flexible. A practical agent typically includes: A key design choice: don’t hide the loop. You’ll debug and extend agents more easily when the control flow is visible. Install dependencies: If you’re using an OpenAI-compatible endpoint: Tools are just Python callables plus metadata: We’ll implement a tiny tool framework. We’ll store a basic conversation history plus a “notes” field the agent can update. Many providers (and local gateways) implement an OpenAI-compatible Chat Completions API. We’ll keep this thin so you can swap it out. We’ll ask the model to respond in a structured JSON format: We’ll also add basic guardrails: Create a docs/ folder with a couple .md files (project notes, API docs, etc.). Then run: A typical interaction looks like: Once the basics work, here are practical upgrades. Instead of letting the model decide everything in one shot, add an explicit planning phase: This reduces randomness and improves debuggability. Our search_local_docs is naive substring matching. For real projects, use embeddings: Then create a tool like retrieve_context(query) -> passages. Tools that hit networks should use timeouts: A common mistake is giving agents broad file/network access. Prefer: Returning JSON strings is fine for demos, but you’ll want consistent schemas. Consider: If you later adopt a framework (LangGraph, LlamaIndex, Haystack, Semantic Kernel), you’ll still benefit from understanding: A good rule: frameworks should reduce boilerplate, not hide control flow. You now have a minimal, extensible Python AI agent with: From here, the biggest improvements come from: If you want, I can follow up with a second post that adds: Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to ? It will become hidden in your post, but will still be visible via the comment's permalink. as well , this person and/or COMMAND_BLOCK: pip install pydantic httpx COMMAND_BLOCK: pip install pydantic httpx COMMAND_BLOCK: pip install pydantic httpx COMMAND_BLOCK: pip install openai COMMAND_BLOCK: pip install openai COMMAND_BLOCK: pip install openai COMMAND_BLOCK: from __future__ import annotations from dataclasses import dataclass from typing import Any, Callable, Dict, Optional, Type from pydantic import BaseModel, ValidationError @dataclass class Tool: name: str description: str input_model: Type[BaseModel] fn: Callable[..., Any] def run(self, raw_args: Dict[str, Any]) -> Any: args = self.input_model(**raw_args) return self.fn(**args.model_dump()) class ToolRegistry: def __init__(self): self._tools: Dict[str, Tool] = {} def register(self, tool: Tool) -> None: if tool.name in self._tools: raise ValueError(f"Tool already registered: {tool.name}") self._tools[tool.name] = tool def get(self, name: str) -> Tool: return self._tools[name] def list(self) -> Dict[str, Tool]: return dict(self._tools) COMMAND_BLOCK: from __future__ import annotations from dataclasses import dataclass from typing import Any, Callable, Dict, Optional, Type from pydantic import BaseModel, ValidationError @dataclass class Tool: name: str description: str input_model: Type[BaseModel] fn: Callable[..., Any] def run(self, raw_args: Dict[str, Any]) -> Any: args = self.input_model(**raw_args) return self.fn(**args.model_dump()) class ToolRegistry: def __init__(self): self._tools: Dict[str, Tool] = {} def register(self, tool: Tool) -> None: if tool.name in self._tools: raise ValueError(f"Tool already registered: {tool.name}") self._tools[tool.name] = tool def get(self, name: str) -> Tool: return self._tools[name] def list(self) -> Dict[str, Tool]: return dict(self._tools) COMMAND_BLOCK: from __future__ import annotations from dataclasses import dataclass from typing import Any, Callable, Dict, Optional, Type from pydantic import BaseModel, ValidationError @dataclass class Tool: name: str description: str input_model: Type[BaseModel] fn: Callable[..., Any] def run(self, raw_args: Dict[str, Any]) -> Any: args = self.input_model(**raw_args) return self.fn(**args.model_dump()) class ToolRegistry: def __init__(self): self._tools: Dict[str, Tool] = {} def register(self, tool: Tool) -> None: if tool.name in self._tools: raise ValueError(f"Tool already registered: {tool.name}") self._tools[tool.name] = tool def get(self, name: str) -> Tool: return self._tools[name] def list(self) -> Dict[str, Tool]: return dict(self._tools) COMMAND_BLOCK: import os import re from pathlib import Path from typing import List from pydantic import BaseModel, Field class SearchLocalDocsInput(BaseModel): query: str = Field(..., min_length=2) folder: str = Field(..., description="Folder containing .md/.txt files") max_results: int = Field(5, ge=1, le=20) def search_local_docs(query: str, folder: str, max_results: int = 5) -> List[dict]: q = query.lower().strip() folder_path = Path(folder) results = [] for path in folder_path.rglob("*"): if path.suffix.lower() not in {".md", ".txt"}: continue try: text = path.read_text(encoding="utf-8", errors="ignore") except OSError: continue if q in text.lower(): # Grab a small snippet around the first match m = re.search(re.escape(q), text, re.IGNORECASE) start = max(0, m.start() - 120) if m else 0 end = min(len(text), (m.end() + 120) if m else 240) snippet = text[start:end].replace("\n", " ") results.append({"file": str(path), "snippet": snippet}) return results[:max_results] class SummarizeTextInput(BaseModel): text: str = Field(..., min_length=1) max_chars: int = Field(600, ge=100, le=5000) def summarize_text(text: str, max_chars: int = 600) -> str: text = re.sub(r"\s+", " ", text).strip() if len(text) <= max_chars: return text return text[: max_chars - 3] + "..." COMMAND_BLOCK: import os import re from pathlib import Path from typing import List from pydantic import BaseModel, Field class SearchLocalDocsInput(BaseModel): query: str = Field(..., min_length=2) folder: str = Field(..., description="Folder containing .md/.txt files") max_results: int = Field(5, ge=1, le=20) def search_local_docs(query: str, folder: str, max_results: int = 5) -> List[dict]: q = query.lower().strip() folder_path = Path(folder) results = [] for path in folder_path.rglob("*"): if path.suffix.lower() not in {".md", ".txt"}: continue try: text = path.read_text(encoding="utf-8", errors="ignore") except OSError: continue if q in text.lower(): # Grab a small snippet around the first match m = re.search(re.escape(q), text, re.IGNORECASE) start = max(0, m.start() - 120) if m else 0 end = min(len(text), (m.end() + 120) if m else 240) snippet = text[start:end].replace("\n", " ") results.append({"file": str(path), "snippet": snippet}) return results[:max_results] class SummarizeTextInput(BaseModel): text: str = Field(..., min_length=1) max_chars: int = Field(600, ge=100, le=5000) def summarize_text(text: str, max_chars: int = 600) -> str: text = re.sub(r"\s+", " ", text).strip() if len(text) <= max_chars: return text return text[: max_chars - 3] + "..." COMMAND_BLOCK: import os import re from pathlib import Path from typing import List from pydantic import BaseModel, Field class SearchLocalDocsInput(BaseModel): query: str = Field(..., min_length=2) folder: str = Field(..., description="Folder containing .md/.txt files") max_results: int = Field(5, ge=1, le=20) def search_local_docs(query: str, folder: str, max_results: int = 5) -> List[dict]: q = query.lower().strip() folder_path = Path(folder) results = [] for path in folder_path.rglob("*"): if path.suffix.lower() not in {".md", ".txt"}: continue try: text = path.read_text(encoding="utf-8", errors="ignore") except OSError: continue if q in text.lower(): # Grab a small snippet around the first match m = re.search(re.escape(q), text, re.IGNORECASE) start = max(0, m.start() - 120) if m else 0 end = min(len(text), (m.end() + 120) if m else 240) snippet = text[start:end].replace("\n", " ") results.append({"file": str(path), "snippet": snippet}) return results[:max_results] class SummarizeTextInput(BaseModel): text: str = Field(..., min_length=1) max_chars: int = Field(600, ge=100, le=5000) def summarize_text(text: str, max_chars: int = 600) -> str: text = re.sub(r"\s+", " ", text).strip() if len(text) <= max_chars: return text return text[: max_chars - 3] + "..." CODE_BLOCK: registry = ToolRegistry() registry.register( Tool( name="search_local_docs", description="Search local markdown/text files for a query and return file snippets.", input_model=SearchLocalDocsInput, fn=search_local_docs, ) ) registry.register( Tool( name="summarize_text", description="Summarize text by truncating to a max character length.", input_model=SummarizeTextInput, fn=summarize_text, ) ) CODE_BLOCK: registry = ToolRegistry() registry.register( Tool( name="search_local_docs", description="Search local markdown/text files for a query and return file snippets.", input_model=SearchLocalDocsInput, fn=search_local_docs, ) ) registry.register( Tool( name="summarize_text", description="Summarize text by truncating to a max character length.", input_model=SummarizeTextInput, fn=summarize_text, ) ) CODE_BLOCK: registry = ToolRegistry() registry.register( Tool( name="search_local_docs", description="Search local markdown/text files for a query and return file snippets.", input_model=SearchLocalDocsInput, fn=search_local_docs, ) ) registry.register( Tool( name="summarize_text", description="Summarize text by truncating to a max character length.", input_model=SummarizeTextInput, fn=summarize_text, ) ) COMMAND_BLOCK: from dataclasses import dataclass, field from typing import Literal, List Role = Literal["system", "user", "assistant", "tool"] @dataclass class Message: role: Role content: str name: str | None = None # used for tool name @dataclass class AgentState: messages: List[Message] = field(default_factory=list) notes: str = "" def add(self, role: Role, content: str, name: str | None = None) -> None: self.messages.append(Message(role=role, content=content, name=name)) COMMAND_BLOCK: from dataclasses import dataclass, field from typing import Literal, List Role = Literal["system", "user", "assistant", "tool"] @dataclass class Message: role: Role content: str name: str | None = None # used for tool name @dataclass class AgentState: messages: List[Message] = field(default_factory=list) notes: str = "" def add(self, role: Role, content: str, name: str | None = None) -> None: self.messages.append(Message(role=role, content=content, name=name)) COMMAND_BLOCK: from dataclasses import dataclass, field from typing import Literal, List Role = Literal["system", "user", "assistant", "tool"] @dataclass class Message: role: Role content: str name: str | None = None # used for tool name @dataclass class AgentState: messages: List[Message] = field(default_factory=list) notes: str = "" def add(self, role: Role, content: str, name: str | None = None) -> None: self.messages.append(Message(role=role, content=content, name=name)) COMMAND_BLOCK: import json from typing import Any, Dict class LLMClient: def __init__(self, model: str = "gpt-4o-mini"): from openai import OpenAI self.client = OpenAI() self.model = model def chat(self, messages: list[dict]) -> str: resp = self.client.chat.completions.create( model=self.model, messages=messages, temperature=0.2, ) return resp.choices[0].message.content def to_openai_messages(state: AgentState) -> list[dict]: msgs = [] for m in state.messages: d = {"role": m.role, "content": m.content} if m.name: d["name"] = m.name msgs.append(d) return msgs COMMAND_BLOCK: import json from typing import Any, Dict class LLMClient: def __init__(self, model: str = "gpt-4o-mini"): from openai import OpenAI self.client = OpenAI() self.model = model def chat(self, messages: list[dict]) -> str: resp = self.client.chat.completions.create( model=self.model, messages=messages, temperature=0.2, ) return resp.choices[0].message.content def to_openai_messages(state: AgentState) -> list[dict]: msgs = [] for m in state.messages: d = {"role": m.role, "content": m.content} if m.name: d["name"] = m.name msgs.append(d) return msgs COMMAND_BLOCK: import json from typing import Any, Dict class LLMClient: def __init__(self, model: str = "gpt-4o-mini"): from openai import OpenAI self.client = OpenAI() self.model = model def chat(self, messages: list[dict]) -> str: resp = self.client.chat.completions.create( model=self.model, messages=messages, temperature=0.2, ) return resp.choices[0].message.content def to_openai_messages(state: AgentState) -> list[dict]: msgs = [] for m in state.messages: d = {"role": m.role, "content": m.content} if m.name: d["name"] = m.name msgs.append(d) return msgs COMMAND_BLOCK: SYSTEM_PROMPT = """ You are a helpful AI agent. You can either: 1) Call a tool, by responding with strict JSON: {"type":"tool","name":"...","args":{...}} 2) Or answer the user, by responding with strict JSON: {"type":"final","answer":"..."} Rules: - Only call tools that are available. - If you call a tool, keep args minimal and valid. - Use the agent notes when helpful. - Output MUST be valid JSON and nothing else. """.strip() class Agent: def __init__(self, llm: LLMClient, tools: ToolRegistry): self.llm = llm self.tools = tools def run(self, user_input: str, state: Optional[AgentState] = None, max_steps: int = 8) -> str: state = state or AgentState() # Add system prompt once at the start if not state.messages or state.messages[0].role != "system": state.messages.insert(0, Message("system", SYSTEM_PROMPT)) state.add("user", user_input) for step in range(max_steps): # Provide notes as context (simple approach) if state.notes: state.add("system", f"Agent notes: {state.notes}") raw = self.llm.chat(to_openai_messages(state)) try: payload = json.loads(raw) except json.JSONDecodeError: # If the model misbehaves, force a final response return "Model returned non-JSON output. Try again with a stricter prompt." if payload.get("type") == "final": answer = payload.get("answer", "") state.add("assistant", answer) return answer if payload.get("type") == "tool": name = payload.get("name") args = payload.get("args") or {} if name not in self.tools.list(): state.add("tool", f"ERROR: tool not allowed: {name}", name=name) continue tool = self.tools.get(name) try: result = tool.run(args) state.add("tool", json.dumps(result, ensure_ascii=False), name=name) except ValidationError as ve: state.add("tool", f"VALIDATION_ERROR: {ve}", name=name) except Exception as e: state.add("tool", f"TOOL_ERROR: {e}", name=name) continue # Unknown response type state.add("assistant", "I couldn't determine the next action.") return "I couldn't determine the next action." return "Max steps reached without a final answer." COMMAND_BLOCK: SYSTEM_PROMPT = """ You are a helpful AI agent. You can either: 1) Call a tool, by responding with strict JSON: {"type":"tool","name":"...","args":{...}} 2) Or answer the user, by responding with strict JSON: {"type":"final","answer":"..."} Rules: - Only call tools that are available. - If you call a tool, keep args minimal and valid. - Use the agent notes when helpful. - Output MUST be valid JSON and nothing else. """.strip() class Agent: def __init__(self, llm: LLMClient, tools: ToolRegistry): self.llm = llm self.tools = tools def run(self, user_input: str, state: Optional[AgentState] = None, max_steps: int = 8) -> str: state = state or AgentState() # Add system prompt once at the start if not state.messages or state.messages[0].role != "system": state.messages.insert(0, Message("system", SYSTEM_PROMPT)) state.add("user", user_input) for step in range(max_steps): # Provide notes as context (simple approach) if state.notes: state.add("system", f"Agent notes: {state.notes}") raw = self.llm.chat(to_openai_messages(state)) try: payload = json.loads(raw) except json.JSONDecodeError: # If the model misbehaves, force a final response return "Model returned non-JSON output. Try again with a stricter prompt." if payload.get("type") == "final": answer = payload.get("answer", "") state.add("assistant", answer) return answer if payload.get("type") == "tool": name = payload.get("name") args = payload.get("args") or {} if name not in self.tools.list(): state.add("tool", f"ERROR: tool not allowed: {name}", name=name) continue tool = self.tools.get(name) try: result = tool.run(args) state.add("tool", json.dumps(result, ensure_ascii=False), name=name) except ValidationError as ve: state.add("tool", f"VALIDATION_ERROR: {ve}", name=name) except Exception as e: state.add("tool", f"TOOL_ERROR: {e}", name=name) continue # Unknown response type state.add("assistant", "I couldn't determine the next action.") return "I couldn't determine the next action." return "Max steps reached without a final answer." COMMAND_BLOCK: SYSTEM_PROMPT = """ You are a helpful AI agent. You can either: 1) Call a tool, by responding with strict JSON: {"type":"tool","name":"...","args":{...}} 2) Or answer the user, by responding with strict JSON: {"type":"final","answer":"..."} Rules: - Only call tools that are available. - If you call a tool, keep args minimal and valid. - Use the agent notes when helpful. - Output MUST be valid JSON and nothing else. """.strip() class Agent: def __init__(self, llm: LLMClient, tools: ToolRegistry): self.llm = llm self.tools = tools def run(self, user_input: str, state: Optional[AgentState] = None, max_steps: int = 8) -> str: state = state or AgentState() # Add system prompt once at the start if not state.messages or state.messages[0].role != "system": state.messages.insert(0, Message("system", SYSTEM_PROMPT)) state.add("user", user_input) for step in range(max_steps): # Provide notes as context (simple approach) if state.notes: state.add("system", f"Agent notes: {state.notes}") raw = self.llm.chat(to_openai_messages(state)) try: payload = json.loads(raw) except json.JSONDecodeError: # If the model misbehaves, force a final response return "Model returned non-JSON output. Try again with a stricter prompt." if payload.get("type") == "final": answer = payload.get("answer", "") state.add("assistant", answer) return answer if payload.get("type") == "tool": name = payload.get("name") args = payload.get("args") or {} if name not in self.tools.list(): state.add("tool", f"ERROR: tool not allowed: {name}", name=name) continue tool = self.tools.get(name) try: result = tool.run(args) state.add("tool", json.dumps(result, ensure_ascii=False), name=name) except ValidationError as ve: state.add("tool", f"VALIDATION_ERROR: {ve}", name=name) except Exception as e: state.add("tool", f"TOOL_ERROR: {e}", name=name) continue # Unknown response type state.add("assistant", "I couldn't determine the next action.") return "I couldn't determine the next action." return "Max steps reached without a final answer." CODE_BLOCK: if __name__ == "__main__": llm = LLMClient(model="gpt-4o-mini") agent = Agent(llm=llm, tools=registry) question = "Search my docs for 'rate limit' and explain what it says in 3 bullet points. Folder is docs." print(agent.run(question)) CODE_BLOCK: if __name__ == "__main__": llm = LLMClient(model="gpt-4o-mini") agent = Agent(llm=llm, tools=registry) question = "Search my docs for 'rate limit' and explain what it says in 3 bullet points. Folder is docs." print(agent.run(question)) CODE_BLOCK: if __name__ == "__main__": llm = LLMClient(model="gpt-4o-mini") agent = Agent(llm=llm, tools=registry) question = "Search my docs for 'rate limit' and explain what it says in 3 bullet points. Folder is docs." print(agent.run(question)) COMMAND_BLOCK: import httpx def fetch_url(url: str) -> str: with httpx.Client(timeout=10.0, follow_redirects=True) as client: return client.get(url).text COMMAND_BLOCK: import httpx def fetch_url(url: str) -> str: with httpx.Client(timeout=10.0, follow_redirects=True) as client: return client.get(url).text COMMAND_BLOCK: import httpx def fetch_url(url: str) -> str: with httpx.Client(timeout=10.0, follow_redirects=True) as client: return client.get(url).text - A minimal agent loop (think/plan → tool call → observe → repeat) - A tool registry with typed inputs - Lightweight memory (conversation + notes) - Basic guardrails (tool allowlist + timeouts + validation) - A working example: an agent that can search docs (locally), summarize, and draft a response - Model: an LLM that can reason over text and choose actions. - Tools: functions the model can call (HTTP requests, DB queries, file I/O). - Memory: state across turns (chat history, scratchpad, retrieved notes). - Policy/Loop: logic that decides when to call tools and when to stop. - Safety: constraints to avoid dangerous actions. - Python 3.11+ - pydantic for tool input validation - httpx (optional) for web calls - An LLM client (examples include OpenAI-compatible APIs or local models). I’ll show an OpenAI-compatible interface, but the agent architecture is model-agnostic. - Description (for the model) - Input schema - Function to execute - search_local_docs: search a local folder of markdown/text files - summarize_text: a non-LLM “tool” (simple chunking + truncation) to show that tools can be deterministic - Either: { "type": "final", "answer": "..." } - Or: { "type": "tool", "name": "...", "args": { ... } } - Send system prompt + history + notes - Parse model output - If tool call: validate args, run tool, append tool result - If final: return answer - Stop after N steps - Tool allowlist: only registered tools can run - Validation: Pydantic schemas - Step limit: prevents infinite loops - Model calls search_local_docs with {query: "rate limit", folder: "docs"} - Tool returns snippets - Model calls summarize_text (optional) - Model returns a final bullet list - Step A: produce a plan (no tools) - Step B: execute the next tool call - sentence-transformers for local embeddings - A vector store like FAISS, Chroma, or SQLite-based solutions - A small set of tools - Explicit path sandboxing (only within a workspace directory) - Read-only tools by default - Tool output models (Pydantic) - A standardized envelope: { ok: bool, data: ..., error: ... } - How tools are validated - Where memory lives - How the loop terminates - How errors are handled - A clear agent loop - Typed tools with validation - Basic memory - Guardrails (allowlist, step limit) - Better retrieval (embeddings) - Better planning (explicit plan/execute) - Better safety (sandboxing + permissions) - Embeddings + FAISS for retrieval - A planner/executor split - Streaming outputs and better tracing/logging