Tools
Tools: Why AI Agents Keep Forgetting Your Project (And How I Fixed It)
2026-02-23
0 views
admin
The Markdown Problem ## Building a Project Tracker for AI Agents ## How It Actually Works ## The Token Math ## What's Under the Hood ## Session Diff: The Feature I Didn't Plan ## Getting Started ## What I Learned Every time I start a new session with an AI coding agent, the same thing happens: it has no idea what I was working on yesterday. It doesn't know which tasks are done. It doesn't know we decided to use Redis for caching. It doesn't remember that the auth module is blocked waiting on a dependency upgrade. It just... starts fresh. So I'd do what most people do — maintain a PROGRESS.md file. After every session, I'd ask the agent to update it. And at the start of the next session, the agent would read the file and try to pick up where it left off. This worked fine for about a week. As the project grew, so did the file. 50 lines became 200. Status updates from three weeks ago sat next to current blockers. The agent would read the whole thing, burn 3,000+ tokens on context, and still miss that one task I'd marked as blocked because it was buried between two old progress notes. The fundamental issue: I was using a text file as a database. I needed queries ("what's blocked?"), not full-file reads. I needed structure (projects → epics → tasks), not flat bullet points. And I needed an audit trail that wouldn't bloat the context window. So I built one. Saga is an MCP server that gives AI agents a local SQLite database for project tracking — think Jira, but designed for the way agents actually work. If you're not familiar with MCP (Model Context Protocol) — it's a standard from Anthropic that lets AI tools talk to external services through typed tool calls. Claude Code, Claude Desktop, Cursor, and Windsurf all support it. Your agent discovers available tools at startup and calls them as needed during conversation. Saga exposes 23 tools through MCP: Everything lives in a single .tracker.db SQLite file. No servers, no API keys, no accounts. Here's what a typical session looks like now. Starting a new project: Five tool calls. Project is structured with epics, tasks, and subtasks. All persisted to SQLite. Resuming the next day: The agent immediately knows: 33% done, one task is blocked, auth epic is ahead of catalog. It can prioritize without me having to explain anything. Recording a decision: The decision is stored as a typed note, linked to the relevant epic. Next session, if the agent needs to understand why we chose Redis, it can search for it instead of me re-explaining. This is the part that surprised me. Saga's 23 tool definitions cost about 1,500 tokens in the system prompt. That's fixed — it doesn't grow with your project. A tracker_dashboard call returns ~800 tokens of structured data. A filtered query like "show me blocked tasks" returns ~200 tokens. Compare that to a PROGRESS.md file for a medium project: 3,000–5,000 tokens, loaded in full every session, growing over time. The crossover happens at about 15–20 tasks. Beyond that, the structured approach scales better because the agent only retrieves what it asks for, not everything. And unlike a markdown file, the data is queryable. "What did we decide about caching?" is a note_search call, not a full-file scan. For the technically curious: The whole thing is ~1,400 lines of TypeScript. Two dependencies: the MCP SDK and better-sqlite3. After launching, someone on Reddit asked: "Can it show what changed between sessions?" Good idea. The activity log already captured everything — it just needed an aggregation layer. So I added tracker_session_diff. You give it a timestamp, and it returns: An agent calling this at the start of a session gets a structured changelog of everything that happened since it last checked. No parsing markdown diffs. No re-reading files. Add this to your project's .mcp.json: That's it. Works with Claude Code, Claude Desktop, or any MCP-compatible client. The database file is created automatically on first use. Building Saga taught me something about how agents actually consume information: they're better with structure than with prose. A markdown file is optimized for humans scanning a document. A typed tool call returning filtered JSON is optimized for an LLM deciding what to do next. The agent doesn't need to "read" your project status — it needs to query it. MCP makes this practical. The protocol handles tool discovery, typed schemas, and transport. All I had to do was put a database behind it. If you're building multi-session agent workflows and finding yourself maintaining growing context files, consider whether that context should be a database instead. Saga is open-source (MIT) and available on npm: Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK:
Me: "Set up tracking for the e-commerce API" Agent calls: tracker_init → epic_create (Auth) → epic_create (Catalog) → task_create (JWT auth) → subtask_create ([setup lib, create endpoint, add middleware]) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Me: "Set up tracking for the e-commerce API" Agent calls: tracker_init → epic_create (Auth) → epic_create (Catalog) → task_create (JWT auth) → subtask_create ([setup lib, create endpoint, add middleware]) CODE_BLOCK:
Me: "Set up tracking for the e-commerce API" Agent calls: tracker_init → epic_create (Auth) → epic_create (Catalog) → task_create (JWT auth) → subtask_create ([setup lib, create endpoint, add middleware]) CODE_BLOCK:
Me: "What's the status?" Agent calls: tracker_dashboard Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Me: "What's the status?" Agent calls: tracker_dashboard CODE_BLOCK:
Me: "What's the status?" Agent calls: tracker_dashboard CODE_BLOCK:
{ "stats": { "total_tasks": 12, "tasks_done": 4, "tasks_blocked": 1, "completion_pct": 33.3 }, "blocked_tasks": [{ "title": "Add rate limiting", "epic": "Authentication" }], "recent_activity": ["Task 'JWT auth' status: in_progress → done", ...]
} Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
{ "stats": { "total_tasks": 12, "tasks_done": 4, "tasks_blocked": 1, "completion_pct": 33.3 }, "blocked_tasks": [{ "title": "Add rate limiting", "epic": "Authentication" }], "recent_activity": ["Task 'JWT auth' status: in_progress → done", ...]
} CODE_BLOCK:
{ "stats": { "total_tasks": 12, "tasks_done": 4, "tasks_blocked": 1, "completion_pct": 33.3 }, "blocked_tasks": [{ "title": "Add rate limiting", "epic": "Authentication" }], "recent_activity": ["Task 'JWT auth' status: in_progress → done", ...]
} CODE_BLOCK:
Me: "We're going with Redis for caching. Mark the research tasks as done." Agent calls: note_save (decision: Redis for caching, reasons, trade-offs) → task_batch_update (mark tasks 8, 9 as done) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Me: "We're going with Redis for caching. Mark the research tasks as done." Agent calls: note_save (decision: Redis for caching, reasons, trade-offs) → task_batch_update (mark tasks 8, 9 as done) CODE_BLOCK:
Me: "We're going with Redis for caching. Mark the research tasks as done." Agent calls: note_save (decision: Redis for caching, reasons, trade-offs) → task_batch_update (mark tasks 8, 9 as done) CODE_BLOCK:
{ "total_changes": 14, "summary": { "created": 3, "status_changed": 4, "updated": 5, "deleted": 2 }, "highlights": [ "Task 'Fix auth bug' status: in_progress → done", "Created epic 'API v2'", "Note 'Sprint retro' deleted" ]
} Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
{ "total_changes": 14, "summary": { "created": 3, "status_changed": 4, "updated": 5, "deleted": 2 }, "highlights": [ "Task 'Fix auth bug' status: in_progress → done", "Created epic 'API v2'", "Note 'Sprint retro' deleted" ]
} CODE_BLOCK:
{ "total_changes": 14, "summary": { "created": 3, "status_changed": 4, "updated": 5, "deleted": 2 }, "highlights": [ "Task 'Fix auth bug' status: in_progress → done", "Created epic 'API v2'", "Note 'Sprint retro' deleted" ]
} CODE_BLOCK:
{ "mcpServers": { "saga": { "command": "npx", "args": ["-y", "saga-mcp"], "env": { "DB_PATH": "/absolute/path/to/your/project/.tracker.db" } } }
} Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
{ "mcpServers": { "saga": { "command": "npx", "args": ["-y", "saga-mcp"], "env": { "DB_PATH": "/absolute/path/to/your/project/.tracker.db" } } }
} CODE_BLOCK:
{ "mcpServers": { "saga": { "command": "npx", "args": ["-y", "saga-mcp"], "env": { "DB_PATH": "/absolute/path/to/your/project/.tracker.db" } } }
} - CRUD for a full hierarchy: Projects → Epics → Tasks → Subtasks
- A notes system: For decisions, context, meeting notes, blockers — all typed and searchable
- A dashboard: One call returns your entire project status — completion percentages, blocked tasks, recent activity
- An activity log: Every change is automatically recorded with old/new values
- A session diff: "Show me what changed since yesterday" — in one call - SQLite with WAL mode — concurrent reads during writes, busy timeout of 5 seconds for lock contention
- Foreign keys enforced — no orphaned tasks when you delete an epic (cascading deletes)
- Append-only activity log — every create, update, and delete is recorded with field-level granularity
- Parameterized queries with column allowlists — no SQL injection surface
- MCP safety annotations on every tool — clients know which tools are read-only, destructive, or idempotent - GitHub: https://github.com/spranab/saga-mcp
- Install: npx -y saga-mcp
how-totutorialguidedev.toaillmserverdatabasegitgithub