Tools

Tools: Latest: CI/CD in the Era of AI and Platform Engineering: A Deep Dive into Dagger CI (Part 4)

2026-03-27 0 views admin

Part 4: The AI-Native CI/CD Stack: Agents, Modules, and Spec-Driven Development

What Is a Dagger Agent?

The Problem With Setting Up CI

Generating the Setup With Daggie

Meet the Agents

The Generated Setup

Running the Checks

When a Check Fails

Integrating Into CI/CD

The Developer Experience Shift

Before: Pipeline Specialists

After: Agents Configure and Fix CI

From Developer Platform to Agent Factory

Spec-Driven Development With Speck

The Pipeline Pattern

Setting Up the Workflow

Running the Workflow

What Just Happened

Agents That Learn: Self-Improvement Across Runs

Three modes

Would I Recommend Dagger CI in Production Right Now?

What the Agent Layer Needs

What's Coming, and Why It Matters

Conclusion

Key Takeaways Fixed pipelines for speed and reliability. AI agents to write them and fix them when they break. In Part 1 we built pipelines as real code. In Part 2 we decoupled them from infrastructure. In Part 3 we built AcmeCorp's private module library (acme-backend, acme-frontend, and acme-deploy) that wraps public daggerverse modules with organization-specific compliance, naming, and security. Now let's talk about where AI actually belongs in CI/CD, and where it doesn't. The thesis is simple: AI doesn't replace the pipeline. It writes the pipeline and fixes it when it breaks. The pipeline itself stays fixed, deterministic, and fast. Just as container primitives allow us to build CI pipelines, Dagger introduces an LLM() primitive that lets you create agents the same way you'd call any other pipeline function. Under the hood, dag.llm() connects to any supported model (Claude, GPT, Gemini) and gives you a composable builder to layer on system prompts, environment bindings, and tool access. What makes this powerful is the tool story. Any Dagger module, including the same private modules we built in Part 3, can be exposed as MCP tools that the agent calls at runtime. Your acme-deploy module becomes a cloud_run tool. Your acme-backend module becomes build and test tools. You can also attach any local MCP server (a language linter, a CLI wrapper, a documentation server) alongside those module tools, giving the agent both your custom CI abstractions and third-party capabilities in a single environment. The result is a tight synergy between modules and agents: modules are the typed, testable building blocks; agents are the orchestration layer that composes them through natural language. You don't choose between writing pipelines and using AI. You write modules once, and agents compose them for you. LLM provider required. The dag.llm() primitive needs access to a language model. Dagger detects your provider from environment variables — set one of: In CI, add the key as a repository secret and pass it via env:. Locally, export the variable in your shell before running dagger call. If no provider is detected, agent functions will fail at runtime with a clear error message. We solved the YAML problem in Part 1. Pipelines are real code now. And in Part 3, we went further: toolchains let you install AcmeCorp's private modules as zero-code CI, with dagger check running all your checks from a single dagger.json. No SDK, no .dagger/ directory, no pipeline code. But there's still a bottleneck: configuring that setup requires knowing the module library. AcmeCorp's platform team maintains a growing set of private modules: acme-backend, acme-frontend, acme-deploy. Each module has its own @check functions, its own parameters, its own DefaultPath conventions. Knowing which modules to install as toolchains, which customizations to add for a monorepo layout, and how to wire the deployment step in GitHub Actions still requires familiarity with the internal module library. What if you could point an AI agent at your private modules and your source code, and have it generate the complete toolchain setup and CI workflow for you? Daggie is a Dagger CI specialist agent. It reads module source code, understands their APIs, and generates the right toolchain configuration for your project. You give it your source directory and the Git URL of your module repository. Daggie discovers all available modules inside it and picks the ones relevant to the assignment. Let's pick up from where we left off in Part 3. We're in the dagger-ci-demo monorepo (FastAPI backend + Angular frontend), and AcmeCorp's private modules live at github.com/telchak/acme-dagger-modules. AcmeCorp's coding agents (Monty, Angie, Daggie) live at github.com/telchak/daggerverse. If you still have local changes from Part 3, you can stash them with git stash -u or simply delete the repo and clone it fresh — we want a clean starting point with no existing Dagger configuration. First, initialize Dagger in the project and write the assignment file: Then point Daggie at both repositories — the module library and the daggerverse (so it can discover Monty and Angie's real URLs and versions): Daggie clones both repositories and auto-discovers all Dagger modules within them by finding dagger.json files. It reads each module's source code and @check-decorated functions (acme-backend (test, lint), acme-frontend (test, lint, audit), acme-deploy (scan)), detects the monorepo layout, and finds the coding agents (Monty, Angie) with their version tags. It also fetches the latest dagger/dagger-for-github action version automatically. The export --path=. writes the generated dagger.json and .github/workflows/ci.yml to your project root, ready to review, test with dagger check, and commit. Before we look at what Daggie generates, let's introduce the three agents that work together in this setup. They're all Dagger modules, and you call them the same way you call any other module: Daggie: the CI specialist. It reads your source code and available modules, then generates the toolchain configuration and CI workflow. You've just seen it in action. Daggie writes the setup; it doesn't run in the pipeline. Monty: the Python coding agent. When a check fails on Python code (a test failure, a lint error, a broken import), Monty reads the error output and the source code, analyzes the root cause, and posts an inline code fix suggestion directly on the pull request. Angie: the Angular/TypeScript coding agent. Same role as Monty, but for the frontend stack. When an Angular build or test fails, Angie diagnoses the issue and suggests the fix. The key design: Daggie generates the toolchain setup once. Monty and Angie are called from the CI workflow only when something fails. The happy path (dagger check: lint, test, audit, scan) is pure deterministic module execution with no LLM involved. AI only enters the picture when a human needs help. Here's what Daggie generates. No .dagger/ directory, no SDK, no Python pipeline code. Just a dagger.json with toolchains and a CI workflow. The code blocks below are what Daggie consistently produced as output after 10+ runs with gemini-2.5-pro: Notice what Daggie understood from the project structure and the module library: No GCP credentials needed for the checks — they run entirely in containers: Six checks, three toolchains, zero lines of code. All six run in parallel. No tokens consumed. The private modules handled base images, cache volumes, coverage thresholds, and vulnerability scanning — all invisible to the project. Let's say a developer pushes a PR and the backend tests fail: The CI workflow's failure step kicks in. Monty reads the error output and the source code, analyzes the root cause, and posts an inline code suggestion directly on the PR: 🐍 Monty suggested a fix for backend/auth.py: The test expects a 401 when the token is expired, but validate_token doesn't check the exp claim. This adds the expiry check before returning. The developer gets actionable fix suggestions, with code they can accept in one click, instead of a wall of logs to interpret. Daggie also generates the GitHub Actions workflow. Here's what it produces: dagger check for PRs, deployment on main, and a failure handler that calls Monty or Angie directly: The check job uses zero LLM tokens. It's pure dagger check — six deterministic checks from three toolchains. The suggest-fix steps only run on failure, calling Monty and Angie directly as Dagger modules (not pipeline functions). The deploy job calls acme-deploy's functions via dagger call on the installed toolchain. You get deterministic, fast CI with intelligent failure handling. The platform team builds the modules and agents. Daggie configures toolchains and generates the CI workflow. dagger check runs fast and deterministic. When things break, coding agents step in with targeted fixes. So far we've seen how Dagger improves CI performance, maintainability, and developer experience. But there's a larger shift happening. As coding agents become more capable, the developer's core role is evolving, from pure coder to agent orchestrator. You still need to understand the code, review the output, and make architectural decisions. But more and more of the mechanical work (implementing a well-specified feature, writing tests for existing code, fixing a lint error) can be delegated to agents that understand your codebase. Follow this evolution to its conclusion, and an Internal Developer Platform starts looking like an Internal Agent Factory: a system that manages not just infrastructure and deployments, but how coding agents are built, composed, and deployed: which agents run on which tasks, with what models, under what constraints, producing what artifacts. The building blocks are already here. We have: What's missing is the orchestration layer, something that takes a feature request, breaks it into agent-assignable tasks, and dispatches them through CI. That's Speck. Speck is a Dagger agent that implements spec-driven development, inspired by GitHub's spec-kit methodology. The idea is simple: specifications first, code second. Given a feature request (either a prompt or a GitHub issue), Speck runs a three-step pipeline: The output is a structured JSON object designed for GitHub Actions fromJson() + matrix strategy consumption. Each task includes a suggested_agent (which Dagger agent should execute it), a suggested_model (which LLM complexity tier it needs), and an order field that defines the execution sequence. When --include-tests and --include-review are enabled, Speck organizes tasks into phases that follow an implement → test → review pipeline: Phases run in parallel (each on its own CI runner). Tasks within a phase use the prompt chaining pattern, a workflow where the output of one agent becomes the input of the next, forming a sequential pipeline. Concretely, each agent receives a source Directory, modifies it, and exports the result back to the workspace. The next agent in the chain picks up that modified workspace as its input. This is different from running agents independently: the test agent sees the code the implementation agent wrote, and the review agent sees both the implementation and the tests. One PR is created per phase from the accumulated changes. The model assignment is automatic: Speck maps task complexity to concrete model IDs based on the chosen provider family. Simple config changes get Haiku. Standard feature implementations get Sonnet. Cross-cutting architectural changes get Opus. Test tasks get one tier above their implementation task's complexity, since understanding the implementation requires more context. Let's see this in action. We'll fork a real-world application, the FastAPI RealWorld Example App (a production-like REST API with authentication, articles, comments, and favorites), and turn GitHub Actions into a spec-driven development platform. Step 1: Fork the repository Step 2: Add the Speck workflow Create .github/workflows/speck.yml: A few things to note in this workflow: Step 3: Configure secrets The workflow needs an LLM API key. Add it as a repository secret: The GITHUB_TOKEN is provided automatically by GitHub Actions with the permissions declared in the workflow. Step 4: Commit, push, and create a test issue Now create a GitHub issue with a feature request (see issue #1): Title: Add article bookmarking/favorites list endpoint Add the ability for authenticated users to retrieve their list of favorited articles with pagination and optional filtering. Add the speck label to the issue. This triggers the workflow. Step 1, Decomposition (Opus): Speck reads the issue, explores the FastAPI codebase (models, routes, repositories, existing test patterns), and produces a structured decomposition. It posts the result as a comment on the issue: In this case, Speck decomposed the feature into 3 phases with 9 tasks: Each task has a suggested_model based on complexity: simple schema additions get claude-haiku-4-5, standard implementations get claude-sonnet-4-6, and comprehensive test writing gets claude-opus-4-6 (since tests need to understand the full implementation context). Step 2, Execution (parallel phases, sequential tasks): GitHub Actions matrices by phase. Each phase runs on its own runner. Within each phase, tasks are chained sequentially. Monty implements the feature, then writes tests on top of the implementation, then reviews the accumulated changes: Step 3, Pull Requests: Each phase produced one PR with all accumulated changes, linked to the original issue: Each PR includes implementation, tests, and a review pass, all generated by Monty working sequentially on the same codebase within the phase. A developer wrote a feature request with acceptance criteria. The system: No pipeline code was written. No agent was invoked manually. The developer's job is now to review the PRs: read the code, check the tests, verify the approach. The mechanical work of translating a spec into code, tests, and PRs happened automatically. This is the shift from Internal Developer Platform to Internal Agent Factory: the platform doesn't just run your CI. It runs your agents, manages their model costs, chains their outputs, and produces reviewable artifacts from natural language specifications. Image generated with Google's Gemini "Nano Banana Pro" There's one more capability worth covering. Every agent (Monty, Angie, Daggie, and Goose, a GCP deployment orchestrator) reads per-repo context files to understand project conventions. But until now, the context was static. The developer wrote it once and maintained it by hand. With --self-improve, the agents can update those files themselves: As Monty works through the codebase (reading models, tracing routes, checking existing patterns), it discovers things: "This project uses Pydantic v2 field validators, not v1-style @validator." "Tests use httpx.AsyncClient, not the sync test client." "Custom exceptions live in app/errors.py." Instead of those discoveries dying with the session, Monty records them in two files: MONTY.md, Python-specific knowledge: AGENTS.md, general project knowledge shared across all agents: The next time any agent runs on this repo, whether it's Monty, Angie, or a different developer, it reads both the agent-specific file and the shared AGENTS.md, starting with better knowledge. Python patterns stay in MONTY.md where only Monty reads them; project-wide conventions go in AGENTS.md where every agent benefits. No one had to write documentation. The agents documented the project by working on it. The commit mode is useful for automation. When combined with develop-github-issue, the context file updates get included in the PR: The PR includes both the code changes and a commit like: Over time, the context files become living documents, a compressed summary of the project's architecture, conventions, and gotchas, maintained by the agents that work on it. I've been following the Dagger project for several years now. And I can say with confidence: it has never been closer to production-ready than it is today. The core primitives (typed functions, composable modules, containerized execution, deterministic caching) are solid. The dagger call experience is genuinely portable across local development and CI. The module ecosystem is growing. And as we've seen throughout this series, the LLM integration through the dag.llm() primitive opens up a category of workflows that simply didn't exist before. That said, there are areas where the platform still needs to mature. Here's what I'd like to see, and what's already on the roadmap. The current LLM primitive is functional but minimal. To build truly capable agents in Dagger, a few key features would make a significant difference: Some of the most exciting changes are already in active development: Cloud Engines: Fully managed Dagger execution environments with auto-scaling and distributed caching built in. Run dagger --cloud and your pipeline executes on managed infrastructure, with secrets and local context securely streamed to the cloud. No more managing Kubernetes daemonsets or custom cache layers. Cloud Checks: This is the big one. Cloud Checks connects directly to your Git provider and triggers dagger check on every change, running on Cloud Engines. No YAML. No vendor syntax. No orchestration layer. Just your Dagger modules. Those two previous features are welcome because the more complex our Dagger workflows get, the more trying to fit them into GitHub Actions or GitLab CI feels like forcing circles into squares. Our Speck-driven development workflow is a perfect example: a decompose job that outputs dynamic JSON, a matrix strategy that fans out phases, shell scripts converting snake_case to kebab-case, environment variables carrying JSON between steps, conditional export commands based on return types... All of this ceremony exists because GitHub Actions was designed for static, declarative workflows, not for the kind of dynamic, graph-shaped execution that Dagger naturally produces. Cloud Checks would eliminate that entire translation layer. Your Dagger module is the CI platform. Add to that a native Graph core type, and you could have a full native multi-agent workflow completely independent from GitHub Actions or any other CI engine. Dagger CI would go from a "CI development toolkit" to a fully operational CI/CD platform. Modules V2: A fundamental redesign of how modules interact with projects. Today, modules can't see your project structure unless you thread it through manually with --source flags, custom boilerplate, and static path patterns. Modules V2 introduces a typed Workspace API that lets modules parse configuration files, traverse directory trees, and adapt to any project layout, all through executable code rather than rigid pragmas. A new .dagger/config.toml file declares which modules a project uses in a human-editable format, and a lockfile ensures reproducible resolution across teams. This shifts complexity from users to module authors, which is exactly where it belongs. These three features together (managed compute, native CI triggering, and smarter module integration) would close the gap between "Dagger as a portable pipeline SDK" and "Dagger as a complete CI platform." And from everything I've seen in the project's trajectory, that gap is closing fast. This is where the whole series comes together. The key insight: CI checks need to be fast, reliable, and deterministic. AI belongs at the edges — generating the configuration, diagnosing failures, decomposing specs into tasks, and learning from every run. Never in the hot path. The example apps and Dagger module are at github.com/telchak/dagger-ci-demo. The AcmeCorp private modules from Part 3 are at github.com/telchak/acme-dagger-modules. This concludes the 4-part series. Thanks for reading. Tags: #cicd #dagger #ai-agents #platform-engineering #mcp #cloudrun #firebase #spec-driven-development Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command

Copy

$ dagger init cat > daggie-assignment.md << 'EOF' Set up this monorepo with toolchains from the module library. The backend (backend/) is FastAPI/Python, the frontend (frontend/) is Angular. Install acme-backend, acme-frontend, and acme-deploy as toolchains with the right source path customizations for this monorepo layout. Also generate a .github/workflows/ci.yml with dagger check for PRs, and a deploy step for Cloud Run (backend) and Firebase (frontend) on main. On check failure, call Monty on the backend and Angie on the frontend to post inline fix suggestions on the PR. EOF dagger init cat > daggie-assignment.md << 'EOF' Set up this monorepo with toolchains from the module library. The backend (backend/) is FastAPI/Python, the frontend (frontend/) is Angular. Install acme-backend, acme-frontend, and acme-deploy as toolchains with the right source path customizations for this monorepo layout. Also generate a .github/workflows/ci.yml with dagger check for PRs, and a deploy step for Cloud Run (backend) and Firebase (frontend) on main. On check failure, call Monty on the backend and Angie on the frontend to post inline fix suggestions on the PR. EOF dagger init cat > daggie-assignment.md << 'EOF' Set up this monorepo with toolchains from the module library. The backend (backend/) is FastAPI/Python, the frontend (frontend/) is Angular. Install acme-backend, acme-frontend, and acme-deploy as toolchains with the right source path customizations for this monorepo layout. Also generate a .github/workflows/ci.yml with dagger check for PRs, and a deploy step for Cloud Run (backend) and Firebase (frontend) on main. On check failure, call Monty on the backend and Angie on the frontend to post inline fix suggestions on the PR. EOF dagger call -m github.com/telchak/daggerverse/[email protected] \ --module-urls="https://github.com/telchak/acme-dagger-modules.-weight: 500;">git" \ --module-urls="https://github.com/telchak/daggerverse.-weight: 500;">git" \ assist \ --assignment-file=./daggie-assignment.md \ --source=. \ export --path=. dagger call -m github.com/telchak/daggerverse/[email protected] \ --module-urls="https://github.com/telchak/acme-dagger-modules.-weight: 500;">git" \ --module-urls="https://github.com/telchak/daggerverse.-weight: 500;">git" \ assist \ --assignment-file=./daggie-assignment.md \ --source=. \ export --path=. dagger call -m github.com/telchak/daggerverse/[email protected] \ --module-urls="https://github.com/telchak/acme-dagger-modules.-weight: 500;">git" \ --module-urls="https://github.com/telchak/daggerverse.-weight: 500;">git" \ assist \ --assignment-file=./daggie-assignment.md \ --source=. \ export --path=. { "name": "acme-monorepo", "engineVersion": "v0.20.3", "toolchains": [ { "name": "backend", "source": "github.com/telchak/acme-dagger-modules/acme-backend@v1.0.0", "customizations": [ { "function": ["test"], "argument": "source", "defaultPath": "/backend" }, { "function": ["lint"], "argument": "source", "defaultPath": "/backend" } ] }, { "name": "frontend", "source": "github.com/telchak/acme-dagger-modules/acme-frontend@v1.0.0", "customizations": [ { "function": ["test"], "argument": "source", "defaultPath": "/frontend" }, { "function": ["lint"], "argument": "source", "defaultPath": "/frontend" }, { "function": ["audit"], "argument": "source", "defaultPath": "/frontend" } ] }, { "name": "deploy", "source": "github.com/telchak/acme-dagger-modules/acme-deploy@v1.0.0", "customizations": [ { "function": ["scan"], "argument": "source", "defaultPath": "/backend" } ] } ] } { "name": "acme-monorepo", "engineVersion": "v0.20.3", "toolchains": [ { "name": "backend", "source": "github.com/telchak/acme-dagger-modules/acme-backend@v1.0.0", "customizations": [ { "function": ["test"], "argument": "source", "defaultPath": "/backend" }, { "function": ["lint"], "argument": "source", "defaultPath": "/backend" } ] }, { "name": "frontend", "source": "github.com/telchak/acme-dagger-modules/acme-frontend@v1.0.0", "customizations": [ { "function": ["test"], "argument": "source", "defaultPath": "/frontend" }, { "function": ["lint"], "argument": "source", "defaultPath": "/frontend" }, { "function": ["audit"], "argument": "source", "defaultPath": "/frontend" } ] }, { "name": "deploy", "source": "github.com/telchak/acme-dagger-modules/acme-deploy@v1.0.0", "customizations": [ { "function": ["scan"], "argument": "source", "defaultPath": "/backend" } ] } ] } { "name": "acme-monorepo", "engineVersion": "v0.20.3", "toolchains": [ { "name": "backend", "source": "github.com/telchak/acme-dagger-modules/acme-backend@v1.0.0", "customizations": [ { "function": ["test"], "argument": "source", "defaultPath": "/backend" }, { "function": ["lint"], "argument": "source", "defaultPath": "/backend" } ] }, { "name": "frontend", "source": "github.com/telchak/acme-dagger-modules/acme-frontend@v1.0.0", "customizations": [ { "function": ["test"], "argument": "source", "defaultPath": "/frontend" }, { "function": ["lint"], "argument": "source", "defaultPath": "/frontend" }, { "function": ["audit"], "argument": "source", "defaultPath": "/frontend" } ] }, { "name": "deploy", "source": "github.com/telchak/acme-dagger-modules/acme-deploy@v1.0.0", "customizations": [ { "function": ["scan"], "argument": "source", "defaultPath": "/backend" } ] } ] } dagger check dagger check dagger check ✔ acme-backend:lint (12.9s) OK ✔ acme-backend:test (15.2s) OK ✔ acme-deploy:scan (18.4s) OK ✔ acme-frontend:lint (58.0s) OK ✔ acme-frontend:test (62.0s) OK ✔ acme-frontend:audit (25.6s) OK ✔ acme-backend:lint (12.9s) OK ✔ acme-backend:test (15.2s) OK ✔ acme-deploy:scan (18.4s) OK ✔ acme-frontend:lint (58.0s) OK ✔ acme-frontend:test (62.0s) OK ✔ acme-frontend:audit (25.6s) OK ✔ acme-backend:lint (12.9s) OK ✔ acme-backend:test (15.2s) OK ✔ acme-deploy:scan (18.4s) OK ✔ acme-frontend:lint (58.0s) OK ✔ acme-frontend:test (62.0s) OK ✔ acme-frontend:audit (25.6s) OK $ dagger check ✔ acme-backend:lint (3.1s) OK ✘ acme-backend:test (6.4s) ERROR ┇ .test( ┆ source: context ./backend ) › ✘ withExec pytest -v --tb=short --cov=src ... (4.2s) ERROR FAILED tests/test_auth.py::test_validate_token AssertionError: Expected 401, got 200 auth.py:47 — missing token expiry check ✔ acme-deploy:scan (18.4s) OK ✔ acme-frontend:lint (2.9s) OK ✔ acme-frontend:test (9.1s) OK ✔ acme-frontend:audit (25.6s) OK $ dagger check ✔ acme-backend:lint (3.1s) OK ✘ acme-backend:test (6.4s) ERROR ┇ .test( ┆ source: context ./backend ) › ✘ withExec pytest -v --tb=short --cov=src ... (4.2s) ERROR FAILED tests/test_auth.py::test_validate_token AssertionError: Expected 401, got 200 auth.py:47 — missing token expiry check ✔ acme-deploy:scan (18.4s) OK ✔ acme-frontend:lint (2.9s) OK ✔ acme-frontend:test (9.1s) OK ✔ acme-frontend:audit (25.6s) OK $ dagger check ✔ acme-backend:lint (3.1s) OK ✘ acme-backend:test (6.4s) ERROR ┇ .test( ┆ source: context ./backend ) › ✘ withExec pytest -v --tb=short --cov=src ... (4.2s) ERROR FAILED tests/test_auth.py::test_validate_token AssertionError: Expected 401, got 200 auth.py:47 — missing token expiry check ✔ acme-deploy:scan (18.4s) OK ✔ acme-frontend:lint (2.9s) OK ✔ acme-frontend:test (9.1s) OK ✔ acme-frontend:audit (25.6s) OK def validate_token(token: str) -> bool: payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) if payload["exp"] < time.time(): raise HTTPException(status_code=401, detail="Token expired") return True def validate_token(token: str) -> bool: payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) if payload["exp"] < time.time(): raise HTTPException(status_code=401, detail="Token expired") return True # .github/workflows/ci.yml (generated by Daggie) name: CI on: push: branches: - main pull_request: # Grant permissions for OIDC and for agents to post comments/suggestions permissions: contents: read pull-requests: write id-token: write jobs: check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v6 - name: Run all Dagger checks id: checks uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" # Must match engineVersion in dagger.json verb: check # Allow the job to continue on failure so the fix suggestion steps can run continue-on-error: true - name: Call Monty to suggest fixes for the backend if: failure() && steps.checks.outcome == 'failure' && github.event_name == 'pull_request' uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- -m github.com/telchak/daggerverse/[email protected] suggest-github-fix --source=./backend --github-token=env:GITHUB_TOKEN --pr-number=${{ github.event.pull_request.number }} --repo=${{ github.repository }} --commit-sha=${{ github.event.pull_request.head.sha }} --error-output="${{ steps.checks.outputs.stderr }}" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY # depending on the agent's LLM provider ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} - name: Call Angie to suggest fixes for the frontend if: failure() && steps.checks.outcome == 'failure' && github.event_name == 'pull_request' uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- -m github.com/telchak/daggerverse/[email protected] suggest-github-fix --source=./frontend --github-token=env:GITHUB_TOKEN --pr-number=${{ github.event.pull_request.number }} --repo=${{ github.repository }} --commit-sha=${{ github.event.pull_request.head.sha }} --error-output="${{ steps.checks.outputs.stderr }}" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} - name: Fail the job if checks failed if: steps.checks.outcome == 'failure' run: exit 1 deploy: runs-on: ubuntu-latest needs: check if: github.ref == 'refs/heads/main' && github.event_name == 'push' steps: - uses: actions/checkout@v6 - name: Deploy Backend to Cloud Run uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- deploy cloud-run --source=./backend ---weight: 500;">service-name=acme-backend-api --team=api-team --project-id=${{ vars.GCP_PROJECT_ID }} --region=${{ vars.GCP_REGION }} --environment=production --oidc-request-token=env:ACTIONS_ID_TOKEN_REQUEST_TOKEN --oidc-request-url=env:ACTIONS_ID_TOKEN_REQUEST_URL env: ACTIONS_ID_TOKEN_REQUEST_TOKEN: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_TOKEN }} ACTIONS_ID_TOKEN_REQUEST_URL: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_URL }} - name: Deploy Frontend to Firebase uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- deploy firebase --source=./frontend --project-id=${{ vars.GCP_PROJECT_ID }} --oidc-request-token=env:ACTIONS_ID_TOKEN_REQUEST_TOKEN --oidc-request-url=env:ACTIONS_ID_TOKEN_REQUEST_URL env: ACTIONS_ID_TOKEN_REQUEST_TOKEN: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_TOKEN }} ACTIONS_ID_TOKEN_REQUEST_URL: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_URL }} # .github/workflows/ci.yml (generated by Daggie) name: CI on: push: branches: - main pull_request: # Grant permissions for OIDC and for agents to post comments/suggestions permissions: contents: read pull-requests: write id-token: write jobs: check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v6 - name: Run all Dagger checks id: checks uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" # Must match engineVersion in dagger.json verb: check # Allow the job to continue on failure so the fix suggestion steps can run continue-on-error: true - name: Call Monty to suggest fixes for the backend if: failure() && steps.checks.outcome == 'failure' && github.event_name == 'pull_request' uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- -m github.com/telchak/daggerverse/[email protected] suggest-github-fix --source=./backend --github-token=env:GITHUB_TOKEN --pr-number=${{ github.event.pull_request.number }} --repo=${{ github.repository }} --commit-sha=${{ github.event.pull_request.head.sha }} --error-output="${{ steps.checks.outputs.stderr }}" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY # depending on the agent's LLM provider ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} - name: Call Angie to suggest fixes for the frontend if: failure() && steps.checks.outcome == 'failure' && github.event_name == 'pull_request' uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- -m github.com/telchak/daggerverse/[email protected] suggest-github-fix --source=./frontend --github-token=env:GITHUB_TOKEN --pr-number=${{ github.event.pull_request.number }} --repo=${{ github.repository }} --commit-sha=${{ github.event.pull_request.head.sha }} --error-output="${{ steps.checks.outputs.stderr }}" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} - name: Fail the job if checks failed if: steps.checks.outcome == 'failure' run: exit 1 deploy: runs-on: ubuntu-latest needs: check if: github.ref == 'refs/heads/main' && github.event_name == 'push' steps: - uses: actions/checkout@v6 - name: Deploy Backend to Cloud Run uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- deploy cloud-run --source=./backend ---weight: 500;">service-name=acme-backend-api --team=api-team --project-id=${{ vars.GCP_PROJECT_ID }} --region=${{ vars.GCP_REGION }} --environment=production --oidc-request-token=env:ACTIONS_ID_TOKEN_REQUEST_TOKEN --oidc-request-url=env:ACTIONS_ID_TOKEN_REQUEST_URL env: ACTIONS_ID_TOKEN_REQUEST_TOKEN: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_TOKEN }} ACTIONS_ID_TOKEN_REQUEST_URL: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_URL }} - name: Deploy Frontend to Firebase uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- deploy firebase --source=./frontend --project-id=${{ vars.GCP_PROJECT_ID }} --oidc-request-token=env:ACTIONS_ID_TOKEN_REQUEST_TOKEN --oidc-request-url=env:ACTIONS_ID_TOKEN_REQUEST_URL env: ACTIONS_ID_TOKEN_REQUEST_TOKEN: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_TOKEN }} ACTIONS_ID_TOKEN_REQUEST_URL: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_URL }} # .github/workflows/ci.yml (generated by Daggie) name: CI on: push: branches: - main pull_request: # Grant permissions for OIDC and for agents to post comments/suggestions permissions: contents: read pull-requests: write id-token: write jobs: check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v6 - name: Run all Dagger checks id: checks uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" # Must match engineVersion in dagger.json verb: check # Allow the job to continue on failure so the fix suggestion steps can run continue-on-error: true - name: Call Monty to suggest fixes for the backend if: failure() && steps.checks.outcome == 'failure' && github.event_name == 'pull_request' uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- -m github.com/telchak/daggerverse/[email protected] suggest-github-fix --source=./backend --github-token=env:GITHUB_TOKEN --pr-number=${{ github.event.pull_request.number }} --repo=${{ github.repository }} --commit-sha=${{ github.event.pull_request.head.sha }} --error-output="${{ steps.checks.outputs.stderr }}" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY # depending on the agent's LLM provider ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} - name: Call Angie to suggest fixes for the frontend if: failure() && steps.checks.outcome == 'failure' && github.event_name == 'pull_request' uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- -m github.com/telchak/daggerverse/[email protected] suggest-github-fix --source=./frontend --github-token=env:GITHUB_TOKEN --pr-number=${{ github.event.pull_request.number }} --repo=${{ github.repository }} --commit-sha=${{ github.event.pull_request.head.sha }} --error-output="${{ steps.checks.outputs.stderr }}" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} - name: Fail the job if checks failed if: steps.checks.outcome == 'failure' run: exit 1 deploy: runs-on: ubuntu-latest needs: check if: github.ref == 'refs/heads/main' && github.event_name == 'push' steps: - uses: actions/checkout@v6 - name: Deploy Backend to Cloud Run uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- deploy cloud-run --source=./backend ---weight: 500;">service-name=acme-backend-api --team=api-team --project-id=${{ vars.GCP_PROJECT_ID }} --region=${{ vars.GCP_REGION }} --environment=production --oidc-request-token=env:ACTIONS_ID_TOKEN_REQUEST_TOKEN --oidc-request-url=env:ACTIONS_ID_TOKEN_REQUEST_URL env: ACTIONS_ID_TOKEN_REQUEST_TOKEN: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_TOKEN }} ACTIONS_ID_TOKEN_REQUEST_URL: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_URL }} - name: Deploy Frontend to Firebase uses: dagger/dagger-for-github@v8.4.1 with: version: "0.20.3" verb: call args: >- deploy firebase --source=./frontend --project-id=${{ vars.GCP_PROJECT_ID }} --oidc-request-token=env:ACTIONS_ID_TOKEN_REQUEST_TOKEN --oidc-request-url=env:ACTIONS_ID_TOKEN_REQUEST_URL env: ACTIONS_ID_TOKEN_REQUEST_TOKEN: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_TOKEN }} ACTIONS_ID_TOKEN_REQUEST_URL: ${{ secrets.ACTIONS_ID_TOKEN_REQUEST_URL }} Developer → writes app code DevOps → writes pipeline code per stack DevOps → maintains deployment scripts per framework DevOps → debugs pipeline failures Everyone → waits for DevOps Developer → writes app code DevOps → writes pipeline code per stack DevOps → maintains deployment scripts per framework DevOps → debugs pipeline failures Everyone → waits for DevOps Developer → writes app code DevOps → writes pipeline code per stack DevOps → maintains deployment scripts per framework DevOps → debugs pipeline failures Everyone → waits for DevOps Platform team → builds and maintains private modules (acme-backend, acme-frontend, acme-deploy) Platform team → builds and maintains agents (Daggie, Monty, Angie) Developer → asks Daggie to set up toolchains from the private module library dagger check → runs deterministically — no LLM, no tokens On failure → coding agents analyze errors and suggest fixes on the PR Platform team → builds and maintains private modules (acme-backend, acme-frontend, acme-deploy) Platform team → builds and maintains agents (Daggie, Monty, Angie) Developer → asks Daggie to set up toolchains from the private module library dagger check → runs deterministically — no LLM, no tokens On failure → coding agents analyze errors and suggest fixes on the PR Platform team → builds and maintains private modules (acme-backend, acme-frontend, acme-deploy) Platform team → builds and maintains agents (Daggie, Monty, Angie) Developer → asks Daggie to set up toolchains from the private module library dagger check → runs deterministically — no LLM, no tokens On failure → coding agents analyze errors and suggest fixes on the PR Phase 1: T001 (implement, haiku) → T002 (test, sonnet) → T003 (review, sonnet) Phase 2: T004 (implement, sonnet) → T005 (implement, sonnet) → T006 (test, opus) → T007 (review, sonnet) Phase 1: T001 (implement, haiku) → T002 (test, sonnet) → T003 (review, sonnet) Phase 2: T004 (implement, sonnet) → T005 (implement, sonnet) → T006 (test, opus) → T007 (review, sonnet) Phase 1: T001 (implement, haiku) → T002 (test, sonnet) → T003 (review, sonnet) Phase 2: T004 (implement, sonnet) → T005 (implement, sonnet) → T006 (test, opus) → T007 (review, sonnet) gh repo fork nsidnev/fastapi-realworld-example-app --clone cd fastapi-realworld-example-app gh repo fork nsidnev/fastapi-realworld-example-app --clone cd fastapi-realworld-example-app gh repo fork nsidnev/fastapi-realworld-example-app --clone cd fastapi-realworld-example-app # Spec-Driven Development with Speck # Triggers on "speck" label → decompose → execute phases → create PRs name: Spec-Driven Development on: issues: types: [labeled] permissions: contents: write pull-requests: write issues: write jobs: decompose: if: github.event.label.name == 'speck' runs-on: ubuntu-latest outputs: result: ${{ steps.run.outputs.result }} steps: - uses: actions/checkout@v6 - uses: dagger/dagger-for-github@v8.4.1 with: verb: version - name: Decompose issue into phases id: run env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} ANTHROPIC_MODEL: claude-opus-4-6 run: | dagger call -m github.com/telchak/daggerverse/speck@feat/speck \ --allow-llm=all \ --source=. \ decompose \ --issue-id=${{ github.event.issue.number }} \ --repository="https://github.com/${{ github.repository }}" \ --github-token=env:GITHUB_TOKEN \ --create-pr \ --include-tests \ --include-review \ --agents='[{"name":"monty","source":"github.com/telchak/daggerverse/monty@feat/speck","specialization":"Python backend development","capabilities":["assist","review","write_tests","build","-weight: 500;">upgrade"]}]' \ --tech-stack="Python, FastAPI, PostgreSQL" \ > /tmp/speck.json echo "result=$(jq -c '.' /tmp/speck.json)" >> $GITHUB_OUTPUT - name: Post summary to issue env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} RESULT: ${{ steps.run.outputs.result }} run: | TASKS=$(echo "$RESULT" | jq '.total_tasks') PHASES=$(echo "$RESULT" | jq '.execution_plan.total_phases') PRETTY=$(echo "$RESULT" | jq '.') gh issue comment "${{ github.event.issue.number }}" \ --body "## Speck Decomposition Complete **Tasks**: ${TASKS} | **Phases**: ${PHASES} <details><summary>Full JSON</summary> \`\`\`json ${PRETTY} \`\`\` </details>" execute-phase: needs: decompose runs-on: ubuntu-latest strategy: matrix: phase: ${{ fromJson(needs.decompose.outputs.result).execution_plan.phases }} max-parallel: 3 fail-fast: false steps: - uses: actions/checkout@v6 - uses: dagger/dagger-for-github@v8.4.1 with: verb: version - name: Execute phase ${{ matrix.phase.phase }} and create PR env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} SPECK_JSON: ${{ needs.decompose.outputs.result }} PHASE_NUM: ${{ matrix.phase.phase }} run: | TASKS=$(echo "$SPECK_JSON" | jq -c \ "[.tasks[] | select(.phase == ($PHASE_NUM | tonumber))] | sort_by(.order)") for i in $(seq 0 $(( $(echo "$TASKS" | jq 'length') - 1 ))); do TASK=$(echo "$TASKS" | jq -c ".[$i]") AGENT=$(echo "$TASK" | jq -r '.suggested_agent.source // empty') ENTRY=$(echo "$TASK" | jq -r '.suggested_agent.entrypoint // "assist"' | tr '_' '-') DESC=$(echo "$TASK" | jq -r '.description') MODEL=$(echo "$TASK" | jq -r '.suggested_model') echo "--- [$(echo "$TASK" | jq -r '.id')] $(echo "$TASK" | jq -r '.title') (model=$MODEL) ---" [ -z "$AGENT" ] && echo "Skipping: no agent" && continue if [ "$ENTRY" = "review" ]; then ANTHROPIC_MODEL="$MODEL" dagger call -m "$AGENT" --allow-llm=all \ --source=. "$ENTRY" --assignment="$DESC" else ANTHROPIC_MODEL="$MODEL" dagger call -m "$AGENT" --allow-llm=all \ --source=. "$ENTRY" --assignment="$DESC" export --path=. fi done # Create PR from accumulated changes dagger call -m github.com/kpenfound/dag/github-issue \ --token=env:GITHUB_TOKEN \ create-pull-request \ --repo="https://github.com/${{ github.repository }}" \ --source=. \ --branch="${{ matrix.phase.pr_branch }}" \ --base=master \ --title="Phase ${{ matrix.phase.phase }}: ${{ matrix.phase.name }}" \ --body="Implements **${{ matrix.phase.name }}**. Closes #${{ github.event.issue.number }}. Generated by Speck." \ url # Spec-Driven Development with Speck # Triggers on "speck" label → decompose → execute phases → create PRs name: Spec-Driven Development on: issues: types: [labeled] permissions: contents: write pull-requests: write issues: write jobs: decompose: if: github.event.label.name == 'speck' runs-on: ubuntu-latest outputs: result: ${{ steps.run.outputs.result }} steps: - uses: actions/checkout@v6 - uses: dagger/dagger-for-github@v8.4.1 with: verb: version - name: Decompose issue into phases id: run env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} ANTHROPIC_MODEL: claude-opus-4-6 run: | dagger call -m github.com/telchak/daggerverse/speck@feat/speck \ --allow-llm=all \ --source=. \ decompose \ --issue-id=${{ github.event.issue.number }} \ --repository="https://github.com/${{ github.repository }}" \ --github-token=env:GITHUB_TOKEN \ --create-pr \ --include-tests \ --include-review \ --agents='[{"name":"monty","source":"github.com/telchak/daggerverse/monty@feat/speck","specialization":"Python backend development","capabilities":["assist","review","write_tests","build","-weight: 500;">upgrade"]}]' \ --tech-stack="Python, FastAPI, PostgreSQL" \ > /tmp/speck.json echo "result=$(jq -c '.' /tmp/speck.json)" >> $GITHUB_OUTPUT - name: Post summary to issue env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} RESULT: ${{ steps.run.outputs.result }} run: | TASKS=$(echo "$RESULT" | jq '.total_tasks') PHASES=$(echo "$RESULT" | jq '.execution_plan.total_phases') PRETTY=$(echo "$RESULT" | jq '.') gh issue comment "${{ github.event.issue.number }}" \ --body "## Speck Decomposition Complete **Tasks**: ${TASKS} | **Phases**: ${PHASES} <details><summary>Full JSON</summary> \`\`\`json ${PRETTY} \`\`\` </details>" execute-phase: needs: decompose runs-on: ubuntu-latest strategy: matrix: phase: ${{ fromJson(needs.decompose.outputs.result).execution_plan.phases }} max-parallel: 3 fail-fast: false steps: - uses: actions/checkout@v6 - uses: dagger/dagger-for-github@v8.4.1 with: verb: version - name: Execute phase ${{ matrix.phase.phase }} and create PR env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} SPECK_JSON: ${{ needs.decompose.outputs.result }} PHASE_NUM: ${{ matrix.phase.phase }} run: | TASKS=$(echo "$SPECK_JSON" | jq -c \ "[.tasks[] | select(.phase == ($PHASE_NUM | tonumber))] | sort_by(.order)") for i in $(seq 0 $(( $(echo "$TASKS" | jq 'length') - 1 ))); do TASK=$(echo "$TASKS" | jq -c ".[$i]") AGENT=$(echo "$TASK" | jq -r '.suggested_agent.source // empty') ENTRY=$(echo "$TASK" | jq -r '.suggested_agent.entrypoint // "assist"' | tr '_' '-') DESC=$(echo "$TASK" | jq -r '.description') MODEL=$(echo "$TASK" | jq -r '.suggested_model') echo "--- [$(echo "$TASK" | jq -r '.id')] $(echo "$TASK" | jq -r '.title') (model=$MODEL) ---" [ -z "$AGENT" ] && echo "Skipping: no agent" && continue if [ "$ENTRY" = "review" ]; then ANTHROPIC_MODEL="$MODEL" dagger call -m "$AGENT" --allow-llm=all \ --source=. "$ENTRY" --assignment="$DESC" else ANTHROPIC_MODEL="$MODEL" dagger call -m "$AGENT" --allow-llm=all \ --source=. "$ENTRY" --assignment="$DESC" export --path=. fi done # Create PR from accumulated changes dagger call -m github.com/kpenfound/dag/github-issue \ --token=env:GITHUB_TOKEN \ create-pull-request \ --repo="https://github.com/${{ github.repository }}" \ --source=. \ --branch="${{ matrix.phase.pr_branch }}" \ --base=master \ --title="Phase ${{ matrix.phase.phase }}: ${{ matrix.phase.name }}" \ --body="Implements **${{ matrix.phase.name }}**. Closes #${{ github.event.issue.number }}. Generated by Speck." \ url # Spec-Driven Development with Speck # Triggers on "speck" label → decompose → execute phases → create PRs name: Spec-Driven Development on: issues: types: [labeled] permissions: contents: write pull-requests: write issues: write jobs: decompose: if: github.event.label.name == 'speck' runs-on: ubuntu-latest outputs: result: ${{ steps.run.outputs.result }} steps: - uses: actions/checkout@v6 - uses: dagger/dagger-for-github@v8.4.1 with: verb: version - name: Decompose issue into phases id: run env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} ANTHROPIC_MODEL: claude-opus-4-6 run: | dagger call -m github.com/telchak/daggerverse/speck@feat/speck \ --allow-llm=all \ --source=. \ decompose \ --issue-id=${{ github.event.issue.number }} \ --repository="https://github.com/${{ github.repository }}" \ --github-token=env:GITHUB_TOKEN \ --create-pr \ --include-tests \ --include-review \ --agents='[{"name":"monty","source":"github.com/telchak/daggerverse/monty@feat/speck","specialization":"Python backend development","capabilities":["assist","review","write_tests","build","-weight: 500;">upgrade"]}]' \ --tech-stack="Python, FastAPI, PostgreSQL" \ > /tmp/speck.json echo "result=$(jq -c '.' /tmp/speck.json)" >> $GITHUB_OUTPUT - name: Post summary to issue env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} RESULT: ${{ steps.run.outputs.result }} run: | TASKS=$(echo "$RESULT" | jq '.total_tasks') PHASES=$(echo "$RESULT" | jq '.execution_plan.total_phases') PRETTY=$(echo "$RESULT" | jq '.') gh issue comment "${{ github.event.issue.number }}" \ --body "## Speck Decomposition Complete **Tasks**: ${TASKS} | **Phases**: ${PHASES} <details><summary>Full JSON</summary> \`\`\`json ${PRETTY} \`\`\` </details>" execute-phase: needs: decompose runs-on: ubuntu-latest strategy: matrix: phase: ${{ fromJson(needs.decompose.outputs.result).execution_plan.phases }} max-parallel: 3 fail-fast: false steps: - uses: actions/checkout@v6 - uses: dagger/dagger-for-github@v8.4.1 with: verb: version - name: Execute phase ${{ matrix.phase.phase }} and create PR env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} SPECK_JSON: ${{ needs.decompose.outputs.result }} PHASE_NUM: ${{ matrix.phase.phase }} run: | TASKS=$(echo "$SPECK_JSON" | jq -c \ "[.tasks[] | select(.phase == ($PHASE_NUM | tonumber))] | sort_by(.order)") for i in $(seq 0 $(( $(echo "$TASKS" | jq 'length') - 1 ))); do TASK=$(echo "$TASKS" | jq -c ".[$i]") AGENT=$(echo "$TASK" | jq -r '.suggested_agent.source // empty') ENTRY=$(echo "$TASK" | jq -r '.suggested_agent.entrypoint // "assist"' | tr '_' '-') DESC=$(echo "$TASK" | jq -r '.description') MODEL=$(echo "$TASK" | jq -r '.suggested_model') echo "--- [$(echo "$TASK" | jq -r '.id')] $(echo "$TASK" | jq -r '.title') (model=$MODEL) ---" [ -z "$AGENT" ] && echo "Skipping: no agent" && continue if [ "$ENTRY" = "review" ]; then ANTHROPIC_MODEL="$MODEL" dagger call -m "$AGENT" --allow-llm=all \ --source=. "$ENTRY" --assignment="$DESC" else ANTHROPIC_MODEL="$MODEL" dagger call -m "$AGENT" --allow-llm=all \ --source=. "$ENTRY" --assignment="$DESC" export --path=. fi done # Create PR from accumulated changes dagger call -m github.com/kpenfound/dag/github-issue \ --token=env:GITHUB_TOKEN \ create-pull-request \ --repo="https://github.com/${{ github.repository }}" \ --source=. \ --branch="${{ matrix.phase.pr_branch }}" \ --base=master \ --title="Phase ${{ matrix.phase.phase }}: ${{ matrix.phase.name }}" \ --body="Implements **${{ matrix.phase.name }}**. Closes #${{ github.event.issue.number }}. Generated by Speck." \ url gh secret set ANTHROPIC_API_KEY --repo your-org/fastapi-realworld-example-app gh secret set ANTHROPIC_API_KEY --repo your-org/fastapi-realworld-example-app gh secret set ANTHROPIC_API_KEY --repo your-org/fastapi-realworld-example-app -weight: 500;">git add .github/workflows/speck.yml -weight: 500;">git commit -m "Add Speck spec-driven development workflow" -weight: 500;">git push origin master -weight: 500;">git add .github/workflows/speck.yml -weight: 500;">git commit -m "Add Speck spec-driven development workflow" -weight: 500;">git push origin master -weight: 500;">git add .github/workflows/speck.yml -weight: 500;">git commit -m "Add Speck spec-driven development workflow" -weight: 500;">git push origin master Phase 1 runner (Repository and Query Layer): T001 (assist, sonnet) → Monty adds repository method → exports to . T002 (assist, haiku) → Monty adds filter schema + dependency → exports to . T003 (write-tests, sonnet) → Monty writes tests → exports to . T004 (review, sonnet) → Monty reviews all changes → logs review → github-issue creates PR #3 from accumulated changes Phase 2 runner (API Endpoint): T005 (assist, sonnet) → Monty implements the endpoint → exports to . T006 (write-tests, opus) → Monty writes integration tests → exports to . T007 (review, sonnet) → Monty reviews endpoint + tests → logs review → github-issue creates PR #4 from accumulated changes Phase 3 runner (Error Handling and Strings): T008 (assist, haiku) → Monty adds error handling + edge cases → exports to . T009 (review, sonnet) → Monty does final integration review → logs review → github-issue creates PR #5 from accumulated changes Phase 1 runner (Repository and Query Layer): T001 (assist, sonnet) → Monty adds repository method → exports to . T002 (assist, haiku) → Monty adds filter schema + dependency → exports to . T003 (write-tests, sonnet) → Monty writes tests → exports to . T004 (review, sonnet) → Monty reviews all changes → logs review → github-issue creates PR #3 from accumulated changes Phase 2 runner (API Endpoint): T005 (assist, sonnet) → Monty implements the endpoint → exports to . T006 (write-tests, opus) → Monty writes integration tests → exports to . T007 (review, sonnet) → Monty reviews endpoint + tests → logs review → github-issue creates PR #4 from accumulated changes Phase 3 runner (Error Handling and Strings): T008 (assist, haiku) → Monty adds error handling + edge cases → exports to . T009 (review, sonnet) → Monty does final integration review → logs review → github-issue creates PR #5 from accumulated changes Phase 1 runner (Repository and Query Layer): T001 (assist, sonnet) → Monty adds repository method → exports to . T002 (assist, haiku) → Monty adds filter schema + dependency → exports to . T003 (write-tests, sonnet) → Monty writes tests → exports to . T004 (review, sonnet) → Monty reviews all changes → logs review → github-issue creates PR #3 from accumulated changes Phase 2 runner (API Endpoint): T005 (assist, sonnet) → Monty implements the endpoint → exports to . T006 (write-tests, opus) → Monty writes integration tests → exports to . T007 (review, sonnet) → Monty reviews endpoint + tests → logs review → github-issue creates PR #4 from accumulated changes Phase 3 runner (Error Handling and Strings): T008 (assist, haiku) → Monty adds error handling + edge cases → exports to . T009 (review, sonnet) → Monty does final integration review → logs review → github-issue creates PR #5 from accumulated changes dagger call -m github.com/telchak/daggerverse/[email protected] \ --self-improve=write \ assist \ --source=. \ --assignment="Add input validation to all API endpoints" dagger call -m github.com/telchak/daggerverse/[email protected] \ --self-improve=write \ assist \ --source=. \ --assignment="Add input validation to all API endpoints" dagger call -m github.com/telchak/daggerverse/[email protected] \ --self-improve=write \ assist \ --source=. \ --assignment="Add input validation to all API endpoints"

Learned Context - Pydantic v2 with field validators (`field_validator`), not v1 `@validator`

- All route handlers are async; tests use `httpx.AsyncClient` with `pytest-asyncio`

- Input validation pattern: Pydantic model as request body, raises `ValidationError` → 422

Command

Copy

$

Learned Context - Pydantic v2 with field validators (`field_validator`), not v1 `@validator`

- All route handlers are async; tests use `httpx.AsyncClient` with `pytest-asyncio`

- Input validation pattern: Pydantic model as request body, raises `ValidationError` → 422

Command

Copy

$

Learned Context - Pydantic v2 with field validators (`field_validator`), not v1 `@validator`

- All route handlers are async; tests use `httpx.AsyncClient` with `pytest-asyncio`

- Input validation pattern: Pydantic model as request body, raises `ValidationError` → 422

Command

Copy

$

Learned Context - Custom exception hierarchy in `app/errors.py`, handlers in `app/middleware.py`

- Project uses src layout with `app/` as the main package

- CI runs pytest with coverage; minimum threshold is 80%

Command

Copy

$

Learned Context - Custom exception hierarchy in `app/errors.py`, handlers in `app/middleware.py`

- Project uses src layout with `app/` as the main package

- CI runs pytest with coverage; minimum threshold is 80%

Command

Copy

$

Learned Context - Custom exception hierarchy in `app/errors.py`, handlers in `app/middleware.py`

- Project uses src layout with `app/` as the main package

- CI runs pytest with coverage; minimum threshold is 80%

Code Block

Copy

dagger call -m github.com/telchak/daggerverse/[email protected] \ --self-improve=commit \ develop-github-issue \ --github-token=env:GITHUB_TOKEN \ --issue-id=42 \ --repository="https://github.com/owner/my-python-api" \ --source=. dagger call -m github.com/telchak/daggerverse/[email protected] \ --self-improve=commit \ develop-github-issue \ --github-token=env:GITHUB_TOKEN \ --issue-id=42 \ --repository="https://github.com/owner/my-python-api" \ --source=. dagger call -m github.com/telchak/daggerverse/[email protected] \ --self-improve=commit \ develop-github-issue \ --github-token=env:GITHUB_TOKEN \ --issue-id=42 \ --repository="https://github.com/owner/my-python-api" \ --source=. chore(monty): update context files with learned discoveries chore(monty): update context files with learned discoveries chore(monty): update context files with learned discoveries - Daggie: the CI specialist. It reads your source code and available modules, then generates the toolchain configuration and CI workflow. You've just seen it in action. Daggie writes the setup; it doesn't run in the pipeline. - Monty: the Python coding agent. When a check fails on Python code (a test failure, a lint error, a broken import), Monty reads the error output and the source code, analyzes the root cause, and posts an inline code fix suggestion directly on the pull request. - Angie: the Angular/TypeScript coding agent. Same role as Monty, but for the frontend stack. When an Angular build or test fails, Angie diagnoses the issue and suggests the fix. - Monorepo layout detected. Daggie saw backend/ and frontend/ and added customizations to route each check's source argument to the right subdirectory. - All @check functions discovered. It read the module source, found every function decorated with @check (test, lint on backend; test, lint, audit on frontend; scan on deploy), and installed the modules as toolchains so dagger check picks them all up. - No pipeline code. No dag.container(), no base image selection, no pip install. The private modules encapsulate all of that. The project gets AcmeCorp-compliant CI from a single JSON file. - Deployment stays explicit. cloud_run and firebase are regular functions, not checks. They don't run via dagger check — they're called explicitly from the CI workflow's deploy step, because deployments should be intentional. - Typed, testable modules that encapsulate domain expertise (Part 3) - Coding agents that read source code and produce changes (Monty, Angie) - A CI specialist that generates pipelines from module libraries (Daggie) - Specify: generate a structured specification with user stories, acceptance criteria, and requirements - Plan: produce a technical implementation plan grounded in the actual codebase - Decompose: break the plan into ordered, dependency-aware tasks with agent and model assignments - Two jobs, one workflow. The decompose job uses Opus for planning, since it needs the most reasoning power to analyze the codebase and produce a good decomposition. The execute-phase job uses the suggested_model from each task: Haiku for simple changes, Sonnet for standard work, Opus for complex logic. - --allow-llm=all is required in CI. In interactive mode, Dagger prompts for LLM API access approval. In GitHub Actions there's no TTY, so we bypass the prompt. - PR creation uses a Dagger module. Instead of raw git commands, we use the github-issue module's create-pull-request function, which takes a --source Directory and handles branch creation, commit, push, and PR creation internally. - Review tasks skip export. The review entrypoint returns a string (the review text), not a Directory. Other entrypoints return a Directory that gets exported back to the workspace for the next task in the chain. - New GET /api/articles/feed endpoint that returns articles the current user has favorited - Support pagination via limit and offset query parameters (defaults: limit=20, offset=0) - Support optional tag filter to narrow favorites by tag - Response format must match the existing GET /api/articles response shape (articles array + articlesCount) - Only accessible to authenticated users (return 401 if unauthenticated) - Phase 1, Repository and Query Layer (T001-T004): add repository method to fetch favorited articles (assist, sonnet), add dependency function and filter schema (assist, haiku), write tests for repository and schema (write-tests, sonnet), review (review, sonnet) - Phase 2, API Endpoint (T005-T007): implement GET /api/articles/favorites endpoint (assist, sonnet), write integration tests (write-tests, opus), review endpoint and tests (review, sonnet) - Phase 3, Error Handling and Strings (T008-T009): add error handling strings and edge case coverage (assist, haiku), final integration review (review, sonnet) - PR #3: Phase 1: Repository and Query Layer — 361 additions across 5 files - PR #4: Phase 2: API Endpoint — 317 additions across 5 files - PR #5: Phase 3: Error Handling and Strings — 74 additions across 3 files - Opus analyzed the codebase and decomposed the request into 9 tasks across 3 phases - Haiku, Sonnet, and Opus implemented the feature, wrote tests, and reviewed the code, each task using the right model for its complexity - Three PRs were created automatically, linked to the issue, ready for human review — 752 lines of code across 13 files - External memory: Right now, agents forget everything between runs. Connecting the LLM to persistent memory stores (a vector database, a knowledge graph, or even a simple key-value store) would let agents accumulate project knowledge beyond what --self-improve and context files can provide. - RAG integration: Being able to connect the LLM to external retrieval-augmented generation engines would allow agents to reason over large documentation sets, internal wikis, or historical CI logs without stuffing everything into the context window. - Remote MCP servers: Dagger currently supports stdio-based MCP servers. Adding support for remote HTTP MCP servers would unlock integration with hosted tool services (SonarQube, Jira, Slack, observability platforms) without needing to bundle everything as a local process. - Broader BYOK compatibility: Dagger natively recognizes only four provider configurations: ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENAI_API_KEY, and Ollama. To use any other provider (Mistral, Qwen, DeepSeek), you have to route through the OPENAI_BASE_URL compatibility layer, which works but feels like a workaround. Native support for more provider environment variables (MISTRAL_API_KEY, DEEPSEEK_API_KEY, etc.) would make BYOK a first-class experience rather than an OpenAI-compat hack. - Native multi-agent patterns: Right now, orchestrating multiple agents (like Speck decomposing work across Monty and Angie) requires external coordination (a GitHub Actions matrix, a shell loop, or a custom orchestrator). A native Graph type in Dagger's core, similar to what LangGraph provides, would let you define agent workflows as typed, cacheable DAGs. Imagine declaring "run these three agents in parallel, merge their outputs, then run a review agent" as a first-class Dagger construct. - Part 1: Pipelines as real code — typed, testable, portable. No more YAML guesswork. - Part 2: Decoupled from infrastructure. Same pipeline on any runner, any cloud. - Part 3: A module library. Public daggerverse modules for generic operations, private modules for organizational compliance. Domain expertise encoded as deterministic, versioned functions. - Part 4: AI agents at the edges. Daggie generates the toolchain setup from the module library. dagger check runs fast and deterministic — no LLM, no tokens. When checks fail, Monty and Angie post fix suggestions on the PR. Speck decomposes feature requests into phased, agent-executable task plans. And with --self-improve, every agent interaction leaves the project better documented. - github.com/dagger/dagger: the main Dagger repository - dagger.io/changelog: follow what features are being actively developed (Cloud Engines, Cloud Checks, Modules V2, and more) - daggerverse.dev: the public module registry, where you can discover and reuse community modules - github.com/telchak/daggerverse: all the modules I've been personally creating throughout this series (Daggie, Monty, Angie, Goose, Speck, and the GCP infrastructure modules), and that I'll continue to support to bring my little piece to this open-source project - Deterministic checks, not LLM-routed ones: CI needs speed and reliability; keep AI out of the hot path - Daggie configures toolchains: point it at your module library and your project, and it generates the right dagger.json and CI workflow - Agents fix failures: Monty and Angie analyze errors and post code suggestions on PRs - Agents learn: --self-improve lets agents update context files (agent-specific + shared AGENTS.md) with project discoveries, getting smarter across runs - Spec-driven development: Speck decomposes high-level specs into structured task lists, dispatching work across specialized agents with model-appropriate assignments - Modules are the foundation: public modules for generic operations, private modules for org compliance, both composable by humans and agents - AI at the edges: configure the setup (before), fix the failures (after), learn from the work. Never in the hot path - Part 1: The CI/CD Bottleneck Nobody Talks About - Part 2: Decoupling Pipelines from Infrastructure - Part 3: From Scripts to a Platform: Your CI/CD Module Library - Part 4: The AI-Native CI/CD Stack: Agents, Modules, and Spec-Driven Development

Share this article

Twitter Facebook LinkedIn Reddit

🏷️ Tags

toolsutilitiessecurity toolslatestplatformengineeringdaggernative

More from Tools

Tools: Gas-Aware Trading: Execute Only When Gas Is Cheap (2026)

2026-03-30 0

Tools: Grafana k6 Has a Free API That Load Tests Your APIs With JavaScript - Full Analysis

2026-03-30 0

Tools: Caddy Has a Free API That Gives You Automatic HTTPS With Zero Configuration (2026)

2026-03-30 0

Tools: Fly.io Has a Free API That Deploys Docker Apps Globally With Edge Hosting (2026)

2026-03-30 0

Trending

1

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

2025-10-27 • 189 views

2

CVE-2025-43939: Dell Unity OS Command Injection (High)

2025-10-30 • 148 views

3

Google disputes false claims of massive Gmail data breach

2025-10-30 • 130 views

4

Microsoft: DNS outage impacts Azure and Microsoft 365 services

2025-10-30 • 88 views

5

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting

2025-11-25 • 81 views

InfinitSec - Latest Cybersecurity, Technology & Gaming News

Tools: Latest: CI/CD in the Era of AI and Platform Engineering: A Deep Dive into Dagger CI (Part 4)

Part 4: The AI-Native CI/CD Stack: Agents, Modules, and Spec-Driven Development

What Is a Dagger Agent?

The Problem With Setting Up CI

Generating the Setup With Daggie

Meet the Agents

The Generated Setup

Running the Checks

When a Check Fails

Integrating Into CI/CD

The Developer Experience Shift

Before: Pipeline Specialists

After: Agents Configure and Fix CI

From Developer Platform to Agent Factory

Spec-Driven Development With Speck

The Pipeline Pattern

Setting Up the Workflow

Running the Workflow

What Just Happened

Agents That Learn: Self-Improvement Across Runs

Three modes

Would I Recommend Dagger CI in Production Right Now?

What the Agent Layer Needs

What's Coming, and Why It Matters

Conclusion

Learned Context - Pydantic v2 with field validators (`field_validator`), not v1 `@validator`

Learned Context - Pydantic v2 with field validators (`field_validator`), not v1 `@validator`

Learned Context - Pydantic v2 with field validators (`field_validator`), not v1 `@validator`

Learned Context - Custom exception hierarchy in `app/errors.py`, handlers in `app/middleware.py`

Learned Context - Custom exception hierarchy in `app/errors.py`, handlers in `app/middleware.py`

Learned Context - Custom exception hierarchy in `app/errors.py`, handlers in `app/middleware.py`

🏷️ Tags

More from Tools

Tools: Gas-Aware Trading: Execute Only When Gas Is Cheap (2026)

Tools: Grafana k6 Has a Free API That Load Tests Your APIs With JavaScript - Full Analysis

Tools: Caddy Has a Free API That Gives You Automatic HTTPS With Zero Configuration (2026)

Tools: Fly.io Has a Free API That Deploys Docker Apps Globally With Edge Hosting (2026)

Trending

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

CVE-2025-43939: Dell Unity OS Command Injection (High)

Google disputes false claims of massive Gmail data breach

Microsoft: DNS outage impacts Azure and Microsoft 365 services

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting