Tools

Tools: Building a 1,056-Test Rust CLI Without Writing Rust — Claude Code Did It

2026-03-19 0 views admin

The Subagent Pattern

Week 1: Fork and Rename

Week 2: The 6 New Filters

Week 3: Benchmarks and Honest Failures

What I Actually Did vs. What Claude Did

Final Stats I don't write Rust. I can read it well enough to catch obvious bugs, but I've never typed impl or fn main() from scratch. Yet I shipped a 40-module Rust CLI with 1,056 tests in 3 weeks. Claude Code wrote every line of Rust. I wrote prompts, reviewed diffs, and made architecture decisions. The tool — ContextZip — compresses Claude Code's own context window. So the AI built a tool to make itself work better. That irony wasn't lost on me. Here's exactly how the process worked, including the parts that went wrong. I never gave Claude Code a vague instruction like "build a context compressor." Every task was a subagent dispatch — a scoped prompt with clear inputs, expected outputs, and test requirements. "Implement an error stacktrace filter for Node.js. Input: raw stderr with Express middleware frames. Output: error message + user code frames only. Write 20+ test cases covering nested errors, empty traces, and mixed stdout/stderr. Put the filter in src/filters/error_stacktrace.rs." The subagent implements, writes tests, runs them. Then I dispatch a second subagent to review: "Review the error_stacktrace filter. Check edge cases: what happens with zero frames? Frames with no file path? Stack traces inside JSON output?" This two-agent cycle — implement, then review — caught 80% of bugs before I even looked at the code. The foundation was RTK (Rust Token Killer), an open-source CLI with 34 command modules, 60+ TOML filters, and 950 tests. I forked it and dispatched a subagent to rename every reference from "rtk" to "contextzip" across 70 files. 1,544 insertions, 1,182 deletions. All 950 tests still passing. Then three agents worked in parallel: one on the install script, one on GitHub Actions CI/CD for 5 platforms, one extending the SQLite tracking system. By Friday: curl | bash installs the binary on Linux or macOS, and contextzip gain --by-feature shows per-filter savings. This is where ContextZip stops being a rename and starts being a product. Six new compression filters, each built by a subagent cycle: Each filter got 15-20 dedicated test cases. The error stacktrace filter alone has 20 tests covering 5 languages. I ran 102 benchmark tests with production-scale inputs. The results were not uniformly impressive. Rust panic compression started at 2%. The subagent's first implementation only stripped the backtrace header line. I rewrote the prompt with explicit examples of Rust panic output and dispatched again. It landed at 80%. Java stacktrace compression went negative (-12%) on short traces. The formatted output was longer than the raw input. I added a threshold: if compression ratio is below 10%, pass through the original output unchanged. Final result: 20% savings on Java, no negative cases. Build error grouping hit -10% on single-error inputs. Same fix — threshold passthrough. Lying about benchmarks is worse than imperfect numbers. The README shows every result, including the weak spots. Me: Architecture decisions, prompt design, review, quality gates, benchmark analysis, bug triage. Claude Code: All Rust implementation, test writing, CI/CD configuration, README generation, install script. The split was roughly 20% me (thinking, reviewing, deciding) and 80% Claude (typing, testing, building). But that 20% was the difference between shipping and not shipping. Without review cycles, the Rust panic filter would still be at 2%. The tool works. I use it daily. My Claude Code sessions last 40-60% longer before hitting context limits. The AI built a tool to extend its own memory, and the humans reviewing it are the reason it actually works. Templates let you quickly answer FAQs or store snippets for re-use. as well , this person and/or - Error stacktraces — strips framework frames from Node.js, Python, Rust, Go, Java

- ANSI preprocessor — removes escape codes, spinners, progress bars- Web page extraction — strips nav, footer, ads, keeps article content- Build error grouping — collapses 40 identical TypeScript errors into one group- Package install compression — removes deprecated warnings, keeps security alerts- Docker build compression — success = 1 line, failure = full context - 1,056 tests, 0 failures- 102 benchmark cases- 40+ command modules (34 inherited + 6 new)- 5-platform CI/CD (Linux x86/musl, macOS arm64/x86, Windows)- 3 install methods (curl, Homebrew, cargo)- README in 4 languages - GitHub: jee599/contextzip- 102-test benchmark results

Share this article

Twitter Facebook LinkedIn Reddit

🏷️ Tags

toolsutilitiessecurity toolsbuildingwithoutwritingclaudesubagent

More from Tools

Tools: Gas-Aware Trading: Execute Only When Gas Is Cheap (2026)

2026-03-30 0

Tools: Grafana k6 Has a Free API That Load Tests Your APIs With JavaScript - Full Analysis

2026-03-30 0

Tools: Caddy Has a Free API That Gives You Automatic HTTPS With Zero Configuration (2026)

2026-03-30 0

Tools: Fly.io Has a Free API That Deploys Docker Apps Globally With Edge Hosting (2026)

2026-03-30 0

Trending

1

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

2025-10-27 • 189 views

2

CVE-2025-43939: Dell Unity OS Command Injection (High)

2025-10-30 • 148 views

3

Google disputes false claims of massive Gmail data breach

2025-10-30 • 130 views

4

Microsoft: DNS outage impacts Azure and Microsoft 365 services

2025-10-30 • 88 views

5

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting

2025-11-25 • 81 views

InfinitSec - Latest Cybersecurity, Technology & Gaming News

Tools: Building a 1,056-Test Rust CLI Without Writing Rust — Claude Code Did It

The Subagent Pattern

Week 1: Fork and Rename

Week 2: The 6 New Filters

Week 3: Benchmarks and Honest Failures

What I Actually Did vs. What Claude Did

🏷️ Tags

More from Tools

Tools: Gas-Aware Trading: Execute Only When Gas Is Cheap (2026)

Tools: Grafana k6 Has a Free API That Load Tests Your APIs With JavaScript - Full Analysis

Tools: Caddy Has a Free API That Gives You Automatic HTTPS With Zero Configuration (2026)

Tools: Fly.io Has a Free API That Deploys Docker Apps Globally With Edge Hosting (2026)

Trending

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

CVE-2025-43939: Dell Unity OS Command Injection (High)

Google disputes false claims of massive Gmail data breach

Microsoft: DNS outage impacts Azure and Microsoft 365 services

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting