Tools: Stop babysitting your AI agent! - Guide

Tools: Stop babysitting your AI agent! - Guide

Containment over constraint

What I've been trying

“Isn’t a VM overkill?”

The way to think about it Permission fatigue is real. You approve so many legitimate actions that the one dangerous action looks identical. Approve file write, approve bash command, approve another file write; twenty approvals in, and it’s muscle memory. You’re not reviewing anymore. You’re the bot. Auto mode, permission hooks, allow/deny lists: they’re all behavioral controls running inside the environment they’re supposed to protect. The classifier and the thing it’s classifying share the same filesystem, the same network, the same credentials. If something goes wrong, the damage surface is your entire machine and potentially the production systems to which your session has access to. The more you configure permissions, the more you end up debugging the permission system instead of doing actual work. Allow this tool, deny that command, hook into pre-execution, handle the escape hatch when the sandbox blocks something legitimate. It starts feeling like a second project just to safely use the first one. The issue isn’t permissions. It’s the blast radius. This isn’t a new idea. We already run untrusted code this way in CI. We already run containers this way in production. The security boundary is the environment, not the application’s self-restraint. The shift is: don’t restrict what the agent does. Restrict what it has. Give it only the filesystem it needs, only the network destinations it needs, only the credentials it needs. Everything else is unreachable by default. Once you have that, you stop caring about what happens inside. The agent can rm -rf / all day. NVIDIA seems to agree. Their OpenShell project takes the same approach: sandboxed execution environments with declarative YAML policies governing egress, filesystem access, and credentials. It uses containers (K3s under the hood) as the isolation boundary. Containers are a good start. But they share the host kernel, and the breakout surface is well-documented. For a truly untrusted agent running --dangerously-skip-permissions, a KVM boundary is a categorically different isolation tier. And with microVMs, the performance cost of that stronger boundary has largely disappeared. I’ve been experimenting with this idea in a little project called nixbox (a NixOS microVM sandbox). I set out trying to achieve the following: The usage is something like this: No permission prompts. No classifier. Full autonomy inside a box the agent can’t escape from. --dangerously-skip-permissions stops being dangerous when there’s nothing dangerous to reach. That was my assumption too. It’s not. Cloud-hypervisor boots in seconds. Balloon memory means the guest only uses what it needs and returns the rest. virtiofs gives shared filesystem access without the overhead of network mounts. Feels like opening a terminal, not spinning up a second machine. Compare the real costs: configuring permissions, hooks, allow/deny lists, trust levels, debugging why the agent got blocked mid-task... vs. nixbox run …. When you hire a contractor, you don’t hand them a 47-page list of forbidden tools and stand behind them checking every move. You put them in the right room, with the right materials, and let them work. Auto mode is the 47-page list. A VM is the room. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to ? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Code Block

Copy

--dangerously-skip-permissions ~/workspace nixbox up nixbox run “cd ~/workspace/myproject && claude --dangerously-skip-permissions -p ‘fix tests’” nixbox down nixbox up nixbox run “cd ~/workspace/myproject && claude --dangerously-skip-permissions -p ‘fix tests’” nixbox down nixbox up nixbox run “cd ~/workspace/myproject && claude --dangerously-skip-permissions -p ‘fix tests’” nixbox down --dangerously-skip-permissions nixbox run … - KVM isolation: a compromised agent cannot reach the host. Period. - Egress filtering: DNS allowlist. Only approved domains resolve. Three modes: off, filtered, open. - Explicit mounts: virtiofs bind mounts, write access is opt-in. Mount ~/workspace, nothing else. - Scoped secrets: credentials passed via env, not inherited from the host shell. - Reproducible: Nix-built image. Same config, same guest, every time.