Tools
Tools: I spent 6 hours debugging a VM that didn't need to exist - Expert Insights
The setup
The rabbit hole
The fix was 3 lines of thinking
The pattern
What I should have done
The takeaway I'm trying to get Anthropic's Cowork feature running on Linux. Cowork is officially Mac-only. It works by booting a sandboxed Ubuntu VM inside Claude Desktop, then running Claude Code inside that VM so it can safely read and write your project files. Claude Desktop for Linux exists as a community build — I maintain a deb package that repackages the official Windows MSIX for Ubuntu. The desktop app works fine. But Cowork doesn't start. The VM service socket never appears. So I open Claude Code on my terminal and say: fix this. What follows is roughly 6 hours of two problem-solvers — me and Claude — locked in a feedback loop where we're both hyperfixated on the same wrong question: how do we make this VM boot? Here's an abbreviated tour of the damage: Hour 1: We extract the Electron app's app.asar, find the minified JavaScript, locate the platform check. The Windows version uses a named pipe \\.\pipe\cowork-vm-service. We patch it to use a Unix socket. We write a custom Node.js service that listens on that socket and spawns QEMU. Claude Desktop connects. It sends startVM. QEMU boots. The guest agent connects. Progress. Hour 2: The installSdk call loops forever. We make it return success locally. Startup completes in 2.5 seconds. Then isProcessRunning keepalives timeout because the guest agent isn't responding. We discover the guest's sdk-daemon binary uses vsock, not the virtio serial port we set up. Hour 3: The sdk-daemon inside the VM can't find the "smol-bin" disk. It scans for NVMe devices. QEMU's NVMe emulation produces "bogus Namespace Identifiers" that the Ubuntu 22.04 kernel rejects. nvme1n1 never appears as a block device. We try uuid=auto, nguid=auto, eui64-default=on. None of it works. Hour 4: We try SCSI instead of NVMe. The device appears as /dev/sda but the sdk-daemon only scans NVMe paths. We try virtio-blk. Same. We discover the "smol-bin" disk is actually just an empty ext4 sessions disk, not the binary distribution we thought. We patch the rootfs with a fake tmpfs mount. The sdk-daemon gets past the smol-bin check. Hour 5: New error: VirtioFS mount fails. We add virtiofsd and memory-backend-memfd to QEMU. The sdk-daemon gets further. It mounts the sessions disk. It starts a MITM proxy. Then: [rpc] connecting to host CID=2 port=51234 — connection refused. We write a Python vsock bridge. It connects. Hour 6: The bridge works but the sdk-daemon crashes on restart because the fake mounts get torn down. We patch the rootfs systemd units. We patch the wrapper script. We fight with rootfs journal persistence. We're deep in the weeds of a QEMU VM running Ubuntu 22.04 inside Ubuntu 26.04, bridging vsock through Python to Node.js through Unix sockets to Electron, trying to make a Go binary that was compiled for Hyper-V happy inside KVM. Then I find this repo. The entire README of claude-cowork-linux by @johnzfitch can be summarized as: Why does Cowork use a VM? Because it needs a Linux environment to run Claude Code. On macOS, that means spinning up an Ubuntu VM. On Windows, same thing — Hyper-V boots a Linux guest. But you're already on Linux. So just... don't boot the VM. Stub out the macOS native module. Run Claude Code directly on the host. Translate the VM paths to real paths. Done. The install took 30 seconds. Cowork started immediately. The spawned Claude Code process ran for 23ms to confirm, then the full session worked. Every single thing I spent 6 hours debugging — NVMe namespace identifiers, vsock port discovery, VirtioFS permissions, rootfs systemd patches, Python-to-Node socket bridges — was a consequence of not asking why the VM existed in the first place. This is a pattern I've seen before but never this clearly: Hyperfixation on the error, not the cause. The error said "VM service not running." So we made the VM service run. Then the error said "guest agent not connected." So we connected the guest agent. Then "smol-bin device not found." So we tried to make the device appear. Every error led to the next error, and we solved each one, and we were making "progress" the entire time. But the actual cause was upstream of all of it: the VM shouldn't exist on Linux in the first place. We never questioned the premise. We just tried to make the wrong thing work really, really well. And here's the uncomfortable part: Claude Code was the perfect partner for this kind of mistake. It's incredible at "make X work" tasks. Give it an error, it'll fix it. Give it the next error, it'll fix that too. It will reverse-engineer a binary's vsock port by reading Go strings out of an ELF executable at 2 AM and not complain. It's a relentless debugger. But relentless debugging is exactly what you don't need when the whole approach is wrong. What you need is someone to say: wait, why are we doing this? Before touching any code: AI coding assistants are force multipliers. If you're pointed in the right direction, they're unbelievable. If you're pointed in the wrong direction, they'll help you dig the most elaborate, well-engineered hole you've ever seen. The fix for Cowork on Linux was a JavaScript stub and a path translation layer. We built a QEMU VM orchestrator with NVMe disk emulation, vsock bridging, VirtioFS shared filesystems, and rootfs systemd patching. Next time the debugging gets complicated, I'm going to stop and ask: am I solving the right problem, or am I just solving the next error? Credit to @johnzfitch for claude-cowork-linux, which solved in an afternoon what I couldn't solve in a night. The deb package with Cowork support is at johnohhh1/claude-desktop-ubuntu. Templates let you quickly answer FAQs or store snippets for re-use. as well , this person and/or - Ask what the VM is for. It runs Linux so Claude Code has a sandbox. We're on Linux. We don't need the sandbox VM.
- Search for prior art. Someone already solved this. 30 seconds of GitHub search would have found it.- Question the frame. The logs said "VM service not running" and I read that as "I need to make the VM service run." A better read: "this component assumes a VM is needed — is it?"