Tools: Why File systems are hard to debug (2026)
I’m building a file system from scratch. Not because I need one—but because debugging what I can’t see is guesswork. Understanding this at the file system level is my first step toward kernel-level observability with eBPF. Most file systems work fine—until they don’t. When something slows down or behaves unexpectedly, you don’t really know why. You just see symptoms: high disk usage, latency spikes, random slowdowns. The problem is simple. The file system is a black box. You can monitor CPU. You can track memory. You can inspect processes. But what actually happens inside the file system—between a read, a write, and the disk—is mostly invisible. That’s where things break. Debugging turns into guessing. And that’s not a tooling problem. It’s a visibility problem. So instead of just studying file systems, I decided to build one. Not for performance. Not for production. Make the file system explain itself. This is where I start. I will continue this series and make a different low level useful tools.This is where I start. This is part of a larger series where I’ll be building low-level system tools from scratch—step by step—as I work toward understanding how an operating system really comes together. The file system is just the beginning. In this series, I’ll explore: The goal isn’t to build a production-ready OS. The goal is to understand systems deeply—and make them observable. Along the way, I’ll connect these ideas to kernel-level observability using eBPF. Next: starting with the disk layer. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse - which file caused the issue- which process triggered it- where the latency actually happened - track every operation- measure latency- connect file activity to what caused it - how data is stored and managed- how processes interact with the system- how system behavior can be observed and debugged- and how to make these internals visible instead of opaque