Tools: Report: Disk Has Space But Can't Create Files? (Linux Inode Exhaustion)

Tools: Report: Disk Has Space But Can't Create Files? (Linux Inode Exhaustion)

The Setup

The Investigation

The Key Insight

The Actual Diagnosis

The Fix

Prevention

What Interviewers Look For

Practice This Interactively One of the most confusing Linux errors I've debugged: a production server reporting "No space left on device" while df -h clearly showed 50GB free. I lost an hour to it the first time. Here's what was actually going on. I turned scenarios like this into an interactive practice tool at scenar.site - you debug simulated servers by talking to an AI interviewer. More at the end. I was on-call for a logging pipeline. Rsyslog kept crashing, and the logs were full of this: First instinct: the disk is full. Easy fix, right? Wait, what? 50GB free on root, 11GB free on /var/log. The disk isn't full. But the error clearly said "No space left on device". So what's going on? This is the moment where a lot of people (including past me) start doing random things: restarting services, clearing caches, rebooting the machine. None of it works. A Linux filesystem tracks two resources, not one: Every file, directory, and symlink on the filesystem consumes exactly one inode. When you run out of inodes, you can't create new files even if you have terabytes of free space. The kernel returns ENOSPC which the userspace translates to "No space left on device" - the same error as being actually out of space. That's where the confusion comes from. There it is. 100% inodes used. Zero free. The filesystem literally cannot create another file. Now: where are all these inodes going? Inode exhaustion almost always means "a lot of small files in one place". Time to find them: Over a million files in /var/log. That's the culprit. Let me see what they look like: Session logs. Let me check the sizes: Every single one is 0 bytes. Millions of empty files. Someone wrote a debug script, forgot to clean up, and it's been creating empty session logs for months. Each file is 0 bytes of disk space but consumes exactly one inode. Delete the empty files. Don't do it with rm directly - the argument list will be too long. Use find: This took about 30 seconds on that machine. Then verify: A few things I put in place after this: If this comes up in an SRE interview, the interviewer isn't just checking if you know df -i. They want to see: I built scenar.site to practice exactly these kinds of scenarios. You describe your debugging approach in plain English, an AI simulates a broken server and returns realistic command output, and tracks your reasoning. This scenario is one of 18 built-in ones. Free tier gets you started, no credit card. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Code Block

Copy

rsyslog[8421]: cannot create '/var/log/syslog.1': No space left on device systemd[1]: rsyslog.service: Main process exited, code=exited, status=1/FAILURE rsyslog[8421]: cannot create '/var/log/syslog.1': No space left on device systemd[1]: rsyslog.service: Main process exited, code=exited, status=1/FAILURE rsyslog[8421]: cannot create '/var/log/syslog.1': No space left on device systemd[1]: rsyslog.service: Main process exited, code=exited, status=1/FAILURE $ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 100G 45G 50G 48% / /dev/sda2 20G 8.0G 11G 42% /var/log $ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 100G 45G 50G 48% / /dev/sda2 20G 8.0G 11G 42% /var/log $ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 100G 45G 50G 48% / /dev/sda2 20G 8.0G 11G 42% /var/log $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 6553600 6553598 2 100% / /dev/sda2 1310720 1310718 2 100% /var/log $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 6553600 6553598 2 100% / /dev/sda2 1310720 1310718 2 100% /var/log $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 6553600 6553598 2 100% / /dev/sda2 1310720 1310718 2 100% /var/log $ find /var/log -type f | wc -l 1310715 $ find /var/log -type f | wc -l 1310715 $ find /var/log -type f | wc -l 1310715 $ ls /var/log/ | head session_000001.log session_000002.log session_000003.log session_000004.log session_000005.log ... $ ls /var/log/ | head session_000001.log session_000002.log session_000003.log session_000004.log session_000005.log ... $ ls /var/log/ | head session_000001.log session_000002.log session_000003.log session_000004.log session_000005.log ... $ find /var/log -name 'session_*' -printf '%s\n' | sort -u 0 $ find /var/log -name 'session_*' -printf '%s\n' | sort -u 0 $ find /var/log -name 'session_*' -printf '%s\n' | sort -u 0 $ find /var/log -type f -name 'session_*' -delete $ find /var/log -type f -name 'session_*' -delete $ find /var/log -type f -name 'session_*' -delete $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 6553600 2883 6550717 1% / /dev/sda2 1310720 1003 1309717 1% /var/log $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 6553600 2883 6550717 1% / /dev/sda2 1310720 1003 1309717 1% /var/log $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 6553600 2883 6550717 1% / /dev/sda2 1310720 1003 1309717 1% /var/log $ systemctl restart rsyslog $ systemctl status rsyslog Active: active (running) $ systemctl restart rsyslog $ systemctl status rsyslog Active: active (running) $ systemctl restart rsyslog $ systemctl status rsyslog Active: active (running) - Disk space (what df -h shows) - how many bytes are used - Inodes - how many files can exist - Monitor inode usage, not just disk space. Most monitoring setups check df -h but forget df -i. Add an alert at 85% inode usage. - Set up logrotate for any directory that accumulates log files. The default logrotate config handles most system logs but custom paths need their own config. - Code review any script that creates files in production. The script that caused this was "just a debug helper" that was never removed. - Use find ... -delete for cleanup, not rm with glob patterns. Glob expansion will hit the ARG_MAX limit with millions of files. - Do you check the actual error message carefully? ("No space left" has two possible causes) - Do you form a hypothesis before running commands? (Running df -h, df -i, find, each answering a specific question) - Can you explain the underlying concept? (inodes as a separate resource) - Do you think about prevention, not just the immediate fix?