Tools: I Spent a Weekend Hunting Through Linux's File System and Here's What I Found - Guide

Tools: I Spent a Weekend Hunting Through Linux's File System and Here's What I Found - Guide

Before We Start: The Map You Need

1: /etc/passwd Was Never Really About Passwords

2: /etc/shadow Is Where the Real Secrets Live

3: /etc/hosts Is the Original DNS

4: /etc/resolv.conf Is Your Gateway to the Internet

5: /proc Is Not a Real Directory

6: /proc/[PID] Lets You Inspect Any Running Process

7: /proc/net/route Shows Your Routing Table as a File

8: /dev/null, /dev/zero, and /dev/urandom Are Not Real Files Either

9: /etc/fstab Decides What Gets Mounted at Boot

10: /etc/systemd/system/ Is Where Services Live

11: /var/log/auth.log Is a Security Timeline

12: /etc/sudoers Controls Who Can Do What as Root

The Bigger Picture Most people learn Linux by memorizing commands. ls, cd, mkdir. But here's the thing, that barely scratches the surface of what's actually going on underneath. Linux is one of those systems where the more you dig, the more you realize the entire OS is basically a giant, structured collection of files that control everything. I decided to stop running commands blindly and actually go hunting. What I found was genuinely interesting. This blog covers my top discoveries from exploring the Linux file system, things I wish someone had explained to me when I was starting out. Not a command list. Just a deep dive into what's really going on. Here's how the top-level Linux file system is roughly laid out When you first hear "the password file," you assume it holds passwords. The name is right there. But open it and run cat /etc/passwd on any modern Linux machine, and you'll see something like this for each user: That x in the second field? That's where the password hash used to live, literally stored in plain text in early Unix systems. Which is terrifying. At some point, system designers realized having world-readable passwords was a bad idea, so they moved the actual hashed passwords into /etc/shadow, which is only readable by root. The fields in /etc/passwd tell you: username, password placeholder, UID (user ID), GID (group ID), full name, home directory, and default shell. This file is the backbone of user identity on Linux. Every time you log in, something reads this file to figure out who you are, where your home folder is, and which shell to launch. What I found interesting: system services like www-data, nobody, and daemon are also listed here. These are not human users, they're service accounts created so that web servers and background processes don't have to run as root. It's a simple but clever way to limit damage if something gets compromised. Once you're root, run sudo cat /etc/shadow. Each line looks something like: Breaking that down: the hash starts with $6$ which means SHA-512 (older systems used $1$ for MD5, which is now considered weak). After that you have the salt, then the actual hash. Then a bunch of numbers representing: days since last password change, minimum days before change allowed, maximum days before forced change, and warning days before expiry. This file exists specifically because /etc/passwd has to be readable by everyone (programs need to look up usernames), but passwords obviously shouldn't be. So Linux splits the concern. Public info in one file, secret info in another, with strict permissions on the second one. The insight here is that Linux uses file permissions as a security boundary — not some fancy encryption layer, just strict ownership rules. Simple, elegant, effective. Before DNS servers existed, every machine on the internet had a manually updated file mapping hostnames to IP addresses. That file was /etc/hosts. Today we have massive distributed DNS infrastructure, but this file still exists and still takes priority over DNS lookups on most systems. Run ping dev-server.local after adding an entry there and it works instantly, no DNS query needed. Developers use this trick all the time to redirect domains locally during development. Security researchers use it to block known malicious domains by pointing them to 127.0.0.1. The order in which Linux resolves names is controlled by another file: /etc/nsswitch.conf. Look for the line: "files" means check /etc/hosts first. "dns" means then go to the DNS server. Change that order and you change how your entire system resolves hostnames. That one line has enormous consequences. When /etc/hosts doesn't have an answer, Linux goes to a DNS resolver. The configuration for which DNS server to use lives in /etc/resolv.conf: Simple file, massive implications. The nameserver lines tell the system where to send DNS queries. The search line appends a domain suffix to short hostnames — so typing ssh webserver might actually try webserver.local.lan first. On modern systems using systemd-resolved or NetworkManager, this file might actually be a symlink: That means it's dynamically managed. The actual DNS logic lives elsewhere in systemd. The file is just a compatibility shim. I found this genuinely surprising — a file I assumed was simple and static is actually a managed symlink pointing to a runtime-generated config. This one blew my mind a little. The /proc directory looks like a folder full of files, but none of those files actually exist on your disk. /proc is a virtual filesystem - it's generated on-the-fly by the Linux kernel every time you read from it. You'll get detailed info about your CPU: model name, cores, cache size, flags showing what features it supports. That data isn't stored anywhere. The kernel just synthesizes it when you ask. Same with /proc/meminfo - it gives you a live breakdown of your RAM usage. MemTotal, MemFree, Cached, SwapUsed. Tools like free -h and htop are literally just reading this file and formatting the output nicely. The design philosophy here is beautiful: instead of a special system call API for every piece of system information, Linux just exposes everything as files. Want CPU info? Read a file. Want memory stats? Read a file. This makes it incredibly easy to access kernel internals from any language that can open a file. Every running process on your system has a numbered folder inside /proc. Find your shell's PID with echo $$, then explore that folder: You'll see things like: The fd/ folder is particularly useful. Every file the process has open — log files, network sockets, config files — shows up here as a symlink. This is how tools like lsof work. They're just walking through /proc/[PID]/fd/ for every process and reporting what they find. One practical insight: if a program deletes a file that another process still has open, the data isn't actually gone yet. It stays on disk until the file descriptor is closed. You can even recover the data by copying from /proc/[PID]/fd/[number]. That's a useful recovery trick. Most people use ip route or netstat -r to see the routing table. But those tools are just reading /proc/net/route. Open the raw file and you see: The values are in hex and reversed byte order, which looks confusing at first. But once you decode them: 0101A8C0 reversed in pairs is C0.A8.01.01 which in decimal is 192.168.1.1. That's your default gateway. Flag 0003 means the route is Up and is a Gateway route. This is the actual kernel routing table. Every packet your machine sends consults this structure to decide where to go next. Understanding this means you understand why changing your default gateway works the way it does, why VPNs can reroute all traffic (they add a more specific route), and why split tunneling is possible (some traffic goes through VPN, some doesn't, based on which route is more specific). The /dev directory is full of device files that let user space programs interact with hardware or kernel features. Most are physical device interfaces, but three stand out as being almost philosophical: /dev/null is a black hole. Anything written to it disappears. Anything read from it returns nothing. You'll see it everywhere: /dev/zero is an infinite source of null bytes. Read from it and you get zeros forever. It's used to create empty files of a specific size, wipe disks, or pre-allocate space. /dev/urandom is an infinite source of random bytes, generated by the kernel's entropy pool (keyboard timings, disk interrupts, network packets, etc.). It's how secure random numbers are generated on Linux. When your system generates an SSH key, it's reading from here. When your browser generates a TLS session key, it eventually traces back to here. The kernel trick of representing these as files means any program can use them without needing special APIs. A shell script can generate random data just as easily as a compiled C program. Every time Linux boots, it reads /etc/fstab to figure out which filesystems to mount and where: Using UUIDs instead of device names like /dev/sda1 is a modern improvement. Device names can change if you add a new disk, but UUIDs are stable. Each field means: device, mount point, filesystem type, mount options, dump flag, fsck order. The last line is interesting: tmpfs on /tmp means your temp folder is actually stored in RAM, not on disk. Everything in /tmp is gone on reboot and reads/writes happen at memory speed. Many systems don't do this by default, but it's a performance and privacy improvement when you add it. If you mess up /etc/fstab, the system can fail to boot. It's one of those files that has enormous power with very little protection against mistakes. On modern Linux systems using systemd, services are defined as unit files. The system ones live in /lib/systemd/system/, but anything in /etc/systemd/system/ takes priority and overrides them. This is where you put custom services or modifications to existing ones. Open any .service file, like ssh.service, and you'll find it surprisingly readable: The After= directive is what makes the dependency graph work. Systemd reads all unit files at boot and builds a dependency tree, then starts services in parallel where possible, serializing only where After= or Requires= demands it. This is why modern Linux boots faster than older init systems. The Restart=on-failure is particularly useful. If your web server crashes, systemd just restarts it automatically. No daemon needed to babysit it. If you want to know what's been happening on your machine security-wise, read /var/log/auth.log (or /var/log/secure on Red Hat-based systems). Every login attempt, sudo usage, SSH connection, and authentication failure is logged here. That command pulls out every failed password attempt, groups by IP address, and shows you who's been hammering your machine. On any internet-facing server, you'll see hundreds or thousands of attempts from random IPs. This is constant and automated — bots scanning the entire internet looking for weak passwords. Looking at the timestamps and patterns in this log teaches you more about real-world security threats than almost any textbook. You can see brute force patterns (same IP, rapid attempts), credential stuffing (many different usernames tried), and distributed attacks (same pattern from many IPs). Tools like fail2ban read exactly this file and auto-block attacking IPs. Running sudo visudo (the safe way to edit this file) reveals the rules governing privilege escalation. The default you'll always see: This means: user root can run all commands on all hosts as any user. Members of the sudo group (the % prefix means group) can do the same. But you can get much more specific: This lets john restart nginx without a password, but nothing else. This principle of least privilege is a core security concept — give people only the permissions they actually need. If john's account gets compromised, the attacker can restart nginx but can't do much else. The file uses a strict syntax, and if you corrupt it, you can lock yourself out of sudo entirely. That's why visudo exists — it validates syntax before saving. It's one of those files where Linux forces you to use a specific editor to protect you from yourself. What surprised me most wasn't any single file. It was the consistent philosophy running through all of it. Linux treats everything as a file. Hardware, processes, random number generators, network state, running services. This "everything is a file" design means that the same tools (reading, writing, piping, permissions) work across completely different concerns. A web developer reading a text file and a kernel reading a network interface are using the same fundamental operation. That's why Linux is so composable. You can pipe /dev/urandom through a text processor, read live CPU stats with a simple cat, redirect a program's output into nothing using a file called null. The whole system is built on one unifying abstraction, and once that clicks, Linux stops feeling like a collection of magic commands and starts feeling like a coherent, logical system you can actually reason about. Go hunt around in these directories yourself. You'll find things that aren't in this post. That's kind of the whole point. Hope you found this helpful! If you spot any mistakes or have suggestions, let me know. You can find me on LinkedIn and X, where I post more about web development. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Code Block

Copy

root:x:0:0:root:/root:/bin/bash john:x:1001:1001:John Doe:/home/john:/bin/bash root:x:0:0:root:/root:/bin/bash john:x:1001:1001:John Doe:/home/john:/bin/bash root:x:0:0:root:/root:/bin/bash john:x:1001:1001:John Doe:/home/john:/bin/bash john:$6$randomsalt$longhashstring...:19800:0:99999:7::: john:$6$randomsalt$longhashstring...:19800:0:99999:7::: john:$6$randomsalt$longhashstring...:19800:0:99999:7::: 127.0.0.1 localhost 127.0.1.1 mymachine 192.168.1.50 dev-server.local 127.0.0.1 localhost 127.0.1.1 mymachine 192.168.1.50 dev-server.local 127.0.0.1 localhost 127.0.1.1 mymachine 192.168.1.50 dev-server.local hosts: files dns hosts: files dns hosts: files dns nameserver 8.8.8.8 nameserver 1.1.1.1 search local.lan nameserver 8.8.8.8 nameserver 1.1.1.1 search local.lan nameserver 8.8.8.8 nameserver 1.1.1.1 search local.lan ls -la /etc/resolv.conf # → /etc/resolv.conf -> /run/systemd/resolve/stub-resolv.conf ls -la /etc/resolv.conf # → /etc/resolv.conf -> /run/systemd/resolve/stub-resolv.conf ls -la /etc/resolv.conf # → /etc/resolv.conf -> /run/systemd/resolve/stub-resolv.conf cat /proc/cpuinfo cat /proc/cpuinfo cat /proc/cpuinfo ls /proc/$$/ ls /proc/$$/ ls /proc/$$/ Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT eth0 00000000 0101A8C0 0003 0 0 100 00000000 0 0 0 Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT eth0 00000000 0101A8C0 0003 0 0 100 00000000 0 0 0 Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT eth0 00000000 0101A8C0 0003 0 0 100 00000000 0 0 0 command 2>/dev/null # silence error output command 2>/dev/null # silence error output command 2>/dev/null # silence error output UUID=abc123 / ext4 defaults 0 1 UUID=def456 /home ext4 defaults 0 2 UUID=ghi789 swap swap sw 0 0 tmpfs /tmp tmpfs mode=1777,size=2G 0 0 UUID=abc123 / ext4 defaults 0 1 UUID=def456 /home ext4 defaults 0 2 UUID=ghi789 swap swap sw 0 0 tmpfs /tmp tmpfs mode=1777,size=2G 0 0 UUID=abc123 / ext4 defaults 0 1 UUID=def456 /home ext4 defaults 0 2 UUID=ghi789 swap swap sw 0 0 tmpfs /tmp tmpfs mode=1777,size=2G 0 0 [Unit] Description=OpenBSD Secure Shell server After=network.target auditd.service [Service] ExecStart=/usr/sbin/sshd -D ExecReload=/bin/kill -HUP $MAINPID Restart=on-failure [Install] WantedBy=multi-user.target [Unit] Description=OpenBSD Secure Shell server After=network.target auditd.service [Service] ExecStart=/usr/sbin/sshd -D ExecReload=/bin/kill -HUP $MAINPID Restart=on-failure [Install] WantedBy=multi-user.target [Unit] Description=OpenBSD Secure Shell server After=network.target auditd.service [Service] ExecStart=/usr/sbin/sshd -D ExecReload=/bin/kill -HUP $MAINPID Restart=on-failure [Install] WantedBy=multi-user.target sudo grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -rn | head sudo grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -rn | head sudo grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -rn | head root ALL=(ALL:ALL) ALL %sudo ALL=(ALL:ALL) ALL root ALL=(ALL:ALL) ALL %sudo ALL=(ALL:ALL) ALL root ALL=(ALL:ALL) ALL %sudo ALL=(ALL:ALL) ALL john ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart nginx john ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart nginx john ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart nginx - cmdline — the exact command that launched this process - environ — all environment variables it was started with - fd/ — a folder containing symlinks to every open file descriptor - maps — the memory map: which libraries and files are loaded into memory - status — current state, memory usage, which user it runs as