1. Netcat (nc): The Swiss Army Knife ## Core Modes ## 2. Tcpdump: The CLI Microscope ## Key Flags to Memorize ## The Filter Syntax (BPF) ## 3. Dig (dig): The DNS Scalpel ## Understanding the Output ## Power User Commands ## 4. Nmap: The Cartographer ## Scan Types ## 5. Debugging: Latency vs. Bandwidth ## A. Latency (The "Distance") ## B. Bandwidth (The "Width") ## C. The Hidden Trap: Throughput & Window Size ## Step 1: The "Is it Alive?" Check (Layer 3 - Network) ## Step 2: The "Address Book" Check (Layer 7 - DNS) ## Step 3: The "Is the Door Open?" Check (Layer 4 - Transport) ## Step 4: The "Deep Dive" (Packet Analysis) ## Summary Checklist Netcat reads and writes data across network connections using TCP or UDP. It is the rawest form of network communication. When you can't use Wireshark (because there is no GUI), you use tcpdump. It captures packets directly from the kernel. It uses the same filter language as Wireshark. nslookup is deprecated/old. dig (Domain Information Groper) is the modern standard because it shows the exact query and response structure. Running dig google.com gives you: Nmap scans a network to map "live" hosts and open ports. It works by sending packets and analyzing the subtle differences in responses. Version Detection (-sV): Connects to the port and listens to the "Banner" to guess the software version (e.g., "Apache 2.4.41"). OS Detection (-O): Analyzes IP TTLs and TCP Window sizes to guess the Operating System (Linux, Windows, connection stack differences). In DevOps, "The network is slow" is a vague complaint. You must distinguish between two completely different bottlenecks. You can have huge Bandwidth (10Gbps) and low Throughput if Latency is high. Here is a Real-World Troubleshooting Cheat Sheet. The Scenario:
You are a DevOps Engineer. A developer complains: "The Web App can't connect to the Database (PostgreSQL), or it's extremely slow." Your Mission: Isolate the root cause using the tools we just discussed. Goal: Determine if the Database server is reachable network-wise. Tool: mtr (or ping)
Run this from the Web Server: Verdict: Network path is fine. Proceed to Step 2. Scenario B (Bad - 100% Loss): "Destination Host Unreachable." Verdict: The server is down, or there is no route (Routing Table issue). Scenario C (Bad - High Loss): Loss starts at Hop 2. Verdict: A specific router/switch in the path is failing. Goal: Ensure the application is trying to connect to the correct IP address. Goal: The server is up, and the IP is right. Is the Database software listening on Port 5432, or is a Firewall blocking us? Tool: nc (Netcat) or telnet Verdict: Firewall is open, DB is listening. The issue is likely Application Layer (wrong password, DB overload). Scenario B (Connection Refused): Ncat: Connection refused. Verdict: Packet reached the server, but the Server said "Go Away." The DB service is likely crashed/stopped. Scenario C (Timeout): It hangs forever... Verdict: Firewall Drop. The packet hit a black hole (Security Group/UFW). It never got a reply. Goal: The connection is "flaky" or "slow," but netcat works intermittently. We need to see the handshake. Tool: tcpdump
Run this on the Web Server while triggering the database connection: Diagnosis: You see only [S] (SYN) packets going out, but no reply. The other side is ignoring you. Confirm Firewall/Security Groups. Case 2: The "Reset" (Service Down) Diagnosis: You see an [R] (RST) flag immediately. The server OS received the request but no application was bound to that port to handle it. Check if Postgres Service is running. Case 3: The "Zero Window" (Overload) Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK:
nc -vz 192.168.1.5 80
# -v: Verbose (tells you what happened)
# -z: Zero-I/O mode (scans for listening daemons, doesn't send data) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
nc -vz 192.168.1.5 80
# -v: Verbose (tells you what happened)
# -z: Zero-I/O mode (scans for listening daemons, doesn't send data) COMMAND_BLOCK:
nc -vz 192.168.1.5 80
# -v: Verbose (tells you what happened)
# -z: Zero-I/O mode (scans for listening daemons, doesn't send data) COMMAND_BLOCK:
# On Server B (Receiver):
nc -l 9090
# -l: Listen mode Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# On Server B (Receiver):
nc -l 9090
# -l: Listen mode COMMAND_BLOCK:
# On Server B (Receiver):
nc -l 9090
# -l: Listen mode COMMAND_BLOCK:
# Receiver:
nc -l 9090 > received_file.txt
# Sender:
nc [Receiver_IP] 9090 < original_file.txt Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# Receiver:
nc -l 9090 > received_file.txt
# Sender:
nc [Receiver_IP] 9090 < original_file.txt COMMAND_BLOCK:
# Receiver:
nc -l 9090 > received_file.txt
# Sender:
nc [Receiver_IP] 9090 < original_file.txt CODE_BLOCK:
google.com:http CODE_BLOCK:
-w capture.pcap COMMAND_BLOCK:
# Capture only traffic from a specific IP on port 80
sudo tcpdump -i eth0 -n src 192.168.1.5 and dst port 80 # Capture everything EXCEPT SSH (so you don't flood your own logs)
sudo tcpdump -i eth0 port not 22 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# Capture only traffic from a specific IP on port 80
sudo tcpdump -i eth0 -n src 192.168.1.5 and dst port 80 # Capture everything EXCEPT SSH (so you don't flood your own logs)
sudo tcpdump -i eth0 port not 22 COMMAND_BLOCK:
# Capture only traffic from a specific IP on port 80
sudo tcpdump -i eth0 -n src 192.168.1.5 and dst port 80 # Capture everything EXCEPT SSH (so you don't flood your own logs)
sudo tcpdump -i eth0 port not 22 CODE_BLOCK:
dig google.com CODE_BLOCK:
dig +trace google.com Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
dig +trace google.com CODE_BLOCK:
dig +trace google.com CODE_BLOCK:
dig +short google.com Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
dig +short google.com CODE_BLOCK:
dig +short google.com CODE_BLOCK:
dig @8.8.8.8 google.com Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
dig @8.8.8.8 google.com CODE_BLOCK:
dig @8.8.8.8 google.com COMMAND_BLOCK:
# The "Aggressive" Scan (OS detection, Version detection, Script scanning, Traceroute)
nmap -A 192.168.1.5 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# The "Aggressive" Scan (OS detection, Version detection, Script scanning, Traceroute)
nmap -A 192.168.1.5 COMMAND_BLOCK:
# The "Aggressive" Scan (OS detection, Version detection, Script scanning, Traceroute)
nmap -A 192.168.1.5 COMMAND_BLOCK:
# Server side
iperf3 -s
# Client side
iperf3 -c [Server_IP] Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# Server side
iperf3 -s
# Client side
iperf3 -c [Server_IP] COMMAND_BLOCK:
# Server side
iperf3 -s
# Client side
iperf3 -c [Server_IP] CODE_BLOCK:
net.ipv4.tcp_window_scaling CODE_BLOCK:
mtr -r -c 10 db.prod.internal Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
mtr -r -c 10 db.prod.internal CODE_BLOCK:
mtr -r -c 10 db.prod.internal CODE_BLOCK:
dig +short db.prod.internal Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
dig +short db.prod.internal CODE_BLOCK:
dig +short db.prod.internal CODE_BLOCK:
nc -zv 10.0.1.50 5432 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
nc -zv 10.0.1.50 5432 CODE_BLOCK:
nc -zv 10.0.1.50 5432 CODE_BLOCK:
Connection to 10.0.1.50 5432 port [tcp/postgresql] succeeded! CODE_BLOCK:
Ncat: Connection refused. COMMAND_BLOCK:
# Capture traffic to the DB IP on port 5432, don't resolve names (-n)
sudo tcpdump -i eth0 -n host 10.0.1.50 and port 5432 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# Capture traffic to the DB IP on port 5432, don't resolve names (-n)
sudo tcpdump -i eth0 -n host 10.0.1.50 and port 5432 COMMAND_BLOCK:
# Capture traffic to the DB IP on port 5432, don't resolve names (-n)
sudo tcpdump -i eth0 -n host 10.0.1.50 and port 5432 COMMAND_BLOCK:
12:01:01 IP WebServer > DBServer: Flags [S], seq 123...
12:01:02 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission)
12:01:04 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
12:01:01 IP WebServer > DBServer: Flags [S], seq 123...
12:01:02 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission)
12:01:04 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission) COMMAND_BLOCK:
12:01:01 IP WebServer > DBServer: Flags [S], seq 123...
12:01:02 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission)
12:01:04 IP WebServer > DBServer: Flags [S], seq 123... (Retransmission) COMMAND_BLOCK:
12:01:01 IP WebServer > DBServer: Flags [S]
12:01:01 IP DBServer > WebServer: Flags [R.], seq 0 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
12:01:01 IP WebServer > DBServer: Flags [S]
12:01:01 IP DBServer > WebServer: Flags [R.], seq 0 COMMAND_BLOCK:
12:01:01 IP WebServer > DBServer: Flags [S]
12:01:01 IP DBServer > WebServer: Flags [R.], seq 0 COMMAND_BLOCK:
12:01:01 IP DBServer > WebServer: Flags [.], win 0 Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
12:01:01 IP DBServer > WebServer: Flags [.], win 0 COMMAND_BLOCK:
12:01:01 IP DBServer > WebServer: Flags [.], win 0 - Client Mode (Connect): Acts like Telnet. Used to test if a port is open and accepting traffic. - Server Mode (Listen): Creates a temporary server. Great for testing firewall rules (e.g., "Can Server A reach Server B on port 9090?"). - File Transfer (The "Hack"): If scp or rsync aren't available, you can pipe files through raw sockets. - -i eth0: Listen on interface eth0 (or any for all interfaces).
- -n: Crucial. Don't resolve Hostnames or Ports. (Shows 1.2.3.4:80 instead of google.com:http). This speeds up output significantly.
- -w capture.pcap: Write output to a file (so you can open it in Wireshark later).
- -v: Verbose (show more header details like TTL, ID). - HEADER: Status (e.g., NOERROR or NXDOMAIN). If you see NXDOMAIN, the domain doesn't exist.
- QUESTION SECTION: What you asked for.
- ANSWER SECTION: The result (IPs).
- AUTHORITY SECTION: Who owns the domain (Nameservers).
- ADDITIONAL SECTION: IPs of the nameservers. - Trace the Recursion: See the full path from Root(.) to TLD(.com) to Auth Server. - Short Mode: Great for scripting. Returns only the IP. - Direct Query: Bypass your local DNS and ask a specific server (e.g., ask Google's 8.8.8.8 directly). - SYN Scan (-sS): The "Stealth" scan. It sends a SYN packet. If the server replies SYN-ACK, Nmap knows the port is open but sends a RST (Reset) immediately. It never completes the 3-way handshake, so it often doesn't show up in application logs.
- Requires sudo.
- Version Detection (-sV): Connects to the port and listens to the "Banner" to guess the software version (e.g., "Apache 2.4.41").
- OS Detection (-O): Analyzes IP TTLs and TCP Window sizes to guess the Operating System (Linux, Windows, connection stack differences). - Definition: The time it takes for a single packet to travel from Source to Destination.
- Analogy: The speed limit of the road. Even if the road is empty, it takes time to drive from New York to London.
- The Cause: Physical distance (fiber optic length), number of router hops, congested queues.
- ping: Measures RTT (Round Trip Time).
- mtr (My Traceroute): Combines ping and traceroute. Shows packet loss at each hop.
- Tip: If loss starts at Hop 3 and continues to the end, Hop 3 is the problem. If loss is only at Hop 3 but Hop 4 is 0%, Hop 3 is just de-prioritizing ICMP (ignoring pings), which is fine. - Definition: The maximum amount of data that can be transmitted in a fixed amount of time.
- Analogy: The number of lanes on the highway.
- The Cause: Link capacity (1Gbps cable vs 100Mbps cable).
- iperf3: The gold standard. requires installation on both ends (client and server). It floods the link with data to test pure capacity. - TCP Window Size: TCP waits for an acknowledgment (ACK) before sending more data. If the Latency (RTT) is high, the sender spends most of its time waiting, not sending.
- Bandwidth-Delay Product (BDP): In "Long Fat Networks" (High Bandwidth + High Latency, like Trans-Atlantic cables), you must tune the TCP Window Size to keep the pipe full.
- DevOps Fix: Tuning Linux Kernel parameters (net.ipv4.tcp_window_scaling). - Scenario A (Good): 0% Packet Loss, Low Latency (<1ms for LAN).
- Verdict: Network path is fine. Proceed to Step 2.
- Scenario B (Bad - 100% Loss): "Destination Host Unreachable."
- Verdict: The server is down, or there is no route (Routing Table issue).
- Scenario C (Bad - High Loss): Loss starts at Hop 2.
- Verdict: A specific router/switch in the path is failing. - Output: 10.0.1.50
- Action: Compare this IP with your AWS Console/Inventory. Is it the correct DB server?
- Trap: Sometimes a developer hardcodes an old IP in /etc/hosts. Check that file too!
- Trap: If you get NXDOMAIN, the DNS record is missing entirely. - Scenario A (Success): Connection to 10.0.1.50 5432 port [tcp/postgresql] succeeded!
- Verdict: Firewall is open, DB is listening. The issue is likely Application Layer (wrong password, DB overload).
- Scenario B (Connection Refused): Ncat: Connection refused.
- Verdict: Packet reached the server, but the Server said "Go Away." The DB service is likely crashed/stopped.
- Scenario C (Timeout): It hangs forever...
- Verdict: Firewall Drop. The packet hit a black hole (Security Group/UFW). It never got a reply. - Case 1: The "SYN Flood" (Firewall/Packet Loss) - Diagnosis: You see only [S] (SYN) packets going out, but no reply. The other side is ignoring you. Confirm Firewall/Security Groups.
- Case 2: The "Reset" (Service Down) - Diagnosis: You see an [R] (RST) flag immediately. The server OS received the request but no application was bound to that port to handle it. Check if Postgres Service is running.
- Case 3: The "Zero Window" (Overload) - Diagnosis: win 0 means the Database Server is screaming "STOP! My buffer is full." It cannot process data fast enough. The DB is CPU/Memory starved.