Tools: Your Server Is at 97% CPU Right Now. Would You Know?

Tools: Your Server Is at 97% CPU Right Now. Would You Know?

The Script

How the CPU Check Actually Works

The Thing About tee -a

Schedule It

Variations That Are Worth Adding

When This Isn't Enough Here's how it usually goes: You deploy something. Traffic is light. Server load sits at 15% and you move on to the next thing. Then traffic grows, or a cron job stacks on itself, or a memory leak slowly eats through your RAM over 72 hours. By the time you notice, the server is thrashing, responses take 8 seconds, and your app is effectively dead. The frustrating part is that the tools to catch this have been on your server the entire time. top and free ship with every Linux distribution ever made. Nobody installs them. They're just... there. Waiting for someone to actually ask. So I wrote a script that asks every hour and logs a warning when the answer is bad. Runs in under a second. Zero dependencies. Works on Ubuntu, Debian, CentOS, RHEL, Arch — anything with top and free, which is everything. The CPU line looks intimidating, so let me walk through it: top -bn1 — runs top in batch mode (-b) for exactly one iteration (-n1). Batch mode dumps the full output to stdout instead of opening the interactive TUI. This is the only way to use top in a script. grep "Cpu(s)" — grabs the line that shows aggregate CPU stats. awk '{print $2}' — pulls the user CPU percentage (the second field). cut and xargs printf — strips the percent sign and any comma decimal separator, then rounds to an integer. You can't do integer comparison in bash with 2.5 — it needs a clean number like 3. The RAM check is simpler: free shows total and used memory, and awk divides used by total and multiplies by 100. You'll notice the script uses echo ... | tee -a "$LOG_FILE" for warnings but plain echo for healthy checks. This is intentional. tee -a writes to the terminal AND appends to the log file simultaneously. When everything is fine, there's nothing to log — you don't want a log file full of "CPU OK" lines every hour for three years. You only want entries when something is actually wrong. So the log file becomes a clean history of every resource spike your server has had, with timestamps. When something breaks at 2 AM and you're debugging at 9 AM, you can cat /var/log/resource-monitor.log and see exactly when resources started climbing. Hourly checks (what I use for most servers): Every 5 minutes (for production servers where you need tighter visibility): Not sure about the cron syntax? I have a cron job builder tool that generates the line visually. Add an email alert when thresholds are breached: I have a full email alert script that covers the mail setup if you haven't configured it before. Check disk space in the same script: Now you've got CPU, RAM, and disk in one pass. I keep disk in a separate script because I use a different threshold for it (90% vs 80%), but combining them works fine if you want fewer cron entries. Log to a CSV for trending: Run this for a week and you'll see patterns. Maybe your app spikes every day at 2 PM when a batch job runs. Maybe RAM creeps up 1% per day, which means you have a memory leak that'll hit the wall in three months. You can't see these patterns without historical data. This script is a notification system, not a monitoring platform. It tells you "something is wrong right now" but doesn't give you graphs, dashboards, or historical trending out of the box. If you need that level of visibility, tools like Netdata (free, runs locally) or Grafana + Prometheus are the next step. But for a single VPS or a handful of servers, a cron script that logs warnings and optionally emails you is 90% of what you need — and it takes 2 minutes to deploy instead of 2 hours. Full script, line-by-line breakdown, cron setup, and more variations: bashsnippets.xyz/snippets/monitor-cpu-ram-usage.html Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command

Copy

#!/bin/bash CHECK="✓" CROSS="✗" # --- Configuration --- THRESHOLD=80 # Alert when usage exceeds this % LOG_FILE="/var/log/resource-monitor.log" DATE=$(date '+%Y-%m-%d %H:%M:%S') # --- CPU Usage --- CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d',' -f1 | xargs printf "%.0f") # --- RAM Usage --- RAM=$(free | awk '/Mem:/ {printf "%.0f", $3/$2*100}') echo "[$DATE] CPU: ${CPU}% | RAM: ${RAM}%" # --- CPU Alert --- if [ "$CPU" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: CPU at ${CPU}% (threshold: ${THRESHOLD}%)" | tee -a "$LOG_FILE" else echo "$CHECK CPU OK: ${CPU}%" fi # --- RAM Alert --- if [ "$RAM" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: RAM at ${RAM}% (threshold: ${THRESHOLD}%)" | tee -a "$LOG_FILE" else echo "$CHECK RAM OK: ${RAM}%" fi #!/bin/bash CHECK="✓" CROSS="✗" # --- Configuration --- THRESHOLD=80 # Alert when usage exceeds this % LOG_FILE="/var/log/resource-monitor.log" DATE=$(date '+%Y-%m-%d %H:%M:%S') # --- CPU Usage --- CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d',' -f1 | xargs printf "%.0f") # --- RAM Usage --- RAM=$(free | awk '/Mem:/ {printf "%.0f", $3/$2*100}') echo "[$DATE] CPU: ${CPU}% | RAM: ${RAM}%" # --- CPU Alert --- if [ "$CPU" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: CPU at ${CPU}% (threshold: ${THRESHOLD}%)" | tee -a "$LOG_FILE" else echo "$CHECK CPU OK: ${CPU}%" fi # --- RAM Alert --- if [ "$RAM" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: RAM at ${RAM}% (threshold: ${THRESHOLD}%)" | tee -a "$LOG_FILE" else echo "$CHECK RAM OK: ${RAM}%" fi #!/bin/bash CHECK="✓" CROSS="✗" # --- Configuration --- THRESHOLD=80 # Alert when usage exceeds this % LOG_FILE="/var/log/resource-monitor.log" DATE=$(date '+%Y-%m-%d %H:%M:%S') # --- CPU Usage --- CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d',' -f1 | xargs printf "%.0f") # --- RAM Usage --- RAM=$(free | awk '/Mem:/ {printf "%.0f", $3/$2*100}') echo "[$DATE] CPU: ${CPU}% | RAM: ${RAM}%" # --- CPU Alert --- if [ "$CPU" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: CPU at ${CPU}% (threshold: ${THRESHOLD}%)" | tee -a "$LOG_FILE" else echo "$CHECK CPU OK: ${CPU}%" fi # --- RAM Alert --- if [ "$RAM" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: RAM at ${RAM}% (threshold: ${THRESHOLD}%)" | tee -a "$LOG_FILE" else echo "$CHECK RAM OK: ${RAM}%" fi top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d',' -f1 | xargs printf "%.0f" top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d',' -f1 | xargs printf "%.0f" top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | cut -d',' -f1 | xargs printf "%.0f" 0 * * * * /home/user/monitor.sh >> /var/log/monitor-cron.log 2>&1 0 * * * * /home/user/monitor.sh >> /var/log/monitor-cron.log 2>&1 0 * * * * /home/user/monitor.sh >> /var/log/monitor-cron.log 2>&1 */5 * * * * /home/user/monitor.sh >> /var/log/monitor-cron.log 2>&1 */5 * * * * /home/user/monitor.sh >> /var/log/monitor-cron.log 2>&1 */5 * * * * /home/user/monitor.sh >> /var/log/monitor-cron.log 2>&1 if [ "$CPU" -gt "$THRESHOLD" ]; then MSG="$CROSS WARNING: CPU at ${CPU}% on $(hostname) at $DATE" echo "$MSG" | tee -a "$LOG_FILE" echo "$MSG" | mail -s "[ALERT] High CPU on $(hostname)" [email protected] fi if [ "$CPU" -gt "$THRESHOLD" ]; then MSG="$CROSS WARNING: CPU at ${CPU}% on $(hostname) at $DATE" echo "$MSG" | tee -a "$LOG_FILE" echo "$MSG" | mail -s "[ALERT] High CPU on $(hostname)" [email protected] fi if [ "$CPU" -gt "$THRESHOLD" ]; then MSG="$CROSS WARNING: CPU at ${CPU}% on $(hostname) at $DATE" echo "$MSG" | tee -a "$LOG_FILE" echo "$MSG" | mail -s "[ALERT] High CPU on $(hostname)" [email protected] fi DISK=$(df / | awk 'NR==2 {print $5}' | tr -d '%') if [ "$DISK" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: Disk at ${DISK}%" | tee -a "$LOG_FILE" fi DISK=$(df / | awk 'NR==2 {print $5}' | tr -d '%') if [ "$DISK" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: Disk at ${DISK}%" | tee -a "$LOG_FILE" fi DISK=$(df / | awk 'NR==2 {print $5}' | tr -d '%') if [ "$DISK" -gt "$THRESHOLD" ]; then echo "$CROSS [$DATE] WARNING: Disk at ${DISK}%" | tee -a "$LOG_FILE" fi echo "$DATE,$CPU,$RAM" >> /var/log/resource-history.csv echo "$DATE,$CPU,$RAM" >> /var/log/resource-history.csv echo "$DATE,$CPU,$RAM" >> /var/log/resource-history.csv