Tools: Cron Jobs Are Failing Silently (Here's a 50-Line Fix) Your

Tools: Cron Jobs Are Failing Silently (Here's a 50-Line Fix) Your

The Silent Killer

The Fix: Wrap Every Job

Updated Crontab

What You Get

Why Not Use Existing Tools?

3 Bonus Tips for Cron

1. Always redirect output

2. Use flock to prevent overlapping runs

3. Set PATH explicitly Last month, my backup cron job failed at 3 AM on a Saturday. I didn't notice until Monday morning when I needed to restore data. Three days of backups — gone. The job had been failing with a disk space error, but cron doesn't care about exit codes by default. It just runs the command and moves on. Here's what most cron setups look like: No monitoring. No alerts. No logging. If any of these fail, you won't know until the damage is done. I built a simple Python wrapper that: Every job is now tracked: My solution: 50 lines of Python, zero dependencies, instant setup. Cron has a minimal PATH. Your script works in terminal but fails in cron? This is why. The full monitor with Telegram, Slack, timeout detection, and missed run alerts is on GitHub. What's the worst cron failure you've had? I know I'm not the only one who lost backups. Follow for more DevOps and automation content. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to ? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command

Copy

# crontab 0 3 * * * /path/to/backup.sh 0 6 * * * /path/to/report.sh 0 */2 * * * /path/to/cleanup.sh # crontab 0 3 * * * /path/to/backup.sh 0 6 * * * /path/to/report.sh 0 */2 * * * /path/to/cleanup.sh # crontab 0 3 * * * /path/to/backup.sh 0 6 * * * /path/to/report.sh 0 */2 * * * /path/to/cleanup.sh import subprocess import json from datetime import datetime from pathlib import Path import urllib.request DB = Path.home() / '.cron-monitor.json' def load_db(): return json.loads(DB.read_text()) if DB.exists() else {'jobs': {}} def save_db(db): DB.write_text(json.dumps(db, indent=2, default=str)) def alert(message, bot_token, chat_id): url = f'https://api.telegram.org/bot{bot_token}/sendMessage' data = json.dumps({'chat_id': chat_id, 'text': message}).encode() req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'}) urllib.request.urlopen(req, timeout=10) def run_job(name, command, bot_token=None, chat_id=None): db = load_db() -weight: 500;">start = datetime.now() result = subprocess.run(command, capture_output=True, text=True) db['jobs'][name] = { 'last_run': -weight: 500;">start.isoformat(), 'duration': (datetime.now() - -weight: 500;">start).total_seconds(), 'exit_code': result.returncode, '-weight: 500;">status': 'success' if result.returncode == 0 else 'failed' } save_db(db) if result.returncode != 0 and bot_token: alert( f"🔴 CRON FAILED: {name}\n" f"Exit code: {result.returncode}\n" f"Error: {result.stderr[:200]}", bot_token, chat_id ) # Usage run_job("daily-backup", ["bash", "/path/to/backup.sh"], bot_token="YOUR_BOT_TOKEN", chat_id="YOUR_CHAT_ID") import subprocess import json from datetime import datetime from pathlib import Path import urllib.request DB = Path.home() / '.cron-monitor.json' def load_db(): return json.loads(DB.read_text()) if DB.exists() else {'jobs': {}} def save_db(db): DB.write_text(json.dumps(db, indent=2, default=str)) def alert(message, bot_token, chat_id): url = f'https://api.telegram.org/bot{bot_token}/sendMessage' data = json.dumps({'chat_id': chat_id, 'text': message}).encode() req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'}) urllib.request.urlopen(req, timeout=10) def run_job(name, command, bot_token=None, chat_id=None): db = load_db() -weight: 500;">start = datetime.now() result = subprocess.run(command, capture_output=True, text=True) db['jobs'][name] = { 'last_run': -weight: 500;">start.isoformat(), 'duration': (datetime.now() - -weight: 500;">start).total_seconds(), 'exit_code': result.returncode, '-weight: 500;">status': 'success' if result.returncode == 0 else 'failed' } save_db(db) if result.returncode != 0 and bot_token: alert( f"🔴 CRON FAILED: {name}\n" f"Exit code: {result.returncode}\n" f"Error: {result.stderr[:200]}", bot_token, chat_id ) # Usage run_job("daily-backup", ["bash", "/path/to/backup.sh"], bot_token="YOUR_BOT_TOKEN", chat_id="YOUR_CHAT_ID") import subprocess import json from datetime import datetime from pathlib import Path import urllib.request DB = Path.home() / '.cron-monitor.json' def load_db(): return json.loads(DB.read_text()) if DB.exists() else {'jobs': {}} def save_db(db): DB.write_text(json.dumps(db, indent=2, default=str)) def alert(message, bot_token, chat_id): url = f'https://api.telegram.org/bot{bot_token}/sendMessage' data = json.dumps({'chat_id': chat_id, 'text': message}).encode() req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'}) urllib.request.urlopen(req, timeout=10) def run_job(name, command, bot_token=None, chat_id=None): db = load_db() -weight: 500;">start = datetime.now() result = subprocess.run(command, capture_output=True, text=True) db['jobs'][name] = { 'last_run': -weight: 500;">start.isoformat(), 'duration': (datetime.now() - -weight: 500;">start).total_seconds(), 'exit_code': result.returncode, '-weight: 500;">status': 'success' if result.returncode == 0 else 'failed' } save_db(db) if result.returncode != 0 and bot_token: alert( f"🔴 CRON FAILED: {name}\n" f"Exit code: {result.returncode}\n" f"Error: {result.stderr[:200]}", bot_token, chat_id ) # Usage run_job("daily-backup", ["bash", "/path/to/backup.sh"], bot_token="YOUR_BOT_TOKEN", chat_id="YOUR_CHAT_ID") # Before (silent failures) 0 3 * * * /path/to/backup.sh # After (monitored) 0 3 * * * python3 /path/to/monitor.py --name "daily-backup" -- bash /path/to/backup.sh # Before (silent failures) 0 3 * * * /path/to/backup.sh # After (monitored) 0 3 * * * python3 /path/to/monitor.py --name "daily-backup" -- bash /path/to/backup.sh # Before (silent failures) 0 3 * * * /path/to/backup.sh # After (monitored) 0 3 * * * python3 /path/to/monitor.py --name "daily-backup" -- bash /path/to/backup.sh === Cron Job Monitor === Job: daily-backup Last run: 2026-03-25 03:00:01 Status: ✅ Success (exit code 0) Duration: 4m 23s Job: db-cleanup Last run: 2026-03-25 02:00:00 Status: 🔴 Failed (exit code 1) Error: "connection refused" Alert sent: Telegram ✅ === Cron Job Monitor === Job: daily-backup Last run: 2026-03-25 03:00:01 Status: ✅ Success (exit code 0) Duration: 4m 23s Job: db-cleanup Last run: 2026-03-25 02:00:00 Status: 🔴 Failed (exit code 1) Error: "connection refused" Alert sent: Telegram ✅ === Cron Job Monitor === Job: daily-backup Last run: 2026-03-25 03:00:01 Status: ✅ Success (exit code 0) Duration: 4m 23s Job: db-cleanup Last run: 2026-03-25 02:00:00 Status: 🔴 Failed (exit code 1) Error: "connection refused" Alert sent: Telegram ✅ 0 3 * * * /path/to/backup.sh >> /var/log/backup.log 2>&1 0 3 * * * /path/to/backup.sh >> /var/log/backup.log 2>&1 0 3 * * * /path/to/backup.sh >> /var/log/backup.log 2>&1 0 3 * * * flock -n /tmp/backup.lock /path/to/backup.sh 0 3 * * * flock -n /tmp/backup.lock /path/to/backup.sh 0 3 * * * flock -n /tmp/backup.lock /path/to/backup.sh PATH=/usr/local/bin:/usr/bin:/bin 0 3 * * * /path/to/backup.sh PATH=/usr/local/bin:/usr/bin:/bin 0 3 * * * /path/to/backup.sh PATH=/usr/local/bin:/usr/bin:/bin 0 3 * * * /path/to/backup.sh - Logs -weight: 500;">start/end time and exit code - Sends a Telegram/Slack alert on failure - Detects missed runs - Healthchecks.io — great -weight: 500;">service, but it's external. I want self-hosted. - Cronitor — $20/month. This is free. - systemd timers — powerful but complex to set up. - Dead Man's Snitch — SaaS, costs money.