Tools

Tools: Type I vs Type II Errors: The Fire Alarm That Cried Wolf vs The Fire Alarm That Slept Through Arson

2026-01-21 0 views admin

Tools: Type I vs Type II Errors: The Fire Alarm That Cried Wolf vs The Fire Alarm That Slept Through Arson

Source: Dev.to

Two Fire Alarms, Two Disasters ## Vendor A: "The Paranoid" (Type I Error Specialist) ## Vendor B: "The Relaxed" (Type II Error Specialist) ## The Dilemma ## The Formal Definitions ## The 2×2 Reality ## The Courtroom Analogy ## Why You Can't Eliminate Both ## Code: Visualizing the Tradeoff ## Real-World Examples ## Example 1: Medical Testing ## Example 2: Spam Filter ## Example 3: Airport Security ## Example 4: Criminal Justice ## The Decision Framework ## Alpha (α) and Beta (β) ## The Relationship to ML Metrics ## Code: Controlling Error Types ## The Memory Tricks ## Trick 1: "I Before II, Positive Before Negative" ## Trick 2: The Alarm Analogy ## Trick 3: The Court Analogy ## Trick 4: Alpha and Beta Placement ## Common Mistakes ## Mistake 1: Thinking You Can Minimize Both ## Mistake 2: Forgetting Context ## Mistake 3: Confusing the Null Hypothesis ## Mistake 4: Ignoring Base Rates ## Quick Reference ## Definitions ## When Each Is Worse ## Formulas ## The Tradeoff ## Key Takeaways ## The One-Sentence Summary ## What's Next? ## Let's Connect! The One-Line Summary: Type I error is a false alarm — saying something exists when it doesn't. Type II error is a miss — saying something doesn't exist when it does. Reducing one usually increases the other. Your job is to decide which mistake is worse for YOUR problem. The Greenwood apartment building had a problem. They needed a new fire alarm system. Two vendors made their pitch. "Our alarm has NEVER missed a real fire! It's so sensitive that if there's even a hint of smoke, it triggers." The failure: So many FALSE ALARMS that when a REAL fire happened, everyone ignored it. "Our alarm will NEVER bother you with false alarms! It only triggers when it's 100% certain there's a real fire." The failure: So afraid of false alarms that it MISSED THE ACTUAL FIRE. Both buildings burned down. Different reasons. Different errors. Let's translate to statistics: The justice system was DESIGNED around these errors: The principle "Innocent until proven guilty" and "Beyond reasonable doubt" exist specifically to minimize Type I errors (convicting innocents) even if it means more Type II errors (guilty people going free). Famous quote: "Better that ten guilty persons escape than that one innocent suffer." — William Blackstone Here's the cruel truth: reducing one type of error usually increases the other. Turn sensitivity DOWN: You're always trading one for the other! These Greek letters are shorthand: Type I = False Alarm — Saying yes when it's no Type II = Miss — Saying no when it's yes You can't minimize both — Reducing one increases the other Context determines which is worse — No universal answer α (alpha) = Type I rate, β (beta) = Type II rate — Standard notation Power = 1 - β = Recall — Ability to detect true positives Threshold controls the tradeoff — Lower = fewer Type II, more Type I Base rates matter — Low error RATE can still mean high error COUNT Type I error is the fire alarm screaming at your burnt toast (false alarm), Type II error is the fire alarm sleeping through an actual fire (miss) — you can turn the sensitivity dial to reduce one, but you'll increase the other, so your job is to decide which mistake would be more catastrophic for YOUR specific building. Now that you understand Type I and Type II errors, you're ready for: Follow me for the next article in this series! If Type I and Type II finally click now, drop a heart! Questions? Ask in the comments — I read and respond to every one. What's the worst Type I or Type II error you've encountered? I once saw a fraud model with 0.1% Type I rate that still flagged 10,000 legitimate transactions per day because of volume! The difference between a fire alarm that's annoying and one that's deadly? Understanding that false alarms make people ignore real alarms, while missed alarms kill directly. Both failures. Different failures. Your threshold decides which one you're willing to accept. Share this with someone who keeps confusing false positives with false negatives. After the fire alarm story, they'll never forget. Happy hypothesis testing! 🔥 Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: Day 1: 3:00 AM ALARM! → Burnt microwave popcorn Day 2: 7:30 AM ALARM! → Shower steam Day 3: 6:15 PM ALARM! → Someone lit a candle Day 4: 2:00 AM ALARM! → Dust in the sensor Day 5: 8:00 AM ALARM! → Toast Day 6: 4:00 AM ALARM! → Humidity Day 7: Actual fire... ALARM! → "Ugh, probably just toast again" → Nobody evacuates → Building burns down Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Day 1: 3:00 AM ALARM! → Burnt microwave popcorn Day 2: 7:30 AM ALARM! → Shower steam Day 3: 6:15 PM ALARM! → Someone lit a candle Day 4: 2:00 AM ALARM! → Dust in the sensor Day 5: 8:00 AM ALARM! → Toast Day 6: 4:00 AM ALARM! → Humidity Day 7: Actual fire... ALARM! → "Ugh, probably just toast again" → Nobody evacuates → Building burns down CODE_BLOCK: Day 1: 3:00 AM ALARM! → Burnt microwave popcorn Day 2: 7:30 AM ALARM! → Shower steam Day 3: 6:15 PM ALARM! → Someone lit a candle Day 4: 2:00 AM ALARM! → Dust in the sensor Day 5: 8:00 AM ALARM! → Toast Day 6: 4:00 AM ALARM! → Humidity Day 7: Actual fire... ALARM! → "Ugh, probably just toast again" → Nobody evacuates → Building burns down CODE_BLOCK: Day 1: Peaceful. No alarms. Day 2: Peaceful. No alarms. Day 3: Small electrical fire starts... Alarm: [silent] "Hmm, still building confidence..." Day 4: Fire spreads to walls... Alarm: [silent] "Not quite certain yet..." Day 5: Building engulfed... Alarm: "FIRE! FIRE!" → Too late → Building gone Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Day 1: Peaceful. No alarms. Day 2: Peaceful. No alarms. Day 3: Small electrical fire starts... Alarm: [silent] "Hmm, still building confidence..." Day 4: Fire spreads to walls... Alarm: [silent] "Not quite certain yet..." Day 5: Building engulfed... Alarm: "FIRE! FIRE!" → Too late → Building gone CODE_BLOCK: Day 1: Peaceful. No alarms. Day 2: Peaceful. No alarms. Day 3: Small electrical fire starts... Alarm: [silent] "Hmm, still building confidence..." Day 4: Fire spreads to walls... Alarm: [silent] "Not quite certain yet..." Day 5: Building engulfed... Alarm: "FIRE! FIRE!" → Too late → Building gone CODE_BLOCK: THE NULL HYPOTHESIS (H₀): "There is NO fire" TYPE I ERROR (α - Alpha): - Rejecting H₀ when it's actually TRUE - Saying "FIRE!" when there's no fire - False Positive - False Alarm - "Crying Wolf" TYPE II ERROR (β - Beta): - Failing to reject H₀ when it's actually FALSE - Saying "No fire" when there IS a fire - False Negative - Miss - "Sleeping Through Danger" Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: THE NULL HYPOTHESIS (H₀): "There is NO fire" TYPE I ERROR (α - Alpha): - Rejecting H₀ when it's actually TRUE - Saying "FIRE!" when there's no fire - False Positive - False Alarm - "Crying Wolf" TYPE II ERROR (β - Beta): - Failing to reject H₀ when it's actually FALSE - Saying "No fire" when there IS a fire - False Negative - Miss - "Sleeping Through Danger" CODE_BLOCK: THE NULL HYPOTHESIS (H₀): "There is NO fire" TYPE I ERROR (α - Alpha): - Rejecting H₀ when it's actually TRUE - Saying "FIRE!" when there's no fire - False Positive - False Alarm - "Crying Wolf" TYPE II ERROR (β - Beta): - Failing to reject H₀ when it's actually FALSE - Saying "No fire" when there IS a fire - False Negative - Miss - "Sleeping Through Danger" CODE_BLOCK: REALITY No Fire Fire ┌──────────┬──────────┐ │ │ │ "No Fire" │ Correct │ TYPE II │ │ ✓ │ ERROR │ ALARM │ (TN) │ (Miss!) │ SAYS: ├──────────┼──────────┤ │ │ │ "FIRE!" │ TYPE I │ Correct │ │ ERROR │ ✓ │ │(F.Alarm!)│ (TP) │ └──────────┴──────────┘ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: REALITY No Fire Fire ┌──────────┬──────────┐ │ │ │ "No Fire" │ Correct │ TYPE II │ │ ✓ │ ERROR │ ALARM │ (TN) │ (Miss!) │ SAYS: ├──────────┼──────────┤ │ │ │ "FIRE!" │ TYPE I │ Correct │ │ ERROR │ ✓ │ │(F.Alarm!)│ (TP) │ └──────────┴──────────┘ CODE_BLOCK: REALITY No Fire Fire ┌──────────┬──────────┐ │ │ │ "No Fire" │ Correct │ TYPE II │ │ ✓ │ ERROR │ ALARM │ (TN) │ (Miss!) │ SAYS: ├──────────┼──────────┤ │ │ │ "FIRE!" │ TYPE I │ Correct │ │ ERROR │ ✓ │ │(F.Alarm!)│ (TP) │ └──────────┴──────────┘ CODE_BLOCK: NULL HYPOTHESIS: "Defendant is INNOCENT" TYPE I ERROR (Convict Innocent): - Jury says "GUILTY" - Person is actually INNOCENT - Innocent person goes to prison - Devastating! Lives ruined. TYPE II ERROR (Acquit Guilty): - Jury says "NOT GUILTY" - Person is actually GUILTY - Criminal walks free - Bad, but fixable (can catch them later) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: NULL HYPOTHESIS: "Defendant is INNOCENT" TYPE I ERROR (Convict Innocent): - Jury says "GUILTY" - Person is actually INNOCENT - Innocent person goes to prison - Devastating! Lives ruined. TYPE II ERROR (Acquit Guilty): - Jury says "NOT GUILTY" - Person is actually GUILTY - Criminal walks free - Bad, but fixable (can catch them later) CODE_BLOCK: NULL HYPOTHESIS: "Defendant is INNOCENT" TYPE I ERROR (Convict Innocent): - Jury says "GUILTY" - Person is actually INNOCENT - Innocent person goes to prison - Devastating! Lives ruined. TYPE II ERROR (Acquit Guilty): - Jury says "NOT GUILTY" - Person is actually GUILTY - Criminal walks free - Bad, but fixable (can catch them later) CODE_BLOCK: FIRE ALARM SENSITIVITY DIAL: TYPE I TYPE II (False Alarms) (Missed Fires) HIGH ←─────────────────────────────────→ LOW │ │ │ ┌─────────┐ │ │◄─────────│ Paranoid │ │ │ │ Alarm │ │ │ └─────────┘ │ │ │ │ ┌─────────┐ │ │ │ Relaxed │────►│ │ │ Alarm │ │ │ └─────────┘ │ │ │ │ 🎯 │ │ (Sweet Spot?) │ │ │ LOW ←──────────────────────────────────→ HIGH Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: FIRE ALARM SENSITIVITY DIAL: TYPE I TYPE II (False Alarms) (Missed Fires) HIGH ←─────────────────────────────────→ LOW │ │ │ ┌─────────┐ │ │◄─────────│ Paranoid │ │ │ │ Alarm │ │ │ └─────────┘ │ │ │ │ ┌─────────┐ │ │ │ Relaxed │────►│ │ │ Alarm │ │ │ └─────────┘ │ │ │ │ 🎯 │ │ (Sweet Spot?) │ │ │ LOW ←──────────────────────────────────→ HIGH CODE_BLOCK: FIRE ALARM SENSITIVITY DIAL: TYPE I TYPE II (False Alarms) (Missed Fires) HIGH ←─────────────────────────────────→ LOW │ │ │ ┌─────────┐ │ │◄─────────│ Paranoid │ │ │ │ Alarm │ │ │ └─────────┘ │ │ │ │ ┌─────────┐ │ │ │ Relaxed │────►│ │ │ Alarm │ │ │ └─────────┘ │ │ │ │ 🎯 │ │ (Sweet Spot?) │ │ │ LOW ←──────────────────────────────────→ HIGH COMMAND_BLOCK: import numpy as np import matplotlib.pyplot as plt from sklearn.metrics import confusion_matrix from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split # Create dataset: detecting fires X, y = make_classification(n_samples=1000, n_features=10, weights=[0.9, 0.1], # 10% are actual fires random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) # Train model model = LogisticRegression() model.fit(X_train, y_train) # Get probabilities probas = model.predict_proba(X_test)[:, 1] # Try different thresholds thresholds = [0.1, 0.3, 0.5, 0.7, 0.9] print("Threshold | Type I (FP) | Type II (FN) | Total Errors") print("-" * 55) results = [] for thresh in thresholds: y_pred = (probas >= thresh).astype(int) tn, fp, fn, tp = confusion_matrix(y_test, y_pred).ravel() type_i = fp # False alarm type_ii = fn # Missed fire results.append((thresh, type_i, type_ii)) print(f" {thresh:.1f} | {type_i:2d} | {type_ii:2d} | {type_i + type_ii:2d}") # Visualize the tradeoff threshs, type_is, type_iis = zip(results) plt.figure(figsize=(10, 6)) plt.plot(threshs, type_is, 'r-o', linewidth=2, markersize=8, label='Type I (False Alarms)') plt.plot(threshs, type_iis, 'b-s', linewidth=2, markersize=8, label='Type II (Missed Fires)') plt.xlabel('Detection Threshold', fontsize=12) plt.ylabel('Number of Errors', fontsize=12) plt.title('The Type I vs Type II Tradeoff', fontsize=14) plt.legend(fontsize=11) plt.grid(True, alpha=0.3) # Add annotations plt.annotate('Paranoid\n(catches all fires,\nmany false alarms)', xy=(0.1, type_is[0]), xytext=(0.2, type_is[0]+10), fontsize=9, arrowprops=dict(arrowstyle='->')) plt.annotate('Relaxed\n(no false alarms,\nmisses fires)', xy=(0.9, type_iis[-1]), xytext=(0.7, type_iis[-1]+10), fontsize=9, arrowprops=dict(arrowstyle='->')) plt.tight_layout() plt.savefig('type_i_vs_type_ii.png', dpi=150) plt.show() Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import numpy as np import matplotlib.pyplot as plt from sklearn.metrics import confusion_matrix from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split # Create dataset: detecting fires X, y = make_classification(n_samples=1000, n_features=10, weights=[0.9, 0.1], # 10% are actual fires random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) # Train model model = LogisticRegression() model.fit(X_train, y_train) # Get probabilities probas = model.predict_proba(X_test)[:, 1] # Try different thresholds thresholds = [0.1, 0.3, 0.5, 0.7, 0.9] print("Threshold | Type I (FP) | Type II (FN) | Total Errors") print("-" 55) results = [] for thresh in thresholds: y_pred = (probas >= thresh).astype(int) tn, fp, fn, tp = confusion_matrix(y_test, y_pred).ravel() type_i = fp # False alarm type_ii = fn # Missed fire results.append((thresh, type_i, type_ii)) print(f" {thresh:.1f} | {type_i:2d} | {type_ii:2d} | {type_i + type_ii:2d}") # Visualize the tradeoff threshs, type_is, type_iis = zip(results) plt.figure(figsize=(10, 6)) plt.plot(threshs, type_is, 'r-o', linewidth=2, markersize=8, label='Type I (False Alarms)') plt.plot(threshs, type_iis, 'b-s', linewidth=2, markersize=8, label='Type II (Missed Fires)') plt.xlabel('Detection Threshold', fontsize=12) plt.ylabel('Number of Errors', fontsize=12) plt.title('The Type I vs Type II Tradeoff', fontsize=14) plt.legend(fontsize=11) plt.grid(True, alpha=0.3) # Add annotations plt.annotate('Paranoid\n(catches all fires,\nmany false alarms)', xy=(0.1, type_is[0]), xytext=(0.2, type_is[0]+10), fontsize=9, arrowprops=dict(arrowstyle='->')) plt.annotate('Relaxed\n(no false alarms,\nmisses fires)', xy=(0.9, type_iis[-1]), xytext=(0.7, type_iis[-1]+10), fontsize=9, arrowprops=dict(arrowstyle='->')) plt.tight_layout() plt.savefig('type_i_vs_type_ii.png', dpi=150) plt.show() COMMAND_BLOCK: import numpy as np import matplotlib.pyplot as plt from sklearn.metrics import confusion_matrix from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split # Create dataset: detecting fires X, y = make_classification(n_samples=1000, n_features=10, weights=[0.9, 0.1], # 10% are actual fires random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) # Train model model = LogisticRegression() model.fit(X_train, y_train) # Get probabilities probas = model.predict_proba(X_test)[:, 1] # Try different thresholds thresholds = [0.1, 0.3, 0.5, 0.7, 0.9] print("Threshold | Type I (FP) | Type II (FN) | Total Errors") print("-" 55) results = [] for thresh in thresholds: y_pred = (probas >= thresh).astype(int) tn, fp, fn, tp = confusion_matrix(y_test, y_pred).ravel() type_i = fp # False alarm type_ii = fn # Missed fire results.append((thresh, type_i, type_ii)) print(f" {thresh:.1f} | {type_i:2d} | {type_ii:2d} | {type_i + type_ii:2d}") # Visualize the tradeoff threshs, type_is, type_iis = zip(results) plt.figure(figsize=(10, 6)) plt.plot(threshs, type_is, 'r-o', linewidth=2, markersize=8, label='Type I (False Alarms)') plt.plot(threshs, type_iis, 'b-s', linewidth=2, markersize=8, label='Type II (Missed Fires)') plt.xlabel('Detection Threshold', fontsize=12) plt.ylabel('Number of Errors', fontsize=12) plt.title('The Type I vs Type II Tradeoff', fontsize=14) plt.legend(fontsize=11) plt.grid(True, alpha=0.3) # Add annotations plt.annotate('Paranoid\n(catches all fires,\nmany false alarms)', xy=(0.1, type_is[0]), xytext=(0.2, type_is[0]+10), fontsize=9, arrowprops=dict(arrowstyle='->')) plt.annotate('Relaxed\n(no false alarms,\nmisses fires)', xy=(0.9, type_iis[-1]), xytext=(0.7, type_iis[-1]+10), fontsize=9, arrowprops=dict(arrowstyle='->')) plt.tight_layout() plt.savefig('type_i_vs_type_ii.png', dpi=150) plt.show() CODE_BLOCK: Threshold | Type I (FP) | Type II (FN) | Total Errors ------------------------------------------------------- 0.1 | 45 | 2 | 47 0.3 | 23 | 5 | 28 0.5 | 12 | 8 | 20 0.7 | 5 | 14 | 19 0.9 | 1 | 21 | 22 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Threshold | Type I (FP) | Type II (FN) | Total Errors ------------------------------------------------------- 0.1 | 45 | 2 | 47 0.3 | 23 | 5 | 28 0.5 | 12 | 8 | 20 0.7 | 5 | 14 | 19 0.9 | 1 | 21 | 22 CODE_BLOCK: Threshold | Type I (FP) | Type II (FN) | Total Errors ------------------------------------------------------- 0.1 | 45 | 2 | 47 0.3 | 23 | 5 | 28 0.5 | 12 | 8 | 20 0.7 | 5 | 14 | 19 0.9 | 1 | 21 | 22 CODE_BLOCK: H₀: Patient does NOT have cancer TYPE I ERROR (False Positive): Test says: "CANCER!" Reality: No cancer Consequence: - Unnecessary surgery - Emotional trauma - Financial burden - But... patient lives TYPE II ERROR (False Negative): Test says: "All clear!" Reality: Has cancer Consequence: - Cancer spreads untreated - Patient might die - Devastating WHICH IS WORSE? Type II! Missing cancer can be fatal. STRATEGY: Accept more false positives to minimize missed cancers. Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: H₀: Patient does NOT have cancer TYPE I ERROR (False Positive): Test says: "CANCER!" Reality: No cancer Consequence: - Unnecessary surgery - Emotional trauma - Financial burden - But... patient lives TYPE II ERROR (False Negative): Test says: "All clear!" Reality: Has cancer Consequence: - Cancer spreads untreated - Patient might die - Devastating WHICH IS WORSE? Type II! Missing cancer can be fatal. STRATEGY: Accept more false positives to minimize missed cancers. CODE_BLOCK: H₀: Patient does NOT have cancer TYPE I ERROR (False Positive): Test says: "CANCER!" Reality: No cancer Consequence: - Unnecessary surgery - Emotional trauma - Financial burden - But... patient lives TYPE II ERROR (False Negative): Test says: "All clear!" Reality: Has cancer Consequence: - Cancer spreads untreated - Patient might die - Devastating WHICH IS WORSE? Type II! Missing cancer can be fatal. STRATEGY: Accept more false positives to minimize missed cancers. CODE_BLOCK: H₀: Email is NOT spam TYPE I ERROR (False Positive): Filter says: "SPAM!" Reality: Important email from client Consequence: - Missed business opportunity - Lost client - Potentially career-ending TYPE II ERROR (False Negative): Filter says: "Not spam" Reality: Nigerian prince scam Consequence: - Annoying email in inbox - User deletes it manually - Minor inconvenience WHICH IS WORSE? Type I! Losing important emails is devastating. STRATEGY: Accept more spam in inbox to never miss real emails. Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: H₀: Email is NOT spam TYPE I ERROR (False Positive): Filter says: "SPAM!" Reality: Important email from client Consequence: - Missed business opportunity - Lost client - Potentially career-ending TYPE II ERROR (False Negative): Filter says: "Not spam" Reality: Nigerian prince scam Consequence: - Annoying email in inbox - User deletes it manually - Minor inconvenience WHICH IS WORSE? Type I! Losing important emails is devastating. STRATEGY: Accept more spam in inbox to never miss real emails. CODE_BLOCK: H₀: Email is NOT spam TYPE I ERROR (False Positive): Filter says: "SPAM!" Reality: Important email from client Consequence: - Missed business opportunity - Lost client - Potentially career-ending TYPE II ERROR (False Negative): Filter says: "Not spam" Reality: Nigerian prince scam Consequence: - Annoying email in inbox - User deletes it manually - Minor inconvenience WHICH IS WORSE? Type I! Losing important emails is devastating. STRATEGY: Accept more spam in inbox to never miss real emails. CODE_BLOCK: H₀: Passenger is NOT a threat TYPE I ERROR (False Positive): Screening says: "THREAT!" Reality: Just a belt buckle Consequence: - Passenger delayed - Extra screening - Annoying but manageable TYPE II ERROR (False Negative): Screening says: "Clear" Reality: Actual weapon Consequence: - Potential catastrophe - Lives at risk - Unacceptable WHICH IS WORSE? Type II! Missing a threat is catastrophic. STRATEGY: Accept many false alarms (pat-downs) to never miss a threat. Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: H₀: Passenger is NOT a threat TYPE I ERROR (False Positive): Screening says: "THREAT!" Reality: Just a belt buckle Consequence: - Passenger delayed - Extra screening - Annoying but manageable TYPE II ERROR (False Negative): Screening says: "Clear" Reality: Actual weapon Consequence: - Potential catastrophe - Lives at risk - Unacceptable WHICH IS WORSE? Type II! Missing a threat is catastrophic. STRATEGY: Accept many false alarms (pat-downs) to never miss a threat. CODE_BLOCK: H₀: Passenger is NOT a threat TYPE I ERROR (False Positive): Screening says: "THREAT!" Reality: Just a belt buckle Consequence: - Passenger delayed - Extra screening - Annoying but manageable TYPE II ERROR (False Negative): Screening says: "Clear" Reality: Actual weapon Consequence: - Potential catastrophe - Lives at risk - Unacceptable WHICH IS WORSE? Type II! Missing a threat is catastrophic. STRATEGY: Accept many false alarms (pat-downs) to never miss a threat. CODE_BLOCK: H₀: Defendant is INNOCENT TYPE I ERROR (False Positive): Jury says: "GUILTY!" Reality: Person is innocent Consequence: - Innocent person imprisoned - Life destroyed - Irreversible injustice TYPE II ERROR (False Negative): Jury says: "Not guilty" Reality: Person is guilty Consequence: - Criminal walks free - Might reoffend - Bad, but can potentially catch later WHICH IS WORSE? Type I! Imprisoning innocents is unacceptable. STRATEGY: "Beyond reasonable doubt" — accept guilty going free. Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: H₀: Defendant is INNOCENT TYPE I ERROR (False Positive): Jury says: "GUILTY!" Reality: Person is innocent Consequence: - Innocent person imprisoned - Life destroyed - Irreversible injustice TYPE II ERROR (False Negative): Jury says: "Not guilty" Reality: Person is guilty Consequence: - Criminal walks free - Might reoffend - Bad, but can potentially catch later WHICH IS WORSE? Type I! Imprisoning innocents is unacceptable. STRATEGY: "Beyond reasonable doubt" — accept guilty going free. CODE_BLOCK: H₀: Defendant is INNOCENT TYPE I ERROR (False Positive): Jury says: "GUILTY!" Reality: Person is innocent Consequence: - Innocent person imprisoned - Life destroyed - Irreversible injustice TYPE II ERROR (False Negative): Jury says: "Not guilty" Reality: Person is guilty Consequence: - Criminal walks free - Might reoffend - Bad, but can potentially catch later WHICH IS WORSE? Type I! Imprisoning innocents is unacceptable. STRATEGY: "Beyond reasonable doubt" — accept guilty going free. CODE_BLOCK: DECIDING WHICH ERROR IS WORSE: Ask yourself: 1. WHAT HAPPENS if I say "YES" when reality is "NO"? (Type I) └─ False alarm, unnecessary action, wasted resources 2. WHAT HAPPENS if I say "NO" when reality is "YES"? (Type II) └─ Missed detection, inaction when action was needed 3. WHICH CONSEQUENCE IS MORE SEVERE? TYPE I WORSE? TYPE II WORSE? (False alarms costly) (Misses are catastrophic) │ │ ▼ ▼ Raise threshold Lower threshold (Be more conservative) (Be more aggressive) Accept more Type II Accept more Type I │ │ ▼ ▼ Examples: Examples: • Spam filter • Cancer screening • Criminal justice • Airport security • Pregnancy tests • Fraud detection • Drug approval (FDA) • Fire alarms Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: DECIDING WHICH ERROR IS WORSE: Ask yourself: 1. WHAT HAPPENS if I say "YES" when reality is "NO"? (Type I) └─ False alarm, unnecessary action, wasted resources 2. WHAT HAPPENS if I say "NO" when reality is "YES"? (Type II) └─ Missed detection, inaction when action was needed 3. WHICH CONSEQUENCE IS MORE SEVERE? TYPE I WORSE? TYPE II WORSE? (False alarms costly) (Misses are catastrophic) │ │ ▼ ▼ Raise threshold Lower threshold (Be more conservative) (Be more aggressive) Accept more Type II Accept more Type I │ │ ▼ ▼ Examples: Examples: • Spam filter • Cancer screening • Criminal justice • Airport security • Pregnancy tests • Fraud detection • Drug approval (FDA) • Fire alarms CODE_BLOCK: DECIDING WHICH ERROR IS WORSE: Ask yourself: 1. WHAT HAPPENS if I say "YES" when reality is "NO"? (Type I) └─ False alarm, unnecessary action, wasted resources 2. WHAT HAPPENS if I say "NO" when reality is "YES"? (Type II) └─ Missed detection, inaction when action was needed 3. WHICH CONSEQUENCE IS MORE SEVERE? TYPE I WORSE? TYPE II WORSE? (False alarms costly) (Misses are catastrophic) │ │ ▼ ▼ Raise threshold Lower threshold (Be more conservative) (Be more aggressive) Accept more Type II Accept more Type I │ │ ▼ ▼ Examples: Examples: • Spam filter • Cancer screening • Criminal justice • Airport security • Pregnancy tests • Fraud detection • Drug approval (FDA) • Fire alarms CODE_BLOCK: α (Alpha) = P(Type I Error) = P(False Positive) = Probability of rejecting H₀ when H₀ is true = "Significance level" in hypothesis testing = Common values: 0.05, 0.01 β (Beta) = P(Type II Error) = P(False Negative) = Probability of failing to reject H₀ when H₀ is false Power = 1 - β = Probability of correctly rejecting false H₀ = "Sensitivity" or "Recall" = Ability to detect a real effect Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: α (Alpha) = P(Type I Error) = P(False Positive) = Probability of rejecting H₀ when H₀ is true = "Significance level" in hypothesis testing = Common values: 0.05, 0.01 β (Beta) = P(Type II Error) = P(False Negative) = Probability of failing to reject H₀ when H₀ is false Power = 1 - β = Probability of correctly rejecting false H₀ = "Sensitivity" or "Recall" = Ability to detect a real effect CODE_BLOCK: α (Alpha) = P(Type I Error) = P(False Positive) = Probability of rejecting H₀ when H₀ is true = "Significance level" in hypothesis testing = Common values: 0.05, 0.01 β (Beta) = P(Type II Error) = P(False Negative) = Probability of failing to reject H₀ when H₀ is false Power = 1 - β = Probability of correctly rejecting false H₀ = "Sensitivity" or "Recall" = Ability to detect a real effect COMMAND_BLOCK: # In hypothesis testing: alpha = 0.05 # Willing to accept 5% false positive rate # This means: 5% chance of "discovering" something that isn't real # In machine learning terms: from sklearn.metrics import confusion_matrix tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel() # Type I Error Rate (False Positive Rate) alpha = fp / (fp + tn) # Of all negatives, how many false alarms? # Type II Error Rate (False Negative Rate) beta = fn / (fn + tp) # Of all positives, how many missed? # Power (Recall/Sensitivity) power = tp / (tp + fn) # = 1 - beta Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # In hypothesis testing: alpha = 0.05 # Willing to accept 5% false positive rate # This means: 5% chance of "discovering" something that isn't real # In machine learning terms: from sklearn.metrics import confusion_matrix tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel() # Type I Error Rate (False Positive Rate) alpha = fp / (fp + tn) # Of all negatives, how many false alarms? # Type II Error Rate (False Negative Rate) beta = fn / (fn + tp) # Of all positives, how many missed? # Power (Recall/Sensitivity) power = tp / (tp + fn) # = 1 - beta COMMAND_BLOCK: # In hypothesis testing: alpha = 0.05 # Willing to accept 5% false positive rate # This means: 5% chance of "discovering" something that isn't real # In machine learning terms: from sklearn.metrics import confusion_matrix tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel() # Type I Error Rate (False Positive Rate) alpha = fp / (fp + tn) # Of all negatives, how many false alarms? # Type II Error Rate (False Negative Rate) beta = fn / (fn + tp) # Of all positives, how many missed? # Power (Recall/Sensitivity) power = tp / (tp + fn) # = 1 - beta CODE_BLOCK: CONFUSION MATRIX MAPPING: ───────────────────────────────────────────────────────── ACTUAL Negative Positive ┌────────────┬────────────┐ Negative │ TN │ FN │ PREDICTED │ │ (Type II) │ ├────────────┼────────────┤ Positive │ FP │ TP │ │ (Type I) │ │ └────────────┴────────────┘ METRIC TRANSLATIONS: ───────────────────────────────────────────────────────── Type I Error Rate = FP / (FP + TN) = 1 - Specificity = False Positive Rate (FPR) Type II Error Rate = FN / (FN + TP) = 1 - Recall = False Negative Rate (FNR) Precision = TP / (TP + FP) = "When I said positive, was I right?" = Inverse of Type I impact Recall = TP / (TP + FN) = 1 - β = Power = "Did I catch all the positives?" = Inverse of Type II impact Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: CONFUSION MATRIX MAPPING: ───────────────────────────────────────────────────────── ACTUAL Negative Positive ┌────────────┬────────────┐ Negative │ TN │ FN │ PREDICTED │ │ (Type II) │ ├────────────┼────────────┤ Positive │ FP │ TP │ │ (Type I) │ │ └────────────┴────────────┘ METRIC TRANSLATIONS: ───────────────────────────────────────────────────────── Type I Error Rate = FP / (FP + TN) = 1 - Specificity = False Positive Rate (FPR) Type II Error Rate = FN / (FN + TP) = 1 - Recall = False Negative Rate (FNR) Precision = TP / (TP + FP) = "When I said positive, was I right?" = Inverse of Type I impact Recall = TP / (TP + FN) = 1 - β = Power = "Did I catch all the positives?" = Inverse of Type II impact CODE_BLOCK: CONFUSION MATRIX MAPPING: ───────────────────────────────────────────────────────── ACTUAL Negative Positive ┌────────────┬────────────┐ Negative │ TN │ FN │ PREDICTED │ │ (Type II) │ ├────────────┼────────────┤ Positive │ FP │ TP │ │ (Type I) │ │ └────────────┴────────────┘ METRIC TRANSLATIONS: ───────────────────────────────────────────────────────── Type I Error Rate = FP / (FP + TN) = 1 - Specificity = False Positive Rate (FPR) Type II Error Rate = FN / (FN + TP) = 1 - Recall = False Negative Rate (FNR) Precision = TP / (TP + FP) = "When I said positive, was I right?" = Inverse of Type I impact Recall = TP / (TP + FN) = 1 - β = Power = "Did I catch all the positives?" = Inverse of Type II impact COMMAND_BLOCK: import numpy as np from sklearn.metrics import confusion_matrix, precision_score, recall_score def analyze_errors(y_true, y_proba, threshold, context=""): """Analyze Type I and Type II errors at a given threshold.""" y_pred = (y_proba >= threshold).astype(int) tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel() # Error rates type_i_rate = fp / (fp + tn) if (fp + tn) > 0 else 0 type_ii_rate = fn / (fn + tp) if (fn + tp) > 0 else 0 print(f"\n{'='50}") print(f"Threshold: {threshold} {context}") print(f"{'='50}") print(f"Confusion Matrix:") print(f" TN={tn}, FP={fp} (Type I Errors)") print(f" FN={fn} (Type II Errors), TP={tp}") print(f"\nError Rates:") print(f" Type I (α): {type_i_rate:.1%} - False Alarm Rate") print(f" Type II (β): {type_ii_rate:.1%} - Miss Rate") print(f"\nML Metrics:") print(f" Precision: {precision_score(y_true, y_pred):.1%}") print(f" Recall: {recall_score(y_true, y_pred):.1%} (= 1 - β = Power)") return type_i_rate, type_ii_rate # Simulate a fire detection scenario np.random.seed(42) n = 1000 # True labels: 5% are actual fires y_true = np.random.binomial(1, 0.05, n) # Model probabilities (higher for actual fires, with noise) y_proba = np.where(y_true == 1, np.random.beta(8, 2, n), # Fires: mostly high probability np.random.beta(2, 8, n)) # No fire: mostly low probability # Analyze different thresholds for different priorities # Paranoid: "Never miss a fire!" (minimize Type II) analyze_errors(y_true, y_proba, 0.2, "(Paranoid - Never miss a fire)") # Balanced: "Try to balance both errors" analyze_errors(y_true, y_proba, 0.5, "(Balanced)") # Relaxed: "Avoid false alarms!" (minimize Type I) analyze_errors(y_true, y_proba, 0.8, "(Relaxed - Avoid false alarms)") Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import numpy as np from sklearn.metrics import confusion_matrix, precision_score, recall_score def analyze_errors(y_true, y_proba, threshold, context=""): """Analyze Type I and Type II errors at a given threshold.""" y_pred = (y_proba >= threshold).astype(int) tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel() # Error rates type_i_rate = fp / (fp + tn) if (fp + tn) > 0 else 0 type_ii_rate = fn / (fn + tp) if (fn + tp) > 0 else 0 print(f"\n{'='50}") print(f"Threshold: {threshold} {context}") print(f"{'='50}") print(f"Confusion Matrix:") print(f" TN={tn}, FP={fp} (Type I Errors)") print(f" FN={fn} (Type II Errors), TP={tp}") print(f"\nError Rates:") print(f" Type I (α): {type_i_rate:.1%} - False Alarm Rate") print(f" Type II (β): {type_ii_rate:.1%} - Miss Rate") print(f"\nML Metrics:") print(f" Precision: {precision_score(y_true, y_pred):.1%}") print(f" Recall: {recall_score(y_true, y_pred):.1%} (= 1 - β = Power)") return type_i_rate, type_ii_rate # Simulate a fire detection scenario np.random.seed(42) n = 1000 # True labels: 5% are actual fires y_true = np.random.binomial(1, 0.05, n) # Model probabilities (higher for actual fires, with noise) y_proba = np.where(y_true == 1, np.random.beta(8, 2, n), # Fires: mostly high probability np.random.beta(2, 8, n)) # No fire: mostly low probability # Analyze different thresholds for different priorities # Paranoid: "Never miss a fire!" (minimize Type II) analyze_errors(y_true, y_proba, 0.2, "(Paranoid - Never miss a fire)") # Balanced: "Try to balance both errors" analyze_errors(y_true, y_proba, 0.5, "(Balanced)") # Relaxed: "Avoid false alarms!" (minimize Type I) analyze_errors(y_true, y_proba, 0.8, "(Relaxed - Avoid false alarms)") COMMAND_BLOCK: import numpy as np from sklearn.metrics import confusion_matrix, precision_score, recall_score def analyze_errors(y_true, y_proba, threshold, context=""): """Analyze Type I and Type II errors at a given threshold.""" y_pred = (y_proba >= threshold).astype(int) tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel() # Error rates type_i_rate = fp / (fp + tn) if (fp + tn) > 0 else 0 type_ii_rate = fn / (fn + tp) if (fn + tp) > 0 else 0 print(f"\n{'='50}") print(f"Threshold: {threshold} {context}") print(f"{'='50}") print(f"Confusion Matrix:") print(f" TN={tn}, FP={fp} (Type I Errors)") print(f" FN={fn} (Type II Errors), TP={tp}") print(f"\nError Rates:") print(f" Type I (α): {type_i_rate:.1%} - False Alarm Rate") print(f" Type II (β): {type_ii_rate:.1%} - Miss Rate") print(f"\nML Metrics:") print(f" Precision: {precision_score(y_true, y_pred):.1%}") print(f" Recall: {recall_score(y_true, y_pred):.1%} (= 1 - β = Power)") return type_i_rate, type_ii_rate # Simulate a fire detection scenario np.random.seed(42) n = 1000 # True labels: 5% are actual fires y_true = np.random.binomial(1, 0.05, n) # Model probabilities (higher for actual fires, with noise) y_proba = np.where(y_true == 1, np.random.beta(8, 2, n), # Fires: mostly high probability np.random.beta(2, 8, n)) # No fire: mostly low probability # Analyze different thresholds for different priorities # Paranoid: "Never miss a fire!" (minimize Type II) analyze_errors(y_true, y_proba, 0.2, "(Paranoid - Never miss a fire)") # Balanced: "Try to balance both errors" analyze_errors(y_true, y_proba, 0.5, "(Balanced)") # Relaxed: "Avoid false alarms!" (minimize Type I) analyze_errors(y_true, y_proba, 0.8, "(Relaxed - Avoid false alarms)") CODE_BLOCK: ================================================== Threshold: 0.2 (Paranoid - Never miss a fire) ================================================== Confusion Matrix: TN=812, FP=138 (Type I Errors) FN=2 (Type II Errors), TP=48 Error Rates: Type I (α): 14.5% - False Alarm Rate Type II (β): 4.0% - Miss Rate ML Metrics: Precision: 25.8% Recall: 96.0% (= 1 - β = Power) ================================================== Threshold: 0.5 (Balanced) ================================================== Confusion Matrix: TN=920, FP=30 (Type I Errors) FN=8 (Type II Errors), TP=42 Error Rates: Type I (α): 3.2% - False Alarm Rate Type II (β): 16.0% - Miss Rate ML Metrics: Precision: 58.3% Recall: 84.0% (= 1 - β = Power) ================================================== Threshold: 0.8 (Relaxed - Avoid false alarms) ================================================== Confusion Matrix: TN=945, FP=5 (Type I Errors) FN=18 (Type II Errors), TP=32 Error Rates: Type I (α): 0.5% - False Alarm Rate Type II (β): 36.0% - Miss Rate ML Metrics: Precision: 86.5% Recall: 64.0% (= 1 - β = Power) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ================================================== Threshold: 0.2 (Paranoid - Never miss a fire) ================================================== Confusion Matrix: TN=812, FP=138 (Type I Errors) FN=2 (Type II Errors), TP=48 Error Rates: Type I (α): 14.5% - False Alarm Rate Type II (β): 4.0% - Miss Rate ML Metrics: Precision: 25.8% Recall: 96.0% (= 1 - β = Power) ================================================== Threshold: 0.5 (Balanced) ================================================== Confusion Matrix: TN=920, FP=30 (Type I Errors) FN=8 (Type II Errors), TP=42 Error Rates: Type I (α): 3.2% - False Alarm Rate Type II (β): 16.0% - Miss Rate ML Metrics: Precision: 58.3% Recall: 84.0% (= 1 - β = Power) ================================================== Threshold: 0.8 (Relaxed - Avoid false alarms) ================================================== Confusion Matrix: TN=945, FP=5 (Type I Errors) FN=18 (Type II Errors), TP=32 Error Rates: Type I (α): 0.5% - False Alarm Rate Type II (β): 36.0% - Miss Rate ML Metrics: Precision: 86.5% Recall: 64.0% (= 1 - β = Power) CODE_BLOCK: ================================================== Threshold: 0.2 (Paranoid - Never miss a fire) ================================================== Confusion Matrix: TN=812, FP=138 (Type I Errors) FN=2 (Type II Errors), TP=48 Error Rates: Type I (α): 14.5% - False Alarm Rate Type II (β): 4.0% - Miss Rate ML Metrics: Precision: 25.8% Recall: 96.0% (= 1 - β = Power) ================================================== Threshold: 0.5 (Balanced) ================================================== Confusion Matrix: TN=920, FP=30 (Type I Errors) FN=8 (Type II Errors), TP=42 Error Rates: Type I (α): 3.2% - False Alarm Rate Type II (β): 16.0% - Miss Rate ML Metrics: Precision: 58.3% Recall: 84.0% (= 1 - β = Power) ================================================== Threshold: 0.8 (Relaxed - Avoid false alarms) ================================================== Confusion Matrix: TN=945, FP=5 (Type I Errors) FN=18 (Type II Errors), TP=32 Error Rates: Type I (α): 0.5% - False Alarm Rate Type II (β): 36.0% - Miss Rate ML Metrics: Precision: 86.5% Recall: 64.0% (= 1 - β = Power) CODE_BLOCK: Type I = First = False Positive = False Alarm Type II = Second = False Negative = Miss Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Type I = First = False Positive = False Alarm Type II = Second = False Negative = Miss CODE_BLOCK: Type I = First = False Positive = False Alarm Type II = Second = False Negative = Miss CODE_BLOCK: Type I = Alarm goes off, nothing's wrong (FALSE ALARM) Type II = Something's wrong, alarm doesn't go off (SILENT FAILURE) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Type I = Alarm goes off, nothing's wrong (FALSE ALARM) Type II = Something's wrong, alarm doesn't go off (SILENT FAILURE) CODE_BLOCK: Type I = Alarm goes off, nothing's wrong (FALSE ALARM) Type II = Something's wrong, alarm doesn't go off (SILENT FAILURE) CODE_BLOCK: Type I = Convicting the INNOCENT (False Positive for guilt) Type II = Acquitting the GUILTY (False Negative for guilt) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Type I = Convicting the INNOCENT (False Positive for guilt) Type II = Acquitting the GUILTY (False Negative for guilt) CODE_BLOCK: Type I = Convicting the INNOCENT (False Positive for guilt) Type II = Acquitting the GUILTY (False Negative for guilt) CODE_BLOCK: α (Alpha) comes FIRST in alphabet → Type I β (Beta) comes SECOND in alphabet → Type II Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: α (Alpha) comes FIRST in alphabet → Type I β (Beta) comes SECOND in alphabet → Type II CODE_BLOCK: α (Alpha) comes FIRST in alphabet → Type I β (Beta) comes SECOND in alphabet → Type II COMMAND_BLOCK: # ❌ WRONG thinking "I want zero false alarms AND zero missed detections!" # ✅ RIGHT understanding # There's always a tradeoff # Decide which error is MORE COSTLY for your specific problem # Then optimize accordingly Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # ❌ WRONG thinking "I want zero false alarms AND zero missed detections!" # ✅ RIGHT understanding # There's always a tradeoff # Decide which error is MORE COSTLY for your specific problem # Then optimize accordingly COMMAND_BLOCK: # ❌ WRONG thinking "I want zero false alarms AND zero missed detections!" # ✅ RIGHT understanding # There's always a tradeoff # Decide which error is MORE COSTLY for your specific problem # Then optimize accordingly COMMAND_BLOCK: # ❌ WRONG "Type I errors are always worse than Type II" # ✅ RIGHT # It depends on the problem! # Cancer screening: Type II worse (missing cancer) # Spam filter: Type I worse (losing important email) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # ❌ WRONG "Type I errors are always worse than Type II" # ✅ RIGHT # It depends on the problem! # Cancer screening: Type II worse (missing cancer) # Spam filter: Type I worse (losing important email) COMMAND_BLOCK: # ❌ WRONG "Type I errors are always worse than Type II" # ✅ RIGHT # It depends on the problem! # Cancer screening: Type II worse (missing cancer) # Spam filter: Type I worse (losing important email) COMMAND_BLOCK: # The error TYPE depends on what H₀ is! # If H₀ = "No cancer" # Type I = Saying cancer when no cancer (false alarm) # Type II = Saying no cancer when cancer (miss) # If H₀ = "Has cancer" (different framing!) # Type I = Saying no cancer when has cancer # Type II = Saying cancer when no cancer # Now the labels are SWAPPED! # Always be clear about what H₀ is! Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # The error TYPE depends on what H₀ is! # If H₀ = "No cancer" # Type I = Saying cancer when no cancer (false alarm) # Type II = Saying no cancer when cancer (miss) # If H₀ = "Has cancer" (different framing!) # Type I = Saying no cancer when has cancer # Type II = Saying cancer when no cancer # Now the labels are SWAPPED! # Always be clear about what H₀ is! COMMAND_BLOCK: # The error TYPE depends on what H₀ is! # If H₀ = "No cancer" # Type I = Saying cancer when no cancer (false alarm) # Type II = Saying no cancer when cancer (miss) # If H₀ = "Has cancer" (different framing!) # Type I = Saying no cancer when has cancer # Type II = Saying cancer when no cancer # Now the labels are SWAPPED! # Always be clear about what H₀ is! COMMAND_BLOCK: # With rare events, Type I errors can FLOOD you even with low rates # 1 million emails, 0.1% are spam (1,000 spam) # Spam filter with 1% false positive rate false_positives = 999_000 0.01 # 9,990 good emails marked spam! true_positives = 1_000 * 0.90 # 900 spam caught # You have 10x more false positives than true positives! # Low Type I RATE can still mean HIGH Type I COUNT with rare events Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # With rare events, Type I errors can FLOOD you even with low rates # 1 million emails, 0.1% are spam (1,000 spam) # Spam filter with 1% false positive rate false_positives = 999_000 * 0.01 # 9,990 good emails marked spam! true_positives = 1_000 * 0.90 # 900 spam caught # You have 10x more false positives than true positives! # Low Type I RATE can still mean HIGH Type I COUNT with rare events COMMAND_BLOCK: # With rare events, Type I errors can FLOOD you even with low rates # 1 million emails, 0.1% are spam (1,000 spam) # Spam filter with 1% false positive rate false_positives = 999_000 * 0.01 # 9,990 good emails marked spam! true_positives = 1_000 * 0.90 # 900 spam caught # You have 10x more false positives than true positives! # Low Type I RATE can still mean HIGH Type I COUNT with rare events CODE_BLOCK: Type I Rate (α) = FP / (FP + TN) = 1 - Specificity Type II Rate (β) = FN / (FN + TP) = 1 - Recall Power = 1 - β = Recall = Sensitivity Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Type I Rate (α) = FP / (FP + TN) = 1 - Specificity Type II Rate (β) = FN / (FN + TP) = 1 - Recall Power = 1 - β = Recall = Sensitivity CODE_BLOCK: Type I Rate (α) = FP / (FP + TN) = 1 - Specificity Type II Rate (β) = FN / (FN + TP) = 1 - Recall Power = 1 - β = Recall = Sensitivity CODE_BLOCK: ↑ Threshold → ↓ Type I (fewer false alarms) → ↑ Type II (more misses) ↓ Threshold → ↑ Type I (more false alarms) → ↓ Type II (fewer misses) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: ↑ Threshold → ↓ Type I (fewer false alarms) → ↑ Type II (more misses) ↓ Threshold → ↑ Type I (more false alarms) → ↓ Type II (fewer misses) CODE_BLOCK: ↑ Threshold → ↓ Type I (fewer false alarms) → ↑ Type II (more misses) ↓ Threshold → ↑ Type I (more false alarms) → ↓ Type II (fewer misses) - Type I = First column problem = Said YES, reality was NO - Type II = Second column problem = Said NO, reality was YES - Fewer missed fires (Type II ↓) - More false alarms (Type I ↑) - Fewer false alarms (Type I ↓) - More missed fires (Type II ↑) - Threshold 0.1: Only 2 missed fires, but 45 false alarms! - Threshold 0.9: Only 1 false alarm, but 21 missed fires! - Type I = False Alarm — Saying yes when it's no - Type II = Miss — Saying no when it's yes - You can't minimize both — Reducing one increases the other - Context determines which is worse — No universal answer - α (alpha) = Type I rate, β (beta) = Type II rate — Standard notation - Power = 1 - β = Recall — Ability to detect true positives - Threshold controls the tradeoff — Lower = fewer Type II, more Type I - Base rates matter — Low error RATE can still mean high error COUNT - Statistical Power — How to design experiments that detect real effects - P-Values — The (often misunderstood) Type I error controller - ROC Curves Deep Dive — Visualizing the Type I/II tradeoff - Cost-Sensitive Learning — When errors have different price tags

🏷️ Tags

how-totutorialguidedev.toaimachine learningmlgit