LRU vs TinyLFU: Choosing the Right Cache Eviction Strategy πŸš€

LRU vs TinyLFU: Choosing the Right Cache Eviction Strategy πŸš€

Source: Dev.to

The Fundamentals πŸ“š ## What is LRU? ## What is TinyLFU? ## The Critical Difference: Recency vs Frequency 🎯 ## Visual Comparison πŸ“Š ## LRU Behavior ## TinyLFU Behavior ## The Real-World Problem: JWT Authentication πŸ” ## The Setup ## Your Traffic Pattern ## The LRU Failure Mode ## The TinyLFU Solution ## Data Structure Comparison πŸ—οΈ ## LRU Implementation ## TinyLFU Implementation ## Count-Min Sketch: The Magic Behind TinyLFU 🎩 ## The Problem It Solves ## How Count-Min Sketch Works ## Operations Visualized ## Why It Works: Hash Collisions ## The Guarantee ## Memory Efficiency ## Counter Decay in Action ## Real-World Example: JWT Cache ## Why This Matters for TinyLFU ## Attack Resilience πŸ›‘οΈ ## LRU Under Attack ## TinyLFU Under Attack ## Configuration Deep Dive βš™οΈ ## LRU Configuration ## TinyLFU Configuration ## Conclusion 🎯 When building high-performance applications, choosing the right cache eviction policy can make or break your system's performance. While both TinyLRU and TinyLFU aim to keep your hot data cached, they take fundamentally different approaches. Understanding when to use each can save you from performance disasters and wasted CPU cycles. LRU (Least Recently Used) is one of the most intuitive cache eviction policies. The principle is simple: when the cache is full and you need to make room for new data, evict the item that was accessed longest ago. Think of it like organizing books on a small shelf. Every time you read a book, you put it at the front. When the shelf fills up, you remove the book from the back that you haven't touched in the longest time. TinyLFU (Least Frequently Used with Frequency sketch) takes a different approach. Instead of just tracking when items were accessed, it tracks how often they've been accessed. Before admitting a new item, TinyLFU asks: "Is this item more valuable than what I'd have to evict?" The fundamental distinction between these algorithms is what they optimize for: This difference seems subtle but has massive implications for real-world systems. Let's examine a production scenario that perfectly illustrates why the choice matters. This example is drawn from real JWT authentication systems handling 100,000 requests per second. You're running an auth gateway. Every incoming request carries a JWT. Verifying that JWT costs 200 microseconds. At 100,000 requests per second, that's 20 full CPU cores doing nothing but crypto. So you add caching. Problem solved... until it isn't. As the author of the TinyLFU JWT authentication analysis notes, this scenario transforms a stable caching system into an unpredictable performance disaster. Hot tokens that should remain cached for their entire 20-minute lifetime get evicted by cold tokens that will never be seen again. TinyLFU's admission gate prevents cache pollution. Batch tokens never achieve sufficient frequency to compete with legitimate hot tokens. The Count-Min Sketch is the secret sauce that makes TinyLFU practical. It's a probabilistic data structure that tracks frequencies for millions of items using only a tiny amount of memory. Imagine you need to track access frequencies for every JWT token you've ever seen: Count-Min Sketch solves this with a clever tradeoff: instead of exact counts, it provides approximate counts with strong guarantees. Think of it as a 2D grid of counters with multiple hash functions: Incrementing a Token: Different tokens can hash to the same positions (collisions), but the Count-Min Sketch handles this elegantly: Count-Min Sketch provides mathematical guarantees: For TinyLFU caching decisions, this is perfect because we're making relative comparisons: Compare memory usage for tracking 1 million tokens: TinyLFU uses periodic decay to keep frequencies fresh: The Count-Min Sketch enables TinyLFU to: Without Count-Min Sketch, TinyLFU would need exact frequency counters for every item ever seen, making it impractical for high-scale systems. The sketch transforms TinyLFU from a theoretical idea into a production-ready cache algorithm. As demonstrated in the JWT authentication case study, the attack actually makes TinyLFU's cache more resilient because random tokens can never achieve the frequency needed to displace legitimate hot tokens. Both TinyLRU and TinyLFU have their place in modern systems: TinyLRU is your default choice. It's simple, fast, and works well for most caching scenarios. Use it when recency correlates with value in your workload. TinyLFU is your specialist tool. It shines when you have frequency-skewed workloads, face burst traffic, or need resilience against cache pollution. The JWT authentication scenario demonstrates how critical this becomes at scale where poor cache decisions directly translate to wasted CPU cores and degraded latency. The right choice depends on understanding your access patterns. Monitor your cache behavior, measure hit rates under realistic load, and choose the algorithm that matches your workload characteristics. Choose wisely, cache efficiently, and may your hit rates stay high! πŸš€ Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: LRU Cache Operations: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Get(key) β”‚ β”‚ └─ Move to front β”‚ β”‚ β”‚ β”‚ Set(key, value) β”‚ β”‚ └─ Add to front β”‚ β”‚ └─ Evict oldest if fullβ”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: LRU Cache Operations: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Get(key) β”‚ β”‚ └─ Move to front β”‚ β”‚ β”‚ β”‚ Set(key, value) β”‚ β”‚ └─ Add to front β”‚ β”‚ └─ Evict oldest if fullβ”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ CODE_BLOCK: LRU Cache Operations: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Get(key) β”‚ β”‚ └─ Move to front β”‚ β”‚ β”‚ β”‚ Set(key, value) β”‚ β”‚ └─ Add to front β”‚ β”‚ └─ Evict oldest if fullβ”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ CODE_BLOCK: TinyLFU Cache Operations: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Get(key) β”‚ β”‚ └─ Increment frequency β”‚ β”‚ └─ Return if cached β”‚ β”‚ β”‚ β”‚ Set(key, value) β”‚ β”‚ └─ Compare frequencies β”‚ β”‚ └─ Admit if worthy β”‚ β”‚ └─ Reject if not valuable β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: TinyLFU Cache Operations: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Get(key) β”‚ β”‚ └─ Increment frequency β”‚ β”‚ └─ Return if cached β”‚ β”‚ β”‚ β”‚ Set(key, value) β”‚ β”‚ └─ Compare frequencies β”‚ β”‚ └─ Admit if worthy β”‚ β”‚ └─ Reject if not valuable β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ CODE_BLOCK: TinyLFU Cache Operations: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Get(key) β”‚ β”‚ └─ Increment frequency β”‚ β”‚ └─ Return if cached β”‚ β”‚ β”‚ β”‚ Set(key, value) β”‚ β”‚ └─ Compare frequencies β”‚ β”‚ └─ Admit if worthy β”‚ β”‚ └─ Reject if not valuable β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ CODE_BLOCK: LRU optimizes for: RECENCY (when was it last used?) TinyLFU optimizes for: FREQUENCY (how often is it used?) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: LRU optimizes for: RECENCY (when was it last used?) TinyLFU optimizes for: FREQUENCY (how often is it used?) CODE_BLOCK: LRU optimizes for: RECENCY (when was it last used?) TinyLFU optimizes for: FREQUENCY (how often is it used?) CODE_BLOCK: Timeline of Cache Operations: Initial: [A] ← β†’ [B] ← β†’ [C] ↑ ↑ MRU LRU Access B: [B] ← β†’ [A] ← β†’ [C] Add D (full): [D] ← β†’ [B] ← β†’ [A] ↑ ↑ MRU Evicted β†’ C Result: C removed (oldest), regardless of access count Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Timeline of Cache Operations: Initial: [A] ← β†’ [B] ← β†’ [C] ↑ ↑ MRU LRU Access B: [B] ← β†’ [A] ← β†’ [C] Add D (full): [D] ← β†’ [B] ← β†’ [A] ↑ ↑ MRU Evicted β†’ C Result: C removed (oldest), regardless of access count CODE_BLOCK: Timeline of Cache Operations: Initial: [A] ← β†’ [B] ← β†’ [C] ↑ ↑ MRU LRU Access B: [B] ← β†’ [A] ← β†’ [C] Add D (full): [D] ← β†’ [B] ← β†’ [A] ↑ ↑ MRU Evicted β†’ C Result: C removed (oldest), regardless of access count CODE_BLOCK: Cache State: Item A: frequency = 500 Item B: frequency = 300 Item C: frequency = 100 ← LFU victim New Item D arrives (frequency = 5) ↓ Compare: D(5) vs C(100) ↓ Decision: REJECT ❌ ↓ Result: D never enters cache C stays (more valuable) New Item E arrives (frequency = 150) ↓ Compare: E(150) vs C(100) ↓ Decision: ADMIT βœ… ↓ Result: E enters cache C evicted Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Cache State: Item A: frequency = 500 Item B: frequency = 300 Item C: frequency = 100 ← LFU victim New Item D arrives (frequency = 5) ↓ Compare: D(5) vs C(100) ↓ Decision: REJECT ❌ ↓ Result: D never enters cache C stays (more valuable) New Item E arrives (frequency = 150) ↓ Compare: E(150) vs C(100) ↓ Decision: ADMIT βœ… ↓ Result: E enters cache C evicted CODE_BLOCK: Cache State: Item A: frequency = 500 Item B: frequency = 300 Item C: frequency = 100 ← LFU victim New Item D arrives (frequency = 5) ↓ Compare: D(5) vs C(100) ↓ Decision: REJECT ❌ ↓ Result: D never enters cache C stays (more valuable) New Item E arrives (frequency = 150) ↓ Compare: E(150) vs C(100) ↓ Decision: ADMIT βœ… ↓ Result: E enters cache C evicted CODE_BLOCK: πŸ”₯ Hot Tokens (70% of traffic) └─ Long-lived sessions └─ Service accounts: 20 req/sec each └─ Browser sessions: 50 req/min each └─ Thousands of hits per 20-min lifetime ❄️ Cold Tokens (25% of traffic) └─ One-off API calls └─ Batch jobs with unique tokens └─ Each token: 1-2 hits, never again 🎲 Noise Tokens (5% of traffic) └─ Scanners └─ Malformed requests └─ Expired tokens from broken clients └─ Pure garbage Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: πŸ”₯ Hot Tokens (70% of traffic) └─ Long-lived sessions └─ Service accounts: 20 req/sec each └─ Browser sessions: 50 req/min each └─ Thousands of hits per 20-min lifetime ❄️ Cold Tokens (25% of traffic) └─ One-off API calls └─ Batch jobs with unique tokens └─ Each token: 1-2 hits, never again 🎲 Noise Tokens (5% of traffic) └─ Scanners └─ Malformed requests └─ Expired tokens from broken clients └─ Pure garbage CODE_BLOCK: πŸ”₯ Hot Tokens (70% of traffic) └─ Long-lived sessions └─ Service accounts: 20 req/sec each └─ Browser sessions: 50 req/min each └─ Thousands of hits per 20-min lifetime ❄️ Cold Tokens (25% of traffic) └─ One-off API calls └─ Batch jobs with unique tokens └─ Each token: 1-2 hits, never again 🎲 Noise Tokens (5% of traffic) └─ Scanners └─ Malformed requests └─ Expired tokens from broken clients └─ Pure garbage CODE_BLOCK: Time: 3:00 AM Event: Batch job spawns 1000 workers Each worker: unique JWT, single request LRU Decision Tree: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ New token arrives β”‚ β”‚ Cache is full β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Admit token β”‚ ← Always admits β”‚ Evict oldest β”‚ ← Evicts hot tokens β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Cache now full of 1000 β”‚ β”‚ tokens that will NEVER β”‚ β”‚ be seen again β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Impact: β”œβ”€ Service account tokens evicted β”œβ”€ Active browser sessions evicted β”œβ”€ Hit rate: 92% β†’ 61% └─ CPU usage doubles Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Time: 3:00 AM Event: Batch job spawns 1000 workers Each worker: unique JWT, single request LRU Decision Tree: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ New token arrives β”‚ β”‚ Cache is full β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Admit token β”‚ ← Always admits β”‚ Evict oldest β”‚ ← Evicts hot tokens β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Cache now full of 1000 β”‚ β”‚ tokens that will NEVER β”‚ β”‚ be seen again β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Impact: β”œβ”€ Service account tokens evicted β”œβ”€ Active browser sessions evicted β”œβ”€ Hit rate: 92% β†’ 61% └─ CPU usage doubles CODE_BLOCK: Time: 3:00 AM Event: Batch job spawns 1000 workers Each worker: unique JWT, single request LRU Decision Tree: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ New token arrives β”‚ β”‚ Cache is full β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Admit token β”‚ ← Always admits β”‚ Evict oldest β”‚ ← Evicts hot tokens β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Cache now full of 1000 β”‚ β”‚ tokens that will NEVER β”‚ β”‚ be seen again β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Impact: β”œβ”€ Service account tokens evicted β”œβ”€ Active browser sessions evicted β”œβ”€ Hit rate: 92% β†’ 61% └─ CPU usage doubles CODE_BLOCK: Same scenario: 1000 batch job tokens arrive TinyLFU Decision for Each Token: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ New batch token β”‚ β”‚ Frequency: 1 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Compare with β”‚ β”‚ LFU victim β”‚ β”‚ (freq: 500) β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 1 < 500 β”‚ β”‚ REJECT ❌ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Token never enters cache β”‚ β”‚ Hot tokens remain cached β”‚ β”‚ Hit rate stays: 92% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Same scenario: 1000 batch job tokens arrive TinyLFU Decision for Each Token: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ New batch token β”‚ β”‚ Frequency: 1 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Compare with β”‚ β”‚ LFU victim β”‚ β”‚ (freq: 500) β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 1 < 500 β”‚ β”‚ REJECT ❌ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Token never enters cache β”‚ β”‚ Hot tokens remain cached β”‚ β”‚ Hit rate stays: 92% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ CODE_BLOCK: Same scenario: 1000 batch job tokens arrive TinyLFU Decision for Each Token: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ New batch token β”‚ β”‚ Frequency: 1 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Compare with β”‚ β”‚ LFU victim β”‚ β”‚ (freq: 500) β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 1 < 500 β”‚ β”‚ REJECT ❌ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Token never enters cache β”‚ β”‚ Hot tokens remain cached β”‚ β”‚ Hit rate stays: 92% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ CODE_BLOCK: Components: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ HashMap: key β†’ node β”‚ β”‚ (O(1) lookups) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Doubly Linked List β”‚ β”‚ (maintains access order) β”‚ β”‚ β”‚ β”‚ [Head] ← β†’ [Node] ← β†’ [Tail] β”‚ β”‚ (MRU) (LRU) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Operations: Get: HashMap lookup + move to head = O(1) Set: HashMap insert + add to head = O(1) (evict tail if full) Memory: ~100 bytes/entry overhead Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Components: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ HashMap: key β†’ node β”‚ β”‚ (O(1) lookups) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Doubly Linked List β”‚ β”‚ (maintains access order) β”‚ β”‚ β”‚ β”‚ [Head] ← β†’ [Node] ← β†’ [Tail] β”‚ β”‚ (MRU) (LRU) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Operations: Get: HashMap lookup + move to head = O(1) Set: HashMap insert + add to head = O(1) (evict tail if full) Memory: ~100 bytes/entry overhead CODE_BLOCK: Components: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ HashMap: key β†’ node β”‚ β”‚ (O(1) lookups) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Doubly Linked List β”‚ β”‚ (maintains access order) β”‚ β”‚ β”‚ β”‚ [Head] ← β†’ [Node] ← β†’ [Tail] β”‚ β”‚ (MRU) (LRU) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Operations: Get: HashMap lookup + move to head = O(1) Set: HashMap insert + add to head = O(1) (evict tail if full) Memory: ~100 bytes/entry overhead CODE_BLOCK: Components: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Main Cache (LRU) β”‚ β”‚ 99% of capacity β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Admission Window (tiny LRU) β”‚ β”‚ 1% of capacity β”‚ β”‚ (for cold-start protection) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Count-Min Sketch β”‚ β”‚ (frequency tracking) β”‚ β”‚ 10x cache size β”‚ β”‚ ~4 bits per counter β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Operations: Get: Increment sketch + cache lookup = O(1) Set: Compare frequencies + conditional admit = O(1) Memory: ~500 KB for 100k cache with 1M counters Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Components: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Main Cache (LRU) β”‚ β”‚ 99% of capacity β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Admission Window (tiny LRU) β”‚ β”‚ 1% of capacity β”‚ β”‚ (for cold-start protection) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Count-Min Sketch β”‚ β”‚ (frequency tracking) β”‚ β”‚ 10x cache size β”‚ β”‚ ~4 bits per counter β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Operations: Get: Increment sketch + cache lookup = O(1) Set: Compare frequencies + conditional admit = O(1) Memory: ~500 KB for 100k cache with 1M counters CODE_BLOCK: Components: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Main Cache (LRU) β”‚ β”‚ 99% of capacity β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Admission Window (tiny LRU) β”‚ β”‚ 1% of capacity β”‚ β”‚ (for cold-start protection) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Count-Min Sketch β”‚ β”‚ (frequency tracking) β”‚ β”‚ 10x cache size β”‚ β”‚ ~4 bits per counter β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Operations: Get: Increment sketch + cache lookup = O(1) Set: Compare frequencies + conditional admit = O(1) Memory: ~500 KB for 100k cache with 1M counters CODE_BLOCK: Naive Approach: HashMap: token β†’ counter Problems: β”œβ”€ Need to store every unique token β”œβ”€ 1 million tokens Γ— 64 bytes = 64 MB just for keys β”œβ”€ Plus counter storage └─ Total: ~100 MB+ for frequency tracking alone ❌ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Naive Approach: HashMap: token β†’ counter Problems: β”œβ”€ Need to store every unique token β”œβ”€ 1 million tokens Γ— 64 bytes = 64 MB just for keys β”œβ”€ Plus counter storage └─ Total: ~100 MB+ for frequency tracking alone ❌ CODE_BLOCK: Naive Approach: HashMap: token β†’ counter Problems: β”œβ”€ Need to store every unique token β”œβ”€ 1 million tokens Γ— 64 bytes = 64 MB just for keys β”œβ”€ Plus counter storage └─ Total: ~100 MB+ for frequency tracking alone ❌ CODE_BLOCK: Count-Min Sketch Structure: Hash1 β†’ [ 3][ 1][ 5][ 2][ 8][ 0][ 4]... Hash2 β†’ [ 2][ 4][ 1][ 7][ 3][ 1][ 2]... Hash3 β†’ [ 1][ 2][ 6][ 3][ 1][ 4][ 5]... Hash4 β†’ [ 4][ 1][ 3][ 5][ 2][ 1][ 8]... ↑ Counters (4 bits each) Dimensions: β”œβ”€ Width (w): Number of counters per row β”œβ”€ Depth (d): Number of hash functions └─ Total memory: w Γ— d Γ— 4 bits Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Count-Min Sketch Structure: Hash1 β†’ [ 3][ 1][ 5][ 2][ 8][ 0][ 4]... Hash2 β†’ [ 2][ 4][ 1][ 7][ 3][ 1][ 2]... Hash3 β†’ [ 1][ 2][ 6][ 3][ 1][ 4][ 5]... Hash4 β†’ [ 4][ 1][ 3][ 5][ 2][ 1][ 8]... ↑ Counters (4 bits each) Dimensions: β”œβ”€ Width (w): Number of counters per row β”œβ”€ Depth (d): Number of hash functions └─ Total memory: w Γ— d Γ— 4 bits CODE_BLOCK: Count-Min Sketch Structure: Hash1 β†’ [ 3][ 1][ 5][ 2][ 8][ 0][ 4]... Hash2 β†’ [ 2][ 4][ 1][ 7][ 3][ 1][ 2]... Hash3 β†’ [ 1][ 2][ 6][ 3][ 1][ 4][ 5]... Hash4 β†’ [ 4][ 1][ 3][ 5][ 2][ 1][ 8]... ↑ Counters (4 bits each) Dimensions: β”œβ”€ Width (w): Number of counters per row β”œβ”€ Depth (d): Number of hash functions └─ Total memory: w Γ— d Γ— 4 bits CODE_BLOCK: Token: "jwt_user_123" ↓ Apply 4 hash functions: β”œβ”€ Hash1(token) = 42 β†’ Increment row1[42] β”œβ”€ Hash2(token) = 157 β†’ Increment row2[157] β”œβ”€ Hash3(token) = 891 β†’ Increment row3[891] └─ Hash4(token) = 23 β†’ Increment row4[23] Before: Row1: ...[ 3][ 5]... (position 42) Row2: ...[10][ 2]... (position 157) Row3: ...[ 7][ 1]... (position 891) Row4: ...[ 2][ 8]... (position 23) After: Row1: ...[ 4][ 5]... ← Incremented Row2: ...[11][ 2]... ← Incremented Row3: ...[ 8][ 1]... ← Incremented Row4: ...[ 3][ 8]... ← Incremented Time: O(1) - Just 4 increments Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Token: "jwt_user_123" ↓ Apply 4 hash functions: β”œβ”€ Hash1(token) = 42 β†’ Increment row1[42] β”œβ”€ Hash2(token) = 157 β†’ Increment row2[157] β”œβ”€ Hash3(token) = 891 β†’ Increment row3[891] └─ Hash4(token) = 23 β†’ Increment row4[23] Before: Row1: ...[ 3][ 5]... (position 42) Row2: ...[10][ 2]... (position 157) Row3: ...[ 7][ 1]... (position 891) Row4: ...[ 2][ 8]... (position 23) After: Row1: ...[ 4][ 5]... ← Incremented Row2: ...[11][ 2]... ← Incremented Row3: ...[ 8][ 1]... ← Incremented Row4: ...[ 3][ 8]... ← Incremented Time: O(1) - Just 4 increments CODE_BLOCK: Token: "jwt_user_123" ↓ Apply 4 hash functions: β”œβ”€ Hash1(token) = 42 β†’ Increment row1[42] β”œβ”€ Hash2(token) = 157 β†’ Increment row2[157] β”œβ”€ Hash3(token) = 891 β†’ Increment row3[891] └─ Hash4(token) = 23 β†’ Increment row4[23] Before: Row1: ...[ 3][ 5]... (position 42) Row2: ...[10][ 2]... (position 157) Row3: ...[ 7][ 1]... (position 891) Row4: ...[ 2][ 8]... (position 23) After: Row1: ...[ 4][ 5]... ← Incremented Row2: ...[11][ 2]... ← Incremented Row3: ...[ 8][ 1]... ← Incremented Row4: ...[ 3][ 8]... ← Incremented Time: O(1) - Just 4 increments CODE_BLOCK: Token: "jwt_user_123" ↓ Apply same 4 hash functions: β”œβ”€ Hash1(token) = 42 β†’ Read row1[42] = 4 β”œβ”€ Hash2(token) = 157 β†’ Read row2[157] = 11 β”œβ”€ Hash3(token) = 891 β†’ Read row3[891] = 8 └─ Hash4(token) = 23 β†’ Read row4[23] = 3 Estimated frequency: min(4, 11, 8, 3) = 3 βœ… Why minimum? └─ Counters can be inflated by collisions Take the minimum = most conservative estimate Time: O(1) - Just 4 lookups + min operation Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Token: "jwt_user_123" ↓ Apply same 4 hash functions: β”œβ”€ Hash1(token) = 42 β†’ Read row1[42] = 4 β”œβ”€ Hash2(token) = 157 β†’ Read row2[157] = 11 β”œβ”€ Hash3(token) = 891 β†’ Read row3[891] = 8 └─ Hash4(token) = 23 β†’ Read row4[23] = 3 Estimated frequency: min(4, 11, 8, 3) = 3 βœ… Why minimum? └─ Counters can be inflated by collisions Take the minimum = most conservative estimate Time: O(1) - Just 4 lookups + min operation CODE_BLOCK: Token: "jwt_user_123" ↓ Apply same 4 hash functions: β”œβ”€ Hash1(token) = 42 β†’ Read row1[42] = 4 β”œβ”€ Hash2(token) = 157 β†’ Read row2[157] = 11 β”œβ”€ Hash3(token) = 891 β†’ Read row3[891] = 8 └─ Hash4(token) = 23 β†’ Read row4[23] = 3 Estimated frequency: min(4, 11, 8, 3) = 3 βœ… Why minimum? └─ Counters can be inflated by collisions Take the minimum = most conservative estimate Time: O(1) - Just 4 lookups + min operation CODE_BLOCK: Collision Example: Token A and Token B both hash to position 42 in Row1 Row1[42] stores: countA + countB = 15 ↓ When querying Token A: β”œβ”€ Row1: 15 (includes Token B) ← Overestimate β”œβ”€ Row2: 12 (includes Token C) ← Overestimate β”œβ”€ Row3: 9 (no collision) ← True count └─ Row4: 11 (includes Token D) ← Overestimate Result: min(15, 12, 9, 11) = 9 Key Property: β”œβ”€ Estimates can be HIGHER than true count β”œβ”€ Estimates are NEVER LOWER than true count └─ Using multiple rows reduces overestimation Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Collision Example: Token A and Token B both hash to position 42 in Row1 Row1[42] stores: countA + countB = 15 ↓ When querying Token A: β”œβ”€ Row1: 15 (includes Token B) ← Overestimate β”œβ”€ Row2: 12 (includes Token C) ← Overestimate β”œβ”€ Row3: 9 (no collision) ← True count └─ Row4: 11 (includes Token D) ← Overestimate Result: min(15, 12, 9, 11) = 9 Key Property: β”œβ”€ Estimates can be HIGHER than true count β”œβ”€ Estimates are NEVER LOWER than true count └─ Using multiple rows reduces overestimation CODE_BLOCK: Collision Example: Token A and Token B both hash to position 42 in Row1 Row1[42] stores: countA + countB = 15 ↓ When querying Token A: β”œβ”€ Row1: 15 (includes Token B) ← Overestimate β”œβ”€ Row2: 12 (includes Token C) ← Overestimate β”œβ”€ Row3: 9 (no collision) ← True count └─ Row4: 11 (includes Token D) ← Overestimate Result: min(15, 12, 9, 11) = 9 Key Property: β”œβ”€ Estimates can be HIGHER than true count β”œβ”€ Estimates are NEVER LOWER than true count └─ Using multiple rows reduces overestimation CODE_BLOCK: True frequency: f Estimated frequency: f' Guarantee: f ≀ f' ≀ f + Ξ΅ Γ— N Where: β”œβ”€ Ξ΅ (epsilon): Error parameter (e.g., 0.01 = 1%) β”œβ”€ N: Total number of items seen └─ f' never underestimates Example with Ξ΅ = 0.01: β”œβ”€ True count: 100 β”œβ”€ Total items: 1,000,000 β”œβ”€ Max error: 0.01 Γ— 1,000,000 = 10,000 └─ Estimated: 100 to 10,100 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: True frequency: f Estimated frequency: f' Guarantee: f ≀ f' ≀ f + Ξ΅ Γ— N Where: β”œβ”€ Ξ΅ (epsilon): Error parameter (e.g., 0.01 = 1%) β”œβ”€ N: Total number of items seen └─ f' never underestimates Example with Ξ΅ = 0.01: β”œβ”€ True count: 100 β”œβ”€ Total items: 1,000,000 β”œβ”€ Max error: 0.01 Γ— 1,000,000 = 10,000 └─ Estimated: 100 to 10,100 CODE_BLOCK: True frequency: f Estimated frequency: f' Guarantee: f ≀ f' ≀ f + Ξ΅ Γ— N Where: β”œβ”€ Ξ΅ (epsilon): Error parameter (e.g., 0.01 = 1%) β”œβ”€ N: Total number of items seen └─ f' never underestimates Example with Ξ΅ = 0.01: β”œβ”€ True count: 100 β”œβ”€ Total items: 1,000,000 β”œβ”€ Max error: 0.01 Γ— 1,000,000 = 10,000 └─ Estimated: 100 to 10,100 COMMAND_BLOCK: Admission Decision: Token A frequency: ~500 (might be 500-600) Token B frequency: ~50 (might be 50-100) Decision: A > B is correct regardless of small errors βœ… Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: Admission Decision: Token A frequency: ~500 (might be 500-600) Token B frequency: ~50 (might be 50-100) Decision: A > B is correct regardless of small errors βœ… COMMAND_BLOCK: Admission Decision: Token A frequency: ~500 (might be 500-600) Token B frequency: ~50 (might be 50-100) Decision: A > B is correct regardless of small errors βœ… CODE_BLOCK: Exact HashMap: β”œβ”€ 1M entries Γ— 64 bytes (key) β”œβ”€ 1M entries Γ— 8 bytes (counter) └─ Total: ~72 MB Count-Min Sketch (w=10M, d=4): β”œβ”€ 10M counters Γ— 4 rows β”œβ”€ 40M counters Γ— 4 bits each β”œβ”€ 160M bits = 20 MB └─ 72% memory savings βœ… Even Better - 4-bit Counters: β”œβ”€ Each counter: 0 to 15 β”œβ”€ When counter reaches 15, halve all counters β”œβ”€ Maintains relative frequencies └─ Prevents overflow Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Exact HashMap: β”œβ”€ 1M entries Γ— 64 bytes (key) β”œβ”€ 1M entries Γ— 8 bytes (counter) └─ Total: ~72 MB Count-Min Sketch (w=10M, d=4): β”œβ”€ 10M counters Γ— 4 rows β”œβ”€ 40M counters Γ— 4 bits each β”œβ”€ 160M bits = 20 MB └─ 72% memory savings βœ… Even Better - 4-bit Counters: β”œβ”€ Each counter: 0 to 15 β”œβ”€ When counter reaches 15, halve all counters β”œβ”€ Maintains relative frequencies └─ Prevents overflow CODE_BLOCK: Exact HashMap: β”œβ”€ 1M entries Γ— 64 bytes (key) β”œβ”€ 1M entries Γ— 8 bytes (counter) └─ Total: ~72 MB Count-Min Sketch (w=10M, d=4): β”œβ”€ 10M counters Γ— 4 rows β”œβ”€ 40M counters Γ— 4 bits each β”œβ”€ 160M bits = 20 MB └─ 72% memory savings βœ… Even Better - 4-bit Counters: β”œβ”€ Each counter: 0 to 15 β”œβ”€ When counter reaches 15, halve all counters β”œβ”€ Maintains relative frequencies └─ Prevents overflow CODE_BLOCK: Before Decay: Row1: [15][12][ 8][15]... Row2: [13][15][ 6][ 9]... Row3: [15][ 7][11][15]... Row4: [10][15][15][ 4]... Decay Event (right shift by 1 bit = divide by 2): Row1: [ 7][ 6][ 4][ 7]... Row2: [ 6][ 7][ 3][ 4]... Row3: [ 7][ 3][ 5][ 7]... Row4: [ 5][ 7][ 7][ 2]... Benefits: β”œβ”€ Recent activity matters more β”œβ”€ Old patterns fade away β”œβ”€ Prevents counter overflow └─ Extremely fast (bit shift operation) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Before Decay: Row1: [15][12][ 8][15]... Row2: [13][15][ 6][ 9]... Row3: [15][ 7][11][15]... Row4: [10][15][15][ 4]... Decay Event (right shift by 1 bit = divide by 2): Row1: [ 7][ 6][ 4][ 7]... Row2: [ 6][ 7][ 3][ 4]... Row3: [ 7][ 3][ 5][ 7]... Row4: [ 5][ 7][ 7][ 2]... Benefits: β”œβ”€ Recent activity matters more β”œβ”€ Old patterns fade away β”œβ”€ Prevents counter overflow └─ Extremely fast (bit shift operation) CODE_BLOCK: Before Decay: Row1: [15][12][ 8][15]... Row2: [13][15][ 6][ 9]... Row3: [15][ 7][11][15]... Row4: [10][15][15][ 4]... Decay Event (right shift by 1 bit = divide by 2): Row1: [ 7][ 6][ 4][ 7]... Row2: [ 6][ 7][ 3][ 4]... Row3: [ 7][ 3][ 5][ 7]... Row4: [ 5][ 7][ 7][ 2]... Benefits: β”œβ”€ Recent activity matters more β”œβ”€ Old patterns fade away β”œβ”€ Prevents counter overflow └─ Extremely fast (bit shift operation) CODE_BLOCK: Scenario: 100k req/sec, tracking 5M unique tokens Configuration: β”œβ”€ Width: 10M counters β”œβ”€ Depth: 4 hash functions β”œβ”€ Counter size: 4 bits └─ Total memory: 20 MB Token Types: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Service Account (Token A) β”‚ β”‚ True frequency: 2000 β”‚ β”‚ Sketch estimate: 2003 β”‚ β”‚ Error: 0.15% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Batch Job Token (Token B) β”‚ β”‚ True frequency: 1 β”‚ β”‚ Sketch estimate: 1 β”‚ β”‚ Error: 0% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Attack Token (Token C) β”‚ β”‚ True frequency: 1 β”‚ β”‚ Sketch estimate: 2 (collision) β”‚ β”‚ Error: 100% (but still tiny!) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Admission Decision: β”œβ”€ Token A (2003) vs Token B (1) β†’ A wins βœ… β”œβ”€ Token A (2003) vs Token C (2) β†’ A wins βœ… └─ Small errors don't affect decisions Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Scenario: 100k req/sec, tracking 5M unique tokens Configuration: β”œβ”€ Width: 10M counters β”œβ”€ Depth: 4 hash functions β”œβ”€ Counter size: 4 bits └─ Total memory: 20 MB Token Types: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Service Account (Token A) β”‚ β”‚ True frequency: 2000 β”‚ β”‚ Sketch estimate: 2003 β”‚ β”‚ Error: 0.15% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Batch Job Token (Token B) β”‚ β”‚ True frequency: 1 β”‚ β”‚ Sketch estimate: 1 β”‚ β”‚ Error: 0% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Attack Token (Token C) β”‚ β”‚ True frequency: 1 β”‚ β”‚ Sketch estimate: 2 (collision) β”‚ β”‚ Error: 100% (but still tiny!) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Admission Decision: β”œβ”€ Token A (2003) vs Token B (1) β†’ A wins βœ… β”œβ”€ Token A (2003) vs Token C (2) β†’ A wins βœ… └─ Small errors don't affect decisions CODE_BLOCK: Scenario: 100k req/sec, tracking 5M unique tokens Configuration: β”œβ”€ Width: 10M counters β”œβ”€ Depth: 4 hash functions β”œβ”€ Counter size: 4 bits └─ Total memory: 20 MB Token Types: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Service Account (Token A) β”‚ β”‚ True frequency: 2000 β”‚ β”‚ Sketch estimate: 2003 β”‚ β”‚ Error: 0.15% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Batch Job Token (Token B) β”‚ β”‚ True frequency: 1 β”‚ β”‚ Sketch estimate: 1 β”‚ β”‚ Error: 0% β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Attack Token (Token C) β”‚ β”‚ True frequency: 1 β”‚ β”‚ Sketch estimate: 2 (collision) β”‚ β”‚ Error: 100% (but still tiny!) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Admission Decision: β”œβ”€ Token A (2003) vs Token B (1) β†’ A wins βœ… β”œβ”€ Token A (2003) vs Token C (2) β†’ A wins βœ… └─ Small errors don't affect decisions CODE_BLOCK: The Trade-off: β”œβ”€ Give up: Exact counts β”œβ”€ Gain: 70-90% memory savings β”œβ”€ Guarantee: Never underestimate └─ Result: Practical frequency-based caching βœ… Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: The Trade-off: β”œβ”€ Give up: Exact counts β”œβ”€ Gain: 70-90% memory savings β”œβ”€ Guarantee: Never underestimate └─ Result: Practical frequency-based caching βœ… CODE_BLOCK: The Trade-off: β”œβ”€ Give up: Exact counts β”œβ”€ Gain: 70-90% memory savings β”œβ”€ Guarantee: Never underestimate └─ Result: Practical frequency-based caching βœ… CODE_BLOCK: Attacker Strategy: 50k random JWTs/sec LRU Response: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Token 1 β†’ ADMIT β”‚ β”‚ Random Token 2 β†’ ADMIT β”‚ β”‚ Random Token 3 β†’ ADMIT β”‚ β”‚ ... (evicting legitimate tokens)β”‚ β”‚ Random Token 50000 β†’ ADMIT β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Result: β”œβ”€ Cache fills with garbage β”œβ”€ Legitimate tokens evicted β”œβ”€ Hit rate collapses └─ CPU usage spikes Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Attacker Strategy: 50k random JWTs/sec LRU Response: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Token 1 β†’ ADMIT β”‚ β”‚ Random Token 2 β†’ ADMIT β”‚ β”‚ Random Token 3 β†’ ADMIT β”‚ β”‚ ... (evicting legitimate tokens)β”‚ β”‚ Random Token 50000 β†’ ADMIT β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Result: β”œβ”€ Cache fills with garbage β”œβ”€ Legitimate tokens evicted β”œβ”€ Hit rate collapses └─ CPU usage spikes CODE_BLOCK: Attacker Strategy: 50k random JWTs/sec LRU Response: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Token 1 β†’ ADMIT β”‚ β”‚ Random Token 2 β†’ ADMIT β”‚ β”‚ Random Token 3 β†’ ADMIT β”‚ β”‚ ... (evicting legitimate tokens)β”‚ β”‚ Random Token 50000 β†’ ADMIT β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Result: β”œβ”€ Cache fills with garbage β”œβ”€ Legitimate tokens evicted β”œβ”€ Hit rate collapses └─ CPU usage spikes CODE_BLOCK: Same Attack: 50k random JWTs/sec TinyLFU Response: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Token 1 (freq=1) β†’ REJECTβ”‚ β”‚ Random Token 2 (freq=1) β†’ REJECTβ”‚ β”‚ Random Token 3 (freq=1) β†’ REJECTβ”‚ β”‚ ... (legitimate tokens safe) β”‚ β”‚ Random Token 50000 β†’ REJECT β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Meanwhile: β”œβ”€ Legitimate Token A (freq=2000) β†’ stays cached β”œβ”€ Service Account B (freq=1500) β†’ stays cached └─ Hit rate remains: 92% Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Same Attack: 50k random JWTs/sec TinyLFU Response: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Token 1 (freq=1) β†’ REJECTβ”‚ β”‚ Random Token 2 (freq=1) β†’ REJECTβ”‚ β”‚ Random Token 3 (freq=1) β†’ REJECTβ”‚ β”‚ ... (legitimate tokens safe) β”‚ β”‚ Random Token 50000 β†’ REJECT β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Meanwhile: β”œβ”€ Legitimate Token A (freq=2000) β†’ stays cached β”œβ”€ Service Account B (freq=1500) β†’ stays cached └─ Hit rate remains: 92% CODE_BLOCK: Same Attack: 50k random JWTs/sec TinyLFU Response: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Random Token 1 (freq=1) β†’ REJECTβ”‚ β”‚ Random Token 2 (freq=1) β†’ REJECTβ”‚ β”‚ Random Token 3 (freq=1) β†’ REJECTβ”‚ β”‚ ... (legitimate tokens safe) β”‚ β”‚ Random Token 50000 β†’ REJECT β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Meanwhile: β”œβ”€ Legitimate Token A (freq=2000) β†’ stays cached β”œβ”€ Service Account B (freq=1500) β†’ stays cached └─ Hit rate remains: 92% CODE_BLOCK: Simple Parameters: cache := NewLRU(100000) // Just capacity Optional: β”œβ”€ TTL per entry β”œβ”€ Expiration callback └─ Memory limit (bytes) Rule of Thumb: Size = ActiveUsers Γ— AverageSessionRequests Γ— SafetyMargin = 10,000 Γ— 50 Γ— 2 = 1,000,000 entries Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Simple Parameters: cache := NewLRU(100000) // Just capacity Optional: β”œβ”€ TTL per entry β”œβ”€ Expiration callback └─ Memory limit (bytes) Rule of Thumb: Size = ActiveUsers Γ— AverageSessionRequests Γ— SafetyMargin = 10,000 Γ— 50 Γ— 2 = 1,000,000 entries CODE_BLOCK: Simple Parameters: cache := NewLRU(100000) // Just capacity Optional: β”œβ”€ TTL per entry β”œβ”€ Expiration callback └─ Memory limit (bytes) Rule of Thumb: Size = ActiveUsers Γ— AverageSessionRequests Γ— SafetyMargin = 10,000 Γ— 50 Γ— 2 = 1,000,000 entries CODE_BLOCK: Critical Parameters: cache := NewTinyLFU(TinyLFUConfig{ MaxSize: 100000, // Main cache capacity NumCounters: 1000000, // Frequency sketch size (10x) BufferItems: 1000, // Admission window (1%) }) Parameter Breakdown: NumCounters: β”œβ”€ Rule: 10x cache size β”œβ”€ Why: Reduces hash collisions β”œβ”€ Memory: counters Γ— 4 bits └─ Example: 1M counters = 500 KB BufferItems: β”œβ”€ Rule: 1% of MaxSize β”œβ”€ Purpose: Cold-start protection β”œβ”€ Behavior: New items land here first └─ Graduation: Build frequency β†’ main cache MaxCost (optional): β”œβ”€ For variable-sized entries β”œβ”€ Memory-bounded eviction └─ Example: 50 MB budget for mixed sizes Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Critical Parameters: cache := NewTinyLFU(TinyLFUConfig{ MaxSize: 100000, // Main cache capacity NumCounters: 1000000, // Frequency sketch size (10x) BufferItems: 1000, // Admission window (1%) }) Parameter Breakdown: NumCounters: β”œβ”€ Rule: 10x cache size β”œβ”€ Why: Reduces hash collisions β”œβ”€ Memory: counters Γ— 4 bits └─ Example: 1M counters = 500 KB BufferItems: β”œβ”€ Rule: 1% of MaxSize β”œβ”€ Purpose: Cold-start protection β”œβ”€ Behavior: New items land here first └─ Graduation: Build frequency β†’ main cache MaxCost (optional): β”œβ”€ For variable-sized entries β”œβ”€ Memory-bounded eviction └─ Example: 50 MB budget for mixed sizes CODE_BLOCK: Critical Parameters: cache := NewTinyLFU(TinyLFUConfig{ MaxSize: 100000, // Main cache capacity NumCounters: 1000000, // Frequency sketch size (10x) BufferItems: 1000, // Admission window (1%) }) Parameter Breakdown: NumCounters: β”œβ”€ Rule: 10x cache size β”œβ”€ Why: Reduces hash collisions β”œβ”€ Memory: counters Γ— 4 bits └─ Example: 1M counters = 500 KB BufferItems: β”œβ”€ Rule: 1% of MaxSize β”œβ”€ Purpose: Cold-start protection β”œβ”€ Behavior: New items land here first └─ Graduation: Build frequency β†’ main cache MaxCost (optional): β”œβ”€ For variable-sized entries β”œβ”€ Memory-bounded eviction └─ Example: 50 MB budget for mixed sizes - Track millions of items with minimal memory - Make instant decisions with O(1) operations - Handle high throughput without becoming a bottleneck - Adapt to changing patterns through decay - Tolerate approximation because relative comparisons still work - πŸ“„ TinyLFU: A Highly Efficient Cache Admission Policy - πŸ” TinyLFU in JWT Authentication Systems - πŸ”§ Ristretto: A fast, concurrent TinyLFU cache - πŸ“š golang-lru: Simple LRU implementation - πŸ“Š Count-Min Sketch: The probabilistic data structure powering TinyLFU