Tools
LRU vs TinyLFU: Choosing the Right Cache Eviction Strategy π
2026-01-03
0 views
admin
The Fundamentals π ## What is LRU? ## What is TinyLFU? ## The Critical Difference: Recency vs Frequency π― ## Visual Comparison π ## LRU Behavior ## TinyLFU Behavior ## The Real-World Problem: JWT Authentication π ## The Setup ## Your Traffic Pattern ## The LRU Failure Mode ## The TinyLFU Solution ## Data Structure Comparison ποΈ ## LRU Implementation ## TinyLFU Implementation ## Count-Min Sketch: The Magic Behind TinyLFU π© ## The Problem It Solves ## How Count-Min Sketch Works ## Operations Visualized ## Why It Works: Hash Collisions ## The Guarantee ## Memory Efficiency ## Counter Decay in Action ## Real-World Example: JWT Cache ## Why This Matters for TinyLFU ## Attack Resilience π‘οΈ ## LRU Under Attack ## TinyLFU Under Attack ## Configuration Deep Dive βοΈ ## LRU Configuration ## TinyLFU Configuration ## Conclusion π― When building high-performance applications, choosing the right cache eviction policy can make or break your system's performance. While both TinyLRU and TinyLFU aim to keep your hot data cached, they take fundamentally different approaches. Understanding when to use each can save you from performance disasters and wasted CPU cycles. LRU (Least Recently Used) is one of the most intuitive cache eviction policies. The principle is simple: when the cache is full and you need to make room for new data, evict the item that was accessed longest ago. Think of it like organizing books on a small shelf. Every time you read a book, you put it at the front. When the shelf fills up, you remove the book from the back that you haven't touched in the longest time. TinyLFU (Least Frequently Used with Frequency sketch) takes a different approach. Instead of just tracking when items were accessed, it tracks how often they've been accessed. Before admitting a new item, TinyLFU asks: "Is this item more valuable than what I'd have to evict?" The fundamental distinction between these algorithms is what they optimize for: This difference seems subtle but has massive implications for real-world systems. Let's examine a production scenario that perfectly illustrates why the choice matters. This example is drawn from real JWT authentication systems handling 100,000 requests per second. You're running an auth gateway. Every incoming request carries a JWT. Verifying that JWT costs 200 microseconds. At 100,000 requests per second, that's 20 full CPU cores doing nothing but crypto. So you add caching. Problem solved... until it isn't. As the author of the TinyLFU JWT authentication analysis notes, this scenario transforms a stable caching system into an unpredictable performance disaster. Hot tokens that should remain cached for their entire 20-minute lifetime get evicted by cold tokens that will never be seen again. TinyLFU's admission gate prevents cache pollution. Batch tokens never achieve sufficient frequency to compete with legitimate hot tokens. The Count-Min Sketch is the secret sauce that makes TinyLFU practical. It's a probabilistic data structure that tracks frequencies for millions of items using only a tiny amount of memory. Imagine you need to track access frequencies for every JWT token you've ever seen: Count-Min Sketch solves this with a clever tradeoff: instead of exact counts, it provides approximate counts with strong guarantees. Think of it as a 2D grid of counters with multiple hash functions: Incrementing a Token: Different tokens can hash to the same positions (collisions), but the Count-Min Sketch handles this elegantly: Count-Min Sketch provides mathematical guarantees: For TinyLFU caching decisions, this is perfect because we're making relative comparisons: Compare memory usage for tracking 1 million tokens: TinyLFU uses periodic decay to keep frequencies fresh: The Count-Min Sketch enables TinyLFU to: Without Count-Min Sketch, TinyLFU would need exact frequency counters for every item ever seen, making it impractical for high-scale systems. The sketch transforms TinyLFU from a theoretical idea into a production-ready cache algorithm. As demonstrated in the JWT authentication case study, the attack actually makes TinyLFU's cache more resilient because random tokens can never achieve the frequency needed to displace legitimate hot tokens. Both TinyLRU and TinyLFU have their place in modern systems: TinyLRU is your default choice. It's simple, fast, and works well for most caching scenarios. Use it when recency correlates with value in your workload. TinyLFU is your specialist tool. It shines when you have frequency-skewed workloads, face burst traffic, or need resilience against cache pollution. The JWT authentication scenario demonstrates how critical this becomes at scale where poor cache decisions directly translate to wasted CPU cores and degraded latency. The right choice depends on understanding your access patterns. Monitor your cache behavior, measure hit rates under realistic load, and choose the algorithm that matches your workload characteristics. Choose wisely, cache efficiently, and may your hit rates stay high! π Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK:
LRU Cache Operations:
βββββββββββββββββββββββββββ
β Get(key) β
β ββ Move to front β
β β
β Set(key, value) β
β ββ Add to front β
β ββ Evict oldest if fullβ
βββββββββββββββββββββββββββ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
LRU Cache Operations:
βββββββββββββββββββββββββββ
β Get(key) β
β ββ Move to front β
β β
β Set(key, value) β
β ββ Add to front β
β ββ Evict oldest if fullβ
βββββββββββββββββββββββββββ CODE_BLOCK:
LRU Cache Operations:
βββββββββββββββββββββββββββ
β Get(key) β
β ββ Move to front β
β β
β Set(key, value) β
β ββ Add to front β
β ββ Evict oldest if fullβ
βββββββββββββββββββββββββββ CODE_BLOCK:
TinyLFU Cache Operations:
βββββββββββββββββββββββββββββββ
β Get(key) β
β ββ Increment frequency β
β ββ Return if cached β
β β
β Set(key, value) β
β ββ Compare frequencies β
β ββ Admit if worthy β
β ββ Reject if not valuable β
βββββββββββββββββββββββββββββββ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
TinyLFU Cache Operations:
βββββββββββββββββββββββββββββββ
β Get(key) β
β ββ Increment frequency β
β ββ Return if cached β
β β
β Set(key, value) β
β ββ Compare frequencies β
β ββ Admit if worthy β
β ββ Reject if not valuable β
βββββββββββββββββββββββββββββββ CODE_BLOCK:
TinyLFU Cache Operations:
βββββββββββββββββββββββββββββββ
β Get(key) β
β ββ Increment frequency β
β ββ Return if cached β
β β
β Set(key, value) β
β ββ Compare frequencies β
β ββ Admit if worthy β
β ββ Reject if not valuable β
βββββββββββββββββββββββββββββββ CODE_BLOCK:
LRU optimizes for: RECENCY (when was it last used?)
TinyLFU optimizes for: FREQUENCY (how often is it used?) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
LRU optimizes for: RECENCY (when was it last used?)
TinyLFU optimizes for: FREQUENCY (how often is it used?) CODE_BLOCK:
LRU optimizes for: RECENCY (when was it last used?)
TinyLFU optimizes for: FREQUENCY (how often is it used?) CODE_BLOCK:
Timeline of Cache Operations: Initial: [A] β β [B] β β [C] β β MRU LRU Access B: [B] β β [A] β β [C] Add D (full): [D] β β [B] β β [A] β β MRU Evicted β C Result: C removed (oldest), regardless of access count Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Timeline of Cache Operations: Initial: [A] β β [B] β β [C] β β MRU LRU Access B: [B] β β [A] β β [C] Add D (full): [D] β β [B] β β [A] β β MRU Evicted β C Result: C removed (oldest), regardless of access count CODE_BLOCK:
Timeline of Cache Operations: Initial: [A] β β [B] β β [C] β β MRU LRU Access B: [B] β β [A] β β [C] Add D (full): [D] β β [B] β β [A] β β MRU Evicted β C Result: C removed (oldest), regardless of access count CODE_BLOCK:
Cache State:
Item A: frequency = 500
Item B: frequency = 300 Item C: frequency = 100 β LFU victim New Item D arrives (frequency = 5) β
Compare: D(5) vs C(100) β
Decision: REJECT β β
Result: D never enters cache C stays (more valuable) New Item E arrives (frequency = 150) β
Compare: E(150) vs C(100) β
Decision: ADMIT β
β
Result: E enters cache C evicted Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Cache State:
Item A: frequency = 500
Item B: frequency = 300 Item C: frequency = 100 β LFU victim New Item D arrives (frequency = 5) β
Compare: D(5) vs C(100) β
Decision: REJECT β β
Result: D never enters cache C stays (more valuable) New Item E arrives (frequency = 150) β
Compare: E(150) vs C(100) β
Decision: ADMIT β
β
Result: E enters cache C evicted CODE_BLOCK:
Cache State:
Item A: frequency = 500
Item B: frequency = 300 Item C: frequency = 100 β LFU victim New Item D arrives (frequency = 5) β
Compare: D(5) vs C(100) β
Decision: REJECT β β
Result: D never enters cache C stays (more valuable) New Item E arrives (frequency = 150) β
Compare: E(150) vs C(100) β
Decision: ADMIT β
β
Result: E enters cache C evicted CODE_BLOCK:
π₯ Hot Tokens (70% of traffic) ββ Long-lived sessions ββ Service accounts: 20 req/sec each ββ Browser sessions: 50 req/min each ββ Thousands of hits per 20-min lifetime βοΈ Cold Tokens (25% of traffic) ββ One-off API calls ββ Batch jobs with unique tokens ββ Each token: 1-2 hits, never again π² Noise Tokens (5% of traffic) ββ Scanners ββ Malformed requests ββ Expired tokens from broken clients ββ Pure garbage Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
π₯ Hot Tokens (70% of traffic) ββ Long-lived sessions ββ Service accounts: 20 req/sec each ββ Browser sessions: 50 req/min each ββ Thousands of hits per 20-min lifetime βοΈ Cold Tokens (25% of traffic) ββ One-off API calls ββ Batch jobs with unique tokens ββ Each token: 1-2 hits, never again π² Noise Tokens (5% of traffic) ββ Scanners ββ Malformed requests ββ Expired tokens from broken clients ββ Pure garbage CODE_BLOCK:
π₯ Hot Tokens (70% of traffic) ββ Long-lived sessions ββ Service accounts: 20 req/sec each ββ Browser sessions: 50 req/min each ββ Thousands of hits per 20-min lifetime βοΈ Cold Tokens (25% of traffic) ββ One-off API calls ββ Batch jobs with unique tokens ββ Each token: 1-2 hits, never again π² Noise Tokens (5% of traffic) ββ Scanners ββ Malformed requests ββ Expired tokens from broken clients ββ Pure garbage CODE_BLOCK:
Time: 3:00 AM
Event: Batch job spawns 1000 workers
Each worker: unique JWT, single request LRU Decision Tree:
βββββββββββββββββββββββββββββββ
β New token arrives β
β Cache is full β
ββββββββββββ¬βββββββββββββββββββ β βΌ ββββββββββββββββ β Admit token β β Always admits β Evict oldest β β Evicts hot tokens ββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββ
β Cache now full of 1000 β
β tokens that will NEVER β
β be seen again β
βββββββββββββββββββββββββββββββ Impact:
ββ Service account tokens evicted
ββ Active browser sessions evicted
ββ Hit rate: 92% β 61%
ββ CPU usage doubles Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Time: 3:00 AM
Event: Batch job spawns 1000 workers
Each worker: unique JWT, single request LRU Decision Tree:
βββββββββββββββββββββββββββββββ
β New token arrives β
β Cache is full β
ββββββββββββ¬βββββββββββββββββββ β βΌ ββββββββββββββββ β Admit token β β Always admits β Evict oldest β β Evicts hot tokens ββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββ
β Cache now full of 1000 β
β tokens that will NEVER β
β be seen again β
βββββββββββββββββββββββββββββββ Impact:
ββ Service account tokens evicted
ββ Active browser sessions evicted
ββ Hit rate: 92% β 61%
ββ CPU usage doubles CODE_BLOCK:
Time: 3:00 AM
Event: Batch job spawns 1000 workers
Each worker: unique JWT, single request LRU Decision Tree:
βββββββββββββββββββββββββββββββ
β New token arrives β
β Cache is full β
ββββββββββββ¬βββββββββββββββββββ β βΌ ββββββββββββββββ β Admit token β β Always admits β Evict oldest β β Evicts hot tokens ββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββ
β Cache now full of 1000 β
β tokens that will NEVER β
β be seen again β
βββββββββββββββββββββββββββββββ Impact:
ββ Service account tokens evicted
ββ Active browser sessions evicted
ββ Hit rate: 92% β 61%
ββ CPU usage doubles CODE_BLOCK:
Same scenario: 1000 batch job tokens arrive TinyLFU Decision for Each Token:
βββββββββββββββββββββββββββββββ
β New batch token β
β Frequency: 1 β
ββββββββββββ¬βββββββββββββββββββ β βΌ ββββββββββββββββ β Compare with β β LFU victim β β (freq: 500) β ββββββββ¬ββββββββ β βΌ ββββββββββββββββ β 1 < 500 β β REJECT β β ββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββ
β Token never enters cache β
β Hot tokens remain cached β
β Hit rate stays: 92% β
βββββββββββββββββββββββββββββββ Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Same scenario: 1000 batch job tokens arrive TinyLFU Decision for Each Token:
βββββββββββββββββββββββββββββββ
β New batch token β
β Frequency: 1 β
ββββββββββββ¬βββββββββββββββββββ β βΌ ββββββββββββββββ β Compare with β β LFU victim β β (freq: 500) β ββββββββ¬ββββββββ β βΌ ββββββββββββββββ β 1 < 500 β β REJECT β β ββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββ
β Token never enters cache β
β Hot tokens remain cached β
β Hit rate stays: 92% β
βββββββββββββββββββββββββββββββ CODE_BLOCK:
Same scenario: 1000 batch job tokens arrive TinyLFU Decision for Each Token:
βββββββββββββββββββββββββββββββ
β New batch token β
β Frequency: 1 β
ββββββββββββ¬βββββββββββββββββββ β βΌ ββββββββββββββββ β Compare with β β LFU victim β β (freq: 500) β ββββββββ¬ββββββββ β βΌ ββββββββββββββββ β 1 < 500 β β REJECT β β ββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββ
β Token never enters cache β
β Hot tokens remain cached β
β Hit rate stays: 92% β
βββββββββββββββββββββββββββββββ CODE_BLOCK:
Components:
βββββββββββββββββββββββββββββββββββ
β HashMap: key β node β
β (O(1) lookups) β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Doubly Linked List β
β (maintains access order) β
β β
β [Head] β β [Node] β β [Tail] β
β (MRU) (LRU) β
βββββββββββββββββββββββββββββββββββ Operations:
Get: HashMap lookup + move to head = O(1)
Set: HashMap insert + add to head = O(1) (evict tail if full) Memory: ~100 bytes/entry overhead Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Components:
βββββββββββββββββββββββββββββββββββ
β HashMap: key β node β
β (O(1) lookups) β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Doubly Linked List β
β (maintains access order) β
β β
β [Head] β β [Node] β β [Tail] β
β (MRU) (LRU) β
βββββββββββββββββββββββββββββββββββ Operations:
Get: HashMap lookup + move to head = O(1)
Set: HashMap insert + add to head = O(1) (evict tail if full) Memory: ~100 bytes/entry overhead CODE_BLOCK:
Components:
βββββββββββββββββββββββββββββββββββ
β HashMap: key β node β
β (O(1) lookups) β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Doubly Linked List β
β (maintains access order) β
β β
β [Head] β β [Node] β β [Tail] β
β (MRU) (LRU) β
βββββββββββββββββββββββββββββββββββ Operations:
Get: HashMap lookup + move to head = O(1)
Set: HashMap insert + add to head = O(1) (evict tail if full) Memory: ~100 bytes/entry overhead CODE_BLOCK:
Components:
βββββββββββββββββββββββββββββββββββ
β Main Cache (LRU) β
β 99% of capacity β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Admission Window (tiny LRU) β
β 1% of capacity β
β (for cold-start protection) β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Count-Min Sketch β
β (frequency tracking) β
β 10x cache size β
β ~4 bits per counter β
βββββββββββββββββββββββββββββββββββ Operations:
Get: Increment sketch + cache lookup = O(1)
Set: Compare frequencies + conditional admit = O(1) Memory: ~500 KB for 100k cache with 1M counters Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Components:
βββββββββββββββββββββββββββββββββββ
β Main Cache (LRU) β
β 99% of capacity β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Admission Window (tiny LRU) β
β 1% of capacity β
β (for cold-start protection) β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Count-Min Sketch β
β (frequency tracking) β
β 10x cache size β
β ~4 bits per counter β
βββββββββββββββββββββββββββββββββββ Operations:
Get: Increment sketch + cache lookup = O(1)
Set: Compare frequencies + conditional admit = O(1) Memory: ~500 KB for 100k cache with 1M counters CODE_BLOCK:
Components:
βββββββββββββββββββββββββββββββββββ
β Main Cache (LRU) β
β 99% of capacity β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Admission Window (tiny LRU) β
β 1% of capacity β
β (for cold-start protection) β
βββββββββββββββββββββββββββββββββββ β βΌ
βββββββββββββββββββββββββββββββββββ
β Count-Min Sketch β
β (frequency tracking) β
β 10x cache size β
β ~4 bits per counter β
βββββββββββββββββββββββββββββββββββ Operations:
Get: Increment sketch + cache lookup = O(1)
Set: Compare frequencies + conditional admit = O(1) Memory: ~500 KB for 100k cache with 1M counters CODE_BLOCK:
Naive Approach:
HashMap: token β counter Problems:
ββ Need to store every unique token
ββ 1 million tokens Γ 64 bytes = 64 MB just for keys
ββ Plus counter storage
ββ Total: ~100 MB+ for frequency tracking alone β Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Naive Approach:
HashMap: token β counter Problems:
ββ Need to store every unique token
ββ 1 million tokens Γ 64 bytes = 64 MB just for keys
ββ Plus counter storage
ββ Total: ~100 MB+ for frequency tracking alone β CODE_BLOCK:
Naive Approach:
HashMap: token β counter Problems:
ββ Need to store every unique token
ββ 1 million tokens Γ 64 bytes = 64 MB just for keys
ββ Plus counter storage
ββ Total: ~100 MB+ for frequency tracking alone β CODE_BLOCK:
Count-Min Sketch Structure: Hash1 β [ 3][ 1][ 5][ 2][ 8][ 0][ 4]...
Hash2 β [ 2][ 4][ 1][ 7][ 3][ 1][ 2]...
Hash3 β [ 1][ 2][ 6][ 3][ 1][ 4][ 5]...
Hash4 β [ 4][ 1][ 3][ 5][ 2][ 1][ 8]... β Counters (4 bits each) Dimensions:
ββ Width (w): Number of counters per row
ββ Depth (d): Number of hash functions
ββ Total memory: w Γ d Γ 4 bits Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Count-Min Sketch Structure: Hash1 β [ 3][ 1][ 5][ 2][ 8][ 0][ 4]...
Hash2 β [ 2][ 4][ 1][ 7][ 3][ 1][ 2]...
Hash3 β [ 1][ 2][ 6][ 3][ 1][ 4][ 5]...
Hash4 β [ 4][ 1][ 3][ 5][ 2][ 1][ 8]... β Counters (4 bits each) Dimensions:
ββ Width (w): Number of counters per row
ββ Depth (d): Number of hash functions
ββ Total memory: w Γ d Γ 4 bits CODE_BLOCK:
Count-Min Sketch Structure: Hash1 β [ 3][ 1][ 5][ 2][ 8][ 0][ 4]...
Hash2 β [ 2][ 4][ 1][ 7][ 3][ 1][ 2]...
Hash3 β [ 1][ 2][ 6][ 3][ 1][ 4][ 5]...
Hash4 β [ 4][ 1][ 3][ 5][ 2][ 1][ 8]... β Counters (4 bits each) Dimensions:
ββ Width (w): Number of counters per row
ββ Depth (d): Number of hash functions
ββ Total memory: w Γ d Γ 4 bits CODE_BLOCK:
Token: "jwt_user_123" β
Apply 4 hash functions:
ββ Hash1(token) = 42 β Increment row1[42]
ββ Hash2(token) = 157 β Increment row2[157]
ββ Hash3(token) = 891 β Increment row3[891]
ββ Hash4(token) = 23 β Increment row4[23] Before:
Row1: ...[ 3][ 5]... (position 42)
Row2: ...[10][ 2]... (position 157)
Row3: ...[ 7][ 1]... (position 891)
Row4: ...[ 2][ 8]... (position 23) After:
Row1: ...[ 4][ 5]... β Incremented
Row2: ...[11][ 2]... β Incremented
Row3: ...[ 8][ 1]... β Incremented
Row4: ...[ 3][ 8]... β Incremented Time: O(1) - Just 4 increments Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Token: "jwt_user_123" β
Apply 4 hash functions:
ββ Hash1(token) = 42 β Increment row1[42]
ββ Hash2(token) = 157 β Increment row2[157]
ββ Hash3(token) = 891 β Increment row3[891]
ββ Hash4(token) = 23 β Increment row4[23] Before:
Row1: ...[ 3][ 5]... (position 42)
Row2: ...[10][ 2]... (position 157)
Row3: ...[ 7][ 1]... (position 891)
Row4: ...[ 2][ 8]... (position 23) After:
Row1: ...[ 4][ 5]... β Incremented
Row2: ...[11][ 2]... β Incremented
Row3: ...[ 8][ 1]... β Incremented
Row4: ...[ 3][ 8]... β Incremented Time: O(1) - Just 4 increments CODE_BLOCK:
Token: "jwt_user_123" β
Apply 4 hash functions:
ββ Hash1(token) = 42 β Increment row1[42]
ββ Hash2(token) = 157 β Increment row2[157]
ββ Hash3(token) = 891 β Increment row3[891]
ββ Hash4(token) = 23 β Increment row4[23] Before:
Row1: ...[ 3][ 5]... (position 42)
Row2: ...[10][ 2]... (position 157)
Row3: ...[ 7][ 1]... (position 891)
Row4: ...[ 2][ 8]... (position 23) After:
Row1: ...[ 4][ 5]... β Incremented
Row2: ...[11][ 2]... β Incremented
Row3: ...[ 8][ 1]... β Incremented
Row4: ...[ 3][ 8]... β Incremented Time: O(1) - Just 4 increments CODE_BLOCK:
Token: "jwt_user_123" β
Apply same 4 hash functions:
ββ Hash1(token) = 42 β Read row1[42] = 4
ββ Hash2(token) = 157 β Read row2[157] = 11
ββ Hash3(token) = 891 β Read row3[891] = 8
ββ Hash4(token) = 23 β Read row4[23] = 3 Estimated frequency: min(4, 11, 8, 3) = 3 β
Why minimum?
ββ Counters can be inflated by collisions Take the minimum = most conservative estimate Time: O(1) - Just 4 lookups + min operation Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Token: "jwt_user_123" β
Apply same 4 hash functions:
ββ Hash1(token) = 42 β Read row1[42] = 4
ββ Hash2(token) = 157 β Read row2[157] = 11
ββ Hash3(token) = 891 β Read row3[891] = 8
ββ Hash4(token) = 23 β Read row4[23] = 3 Estimated frequency: min(4, 11, 8, 3) = 3 β
Why minimum?
ββ Counters can be inflated by collisions Take the minimum = most conservative estimate Time: O(1) - Just 4 lookups + min operation CODE_BLOCK:
Token: "jwt_user_123" β
Apply same 4 hash functions:
ββ Hash1(token) = 42 β Read row1[42] = 4
ββ Hash2(token) = 157 β Read row2[157] = 11
ββ Hash3(token) = 891 β Read row3[891] = 8
ββ Hash4(token) = 23 β Read row4[23] = 3 Estimated frequency: min(4, 11, 8, 3) = 3 β
Why minimum?
ββ Counters can be inflated by collisions Take the minimum = most conservative estimate Time: O(1) - Just 4 lookups + min operation CODE_BLOCK:
Collision Example: Token A and Token B both hash to position 42 in Row1 Row1[42] stores: countA + countB = 15 β
When querying Token A:
ββ Row1: 15 (includes Token B) β Overestimate
ββ Row2: 12 (includes Token C) β Overestimate
ββ Row3: 9 (no collision) β True count
ββ Row4: 11 (includes Token D) β Overestimate Result: min(15, 12, 9, 11) = 9 Key Property:
ββ Estimates can be HIGHER than true count
ββ Estimates are NEVER LOWER than true count
ββ Using multiple rows reduces overestimation Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Collision Example: Token A and Token B both hash to position 42 in Row1 Row1[42] stores: countA + countB = 15 β
When querying Token A:
ββ Row1: 15 (includes Token B) β Overestimate
ββ Row2: 12 (includes Token C) β Overestimate
ββ Row3: 9 (no collision) β True count
ββ Row4: 11 (includes Token D) β Overestimate Result: min(15, 12, 9, 11) = 9 Key Property:
ββ Estimates can be HIGHER than true count
ββ Estimates are NEVER LOWER than true count
ββ Using multiple rows reduces overestimation CODE_BLOCK:
Collision Example: Token A and Token B both hash to position 42 in Row1 Row1[42] stores: countA + countB = 15 β
When querying Token A:
ββ Row1: 15 (includes Token B) β Overestimate
ββ Row2: 12 (includes Token C) β Overestimate
ββ Row3: 9 (no collision) β True count
ββ Row4: 11 (includes Token D) β Overestimate Result: min(15, 12, 9, 11) = 9 Key Property:
ββ Estimates can be HIGHER than true count
ββ Estimates are NEVER LOWER than true count
ββ Using multiple rows reduces overestimation CODE_BLOCK:
True frequency: f
Estimated frequency: f' Guarantee: f β€ f' β€ f + Ξ΅ Γ N Where:
ββ Ξ΅ (epsilon): Error parameter (e.g., 0.01 = 1%)
ββ N: Total number of items seen
ββ f' never underestimates Example with Ξ΅ = 0.01:
ββ True count: 100
ββ Total items: 1,000,000
ββ Max error: 0.01 Γ 1,000,000 = 10,000
ββ Estimated: 100 to 10,100 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
True frequency: f
Estimated frequency: f' Guarantee: f β€ f' β€ f + Ξ΅ Γ N Where:
ββ Ξ΅ (epsilon): Error parameter (e.g., 0.01 = 1%)
ββ N: Total number of items seen
ββ f' never underestimates Example with Ξ΅ = 0.01:
ββ True count: 100
ββ Total items: 1,000,000
ββ Max error: 0.01 Γ 1,000,000 = 10,000
ββ Estimated: 100 to 10,100 CODE_BLOCK:
True frequency: f
Estimated frequency: f' Guarantee: f β€ f' β€ f + Ξ΅ Γ N Where:
ββ Ξ΅ (epsilon): Error parameter (e.g., 0.01 = 1%)
ββ N: Total number of items seen
ββ f' never underestimates Example with Ξ΅ = 0.01:
ββ True count: 100
ββ Total items: 1,000,000
ββ Max error: 0.01 Γ 1,000,000 = 10,000
ββ Estimated: 100 to 10,100 COMMAND_BLOCK:
Admission Decision:
Token A frequency: ~500 (might be 500-600)
Token B frequency: ~50 (might be 50-100) Decision: A > B is correct regardless of small errors β
Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
Admission Decision:
Token A frequency: ~500 (might be 500-600)
Token B frequency: ~50 (might be 50-100) Decision: A > B is correct regardless of small errors β
COMMAND_BLOCK:
Admission Decision:
Token A frequency: ~500 (might be 500-600)
Token B frequency: ~50 (might be 50-100) Decision: A > B is correct regardless of small errors β
CODE_BLOCK:
Exact HashMap:
ββ 1M entries Γ 64 bytes (key)
ββ 1M entries Γ 8 bytes (counter)
ββ Total: ~72 MB Count-Min Sketch (w=10M, d=4):
ββ 10M counters Γ 4 rows
ββ 40M counters Γ 4 bits each
ββ 160M bits = 20 MB
ββ 72% memory savings β
Even Better - 4-bit Counters:
ββ Each counter: 0 to 15
ββ When counter reaches 15, halve all counters
ββ Maintains relative frequencies
ββ Prevents overflow Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Exact HashMap:
ββ 1M entries Γ 64 bytes (key)
ββ 1M entries Γ 8 bytes (counter)
ββ Total: ~72 MB Count-Min Sketch (w=10M, d=4):
ββ 10M counters Γ 4 rows
ββ 40M counters Γ 4 bits each
ββ 160M bits = 20 MB
ββ 72% memory savings β
Even Better - 4-bit Counters:
ββ Each counter: 0 to 15
ββ When counter reaches 15, halve all counters
ββ Maintains relative frequencies
ββ Prevents overflow CODE_BLOCK:
Exact HashMap:
ββ 1M entries Γ 64 bytes (key)
ββ 1M entries Γ 8 bytes (counter)
ββ Total: ~72 MB Count-Min Sketch (w=10M, d=4):
ββ 10M counters Γ 4 rows
ββ 40M counters Γ 4 bits each
ββ 160M bits = 20 MB
ββ 72% memory savings β
Even Better - 4-bit Counters:
ββ Each counter: 0 to 15
ββ When counter reaches 15, halve all counters
ββ Maintains relative frequencies
ββ Prevents overflow CODE_BLOCK:
Before Decay:
Row1: [15][12][ 8][15]...
Row2: [13][15][ 6][ 9]...
Row3: [15][ 7][11][15]...
Row4: [10][15][15][ 4]... Decay Event (right shift by 1 bit = divide by 2):
Row1: [ 7][ 6][ 4][ 7]...
Row2: [ 6][ 7][ 3][ 4]...
Row3: [ 7][ 3][ 5][ 7]...
Row4: [ 5][ 7][ 7][ 2]... Benefits:
ββ Recent activity matters more
ββ Old patterns fade away
ββ Prevents counter overflow
ββ Extremely fast (bit shift operation) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Before Decay:
Row1: [15][12][ 8][15]...
Row2: [13][15][ 6][ 9]...
Row3: [15][ 7][11][15]...
Row4: [10][15][15][ 4]... Decay Event (right shift by 1 bit = divide by 2):
Row1: [ 7][ 6][ 4][ 7]...
Row2: [ 6][ 7][ 3][ 4]...
Row3: [ 7][ 3][ 5][ 7]...
Row4: [ 5][ 7][ 7][ 2]... Benefits:
ββ Recent activity matters more
ββ Old patterns fade away
ββ Prevents counter overflow
ββ Extremely fast (bit shift operation) CODE_BLOCK:
Before Decay:
Row1: [15][12][ 8][15]...
Row2: [13][15][ 6][ 9]...
Row3: [15][ 7][11][15]...
Row4: [10][15][15][ 4]... Decay Event (right shift by 1 bit = divide by 2):
Row1: [ 7][ 6][ 4][ 7]...
Row2: [ 6][ 7][ 3][ 4]...
Row3: [ 7][ 3][ 5][ 7]...
Row4: [ 5][ 7][ 7][ 2]... Benefits:
ββ Recent activity matters more
ββ Old patterns fade away
ββ Prevents counter overflow
ββ Extremely fast (bit shift operation) CODE_BLOCK:
Scenario: 100k req/sec, tracking 5M unique tokens Configuration:
ββ Width: 10M counters
ββ Depth: 4 hash functions
ββ Counter size: 4 bits
ββ Total memory: 20 MB Token Types:
βββββββββββββββββββββββββββββββββββββββ
β Service Account (Token A) β
β True frequency: 2000 β
β Sketch estimate: 2003 β
β Error: 0.15% β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β Batch Job Token (Token B) β
β True frequency: 1 β
β Sketch estimate: 1 β
β Error: 0% β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β Random Attack Token (Token C) β
β True frequency: 1 β
β Sketch estimate: 2 (collision) β
β Error: 100% (but still tiny!) β
βββββββββββββββββββββββββββββββββββββββ Admission Decision:
ββ Token A (2003) vs Token B (1) β A wins β
ββ Token A (2003) vs Token C (2) β A wins β
ββ Small errors don't affect decisions Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Scenario: 100k req/sec, tracking 5M unique tokens Configuration:
ββ Width: 10M counters
ββ Depth: 4 hash functions
ββ Counter size: 4 bits
ββ Total memory: 20 MB Token Types:
βββββββββββββββββββββββββββββββββββββββ
β Service Account (Token A) β
β True frequency: 2000 β
β Sketch estimate: 2003 β
β Error: 0.15% β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β Batch Job Token (Token B) β
β True frequency: 1 β
β Sketch estimate: 1 β
β Error: 0% β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β Random Attack Token (Token C) β
β True frequency: 1 β
β Sketch estimate: 2 (collision) β
β Error: 100% (but still tiny!) β
βββββββββββββββββββββββββββββββββββββββ Admission Decision:
ββ Token A (2003) vs Token B (1) β A wins β
ββ Token A (2003) vs Token C (2) β A wins β
ββ Small errors don't affect decisions CODE_BLOCK:
Scenario: 100k req/sec, tracking 5M unique tokens Configuration:
ββ Width: 10M counters
ββ Depth: 4 hash functions
ββ Counter size: 4 bits
ββ Total memory: 20 MB Token Types:
βββββββββββββββββββββββββββββββββββββββ
β Service Account (Token A) β
β True frequency: 2000 β
β Sketch estimate: 2003 β
β Error: 0.15% β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β Batch Job Token (Token B) β
β True frequency: 1 β
β Sketch estimate: 1 β
β Error: 0% β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β Random Attack Token (Token C) β
β True frequency: 1 β
β Sketch estimate: 2 (collision) β
β Error: 100% (but still tiny!) β
βββββββββββββββββββββββββββββββββββββββ Admission Decision:
ββ Token A (2003) vs Token B (1) β A wins β
ββ Token A (2003) vs Token C (2) β A wins β
ββ Small errors don't affect decisions CODE_BLOCK:
The Trade-off:
ββ Give up: Exact counts
ββ Gain: 70-90% memory savings
ββ Guarantee: Never underestimate
ββ Result: Practical frequency-based caching β
Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
The Trade-off:
ββ Give up: Exact counts
ββ Gain: 70-90% memory savings
ββ Guarantee: Never underestimate
ββ Result: Practical frequency-based caching β
CODE_BLOCK:
The Trade-off:
ββ Give up: Exact counts
ββ Gain: 70-90% memory savings
ββ Guarantee: Never underestimate
ββ Result: Practical frequency-based caching β
CODE_BLOCK:
Attacker Strategy: 50k random JWTs/sec LRU Response:
βββββββββββββββββββββββββββββββββββ
β Random Token 1 β ADMIT β
β Random Token 2 β ADMIT β
β Random Token 3 β ADMIT β
β ... (evicting legitimate tokens)β
β Random Token 50000 β ADMIT β
βββββββββββββββββββββββββββββββββββ Result:
ββ Cache fills with garbage
ββ Legitimate tokens evicted
ββ Hit rate collapses
ββ CPU usage spikes Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Attacker Strategy: 50k random JWTs/sec LRU Response:
βββββββββββββββββββββββββββββββββββ
β Random Token 1 β ADMIT β
β Random Token 2 β ADMIT β
β Random Token 3 β ADMIT β
β ... (evicting legitimate tokens)β
β Random Token 50000 β ADMIT β
βββββββββββββββββββββββββββββββββββ Result:
ββ Cache fills with garbage
ββ Legitimate tokens evicted
ββ Hit rate collapses
ββ CPU usage spikes CODE_BLOCK:
Attacker Strategy: 50k random JWTs/sec LRU Response:
βββββββββββββββββββββββββββββββββββ
β Random Token 1 β ADMIT β
β Random Token 2 β ADMIT β
β Random Token 3 β ADMIT β
β ... (evicting legitimate tokens)β
β Random Token 50000 β ADMIT β
βββββββββββββββββββββββββββββββββββ Result:
ββ Cache fills with garbage
ββ Legitimate tokens evicted
ββ Hit rate collapses
ββ CPU usage spikes CODE_BLOCK:
Same Attack: 50k random JWTs/sec TinyLFU Response:
βββββββββββββββββββββββββββββββββββ
β Random Token 1 (freq=1) β REJECTβ
β Random Token 2 (freq=1) β REJECTβ
β Random Token 3 (freq=1) β REJECTβ
β ... (legitimate tokens safe) β
β Random Token 50000 β REJECT β
βββββββββββββββββββββββββββββββββββ Meanwhile:
ββ Legitimate Token A (freq=2000) β stays cached
ββ Service Account B (freq=1500) β stays cached
ββ Hit rate remains: 92% Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Same Attack: 50k random JWTs/sec TinyLFU Response:
βββββββββββββββββββββββββββββββββββ
β Random Token 1 (freq=1) β REJECTβ
β Random Token 2 (freq=1) β REJECTβ
β Random Token 3 (freq=1) β REJECTβ
β ... (legitimate tokens safe) β
β Random Token 50000 β REJECT β
βββββββββββββββββββββββββββββββββββ Meanwhile:
ββ Legitimate Token A (freq=2000) β stays cached
ββ Service Account B (freq=1500) β stays cached
ββ Hit rate remains: 92% CODE_BLOCK:
Same Attack: 50k random JWTs/sec TinyLFU Response:
βββββββββββββββββββββββββββββββββββ
β Random Token 1 (freq=1) β REJECTβ
β Random Token 2 (freq=1) β REJECTβ
β Random Token 3 (freq=1) β REJECTβ
β ... (legitimate tokens safe) β
β Random Token 50000 β REJECT β
βββββββββββββββββββββββββββββββββββ Meanwhile:
ββ Legitimate Token A (freq=2000) β stays cached
ββ Service Account B (freq=1500) β stays cached
ββ Hit rate remains: 92% CODE_BLOCK:
Simple Parameters: cache := NewLRU(100000) // Just capacity Optional:
ββ TTL per entry
ββ Expiration callback
ββ Memory limit (bytes) Rule of Thumb:
Size = ActiveUsers Γ AverageSessionRequests Γ SafetyMargin = 10,000 Γ 50 Γ 2 = 1,000,000 entries Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Simple Parameters: cache := NewLRU(100000) // Just capacity Optional:
ββ TTL per entry
ββ Expiration callback
ββ Memory limit (bytes) Rule of Thumb:
Size = ActiveUsers Γ AverageSessionRequests Γ SafetyMargin = 10,000 Γ 50 Γ 2 = 1,000,000 entries CODE_BLOCK:
Simple Parameters: cache := NewLRU(100000) // Just capacity Optional:
ββ TTL per entry
ββ Expiration callback
ββ Memory limit (bytes) Rule of Thumb:
Size = ActiveUsers Γ AverageSessionRequests Γ SafetyMargin = 10,000 Γ 50 Γ 2 = 1,000,000 entries CODE_BLOCK:
Critical Parameters: cache := NewTinyLFU(TinyLFUConfig{ MaxSize: 100000, // Main cache capacity NumCounters: 1000000, // Frequency sketch size (10x) BufferItems: 1000, // Admission window (1%)
}) Parameter Breakdown: NumCounters:
ββ Rule: 10x cache size
ββ Why: Reduces hash collisions
ββ Memory: counters Γ 4 bits
ββ Example: 1M counters = 500 KB BufferItems:
ββ Rule: 1% of MaxSize
ββ Purpose: Cold-start protection
ββ Behavior: New items land here first
ββ Graduation: Build frequency β main cache MaxCost (optional):
ββ For variable-sized entries
ββ Memory-bounded eviction
ββ Example: 50 MB budget for mixed sizes Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Critical Parameters: cache := NewTinyLFU(TinyLFUConfig{ MaxSize: 100000, // Main cache capacity NumCounters: 1000000, // Frequency sketch size (10x) BufferItems: 1000, // Admission window (1%)
}) Parameter Breakdown: NumCounters:
ββ Rule: 10x cache size
ββ Why: Reduces hash collisions
ββ Memory: counters Γ 4 bits
ββ Example: 1M counters = 500 KB BufferItems:
ββ Rule: 1% of MaxSize
ββ Purpose: Cold-start protection
ββ Behavior: New items land here first
ββ Graduation: Build frequency β main cache MaxCost (optional):
ββ For variable-sized entries
ββ Memory-bounded eviction
ββ Example: 50 MB budget for mixed sizes CODE_BLOCK:
Critical Parameters: cache := NewTinyLFU(TinyLFUConfig{ MaxSize: 100000, // Main cache capacity NumCounters: 1000000, // Frequency sketch size (10x) BufferItems: 1000, // Admission window (1%)
}) Parameter Breakdown: NumCounters:
ββ Rule: 10x cache size
ββ Why: Reduces hash collisions
ββ Memory: counters Γ 4 bits
ββ Example: 1M counters = 500 KB BufferItems:
ββ Rule: 1% of MaxSize
ββ Purpose: Cold-start protection
ββ Behavior: New items land here first
ββ Graduation: Build frequency β main cache MaxCost (optional):
ββ For variable-sized entries
ββ Memory-bounded eviction
ββ Example: 50 MB budget for mixed sizes - Track millions of items with minimal memory
- Make instant decisions with O(1) operations
- Handle high throughput without becoming a bottleneck
- Adapt to changing patterns through decay
- Tolerate approximation because relative comparisons still work - π TinyLFU: A Highly Efficient Cache Admission Policy
- π TinyLFU in JWT Authentication Systems
- π§ Ristretto: A fast, concurrent TinyLFU cache
- π golang-lru: Simple LRU implementation
- π Count-Min Sketch: The probabilistic data structure powering TinyLFU
how-totutorialguidedev.toainodegit