Tools
Fanout at Scale: Push vs. Pull Strategies in Distributed Systems
2025-12-21
0 views
admin
Fanout Push vs. Pull: Solving the Feed Distribution Problem at Scale ## 1. The Actual Fanout Problem ## 2. Fanout-on-Write (Push Model) ## How It Works ## What Problem It Solves ## Why It’s Powerful ## Core Trade-Offs ## 3. Fanout-on-Read (Pull Model) ## How It Works ## What Problem It Solves ## Trade-Offs ## 4. Hybrid Fanout (The Real-World Solution) ## 5. How Major Social Media Platforms Solve Fanout ## Twitter (X) ## Facebook ## Instagram ## LinkedIn ## TikTok / YouTube Shorts ## 6. Fanout Beyond Social Media (Critical Non-Obvious Use Cases) ## 1. Notifications Systems ## 2. Event-Driven Architectures ## 3. CDN Cache Invalidation ## 4. Observability & Metrics ## 5. Multiplayer Gaming ## 6. Financial Market Data ## 7. Configuration & Feature Flags ## 7. How to Choose the Right Fanout Strategy ## Use Fanout-on-Write (Push) When: ## Use Fanout-on-Read (Pull) When: ## Use Hybrid When: ## 8. Final Mental Model Modern systems don’t fail because they can’t store data — they fail because they can’t deliver the right data to the right consumers at the right time. The fanout problem is one of the most fundamental challenges in distributed systems, especially visible in social media feeds, notification systems, messaging platforms, and event-driven architectures. At its core, fanout answers one question: When a single event occurs, how do we efficiently deliver it to millions of consumers? There are two dominant strategies: Both are valid. Both are widely used. And both come with serious trade-offs. Imagine a social media post: Each follower expects: The naive solution — “just send the post to everyone” — collapses under: One write → many reads The challenge is deciding when and where that fanout happens. When a user creates content: Reads massively outnumber writes in social systems.
Fanout-on-write moves complexity to write time, making reads cheap. A celebrity with 100M followers can generate: 100M feed writes for one post This is not theoretical — it’s an operational nightmare. This model pushes complexity to query time. Almost no large platform uses pure push or pull. Hybrid Fanout Strategy This is called Selective Fanout Primarily Fanout-on-Write They mitigate costs with: Feed is composed from: Mostly Fanout-on-Write Primarily Fanout-on-Read Content is pulled from: This model would be impossible with push. Fanout strategies solve problems far beyond feeds. Kafka is fanout-on-read by design. Push state updates for: Pull world state on reconnect Hybrid fanout prevents bandwidth explosion. Latency requirements dictate fanout model. Push updates to services for: Pull periodically for consistency You want to survive real-world traffic 😄 Think of fanout like logistics: Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK:
User posts → System pushes post to N follower feeds Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
User posts → System pushes post to N follower feeds CODE_BLOCK:
User posts → System pushes post to N follower feeds CODE_BLOCK:
User opens feed → System pulls content from many sources Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
User opens feed → System pulls content from many sources CODE_BLOCK:
User opens feed → System pulls content from many sources - Fanout-on-Write (Push)
- Fanout-on-Read (Pull) - A user posts one update
- That user has 10 followers… or 100 million followers
- Each follower expects: Low latency
Personalized feeds
High availability
- Low latency
- Personalized feeds
- High availability - Low latency
- Personalized feeds
- High availability - Write amplification
- Storage explosion
- Hot partitions
- Latency spikes - The system immediately distributes (pushes) the content to followers
- Each follower gets a precomputed feed entry
- Reading the feed becomes a simple lookup - Ultra-fast feed reads
- Predictable latency
- Minimal computation during reads - Content is written once, stored centrally
- When a user opens their feed: - The system fetches recent posts from accounts they follow
- Merges, ranks, and filters them at read time - Eliminates write amplification
- Handles massive fanout safely
- Simplifies writes - Not all users are equal
- Not all content deserves precomputation
- Not all reads need the same latency - Push for normal users
- Pull for celebrities - The top ~0.01% of users generate extreme fanout
- Pushing their tweets would melt storage and queues - Tweets from celebrities are not pushed
- They’re fetched dynamically at read time - Posts are pushed into followers’ feeds
- Heavy use of: Precomputation
Ranking models
Feed materialization
- Precomputation
- Ranking models
- Feed materialization - Precomputation
- Ranking models
- Feed materialization - Facebook prioritizes low-latency scrolling
- Heavy investment in storage and background processing - Asynchronous workers
- Backpressure controls
- Partial fanout (not all posts go to all feeds) - Push for normal users
- Pull for: Celebrities
Suggested content
Ads
Reels
- Celebrities
- Suggested content - Celebrities
- Suggested content - Precomputed feed entries
- Dynamically fetched content - Connections are limited
- Professional graph is smaller and denser
- Easier to precompute feeds - Feed is interest-based, not follower-based
- Content is pulled from: Recommendation pools
ML-ranked candidate sets
- Recommendation pools
- ML-ranked candidate sets - Recommendation pools
- ML-ranked candidate sets - Send alerts to millions of devices
- Use: Message queues
Topic-based pub/sub
Rate limiting
- Message queues
- Topic-based pub/sub
- Rate limiting - Message queues
- Topic-based pub/sub
- Rate limiting - Mobile push notifications
- Emergency alerts
- Trading alerts - Consumers pull from Kafka partitions
- Enables: Backpressure
Replayability
Fault isolation
- Backpressure
- Replayability
- Fault isolation - Backpressure
- Replayability
- Fault isolation - Purge or update cache globally
- Used for: Security patches
Breaking content changes
- Security patches
- Breaking content changes - Security patches
- Breaking content changes - Lazy fetching of rarely accessed assets - Prometheus: Pull fanout (scraping)
- Datadog: Push fanout (agents) - Control vs latency
- System stability during failures - Push state updates for: Nearby players
Active sessions
- Nearby players
- Active sessions
- Pull world state on reconnect - Nearby players
- Active sessions - Push for: Price ticks
Trade executions
- Price ticks
- Trade executions
- Pull for: Historical data
Analytics queries
- Historical data
- Analytics queries - Price ticks
- Trade executions - Historical data
- Analytics queries - Push updates to services for: Kill switches
Emergency rollbacks
- Kill switches
- Emergency rollbacks
- Pull periodically for consistency - Kill switches
- Emergency rollbacks - Read latency must be minimal
- Fanout size is bounded
- Storage is cheap
- Predictable performance is critical - Fanout size is unbounded
- Writes must be cheap
- Consumers vary wildly in demand
- Backpressure is required - Push = Home delivery (fast, expensive, predictable)
- Pull = Warehouse pickup (cheap, flexible, slower)
- Hybrid = Amazon Prime 😉
how-totutorialguidedev.toaimlswitch