[ G | P ] ← root (internal node) / | \ [B | D | F] [H | L] [R | T | W] ← internal nodes / | | \ ... ... [A-B][C-D][E-F][G-H]... ← leaf nodes (actual data) CODE_BLOCK: [ G | P ] ← root (internal node) / | \ [B | D | F] [H | L] [R | T | W] ← internal nodes / | | \ ... ... [A-B][C-D][E-F][G-H]... ← leaf nodes (actual data) CODE_BLOCK: [ G | P ] ← root (internal node) / | \ [B | D | F] [H | L] [R | T | W] ← internal nodes / | | \ ... ... [A-B][C-D][E-F][G-H]... ← leaf nodes (actual data) CODE_BLOCK: [A-B] ↔ [C-D] ↔ [E-F] ↔ [G-H] ↔ [I-J] → ... CODE_BLOCK: [A-B] ↔ [C-D] ↔ [E-F] ↔ [G-H] ↔ [I-J] → ... CODE_BLOCK: [A-B] ↔ [C-D] ↔ [E-F] ↔ [G-H] ↔ [I-J] → ... CODE_BLOCK: SELECT * FROM orders WHERE created_at BETWEEN '2026-01-01' AND '2026-03-31'; CODE_BLOCK: SELECT * FROM orders WHERE created_at BETWEEN '2026-01-01' AND '2026-03-31'; CODE_BLOCK: SELECT * FROM orders WHERE created_at BETWEEN '2026-01-01' AND '2026-03-31'; CODE_BLOCK: Before split — leaf page is full: ┌─────────────────────────────┐ │ A | C | E | G | J | L | N │ ← full └─────────────────────────────┘ Insert "K" → triggers split: ┌───────────────┐ ┌───────────────┐ │ A | C | E | G │ │ J | K | L | N │ └───────────────┘ └───────────────┘ ↑ parent node gets new key "J" pointing to right sibling CODE_BLOCK: Before split — leaf page is full: ┌─────────────────────────────┐ │ A | C | E | G | J | L | N │ ← full └─────────────────────────────┘ Insert "K" → triggers split: ┌───────────────┐ ┌───────────────┐ │ A | C | E | G │ │ J | K | L | N │ └───────────────┘ └───────────────┘ ↑ parent node gets new key "J" pointing to right sibling CODE_BLOCK: Before split — leaf page is full: ┌─────────────────────────────┐ │ A | C | E | G | J | L | N │ ← full └─────────────────────────────┘ Insert "K" → triggers split: ┌───────────────┐ ┌───────────────┐ │ A | C | E | G │ │ J | K | L | N │ └───────────────┘ └───────────────┘ ↑ parent node gets new key "J" pointing to right sibling CODE_BLOCK: Memtable (in memory): ┌──────────────────────────────────────────────┐ │ order: aaa-111, user: x, status: pending ... │ │ order: bbb-222, user: y, status: shipped ... │ │ order: ccc-333, user: x, status: delivered.. │ │ order: ddd-444, user: z, status: pending ... │ │ ↑ sorted by order_id │ └──────────────────────────────────────────────┘ CODE_BLOCK: Memtable (in memory): ┌──────────────────────────────────────────────┐ │ order: aaa-111, user: x, status: pending ... │ │ order: bbb-222, user: y, status: shipped ... │ │ order: ccc-333, user: x, status: delivered.. │ │ order: ddd-444, user: z, status: pending ... │ │ ↑ sorted by order_id │ └──────────────────────────────────────────────┘ CODE_BLOCK: Memtable (in memory): ┌──────────────────────────────────────────────┐ │ order: aaa-111, user: x, status: pending ... │ │ order: bbb-222, user: y, status: shipped ... │ │ order: ccc-333, user: x, status: delivered.. │ │ order: ddd-444, user: z, status: pending ... │ │ ↑ sorted by order_id │ └──────────────────────────────────────────────┘ CODE_BLOCK: Disk state after several Memtable flushes: SSTable-1 (oldest): [aaa-111] [ccc-333] [eee-555] [ggg-777] SSTable-2: [bbb-222] [ddd-444] [fff-666] SSTable-3: [aaa-111] [hhh-888] ← updated version of aaa-111 SSTable-4 (newest): [iii-999] [jjj-000] CODE_BLOCK: Disk state after several Memtable flushes: SSTable-1 (oldest): [aaa-111] [ccc-333] [eee-555] [ggg-777] SSTable-2: [bbb-222] [ddd-444] [fff-666] SSTable-3: [aaa-111] [hhh-888] ← updated version of aaa-111 SSTable-4 (newest): [iii-999] [jjj-000] CODE_BLOCK: Disk state after several Memtable flushes: SSTable-1 (oldest): [aaa-111] [ccc-333] [eee-555] [ggg-777] SSTable-2: [bbb-222] [ddd-444] [fff-666] SSTable-3: [aaa-111] [hhh-888] ← updated version of aaa-111 SSTable-4 (newest): [iii-999] [jjj-000] CODE_BLOCK: Before compaction: SSTable-1: [aaa-111 v1] [ccc-333] SSTable-3: [aaa-111 v2] [hhh-888] After compaction: SSTable-merged: [aaa-111 v2] [ccc-333] [hhh-888] CODE_BLOCK: Before compaction: SSTable-1: [aaa-111 v1] [ccc-333] SSTable-3: [aaa-111 v2] [hhh-888] After compaction: SSTable-merged: [aaa-111 v2] [ccc-333] [hhh-888] CODE_BLOCK: Before compaction: SSTable-1: [aaa-111 v1] [ccc-333] SSTable-3: [aaa-111 v2] [hhh-888] After compaction: SSTable-merged: [aaa-111 v2] [ccc-333] [hhh-888]
- Fast writes. When you insert an order, the index should update quickly.
- Fast reads. When you query by order_id, finding the right row should take as few disk operations as possible.
- Efficient range scans. When you query orders between two dates, the engine should be able to find the start of the range and read forward — not scatter-gather across random disk locations.
- Reads: 3–4 page reads to find any row in a million-row table. Fast, predictable.
- Range reads: Follow the leaf linked list sequentially. Very fast.
- Writes: Usually fast, but page splits cause write amplification and latency spikes on random keys. The variance shows up at p99.