Comparing Log-Structured and B-Tree Storage Engines
Storage engines are where database performance is decided. Two dominant designs are B-trees and LSM trees. Each is optimized for different workloads.
B-Tree Storage
- Writes update in place.
- Reads are fast for point lookups.
- Range queries are efficient.
Trade-off: write amplification under heavy writes.
LSM Tree Storage
- Writes go to a log and memtable.
- Data is compacted into sorted files.
- Reads may hit multiple files unless filtered.
Trade-off: higher read amplification, but great write throughput.
When Each Wins
- B-tree: read-heavy OLTP systems.
- LSM: write-heavy or append-heavy systems.
Key Concepts
- Compaction cost
- Bloom filters
- Read amplification
- Write amplification
Final Thought
There is no universally better storage engine. Pick based on workload, not hype.
Related Articles
How to Properly Delete Stuff
Most people think deleting a file means it’s gone forever. You select a file, hit Delete, empty the recycle bin, and move on with your life assuming the data no...
Understanding the Tradeoff Between Reads and Writes in Databases and Why You Can’t Optimize Both at the Same Time
A clear explanation of the read/write tradeoff in databases and its impact on performance decisions.
Chess.com’s Authentication Flow — What’s Missing and How to Fix It
Exploring Chess.com's authentication system: what happens when email verification is missing, the security vulnerabilities it creates, and how to build a stronger authentication flow