| HN Mirror

I don't know intricate details of technology that TokuDB was using. These things are patented and I prefer to stay away.

My experience indicates that region allocation (inside DB file or other space) algorithm can be a big win. I devised an algorithm that produced O(log^2N) regions for log-structured merge trees backend and that resulted in larger regions allocated and more read/write bandwidth available (larger pages, adjusted to size of data). The overhead was not all that big - slightly less than 2 times over actual data size, BerkeleyDB log was much bigger (100x as big).

The SQLite's implementation of LSM trees used O(NlogN) regions allocations - it managed space using 1M bytes pages. Most of other databases does something like that - bitmaps in MS SQL Server are exactly this.

Also, you can trade space over speed and vice versa. Most of databases can utilize cover indices - index that contains all fields used in query will not trigger lookup operation into the actual table. With B+-trees, you cannot have many indices, they are all random inserts, basically, and B+-trees will degrade. With TokuDB or LSM trees you can have many indices, with different coverage, because you can tolerate random inserts much, much better (several orders of magnitudes better).

Indices are often highly compressible due to their redundancy because of sorting.

Thus, to get most of TokuDB, you have to change how you approach problems. You can do things that are not quite possible with B+-trees and that is what you have to get used to.

This may be second things why TokuDB is faded - thinking inertia.