We are doing all of the above. Microbenchmarks that show only the performance of the DB engine still serve a purpose. And the difference between the microbenches and the full benches shows you how much overhead is eaten in the rest of the system. E.g., when testing read performance of SQLite3 adapted to LMDB, we measured only a 1-2% improvement, because the majority of SQLite3 read execution time is spent in SQL processing, not in its Btree code.
Nice! Do you know why that benchmark show almost no read improvement but massive write improvement? The slides here (http://symas.com/mdb/20120322-UKUUG-MDB.pdf) suggest the opposite? Or am I reading them incorrectly?
As I understand it, that test used something like 32 server threads on a machine with only 4 cores. The UKUUG prezo used 16 server threads on a machine with 16 cores, which allowed all of the reads to run concurrently. The smaller number of CPU cores in the Zimbra test naturally reduces the concurrency advantage of LMDB.