Hacker News new | ask | show | jobs
by ricmo 5925 days ago
I'm not sure I get the point of this. Two of SQLites strong points (among many others) are: (a) Short dependency list (b) Platform independent filesystem storage

..doesn't this negate both of those for what seems like little gain?

2 comments

Bad comparison. The point is that you interact with berkleyDB via an SQLite interface. It's just an interface layer in this case.
BerkeleyDB doesn't have many dependencies, uses only plain files as storage, and is itself highly-portable open source (with a quasi-copyleft condition).

So you don't give up much to get the claimed benefits of this combination -- only the ability to use public-domain SQLite in proprietary distributed software.

SQLite outperforms it by far. That could also weigh in a little :)
I've used berkeley for hundreds of concurrent queries -- it is quite good in those situations, SQLlite is not -- its just not designed for those situations.

combining the two brings an easy interface that sqlite provide, and the concurrent performance that bdb has, is definitely providing value.

Do you have any benchmarks to support that claim?
I do. I wrote a tool to help me understand what I could do on various storage tiers running as safely as possible. Here's one of my results from linode:

http://skitch.com/dlsspy/nh2qb/kvtest-results-on-linode

Here's the code: http://github.com/dustin/kvtest

Depending on what you're doing and how safe you want it, you can adjust inputs to select a different winner.

This is an awesome shootout! Be careful with BDB, its default tuning is geared more towards an embedded environment then a modern web-app. That's not to say it can't go fast!

I took a look at your BDB demo and saw that you weren't creating the DBs in an environment, which meant every operation was straight to disk, and also meant that you weren't running with write-ahead logs (which will give you durability). I didn't look too closely to see if the other databases had caching enabled or not (or what their defaults were).

On my macbook air, configuring with caching (and no logging) yielded this from your benchmark:

Air:kvtest jamie$ ./bdb-test Running test ``test test'' PASS Running test ``write test'' Ran 284669 operations in 5s (56933 ops/s) PASS

Of course, page caching makes all the difference :-)

BDB has some great sample code, I'd recommend taking a look at: examples_c/ex_env.c (http://www.fiveanddime.net/berkeley-db/db-4.3.28/examples_c/...)

I really wish BDB had more sensible defaults, I think it unfairly gets a black-eye in performance shootouts.

What does the 'auditable' qualifier (which seems to make the difference between SQLite beating BDB, or BDB beating SQLite) mean?
auditable means every revision of every change was saved in a fashion that allows you to go back in time and what-not.

Specifically, it means this: http://github.com/dustin/kvtest/blob/master/sqlite-base.cc#L...