| > How does KDB handle replication and failover? With -r and (in my case) SO_REUSEPORT. Most people use a dedicated gateway (have seen custom tomcat stuff and haproxy). Meanwhile, MongoDB doesn't actually replicate reliably (acking then losing anyway) and failover can crash cascade in the naïve configuration. > Or even high insert/update rates to datasets that exceed the size of memory? This is literally the KDB tickerplant model. Have an RDB that flushes out regularly (daily) to an HDB. You can also just write to a log `:log upsert ... > How do you shard KDB? Same way you shard anything else? By picking a key and directing the query to the appropriate server. h[(first md5 k) mod count h] "query..." > KDB doesn't support unicode text. UTF8 is fine. The number of times I've needed the first 5 code points (and not the first 5 bytes or the first 5 characters) in my life is zero. All that half-baked Unicode support in various languages (like MongoDB) just makes people think that they've solved a problem that they really haven't. > Yes, KDB excels at its relatively well defined niche of transforming and aggregating "smallish" (say 10 TB or less) numerical time series data. It would be a horrible choice for the backing store of a high throughput CRUD application... I use it in one of those big CRUD databases (digital marketing and tele-lead tracking). > What is it with KDB zealots thinking that KDB is the best database for every task? I swear, KDB is the Scientology of databases. Because it solves problems they have. Even when I don't use KDB I use a similar architecture because it's the correct architecture, because I've had these problems for a lot longer than I've had KDB. If it doesn't solve every problem I have, that's because I have work to do, not because it isn't great at the problems it does solve, and I don't shout at my hammer because it isn't a spoon. However MongoDB doesn't solve any problem I've ever had: I've never needed a bag of objects/filesytem that loses data, or a binary blob that I cannot query. It's so famously "web scalable" it has made a joke of the very idea of being scalable. |
So as a KDB user you need to implement your own HA solution. That is strictly worse than MongoDB replication, even with its now-fixed bugs. Do you really think your homemade multi-master KDB system would pass Jepsen?
> This is literally the KDB tickerplant model. Have an RDB that flushes out regularly (daily) to an HDB.
Wat? That only works if data is immutable once written. Tweets are liked/deleted/etc. You could store an immutable log of user actions, but then you would have to reconstruct the current snapshot every time someone loads a timeline. It's entirely possible for someone to like/delete/RT an old tweet. Financial data is naturally partitioned because the order book clears at the end of every trading day - this doesn't apply to CRUD apps.
> UTF8 is fine
I think you misunderstand what I mean by unicode support. Does KDB support locale specific collations? Does it support normalization/canonicalization? Being able to index by code point is about 1% of the needed solution to build an i18n-proof product. Obviously that doesn't matter when you are dealing with normal KDB datasets like market data where e.g. asian names are represented with numbers.
> I use it in one of those big CRUD databases (digital marketing and tele-lead tracking).
Were you using it to store clickstream data? Or some other kind of immutable stream of events? That isn't really applicable to general CRUD applications.
Like I said - KDB is great for analyzing immutable streams of events. It's not a general purpose database for building CRUD applications. MongoDB tries to be a reasonable enough solution for many use cases, while KDB focuses on excelling at a small number. Both are valid approaches to building a database...