Hacker News new | ask | show | jobs
by blowski 1427 days ago
Whenever I see quotes like ā€œ182x faster than MySQL! 29x faster than Elasticsearch!ā€ as the intro to a project, I’m immediately sceptical of the quality of the entire project.
2 comments

Me too! That's why it's very important when you say things like that to make sure you provide enough evidences. In this case there is https://db-benchmarks.com/ where you can see all the details and it's 100% opensource - https://github.com/db-benchmarks/db-benchmarks including the UI https://github.com/db-benchmarks/ui . The results themselves are opensource too - https://github.com/db-benchmarks/db-benchmarks/tree/main/res...

So anyone can reproduce the results or at least look carefully into them, understand the testing methodology etc.

Comparing full-text search engines on queries that aren't full-text search are of course slow.

It would be interesting to see queries benchmarked for their intended workload (logs/sorting/full-text etc).

Fastest Avg: https://db-benchmarks.com/?cache=fast_avg&engines=elasticsea...

Slowest: https://db-benchmarks.com/?cache=slowest&engines=elasticsear...

Elasticsearch is used not only for full-text search. It's widely used for analytics (aggregations) and filtering too, e.g. when you do log analytics.

Comparing Elasticsearch / Manticore with MySQL may be not the fairest thing since they are too different, but comparing them one with another and using not only full-text queries seems fine to me.

It all seems too good to be true, so I'd like understand more about the limitations.
Well for one it's written in c++, which means it is more likely to have memory safety bugs, which could potentially be security vulnerabilities.
While generally true, I would argue that for the use cases where full-text search is mostly used (e.g. either search through a public database, or, quite the opposite, an internal system that does search through logs collected from various sources), in practice security vulnerabilities are less of a concern because usually even if you can expose some data stored in the full text index using that vulnerability, it would still only expose data you could already find in that search engine and that's already accessible to you :).
I'd say (fulltext) search is one of the least interesting features of ES, and the aggregations are its USP. E.g. moving averages on (biggish) datasets can be calculated on very cheap hardware.
If you're not interested in fast full text search, then you're wasting a ton of resources on a solution that can be served way easier on more specially tuned analytics databases. The entire storage and retrieval methodology of these engines are based largely to do lucene style searches at extreme speeds.
What would be a good self-hostable solution to replace ES aggregations? I'm also quite fond of Clickhouse which is a lot faster yet, but the sheer number of products which have popped up in the last decade always makes me wonder if there's still faster solutions out there.
I'm not even entirely sure what such a comparison means for a DBMS to be faster than another.

In practice, claims boil down to this query is 182x faster than that query in MySQL.

Using the same type of comparison, you can come to the conclusion that MySQL is 10x faster than MySQL because one query was constructed one way and the other another. (Applying induction, we can thus prove that MySQL is infinitely fast ;-)