|
|
|
|
|
by porker
2612 days ago
|
|
This. The best benchmarking for search engines is: 1. Does it return relevant results?
2. Can it handle complex queries?
2) is only required in specific use-cases, but when it's needed it's _really needed_.1) is the main measure users care about, and in my experience is best evaluated by building a search in each system with the same corpus and giving to subject-matter experts. |
|
Without a good search engine you might have the results you needed plus lots of other results. You'd have to scroll to page 20 of your results to actually see the result that you wanted, which means it wasn't very precise.
Think of internet search engines pre-google. With e.g. alta vista you had great recall but extremely poor precision. You'd often be scrolling multiple pages of results. Google turned that around by having great precision and similar recall. They made it so good that they implemented the "i feel lucky" button.
The trick with search is to have great precision and still good enough recall. That's super hard because what is precise is very subjective and highly dependent on your usecases, data, languages, etc.
This is why Elasticsearch is such a hugely complicated product: it includes a lot of solutions for essentially any use case you can imagine around search.
I have no experience with Redisearch; so I'll reserve my judgment. But this article is not doing it any favors.
There are competing things out there for Elasticsearch. Most of the serious ones also use Apache Lucene (e.g. Solr). Some of the upcoming ones are attempting to rebuild what Lucene does and may or may not be good enough depending on your use case. There have been some lucene ports over the years, including a C port. Most of those have fallen behind or are no longer maintained. The Java implementation is actually pretty good as is and has had a lot of performance and optimization work done to it over the years. You'd be hard pressed to build something as good and as fast without essentially using the same algorithms and reinventing a lot of the same wheels.
IMHO the current effort to build a search engine in Rust makes a lot of sense. The language is uniquely suited to doing the kinds of things Lucene does and they seem to be pretty serious about doing things properly.