Hacker News new | ask | show | jobs
by petulla 1778 days ago
no BERT?
3 comments

Neural search in combination with learning algorithms and traditional keyword searches are clearly the future. They vastly outperform traditional search engines.

At sajari.com we have been working on an experiment that uses a 1 cpu machine on cloud run to serve a neural network generated, hash based index of an old BestBuy catalog (25k products). Retrieval uses an approximate nearest neighbour (ANN) look up which typically takes ~1msec. Speed and relevancy are already pretty good.

But we have also learned that there is no one silver bullet and we have seen the best results when combining neural search with traditional keyword search and reinforcement learning.

You can take a look at the demo here: http://neural-hashes.sajari.com

Be gentle, this is an experiment and not a production scale implementation.

This is easily the most fun thing I’ve been involved with for years. Can’t wait to see it ship.
I've scaled large transformer based models that supplement a lucene-based search engine. The architecture supports an ensemble approach where Lucene results are first-class and then we tailor similarity rankings with the models.

It looks a lot like this: https://huggingface.co/blog/bert-cpu-scaling-part-1

We have to store large "index" embeddings on SSDs and use leveldb for value retrievals of the lucene results.

Yep I was surprised -- google and others have long moved to neural search, afaict, where we are seeing things like faiss for indexes based on embeddings, and all sorts of deploy pain around training+inference. I knew that was still true for elastic, but hadn't realized also for their replacements. So this article is clustering for pre-neural search, and guess enterprise search is still getting there..