Hacker News new | ask | show | jobs
by ObserverEffect 2468 days ago
Disclaimer: I work at Algolia (https://www.algolia.com/), a hosted search as a service API.

While I agree that building a great and relevant search experience with a Lucene-based engine requires lots of extra time and effort to get right, there are other non-TFIDF based solutions that provide a much faster path to great relevance with far less effort (https://blog.algolia.com/inside-the-algolia-engine-part-1-in...), and it's possible to have semantic ranking without too much machine learning (https://blog.algolia.com/promote-search-results-query-rules/). Not to discount the value of machine learning - we're finding that for specific usecases ML can be a very valuable way to help surface more pertinent content for individuals based on their profile/preferences etc. (https://blog.algolia.com/personalization-announcement/).

This may be along the lines of what you mentioned around "commoditizing" complex traditional search workflows. I'd be curious to hear more about what kind of use-cases you think are trickiest without SS.

1 comments

Although I'm a big fan of Algolia search (because it's freakin' fast) I happen to know little to nothing of your search model other than what I have learned from Algolians chipping in right here on HN.

I used to be quite impressed with Lucene, even at version 1.0 (when a fuzzy search meant a full table scan), then watched in joy when they conquered the search market, before realizing how it struggled (and still does) with, well, I hate to say it (because I'm usually ridiculed when I bring this up), y'know, big (-ish) data. The proposed and popular solution: sharding the data onto a cluster of machines.

Algolia seems to be a focused, streamlined and more efficient ElasticSearch, at least in the FTS use case.

I've worked almost exclusively in e-com for ~20 years. Algolia FTS+personalization seems to fit the e-commerce use case pretty darn well.

I wonder, regarding "Algolia Query Rules" (which also seems like a real killer-feature for e-commerce):

>> automatically transforming a query word into its equivalent filter (“cheap” would become a filter price< 400...

How do you translate "cheap" into "price<400"? By maintaining a dictionary? Also, what if some people think 400 is quite expensive?

I want to build or implement a search engine that is inherently self-maintained in the same way you and me are self-maintained. As humans, however, we do have a serious flaw. In order for us to maintain an index of our knowledge we need to sleep. To start with I'd like to try to mimic that construct, then move past it.