| I worked in a company previously where Solr was used to scale the business, and was not performant for us after a while. We wrote our own search engine at that point. You are right that there are a lot of little “devil in the details” issues. But overall it was a fun experience. This was needed to support some specific machine learning workflows in the search ranking process — which could not be used if we paid the high latency cost to first get preliminary results in Solr. So we took a “create your own index data structures” approach with index data (both the normalized bag of words vectors and companion data like boolean filters), which allowed us to highly optimize the initial broad ranking query. Latency was low enough that it allowed the time cost of calling follow-on machine learning services. This was for a fairly high-traffic product search engine at an online retailer. It ended up working very well and over a span of about two years we eventually rolled all search traffic onto the in-house platform, even the parts not needing the machine learning services, and our query latencies went down across all our traffic, and we retired the original Solr implementation. Wouldn’t be the right choice for everyone, but it informs my opinion a lot about the worthwhileness of creating an in-house search engine to specifically replace Solr. I’d suspect a lot of medium-sized or large companies running Solr should seriously consider it. |