Hacker News new | ask | show | jobs
by nzadrozny 4515 days ago
I host and support websolr.com and bonsai.io and have seen a lot of search implementations.

The main thing for good stability and performance is to be very good at batching your updates. You don't want to sling a ton of highly-parallel single-document updates at Lucene, lest you thrash the JVM and start garbage collecting like crazy.

From there, on the query side, you'll want to get a good working knowledge of the different tokenization and analysis options. There are a lot of subtle and interesting combinations to be had in there that influence performance and relevance of your search results.

1 comments

Do you have a demo on either of those sites where I can input terms into a search box and look at results? What explanation do you give to users as to the options available when formatting the query?
We've got a free Heroku addon that's pretty easy to spin up and play with. Elasticsearch also has an analyze[1] API that can be helpful to play around with.

It's also possible to download and install ES locally and run any number of front-end interfaces, some of which include query builders. ElasticHQ seems like a decent option for that. The venerable Elasticsearch-head is another.

I think now that ES 1.0 has shipped, more experimental tools will start to emerge that help people learn and interact with ES itself. (If anyone out there is a front-end whiz and wants to help me build something like that, please email nz@bonsai.io!)

1. http://www.elasticsearch.org/guide/en/elasticsearch/referenc...