Hacker News new | ask | show | jobs
by SanFranManDan 3543 days ago
For me it was

sudo apt-get install elastic-search

pip install django-haystack

add haystack to installed_apps, add the elastic search backend and set the end point.

use a haystack index class that is prebuilt to hook into all model change signals.

done. No weird configurations, everything was vanilla.

Maybe I am missing a step but it took me less than an hour to get everything going and it hasn't had to have any maintenance.

When you are adding ngram support and all the indexes and views in postgres to replicate the behavior, ES looks to be less complicated. At the very least I don't see a reduction in complexity doing the postgres way just that you have 1 less dependency to worry about.

1 comments

With this solution, it looks like Django is keeping the Elasticsearch index in sync with the SQL database. What happens if the Django process crashes after having committed data to the SQL database but just before having updated the Elasticsearch index? How do you reconcile the index with the database after the fact?
Three options, depending on how demanding your users are:

1.) Don't. So what? That entry will never show up in search results, which is probably exactly what would've happened if you use a search engine with poor ranking, and exactly what will happen if you don't provide search at all.

2.) Blow away your index periodically and re-create it at off-peak times, or upon crash. Works as long as your data set is small enough to read it all off-peak.

3.) When you read your search results back, check them against the source-of-truth and re-index anything that's inconsistent. Relatively easy if there's a 1:1 correspondence between ElasticSearch documents and RDBMS tables; gets more difficult with complex joins.

From my memory, haystack gives you a couple of manage.py commands for synchronisation. So you'd run one of those when spinning things back up.

All the usual caveats about the inherent complexity of distributed systems apply, but it's still pretty convenient.