Static Sites with Elasticsearch | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Static Sites with Elasticsearch (gizra.com)
	79 points by amitaibu 2245 days ago

3 comments

simonw 2245 days ago

I've been experimenting with SQLite FTS as a way of adding search to an otherwise static site.

The big advantage of SQLite FTS is that it's really cheap to run. The index is a single static file on disk, then you add a Python process (I'm using https://github.com/simonw/datasette ) to run queries against it. Much less resource intensive than running Solr or Elasticsearch.

It also works surprisingly well - I've run FTS queries against tables that are up to around 10GB on disk and performance is great.

It's no way near as featureful as Lucene, but for small to medium sized projects it's easily good enough.

As for deployment: if the SQLite .db index file is small enough you can bundle it up as part of a static deployment, e.g. bundled in a Docker container. I've done this using Heroku, Google Cloud Run, https://fly.io/ and Zeit Now (aka Vercel).

If the content lives in a git repository you can hook up CI (or a GitHub Action) to build and publish a new copy of the SQLite index on every change.

I've started thinking of this pattern as a kind of static-dynamic site: there's dynamic server-side code but it's running in read-only containers, so you can scale it up by running more copies and if anything goes wrong you just restart the container.

https://til.simonwillison.net/ is my most recent site to use this pattern, see https://github.com/simonw/til for how it works.

I also wrote this tutorial describing the pattern a while ago: https://24ways.org/2018/fast-autocomplete-search-for-your-we...

tootie 2245 days ago

It's mentioned in this post but lunr.js is a clever option if you don't have much content. The idea is that your search page has to create a json object with all your content and lunr will build an index out of it client-side. This sounds like terrible architecture but you could stuff about 100 blog posts into an object the size of one big jpeg. For a few dozen articles it's pretty snappy.

simonw 2245 days ago

Yeah that's a pretty useful trick.

Sphinx, the Python documentation engine, does something like that - e.g. https://datasette.readthedocs.io/en/stable/search.html?q=fts... which runs off this generated JavaScript index file: https://datasette.readthedocs.io/en/stable/searchindex.js

techntoke 2245 days ago

A few dozen articles? I've seen examples with Fuse.js and Hugo searching over 10,000 articles and it is fast. No server-side components required.

contravariant 2245 days ago

I think most people (me included) underestimate how ridiculously tiny raw text is compared to other common file types.

chopraaa 2245 days ago

Great way to botch user experience since the user has to download the index.

techntoke 2245 days ago

There are ways to optimize this per section, alphabetically, etc. Otherwise Xapian is very easy to setup and would be my goto over Elasticsearch.

evandrofisico 2245 days ago

You can also do the lunr index creation server side, with lib in php or python, generating a quite smaller file and also reducing processing time on the client side.

I've been developing a podcast hosting solution that parses the xml feed, generate static pages and prebuilds a lunr index, and for ~100 posts, it takes around 30 seconds to build server side and less than a second to download and and load on the client.

tootie 2245 days ago

What's the file size of the index? Is it just encoded in JSON?

kevinastone 2245 days ago

Yes, it's encoded json for lunr.js. Here's the index for my static-site blog: https://blog.kevinastone.com/search_index.json. It's 305kb un-compressed at the moment (38kb gzipped).

arkadiyt 2245 days ago

Philosophically building website search on top of Elasticsearch seems fine, but don't call it a static site then - you're deploying a backend.

"Static site search" to me is something like adding a `<form action="https://duckduckgo.com" method="get">` text box to your site.

amitaibu 2245 days ago

@arkadiyt Thanks - I see your point. However, I believe you can also think about it as a service - similar to how Disqus can be added to your static site. That is, the site is static, but the results for the search are handled with a service. In this case the "service" - Elasticsearch - is tightly coupled to your static site's revision.

turnipla 2245 days ago

You’re confusing static with serverless.

Your content is indeed static but your search is not: It’s handled by a service that parses your requests and produces output.

Static content on the other hand are files being served straight from the filesystem.

As for Disqus, they’re not static either, they’re just “a service for static websites”

amitaibu 2245 days ago

@turnipla Indeed, the Elasticsearch is completely a typical request - response kind. However, the point in the post was showing how we could make sure the search is in full sync with static site - even if we for example rolled-back deploys. That is, even if we rollback to a revision with less content than what we have in the "default" index, search will not show it to us.

juriansluiman 2245 days ago

Yes. I have a Google Custom Search engine which I dispatch via javascript. It just displays the results formatted from a json list. It is just a matter of preference, since DDG can't integrate search with an API as far as I know.

karterk 2245 days ago

If you're looking to do the same on something that's easier to run and manage than ES, consider using Typesense: https://github.com/typesense/typesense

The primary benefits are simplicity, typo tolerance and ability to expose the search engine directly to the front end without having to put it behind an ELB as described in this post for Elasticsearch.

P.S: I work on this.

kvz 2244 days ago

TypeSense looks nice for this but it could really use an officially supported browser integration so folks can onboard easier I feel https://github.com/typesense/typesense/issues/85

karterk 2244 days ago

Agree 100%, very close to launching that (within 2 weeks!).

kvz 2244 days ago

Looking forward! <3