Hacker News new | ask | show | jobs
by mattb314 1335 days ago
Wonder if this has anything to do with the sliding window:

> Sonic only keeps the N most recently pushed results for a given word, in a sliding window way (the sliding window width can be configured)

Default window looks like 1k documents. I read this as saying that super common words are basically dropped from the index (only 1k out of many thousands of docs retained), but I don’t know enough about the internals to be sure. Not sure if this actually hurts search results in practice, seems like an ok trade off for help docs at least.

2 comments

It's definitely a great trade-off to make for efficiently, but makes it inherently unusable for most of elastic searchs usecases.

Looking at it from a practical example such as log search (almost everyone I know has used kibana/logstash/elasticsearch at some point): you'd be able to search for things like tracingId/requestId but adding more filters such as logLevel, requestType or serviceName would be impossible

It has it's niche, but calling it an elasticsearch alternative really is a stretch

Also the ability to weight fields when fetching results to boost relevancy, which is needed for a lot of my use cases.
I wonder how easy it would be to change "most recently pushed" to something like a redis sorted set where each document has a score and only the top N results are retained when sorted by their separate score value? That would allow you to sort by pageviews / popularity in a more useful way. But it fails entirely when looking for uncommon intersections of common words, which feels like it makes it useless for most actual full-text search use-cases :(