SIREn – Enhanced Structured Data Search for Solr and Elasticearch

Y	Hacker News new \| ask \| show \| jobs

	SIREn – Enhanced Structured Data Search for Solr and Elasticearch (siren.solutions)
	44 points by rendel 4350 days ago

1 comments

techtalsky 4350 days ago

My takeaways from the site for people who would like a summary:

SIREn encodes [...] the index using a completely different model than Lucene, [...] it uses its own disk format, low level compression algorithm and query operator implementation. A comparison between regular method (Blockjoin) and SIREn can be found in this blog post: http://siren.solutions/24-times-less-memory-11-times-faster-...

In SIREn, parent-child relationships of the nested elements are materialised and indexed into the same document. This is at the core of SIREn very high performance and scalability. This means that changes in the nested part of the document will require reindexing of the full document.

SIREn is Free and Open Source under an identical licensing model than MongoDB: http://www.mongodb.org/about/licensing/

link

rakoo 4350 days ago

It seems like SIREn is optimized for highly nested documents.

One of the best use cases for Elasticsearch is log analysis, where 1 log event has little to no nesting (at most a few tags). How does SIREn evaluate in these cases ?

link

harishkm 4350 days ago

The standard Lucene indexing model will be faster for simple flat documents

SIREn makes sense for any document with 1..* nested relations. The performance boost is proportional to the number of nested objects.

Further, SIREn also is truly schemaless. Which means that the type for a property can be different in different documents. Something that is likely to happen in complex scenarios.

ps : I work for siren.solutions.

link

PaulHoule 4350 days ago

I know Giovanni and some other SIREn people and this is a product that has amazing computer science and software engineering built in.

link

dholowiski 4350 days ago

Me too! (I'm a customer). These guys are brilliant, and the science behind what they do is (usually) way over my head.

link

collyw 4350 days ago

Is there any good resource that compares the various NoSQL databases and what they are best suited to (and what they are not)?

I am involved o a project at work where the other team are insisting on using NoSQL despite it looking like a relational problem to me. Unlikely that the database will grow over a few terabytes (should stay on one server).

link

jkbyc 4350 days ago

A comprehensive list of them is at http://nosql-database.org/ a comparison matrix of Mongo, Couch, Dynamo, HBase, Cassandra, Accumulo, Redis, Riak, Neo4j: http://www.infoivy.com/2013/07/nosql-database-comparison-cha...

There is so many of them that I doubt there is a good resource comparing them. I think it's best to start comparing the various models (document store vs column store vs key-value store vs graph db vs rdbms, ...) and then look for detailed comparisons of the implementations of the model.

There is a lot of hype around many of them, but a lot has already been debunked too, esp. here.

For example I personally would have picked PostgreSQL rather than Mongo in some cases in past when Mongo was pushed on me by people who's understanding didn't go beyond the hype.

Maybe describe your problem in more detail here and perhaps you'll get useful feedback? Maybe Ask HN? Feel free to drop me a line.

link