|
|
|
|
|
by krishna2
3253 days ago
|
|
That's a big claim and kudos if you really pulled it off. There is also the aspect of relevancy in addition to speed. I think the best way you can showcase is to build a few sample proof-of-concept search engines. For e.g., How about a search engine for Wikipedia? Project Gutenberg? StackOverflow? All these datasets are freely available. You can set up a search engine for this and easily let anyone be able to verify your search engine's speed and relevancy. Lastly, in addition to both speed and relevance is how easy it is to install, customize and extend. Hope that helps! |
|
I'm confident the relevance is as good as or better than Lucene. I especially like my phrase queries and how they seem more relevant compared to that of a Lucene phrase query. The scoring is a half-way implementation of word2vec (in a lot of ways similar to the scoring mechanics of Lucene's tf-idf scheme). I'm aiming for full word2vec implementation in vNext.
I have only my own benchmark tests to tell me I'm faster than Lucene. Which is why I'm contemplating writing a formal proof both of ResinDB's performance and of it's relevance.
My test data has been the English verison of Wikipedia plus Project Gutenberg. I suppose I could publish those indices to the world, as a demo search engine. I don't think a soul would care about a proper searchable Project Gutenberg though. Looking into common crawl now.
I'm a part-time father of two, employed doing tedious unmotivating work, focusing completely on my spare time project. I need some advise as to what the next step should be, if I wanted to make this into a business that I could spend all of my time with, not only nights and weekends. Formal proof? Demo?
Side note: one of the most approachable people in the database building community is Oren Eini, creator of RavenDB. He's reviewing ResinDB on his blog. I've read a preview of the entire series of posts, implemented solutions for the best parts of the critique and just released v2. Blog is here: http://ayende.com/blog