Hacker News new | ask | show | jobs
by michael_dorfman 6200 days ago
There are some ACM/IEEE journals that have relevant papers, but you have to ask yourself: is reinventing the wheel what you really want to be doing? Given that there are lots of available COTS solutions, shouldn't you be focusing on things that are unique to your app?

(Needless to say, if the search engine needs are unique to your app, and a COTS solution isn't viable, you might want to bring in someone with relevant expertise.)

1 comments

spot on. OP: Are you asking how basic tf-idf works, or is there something you can't get lucene / SOLR / sphinx / tsearch to do easily?

nevertheless, here are some good background materials (search amazon on "data mining"

http://www.amazon.com/gp/product/1584504609

http://www.amazon.com/Data-Mining-Practical-Techniques-Manag...

Also the Collective intelligence by Satnam alag is quite good (a lot of java code to wade through tho

To be honest I hadn't even heard of tf-idf before you mentioned it. It is definitely not the case I am stepping beyond the bounds of something like sphinx.

I basically want to lay a bit of foundation before I start mucking around with something I have no idea about.

I have a couple e-books on Data Mining but I didn't think it was applicable. Are Data Mining and Search two things closely intertwined?