Hacker News new | ask | show | jobs
by alok-g 4780 days ago
Thanks. This helps. This was a great idea. I would like to know what your larger project is when you are ready.
1 comments

A natural-language search engine (hence my earlier comments on NLP). I've been using it as a news reader for a while now. Should be able to open invites for the reader part in a month or so.

The crawler collects lots of metadata from content. The vocabulary engine is part of a planned down-the-road feature that will provide additional metadata for crawled content, as well as eventually help users find other users that write comments they want to read. (It has an "anti-social" aspect planned, where user interaction will be allowed on reader content, but the software will encourage users to form loose-knit groups of around 20 or so.)

I think the number one problem of social networks right now is that they try to grow without restraint and force hundreds (or thousands) of people to all interact. But humans aren't wired like that; we don't do that well. What does seem to work well is LiveJournal-style communities, or the BB communities, where people get partitioned off into smaller groups by common interests, with other people crossing between interest groups.

I'm not super excited about the community stuff though. The NLP work has been a blast so far, the interpreter I wrote seems to be working out well. I took a somewhat naive, very clean approach to NLP, and I think it'll support up to a few thousand different types of metadata (I can't even imagine that yet) and at least as many different phrases. I have a little more mostly front-end work to do on the reader, then after that I'll start working on providing direct access to the search engine behind the reader. (The reader isn't an RSS reader, it's a search engine interface that lets you use the results from searches as news feeds -- like, right now I have a "front page of HN" news feed. Reddit content is also being crawled, I just need to edit the parser to accept a query like, "front page of HN and r/technology and r/startups".) Eventually I'll mess around with the community part.

So, for example, users with an account on the reader (and, later, the search engine) might have articles closer to their own reading level get a mild rankings boost.