Hacker News new | ask | show | jobs
by chx 2513 days ago
NowPublic did a tool like that in 2008 intended for the use of newsrooms but we never got a single subscriber, every newsroom we showed immediately wanted to acquire the company :D Eventually one did. It was a relatively primitive affair extracting information from the Twitter gardenhose with Stanford NLP and storing / retrieving with Xapian. There was nothing like that at the time as far as I am aware.
2 comments

Very cool! And yes, all the data is out there and is public, it's just a problem of filtering through the noise to get the signal. Because < 0.01% of topics discussed online are sustainable trends, it can take a bit of engineering work to do this all...

1. at a non exorbitant cost 2. in a time period where it takes less time than the cycle of information otherwise it's useless by the time the trend is surfaced.

Yeah, as I said, this was primitive: Stanford NLP was used to extract keywords out of tweets and then it was down to measure the velocity of keywords.
> every newsroom we showed immediately wanted to acquire the company :D

That's an amazing story. Where can I read more about this?

Ask away. Here's what the public knew at the time of Scan: https://mashable.com/2008/09/19/nowpublic-blog-scan/ and here's the acquisition https://venturebeat.com/2009/09/01/examinercom-snaps-up-citi... I will try to answer but if there's real interest, I haven't lost contact with the old team so we could perhaps do a Reddit AMA . Maybe? It's not so interesting after so long, really. Well, now I Google'd it I now remember we used MongoDB for initial storage, that was quite an early use for it and frankly ideal because we just needed to store quickly and if a few tweets got lost, big deal...

Here's a Scan example: https://web.archive.org/web/20081015004600/http://www.nowpub...