Hacker News new | ask | show | jobs
Show HN: News geolocation website
41 points by zack2018 2888 days ago
The idea: The idea is to create a news aggregator that geolocates the news, it analyses the news and displays an article on a street/city/region (location in general) if it is mentioned in that article. The website will later give you notifications about what is mentioned in the news nearby. The current situation of the project:

For now I have a minimum viable product, I have the website up and running that shows how the news will be displayed on a map.

What I am asking:

It will be really very nice of you if you can give me feedback, any feedbacks even negative ones are really more than appreciated. I know there are many bugs, bad design, lack of content but the question I am asking you is would you use such a website/mobile app if it existed? Do you like the idea? Do you think it is worth it if I finish building such a website?

Here is the link to the website:

http://www.toperudite.com/

Here is the link to the newsmap:

https://www.toperudite.com/pages/news/newsmap

Please don't hesitate to fill the following survey (it takes less than 3 minutes)

https://docs.google.com/forms/d/e/1FAIpQLSe9q_0roqwtipe4KLyR...

Here is a link to a slack channel : https://join.slack.com/t/toperuditebetatesters/shared_invite...

Here is a quick youtube video https://www.youtube.com/watch?v=CrwvY049ipE

Thank you very much for your time :)

24 comments

I had a startup that built almost exactly the same thing in 2009 (at the peak of 'hyperlocal' hype), so I have some advice. We ended up creating branded widgets for partners and attempted to monetize with advertising.

1. Your biggest problem is going to be acquiring audience; 'if you build it, they will come' is probably not going to play out unless you offer something above and beyond a typical aggregation web site.

2. Consider the utility of granular location-specific news. For many users what you will be showing is either very little content (because there isn't any) or content from a larger geographic region. At that point you have to ask who your audience is and what you provide to them over a bigger regional news source.

3. Keeping up with available sources will be a big challenge. Remember, if you're just aggregating from the big sources, you're not providing much; the appeal to an aggregator is that it gets and filters everything. If you have to go to another source as a user, there's no point in using this site.

4. The site looks pretty bare-bones. It could use some design help across the board.

5. The 'secret sauce' of a site like this is actually not aggregation and automation ... it's curation. If you read 100,000 RSS feeds (if and where you can even find them these days!) you still need to separate the wheat from the chaff. That's why Reddit works - it's effectively a user-curated aggregate site that leverages a scoring algorithm to (ostensibly) bring the best stuff to the top.

Edit: Forgot one of the biggest challenges. I don't know how you're doing your address extraction, but one of the toughest things we ran into were false positives. Articles that reference a location which we geocode but really weren't about that location. So you'll analyze a whole story (which can get very computationally expensive) and find 4 locations:

1600 Pennsylvania Avenue Washington, D.C. Washington Zimbabwe

Now this story is about the Zimbabwe ambassador having a meeting with the president. Two of those addresses are where the story is, two probably aren't (Washington would probably not be geocoded to D.C.). So you file the story in:

Washington, D.C. Washington, USA Zimbabwe

Now you've just got noise. NLP can help but it's not enough - you will mis-geolocate this story. When that happens a lot the results are not reliable and, again, your potential audience does not see the utility.

Anyway, best of luck.

As somebody coming from geocoding background parsing places from full text will be a huge challenge. Hardly a word that isn't also a location. (There's a state in Turkey called Batman).
Indeed it's one of the most difficult challenges I am facing, the solution would be to give the power to the users so that they can discard such parsing. But first I have to ensure that there are people who might be interested in using the solution, it's the phase I am in right now (trying to validate the concept), thanks for the feedback :)
there is a town in Austria called Fucking: https://www.youtube.com/watch?v=bbNzPnvkn-A
haha, true, indeed
> built almost exactly the same thing in 2009 (at the peak of 'hyperlocal' hype)

In 2009, there was around 25 million smart phone users. We will close out this decade close to 3 billion. That is half the world with a geo-location based device in their pocket which grabs their attention for hours each day.

The peak may have been Pokemon Go, or maybe we aren't even there yet.

We probably aren't there yet in practice, but the hype far preceded things like Pokemon Go. Foursquare really kicked it off.
true
True, there are many things to do to improve the quality of services/products using the devices geo-location, not only in the gaming business.
Thanks a lot for all the feedbacks, I agree with you, the more I work on the project the more I realize how difficult it is to make everything fully automated without impacting the quality of the geolocation (NLP is not 100% reliable) and the quality of the sources, I am still working on that.
Many things don't happen at a point, they happen in an area, e.g. if something is relevant to France as a whole, don't show it as happening in Paris, use something to indicate a bounding box or radius which the story is centred on.

If something happens in Paris, show it in Paris, not in some random particular location in the middle of Paris, unless you know that it really happened at that exact location.

See this: https://splinternews.com/how-an-internet-mapping-glitch-turn... for the kind of issue you can cause.

I see your point, it's something listed in my list of bugs to fix, for now i am still in an idea validation phase where I want to validate the concept before committing more time to it, but I agree, this should definitely be fixed first :)
I built something like this a few years ago. We ultimately did not really succeed but I think this could be potentially still very interesting.

Location is interesting when you combine it with time. News archives contain a lot of valuable content that would be interesting in the context of a location. For example, I live in Berlin and when we started digging around in archives from news papers, we found all these gems about David Bowie visiting certain bars, being on certain streets, etc. This is interesting to people in that area, years after the fact but not necessarily for people outside that area. Just having a historical view on a place via the things that people published about it is interesting.

Our problem at the time was coming up with enough of an MVP to convince users and investors. One thing we explored was using nlp to extract clues about location references from the text. This is surprisingly hard but not impossible. People use a lot of ambiguous language to refer to locations but taken together you can sometimes deduce correctly that people are referring to a street in Prenzlauerberg (a neighborhoud) in Berlin (the capital of germany, not the village near Bremen). This is of course flaky. The good news is some content is actually geotagged, which makes this easier. However, we found a lot of low quality geotagging as well.

It is definitely something I will explore if I continue the project, combining time and location. I agree it is difficult to find locations for text but nowadays the NLP algorithms are more powerful than ever before so it is feasible :)
The problem is the references to locations in text are ambiguous. There are many places called paris (most of which outside of France). Many streets called Main street, etc. Also lots of articles mention several locations. Then there are lots of informal names for neighborhoods, people being a bit loose with boundaries, etc. You can usually guess the city but getting from there to e.g. street level or neighborhood level is a lot harder. Anyway, good luck.
I think it's an interesting idea, but I'm rarely interested in news about a specific location (except when a particular incident / tragedy has happened). For local news, I tend to go read my local newspaper websites, and Google News already aggregates local articles anyway for me in their Local News section.

However - I am looking for a Google News replacement, as it keeps showing me gossip, entertainment and snarky news, no matter how many times I click "fewer stories like this". They used to have a feature where you could block news providers (eg never show me news from TMZ), and also choose my preferred news sources (eg I could choose The Verge & Bloomberg) and it would rank their version of the story highest. That was very useful while it worked. You could also add topics that you were interested in, so if I added "Nine Inch Nails, Trent Reznor", it would rank news stories about Nine Inch Nails more highly so that I would see them. (I would really like to be able to blacklist topics/keywords too.)

If somebody made a news aggregator like this, I'd love to switch away from Google News.

Hey, i'm a little late but i've been working on something that you might find interesting. I've been using RSS readers for quite some time but wanted to have an app that would combine rss feeds and GoogleNews-like news aggregation. So i built https://aktu.io, It's a mix between a RSS reader and a news aggregator, so you can think of it as an all-in-one Google Reader + Google News web app. It's still an early version and doesn't have all the features of Google News (yet) but i would love to have your feedback !
Thanks for the feedback, those are all good suggestions, much appreciated :)
Very cool idea. I haven't seen it mentioned in here, but the GDELT project is something you might be able to use

https://www.gdeltproject.org/

I didn't know about the GDELT project this could be very useful for my project, much appreciated :)
Nice work. I also started a very very similar news aggregation service a few months ago, which extracts locations from news articles and shows the headlines on a map. http://mapflare.com

As other comments already mentioned, there are many false positive locations, such as names of persons, organizations, chemical elements or even normal verbs and nouns. There exist for example places named "Robin Hood".

Took me some time to realize that i should not limit the text extraction to locations, but also focus on the recognition of other entities (persons, organizations,..) in order to filter out the ambiguous names.

The idea is a very good one. Thought about it just the other day. The news in Sweden just now are overwhelmed about report about forest fires, but very hard to follow as most of them are in "obscure" locations. People that are concerned about fires in familiar neighbourhoods need to search the locations to see if they are close to the locations they are concerned about etc. I think there is much promise in the idea to provide a map interface to news of this kind. The regular media is in any case not doing a great job about it.

But also many challenges. I noticed in the past days that the accuracy of locations in the news is much lacking. Location names are not unique so an automatic approach is likely to fail.

I am yet another person who built something like this some years ago (2006, I think).

The idea is OK, but you're missing something important. People don't care about local news, they care about relevant news. Some news is relevant to me even though it's happening on the other side of the World. Some news is only relevant to me because it affects my next door neighbour (who knew he was into that, huh?).

You're going to need to spend a lot of time fixing up the design as it is right now, and audience building is going to be so, so hard in this climate.

Good luck, but you've a way to go.

Thanks for the feedback, it's much appreciated :)
It doesn't really seem to work. The newsmap doesn't have any content.

Also it's a little strange that you're only able to search for streets, not general areas or cities.

Thanks for the feedback, I will add the cities/ general areas, it's necessary indeed. And sorry for the inconvenience, there was a problem with the database, everything is back to normal now.
I think small visual changes in text size and formatting can make a big difference. Like just making the titles of articles bold, and putting them above their sources.

There is a bug for me where if I click the language change to english, then set the location to english, the "hours ago" section still shows french.

This is a really cool project! Hope you post more updates on HN as you go.

Thank you for the feedback, much appreciated :) :)
Great start.

I had a similar MVP in 2010 with the intent on building a curated news aggregator for sales territories. Once you define your geographic area (zipcodes/states/draw etc) and choose keywords and/or topics, your news feed will include any local or AP story that includes these keywords and geotagged within your territory.

Very similar idea to what I was developing some time ago:

http://www.theheretimes.com

about: http://www.theheretimes.com/about

very similar indeed
A lot of news sources have a large part of their media already devoted to local news. I don't see this providing much value on top of that. I think your engineering work here is great, but business wise I think you should pivot.
Thanks for the feedback, much appreciated :)
> would you use such a website/mobile app if it existed?

Geolocated "news" is used in larger Brazilian cities but it's regarding stay bullets and robberies (eg. Fogo Cruzado, Onde Tem Tiroteio, Onde Fui Roubado). So there are some emergency use cases, but I'm unable to think of another use case where I might want to see the news mapped, as the core idea.

https://www.reuters.com/article/us-brazil-security-app/brazi...

Thanks for the feedback, it can be very useful in case of an emergency indeed.
Related: http://eventregistry.org/ (B2B SaaS)
True, thanks :)
The idea is interesting. B2B: I think that historical trends, especially on a seasonal basis might be quite useful for news and travel agencies, real estate offices, and general economic analysis.
thanks for the feedback, those are very good ideas much appreciated :)
I like the idea. It would be great if a user could help the site improve its results for that user by, say, removing news about local hockey games that I don’t and will never be interested in.
True, adding personnalisation of the news depending on the profile of the user is in my list of things to do :)
I like this idea a lot, but not able to check it out due to 502 errors. Would like to see it when it's a bit more built out. What are you using for address extraction?
I use machine learning for address extraction. Very sorry for the inconvenience, the website is back online
The address extraction piece would make an awesome API that you could charge for if it's effective. I imagine a lot of aggregators would pay for something like that.
True, thanks for the feedback :)
I don't see any content, is the backend dead?

https://imgur.com/a/BWpvBdOq

I don't see any content either, but just wanted to point out that your link gives me a 404.
Meh, probably I didn't save it.
Very sorry for the inconvenience, the website is back online
I'm getting a 502.5 process failure. HN death hug?
Very sorry for the inconvenience, the website is back online
Hey Zack, Can we talk a little further about this offline? You can reach me via email at Firegarden.

- Rob

> England

Union Jack Flag

Sites broken for me
Very sorry for the inconvenience, the website is back online
Rather than news aggregation on a map/location, I'd rather have news aggregation per event across countries/regions/etc.

For example, Trump-Putin meeting is an event. I'd like news aggregation on this event on what every corporate and independent news publication across the world is saying. Currently, I am only able to get US/British news viewpoints or angles and I have to work ridiculous hard ( google search is terrible now ) to find reporting on other nations/viewpoints. I was curious how the chinese, arabs, israelis, indians, south americans, etc were reporting on the story because having been to europe and asia, I know that news is not the same everywhere. One good thing about traveling.

Google news used to do something like this until they were forced to limit it to corporate news and regions.

That was the original idea, it seemed too complicated to do at the beginning but now after working some months on the idea, I think it's feasible, I will add this feature in my future surveys to see who might be interested, thanks for the feedback, much appreciated :)