Hacker News new | ask | show | jobs
by querious 3154 days ago
What I'm psyched about is OpenStreetMaps data queryable with Athena. It's traditionally kind of a pain to convert PBFs to a queryable format.
3 comments

Have you looked at Overpass API?

(it provides direct access to OSM data using a DSL: http://wiki.openstreetmap.org/wiki/Overpass_API/Overpass_QL )

For tiny purposes the public servers are sufficient and there seem to be quite a few people running private servers.

In case you missed it, we just added support for 2D Geospatial Queries in Amazon Athena:

https://docs.aws.amazon.com/athena/latest/ug/geospatial-exam...

PS: I'm on the Athena team

Out of pure curiosity, how so? I deal with Protobuf regularly, and as long as a decent library exists to dump to JSON that is domain specific to your use case it is trivial. Is that the only thing missing here?
For starters, the OSM PBF file format is not a protobuf file! Instead it's a collection of protobuf files inside each other!

You can read more in the fileformat: https://wiki.openstreetmap.org/wiki/PBF_Format

There are other problems, specific to OSM and not PBF/protobuf, like needing to store the locations of nodes until the end of file because they could be referenced anywhere in the file.

Global OSM is 40Gb or so - there are various libraries to translate it but as you can imagine, the sheer size of the dataset causes challenges. You also have to make choices about how you translate the attributes - for example, if you want to pull certain tags from the key:value field into separate columns in a table. Yet another issue involves source and target geometries - there can be inconsistencies in how features of the same type are recorded in OSM in terms of geometry, and so getting disparate input types translated into a single output type involves choices. Yes you can easily (after a wait!) get global OSM translated into something else, but making that something else exactly what you need can take effort.