| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by TuringTest 1479 days ago
	> OSM stores maps as graphs, in flat files where each line is either a node, an ordered list of nodes, or metadata. The graph nodes can be arbitrarily ordered in OSM files, which leads to computational complexity when parsing them. This is not a bad thing, since it means that the spec for OSM files can be extremely simple, which makes it easy for people to contribute to OSM. That's actually a sensible design. Treat user-facing stored data as user interface. If you need efficient processing of that data, such as fast parsing, you can always build it elsewhere, such as by caching that data into an intermediate structure that is recompiled whenever the user data changes.

3 comments

xg15 1478 days ago

> Treat user-facing stored data as user interface.

Are you telling me, the main mode of contributing to OSM should be to edit XML files and put in GPS coordinates by hand?

That would be about the most user-hostile UI for map editing I could think of.

link

SteveCoast 1479 days ago

Someone gets it :-)

link

seoaeu 1479 days ago

Wait, the proposed solution to a data format being slow to parse is to work around the bad performance by caching the already parsed representation? That seems like it has a clear flaw if you’re only accessing the data once…

link

TuringTest 1478 days ago

Where's the flaw in that? If you're only accessing the data once, why does it matter how fast or slow it is?

And, you're suggesting that user-facing data should be harder to work with only to make it faster to parse?

link

seoaeu 1477 days ago

Accessing any given data once. When you have a total dataset size in the 10-100s of gigabytes range, having to download any significant fraction of it to do data processing is really unfortunate.

But seriously what's up with this total disdain for anyone trying to build applications with OSM data? You don't seem to care whether parsing is near instant or as other commenters have mentioned, literally a majority of total processing time for certain compute jobs

link