Hacker News new | ask | show | jobs
by sdenton4 37 days ago
The problem is simplification. Suppose two regions share a border with some nonlinear points a, b, c, d. Simplifying the polygon for the first region might yield a, b, d while the second yield a, c, d. This creates gaps or overlaps between the two regions.
2 comments

But what is the border? Set the border to what it actually is, not a simplification of it. The state of Colorado is formally a 697 sided polygon, don't simplify it to a rectangle.
This is not what OP is describing. It is very common to simplify objects for decreasing boundary objects by orders of magnitude. GeoJSON is missing correlation when you do that. Simplifying country objects from a GeoJSON source could lead to a gap between the country borders. So you either have poor representation or a longer pipeline to convert objects to an amenable object set. It also breaks idempotency in some regards.
To do the simplification, you detect shared borders, simplify and generate polygons again. That doesn’t make topojson inherently superior. You can convert back and forth and for many applications geojson is easier to process.
Yes, you could write code to do that. Or use the utilities provided in the TopoJSON GitHub and let them do it for you: convert to TopoJSON, simplify, convert back to GeoJSON. They have already written all the code for you.
Yeah, or you could use Geojson and use https://mapshaper.org/
It depends on what purpose you are using the polygons. In an online map you need to simplify way down. Consider these Colorado maps at two different zoom levels:

https://maps.app.goo.gl/JH93ko96QcoLXuBJ9

https://maps.app.goo.gl/au53iTnsmNdFuEZV8

Even the one zoomed in on the state appears to use maybe 15-20 vertices max.

In the second one, if I squint real hard I can just barely make out one slight dogleg on the western border and one on the south. And that is partly because I knew to look for them in the zoomed-in map.

If we use, say, the Census TIGER/Line boundary definitions for the states, we are probably talking about hundreds of thousands of vertices, perhaps millions. You won't be using those in an online map without simplifying.

The Texas border with Mexico is formally down the centerline of the Rio Grande, even as the river moves (ignoring fiddly complications). Even if you could somehow take a perfect snapshot of it at a given time, you'd run into the coastline paradox when sampling it.
So don’t simplify the shapes on their own. Geojson is a storage and exchange format, you can still convert it to other formats if you want to modify it.
I think what the original comment is pointing out is that GeoJSON lacks a concept of a shared boundary. Shared boundaries expressed in GeoJSON get around that by duplicating data. Whenever data is duplicated, there's a risk that the copies will not be exactly the same. That makes the task of modification more challenging given that the real world is full of messy data, like duplicates not matching.

20-25 years ago I worked a lot with map data from otherwise high quality, and sometimes authoritative, sources like the USGS and NOAA that had this non-identical shared boundaries problem (in formats other than GeoJSON). If the format doesn't allow such mistakes to be expressed, then they have to fix their data to publish it in said format.

Sure, but not every format is useful for everything. Geojson is great if you want a simple way to express a shape to show on a map. It’s like criticizing CSV because people put strings in choice value fields instead of doing a foreign key to another table. That’s just not what the format is used for.
I'd take your point further... No format is useful for everything. But we have to be aware of the trade-offs of each format (or language or tool or ...) in order to make the right choice of what to use for a given use case. We do that by sharing knowledge of where a given tool succeeds and where it falls down. Pointing out something a format doesn't handle well is not condemning that format for all use cases (I happily choose GeoJSON over other formats for many things).