Hacker News new | ask | show | jobs
by dalbin 3595 days ago
It's very common to return the centroid of the matched location (city, state or country).
2 comments

It's a terrible approach, as it throws off all sorts of algorithms, and has the effect seen above. If you have to return a Long/Lat, make sure that your "Unknown" is something that nobody would ever confuse with being a valid address - North Pole, South Pole, In the middle of some Ocean, Dessert, or Mountain Range - whichever is least likely to be valid for your particularly application.

This makes it much easier for downstream application developers to filter out "Invalid" addresses, and simply eyeballing them on a map makes it clear what the "Invalid" value is.

And of course, nobody has thought of the possibility of including an accuracy radius with the position, so instead of having a point saying "It's here", you have "It's anywhere within this area".
Anyone that's driving out to a farm without checking other data sources will probably just ignore the accuracy radius.
It's the same with site metrics, or sensors in IoT - your measurement isn't X, it's X +/- Y, with some measurement error distribution.
I have not used MaxMind but with another geo data provider in addition to the centroid it returned the shape and the type of match it was (exact address, city, county, state, country,etc...). So you know whether the IP has an exact address match or not. But many application developers probably chose to ignore the complexity as that might be easier.
It may be a terrible approach in your view but it's essentially universal and is often the best one. If I ask just about any database for the Lat/Long of some city, it's going to return the location of some approximation for the center of the city--which may be a park, a house, or the middle of a river. It's not going to, nor should it, tell me "invalid question" because the city (or really any address within that city) isn't really represented by a single Lat/Long point.
I've worked on a number of applications, where we need to make a decision as to what the long/lat will be of "Unknown" - and we always pick something that is guaranteed to throw a flag for downstream consuming applications.

Recognizing how your data will be used, and taking some precaution to ensure that it doesn't result in scenarios described in the article is quite often fairly straightforward. (As evidences - The article itself made it clear that when they don't know the actual location, they have changed the long/lat to return a value in the middle of a lake to avoid this sort of problem in the future).

Not when this has consequences for whomever lives there

"Just send the drones to bomb the centroid of city X" doesn't seem so smart

Or they should have a giant disclaimer on their results saying that their information is subject to errors and it should not be used for legal/law-enforcement purposes