Hacker News new | ask | show | jobs
by verdverm 2224 days ago
I recall seeing a paper where they showed how close you can geolocate with various numbers of peers to the target, by using network latency alone
3 comments

That's funny, I was just working on a POC like this today[0] - it's accurate most of the time for my location but I haven't tested from other locations. You'd need to tweak the 'known' servers on and off to find the optimal arrangement because you'd need to be somehow inside the polygon. I was planning to find a way to discover these itself and other tweaks (like trying multiple times then averaging out)

Edit: the paper I found related to this is here[1]

[0]: https://github.com/stagas/http-geolocate

[1]: https://homes.cs.washington.edu/~tom/support/geoloc.pdf

https://vercel.com/edge-network

This page contains websocket addresses of a CDN that returns ping pong from a huge number of locations. It works surprisingly well for working out very fine grain location in just playing with it.

Oh wow, they're actually doing the same thing, triangulating on latency. Hive mind, I guess. I used universities because I figured a)they won't mind, and b)they're more likely to have their servers on premises, then used a public reverse geoip to get their locations. I'll try to see if I can integrate with Vercel's edge network, seems more ideal.

Edit: no, I misunderstood, the location dot that's being displayed isn't the product of triangulation, they're just doing reverse geoip lookup. So, I wonder now if the edge network would perform better.

Update: it doesn't perform better. Either there is some kind of proxy redirecting their traffic or these servers aren't where they say they are, the center is skewed out completely. The universities win so far being correct and accurate most of the time.

Unimpressed with yours. Why? Because my ISP got some IPv4 space recently, which was formerly allocated to .ua, .ru, .iq, & .ir. While being physically next to or in Hamburg, .de. That led to all sorts of inconveniences for some people, suddenly barred from logging on, or using the sites they frequent, because they used outdated geo-ip data. I didn't even notice, except for the outrage in their customer forum.

Now yours consistently puts me somewhere into, or onto the shores of the Black Sea, while Vercels doesn't. So that makes me suspicious of your claim by using latency alone.

Edit: There seems to be outdated geo-ip information factored in somewhere. Why else it would put me IN the Black Sea?

I mentioned I just started working on it. To try for yourself you'd need to clone and tweak the servers list to find an optimal arrangement for you. The problem as I see it is that jumping continents occurs really artificial delays which skews the result significantly, so it first needs to identify your relative whereabouts, then decide on an optimal set of servers. If you clone and tweak the servers to place yourself inside the polygon you'd see it does locate you. Vercel is doing a reverse geoip lookup, so your location is preconfigured in some database based on your ip.
At least they are current ;-)
This is commonly referred to as "ping triangulation" and seems to be reinvented every few years. It sounds good in theory but in practice performs poorly unless ran consistently for days, which is why few researchers end up writing it up or publishing POCs for the next person to find. :)

You should take a look at network path convergence for geolocation instead. You can see a demo of the tool I wrote at https://traceroute.guru/tr/209.216.230.240

I think the key thing to keep in mind with this approach is that physical distance and network distance are only loosely related.

You can easily be next door to someone, and your packets go all the way across the country to get there.

I've definitely seen cases where my latency to two servers in the same building were wildly different, depending on the paths my ISP and their ISP(s) routed the traffic.

I'm about 22 ms roundtrip away from ISPs facility at the local internet exchange in Seattle. If a server is connected to that exchange, and traffic flows through the exchange both ways, I see ping times of about 22 ms. Sometimes, my ISP will send traffic through San Jose instead, but the server returns the traffic in Seattle, and that adds about 26ms, so I get a 48 ms ping. If both sides route through San Jose for whatever reason, I'll get 74 ms, which is close to what I'd get if the routing was sensible and the server was in the Washington, DC area. (You normally can't see the routing back from the server, but sometimes you control the server, too).

If I were building something like this, I'd want to try to determine how much of the latency was the user getting to where their ISP interconnects with other networks (or with multiple routes within their own network), and then where that interconnection location is. I'd guess you can get a pretty good idea of the interconnection location, based on reasonable network paths from there, but distance from the interconnection point is going to be tricky, most residential networking technologies add much more latency than the speed of light, so your upper bound of distance is going to be pretty far off. Of the roughly 22 ms I see to Seattle, about 20 ms is just coming from the DSL termination; when I had AT&T GPON, it added about 4 ms to my pings; that's a lot of distance.

Weird. Your POC seems to be placing me almost directly opposite in the caucasus.
Yeah, it doesn't work really well :) Work in progress
Would be interesting to see a link! That doesn't sound like it would get an accurate guess to me, given facts like "light travels slower in copper than fiber" and "your packets have to enter a country through specific large hubs" etc etc.
If you have enough machines you can just use machines after those large hubs. The copper/glass light speed difference doesn't matter much because the speed differences added by hops in the middle are usually orders of magnitudes higher.

In the end, unless you've got machines next door pinging your target you won't be able to differentiate between houses or city blocks.

That's pretty interesting. So, effectively, a triangulation based on latency times?
Hate to be a pedant, but it's technically trilateration.
I didn't know this term, thank you for being such a pedant :) knowing the right term makes it easier to find information on the subject.
> So, effectively, a triangulation based on latency times?

Yes.