Hacker News new | ask | show | jobs
by skuhn 2158 days ago
Since most CDNs are already doing GeoIP lookups (for request headers or log entries), you can leverage that to provide the data back in the response body via origin, worker or even CDN edge config.

Programmatically populating the response body, as in the Cloudflare worker example from the post, is better than going to the origin just to echo some headers back in the response. To me, something like Fastly's VCL config language is even simpler. It directly executes on every CDN edge node worldwide upon request.

For example, I just whipped this up on Fastly using VCL. It returns GeoIP as json data for your IP at the root path:

http://geo.zombe.es

Or if you want a particular IP, just append it to the path:

http://geo.zombe.es/2a04:4e42:600::313

You could do the same via query params, headers, etc. Have URL endpoints that only return some of the data, and so forth.

The VCL syntax gets a little gross when you handle quoting strings and assembling json and testing if the string is empty, but it gets the job done.

Of course what you might want from GeoIP data may not be what you get. It's really kind of a useful kludge that gets treated sometimes as a panacea.

This dataset right now thinks that I'm about 5 miles east of my location, but when subnets are repurposed it could be much more significant. And the data sources are always changing, so who knows what it will think tomorrow.

2 comments

Your service correctly returned the following data about my IP:

    "client": {
      "conn_speed": "broadband",
      "conn_type": "wifi",
      "proxy_description": "vpn",
      "proxy_type": "hosting"

Is this what Fastly is thinking about my IP?
Yeah, that's coming from these four Fastly VCL variables:

conn_speed: https://developer.fastly.com/reference/vcl/variables/geoloca...

conn_type: https://developer.fastly.com/reference/vcl/variables/geoloca...

proxy_desc: https://developer.fastly.com/reference/vcl/variables/geoloca...

proxy_type: https://developer.fastly.com/reference/vcl/variables/geoloca...

conn_type is interesting to me, I'm not sure how you would distinguish wifi vs. wired based on HTTP header data.

I haven’t worked in the space in a while but I’d doubt it’s via anything like HTTP headers. From a total guess I’d look at packet inter frame gaps & jitter to imply client csmacd or l2 behavior. Maaaaybe MTU and TTLs to infer intermediate routed networks or devices like the tunnel. And of course various TCP options and behavior, like say timestamps and dsack, to fingerprint the client or intervening ip proxies.
it used to be, for years, thats the older stuffin the "geoip.<key>" namespace.

the stuff in the "client.geo.<key>" space is from a newer/better/higher-tier service (they say, i forget the name). also I think some of it is mixed in with other sources and some of the info is self-sourced.

Whatever it's using for conn_type, it's not accurate. I get "wifi" on all of my computers, wired or wireless.
What does the code for this look like?
The VCL for it is available at https://gist.github.com/simonkuhn/a380a6fa205db87a3625f26ad0...

vcl_recv and vcl_error are the important bits, the rest is VCL boilerplate from https://developer.fastly.com/learning/vcl/using/ for unused subroutines.