Hacker News new | ask | show | jobs
by jraph 1322 days ago
When I got started with the web 15 years ago, it was advised everywhere not to rely on user agent strings and rely on feature detection instead, and that using the user agent string should be a last resort solution.

Today, we are still seeing issues "solved" by switching one's user agents. And here we are reading Akamai whining about user agents getting unreliable. And we are talking unreliable at the minor version and specific platform version level.

It's not like we weren't warned ahead of time.

I'm sure problems will be sorted by proper http headers, data in handshakes or other things. And they should. Nobody should have to read user agent strings to optimize things, because things should also be optimized for a new, unknown user agent that would support these optimizations.

6 comments

I'm not sure how you came to the conclusion that "Akamai is whining" about this. It's an informational blog post about what's happening and what's changing.

User Agent strings aren't used for feature detection, they're used for classification. As a developer, when you're trying to fix a bug reported by a customer, it helps to know exactly which browser right down to the patch version that bug shows up in so that you can try and reproduce the bug in the same environment.

Odd response—especially the (perversely ironic!) dig in your first sentence. The blog post states:

> At Akamai, we use the User-Agent header at the edge and as part of many Akamai products for business logic

The post then goes on to describe several things that are expected to break (or would be breaking—if Akamai weren't taking steps on their end) since they rely on the value of client's User-Agent header, and it affects how they respond. It's definitely not just for being used at Akamai to help reproduce bugs in the same environments...

Then you can ask the customer. You have a relationship with them.

Akamai uses user-agent strings in its Bot Manager. They see what specific version of what browser you're running, then check certain characteristics of the request (eg header order) against a database. That isn't going to work anymore.

And good riddens. It makes the internet brittle and isn't especially hard to work around anyway.

> I'm not sure how you came to the conclusion that "Akamai is whining" about this

I largely overstated that. Of course. However, my feeling throughout the entire post is that it was like they were announcing bad news about which they were not too overly about this because they based their optimization strategy on this. I was like "the world told you so".

As the author of the post... :) We don't see this as bad news, quite the opposite in fact. Feature detection (if you're running JavaScript), Client Hints (if you're running code at the edge), etc are all a lot better to use for logic decisions. We've been making changes to rely less on the UA header directly for our Akamai products.

Our goal of the post was more around educating our customers, who may not be aware of these changes, and these changes can affect their custom logic, if they depend on the UA.

This is yet another example of good advice that over time gets oversimplified to an 'always' rule, and just becomes silly.

"Don't rely on User-Agent" was in response to things like:

  if (isIE())
      useIEThing()
  elseif (isNetscape())
      useNetscapeThing()
  else
      alert('unsupported')
And that is usually a bad thing to do, since you can always almost replace that with a simple "if ('foo' in window) useFoo()" test or the like.

But there are also things that can't be done like this. What if I want to serve the best possible image or video format for a platform? The Accept-Content header isn't really enough for this, aside from that it doesn't really advertise all supported formats on most platforms it also doesn't tell you things like "Firefox 86 enabled AVIF decoding support, and Firefox 100 enabled hardware decoding, but only on Windows". So if someone is using Firefox 101 on Windows: let's serve them an AVIF video, it will work great for them. If they're using Firefox 101 on macOS: maybe use another format because AVIF will eat all their CPU.

There's lots of little cases like this where you can't really rely on feature detection. There's a reason User-Agent got replaced by another system which gives the same information: that's because they're useful (IMHO Client Hints are worse by the way and not an improvement at all).

For this example I see two possible answers:

- Accept-Content has a q-factor weight that can be attached to the types. Browsers should use this correctly to hint at the server their preferred format(s). It should be considered a bug if not. What if I actually tweaked and recompiled my Firefox with a better decoder? Or if the browsers uses a framework like gstreamer / ffmpeg and I installed the right packages for this decoder? You can't know from the UA. Accept-Content cannot possibly list all the supported codecs when the browser accepts a large number of them, but it should at least list the most widespread ones with correct weights. But this leads to my second answer (especially as a very reliable Accept-Content is bad for privacy):

- Actually, just use srcset and provide an entry for each format you support and let the browser pick the right one. No need for Accept-Content and CDN magic. It should be the browser's responsibility to know what's supported, what's not, what's best. If not, again, it should be considered a browser bug, the web developer has done their work at this point.

The server can't know the best format, only the browser can.

I understand that reality is different, but Akamai and YouTube both have leverage on browser vendors to make them fix their bugs / to build standards for this shit. Smaller developers can report bugs too, and I understand that using the UA string is a "worse is better" solution that works around the issues, but we've had years to fix this.

A large entity like Akamai should not have relied on UA without preparing cleaner solutions built with the browser vendors, and should not have been caught by surprise by such a change. Something is wrong in this story.

Note that I didn't outright reject using the UA string for workarounds, but that's still a last resort which is error prone. Building business logic using the UA string is asking for troubles and we've been known this for a long time now. The proof of this is in the very existence of this article from Akamai. They are "screwed" [1] because they can't really do what they are doing the way they are doing it.

[1] I'm sure they are smart and will find solutions. I hope they'll find that building standards with browser vendors is a good solution.

> A large entity like Akamai should not have relied on UA without preparing cleaner solutions built with the browser vendors

The current solution works; there is no problem here.

> The proof of this is in the very existence of this article from Akamai. They are "screwed" [1] because they can't really do what they are doing the way they are doing it.

This is a rather odd interpretation of the article; they just switched from User-Agent header to UA Client Hints. UA Client Hints are the same as the User-Agent string, except delivered through a different mechanism. It's little more than s/one-thing/other-thing-thats-basically-the-same-but-different/

I resent how the Chrome team is handling this because it's forcibly creating work for a large number of developers for no good reason in particular other than "this other interface is a little bit nicer".

Ah yes, you are right, I misread this a bit.

Well, then if the UA string is reduced for privacy reasons but the server can still ask for the information, the benefits are quite unclear.

UA strings need to be solved but I also agree that Chrome single-handledly deciding the standard is annoying.

The biggest privacy impact is just vendors putting bonkers things in the User-Agent string by choice. Mobile browser vendors in particular put all sort of things in there: the device model is pretty much standard for no good reason in particular, but you also see very specific OS build versions, device settings, and the like. Some Android vendors in particular are real bad about this.

I have no expectation that will stop no matter what we do with the tech. They made the decision to stuff it in there for whatever reason and will find somewhere else to stuff it.

> UA strings need to be solved

At this point, just getting the browser name and system name out of a User-Agent string is not as hard as it's sometimes made out to be; things could probably be simplified a bit (does Chrome really need to send "KHTML like Gecko"? Probably not), and vendors could choose to send less information (like Firefox already does, and has for many years).

I mean, it's a bit more messy than it needs to be; removing the "Mozilla/5.0" that every browser sends probably will break some things as it's a bad but quick and surprisingly effective way to check if a browser is a bot, but ... is it really that big of a deal that every browser sends that? Is it really worth the effort replacing that?

The privacy impact is that the server asks for what it wants, and the client (browser) can decide what to send.
> and that using the user agent string should be a last resort solution.

In fairness, “last resort solution” means sometimes it is your only solution, when a specific browser fucks up on specific content and you need to work around that specifically.

Sure, that belongs to the very few valid use cases.

I got an iPad 2 from a relative, I do detect its user agent on my private Invidious instance to send it transpiled/polyfilled JS instead of the original one.

Of course it would not be the correct solution if Apple did not forbid other browsers on its hardware, the correct solution would then be to install a recent Firefox version on it. It would also allow a shitload of other stuff to work, like subtitles on fullscreen videos and autoplay on the next video, playback of videos protected by HTTP basic auth, as well as Let's Encrypt SSL certificates.

The device's browser should send a "X-I-m-dumb-and-my-manufacturer-likes-to-piss-everybody-off: true" HTTP header to avoid relying on its user-agent though.

FWIW you can get Let's Encrypt working on outdated Apple hardware by manually loading the CA certificate.
I could import the root certificate but instructions for activating it didn't work, they seem to apply on more recent versions of iOS.
I've done it on iOS 9.3.5 and it worked in Safari at the very least. You need to import it first and then activate it from the settings afterwards, I believe.

iOS changed the exact procedure a few times so you may need to Google around for the exact steps you need to follow.

The level of detail that will be available to servers will not be reduced at all, but rather repackaged and split up into separate headers that the server can individually request. The information contained in these headers will likely be more accurate because it's claimed to be safer this way.

Whereas today your browser sends the messy but relatively detailed user agent string automatically with each request, after this change it will still send the messy user agent string with each request but with a tiny bit less detail.

Google's writers are pretty good at polishing turds, got to give them that!

My biggest issue with being a solo learner is that unless I specifically look for some information, I don’t know it’s there. Who is we for you? Who was warning everyone? Was I supposed to get a memo?
> When I got started with the web 15 years ago, it was advised everywhere not to rely on user agent strings and rely on feature detection instead,

Which is reasonable advice for a code running in a browser, not for a proxy/CDN (and you don't want proxy inserting it's own js).

UA detection in the backend has also been frawned upon, it's not limited to code running in the browser.

And a proxy/CDN should not be doing something else than proxying requests and serving files.

Workarounds are fine, but that's what they are.

To get it working client-side you'd need to change the app, while the whole premise here is that you don't have to (and often can't). One point tracking what UA string are here and what they need is better than expecting every app author to handle this properly.