Hacker News new | ask | show | jobs
by computerfriend 1322 days ago
User agent strings are such a train wreck. I wish Chrome was braver and changed it to something like "Chromium (Blink, V8); Linux (Android)".
4 comments

Or just "chromium 99"

Every once in a while I rebel and change my user agent to "firefox 103". but in the end get sad about how much breaks when you do that, and come crawling back the the default user agent string.

I think the thing that bugs me the most is not the complexity of it. but how every body is spoofing every body elses user agent string. It is just this stupid circle jerk of spoofing.

How would CDNs cache both a mobile optimized and desktop optimized version of a site on the edge?

I suppose this can still (kind of) be done, but on the client-side using the viewport size (combined with javascript or CSS @media) rather than on the backend.

For a simple website, the user agent should be able to decide what to download and display. It should not be a backend application.

HTML is responsive by default, just don't break this, and yes, you can use media queries if needed.

For images, we have srcset to tell the browser what to download depending on the screen size [1]. You should not try to optimize the bandwidth if I'm on mobile. I might be on a Wi-Fi connection with a mobile or with my tethered mobile connection on my laptop. Just optimize for everything anyway.

The backend should not be involved in how the site is presented, and the CDN should be as dumb as possible, or should not be used at all.

For apps, you have Javascript to do whatever you want.

Mobile / desktop detection is yet another user agent detection in disguise anyway. Just detect my screen size, my dpi, my tactile screen, my mouse, possibly my bandwidth is it's really necessary (videoconferencing for instance). I could be using a mouse on a mobile device. Both the mouse and the touchscreen need to work. You might not need to do feature detection, just bind these events unconditionally. I could plug a secondary tactile screen and move my browser window on this screen.

Many devices are hybrid now. A tablet with a keyboard is not that weird today. What should isMobile return? "YesAndNo"?

I've not seen a really convincing use of isMobile yet. But I've seen harmful ones. They are full of assumptions that are correct most of the times, but still have exceptions.

[1] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/so...

> What should isMobile return?

There was an joke (real story maybe) about soldiers being allowed to carry up to 25kg of gear, and therefore a device weighting 104kg supposed to be carried by 4 people was deemed not to be portable.

As author of a popular User Agent parser - they are indeed a train wreck but they were at least a largely solved, managed and contained train wreck. The average person could just grab a library, pass it a single string and know what browser someone was using.

UA hints, SEC headers and all that stuff they’re pushing to “replace” it really just complicate the problem. Getting accurate data server side has been made a total pita.

Yeah, the problem with "just use feature detection" is that most of it only works on the frontend, or by having the frontend send additional data to the server after the initial page load.

Sometimes you need to optimize things for certain browsers or bots before a single byte of JS has been sent, relying only on the first few request headers. Akamai probably needs to.

Deleting the cruft but retaining the highest bits (product name and major version) like Chrome is doing seems like a reasonable compromise.

On the other hand, they are amazing at catching bots. Almost all bots (obviously excluding disguised ones, but those never were an issue for us) have identifiable user agents, by blocking bots via UAs, we became better than Google Ads at blocking bots, they are probably doing some kind of complicated ML thing that works far better for edge cases, our simple solution works better for normal cases…
Even today? UA strings are easily fakeable (so fakeable that it surprises me that people still use them for anything).

If a bot still gets caught by UA strings then it's just a poorly written bot?

DoubleVerify is a Googley company that does bot detection. That uses the UA and IP address to find them.

I’m talking about actual, legit bots. Facebook, Instagram, all those search engine crawlers. Those follow all kinds of links, including ads, and then go and annoy us and the advertisers by counting as "fake traffic".

Google is/was (we wrote our own simplified adserver, only using AdManager for the agencies that require it, so I’m not sure how much things changed in the last two years) not only happy letting those through, they even send their own, it was so bad that we redirected all links through our site where we filtered all Google IP ranges (because, of course, whatever they used did not have a proper bot UA) that we could find to block them and stop sending 1000s of fake visits to the advertiser every day.

I wonder if these bots would respect robots.txt files for the ads?
Well, you’d have to get Google to host those robots.txt as the ads are running iframed on their servers ;)
Easily fakeable and abusers still use bad ones. The bar is barely above the floor.
Right? the mentioned short UA string in the article is

    Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.0.0 Safari/537.36
but as long as we're breaking backwards compatibility anyway, it seems to me it could just say Chrome/100.0.0.0 Android