Hacker News new | ask | show | jobs
by sdeyerle 1234 days ago
qualcomm.com

This website is about a product or company that wants you to enable JavaScript in order to use their services and get access to the app they provide. JavaScript is a type of computer language that helps websites work better. By enabling it, you will be able to use the product or company's services and get the most out of their app.

Not quite there...

3 comments

Unintentionally hilarious though. I could see that being a punchline in The IT Crowd or Futurama.
Parks and Recreation did it:

> Leslie is sick and Andy tries to help her out by looking up her symptoms on the internet.

> Andy: "Leslie, I typed your symptoms into the thing up here and it says you could have 'network connectivity problems.'"

https://www.youtube.com/watch?v=LinpRhB4aWU

Apparently improvised, like many other of the best punchlines in the show.

I used to run Autosummarized HN, which would summarize Twitter submissions as something along the way of: „Twitter has detected that JavaScript is disabled in the browser, and asks the user to enable JavaScript or switch to a supported browser.”

Example: https://danieljanus.pl/autosummarized-hn/previously/2023-01-...

Requested Twitter and get basically the same same:

> This website is about Twitter, a website that allows people to communicate with each other by posting messages. It requires users to have JavaScript enabled in order to access the website and use its features. The website also provides information on its Terms of Service, Privacy Policy, Cookie Policy, and Imprint. In case something goes wrong, users can try again to fix the issue.

Basically when GPT can’t parse your landingpage it means google can’t either…
Google were processing JavaScript for crawling websites more than a decade ago (a quick search suggests since ~2008).

It's not GPT that's having the issue it's what's feeding GPT the website.

Is there any literature out there on how they do it at large scale?

From my understanding it’s tough and expensive ( selenium and rotating resedential ips) am I misinformed?

Google doesn't need residential IPs, since websites tend to treat Googlebot specially
Google purposefully obfuscate the details, AFAICT. I've not done SEO for ~5 years, I'm not sure I know anything useful on the subject any more.
GPT doesn't browse or make HTTP requests at all. This bug is because the author of wtfdoesthiscompanydo.vercel.app isn't executing JS when they scrape the content, before they submit it to GPT. They're probably just making an HTTP request to the provided URL rather than loading it in headless chromium.