Hacker News new | ask | show | jobs
by hmontazeri 1234 days ago
Basically when GPT can’t parse your landingpage it means google can’t either…
3 comments

Google were processing JavaScript for crawling websites more than a decade ago (a quick search suggests since ~2008).

It's not GPT that's having the issue it's what's feeding GPT the website.

Is there any literature out there on how they do it at large scale?

From my understanding it’s tough and expensive ( selenium and rotating resedential ips) am I misinformed?

Google doesn't need residential IPs, since websites tend to treat Googlebot specially
Google purposefully obfuscate the details, AFAICT. I've not done SEO for ~5 years, I'm not sure I know anything useful on the subject any more.
GPT doesn't browse or make HTTP requests at all. This bug is because the author of wtfdoesthiscompanydo.vercel.app isn't executing JS when they scrape the content, before they submit it to GPT. They're probably just making an HTTP request to the provided URL rather than loading it in headless chromium.