Hacker News new | ask | show | jobs
by BiteCode_dev 2306 days ago
Yes, but doing so, they will lose the ability to be indexed, searched, archived and linked to. Like during the flash era.

Which in turn will start a new trend of website that are "open".

It's a cycle.

3 comments

> they will lose the ability to be indexed, searched, archived and linked to

Sounds like an API that will be implemented.

If it's not standard, it will be useless. If it is, the browser will eventually put it in the users hand.
Crawler bots are already doing site rendering and OCR. This isn't that big of a step for them.
TIL. Wow. I guess I missed the web is not HTML anymore.
Plenty of sites already give Google something different from what they show everyone else, right?
No, that's actually a great way to make Google penalize your site. It's explicitly against their Webmaster Guidelines: https://support.google.com/webmasters/answer/66355
Tell that to linkedin, pinterest, and dozens of news sites.
Linkedin used to trick people into thinking you have to pay to see someones profile when you can see it just fine by logging out. That is not against Googles policy. Not sure if they still do that, but I used the logout trick a lot a few years ago.
Google does document a way to mark paywalled content and have it indexed without it being considered cloaking: https://developers.google.com/search/docs/data-types/paywall...
Big properties do that and Google doesn't care.
You sure about that? How big qualifies as big in your eyes? Medium implemented homepage cloaking in November of 2019 and within a month they lost 40% of their overall search visibility:

https://www.onely.com/blog/medium-lost-half-visibility/

Pinterest? Google is somehow able to crawl images on their site, but I certainly can't view these images without creating an account and logging in. Their site ranks highly enough that I gave up on google image search last year.

Additionally, off the top of my head Bloomberg and NYT both won't allow me to view more than a few articles but let the Google crawler index their articles.

Yes Pinterest is awful. Certain categories of images are totally borked on Google and other search engines because of them.
I thought that was an offense that would get one either severely de-ranked or delisted from search outright?
What do paywalled sites do, then? They don't seem to have a problem ranking just fine.
They show the abstract or the same bit of the headline and maybe a paragraph you see when hitting the page. Googlebot doesn't get the full text.
I've definitely seen article text on Google that was not presented to me when I clicked the link, and not because the article was updated or something, but because I was blocked from seeing the article.
It could be in a hidden element or something.. I find that pages that do this fall in the face of things like safari reader mode.
Many "paywalled" sites lose their paywall if you disable JS.