|
|
|
Defeating AI scraping by rethinking webpage rendering
|
|
2 points
by exodys
147 days ago
|
|
Consider the idea of someone creating a project that rendered out webpages in images and sent those over the web; updating them whenever an input is received, much like a video game input loop. If everything was server side rendered, how difficult would it be for scraping? The idea of an un-copyable webpage is enticing, assuming that you would not like your data scraped. I know computer vision is a thing, but the error rate may be enough? |
|
Web pages can render in pieces, images not so much. At least not the way web pages can. What is the resolution of a web page - the resolution of a web page really depends on the browser and the OS, some web pages render really high definition because that is what their OS allows (Macs for example), some browsers have more color spaces available than just RGB - many nowadays, so if your site uses more advanced color spaces are you going to render to an RGB image, meaning that your customers get less popping designs with your solution than with the browser. Or are you going to render to the most advanced image resolution possible meaning the images are going to be even bigger and it will be even harder to download.
Are you going to render multiple resolutions to give the correct resolution to user agent, so that you can save on bandwidth - by having done more renders on the server and having your customer pay for more renders.
What is caching behavior here?
I believe performance of this solution would by necessity be sub-optimal. Nobody likes a sub-optimal performance on the web, because almost all of the web is entertainment development, and people won't accept poor performance on their entertainment.
https://medium.com/luminasticity/on-premature-optimization-i...