Hacker News new | ask | show | jobs
by gonzo41 1690 days ago
I think there's an opportunity for a new JS framework to have something like randomly generated dom that will always display the page and elements the same to a human but constantly break paths for computers.

Like displaying a table with semantic elements, then divs, then using an iframe with css grid and floating values over the top.

This almost seems like a problem for AI to solve.

4 comments

Even if your DOM is obfuscated, the rendered page remains vulnerable to OCR. Obfuscate the rendered pixels and you’ll annoy your humans and eventually find that the scrapers’ OCR is superhuman.

Still, maybe AI comes into it. Maybe poisoning the data is the right way to do it conditioned on ML-juiced anomaly detection.

pdf and print newspaper is still a massive pain in the ass to OCR accurately
To some extent those already exist and I get annoyed by them when they cause 1Password to be useless on their login page. But it probably would help with algorithmic scraping.
This is already common. It's mildly annoying for scrapers but generally a waste of time since you can usually still orient yourself based on the content of the nodes.
This would have huge accessibility issues, breaking screen readers and the like.
We already have react-native-web (<3), so we have that covered.