Hacker News new | ask | show | jobs
by ryan14975 101 days ago
This makes a lot of sense. Cloudflare already has the rendered content at edge — serving a structured snapshot from cache would eliminate redundant crawling entirely.

What I'd love to see is site owners being able to opt in and control the format. Something like a /cdn-cgi/structured endpoint that respects your robots.txt directives but gives crawlers clean markdown or JSON instead of making them parse raw HTML. The site owner wins (less bot traffic), the crawler wins (structured data), and Cloudflare wins (less load on origin).