Hacker News new | ask | show | jobs
by ffgjgf1 860 days ago
But Github is not a content company and they don’t really own copyright to almost anything hosted there.
2 comments

That's true, but there's an interesting parallel with GitHub's corporate parent, Microsoft, and Microsoft's other platform company LinkedIn[1]. LinkedIn sued scrapers for retrieving data from the site.

LinkedIn isn't a content company either, nor do they really own any content posted there (they don't right?), but a large part of their business moat comes from the network of people posting content there. Scrapers and bots undermine this, something the AI boom facilitates.

1: https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn

There is a cost to serving up all that content, and if hundreds of AI start ups are all trying to pull in data, that can add up fast. It’s not typical user behavior.
If it’s just static content it wouldn’t really be that expensive. In reality egress traffic is extremely cheap compared to what Azure/AWS etc. are charging.