Hacker News new | ask | show | jobs
by olmo23 16 days ago
> Funnily enough, people on HN often do not consider this an issue, like at all...

I didn't have a problem with it when it was Aaron Swartz, not sure why I should have a problem with it when others do it.

2 comments

Aaron Swartz never did whatever it was he was going to do. He was caught and hounded to death before that.

But he was working with scientific papers— the outputs of public institutions— and his likely goal was releasing them to the public. What proprietary AI companies have done in training LLMs on every book in existence is nothing like that.

A lot of what they have done is the reverse. They have used a lot of such publicly funded information (and a lot of other freely available information) to train LLMs that are proprietary.
The strange thing is he picked a fight with a store of humanities papers rather than scientific ones.
JSTOR holds content from lots of journals including in the sciences. It's not only humanities papers.
1) those were scientific papers; the authors weren't getting paid either way (unless book authors making a living from them)

2) more importantly, Swartz wasn't building a business empire on the pirated data, and charging access

I don't see how the two are even remotely similar