Hacker News new | ask | show | jobs
by isityettime 22 days ago
Aaron Swartz never did whatever it was he was going to do. He was caught and hounded to death before that.

But he was working with scientific papers— the outputs of public institutions— and his likely goal was releasing them to the public. What proprietary AI companies have done in training LLMs on every book in existence is nothing like that.

2 comments

A lot of what they have done is the reverse. They have used a lot of such publicly funded information (and a lot of other freely available information) to train LLMs that are proprietary.
The strange thing is he picked a fight with a store of humanities papers rather than scientific ones.
JSTOR holds content from lots of journals including in the sciences. It's not only humanities papers.