Hacker News new | ask | show | jobs
by CWuestefeld 37 days ago
I assume that the repository of books was used as training data, but not by way of the annas-archive domain. Instead, it would make a lot more sense for them to download the whole pile via bittorrent, which has nothing at all to do with the domain. In other words, the legal solution here wouldn't have prevented the problem.
1 comments

> We’re able to provide high-speed access to our full collections, as well as to unreleased collections.

>This is enterprise-level access that we can provide for donations in the range of tens of thousands USD. We’re also willing to trade this for high-quality collections that we don’t have yet.

https://annas-archive.gl/llm

Which is interesting. What if they had proof of US AI companies paying them (AA) for sourcing “high quality collections we don’t have yet”? Procurement of an illegal act is an illegal act. Might this be enough to garner some legal cover from their presumably well-heeled customers?
We at least have proof of US AI Companies buying access to AA in general. [0]

[0] - https://torrentfreak.com/nvidia-contacted-annas-archive-to-s...