Hacker News new | ask | show | jobs
by rainytuesday 1154 days ago
Data is going to get locked up tight and only released for a king's ransom. Must be some investment opportunity here? Who has the shovels in this gold rush? LexisNexus? Elsevier? Suggestions welcome.
2 comments

Seems like the easiest thing to do is just start creating large torrent files with data to train on.

Wikipedia already has torrents.. a Usenet archive might be a good addition, maybe some public medical journals, and so on

OpenAI has enough tweets, they don't need more. There are other much better data sources.