Hacker News new | ask | show | jobs
by bayindirh 533 days ago
> I can read books and then earn money from applying what I learned in them.

How many books can you read, understand and memorize in T time, and how many books an AI can ingest in the T time?

If we're down to paraphrasing, watch this video [1], and think again.

Many models, given that you ask the correct questions, reproduce their training set with great accuracy, and this is only prevented with monkey patching, IIUC.

So, it's still a big mess, even if we don't add copyrighted corpus to the mix. Oh, BTW, datasets like "The Stack" are not clean as they claim. I have seen at least two non-permissively licensed code repositories inside that dataset.

[1]: https://youtu.be/LrkAORPiaEA

1 comments

I agree it's a big mess, that was kind of my point.

I am curious about the video, but am not compelled to spend 24 min watching it when you haven't summarized its thesis for me. The title of the video makes it seem adjacent at best to the points I was making. (Some automated flagging system =/= actual law)