Hacker News new | ask | show | jobs
by altruios 846 days ago
Asking what the training data was, is valid. Knowing what these AI are trained on is to everyone's benefit.
1 comments

Your comment is correct but misdirected.

I didn't say anything about OP's first question. Your comment is about that.

As a contrived example... If you train exclusively on AGPL source code, the probability of generating something identical to AGPL licensed code is likely non-zero.

This is a very important question.