Hacker News new | ask | show | jobs
by blibble 1818 days ago
surely that depends on the size of the training set?

I could feed the Linux kernel one function at a time into a ML model, then coerce its output to be exactly the same as the input

this is obviously copyright infringement

whereas in the github case where they've trained it on millions of projects maybe it isn't?

does the training set size become relevant legally?