Hacker News new | ask | show | jobs
by preseinger 1075 days ago
how could someone "prove" which inputs and outputs of a large ML model leveraged any specific data?
1 comments

That's her and her legal team's homework assignment.
no, it isn't

it's an unsatisfiable requirement, and unnecessary to substantiate the legal claims

it's dumb to talk about

> it's an unsatisfiable requirement,

There's a wealth of primary literature describing means to probe models for the training data. There's also discovery and a whole host of other processes to answer this question.

The plaintiff that filed the lawsuit should prove what they allege.

> unnecessary to substantiate the legal claims

Why?

What if the model has absolutely zero of her data in it? Should she even be allowed to bring this case to court?

> it's dumb to talk about

Absolutely not! It's central to the entire case.

Even if her data is in the model, there's still a question of whether or not she should be compensated. I'd argue no for the same reason that babies that grow up watching Disney don't owe their entire intellectual output to the company.