Hacker News new | ask | show | jobs
by somat 932 days ago
Speaking of remembering training data, I see that as a big problem with chat based systems. They swallow a bunch of data, then generate something when prompted, My worry is not so much copyright infringement but more something like citation needed?

Has anyone done any work to produce citations for the generated data?

1 comments

Some work, yeah. It's still an open problem to do it well, but I think the folks at Anthropic have made a reasonable start with their work[1] on influence functions ("tracing model outputs to the training data"). Basically their work attempts to answer the question "what particular training data most strongly influenced the model to give the answer it did", by doing some fancy math that I think is equivalent to taking the gradient produced by each piece of training data, computing the derivative of loss on the output of interest as the gradient is applied to the model, and then using that as the answer.

Though it sounds like even their much cheaper clever approach is still very expensive.

[1] paper at https://arxiv.org/abs/2308.03296, post at https://www.anthropic.com/index/influence-functions