Hacker News new | ask | show | jobs
by Shamar 1751 days ago
Beyond the global output and error of each sample from the last epoch, the log also includes the weight update of one single (fully connected) node for each layer.

During the compilation phase, the training dataset is projected on a complex vector space that is constituted by both the "model" of the "neural network" and these logs.

It's just like projecting a shadow over a bidimensional surface: if you discard the data pertaining to one dimension you have no hope to guess what projected it: you need both dimensions.

The logs that are preserved in the compilation process is the part of the vector space that is usually discarded during the "training".

But discarding the "model" would have exactly the same effect: you cannot get back the source dataset from those logs alone. That's why this does not "smuggle the training dataset back".

Indeed the fact that the source dataset is obtainable from the couple "these logs" + "final model", but neither from "these logs" alone nor by the final model alone, proves that a substantial portion of the source dataset is always embedded in the "model", that becomes a derivative work of the sources.

2 comments

The last iteration (or epoch) of SGD is not shipped with the trained model. The point just does not stand. There are other (better) arguments for why such models are derivative works.

Basically the argument starts with a claim (you can reconstruct the training set of model X from its weights alone) and then shows something totally different. Of course you can reconstruct from the gradient updates plus the weights—that's not interesting, nor does it support the claim.

This does not prove that the source dataset is embedded in the model. You could do this with a random model and get the same result...
I strongly encourage you to prove your statement with a script that use the logs saved and a random "model" and get back the exact source dataset.