|
|
|
|
|
by msdz
98 days ago
|
|
They’d probably get the farthest, but they won’t pursue that because they don’t want to end up leaking the original data from training.
It is possible in regular language/text subsets of models to reconstruct massive consecutive parts of the training data [1], so it ought to be possible for their internal code, too. [1] https://arxiv.org/abs/2601.02671 |
|