|
|
|
|
|
by famouswaffles
608 days ago
|
|
>Synthetic data can never contain more information than the statistical model from which it is derived: it is simply the evaluation of a non-deterministic function on the model parameters. And the model parameters are simply a function of the training data. The Information in the data isn't just about the output but its rate of occurrence/distribution. If what your base model has learnt is only enough to have the occasional flash of brilliance say 1 out of 40 responses and you are able to filter out these responses and generate as much as you like then you can very much 'bootstrap a better model' by training on these filtered results. You are only getting a function of the model's parameters if you train on its unfiltered, unaltered output. |
|