|
|
|
|
|
by ethbr1
357 days ago
|
|
Wouldn't a model that can recite training data verbatim be larger than necessary? Exact text isn't coming from nowhere, no matter how efficiently the bits are encoded, and the same effectiveness should be achievable by compressing those portions of the model. |
|