|
|
|
|
|
by SubiculumCode
2 hours ago
|
|
What do those compress to with conventional approaches? For comparison. I am curious. A classic machine learning ensemble approach is to overfit a collection of small models then bag them (e.g. voting) allowing the models to generalize. I'm sure someone's tried to overfit a bunch of transformers for compression like this, then bag them to see how well it does? |
|