|
|
|
|
|
by aesthesia
57 days ago
|
|
I notice the experiments are all run with Gaussian token embeddings and weight matrices, which is a very different scenario than you would get in a real model. It shouldn't be much more difficult to try this with an actual model and data and get a much better sense of how well it compresses. |
|
I’ve started trying this out with actual models, but currently running things CPU-bound, so it’s pretty slow. Would ideally want to try this properly on GPU, but that gets expensive quickly
So yeah, still very much a research prototype — but validating this on real models/data is definitely the next step.