|
|
|
|
|
by ffast-math
1468 days ago
|
|
This master's thesis sort of does it for individual layers, but it doesn't have any fine-tuning yet so it completely wrecks the accuracy: https://github.com/joennlae/halutmatmul. If someone worked on contributing this functionality to Composer [1] I'd be down to help out. I can't justify building it all on my own right now since we're 100% focused on training speedup, but I could definitely meet and talk through it, help code tricky parts, review PRs, etc. [1] https://github.com/mosaicml/composer |
|