Hacker News new | ask | show | jobs
by 0xdeadbeefbabe 145 days ago
Is anyone excited to do ablative testing on it?
1 comments

With such a high throughput because of sparsity, I'm particulary interested in distilling it into other architectures. I'd like to try a recurrent transformer when I have the time