Y
Hacker News
new
|
ask
|
show
|
jobs
by
0xdeadbeefbabe
145 days ago
Is anyone excited to do ablative testing on it?
1 comments
manbitesdog
145 days ago
With such a high throughput because of sparsity, I'm particulary interested in distilling it into other architectures. I'd like to try a recurrent transformer when I have the time
link