Y
Hacker News
new
|
ask
|
show
|
jobs
by
limapedro
624 days ago
This is such a interesting paper, sadly they don't have big models, I'd like to see a model trained on TinyStories or even C4 since it should be faster than the transformer variant and see how it compares.