Y
Hacker News
new
|
ask
|
show
|
jobs
by
pizza
517 days ago
Seems like we should just use gradual annealing of tokens to more fine grained single character tokens over the course of training then
1 comments
kevmo314
517 days ago
I believe that's similar to the idea behind
https://github.com/facebookresearch/blt
link