Y
Hacker News
new
|
ask
|
show
|
jobs
by
larodi
498 days ago
My thought on the same guess being - all tokens live in same latent space or in many spaces and each logical units train separate of each other…?