Y
Hacker News
new
|
ask
|
show
|
jobs
by
jacobn
762 days ago
> Goes to show just how much is in the training data.
And in the scale (num_layers, embed_dim, num_heads) of the model of course ;)