Y
Hacker News
new
|
ask
|
show
|
jobs
by
minimaxir
1172 days ago
A larger vocab takes longer to train but has no (practical) impact at inference time as an Embeddings index is just a key-value store, which is very helpful as GPT starts hitting scaling laws.