Y
Hacker News
new
|
ask
|
show
|
jobs
by
cowsaymoo
709 days ago
The vocabulary used here doesn't have sufficient intrinsic dimension to partition the input into a low loss prediction. Improvement is promising with larger context or denser attention.