Y
Hacker News
new
|
ask
|
show
|
jobs
by
buildbot
301 days ago
This paper's claim holds - for 4 layer models. Models improve on out of context examples dramatically at larger scales.