Y
Hacker News
new
|
ask
|
show
|
jobs
by
h2odragon
931 days ago
you should see my rants about why normalizing weights is a bad idea and how a limited context window is effectively random interpolation