Y
Hacker News
new
|
ask
|
show
|
jobs
by
whiplash451
184 days ago
Not just hyper parameter tweaking. Not foundational research either. But rather engineering improvements that compound with each other (conswiglu layers, muon optimizer)