Y
Hacker News
new
|
ask
|
show
|
jobs
by
bear3r
112 days ago
tradeoff worth naming: you avoid the autodiff graph overhead (hence the speedup), but any architecture change means rewriting every gradient by hand. fine for a pedagogical project, but that's exactly why autodiff exists.