Y
Hacker News
new
|
ask
|
show
|
jobs
by
heavymemory
193 days ago
This still seems like gradient descent wrapped in new terminology. If all learning happens through weight updates, its just rearranging where the forgetting happens