Y
Hacker News
new
|
ask
|
show
|
jobs
by
homerowilson
1712 days ago
A nice, short, recent paper along these lines (not mentioned in the article) is "Every Model Learned by Gradient Descent Is Approximately a Kernel Machine":
https://arxiv.org/abs/2012.00152