|
|
|
|
|
by cs702
824 days ago
|
|
Yes, that sounds right, but it doesn't make the work less worthwhile or less interesting. The authors do interesting things with the NFM, including explaining why pruning should even be possible and why we see grokking during learning. They also train a kernel machine iteratively, at each step alternating between (1) fitting the model's kernel matrix to the data and (2) computing the average gradient outer product of the model and replacing the kernel matrix with it. The motivation is to induce the kernel machine to "learn to identify features." The approach seems to work well, outperforming all previous approaches on tabular data. PS. I've updated my comment to add these additional points. |
|