Y
Hacker News
new
|
ask
|
show
|
jobs
by
thesz
642 days ago
KAN's have O(N^(-4)) scaling law where N is the number of parameters. MLPs have O(N^(-1)) scaling or worse.
For where you need MLP with a tens of billions of parameters you may need KAN with thousands.