|
|
|
|
|
by ag2718
1 day ago
|
|
Ah I see, that's an interesting point about higher depth potentially having other benefits. For our work on smaller models (e.g. generally <5 layers), this might not have been as relevant but I would definitely be interested to see implications for much deeper networks. As to your point about KANs performing better or worse depending on the specific task, we definitely did notice this to some extent (symbolic tasks were the best, non-symbolic tasks such as image recognition were the worst). |
|
I wonder how much of that is not so much the overall task but the need to build up to a complex state where KANs can excel. If you consider the classic neuralnet edge detector example, it's hard to imagine a KAN doing the task more efficiently, it seems like a necessary task as part of the overall process but delegating a more capable system to a menial task is probably wasting resources.
One layer of conv2d might be enough to turn pixels into something that KANs manage better.