| Update2: got it to 100% training accuracy, 99% test accuracy with (2, 2, 2) shape. Changes: 1. Increased the training set from 1000 to 100k samples. This solved overfitting. 2. In the dataset generation, slightly reduced noise (0.1 -> 0.07) so that classes don't overlap. With an overlap, naturally, it's impossible to hit 100%. 3. Most important & specific to KANs: train for 30 steps with grid=5 (5 segments for each activation function), then 30 steps with grid=10 (and initializing from the previous model), and then 30 steps with grid=20. This is idiomatic to KANs and covered in the Example_1_function_fitting.ipynb: https://github.com/KindXiaoming/pykan/blob/master/tutorials/... Overall, my impressions are: - it works! - the reference implementation is very slow. A GPU implementation is dearly needed. - it feels like it's a bit too non-linear and training is not as stable as it's with MLP + ReLU. - Scaling is not guaranteed to work well. Really need to see if MNIST is possible to solve with this approach. I will definitely keep an eye on this development. |