Hacker News new | ask | show | jobs
by zygy 681 days ago
Naive question: what's the intuition for how this is different from increasing the number of learnable parameters on a regular MLP?
1 comments

Orthogonality ensures that each weight has its own, individual importance. In a regular MLP, the weights are naturally correlated.