|
|
|
|
|
by esafak
1039 days ago
|
|
I haven't read the latest literature but my understanding is that "grokking" is the phase transition that occurs during the coalescing of islands of understanding (increasingly abstract features) that eventually form a pathway to generalization. And that this is something associated with over-parameterized models, which have the potential to learn multiple paths (explanations). https://en.wikipedia.org/wiki/Percolation_theory A relevant, recent paper I found from a quick search: The semantic landscape paradigm for neural networks
(https://arxiv.org/abs/2307.09550) |
|