Hacker News new | ask | show | jobs
by dontwearitout 747 days ago
I haven't heard the term "shallow basin hypothesis" but I know what it refers to, these two papers spring to mind for me:

1) Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs https://arxiv.org/abs/1802.10026

2) Visualizing the Loss Landscape of Neural Nets https://arxiv.org/abs/1712.09913

There's also a very interesting body of work on merging trained models, such as by interpolating between points in weight space, which relates to the concept of "basins" of similar solutions. Skim the intro of this if you're interested in learning more: https://arxiv.org/abs/2211.08403

2 comments

Yes, you both understood what I meant. I just coined the term, having in mind illustrations like Fig. 1 in Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape (https://proceedings.mlr.press/v151/bisla22a.html)

Reviewing the literature, I see the concept is more commonly referred to as "flat/wide minima"; e.g., https://www.pnas.org/doi/10.1073/pnas.1908636117

cheers! i'm familiar with those first two papers, just not with the specific term. my intuition was more relatively deep points connected by tunnels than shallow basin - but it might just be the difficulty of describing high dimensional spaces