| Compare genetic approach to greedy? With greedy approach you always take the next lower energy value no matter what, right? But if you end up in a local basin, you'll just never escape, because you never look further than your direct environs. So instead of just sampling in a close 'circle' around your current point looking for a 'down', how about we spread that out a bit? You could use a 'circle' in a regular pattern, but what does that even look like in high dimensional space? Seems it's best to use some random distribution centered on your current position. (LLMs actually have a 'temperature' setting which introduces noise for this exact reason.) Some of GA's claims to fame are A) it uses purely just this distribution to descend. B) It can find multiple optima. The way I think of it is that the simplest GA is basically greedy optimization with spread. Greedy is like shooting a rifle , which is great for sniping, but you'll miss if the target is moving fast or doing things you can't quite keep up with. A GA -like a shotgun- introduces spread: multiple chances to hit, multiple chances to escape local optima and rough patches in the landscape. (A really good -if slightly morbid- modern example in the wild is COVID; which managed to outwit human civilization rather handily. "Not bad for a bit of encapsulated RNA" you'd think - until you realize it was running trillions of attempts in parallel. Really, the poor governments had no chance. ) |