| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dartos 663 days ago

> how many people do you think "introspect and continually learn" on a daily basis?

At the very least, every single person who plays sports, video games, tries finding a way around traffic, a faster route home, a way to do less work, take a longer break, or a way to save some extra money getting food.

Literally any optimization task at all requires an observation, analysis (read: introspection,) and adjustment. That’s why we model training loops as optimization problems.

We spoof that with REACT prompts in LLMs, but it becomes clear after a few iterations that there’s no real optimization going on, just guessing at tokens (a gross oversimplification, as this guessing has real uses). It’s doing what it was trained to do, completes text. Not to mention that those steps all disappear when the prompt is changed.