Hacker News new | ask | show | jobs
by reaperman 1006 days ago
LLM's can integrate with a sandbox to deploy and test their code and iterate on a solution until it appears to be valid. They can also integrate with web search to go out and find presumably-valid sudoku puzzles to use for test cases if they're (likely) unable to generate valid sudoku puzzles themselves. I know it's expanding the definition of "LLM" to include a sandbox or web search, but I think it's fair because it's a reasonably practical and obvious application environment for an LLM which you plan to ask to do things like this, and I think LLMs with both these integrations will be commonplace in the next 1-2 years.

No, I don't think LLM's can "reason and plan". But I do think they can effectively mimic (fake) "reasoning and planning" and still arrive at the same result that actual reasoning and planning would yield, for reasonably common and problems of greater than trivial complexity but less than moderate complexity.

I think pretty much all of our production AI models today are limited by their lack of ability to self-assess and "goal-seek" and mutate themselves themselves to "excel". I'm not 100% sure what this would look like but I can be sure they don't have any real "drive to excel beyond". Perhaps improvements in Reinforcement Learning will uncover something like this, but I think there may need to be a paradigm shift before we invent something like that.