| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ActorNightly 620 days ago

>and recognize patterns.

not quite.

they map certain patterns in the input data onto output data, in a fundamentally statistical way, which is why they can't really do math problems.

Thats not to say that you can't train a model to do math, but to do that, you would have fundamentally 3 things different compared to current LLMS.

1. Map the tokens from the input representing some math to a hyperspace of conceptual math things with defined operations that you can do on them, and how to represent the application of those operations. I.e not just token "3" "+" "3" statistically map to "6", but "3" maps to a some hyperparameter with "branching" options, and "+" maps to one of those branches, and the output is run through a deterministic process.

2. Figure out how to make the models recurse in ideas, which involves some inner state of being wrong, and ability to rewind the processing steps and try new things. I.e search.

3. Figure out how to do all of that through training.

All of that is basically teaching LLMs how to do logic, which is basically what AGI is. In an AGI model will essentially function on mapping a piece of information to a knowledge graph, and traversing that knowledge graph.