| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kartoffelsaft 313 days ago
	I don't think what we have now fits that definition. LLMs are still narrowly good at language generation, and the "many" things it's good at are things that have canonical textual / linguistic representations (code, chess notation, etc.). Much of existing AI that appears more general is hooking up more specific models together; for example, taking the output of an LLM and piping it into a TTS . Since these pieces are easily replaceable I struggle to call it one AI that can do many tasks. Consider that LLM->TTS example's human equivalent: when you're talking, you naturally emphasize certain words, and part of that is knowing not just what you want to say but why you want to say it. If you had a machine learning model where the speech module had insight into why the language model picked the words it has, and also vision so it knows who it's talking to to pick the right tone, and also the motor system had access to that too for gesturing, etc. then at that point you'd have a single AI that was indeed generally solving a large variety of tasks. We have a little bit of that for some domains but as it stands most of what we have are lots of specific models that we've got talking to each other and falling a little short of human level when the interface between them is incomplete.