| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by torginus 235 days ago

It can?

If you use 'multimodal transformer' instead of LLM (which most SOTA models are), I don't think there's any reason why a transformer arch couldn't be trained to drive a car, in fact I'm sure that's what Tesla and co. are using in their cars right now.

I'm sure self-driving will become good enough to be commercially viable in the next couple years (with some limitations), that doesn't mean it's AGI.

2 comments

tsimionescu 235 days ago

There is a vast gulf between "GPT-5 can drive a car" and "a neural network using the transformer architecture can be trained to drive a car". And I see no proof whatsoever that we can, today, train a single model that can both write a play and drive a car. Even less so one that could do both at the same time, as a generally intelligent being should be able to.

If someone wants to claim that, say, GPT-5 is AGI, then it is on them to connect GPT-5 to a car control system and inputs and show that it can drive a car decently well. After all, it has consumed all of the literature on driving and physics ever produced, plus untold numbers of hours of video of people driving.

link

famouswaffles 235 days ago

>There is a vast gulf between "GPT-5 can drive a car" and "a neural network using the transformer architecture can be trained to drive a car".

The only difference between the two is training data the former lacks that the latter does so not a 'vast gulf'.

>And I see no proof whatsoever that we can, today, train a single model that can both write a play and drive a car.

You are not making a lot of sense here. You can have a model that does both. It's not some herculean task. it's literally just additional data in the training run. There are vision-language-action models tested on public roads.

https://wayve.ai/thinking/lingo-2-driving-with-language/

link

torginus 235 days ago

> single model that can both write a play and drive a car.

It would be a really silly thing to do, and probably there are engineering subletities as to why this would be a bad idea, but I don't see why you couldn't train a single model to do both.

link

tsimionescu 235 days ago

It's not silly, it is in fact a clear necessity to have both of these for something to be even close to AGI. And you additionally need it trained on many other tasks - if you believe that each task requires additional parameters and additional training data, then it becomes very clear that we are nowhere near to a general intelligence system; and it should also be pretty clear that this will not scale to 100 tasks with anything similar to the current hardware and training algorithms.

link

oldestofsports 235 days ago

Okay but then can a multimodal transformer do everything an LLM can?

link

torginus 235 days ago

Most SOTA LLMs are multimodal transformers.

link