| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ps747 998 days ago

> Unlike widely used Reinforcement Learning (RL)-based and Search-based approaches, GPT-3.5 not only interprets scenarios and actions but also utilizes common sense to optimize its decision-making process

Relying on LLMs for reasoning seems dangerous due to the risk of hallucinations, especially in a safety-critical setting like self-driving. I have some other problems with this paper, for example, the comparison to RL is limited to zero-shot and this technique will struggle to run in real-time due to the slow inference speeds of LLMs.

Maybe there is some potential for LLMs to work as a fall-back mechanism in new situations or to help predict the behavior of humans and other cars, but I doubt that LLMs will become central to decision making in self-driving cars.

2 comments

kromem 998 days ago

Hallucinations have become a bit too much of a boogeyman.

One should not rely on LLMs as any sort of authoritative representation of training data where data integrity is critical.

But there's generally very little propensity for hallucination from in context information you are feeding into them live.

Additionally, even just a second pass with a fine tuned classifier checking for hallucinations between provided data and output can reduce the degree to which they occur significantly.

The low hanging fruit when the models first released of summarizing massive sets of training data is definitely an area where hallucinations have been a problem, but arguably the greater value in models moving forward is having turned them into informal logic engines of increasing caliber.

In that application, hallucinations are far less of a concern unless the context extensively overlaps with training data, and in those cases the hiccups can generally be effectively broken by replacing tokens with representative placeholders (such as if working with a LLM on a variation of the goat, wolf, and cabbage problem where it keeps hallucinating details from the normal form, using different nouns or using emojis in place of , , and ).

The issue of speed is much more salient, but I could definitely see LLMs in combination with the generative tech stacks coming up in 3D generation being used to help create large swaths of synthetic scenario data for edge cases less likely to occur and be captured in real world driving conditions, which would in turn train faster and more comprehensive self-driving models in vehicle.

link

wokwokwok 998 days ago

It’s fair though, to say that in safety critical situations where human lives are at stake…

> can reduce the degree to which they occur significantly

May not be good enough, unless you can quantify the degree to which they happen.

1/10? 1/1000? 1/1000000? More in situations like fog or rain? Perfectly safe in normal conditions?

The problem here isnt hallucinations, correct. All systems are unsafe to some degree.

The problem is that the degree to which it is a problem is (afaik) quite difficult to quantify.

It’s not ok, if you just vaguely wave your hand and say it doesn’t happen that much. Or you can mitigate it to some degree by doing such and such. How much?

You have to actually be able to articulate the degree of risk involved.

There is a risk. Fact.

Is it acceptable? That’s the question, and no one seems to be able really answer it clearly.

link

transformi 998 days ago

I think that the valuable parts are xAI (explanation) and make the interaction between the REAL model and the human.

link