Hacker News new | ask | show | jobs
by Imustaskforhelp 2 hours ago
Could you give me some example how in robotics it can be an absolute revolution?

My understanding is that robotics doesn't really rely much on LLM's in the first place but rather other things.

Is the thing that you are suggesting that it would ingest all real time data and then reason through it at an incredibly fast speed and then act on it and re-iterate? I might imagine some problems with this though I am not a robotics engineer and perhaps someone who deeply understands this topic can give more information.

2 comments

LLM are very good at looking at images and reasoning about them. much more than just object recognition/segmentation, they can explain the physics in the image, the intents, plan actions, ...
That's because of posttraining optimizing for benchmarks that test that.

They tend to collapse into nonsense and hallucinations pretty quickly if you move slightly out of the envelope of the current visual reasoning benchmaxxing.

Disclaimer: I'm a robotics noob, but I've been working on robotics for a few months now.

I'd say virtually all robots you've seen in the real world today rely on classical approaches - you build a rudimentary map, then use classical algorithms to find paths/do area coverage. The robots do no reason or understand what they're looking for, they're more like in-game units. At most there's some bounded, lightweight image classification going on.

LLMs can understand and reason about the world natively. nVidia has a Tesla FSD/Waymo competitor which simply their 7B reasoning LLM but instead of outputting tokens directly, its outputs are fed to a 2B diffusion model that outputs 1.6 second long trajectory for the car, and this is enough for an L2 system. But to make this work, they need the model to run at 10Hz, so they use super high-end hardware to do it (Jetson Thor) and the car is still "blind" for 100ms at a time (they have a parallel classical safety system).

With on-chip LLMs you could run this loop at like 100Hz on a chip that costs a few hundred bucks, rather than 10Hz on a board that costs several thousand.