Hacker News new | ask | show | jobs
by martythemaniak 4 days ago
I come from a regular swe background, but I've spent the last few months getting into robotics and trying to build a snow-clearing robot, so here's my noob notes:

First, very much expected. Both Google and Qwen have been building explicit spatial reasoning and spatial output capabilities in their models since last fall, gemini 3 was released with support for outputting trajectories for example. I only took a look at Robonav (more relevant for my needs) and its architecture and capabilities are inline with other similar models (eg nVidia's alpamayo).

Second, the overall architecture they describe mirrors what I've been working on: You have general purpose LLM that takes a look at the works and the task in front of it and reasons to break it down into subtasks and tool calls, and you can think of RoboNav and RoboManip as tool calls here. The harness keeps a memory and manages the context of the LLM and tools and keep looping until the objective is complete.

Consider the task of clearing snow off a driveway using this suite: An LLM (Qwen 3.7 plus) takes look at the driveway and decides which areas to clear. The harness then tells robotnav to go to an certain location, then robotnav takes over an runs in a loop until the robot is that that location. Then the harness tells robotmanip to use the plow to clear strip of snow. The harness will then call the planner LLM to plan an execute the next clearing and repeats until the driveway is clear.

So what' the issues? Well, they didn't release the weights, nor the training scripts so you can't actually use it. But also, it's all very research-y still, the models are "small" but still huge/expensive for current edge hardware. You'd still need lots of data collection, HITL, and fine-tuning and evals to make it work for your task. You'd also need a secondary safety system to make sure the models don't wreck something. But overall, I do expect robots to use an agent/model combo like this in prod in a few years.

2 comments

This is bananas to me. Theres been successful entries to snow plow competitions for ages. What a world that people now expect networks to handhold through it. Irresistable to all parties I suppose.

Well I guess I'll have to have a look!

Yeah, there's commercially available snow plow robots, you can buy a Yarbo for your house today. As far as I can tell, they all operate on a classical robotics stack - for the Yarbo you install an RTK antenna to give the robot cm-level precision, define a map and a routine, then the Yarbo can execute that routine by itself.

But can it deal with arbitrary lots without extensive premapping, manage piles, handle obstacles intelligently, correct itself (ie spot needs a second clearing ), tackle windrows, etc? It can't, and my hunch is that LLMs are the first tech we have that can plausibly handle all the various cases that a proper robot would need to handle.

My hunch is that some kind of planning stack with environmental awareness at a network level is a good solution to this. My hunch is that LLMs aren't really it. Maybe VLA but I'd bet lower.

Robotics probably will absorb a lot of Rl/diffusion-based tech, with LLM at a high level interface at best.

Yeah, afaik the approach people take today is always some form of bi or tri level hierarchical control, with a slow LLM doing planning and sub task management and diffusion or VLA doing the motor control at higher frequencies. Major differences seem like where and how you draw the boundaries. For my project I'm personally trying to use ROS2 as a low level tool call (instead of diffusion), with an agent /LLM doing the main decisions.

Having said that, this scheme seems like it might just be a reaction to current hardware limitations. When I saw Talaas demonstrate a 8B model running on a custom chip at 17k Tok/sec, first thing I thought was "wow, you can just run an LLM in a control loop"

> This is bananas to me. Theres been successful entries to snow plow competitions for ages.

Why do you hate subscriptions? What if you get a summertime snow storm?

I wonder if there is hope for clearing ice off asphalt and concrete. It's a real problem in Scando, where temps can hover around freezing longtime, for repeated thaw/freeze cycles.