| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cowteriyaki 251 days ago

Coming with a Lidar out of the box seems nice.

Does the MARS hardware really remove the hidden extras (computer with a gpu) mentioned as the downside of HF SO-101 or LeKiwi? While a jetson is good for inference, I feel like to train VLAs you would need access to a powerful GPU regardless. For Lerobot based hardware training ACT was relatively low profile if you use low resolution for the camera feeds, but with increased resolution or with more than one camera I already saw needing more than 8GB of VRAM. If VLA is on the table, finetuning something like the open sourced version of pi0 should already necessitate access to more than one 4090 or above I think.

Also, do you have plans for community-level datasets? I think Lerobot sort of does this with their data recording pipeline and HF integration.

1 comments

apeytavin 251 days ago

One of our objectives here was to fix everything that we don't like about the SO-101 and the Kiwi, which have several hardware and software flaws in our view. Including, yes, the constant need for a computer to simply run your robot.

The training does require external GPUs (but we provide that infra for free, straight from the app!), but the onboard jetson can run models trained though, as you can see in the examples. Everything you see in the vids is running onboard when it comes to manipulation, because we use a special version of ACT made specifically by us for this robot, that also includes a reward model (like DYNA does).

We have developed this system to also be able to run the other components smoothly so it also does SLAM, and has room for more processing even when running our ACT.

Now indeed this cannot run Pi-0 but from our experience - and the whole community in general - VLAs are not particularly better than ACT in the low data regime, and need a lot more compute.

As for community-level datasets, yes this is the plan. Anything you train can already be shared with others - just share the files. We didn't develop a centralized place for sharing datasets and behaviors but it is on the plan.

link

greggsy 251 days ago

If these are intended to be single-dwelling or single-workplace, is there a need to have any onboard processing greater than a few watts?

You could simply host the raw grunt in a base station somewhere else in the premises, keeping the device lighter and lower power.

link

apeytavin 251 days ago

That would not make it a complete product and would always require a complex setup whenever and wherever you want to use it.

This one is really, really convenient and intuitive. Turn it on anywhere, even outside, it just works. Even when I want to dev on it, it's super convenient.

On some level I truly believe robotics has to become more "complete", we can't always just piece things together, it makes it very hard to have a beautiful product.

I realize this is more of a philosophical answer, but I also think it is the right one to take this field to the next level

link

greggsy 251 days ago

Aren’t you literally selling this with a cloud-based subscription service?

How does that fit into your ‘complete’ ethos?

link

apeytavin 250 days ago

If we could sell it without we would for sure, but this is a current technological limitation. And we make extra easy to connect to it anywhere still, from your phone. Several components of the robots do not need this cloud service, and because the OS is accessible to you, you could even replace it with your own of doing things.

For this one, it's just the only feasible way we found to bring the kind of experience we created to folks.

link

dimatura 250 days ago

Hi, I am currently considering a Lekiwi build but I am intrigued by Mars. Outside of the need for external compute, what issues did you find with SO101 and Kiwi?

Also I am curious about a couple of the parts, if you don't mind sharing - are those wheels the direct drive wheels from waveshare? And what is the RGBD camera? (Fwiw, even if it's hefty the MARS price tag seems fair to me).

link

apeytavin 250 days ago

There's several things but for example, there is no LiDAR on it nor even a good place to put one. If you're going to navigate around, without a LiDAR or good compute for VSLAM (which is very hard to setup and VERY demanding in compute), you will very quickly get lost. At this point the Kiwi is only for very local navigation (and you will still have IMU drift).

There is also a possibility for it to tip the base if the arm is fully extended. And the SO-101 has quite poor repeatability.

The base is also slow to move, and depending on which surface you are the omniwheels can get dirt in quickly.

Finally, external compute means you need in particular to teleoperate from your computer, so you have to be far from the robot and not necessarily in the same orientation than it which is very, very uncomfortable. This app system we made is one of the things people love the most about MARS.

Ah, and RGBD really does matter for navigation AND for learning (augmenting ACT with depth yields better results).

The wheels are indeed these ones, and the camera on the video is a luxonis oak-d wide, pretty expensive but comfortable to work with. However, the version we're shipping includes a much cheaper stereo-depth camera that we calibrate ourselves - I can't get you the reference right right now cause it's late at night but feel free to reach out on discord

link

dimatura 250 days ago

Ah, so that's why the camera seemed familiar, I have a couple of the luxonis cameras around the office :). Re: kiwi, those are good points. Thank you for the answer!

link