Hacker News new | ask | show | jobs
by dasbo 2538 days ago
It's best not to think in GB terms when talking about AD datasets. E.g., when you record raw data of a multisensor setup (lidars, radars, cameras), the data rate can reach 10+ TB/h. Camera-only datasets are in comparison much smaller.

Taken out of the argoverse dataset description: - One dataset with 3D tracking annotations for 113 scenes - One dataset with 327,793 interesting vehicle trajectories extracted from over 1000 driving hours - Two high-definition (HD) maps with lane centerlines, traffic direction, ground height, and more

1000 driving hours is ok'ish for research (imho).

2 comments

Right, the dataset is not remotely useful for commercial use. For one thing, the license prohibits such use.

Yes, it's certainly not yet clear if the dataset is large enough to capture useful variance. But unlike kitti or cityscapes, it's large enough to present a computational challenge to most of the machines & budgets in research use today, so there's a pretty good chance it'll help push the state of the art... Perhaps for more than one art. The API code itself has a lot of low-hanging fruit: https://github.com/argoai/argoverse-api

nuscenes is good too https://www.nuscenes.org/

Waymo will have one as well some day https://waymo.com/open/

One attractive aspect of these datasets is that they help open up the question of safety for public discussion. For example, now anybody can throw off-the-shelf object detection at these datasets and see what a realistic F1-score looks like for objects at 30m, 50m, 100m, etc...

In argoverse and nuscenes, you have track labels, so you can furthermore factor the velocity difference into how you weigh the error. Have you ever been hit by an Uber? Even 5mph can cause a lot of damage.

> 1000 driving hours is ok'ish for research

Wouldn't it be better to use a simulated environment first. Maybe something using pybullet[1] and a script which maps real world to STL or OBJ files[2]

1. https://pybullet.org/wordpress/ 2. https://github.com/mkagenius/osm2maya

Depends on your goals.

Today's virtual worlds are not accurate enought to allow to develop perception algorithms in simulation. In order to develop sensor fusion, you also need to simulate the output of all sensors including their specific characteristics. Apart from the model quality, there is another challenge: Simulation runtime, which is substantial (!) and - to my best knowledge - not even close to realtime.

If you want to develop driving algorithms that sit on top of the perception stack, then this becomes simpler. You can work on the object level (object being simulated cars, pedestrians, ...) and statistically model perception errors. This is a lot faster, which is e.g. important, if you want to run large-scale reinforcement learning to develop your driving strategy.

In any case, in the future, I would say that we will see a lot more simulation (don't forget, all major players build heavily on simulation - just look into the numbers on how many miles Waymo simulates every day) and potentially going down all the way to the sensing level, because it allows you to develop and especially debug along the whole sense - plan - act stack.

Also, today there exist also hybrid approaches. You take real recordings and abstract them into a simulatable format that you can then, e.g., use to variate and derive artificial scenarios for simulation. This can be used to analyse the influence of different situative paramters on the behavior of a function to pinpoint which parameter(s) caused certain troublesome behavior that have been observed in real drives.

How would something like Waymo's daily simulation training work? Is it just feeding it a larger set of random obstacles every day?

Wouldn't it quickly have negligible returns once it optimizes for the current simulation capabilities? Or are they constantly tweaking both the car model and the simulation data set?