This doesn't sound like much of a barrier to me. If you're a human training the LiDAR system, couldn't you just consult the image or video to help label whatever the LiDAR is seeing?
A good supervised learning process requires teaching humans to label consistently.
Imagine trying to write down precise instructions to train hundreds or thousands of humans to label many different types of objects using a tool like the above. Now hire, train, and manage those humans.
Compare that to having the humans draw rectangles around 2d color pictures of cars.
Also note that such tools need to be built and improved.
https://dataloop.ai/platform/lidar/
A good supervised learning process requires teaching humans to label consistently.
Imagine trying to write down precise instructions to train hundreds or thousands of humans to label many different types of objects using a tool like the above. Now hire, train, and manage those humans.
Compare that to having the humans draw rectangles around 2d color pictures of cars.
Also note that such tools need to be built and improved.