Hacker News new | ask | show | jobs
by visioninmyblood 211 days ago
The problem is the data. LLM data is self supervised. Vision data is very sparsly annotated in the real world. Going a step further robotics data is is much sparser. So getting these models to improve on this long tail distribution will take time.