Hacker News new | ask | show | jobs
by zone411 858 days ago
Video will be especially important for language models to grasp physical actions that are instinctive and obvious to humans but not explicitly detailed in text or video captions. I mentioned this in 2022:

https://twitter.com/LechMazur/status/1607929403421462528

https://twitter.com/LechMazur/status/1619032477951213568