Hacker News new | ask | show | jobs
by kadushka 251 days ago
Mainly because global video data corpus is > 100k larger than global text corpus, so you will need to train much larger models for much longer (than current LLMs).