|
|
|
|
|
by nee1r
111 days ago
|
|
Hey guys! I’m Neel, been holed up in our south park office for the past year working on model training. excited to share our research! This is a preview of a very different type of computer use model—we train on the internet. Specifically we have 11 million hours of computer video stored on our storage cluster (previously shared https://news.ycombinator.com/item?id=45438496 !) and the model can work in 30 FPS. Since we match the fundamental form factor of computer-use, we can get our model to do CAD, browse websites, and even drive a car using arrow keys. I’m super excited to see what our model can do as we scale more, it's a fun frontier to work on (not language models :) ). The team and I will be online responding to the comments, so drop any questions. |
|
Any benchmark comparisons to Fara-7B or Sonnet 4.6, Qwen 3.5 etc.?