I feel that getting anywhere into the neighborhood of “kind of working” for a project like this is noteworthy and a huge milestone. Maybe a better headline would be, however: Agents almost create a working browser.
Yes, if Cursor claimed "We let autonomous agents run for weeks, and they produced millions of lines of code, and it kind of looks like a browser, and it kind of runs", then I wouldn't have written and published TFA.
But their claim wasn't so nuanced, it was "hundreds of agents can work on a single codebase autonomously for weeks and build an entire browser from scratch that works (kinda)". Considering the hand-holding that seems to have been required to get it to compile, this claim doesn't seem to hold up to scrutiny.
At this point, its 1.5mlocs without the vendored crates (so basically excluding the js engine etc). If you compare that to Servo/Ladybird which are 300k locs each and actually happen to work, agents do love slinging slop.
But their claim wasn't so nuanced, it was "hundreds of agents can work on a single codebase autonomously for weeks and build an entire browser from scratch that works (kinda)". Considering the hand-holding that seems to have been required to get it to compile, this claim doesn't seem to hold up to scrutiny.