Browser Agent Benchmark: Comparing LLM models for web automation

Y	Hacker News new \| ask \| show \| jobs

	Browser Agent Benchmark: Comparing LLM models for web automation (browser-use.com)
	13 points by MagMueller 141 days ago

2 comments

Since we're in this topic, can anyone suggest good AI-based tool for exploratory (fuzzy?) web testing?

It's lacking the best model (Opus 4.5) on the benchmark tho.

Yeah but then their own product might not score the highest.

Exactly why I'm pointing it out, which feels a bit corrupt, but understandable.

tbh i was a bit cranky yesterday - even if they are #2 on a legit benchmark that would be impressive