| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jph00 456 days ago
	The linked page only compares to very old and very small models. But the pricing is higher even than the latest Gemini Flash 2.5 model, which performs far better than anything they compare to.

3 comments

freeqaz 455 days ago

Their pockets are probably not as deep as Google's in terms of willingness to burn cash for market share.

If speed is your most important metric, I could still see there being a niche for this.

From a pure VC perspective though, I wonder if they'd be better off Open Sourcing their model to get faster innovation + centralization like Llama has done. (Or Mistral with keeping some models private, some public.)

Use it as marketing, get your name out there, and have people use your API when they realize they don't want to deal with scaling AI compute themselves lol

link

vineyardmike 455 days ago

> The linked page only compares to very old and very small models.

They're comparing against the fastest models. That's why smaller models are shown.

link

jbellis 455 days ago

Sort of. The benchmarks showing Flash 2.5 doing really well are benchmarking its thinking mode, which is 4x more expensive than Mercury here

link

NitpickLawyer 455 days ago

Is cost really the main differentiator here, tho? "Solving" coding seems like the holy grail atm (and I agree, it can enable a bunch of things once that's done) and "traditional, organic, human fed code" is pretty expensive atm, so does cost really matter now?

Put another way, how much would company x be willing to spend on "here's a repo, here are the tests, here is the speed now, make this faster while still passing all the tests". If it "solves" something in cudnn that makes it 10% faster, how much would nvidia pay for this? 1m$? 10m$?

link

jph00 455 days ago

Flash 2.5 without thinking mode is also exceptionally good fwiw.

link