| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by scbenet 605 days ago
	Technical report is available here https://www.amazon.science/publications/the-amazon-nova-fami...

1 comments

kajecounterhack 605 days ago

TL;DR comparison of models vs frontier models on public benchmarks here https://imgur.com/a/CKMIhmm

link

SparkyMcUnicorn 604 days ago

This doesn't include all the benchmarks.

The one that really stands out is GroundUI-1K, where it beats the competition by 46%.

Nova Pro looks like it could be a SOTA-comparable model at a lower price point.

link

maeil 604 days ago

Just means it's better at one specific task than the others, which has always been the case. For each of Sonnet, GPT and Gemini I can readily name a task they are individually the best at. At the same time the consensus that Sonnet 3.5 is overall the currently strongest model remains correct, and that's what most people care about. Additionally most people do tasks that all of the models perform similarly at, or they can't be bothered to optimize every task by using the best model for that one task. Which makes sense since not a single cloud provider has all three of them. Now this one will likely be AWS-exclusive too.

link

int_19h 603 days ago

Benchmarks are way too easy to game. There's no shortage of models that "beat GPT-4" according to some benchmark or another, that are obviously nowhere even close when you try them on novel tasks.

link

attentive 603 days ago

on https://aider.chat/docs/leaderboards/ Nova Pro is on par with Yi Coder 9B Chat. Which is not very inspiring.

link

retinaros 604 days ago

in the berkeley function calling it is similar than 4-o for multi turn while being way faster

link

oblio 604 days ago

SOTA?

link

camel_Snake 604 days ago

"State of the Art", if that's what you were asking.