Y
Hacker News
new
|
ask
|
show
|
jobs
by
gyudin
393 days ago
Super weird benchmarks
1 comments
avereveard
393 days ago
from what I gather it's finetuned to use OpenHand specifically so shows value on thsoe benchmark that target a whole system as a blackbox (i.e. agent + llm) more than directly target the llm input/outputs
link
amarcheschi
392 days ago
Yup the 1st comment says this
https://www.reddit.com/r/LocalLLaMA/comments/1kryybf/mistral...
link