| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by crakhamster01 48 days ago

I realize this post is about the pelican test, but in regards to coding, has anyone tried out the advisor strategy with V4?[0]

e.g. Have V4 call out to Opus when it's uncertain, but otherwise handle execution.

The results with Sonnet/Haiku in the blog post seemed promising, so I'm curious how it would go with these latest open models.

1 comments

That first graph (SWE-bench Multilingual) is a crime