| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ignoramous 883 days ago

Nice.

> Some of the popular LLMs that we recommend are: Mistral, CodeLLama

1. Surprised Mistral (Mixtral?) is recommended for code generation / explanation alongside a fine-tuned CodeLlama?

2. Recent human evals put Microsoft's WaveCoder-Ultra-6.7B (SoTA w/ GPT4) at the top of the rankings with WizardCoder-33B, Magiccoder-S-DS-6.7B trailing: https://twitter.com/TeamCodeLLM_AI/status/174755128687745064...

2 comments

ParetoOptimal 882 days ago

Typically when a 6.7B model or similar beats a 33B model it's not really true in my experience. At the least I have very a high burden of proof before believing it.

link

austhrow743 882 days ago

Are you able explain what the charts mean? Only one of the three has wavecoder at the top.

link

ignoramous 882 days ago

Those charts show pass@k metric (expectation at least k generated samples are correct out of n) on OpenAI and Octopack problem evals for code.

WaveCoder: https://arxiv.org/abs/2312.14187 (section 3.2)

Octopack: https://github.com/bigcode-project/octopack

link

srikanth235 882 days ago

While testing internally, Mistral worked well. But these models are just starting points. Will add support for the models WaveCoder-Ultra-6.7B, WizardCoder-33B, Magiccoder-S-DS-6.7B, etc soon.

link