|
|
|
Ask HN: How are you getting reliable code-gen performance out of LLMs?
|
|
2 points
by _false
617 days ago
|
|
I'm particularly interested in people using LLM APIs, where code is consumed programmatically. I've been using LLMs a lot lately to generate code, and code quality is a mixed bag. Sometimes it will run straight out of the box or with a few manual tweaks, and others it just straight up won't compile. Keen to hear what workarounds others have used to solve this (e.g. re-prompting, constraining generations, etc). |
|
o1-preview seems to be a step up from Claude 3.5 Sonnet.
There are many open source coding LLMs that for complex tasks will be a joke compared to the SOTA closed ones.
I think that there are two strategies that can work: 1) constrain the domain to a particular framework and provide good documentation and examples in the prompts for it, and 2) create an error-correcting feedback loop where compilation/static analysis and runtime errors or failed tests are fed back to the model automatically.