|
|
|
|
|
by whbrown
669 days ago
|
|
There are even some benchmarks which have caught 'lazy coding' regressions with their latest models[1]. I recall my best experience with their models was last year with an early version of the advanced data analysis feature where it would write a script, write tests for it, run the tests, update the code and/or tests, and re-run them. Presumably that was too expensive, and now it feels like pulling teeth to get the same result. [1] https://aider.chat/2024/04/09/gpt-4-turbo.html |
|