|
|
|
|
|
by vadansky
358 days ago
|
|
I had a particularly hard parsing problem so I setup a bunch of tests and let the LLM churn for a while and did something else. When I came back all the tests were passing! But as I ran it live a lot of cases were still failing. Turns out the LLM hardcoded the test values as “if (‘test value’) return ‘correct value’;”! |
|
https://github.com/auchenberg/volkswagen