Hacker News new | ask | show | jobs
by iamflimflam1 329 days ago
I would expect most developers to fail at this challenge. Here’s the doc - you’ve got one chance to get the API to do this.

I can’t tell from the description if the LLMs are allowed to try and then correct based on any errors received.

Though it would be surprising if that helped. Most APIs don’t tell you what you’ve done wrong…

1 comments

We would've assumed that the llms are much better at writing working code since it's not random APIs but rather established API patterns which they should be able to one-shot (e.g. Stripe). Bad error messages are a problem indeed. We will release another one with retries very soon.