Hacker News new | ask | show | jobs
by cornholio 1189 days ago
> which is ‘good enough’ compared to a 25-150$/hr human (it does for the lower part of that scale),

I'm not exactly sure you define that scale, perhaps Minecraft bots or the like, where the damage after complete failure is self contained to perhaps a few dollars or a few hours of human annoyance. I'm sure there are many niches where a 50% success rate of mass generated programs can earn you big bucks.

But in my experience Codex does very limited reasoning about code paths. For the current state of the art, you are almost guaranteed to have catastrophic bugs in any non-trivial programs engineered by prompt.

2 comments

When a bad senior or junior/medior delivers their pr, I (or some other skilled senior) review it and it has issues which I explain and we go in a fix loop. Often I just approve it and fix it myself as it takes too long. That is good enough as that’s going on in all companies. Gpt is the same only many times faster; the loop is instant and sometimes I give up and fix it myself, as it is not going to get it, just like some (depressingly many) humans.
Have you worked with ChatGPT to generate the edge cases/test data? then fed that in to get back out a function or what ever?

I tried it with phone number formats and it came up with more than I could.