Hacker News new | ask | show | jobs
by tedsanders 1251 days ago
Seems pretty clear that this question was in its training set and it's regurgitating the answer for part (b). Seems far too coincidental to accidentally get the correct answer to wrong question.

For me, it solved part (a) perfectly when I told it: "To solve this, write a Python 3 function that takes a string like `"R4, R3, R5, L3, ..."` and outputs the number of blocks to Easter Bunny HQ." The original question on its own was a bit ambiguous in my opinion because it doesn't explicitly contain the input which the user reads on a second page.

In any case, neither is strong evidence for or against its ability to solve problems like these. First, it's N=1. Second, it's a problem from its training set.

For me, Copilot/ChatGPT adds value not by replacing my programming but by (a) writing simple code for me and (b) answering my questions about things I don't understand. I operate in a supervisory role where I have to double check everything it says. But, critically, it's faster for me to double check its work than to do everything myself.

1 comments

I mean, it's not N=1 though. Fails day 2 as well, and a bunch of other tasks I've tried to give it. It's weird how some of you are responding that I've cherry picked a single example, I've done a ton of stuff with chatGPT, you can check my comments on prior experimentation with stuff like mathematics and basic problem solving too. Probably spent like 20 hours with it, total?

It genuinely fails 100% of the time at coding anything non-trivial for me, and about half the time for simple stuff. Glad you've been having success though, maybe some people are just better at getting it to work, or it has certain domains it excels in, or your tasks are fairly simple.