Hacker News new | ask | show | jobs
by lgrapenthin 1295 days ago
This day I asked it not too fundamental questions about Clojure and it was able to provide impressive, accurate answers and provide correct code examples. However if you continue the dialogue and ask it to do more advanced stuff, it will just make up stuff out of thin air. For instance it will use functions that don't exist and claim that they can be imported from packages that don't exist or don't have them. Once you point out these mistakes, it will admit them and come up with different changes which can be even worse, but sometimes also be better and save the whole thing. Overall I'm not sure how useful this will turn out, given that its not reliable. It may be useful to get some initial intuitions and informations (non specific stuff it usually gets right), but it can also mislead you badly. I asked it, how it makes these mistakes only to understand them and admit them once I point them out. It has no answer beyond the usual "I'm a language model". It also told me that it is capable of logical inference, but denied that the next day. Then it told me that its answers would always be consistent, which is a lie. The whole thing is really weird, because its somewhat very smart and capable and incredibly stupid and dishonest at the same time.
7 comments

That’s what I’m seeing too. I had a problem with some Hashicorp Packer scripts and posed it to ChatGPT. It did have an idea of the shape of the problem. To solve it the bot just hallucinated syntax. It spoke with great authority that this was the solution and provided a beautifully syntax-colored excerpt of something that wouldn’t have even compiled.

This was perhaps a very hard problem for an LLM, as the Packer tool’s nature is to manage layers of context. Environment variables passed through templates then passed to scripts which themselves might be in other frameworks. So in this case it to be confused about what was Ansible syntax and what was Packer.

So the bot seems to have different failure modes than humans. Distinguishing context layers seems to be a weak point. And an answer that is a wild guess looks as authoritative as a solid answer. But it’s still extremely impressive.

I think it's just not great at Clojure, it's less popular and so there is less of it in its training data. Also Clojure seems to be kind of hard to get right.

I started trying to learn Clojure with this years advent of code, but got stuck and first tried to use ChatGPT to solve it. My impression matches yours in that it consistently produced non working code and even when told about the error it was unable to fix it.

Then I instead decided to let it use any tool or language it knows, and I'm now documenting how it's doing solving the things.

If you're interested, here is Day1 where I first tried to use it to help me solve it with clojure, but then I gave up and asked for any concise solution. So I got a working solution with `awk` https://blog.nyman.re/2022/12/02/chatgpt-does-advent.html

The second day I just let it pick anything, and it successfully solved the day2 puzzle using python which seems to be it's go-to language. https://blog.nyman.re/2022/12/03/chatgpt-does-advent.html

This is how I felt speaking to some people in India. They could speak, but there was zero understanding, as evidenced by their actions. Personally when learning languages I develop ability to understand years before considering myself to be able to speak it, but it is clear that not everybody does that.
>The whole thing is really weird, because it's somewhat very smart and capable and incredibly stupid and dishonest at the same time.

Not so weird: this judgment could apply to a lot of humans and to whole fields of human activities, if not to the essence of life itself.

In particular it reminds me of a con man every expert could see through, but that mesmerized management with his buzzwords talk, causing an exodus of competent people and high turnover for a few years, and most likely many millions in damage.

With advanced enough AIs handling full remote jobs, this could be done on steroids, getting you a lot of income while wreaking havoc in the companies.

guess it actually found how human work.
Rather just how superficial and stubborn imitation and arguing work.
Some men see things as they are and say why, I dream things that never were and say, why not?
Would be interesting to see if meaningful refinement training could be done by hooking the model up to a language interpreter/compiler. So the model can learn for itself what is valid output.
How much could you charge for packages that the AI says should exist?