Hacker News new | ask | show | jobs
by AlexAltea 1203 days ago
That looks cool, though I'd be very concerned about the possibility about ChatGPT "hallucinating" a `rm -rf /` or equivalent.
3 comments

Given that it is doing actual operations under the hood, it would be interesting to have it just output the "native" command for the operation rather than executing it. People could write wrappers around that output to execute it after prompting the user for confirmation or something like that so that users would have the choice to "opt into" the level of automatic execution they're comfortable with.
This way my approach when I created a similar tool, called git-genie[0]. It’s more of an educational tool first, explains the generared git command in detail.

[0] https://github.com/danthelion/git-genie

Just run all the git commands by another GPT bot that determines whether it would be harmful first. GPT bots all the way down.
Can someone make a bot to verify the gpt bots are up and another to verify the first verifier is up
It's the same GPT but with a different prompt.
When LLMs “hallucinate” they don’t do so in random ways like somehow deciding to “rm -rf /“. They do so in predictable ways.
Sure, perhaps my example was too extreme. What about:

  $ gitgpt commit files with msg cleaning repo files and push

  git commit -m msg
  git clean -fdx
  git push
Without `git-add`, 1st one is a NO-OP, 2nd is a destructive action, 3rd is a NO-OP. All vaguely related to the topic at hand, "hallucinating" such a destructive action seems at least plausible.
Yup, that is something more realistic.

The GitHub Copilot CLI tool that is in beta, along with other tools, will show you the command and then make you manually choose to run, redo or abort.

In practice I have yet to accidentally perform any destructive operations. This makes intuitive sense because those seem like unlikely completions for a model that has been fine-tuned on “helpful cli recommendations”!

Extraordinary claims require extraordinary evidence. Given that GPT3 has 175 billion nodes, how would you even begin to support the claim that it never (or sufficiently rarely) does things that are surprising to humans?
If you're looking for academic research, this is a great place to start:

https://arxiv.org/abs/2202.03629

But you can get a general feel by using ChatGPT. Open up a new conversation and ask it something like, "What is the capital of France?". Note the response. Open up a new conversation and note the response. Soon enough you should be able to see that the responses are far from random.

You can use the OpenAI APIs directly and have it run 10,000 or so iterations to see what kind of "hallucinations" it makes! They are not random!

Ask it details about a little-documented event and it'll happily tell you plausible, but utterly false, lies, however.

Apparently the "early 2011 Bougainville earthquake" was magnitude 6.3, at a depth of 21.7km, on the 20th January and caused "widespread damage to buildings and infrastructure in the region, and triggered landslides that blocked roads and hampered rescue efforts".

It was actually on the 7th Feb, a 6.4 and at a depth of 415km. There were "no immediate reports of damage or injuries".

None of this is remotely surprising, considering it's a turbocharged statistical model and it probably ingested a few words about it at most, out of billions and billions, but somewhere along the line from "famous" to "footnote" subjects, it will segue into complete fiction.

Sure, but that is a predictable kind of result, not “rm -rf /“.
Some flavour of "git gc" after your reset is far more likely to crop up and ruin your day, that's true.

As long as you stay on the statistical beaten path (i.e. you're asking about Paris), you will probably be fine, indeed. Probably. Stochastic bugs are always the most fun anyway.

Thanks for the link! I don't think that really addresses my concern, though.

My point is that these LLMs are basically incredibly large programs that defy analysis with our current tools. Sure, I can poke it a few times and see that it usually does what I want, but that's not the same as saying it never goes off the rails.

If it does something crazy like post my bank login online, even only once in a billion times, that's still orders of magnitude higher than I'm willing to accept.

You’re basically asking me to prove to you that I can’t fly.

I will say it like this: it is highly improbable that I can fly. I cannot come up with a way to prove it to you. There is some sort of epistemic miscalculation going on if you operate under the assumption that I might be able to fly.