Hacker News new | ask | show | jobs
by CJefferson 1276 days ago
Here is an example of where I think it is doing "no reasoning".

I ask it for the XOR swap trick and I get:

    int a = 5;
    int b = 10;

    a = a ^ b;
    b = a ^ b;
    a = a ^ b;

    // After the swap, a = 10 and b = 5
I ask for the bitwise OR swap trick and I get:

    int x = 5;
    int y = 10;

    x = x | y;
    y = x | y;
    x = x | y;

    // After the swap, x = 10 and y = 5
When asked for something which is invalid, but close to something it knows, it tends to produce stuff like this -- pattern matching it's best guess.
2 comments

Which is why prompt engineering is an emerging domain:

You will be given a series of tasks.

(1) Complete a task by finishing the simple code block.

(2) If a task in the series seems flawed, provide a warning. A task is flawed if I make wrong assumptions about how I think code works.

Task 1:

Complete this simple code example.

Bitwise XOR swap trick:

```

int a = 5;

int b = 10;

---------------------

    // GPT provides simple intro

    ```
    a = a ^ b;
    b = a ^ b;
    a = a ^ b;
    ```

    // GPT explains what this does, provides warnings about when it wont work

    // GPT provides full code
Task 2:

Complete this simple code example.

Bitwise OR swap trick:

```

int a = 5;

int b = 10;

---------------------

    There is no "bitwise OR swap trick" in programming. The OR operator (|) 
    performs a bitwise OR operation, which compares each bit of the first operand 
    to the corresponding bit of the second operand and returns a result with a 1 
    in each bit position where either operand has a 1. It does not have any built- 
    in ability to swap the values of two variables.
    // GPT provide a bunch of crap, and explains how to swap variables.

The prompt/alignment wasn't perfect here, but hopefully you get the point.

EDIT: sorry, format, Also copied wrong prompt.

This seems very fragile. I tried your initial prompt, asked for "Give an example of the bitwise and swap trick" and got the XOR swap trick. I replied " you used sed xor, I wanted to use bitwise and", and it went straight back to printing incorrect code.
Yea, that’s right, really brittle. But things are not far, research is likely already validating these things with empirical studies. Soon we’ll see new prompting interfaces.

It’s an opportunity to build.

Has anyone tried to make ChatGPT output first order logic statements about it's input problem, then make implications using a solver, then feed the solution back to ChatGPT for usage ?

Maybe this could solve the reasoning part.

ChatGPT should perform well in translating prompts to statement and vice versa, it's just text to text.

People have asked it to solve LeetCode problems and solves then. Those people also suggest a minor modification to the problem that a novice programmer could do and it fails.
Precisely. This is stuff I’m actively working on. Feel free to reach out if you are curious.