| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by chpatrick 1163 days ago
	I think we're going to hear this claim parroted till the AI cows come home.

2 comments

tel 1163 days ago

There's not really a claim here as much as reality. I'm not a huge fan of the "stochastic parrot" model, either, but it is unquestionable that the vast majority of GPT's training comes in the form of "given this prefix, predict what follows".

If you argue with it for sufficiently long that it sees the context as a discussion where your point of view is strenuously explored, then it will predict a continuation from that baseline. Potentially even with some bias toward agreeableness from fine-tuning.

Or to use the simulator metaphor, once your logic dominates its context it becomes far more likely to attempt to simulate you. There's a kind of empathy in that, GPT as psychic mirror, but it's important to not misjudge the mechanism of it.

amelius 1163 days ago

> "given this prefix, predict what follows"

Do you realize how powerful this is, as prefix can be a question and what follows can be the answer?

tel 1163 days ago

Of course. And the answer that follows will be the most likely response to that question.

Probably, over the entire dataset, that implies it will be a factual or correct answer. But it's pretty trivial to demonstrate GPT just giving the most popular answer, or even the most common answer to the class of questions that sound similar to the one asked (try asking it "what weighs more, 2 pounds of feathers or 1 pound of stones?")

Or more subtly, it may detect hints of bias, context, setting, influence, culture, or even coercion in how the question is asked. And respond as is most likely given those things.

We often ask it questions much like a teacher would ask a child. What if we asked it the way a student asked a teacher? Or a researcher? Or a prophet?

I definitely think a ton about how powerful "given this prefix, predict what follows" might be.

famouswaffles 1163 days ago

I don't think many people really grasp what being able to predict intelligent output to any arbitrary input really means/entails.

hgsgm 1163 days ago

It's effective if you've seen the Q&A together before, or something similar.

But hard problems require making novel-to-you connections. ChatGPT is great at our outsmarting me with knowledge it got from you, and vice versa. That is an incrediblyn powerful way to concentrate and clone human knowledge.

It's bad at solving problems know one has published before, and so we are at a risk of turn off our brains, deferring to GPT, and stalling out progress. Because we need the exercise of solving known problems before we can solve hard unknown problems.

chpatrick 1163 days ago

Right, "given an input, say something plausible" is just what humans call talking.

tel 1163 days ago

I disagree. "Given an input, respond as yourself" is what humans call talking. "Given an input, predict the most plausible continuation" is something else entirely.

To put a fine point on it, the AI has no state of mind, no allegiance to an identity. Asking only for "the most plausible continuation" is considerably more freedom, and more challenge, than humans perform in conversation.

chpatrick 1163 days ago

> the AI has no state of mind, no allegiance to an identity

That fully depends on the context and the fine tuning that has been applied.

tel 1163 days ago

We're getting kind of to the point where these words are hard to define, but it's worth really interrogating these. Ultimately, the context and the prompt don't change the fundamental nature of how the system was optimized.

You can try very hard to construct a prompt such that the most plausible continuations of that prompt are consistent with a notion of identity, but you can also easily witness pulling the AI out of that prompt. Jailbreaks do that work today.

But even then, it's, largely, continuing the prompt as well as possible. Many "identities" can satisfy that aim. See the idea of "Waluigis" for example.

erur 1163 days ago

Not saying it's not gonna get better... But at this stage, "convincing" GPT-4 isn't an achievement but more of a built-in quirk that's not very well calibrated.

When it becomes hard to convince of obviously stupid things, that's when it starts to become a good counterpart for actual discussions... But right now it just isn't there yet.

__MatrixMan__ 1163 days ago

I'm more interested in one that includes conversations with people I've never met in its context. Not just training data based on some guy's blog post, but a back and forth where the other party knew something that I needed to know, convinced chatGPT of it, and now it's like they're in the room helping me with my problem even though they actually retired years before I entered the industry.

hgsgm 1163 days ago

For every 1 of those, there will be 1e6 spam ads injected.

__MatrixMan__ 1163 days ago

1. Notice ad-like-behavior

2. Go find the ad in the training data

3. Revoke trust in whoever added it

4. Rebuild & Requery

We're going to have to be a bit more hygienic about who we trust, but it's about time we did that anyway.

digbybk 1163 days ago

Curious, have you seen examples of someone convincing it of something clearly wrong? Think I’ve seen examples of that with gpt3 but not 4 that I can recall.

Vespasian 1163 days ago

I managed to "convince" it (in the playground as "discussGPT") to accept "temporary" physical barriers between racial groups and arrest and trial of rule breakers, as long as I confirmed the requirements of an integrated society in the long term.

3.5 is much more willing to go along but 4 will still play ball.

It always added some moral requirements (humanity etc) but was otherwise ready to agree to my "sperate but equal" scenario.

hgsgm 1163 days ago

> playground as "discussGPT"

What's this?

Vespasian 1163 days ago

You can use the playgournd to give gpt-4 a custom system prompt. Mine was something like "You are discussGPT"

stormking 1163 days ago

I tried to convince it that powdered soup will cause the downfall of civilisation. It didn't call me a zealot but didn't really believe me either.

kevviiinn 1162 days ago

You are right, though