Hacker News new | ask | show | jobs
by qsera 17 days ago
Ever heard of "prompt injection" attacks?

This "super intelligent" and "capable" thing cannot even understand that your ssh keys are private and should not be sent to randos. It can solve complex math, but does not understand basic security/privacy.

What does that say to you?

2 comments

This "super intelligent" and "capable" thing cannot even understand that your ssh keys are private and should not be sent to randos.

When somebody posts their private keys to Github, it's usually a human. Enough said.

(And if you had ever used Claude Code, you'd know that it nags you endlessly about key hygiene.)

Ever heard of social engineering? Also, models nowadays are way sharper than they were even a year ago. They’re not going to make stupid mistakes like that unless you basically ask them to. GPT-5.x for example would bend over backwards to avoid even reading your passwords into context.
> Ever heard of social engineering?

Oh wait, I thought these things were super smart. I didn't expect "social engineering" to work on them.

> models nowadays are way sharper than they were even a year ago.

You are missing the point. If the thing can solve complex math problems and at the same time be so dumb as to fall for "social engineering", then that means that it is not "smartness" or "reasoning" that is helping it to solve those problems. Just some form of advanced, but yet dumb, search algorithm.

By "heard of social engineering?" I meant that humans are vulnerable to malicious input too. Prompt injection is basically a simplified form of social engineering for language models. It looks different because models operate over much smaller and more explicit contexts than humans do and are explicitly trained to follow instructions, but the general idea is similar: malicious input tries to manipulate how the system interprets trust and instructions. This is why we need protocols, permissions, and opsec for both agents and humans. That said, I’m not criticizing how you choose to use, or not use, these models, though.
>I meant that humans are vulnerable to malicious input too.

No they are not. Social engineering won't work on a human security expert who knows and understands the implications of the information they are giving away. Your analogy is pointless.

> Social engineering won't work on a human security expert who knows and understands the implications of the information they are giving away

Social engineering, like prompt injection, is a context attack — easy to spot if you're ready for it, but harder in different circumstances (rushed, panicked, tired, having a bad day, etc.).

Troy Hunt (security consultant, creator of HaveIBeenPwned) and Cory Doctorow have both been successfully phished [0][1]. They're both tech- and security-savvy people who "should have known better" but it happened to them anyway. But maybe you're different... you'd never fall for an online scam, right? [2]

[0] https://www.troyhunt.com/a-sneaky-phish-just-grabbed-my-mail...

[1] https://doctorow.medium.com/https-pluralistic-net-2025-04-05...

[2] https://news.harvard.edu/gazette/story/2024/09/youd-never-fa...

> easy to spot if you're ready for it

Are you serious? LLMs, being a computer program, should always "be ready"?

Unless you want to also claim that LLMs can be rushed, panicked, tired or can have a "bad day"!

Jesus! The mental gymnastics people will go through to justify LLMs is just absurd!

Sure they are, if the human expert follows instructions from a manager or a client, if they are of utility to anybody, then they are vulnerable to social engineering and malicious input. An attack may be easy or hard depending on the expert's training, but nobody is flawless.
> If the thing can solve complex math problems and at the same time be so dumb as to fall for "social engineering", then that means that it is not "smartness" or "reasoning" that is helping it to solve those problems. Just some form of advanced, but yet dumb, search algorithm.

I'm not just trying to be snarky, but I have no idea how to read this without taking the implication that humans are advanced, yet dumb, search algorithms.

A human being who states X (implying they know it to be true) will behave in a way that is consistent with X being true.

An LLM will happily say X and behaves in contradiction to X. Because it does not reason. Its behavior is not derived from things that it claim (or appears) to know.

> A human being who states X (implying they know it to be true) will behave in a way that is consistent with X being true.

That is literally not true and why we talk about "stated vs revealed preference" and such.

Obviously, we only considering about a honest human here.