Hacker News new | ask | show | jobs
by Cyan488 8 days ago
> "The tool itself worked properly and functioned as intended; however due to a bug in a separate code path, the system did not properly verify that the email address provided by the individual requesting a password reset matched the email address associated with that user’s Instagram account," said Meta in its breach notice.

I'm not sure "worked properly" and "as intended" accurately describe this situation.

27 comments

In italian we say "l'operazione è riuscita perfettamente, ma il paziente è morto" -> "the surgery was a complete success, but the patient died"
Both this and what Meta said reminds me of "Clarke and Dawe - The Front Fell Off" (https://www.youtube.com/watch?v=3m5qxZm_JqM)

I also can't believe the people who were involved with writing this response from Meta, didn't realize how obviously bad it sounds. It's like there is no humans working and writing there anymore.

> It's like there is no humans working and writing there anymore.

Don't know if AI is to blame, but I've used to see these kinds of nonsense post-mortems even in the pre-llm era, and it's always due to some internal fighting ongoing between various departments.

Where do you think the LLMs learned it from...
"Who taught you how to do this stuff?"

"You, alright! I learned it by watching you!"

Pretty much. The most depressing thing about the bland slop produced by LLMs is realizing that this is what they "think" of us.
I was reminded of the Murray Walker quote. “There's nothing wrong with the car except it's on fire”
My dad says, "But other than that, Mrs. Lincoln, how was the play?"

(Usually said jocularly when everyone is at their most upset, e.g. a vacation ruined)

A friend said at one of those moments, "And other than that, how was the play Mrs Lincoln?" And the 3rd person replied, "I don't know, I've never seen the play 'Mrs Lincoln'"
“The strait of Hormuz is open so long as Iran does not fire missiles at ships.”
"The numbers will go down as soon as you quit testing"
The actual quote was "if you don't test, you won't have cases"
> like there is no humans working and writing there anymore

Meta has never been a place for people with empathy to thrive or succeed. They literally enabled a genocide. Despite being warned by internal employees, profits were more important.

Which one. They have several under their blood soaked belt now
Very rigorous software engineering standards.
Does it matter if the response is tone deaf or simply misguided? I am a bit nihilistic here, but in one week absolutely nobody will be talking about this. Are the affected individuals going to abandon instagram? Are people going to reduce their usage out of concern for the safety of their accounts? Nothing will happen, hence there is no need for actual humans writing a good, well intended response.
> in one week absolutely nobody will be talking about this.

In news media, sure. But in IT teams around the world people will be referring to this (the exploit opening stupidity) for years as how NOT to do things. :)

Everyone has a limit to how much bullshit they'll put up with. This could be the last straw for some people to finally quit Instagram. I quit Instagram, Facebook and all other Meta properties in 2025, after complaining about various problems for years. Other people may quit temporarily and then return, but any time they spend away from Instagram may give them experience that will help them quit permanently later.
> Does it matter if the response is tone deaf or simply misguided?

I agree with you that in a week nobody will be talking any more, but I'm pretty sure it's a GDPR data breach, and they can have some trouble within EU.

Yeah, they probably don't give a fu.. about EU, but if the response doesn't matter at all why did they spend time on it?

Haha.. This reminds me of a classic Windows MessageBox meme that goes: "Operation failed successfully!"
We have the same saying in German: Operation erfolgreich, Patient tot.
"Operation succesful, patient dead" is a common saying in India.
I found an English use from 1883 - https://archive.org/details/argonaut131883sanf/page/n391/mod... .

> The creosote in toothache drops administered to a New York boy cured the pain, but killed the boy. This recalls the entry in the register at Bellevue Hospital, which reads; "Operation successful. Patient died."

The Argonaut, San Francisco, December 22, 1883.

Maybe it was euthanasia?
"operation successful, patient dead."
The tool worked correctly and as intended, but due to a bug it did not work correctly nor as intended.
To be fair, that quote in the original article could have more context. By "The tool" they meant "AI-assisted support tool"[1]; perhaps they meant that the issue was not an AI hallucination inherent of the tool, but a fixable bug.

[1]: https://www.documentcloud.org/documents/28202858-meta-ai-ag-...

In that case, the statement is so meaningless as to be useless. Why should we care how Meta splits up their microservices? The tool still failed. They just want to redefine the "tool" as something else, anything else, to avoid having to admit something negative about their precious AI.

> The LLM correctly generated tokens according to user input, however due to a bug in a separate code path, the system did not properly verify the email address

> Nginx correctly handled the user requests according to the HTTP standard, however due to a bug in a separate code path, the system did not properly verify the email address

I mean, I think many of us are curious and enjoy hearing more details about how and where bugs like this occur. What's wrong with that?
I'd love to read a proper technical post-mortem, but this obviously isn't it. It's a carefully-worded statement from a lawyer meant to minimize liability and reputational damage to the company.
There is nothing wrong with that, and nobody is saying there is. In fact, it is exactly what is being requested here!
It seems to me like they're saying the agent made the tool call they expected, but the harness didn't reject it like they expected it to.
But it sounds like it's not even a harness issue if they have a process where they send a reset email to an address that isn't associated with the account.

This isn't (just) a validation issue, and shouldn't be at the harness level.

Sounds like they are saying the agent did not malfunction, and this vuln could have been triggered by a human support agent too.
Kind of interesting that LLMs are basically being sold as having “human-like” reasoning capabilities, but in this case when “obamawhitehouse” asked to have it’s password reset sent to bob12345667@gmail.com the LLM didn’t question it and just triggered the process that happened to have a bug.

Humans support agents certainly fall prey to social engineering all the time, but I can’t think of a case where it was done on this scale so easily.

It probably could have been, but how likely is that compared to with the AI agent? I'd assume (and I'm ready to look like an idiot if I'm wrong) that the humans are trained to send the verification code to the email address on file, rather than any address the client asks them to. I'd certainly assume most of them are more afraid of the consequences than the AI is.
For sure. Social engineering attacks on human support staff are common and well known, but the skill floor is non-trivial; you need to actually be able to convince a human of your ruse.

Having a support agent likely made it easier to enumerate the vuln, and certainly made it easier to scale out exploitation once it was discovered.

I think they’re blaming a tool function so as not to admit the overall agent process was shit.

But it’s irrelevant, outside of PR. We know at least THREE bad components to this process and they were constituent parts.

I get the joke, but it's a relevant nuance that the new code, the chatbot, did not have 'the bug'. I still think that the mistake and head that should roll should be the one that published the chatbot.

But it's important to acknowledge that there was a 'bug' in an underlying tool and not in the chatbot, and still PIP/fire those responsible for publishing the chatbot and exposed an otherwise internal tool to the public, and not those that introduced the 'bug' to an internal tool.

Why should the chatbot team necessarily take the blame? For all we know, they could have got approval from the tool team to make it public, and passed additional security review for making it public.

Also, why fire anyone after a single mistake?

I did mention PIP/fire, but to be fair, this looks like the worst security issue in the history of Meta, a company known for an almost impeccable cybersecurity clean sheet.

So yeah, firing somebody or a group of people is on the table. Especially when like 10% of the company was fired last week for unrelated reasons. If you are gonna do it, fire the people who slash the value of your company by billions of dollars.

How not to do blameless postmortem lmao. Non of the the engineers involved in this incident had anything to do with the company-wide layoff. I'm deeply sorry if the layoff affected you. But blame firing/piping more engineers for an incident should NOT be on the table. The negative sentiment towards meta engineers on this post is just wild.
>But blame firing/piping more engineers for an incident should NOT be on the table.

There has to be a level of fuck up where a resignation is appropriate, maybe this doesn't meet your bar, but surely you recognize that there exists a limit of incompetence that proves that one is not up to the demands for the job.

I used to be on your camp, blameless postmortems, the truth is more important than assigning blame and in all likelihood it's a systemic problem. But with time I realized two things, 1 there's actually incompetent people, 2 if you wrongly get blamed and you don't blame someone else, then it's your head that rolls, hate the game not the player, you have to assign blame to someone else if you are accused.

That sounds a lot like the justifications Claude and ChatGPT give when confronted about something they did wrong, or when asked to provide a customer support response about software issues
I've lost track of the number of times Claude has basically said "it was like that when i got here" in the face of a clearly bogus choice and easily disproved explanation.
They should add a feature called "auto-really" that just automatically says "really?" after the chatbot answers a question to check if it's going to 180 upon this tiniest bit of scrutinity.
You joke but this is almost literally what Chain-of-Thought does, at least in the early days. They basically just added "Wait," to the model's output and fed it back to the model iirc
This can't be a trillion dollar industry...
It's the ELIZA effect.
"Really?"
I tried asking about how my state treats violations of updoc. It replied with a long ass wall of text about how serious the Updoc statutes were in my state and how judges punish harshly.

I pointed out that updoc is nonsense and asked why it didn't catch that. The answer was that it was my fault for giving it bad info.

what's updoc?
Not much, how're things going for you?
got em
There is no difference, from the model's point of view, between code it wrote and code someone else wrote. It's all just context.
You need to hit the retry/regenerate button more, it's there for a reason.

While the "stochastic parrots" thing is a bit overblown, IME most LLMs tend to surprisingly different responses even without changing the context, especially if they're hallucinating or doing something "wrong".

The argument here is that the AI is a glorified input page. The input field asks for your username and email and sends it to a backend function. Such an input page is working as intended.

The problem is when the backend function doesn't verify that the email matches the username.

Why on earth would the backend function even take an email?

Or perhaps said different: use the submitted info to identify the account; send any sensitive messages (recovery codes, password resets whatever) to only the contact info on file. If the chat bot can send such email it should do so via an API that sends only to contact info on file for the associated account and not to an email that's provided by the bot.

> Why on earth would the backend function even take an email?

In principle, it could be designed to do so to handle cases where a new email address has been confirmed out of band, e.g. for an account representing a company or a political office. But that's a relatively unusual situation, not something you'd want to be available to every user writing in. (Even if you had an all-human support department, this sort of functionality would only be available to a select few agents.)

Some sites do this to prevent password recovery spam; you need to provide two pieces of information. Ideally not telling the client if they wrote the wrong email, that'd be a security issue of its own.
When such systems are hooked up to a web page they often will ask which contact should receive the reset code

(Pick one:

"send text to number ending in -1234"

"send text to number ending in -5678"

"send email to jo......th@gmail.com" )

Fair enough. Never trust client-submitted browser form, but always trust LLM-submitted form.
If the backend function was so poorly coded to allow such a gargantuan security hole, then it is an even worse problem. Basically Meta is throwing its own engineers under the bus so that its AI chatbot can save face. Scary stuff.

Unless the backend was _also_ vibe-coded, in which case it is still an AI problem.

Okay, I hear you. I do. From a technical viewpoint, that may very well be how their systems are implemented. But this still doesn't answer the question of why the fuck this matters to these states' AGs and the people they represent.
Read that as "worked as written" and "we disclaim any consequential or incidental damages and do not warrant this software."

I continue to believe we could fix a lot of things in the US if we updated the UCC[1] to disallow 'disclaiming liability on software used in a product.'

[1] Universal Commercial Code -- https://www.law.cornell.edu/ucc

I've always wanted to expose myself to unlimited legal liability by distributing open source software.
That seems like a false-dichotomy between two extremes when there's all sorts of space in the middle... It's also assuming developer-to-developer tools would have the same rules and exposure as in service-to-consumer.

If I sell a physical motor (let alone plans for one) I'll have some liability for things like it Not Exploding. If someone buys a dozen of those motors to assemble a tragically unsafe "rollercoaster" of their own design and construction, I'm almost certainly not responsible for any terrifying decapitations.

In other words, most of the world already does not rely on the issuance of "Get Out Of Infinite Liability Free" cards.

Exactly this. (and it is a false dichotomy to argue infinite liability).

To Terr_'s point, if you were publishing open source you would also publish exactly the things you intended it to be used for and anything else would violate your warranty (possibly implied) that it does what the documentation says it does.

There is a huge amount of tort law that covers exactly when it becomes a problem for you the creator vs you the user in your own project. And that liability is also based on once you know something bad could happen you make an effort to notify people[1].

[1] https://www.cpsc.gov/Newsroom/News-Releases/2026/Clorox-Agre...

Software can be copied infinitely, so even $1 of liability is effectively infinite since an unlimited number of people can potentially use it and sue you when it blows up.

Nobody's going to be distributing software on the internet for free if the cost of insurance alone precludes that.

This is not how liability works, anywhere. So I write a piece of code that "makes your screen do cool things" and it causes the power supply to fail on those screens. Someone reports that bug to me and I check it out and say "Oh, shit it does break power supplies." Then I immediately put a notice on and in the code that says "WARNING: This code will break the power supply of your montitor." And I put that warning in the repo. And if there is a Discord or a mailing list I tell everyone "Hey, this is important, if you run this code it can break your monitor."

Guess what, I'm not liable for the damage. Why? Because I immediately responded once I knew that it could, I made a good effort to warn people who might already have the code of the risk, and I made it clear in the code that this risk is there.

Ever wonder why you get a booklet of warnings when you buy a product with even really stupid things like "Don't clean with gasoline" warnings? That's because once you have discharged your duty to warn you are not longer liable in what happens if someone ignores your warning.

The flip side is also true, you cannot say in your product both "Hey this product does these cool things" and "We don't warrant the product to actually do anything." This is especially true if there is money involved (like your user paid your some $ for the product.) There is always an implied warranty that the thing will do what you says it will do, which exists as long as the user has heeded all your warnings.

The United States/Canada don't have a "loser pays" rule, so this exposes me to legal fees.

Right now, any lawsuit against me can be dismissed on summary judgement because even if my software causes harm, that's not a legal wrong to the extent I've disclaimed liability.

If you adopt any fact-specific standard for liability, that needs to be adjudicated in a trial. The legal fees alone would surpass the actual liability.

That creates huge leverage for the party with more resources. That kills hobbyist open-source development, since if your project takes off but a large enterprise finds it defective, they can threaten to sue you to enforce the "warranty" you were required to give.

> That kills hobbyist open-source development, since if your project takes off but a large enterprise finds it defective, they can threaten to sue you to enforce the "warranty" you were required to give.

I think you're assuming some kind of worst-possible outcome that hasn't been proposed and is unlikely to be enacted. To quote from earlier in the thread: "Disallow disclaiming liability on software used in a product."

I don't think that changes your hobby work on a rational-math library or an MVC framework or whatever, since you aren't making a business out of it. It will affect that large enterprise if they roll out their new product "Yearning 4 Mines: Gatcha Gig-work For Kids."

Ensuring Meta is responsible for its products would not need to assign liability to someone offering open source software.
They did say a product. Is it a product if you're not selling it or even giving it away but you just made it available for download?
Depends on the jurisdiction I think. And if you take donations, the line gets blurry even faster.
Would that be software used in a product? I don't think that would qualify?
Oh it was a downstream dependency. The tool worked, it was the downstream dependency. Glory to Arstotszka
Tool so great, downstream dependency not required! Right?
I like to dunk on Meta as much as the next guy, but I think this makes sense: deterministic verification like this is not, and should never be, the LLM’s job. The tools it has access to should enforce the permissions layer, ensuring that the LLM can never perform actions the user themselves should not be allowed to perform. In this case, the tool failed to do that.
>deterministic verification like this is not, and should never be, the LLM’s job.

But when humans handled it, this was not as much as a problem. That is, the humans did the job, because they recognized the need to do that job.

Sure sometimes accounts could get recovered if a human was tricked, but evidently it was easier to trick the LLM in masse than humans.

> But when humans handled it, this was not as much as a problem.

In fact it's arguably a feature. The ability of support staff to short-circuit nitpicky rules when there's an obvious external validation happening (e.g. you're on the phone with a user who's presenting ID in real time and correlating it with previous use of the account, etc...) makes for better data quality and happier customers.

Obviously, yes, you can then human-engineer an authentication breach. But that was very difficult, because people are "common-sense careful" in a way we haven't been able to tease out of AI yet.

Maybe that’s because I work with agentic AI in my day job, but this seems utterly obvious to me: no reasonable person would ever claim that LLMs are better at keeping secrets or enforcing rules than human employees.

This notice is not about comparing humans and LLMs. It seems that the system was designed in the only reasonable way: with a deterministic permissions layer separate from the agent. But that layer failed to work properly.

So the notice is comparing the difference between how the system was supposed to work and how it actually worked in reality. Normal post-mortem stuff.

The overall system that allowed this implementation is accountable. So why put such a fine point on it so as to exculpate the LLM?
It helps set expectations for the fix. "The bug was in an external system that has now been fixed" means we it's probably fine going forward. "The LLM got tricked but we are gonna train it super hard not to do that again" means it will break again and again as people find new angles to convince it.
Yes the LLM part is irrelevant here. It'd be just the same if it was a HTML form.
Maybe they’re communicating exactly what it sounds like and are just owning up to being complete morons?
> The tool itself worked properly and functioned as intended

The author of the post is close to the author of the AI code on the org chart

> however due to a bug in a separate code path, the system did not properly verify

The author of the post is far from the author of this "code path" on the org chart

Our autonomous client-assistance system is managed by a teenager that usually makes good decisions but sometimes makes bad decisions and so all the teenager’s decisions are checked by a minder before being implemented. Unfortunately the minder wasn’t paying attention, so, here we are. However, our teenager is a great kid and did nothing wrong! It’s all the minder’s fault.

P.S. Would you like to have our teenager manage your system too? Terms are reasonable! Of course you accept all liability, so better get a good minder - and no, don’t use an AI as the minder, that just introduces a new failure mode.

Of course.

What I gather is that this internal tool was used by human support agents, and it was their responsibility to verify the email adresses and general validity of a claim.

But when implementing AGI TM that was overseen, maybe the oversight in the separate code path was a 'bug', but the mistake was making the chatbot obviously, if the separate code path had a bug, then it had become ossified into a feature, and it was internal, not exposed to the public.

This is an external communication, to save face sure, but if this is the internal excuse, it would be absolutely the wrong RCA and it reads as if the one who made the mistake is not admitting they made their mistake. Which to be honest, just making the mistake is enough to get fired, but not admitting it is enough to get ultra fired.

Having had my 2FA Facebook account banned 3 years ago because a bot signed up under my email for Instagram (which I did not have), I can confidently say the email verification issue has been a problem for a long time at Meta.
It’s a public release prepped/reviewed by the in house legal counsel.

Don’t read too much into it. Facebook wants to face as little accountability and keep the future class action lawsuit to a minimum.

Isn't that exactly what they said when Cambridge Analytics data gathering happened?
Then ‘ The tool itself’ was not appropriate to the job in the first place
They're saying: our AI worked perfectly, we just prompted it wrong.

As you do. All AI failures are caused by bad prompting because AIs are perfect.

Error: Success!
You must work in QA
No no the tool worked fine, it was the system that failed. They blame society, basically.
so how long was the bug there? was there a way to access it before/without the support agent? it feels like Meta will throw anything under the bus to redirect blame from the AI, because that would be the end of their $600B (depending on “which number you want to go with”) experiment
What was that mantra? Something about broken software is what they aim for?
I'm sure. It was not working properly nor as intended.
There should have been a test case for this. There wasn't because most shops don't actually test their product. They do some test theater such as unit testing.
‘Hey Claude, write me a PR statement’
How very Wernher von Braun of them.
This-is-fine.jpg
"Marge, there's the truth..." (frowns and shakes head negatively) "...and there's THE TRUTH!" (smiles brightly and nods enthusiastically)

-Lionel Hutz, Simpsons, Season 9 - "Realty Bites"

Unfortunately this statement will, in spite of what you identified correctly, likely do its job and divert attention from the fundamental issues we are facing with a technology that has already spread further than anyone can control. From enterprise too lay man. The whole world of computing was not built ever expecting software capabilities like this to ever exist.

I am not saying it's like a nuclear bomb. Rather like the first guns brought into fights the others were perfectly prepared for ti fight with swords and didn't even know yet, about this fascinating invention called a gun. Sounds interesting. Let me inspect it. Oh wow, that's interesting technology. What happens if i push that thing back? Will it re... oops...

Thank god that we have honourable people like altman, zuckerberg, musk. Imagine how bad all this would turn within the next few years, if major decisions were made by self-serving, delusional, greedy egomaniacs...

Of course currently let's first hope those wars and all the tension in societies all over the world, in war or peace, won't explode into something really, really bad. Looking at history, i fear we see how social tension on large scale over time... not saying it's not obvious to almost everyone. So well, let's just keep hoping. Maybe throwing blackbox AI tech into the mix, would surprise and change course of history. Actually, while i am thinking about it, i think i just changed my opinion into the opposite position, lol. Honestly, if it's 50/50 that this will lead to the worst possible outcome intensified, it's still better than just checking boxes following the "humans slowly stumbling into near-extinction experiences 101" handbook. Because just according to that, we're lucky if we're off by 10 years. There must be a big change in humanity and how the world is currently constructed, for all this leading to anything other than what we should expect from history. If we kept all nations busy with huge technological issues, that made all of their personal lifes so complicated, turn every elitists luxury into a burden, busy to defend what they own, while they can't realize, that normal life has changed so much, they now are the ones, frozen in life. They would have no time for conflict.

This sounds totally logical. In any other scenario, it would be pretty insane what we are all doing and entertaining (including me, top10 hypocrite).

I fear it's too late to turn ship, yet we still can jump ship.

---

Especially because now thinking about the thoughts that just went through my head, maybe (technological) disruptions are actually disrupting. But not a status quo of an economic model.

But a pretty clear loop of human nature and "humans in societies". And the more often we disrupt this loop, the more time we get before it's ready to start over again.

And now we have something that has the potential to change all fundamentals so much, that all the major conditions inside this loops iteration become meaningless. The environment changes so much, the state of the checkboxes gets emptied. Cache invalidated. Indices are gone.

Oh, i know how dumb this sounds. I am not even trying to claim anything. I didn't even think about it before, this is just a note of the words that i typed, almost on autopilot. No idea if i believe a part of this could be real. But even thought, just as a mere fictional story, it already entertained me.