Hacker News new | ask | show | jobs
by tptacek 171 days ago
These are words, but I don't understand how they respond to the preceding comment, which observes that binary legibility is an operational requirement for real security given that almost nobody uses reproducible builds. In reality, people meaningfully depend on work done at the binary level to ensure lack of backdoors, not on work done at the source level.

The preceding comment is saying that source security is insufficient, not that transparency is irrelevant.

1 comments

Source availability is what makes a chain of trust possible that simply isn't meaningfully possible with closed source software, even with dynamic analysis, decompilation, reverse engineering, runtime network analysis with TLS decryption, etc.

Both you and the preceding commenter are correct that just running binaries signed and distributed by Alphabet (Google) and/or Apple presents room for additional risks beyond those observable in the source code, but the solution to this problem isn't to say "and therefore source availability doesn't matter at all for anyone", it's to choose to build from source or to obtain and install APKs built and signed by the developers, such as via Accrescent or Obtanium (pulls directly from github, gitlab, etc releases).

There's a known-good path. Most people do not take the known-good path. Their choice to do so does not invalidate or eliminate the desirable properties of known-good path (verifiability, trustworthiness).

I genuinely do not understand the argument you and the other user are making. It reads to me like an argument that goes "Yes, there's a known, accurate, and publicly documented recipe to produce a cure for cancer, but it requires prerequisite knowledge to understand that most people lack, and it's burdensome to follow the recipe, so most people just buy their vials from the untrustworthy CancerCureCorporation, who has the ability to give customers a modified formula that keeps them sick rather than giving them the actual cure, and almost nobody makes the cure themselves without going through this untrustworthy but ultimately optional intermediary, so the public documentation of the cure doesn't matter at all, and there's no discernable difference between having the cure recipe and not having the cure recipe."

No, you're completely off the rails from the first sentence. It is absolutely possible --- in some ways more possible[†] --- to make a chain of trust without source availability. Your premise is that "reverse engineering" is somehow incomplete or lossy with respect to uncovering software behavior, and that simply isn't true.

[†] Source is always good to have, but it's insufficient.

Never once anywhere in this thread have I claimed that source code alone is sufficient by itself to establish a chain of trust, merely that it is a necessary prerequisite to establish a chain of trust.

That said, you seem to be refuting even that idea. While your reputation precedes you, and while I haven't been in the field quite as long as you, I do have a few dozen CVEs, I've written surreptitious side channel backdoors and broken production cryptographic schemes in closed-source software doing binary analysis as part of a red team alongside former NCC folks. I don't know a single one of them who would say that lacking access to source code increases your ability to establish a chain of trust.

Can you please explain how lacking access to source code, being ONLY able to perform dynamic analysis, rather than dynamic analysis AND source code analysis, can ever possibly lead to an increase in the maximum possible confidence in the behavior of a given binary? That sounds like a completely absurd claim to me.

I see what's happening. You're working under the misapprehension that static analysis is only possible with source code. That's not true. In fact: a great deal of real-world vulnerability research is performed statically in a binary setting.

There's a lot of background material I'd have to bring in to attempt to bring you up to speed here, but my favorite simple citation here is just: Google [binary lifter].

This assumption about me is not accurate at all, I've done static analysis professionally on CIL, on compiled bytecode, and on source code. Instead of being condescending and patronizing to someone you don't know that you've made factually inaccurate assumptions about, can you please explain how having just a binary and no access to source code gives you more information about, greater confidence in, and a stronger basis for trust in the behavior of a binary than having access to the binary AND the source code used to build it?
I have no idea who you are and can only work from what you write here, and with this comment, what you've written no longer makes sense. The binary (or the lifted IR form of the binary or the control flow graph of the binary or whatever form you're evaluating) is the source of truth about what a program actually does, not the source code.

The source code is just a set of hints about what the binary does. You don't need the hints to discern what a binary is doing.

> but the solution to this problem isn't to say "and therefore source availability doesn't matter at all for anyone"

Thankfully, I didn’t say that.

Great, then it sounds like we agree: your original equivalence of Signal and WhatsApp was misguided, since one offers a verifiable chain of trust that starts with source availability and the other doesn't, to say nothing of the lengthy history of untrustworthiness and extensive, deliberate privacy violations of the company that owns and maintains WhatsApp, right?
No, we don’t agree. There are things that source code is good for, but validating the presence or absence of illicit data stealing code in apps delivered to consumers is not one of those things. For that, source code can show you obvious malfeasance, but since it’s not enough to rule out obvious malfeasance, you’re stuck going to analysis of the compiled app in both cases.

The population of users who have a verifiable path from an open source repo to an app on their device is a rounding error in the set of humans using messaging apps.

I think we've both made our positions clear. From my perspective, you're continuing to heavily cite user statistics that are irrelevant to the properties of verifiability or trustworthiness of the applications themselves, the goalposts I am discussing keep being moved, and there is a repeated pattern of neglect to address the points I'm raising. Readers can judge for themselves. Curious readers should also read about the history of Meta's Onavo VPN software and resulting lawsuits and settlements in evaluating the credibility of Meta's privacy marketing.
Just to be crystal clear about the goalposts: I said at the start of this chain that if somebody wants secure messaging, they should use Signal or WhatsApp.

You raised concerns about lack of source availability, and I’ve been consistent in my replies that source availability is not the way that somebody wants secure messaging is going to know they’re getting it. They’re going to get it because they’re using a popular platform with robust primitives, whose compiled/distributed apps receive constant scrutiny from security researchers.

Signal and WhatsApp are that. Concerns about Meta’s other work are just noise, in part because analysis of the WhatsApp distributed binaries doesn’t rely on promises from Meta.