Hacker News new | ask | show | jobs
The CIA's “Development Tradecraft Do's and Dont's” (schneier.com)
219 points by loop22 3386 days ago
11 comments

""DO NOT have "dirty words" (see dirty word list - TBD) in the binary. Rationale: Dirty words, such as hacker terms, may cause unwarranted scrutiny of the binary file in question.""

I'm not even kidding. I've found so much bad code over the years through searching for profanity. It's actually a bit embarrassing to be honest.

"unwarranted scrutiny" is a funny term here.
I want the CIA's dirty word list on a t-shirt!
I hope you'll enjoy "unwarranted scrutiny" that follows.
From the NOD crypto document, the following advice is given:

> (S//NF) Tools should perform key exchange exactly once per connection. Many algorithms have weaknesses during key exchange and the volume of data expected during a given connection does not meet the threshold where a re-key is required. xiii To reiterate, re-keying is not recommended.

With the footnote:

> xiii (S//NF) The exact nature of which algorithms are weak at this stage is highly classified. In the absence of those facts this guidance is still relevant; the utility inherent in re-keying derives from minimizing key exposure when performing bulk encryption of large amounts of data. Even the most data-intensive NOD operations involve several fewer orders of magnitude of data per session key. Consequently, re-keying introduces unnecessary complexity (and therefore opportunities for bugs or other unexpected behavior) without delivering value in return.

Which key exchange algorithms have key exchange vulnerabilities when keys are frequently exchanged, I wonder.

It would be really interesting if that kind of technical detail leaked. With all of these leaks, I've never heard of a technical / mathematical document discussing the actual encryption algorithms leaking.

Makes me wonder if either

- This whole leak is a "fake" or at least no big deal for the TLAs (because there is not much surprizing inside)

and / or:

- Most encryption is broken in a fundamental way. I would never be able find out, because the four or five influential security experts I know and trust, and who tell me it is safe, are bought by the TLAs. Who knows, maybe all PGP does effectively is to mark my mails as really really interesting. "They" can trivially decrypt them, and then they employ thousands of analysts who just do parallel construction on everything they find out (so they don't leak their exploit).

A boring but more likely explanation is that the juicier the information, the more heavily protected it is. Fewer people know, writing it down is more strongly discouraged, more paranoid procedures are used to prevent leaks, etc.
DH?
Probably stands for Diffie–Hellman, a well-known key exchange algorithm.

https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exc...

    DO explicitly remove sensitive data (encryption keys, raw collection data, 
    shellcode, uploaded modules, etc) from memory as soon as the data is no 
    longer needed in plain-text form. DO NOT RELY ON THE OPERATING SYSTEM TO DO 
    THIS UPON TERMINATION OF EXECUTION.

    DO NOT explicitly import/call functions that is not consistent with a tool's 
    overt functionality

    DO NOT perform operations that will cause the target computer to be 
    unresponsive to the user

    DO make all reasonable efforts to minimize binary file size for all binaries 
    that will be uploaded to a remote target

    DO provide a means to completely "uninstall"/"remove" implants, function 
    hooks, injected threads, dropped files, registry keys, services, forked 
    processes, etc whenever possible.

    DO use end-to-end encryption for all network communications. NEVER use 
    networking protocols which break the end-to-end principle with respect to 
    encryption of payloads.

    DO NOT break compliance of an RFC protocol that is being used as a blending 
    layer.

    DO NOT read, write and/or cache data to disk unnecessarily. Be cognizant of 
    3rd party code that may implicitly write/cache data to disk.

    DO NOT use hard-coded filenames or filepaths when writing files to disk. 
    This must be configurable at deployment time by the operator.
It's remarkable how many of these guidelines are just good software development guidelines and have nothing to do with malware, necessarily.
It's funny when you think about it. Many of these guidelines are there to prevent the program from being detected; in other words, it's to give the impression that it's not there.

You would think that good software which performs background tasks should be just as discreet, and follow similar guidelines: do your task, don't bother the user too much, definitely try not to block UI when the user is doing another thing. But a lot of software definitely does not. The classical example is AV software, but a lot of mobile applications are very guilty too. There is so much software that fails so hard at being efficient at background tasks...

It used to be even more prevalent about 10-15 years ago, but I suspect that it has a lot more to do with increased computational capabilities than to higher quality software.

It's also interesting how many of these guidelines have malware-related rationales, though.
I like this one,

> DO NOT solely rely on SSL/TLS to secure data in transit. Rationale: Numerous man-in-middle attack vectors and publicly disclosed flaws in the protocol.

> man-in-middle attack vectors

This is not too surprising.

In the CIA's use case (data exfiltration), this rationale is likely due to target organizations using a firewall which utilizes TLS interception to capture and inspect data, requiring the computer or mobile device to have a custom trusted root CA added in order to properly send traffic through their firewall box.

So the issue would be that TLS is going to be useless for protecting any data that is being exfiltrated, as the firewall box would obviously perform it's DLP duties and block their exfiltration attempt. Custom additional cryptography or added obfuscation makes sense in this case because they only need to get past the automated inspection, not an actual human. The data has already been sent to the LP by the time anyone has a chance to crack the additional layer of crypto/obfuscation and see the data.

But "don't roll your own crypto..."

If your attacker is breaking your TLS implementation, surely the next step is to break your shitty custom crypto protocol wrapped inside of it.

See the bottom of the page where he talks about the link to their internal (previously top secret) CIA crypto standards, which is probably one of the few cryptos that is actually any good (most of it was done with the NSA and just talks about which protocols are secure).
so probably the only ones who can break the CIAs crypto are the NSA.
Pretty sure that's the plot of Sneakers.
No, the plot of Sneakers is at the end the NSA thinks they're the only ones who can break the CIA's encryption but really the only one who can do it is Robert Redford!

Postscript: Redford of course then goes ahead and basically announces it to the NSA by stealing all the Republican party's money (and someone else - can't remember) and donating it to causes like Greenpeace and Amnesty International.

You can supplement TLS without rolling your own crypto. Sending a GPG message, for example.
Haha it's like CIA knew this would get leaked, and wrote this to troll HN in advance...
probably true
>DO NOT use US-centric timestamp formats such as MM-DD-YYYY. YYYYMMDD is generally preferred.

It will be funny to use this argument next time I see a discussion on Imperial vs SI units and formats :)

I always preferred YYYY-MM-DD. Sorting filenames etc. makes more sense this way.
doing everything with dates makes more sense this way! :)
How easy is it for automated tools to spot encryption routines in a binary? If my notepad replacement had call to encryption algorithms in it I'd be a bit surprised...
This is equivalent to asking, "How good are virus scanners?"

Many crypto routines have identifiable constants but there are myriad ways to obfuscate code too so I'm not sure there's an answer other than: It depends on how hard they're trying to avoid you and how hard you're trying to find them.

The presence of AES instructions would be easy just from static analysis. Also loops of shifts and xors that are accumulating, etc. You could easily map the structure of various encryption algorithms into a static analysis tool. (I feel like the dynamic analysis proposed below about watching outputs being random is clever, but much more difficult).
> Also loops of shifts and xors that are accumulating, etc.

I think the problem is then to avoid punishing binaries for, e.g., using a hashmap :)

Statistically, any routine which results in randomized output from non-random input (and especially if it is repeatably deterministic) would be a good candidate.
Good symbolic analysis can identify simpler (e.g. RC4) encryption algorithms, but the run time is currently too expensive to run on every binary.
COM will use crypto routines as a matter of course - so practically every windows program will have these instructions.
When will the WIFOM start?

"This malware has consistent US EST timestamps, so it's probably not the CIA since the CIA knows to change timestamps. Probably some Iran masquerading as a US malware."

> Asymmetric cryptography must not be used directly for bulk encryption. It must only be used to negotiate or exchange secrets used for symmetric encryption and for digital signatures and their verification.

Which asymmetric encryption would that be?

Asymmetric encryption is generally synonymous with public key cryptography[1], as there are few processes which are asymmetric and useful.

In terms of definitions, with an asymmetric system, knowing the key with which something was encoded doesn't allow you to decrypt it.

With symmetric cryptography, if you know the key, you can decode whatever was encrypted.

[1] https://en.wikipedia.org/wiki/Public-key_cryptography

I don't think that asked what it is. Probably meant which rather than what.
Yes, that's what I meant, thanks!

I edited the parent for clarity.

As an aside, I really like that they include the rationale with each item.
tl;dr don't be sloppy
So much information is lost from generation to generation because of how "obvious" it was.
Isn't it great? Now every low level malware author in the world will start to follow adhere these tips, and the US has once again shot themselves massively in the foot, and fucked over the whole world in the process...

Thanks CIA

These tips are not very new or novel. Even low-level malware already follows some of them, although they likely do not care as much about attribution for example.
A lot of malware authors are not highly trained. Having a good resource helps a lot.