Hacker News new | ask | show | jobs
by ekr 4006 days ago
Does this mean that it will no longer be possible to do things like return-oriented programming?

LE: indeed, it's quite clear from the mentioned article (http://dslab.epfl.ch/pubs/cpi.pdf). So this provides great exploit protection.

5 comments

According to https://github.com/Microsoft/clang/blob/master/docs/SafeStac... safestack alone doesn't fully protect against ROP:

> With SafeStack alone, an attacker can overwrite a function pointer on the heap or the unsafe stack and cause a program to call arbitrary location, which in turn might enable stack pivoting and return-oriented programming.

And you need additional features (such as CPI from the paper you and the commit message link to) for full protection.

ROP is an exploit technique. Stack corruption is a class of vulnerabilities. There are other memory corruption techniques that can be exploited with ROP.
It is NOT clear from that article. ROP can occur on the heap and CPI is bypassable (although safestack is a great contribution, and frankly it's about time). There is great literature on this issue already (see many forms of the Control Flow Integrity defense), and many solutions that exist come close to full security without providing it (CPI only protects code pointers, and side-channel attacks that work through data pointers can still achieve arbitrary memory reads and writes). In particular, use-after-free vulnerabilities still exist. Without full memory safety, exploits of these types will always be possible.
What is return-oriented programming? Recursion?
Return-oriented programming is an exploit technique that relies on reusing snippets of existing code (called gadgets) in a program in order to carry out attacker code. Each gadget generally ends with a return instruction, which causes it to read the address of the next gadget off the stack and jump to it. In this way, arbitrarily complex code can be built up by chaining together sequences of gadgets controlled by an initial set of return addresses on the stack.

It's used as a way to defeat DEP (Data Execution Prevention); with DEP the attacker can no longer write code into memory and then execute it, so instead they just set up the stack cleverly so they can carry out a return-oriented payload (most commonly, these payloads just disable DEP and then move on to a more traditional second stage).

More info:

The paper that introduced the name ROP (though some would argue that the techniques existed before this paper): https://cseweb.ucsd.edu/~hovav/dist/geometry.pdf

Wikipedia: https://en.wikipedia.org/wiki/Return-oriented_programming

If you're interested in learning about exploit writing, you might want to check this page : https://www.corelan.be/index.php/articles/ .
It looks like it's an exploit technique where the stack is modified to set up malicious calls to functions: https://en.wikipedia.org/wiki/Return-oriented_programming
That's not really the essence of ROP because other attack techniques often also need to manipulate the stack. The key novel idea in ROP is to use data in unintended ways. This is based on the insight that the memory often contains short sequences of bytes (e.g. a .jpg image) that can be interpreted as machine instructions. For example an mp3 file might contain the sequence 99 19933, 16 which translates to

    increment register 16
    return
in the ambient machine language. Call that "dual use data". ROP searches the memory for sufficient "dual use data" and then builds an ac-hoc compiler with "dual use data" as target language. Then the attack software compiles to "dual use data" and then runs the compiled code.

Of course one may ask: can we always find enough "dual use data" to build a Turing-complete set of instructions as a compilation target. Turns out that with high probability that is the case.

ROP gadgets are usually harvested from libraries loaded into the program, not MP3 files.

The key novel idea in ROP is to use instruction sequences in unintended ways. ROP is a refinement of ret2libc, improving on it by returning into arbitrary locations in functions rather than their entry points. That, and of chaining together gadgets with returns. Hence the name.

It is true that "ROP gadgets are usually harvested from libraries loaded into the program, not MP3 files", but that's not because there's something intrinsically wrong with mp3s as source of gadgets, it's just that mp3s are often not executable. I have emphasised mp3s and jpgs precisely to emphasise what' novel about ROP, namely that any data can be used as machine language.
Yeah this just doesn't seem like an illuminating example in practice. In practice, gadgets for ROP chains are harvested from program text. It's for that reason that so much effort is expended in many exploits on memory leaks that reveal the locations of libraries loaded into memory.
None of that should be marked executable, though. The real risk in ROP is using little bits of legitimate functions to bypass DEP.
That is true, I should have been more clear about this but you don't use the "legitimate function"'s intended functionality, you only use the fact that it can be execute (and the byte-string that is it's code).

I used mp3s and jpgs as extreme examples of data that was never intended to be executed, but still can be interpreted as code. In ROP, you don't care about the intended meaning of the bytes that make up "legitimate functions" (or any other data you may use) for it's unlikely to have the sought functionality. Instead you use you search for "dual use code" too, and piece together the functionality you need.

Well, but you missed that data can not be executed.

Unless you store your MP3s and jpegs in .text, the memory pages all that stuff is in are marked not executable and will only cause a crash if you jump to it. Regardless of whether the bytes make useful instructions.

You may however be able to get different legitimate instructions than originally intended by jumping to an address in the middle of a multi-byte instruction that happens to decode into a useful series of operations. It follows that there are more usable "returns" in a body of code than just those written in the original source.
There are a multiple usable "returns" in functions because "return" is just "pop" then "jump", so a block that has the effect of manipulating the stack and then transferring control can often be used to the same effect as a return.
Accidentally downvoted you. My apologies.
If only there was a way to space those two tiny arrows further apart. Let's hope science comes up with a way some day..
HN is a mind hack to exercise your jquery skills.
The best explanation I've heard for HN's lack of some features is that any HN power user is expected to find a solution for these problems. That doesn't necessarily mean making a solution, it may just entail searching for a third party solution, or discovering a bit of specific HN behavior that isn't well documented.

I imagine any user with over 1000 karma probably has at least a few tricks, and could mention off the top of their head a few things that help quite a bit when using the site. At the same time, I think it's best they aren't mentioned too often, as they offer a sort of natural advantage to those that have been around an while and contributed to discussions (it's the nature of many of these helpers that they actually help discussion), and putting those tools into the hands of a novice user may actually be detrimental to the site as a whole.

This sounds a lot like rationalising poor UI.
Upvoted for you.