10/20/2008

We'Re Moving!

That's right, after some successful feedback from some readers (it meant a lot, thank you guys) and in order to provide a richer knowledge sharing experience I decided to upgrade MoreNops to a full-blown domain known from now on as www.morenops.com.

As I already mention in the new site, I've moved all the content already available here to the new site to avoid cross-site referencing issues but I'll leave this site online and with all the content just in case someone drops by following some cached content from sites like Google and such; anyway be advised that this site will no longer be updated.

I encourage you to update your bookmark (If you where one of the few that had one) and grant a warm welcome to the new MoreNops, now gone 2.0!

10/19/2008

Advanced Buffer Overflows: Defeating #9

After the fun-and-go! session in the previous post it didn't take too long to create a exploit for this kind of situation. Today I present the solution for the 9th level of the Advanced Buffer Overflows challenge which deals with free() and a dlmalloc implementation. First of all, lets take a look at the C source:
int main(int argv,char **argc) {
char *pbuf1=(char*)malloc(256);
char *pbuf2=(char*)malloc(256);

gets(pbuf1);
free(pbuf2);
free(pbuf1);
}

In my particular case I changed gets() for strpy() using the call parameters as input vector to ease data input and exploitment, which doesn't alter the bug in any way. This level is textbook example of a Heap Buffer Overflow situation in which we fool the implementation and the unlink() macro to overwrite arbitrary bytes in memory.

As explained in our last post, we need to create a fake chunk header in one of our buffers and fool free() to unlink() it. For this purpose we will overwrite the chunk header of buf2 so that calculations will lead the implementation to our fake chunk header. We will overwrite the prev_size field so that when _int_free() calculates the address of the previous chunk, it gets to our fake chunk header. Instead of making a step-by-step debugging session, I will explain the key instructions where data flow gets manipulated. Remember you can grab the disassembly here.

First let's begin by showing how the call will be so that we can identify the data we're analizing. Note in this call that the sub-sequence in blue is where our overflow begins and this 8 bytes will eventually fill the chunk header for buf2:

[infi@localhost insecure]$ ./abo9 `python -c 'print "\xeb\x0e"+"A"*14+"\xeb\x1a\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68"+"A"*188+"\xff\xff\xff\xff"+"A"*8+"\xf8\xff\xff\xff"+"\xf0\xff\xff\xff"+"\xff\xff\xff\xff"*2+"\x9c\x95\x04\x08"+"\x08\x96\x04\x08"'`
sh-2.05b$


In 0x4207446b the address of \xf8\xff\xff\xff gets loaded into ecx and right after that \xf0\xff\xff\xff is copied into eax. At address 0x42074477 eax is copied into esi. One of the critical points comes in 0x420744ad because esi=0xfffffff0 and ecx=0x08049708 get added and the result is stored in eax, which now contains 0x08049718; the theoretical start address of our fake chunk where the theoretical fd and bk pointers are (in red in the call sequence).

Anyway all of that is there just to success some checks, the real magic begins now. At 0x420744c8 edi-8 (the address of buf2) will be stored in eax, thus saving the prev_size field of buf2's chunk header in it. In the next instruction and remembering that ecx holds the address of the prev_size of buf2, ecx = ecx - eax is executed. Now remember that we manipulated the prev_size of buf2 to read 0xfffffff8 (-8), so this will effectively makes ecx point to an address inside buf2, which we can manipulate due to the overflow. ecx now points into 0x08049710, the chunk header of our fake chunk. Since the first 8 bytes of this "unused" buffer are fd and bk pointers our job is kindly completed by unlink() which using offsets 0x8 and 0xc will write 0x08049608 (address of our shellcode, stored in buf1 itself) in 0x0804959c+12=0x080495a8 (address of free@GOT).

That's basically it but a couple of notes here. First of all, shellcode+8 will get clobbered by 0x0804959c , thats how unlink() works. To make the shellcode work properly, we place a unconditional jump in the beginning and jump over the clobbered area, thats why we include "\xeb\x0e"+"A"*14 in the beginning of the shellcode (eb is the opcode for "jmp" and 0x0e=14 is the offset). Also, we need to make sure we overwrite the GOT entry for free() because right after we trick free() into doing this, it gets called once again over buf1, and after manipulating headers this way, the program crashes with a segment violation signal.

I hope this little walktrough enlightened somebody as just did with me. For any suggestion, question or whatever don't hesitate to drop a comment or write an email. See you next time and keep adding NOPs!

10/17/2008

Fandango on free()

I apologize for the delay in keeping this blog updated. As I was working in the solution for the 9th level of the Advanced Buffer Overflows challenge I had some serious troubles with the different implementations of Doug Lea's Malloc (dlmalloc from now on) management system which I thought would do for a good article. Althought the solution for abo9 is in its final stages, I'll elaborate on the heap implementation this time.

Lots of good articles have been already written in this subject (Which I'll link at the end of this post for anyone that's interested) but with the versions I was using in my VMs as lab I had more than one headache(In fact, vulnerable versions of glibc seem to be already patched in SuSE versions 9.X. In my case switching to Red Hat 9.0 did the job). None of the popular papers seemed to work in my case, hence, I decided to dig myself in the inner workings of this dynamic memory allocation system. Instead of explaining a C implementation I went for a Reverse Code Engineering approach which turned as follows. In this post I won't explain the inner workings of dlmalloc, that's already been nicely documented in other papers such as Vudo - An object superstitiously believed to embody magical powers by Michel "MaXX" Kaempf, hence I'll avoid any reference to that.

The function in question is free() which takes care of freeing a used chunk once we have finished our work with it. free() is actually a wrapper around _int_free() which does the real job. _int_free() makes some checks and then makes use of the unlink() macro to unlink two adjacent chunks and join them to create a larger one. You can get the interesting part of the disassembly of _int_free() here.

At address 0x42074464 it loads the pointer to the chunk we want to free into the edi register. In 0x4207446b loads the address of the prev_size field of the buf2 buffer into ecx. Right in the next intruction the contents of size field of buf2 are copied into eax. After check int the arena struct we jump to 0x420744a0 at 0x42074481. In the next two instructions it checks wether or not the 2nd least significant bit of the size field is enabled and if it is(which is not our case) it jumps to another region within _int_free().

Now is where the fun part for exploitation begins. At 0x420744ad, we calculate the address of the next chunk using our address + the prev_size field of buf2 (which we can control overflowing the first buf1) and store it in eax, which can now contain the address of the prev_size of a fake chunk we can create. In the next instruction we load the size field of our fake chunk into edx. After some saving into local variables, some 8 byte boundary roundings and a PREV_INUSE checks, we arrive to 0x420744c8. Here we load in eax the prev_size of buf2 and in the next instruction we store in ecx the address of buf2's prev_size minus the prev_size of buf2 itself. This hypothetically would get us the previous chunk's prev_size address but since we manipulated buf2's prev_size we can make it go wherever we want.

So, since ecx now points to the prev_size of our fake chunk, let's talk a little bit about the unlink() macro. This macro basically swaps the fd and bk fields of unused chunks to make the use of free space in memory more efficient and to avoid fragmentation. Being a macro as it is, is compiled inline with the rest of the code and, since doesn't provide any sanity checking it'll swap some addresses we control with information we control allowing us to overwrite 4 bytes wherever we want in memory with the information we want. You can see unlink() in action in address 0x420744cd, 0x420744d2, 0x420744d5 and 0x420744d8. From then on the function continues his normal path as the damage has already been done.

Well that's it. I'll talk about exploitability and it's techniques in the next post. If anybody felt this was quite messy to follow I apologize; I understand it's a difficult matter (at least it was for me :P) but I hope somebody else learned something from this. I also hope I can bring a final solution for abo9 in the upcoming days. As promised I include some references to previous works in the topic here: