Call Stack - buffer overflow vulnerability
Buffer overflows are a kind of call stack vulnerability that occur when buffers are created on the stack, but accessed improperly. Buffer underruns are typically not so dangerous, because writing in the current stack frame or beyond the stack pointer will only affect local variables on that stack frame. On the other side, buffer overruns can allow the attacker to overwrite the return address and thus even modify the program’s behavior.
Buffer overflow
C programmers often allocate buffers on the stack to handle user input. If the input reading logic is implemented incorrectly and has no buffer length checks, a underflow/overflow can happen. If the user input is long enough, it will overwrite the saved ebp
register of the previous stack frame and, what matters most, the return address.
Example
#include <stdio.h>
#include <stdlib.h>
void __attribute__((noinline)) fun(int a, int b, int c) {
char buffer[16] = {0};
int* prevEbp = &a - 2;
int* ret = &a - 1;
printf("Buffer start: %p Buffer start pointer address: %p\n", buffer, &buffer);
printf("Previous EBP: %p Value: %d Value as hex: %x\n", prevEbp, *prevEbp, *prevEbp);
printf("Return address: %p Value: %x\n", ret, *ret);
printf("Buffer end: %p\n", buffer + 16);
fflush(stdout);
}
int main() {
printf("Ptr size: %d bytes\n", sizeof(void*));
fun(1, 2, 3)
return 0;
}
We can calculate the return address position by taking addresses of the buffer and the function arguments. In this case we only take the pointer to the first argument, because it is added to the stack last. The previous base pointer size as well as the return address size are 4 bytes, so we can just subtract 1 (4
bytes) from the pointer to get the return address and 2 (8
bytes) to get the base pointer.
We can now compile the program with the -fno-stack-protector
flag to disable stack protecting canary that gcc
adds by default:
$ gcc main.c -o viewret -fno-stack-protector
By running the program I got:
Ptr size: 4 bytes
Buffer start: 0061FEE8 Buffer start pointer address: 0061FEE8
Previous EBP: 0061FF08 Value: 6422312 Value as hex: 61ff28
Return address: 0061FF0C Value: 401508
Buffer end: 0061FEF8
We can easily alter the return address value now:
void __attribute__((noinline)) fun(int a, int b, int c) {
char buffer[16] = {0};
int* prevEbp = &a - 2;
int* ret = &a - 1;
printf("Buffer start: %p Buffer start pointer address: %p\n", buffer, &buffer);
printf("Previous EBP: %p Value: %d Value as hex: %x\n", prevEbp, *prevEbp, *prevEbp);
printf("Return address: %p Value: %x\n", ret, *ret);
printf("Buffer end: %p\n", buffer + 16);
fflush(stdout);
*ret = 0xcafeefac;
}
Now, if we run the program we will get a segmentation fault
error because the function will try to jump back to the calee using an invalid address.
We can examine exactly how it works by running the GDB
debugger:
$ gdb viewret.exe
Of course, we need to set the breakpoint at the fun
function:
(gdb) $ br fun
[New Thread 3388.0x3368]
[New Thread 3388.0x1a2c]
Ptr size: 4 bytes
Breakpoint 1, 0x00401416 in fun ()
By using the frame
command we can view the saved registers if the current stack frame.
(gdb) $ info frame
Stack level 0, frame at 0x61ff10:
eip = 0x401416 in fun; saved eip 0x401508
called by frame at 0x61ff30
Arglist at 0x61ff08, args:
Locals at 0x61ff08, Previous frame's sp is 0x61ff10
Saved registers:
ebp at 0x61ff08, eip at 0x61ff0c
The ebp
register of the previous stack frame is at address 0x61ff08
, the return address - at 0x61ff0c
. The values are the same as generated by the program above.
(gdb) $ c
Continuing.
Buffer start: 0061FEE8 Buffer start pointer address: 0061FEE8
Previous EBP: 0061FF08 Value: 6422312 Value as hex: 61ff28
Return address: 0061FF0C Value: 401508
Buffer end: 0061FEF8
Program received signal SIGSEGV, Segmentation fault.
0xcafeefac in ?? ()
By stepping over the breakpoint we can see the invalid return address that caused the segmentation fault.
Altering variables
Let’s examine another program that reads data from the standard input stream:
#include <stdio.h>
#include <stdlib.h>
int main() {
volatile int zero;
char buffer[64];
zero = 0;
gets(buffer);
if (zero) {
printf("You changed the zero variable to %d (hex: %x)!", zero, zero);
}
else {
puts("Variable not changed.");
}
return 0;
}
The zero
variable is marked as volatile
to prevent the compiler from optimizing it’s usage, e.g. by caching it’s value in one of the general-purpose registers.
By disassembling the program with gdb
, we get:
0x00401410 <+0>: push ebp ; save the previous ebp register
0x00401411 <+1>: mov ebp,esp ; initializing ebp of the new stack frame
0x00401413 <+3>: and esp,0xfffffff0 ; memory aligning
0x00401416 <+6>: sub esp,0x60 ; memory allocation on the stack
0x00401419 <+9>: call 0x401980 <__main>
0x0040141e <+14>: mov DWORD PTR [esp+0x5c],0x0 ; assign to zero
; eax = esp + 0x1c
0x00401426 <+22>: lea eax,[esp+0x1c]
; the address calculated with the previous instruction gets saved on the stack
0x0040142a <+26>: mov DWORD PTR [esp],eax
0x0040142d <+29>: call 0x403ae8 <gets> ; gets() call
; load the value from the memory for comparison
0x00401432 <+34>: mov eax,DWORD PTR [esp+0x5c]
0x00401436 <+38>: test eax,eax ; test if it is zero
0x00401438 <+40>: je 0x401458 <main+72>
0x0040143a <+42>: mov edx,DWORD PTR [esp+0x5c]
; commands needed for printf
0x0040143e <+46>: mov eax,DWORD PTR [esp+0x5c]
0x00401442 <+50>: mov DWORD PTR [esp+0x8],edx
0x00401446 <+54>: mov DWORD PTR [esp+0x4],eax
0x0040144a <+58>: mov DWORD PTR [esp],0x405044
0x00401451 <+65>: call 0x403ac8 <printf> ; success print
0x00401456 <+70>: jmp 0x401464 <main+84> ; jump over the else branch
0x00401458 <+72>: mov DWORD PTR [esp],0x405073
0x0040145f <+79>: call 0x403ac0 <puts> ; error print
; return with exit code 0
0x00401464 <+84>: mov eax,0x0
0x00401469 <+89>: leave
0x0040146a <+90>: ret
0x0040146b <+91>: nop
0x0040146c <+92>: xchg ax,ax
0x0040146e <+94>: xchg ax,ax
We can set 2 breakpoints - before and after the gets()
call.
(gdb) $ br *0x0040142d
(gdb) $ br *0x00401432
With gdb
, we can define what commands to run when these breakpoints are reached:
(gdb) $ define hook-stop
>info registers
>x/24wx $esp
>x/2i $eip
>end
Bu running the commands above we will see the register state, 24 machine words on the stack and two next instructions after the instruction pointer:
eax 0x61fedc 6422236
ecx 0x4018f0 4200688
edx 0x50000018 1342177304
ebx 0x2d2000 2957312
esp 0x61fec0 0x61fec0
ebp 0x61ff28 0x61ff28
esi 0x4012d0 4199120
edi 0x4012d0 4199120
eip 0x40142d 0x40142d <main+29>
eflags 0x202 [ IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x53 83
gs 0x2b 43
0x61fec0: 0x0061fedc 0x00000008 0x772c8023 0x772c801a
0x61fed0: 0xb3b6879d 0x004012d0 0x004012d0 0x00000000
0x61fee0: 0x004018f0 0x0061fed0 0x0061ff08 0x0061ffcc
0x61fef0: 0x772cdd70 0xc4e6dd59 0xfffffffe 0x772c801a
0x61ff00: 0x772c810d 0x004018f0 0x0061ff50 0x0040195b
0x61ff10: 0x004018f0 0x00000000 0x002d2000 0x00000000
=> 0x40142d <main+29>: call 0x403ae8 <gets>
0x401432 <main+34>: mov eax,DWORD PTR [esp+0x5c]
Breakpoint 1, 0x0040142d in main ()
Now we can examine how the input affects the stack:
(gdb) $ c
Continuing.
0000000000000000000000000000000000000000000
eax 0x61fedc 6422236
ecx 0x772eb098 1999548568
edx 0xa 10
ebx 0x2d2000 2957312
esp 0x61fec0 0x61fec0
ebp 0x61ff28 0x61ff28
esi 0x4012d0 4199120
edi 0x4012d0 4199120
eip 0x401432 0x401432 <main+34>
eflags 0x216 [ PF AF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x53 83
gs 0x2b 43
0x61fec0: 0x0061fedc 0x00000008 0x772c8023 0x772c801a
0x61fed0: 0xb3b6879d 0x004012d0 0x004012d0 0x30303030
0x61fee0: 0x30303030 0x30303030 0x30303030 0x30303030
0x61fef0: 0x30303030 0x30303030 0x30303030 0x30303030
0x61ff00: 0x30303030 0x00303030 0x0061ff50 0x0040195b
0x61ff10: 0x004018f0 0x00000000 0x002d2000 0x00000000
=> 0x401432 <main+34>: mov eax,DWORD PTR [esp+0x5c]
0x401436 <main+38>: test eax,eax
Breakpoint 2, 0x00401432 in main ()
As we can see, 43 zero-characters (ascii code 0x30
) was not enough to get to the zero value that we want it to occur. In order to get past the end of the buffer, we need at more than 64 bytes (because the buffer size is 64). For demonstration purposes, we will use the following string as the input:
000011111111111111112222222222222222333333333333333344444444444456
This string contains 66 characters, so the two last characters 5
and 6
should overwrite the 2 least significant bytes (because memory endianness is little-endian) of the variable.
(gdb) $ c
Continuing.
000011111111111111112222222222222222333333333333333344444444444456
eax 0x61fedc 6422236
ecx 0x772eb098 1999548568
edx 0xa 10
ebx 0x3f9000 4165632
esp 0x61fec0 0x61fec0
ebp 0x61ff28 0x61ff28
esi 0x4012d0 4199120
edi 0x4012d0 4199120
eip 0x401432 0x401432 <main+34>
eflags 0x216 [ PF AF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x53 83
gs 0x2b 43
0x61fec0: 0x0061fedc 0x00000008 0x772c8023 0x772c801a
0x61fed0: 0xe53b01b1 0x004012d0 0x004012d0 0x30303030
0x61fee0: 0x31313131 0x31313131 0x31313131 0x31313131
0x61fef0: 0x32323232 0x32323232 0x32323232 0x32323232
0x61ff00: 0x33333333 0x33333333 0x33333333 0x33333333
0x61ff10: 0x34343434 0x34343434 0x34343434 0x00003635
=> 0x401432 <main+34>: mov eax,DWORD PTR [esp+0x5c]
0x401436 <main+38>: test eax,eax
Breakpoint 2, 0x00401432 in main ()
By continuing we see that the variable now contains 0x3635
or 13877
in decimal.
(gdb) $ c
Continuing.
You changed the zero variable to 13877 (hex: 3635)![Inferior 1 (process 4848) exited normally]
Error while running hook_stop:
The program has no registers now.
In order to alter the zero
variable we need to represent the number in the little endian
form and write the corresponding bytes to the 65, 66, 67 and 68 offsets in in buffer.
Protection against buffer overflows
Compilers and operating systems have some techniques to prevent such stack exploits. In gcc
, for example, if the function allocates a buffer on the stack, an additional so-called stack canary
is added. A stack canary is just a random integer generated when the function is called. Before returning the function makes sure that the canary has the same value. If the canary has been altered, the program is terminated with a fatal Stack smashing detected
error.
Another technique used by operating systems is restricting code evaluation on the stack. When the stack overflow is exploited, hackers will try to overwrite the return address so that it points at the buffer location with the malicious code injected. Even if the exact address is not known, it is possible to construct a NO-OP
-instruction slide in the stack buffer so that a jump at any address within this slide will lead to malicious code execution. Exact stack addresses are typically different after every program run because operating systems push environmental variables onto it.