Call Stack - buffer overflow vulnerability

Call Stack - buffer overflow vulnerability

Buffer overflows are a kind of call stack vulnerability that occur when buffers are created on the stack, but accessed improperly. Buffer underruns are typically not so dangerous, because writing in the current stack frame or beyond the stack pointer will only affect local variables on that stack frame. On the other side, buffer overruns can allow the attacker to overwrite the return address and thus even modify the program’s behavior.

Buffer overflow

C programmers often allocate buffers on the stack to handle user input. If the input reading logic is implemented incorrectly and has no buffer length checks, a underflow/overflow can happen. If the user input is long enough, it will overwrite the saved ebp register of the previous stack frame and, what matters most, the return address.

Stack buffer overflow

Example

#include <stdio.h>
#include <stdlib.h>

void __attribute__((noinline)) fun(int a, int b, int c) {

    char buffer[16] = {0};

    int* prevEbp = &a - 2;
    int* ret = &a - 1;

    printf("Buffer start: %p Buffer start pointer address: %p\n", buffer, &buffer);
    printf("Previous EBP: %p Value: %d Value as hex: %x\n", prevEbp, *prevEbp, *prevEbp);
    printf("Return address: %p Value: %x\n", ret, *ret);
    printf("Buffer end: %p\n", buffer + 16);

    fflush(stdout);

}

int main() {
    printf("Ptr size: %d bytes\n", sizeof(void*));
    fun(1, 2, 3)
    return 0;
}

We can calculate the return address position by taking addresses of the buffer and the function arguments. In this case we only take the pointer to the first argument, because it is added to the stack last. The previous base pointer size as well as the return address size are 4 bytes, so we can just subtract 1 (4 bytes) from the pointer to get the return address and 2 (8 bytes) to get the base pointer.

We can now compile the program with the -fno-stack-protector flag to disable stack protecting canary that gcc adds by default:

$ gcc main.c -o viewret -fno-stack-protector

By running the program I got:

Ptr size: 4 bytes
Buffer start: 0061FEE8 Buffer start pointer address: 0061FEE8
Previous EBP: 0061FF08 Value: 6422312 Value as hex: 61ff28
Return address: 0061FF0C Value: 401508
Buffer end: 0061FEF8

We can easily alter the return address value now:

void __attribute__((noinline)) fun(int a, int b, int c) {

    char buffer[16] = {0};

    int* prevEbp = &a - 2;
    int* ret = &a - 1;

    printf("Buffer start: %p Buffer start pointer address: %p\n", buffer, &buffer);
    printf("Previous EBP: %p Value: %d Value as hex: %x\n", prevEbp, *prevEbp, *prevEbp);
    printf("Return address: %p Value: %x\n", ret, *ret);
    printf("Buffer end: %p\n", buffer + 16);

    fflush(stdout);

    *ret = 0xcafeefac;

}

Now, if we run the program we will get a segmentation fault error because the function will try to jump back to the calee using an invalid address.

We can examine exactly how it works by running the GDB debugger:

$ gdb viewret.exe

Of course, we need to set the breakpoint at the fun function:

(gdb) $ br fun
[New Thread 3388.0x3368]
[New Thread 3388.0x1a2c]
Ptr size: 4 bytes

Breakpoint 1, 0x00401416 in fun ()

By using the frame command we can view the saved registers if the current stack frame.

(gdb) $ info frame
Stack level 0, frame at 0x61ff10:
 eip = 0x401416 in fun; saved eip 0x401508
 called by frame at 0x61ff30
 Arglist at 0x61ff08, args:
 Locals at 0x61ff08, Previous frame's sp is 0x61ff10
 Saved registers:
  ebp at 0x61ff08, eip at 0x61ff0c

The ebp register of the previous stack frame is at address 0x61ff08, the return address - at 0x61ff0c. The values are the same as generated by the program above.

(gdb) $ c
Continuing.
Buffer start: 0061FEE8 Buffer start pointer address: 0061FEE8
Previous EBP: 0061FF08 Value: 6422312 Value as hex: 61ff28
Return address: 0061FF0C Value: 401508
Buffer end: 0061FEF8

Program received signal SIGSEGV, Segmentation fault.
0xcafeefac in ?? ()

By stepping over the breakpoint we can see the invalid return address that caused the segmentation fault.

Altering variables

Let’s examine another program that reads data from the standard input stream:

#include <stdio.h>
#include <stdlib.h>

int main() {
	
    volatile int zero;

    char buffer[64];

    zero = 0;

    gets(buffer);

    if (zero) {
        printf("You changed the zero variable to %d (hex: %x)!", zero, zero);
    }
    else {
        puts("Variable not changed.");
    }

    return 0;
}

The zero variable is marked as volatile to prevent the compiler from optimizing it’s usage, e.g. by caching it’s value in one of the general-purpose registers.

By disassembling the program with gdb, we get:

0x00401410 <+0>:     push   ebp ; save the previous ebp register
0x00401411 <+1>:     mov    ebp,esp ; initializing ebp of the new stack frame
0x00401413 <+3>:     and    esp,0xfffffff0 ; memory aligning
0x00401416 <+6>:     sub    esp,0x60 ; memory allocation on the stack
0x00401419 <+9>:     call   0x401980 <__main>
0x0040141e <+14>:    mov    DWORD PTR [esp+0x5c],0x0 ; assign to zero
; eax = esp + 0x1c
0x00401426 <+22>:    lea    eax,[esp+0x1c]
; the address calculated with the previous instruction gets saved on the stack
0x0040142a <+26>:    mov    DWORD PTR [esp],eax
0x0040142d <+29>:    call   0x403ae8 <gets> ; gets() call
; load the value from the memory for comparison
0x00401432 <+34>:    mov    eax,DWORD PTR [esp+0x5c]
0x00401436 <+38>:    test   eax,eax ; test if it is zero
0x00401438 <+40>:    je     0x401458 <main+72> 
0x0040143a <+42>:    mov    edx,DWORD PTR [esp+0x5c]
; commands needed for printf
0x0040143e <+46>:    mov    eax,DWORD PTR [esp+0x5c]
0x00401442 <+50>:    mov    DWORD PTR [esp+0x8],edx
0x00401446 <+54>:    mov    DWORD PTR [esp+0x4],eax
0x0040144a <+58>:    mov    DWORD PTR [esp],0x405044
0x00401451 <+65>:    call   0x403ac8 <printf> ; success print
0x00401456 <+70>:    jmp    0x401464 <main+84> ; jump over the else branch
0x00401458 <+72>:    mov    DWORD PTR [esp],0x405073
0x0040145f <+79>:    call   0x403ac0 <puts> ; error print
; return with exit code 0
0x00401464 <+84>:    mov    eax,0x0
0x00401469 <+89>:    leave
0x0040146a <+90>:    ret
0x0040146b <+91>:    nop
0x0040146c <+92>:    xchg   ax,ax
0x0040146e <+94>:    xchg   ax,ax

We can set 2 breakpoints - before and after the gets() call.

(gdb) $ br *0x0040142d
(gdb) $ br *0x00401432

With gdb, we can define what commands to run when these breakpoints are reached:

(gdb) $ define hook-stop
>info registers
>x/24wx $esp
>x/2i $eip
>end

Bu running the commands above we will see the register state, 24 machine words on the stack and two next instructions after the instruction pointer:

eax            0x61fedc 6422236
ecx            0x4018f0 4200688
edx            0x50000018       1342177304
ebx            0x2d2000 2957312
esp            0x61fec0 0x61fec0
ebp            0x61ff28 0x61ff28
esi            0x4012d0 4199120
edi            0x4012d0 4199120
eip            0x40142d 0x40142d <main+29>
eflags         0x202    [ IF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x53     83
gs             0x2b     43
0x61fec0:       0x0061fedc      0x00000008      0x772c8023      0x772c801a
0x61fed0:       0xb3b6879d      0x004012d0      0x004012d0      0x00000000
0x61fee0:       0x004018f0      0x0061fed0      0x0061ff08      0x0061ffcc
0x61fef0:       0x772cdd70      0xc4e6dd59      0xfffffffe      0x772c801a
0x61ff00:       0x772c810d      0x004018f0      0x0061ff50      0x0040195b
0x61ff10:       0x004018f0      0x00000000      0x002d2000      0x00000000
=> 0x40142d <main+29>:  call   0x403ae8 <gets>
   0x401432 <main+34>:  mov    eax,DWORD PTR [esp+0x5c]

Breakpoint 1, 0x0040142d in main ()

Now we can examine how the input affects the stack:

(gdb) $ c
Continuing.
0000000000000000000000000000000000000000000
eax            0x61fedc 6422236
ecx            0x772eb098       1999548568
edx            0xa      10
ebx            0x2d2000 2957312
esp            0x61fec0 0x61fec0
ebp            0x61ff28 0x61ff28
esi            0x4012d0 4199120
edi            0x4012d0 4199120
eip            0x401432 0x401432 <main+34>
eflags         0x216    [ PF AF IF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x53     83
gs             0x2b     43
0x61fec0:       0x0061fedc      0x00000008      0x772c8023      0x772c801a
0x61fed0:       0xb3b6879d      0x004012d0      0x004012d0      0x30303030
0x61fee0:       0x30303030      0x30303030      0x30303030      0x30303030
0x61fef0:       0x30303030      0x30303030      0x30303030      0x30303030
0x61ff00:       0x30303030      0x00303030      0x0061ff50      0x0040195b
0x61ff10:       0x004018f0      0x00000000      0x002d2000      0x00000000
=> 0x401432 <main+34>:  mov    eax,DWORD PTR [esp+0x5c]
   0x401436 <main+38>:  test   eax,eax

Breakpoint 2, 0x00401432 in main ()

As we can see, 43 zero-characters (ascii code 0x30) was not enough to get to the zero value that we want it to occur. In order to get past the end of the buffer, we need at more than 64 bytes (because the buffer size is 64). For demonstration purposes, we will use the following string as the input:

000011111111111111112222222222222222333333333333333344444444444456

This string contains 66 characters, so the two last characters 5 and 6 should overwrite the 2 least significant bytes (because memory endianness is little-endian) of the variable.

(gdb) $ c
Continuing.
000011111111111111112222222222222222333333333333333344444444444456
eax            0x61fedc 6422236
ecx            0x772eb098       1999548568
edx            0xa      10
ebx            0x3f9000 4165632
esp            0x61fec0 0x61fec0
ebp            0x61ff28 0x61ff28
esi            0x4012d0 4199120
edi            0x4012d0 4199120
eip            0x401432 0x401432 <main+34>
eflags         0x216    [ PF AF IF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x53     83
gs             0x2b     43
0x61fec0:       0x0061fedc      0x00000008      0x772c8023      0x772c801a
0x61fed0:       0xe53b01b1      0x004012d0      0x004012d0      0x30303030
0x61fee0:       0x31313131      0x31313131      0x31313131      0x31313131
0x61fef0:       0x32323232      0x32323232      0x32323232      0x32323232
0x61ff00:       0x33333333      0x33333333      0x33333333      0x33333333
0x61ff10:       0x34343434      0x34343434      0x34343434      0x00003635
=> 0x401432 <main+34>:  mov    eax,DWORD PTR [esp+0x5c]
   0x401436 <main+38>:  test   eax,eax

Breakpoint 2, 0x00401432 in main ()

By continuing we see that the variable now contains 0x3635 or 13877 in decimal.

(gdb) $ c
Continuing.
You changed the zero variable to 13877 (hex: 3635)![Inferior 1 (process 4848) exited normally]
Error while running hook_stop:
The program has no registers now.

In order to alter the zero variable we need to represent the number in the little endian form and write the corresponding bytes to the 65, 66, 67 and 68 offsets in in buffer.

Protection against buffer overflows

Compilers and operating systems have some techniques to prevent such stack exploits. In gcc, for example, if the function allocates a buffer on the stack, an additional so-called stack canary is added. A stack canary is just a random integer generated when the function is called. Before returning the function makes sure that the canary has the same value. If the canary has been altered, the program is terminated with a fatal Stack smashing detected error.

Another technique used by operating systems is restricting code evaluation on the stack. When the stack overflow is exploited, hackers will try to overwrite the return address so that it points at the buffer location with the malicious code injected. Even if the exact address is not known, it is possible to construct a NO-OP-instruction slide in the stack buffer so that a jump at any address within this slide will lead to malicious code execution. Exact stack addresses are typically different after every program run because operating systems push environmental variables onto it.


Copyright © 2019 — 2023 Alexander Mayorov. All rights reserved.