Darun Grim 3
A n d
A Tool
For Binary Diffing
Jeongwook Oh
mat @monkey.org,
Automatic Vulnerabilities Pattern Matching
@ohjeongwook, Security Researcher, WebSense
EUSec West 2010 Amster Dam NL
Why?
I worked on a security product last 5 years. The IPS and vulnerability scanner needed signatures We needed technical details on the patches The information was not provided by the vendors In recent years, a program called MAPP appeared from Microsoft, but many times it's not enough You have two options in this case: Use your own eye balls to compare disassemblies Use binary diffing tools Patch analysis using binary diffing tools is the only healthy way to obtain some valuable information out of the patches.
How?
I'll show you whole process for a typical binary diffing You should grab an idea what binary diffing is The example shown next will show the typical example of binary diffing process The patch(MS10-018) is for CVE-2010-0806 vulnerability.
Example: CVE-2010-0806 Patch
Description from CVE Web Page
http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-0806 Use-after-free vulnerability in the Peer Objects component (aka iepeers.dll) in Microsoft Internet Explorer 6, 6 SP1, and 7 allows remote attackers to execute arbitrary code via vectors involving access to an invalid pointer after the deletion of an object, as exploited in the wild in March 2010, aka "Uninitialized Memory Corruption Vulnerability."
CVE-2010-0806 Patch Analysis Acquire Patches
Download the patch by visiting patch page(MS10-018) and following the OS and IE version link.
For XP IE 7, I used following link from the main patch page to download the patch file.
( http://www.microsoft.com/downloads/details.aspx?FamilyID=167ed896-d383-4dc0-9183cd4cb73e17e7&displaylang=en )
CVE-2010-0806 Patch Analysis Extract Patches
C:\> IE7-WindowsXP-KB980182-x86-ENU.exe /x:out
CVE-2010-0806 Patch Analysis Acquire unpatched files
You need to collect unpatched files from the operating system that the patch is supposed to be installed.
I used SortExecutables.exe from DarunGrim2 package to consolidate the files. The files will be inside a directory with version number string.
CVE-2010-0806 Patch Analysis Load the binaries from DarunGrim2
Launch DarunGrim2.exe and select "File New Diffing from IDA" from the menu
You need to wait from few seconds to few minutes depending on the binary size and disassembly complexity.
CVE-2010-0806 Patch Analysis Binary Level Analysis
Now you have the list of functions Find any eye catching functions
Like following, the match rate(the last column value) 86% and 88% is a strong indication that it has some minor code change which can be a security patch.
CVE-2010-0806 Patch Analysis Function Level Analysis
If you click the function match row, you will get a matching graphs. Color codes
The white blocks are matched blocks The yellow blocks are modified blocks The red blocks are unmatched blocks
Unmatched block means that the block is inserted or removed.
So in this case, the red block is in patched part which means that block has been inserted in the patch.
CVE-2010-0806 Patch Analysis Function Level Analysis
CVE-2010-0806 Patch Analysis Function Level Analysis
So we just follow the control flow from the red block and we can see that esi is eventually set as return value(eax).
We can guess that the patch is about sanitizing return value when some condition is not met or something.
The Problems with Current Binary Diffing Tools
Managing files are boring job.
Downloading patches Storing old binaries/ Loading the files manually
How do we know which function has security updates, not feature updates?
Just go through every modified functions?
How about if the modified functions are too many?
The Solution = DarunGrim 3
Bin Collector
Binary Managing Functionality Automatic patch download and extraction Supports Microsoft Binaries Will support other major vendors soon Shows you what functions have more security related patches inside it. User friendly By clicking through and you get the diffing results
Security Implication Score
Web Interface
Architecture Comparison
DarunGrim 2
Windows GUI
Diffing Engine Database (sqlite)
IDA
Architecture Comparison
DarunGrim 3
Windows GUI Web Console
Bin Collector Diffing Engine Python Interface Diffing Engine Database Python Interface
IDA
Database (sqlite)
Binary Storage
Performing Diffing
Interactive Non-Interactive
Performing Diffing: Interactive
Using DarunGrim2.exe UI
Just put the path for each binary and DarunGrim2.exe will do the rest of the job.
DarunGrim2.exe + Two IDA sessions
First launch DarunGrim2.exe Launch two IDA sessions First run DarunGrim2 plugin from the original binary Secondly run DarunGrim2 plugin from the patched binary DarunGrim2.exe will analyze the data that is collected through shared memory
Using DarunGrim Web Console: a DarunGrim 3 Way
User friendly user interface Includes "Bin Collector"/Security Implication Score support
Performing Diffing: Non-Interactive
Using DarunGrim2C.exe command line tool
Handy, Batch-able, Quick
Using DarunGrim Python Interface: a DarunGrim 3 Way
Handy, Batch-able, Quick, Really Scriptable
Diffing Engine Python Interface
import DarunGrimEngine DarunGrimEngine.DiffFile( unpatched_filename, patched_filename, output_filename, log_filename, ida_path )
Perfoms diassemblying using IDA Runs as a background process Runs DarunGrim IDA plugin automatically Runs the DiffEngine automatically on the files
Database Python Interface
import DarunGrimDatabaseWrapper database = DarunGrimDatabaseWrapper.Database( filename ) for function_match_info in database.GetFunctionMatchInfo(): if function_match_info.non_match_count_for_the_source > 0 or function_match_info.non_match_count_for_the_target > 0: print function_match_info.source_function_name + hex(function_match_info.source_address) + '\t', print function_match_info.target_function_name + hex(function_match_info.target_address) + '\t', print str(function_match_info.block_type) + '\t', print str(function_match_info.type) + '\t', print str( function_match_info.match_rate ) + "%" + '\t', print database.GetFunctionDisasmLinesMap( function_match_info.source_file_id, function_match_info.source_address ) print database.GetMatchMapForFunction( function_match_info.source_file_id, function_match_info.source_address )
Bin Collector
Binary collection & consolidation system
Toolkit for constructing binary library It exposes some python interface, so it's scriptable if you want
It is managed through Web Console
The whole code is written in Python It maintains indexes and version information on the binary files from the vendors. Download and extract patches automatically
Currently limited functionality Adobe, Oracle binaries will be supported soon
Currently it supports Microsoft binaries
Bin Collector
Collecting Binaries Automagically
It visits each vendors patch pages
Use mechanize python package to scrap MS patch pages Use BeautifulSoup to parse the html pages Use sqlalchemy to index the files <Company Name>\<File Name>\<Version Name>
It extracts and archives binary files
Use PE version information to determine store location
You can make your own archive of binaries in more organized way
Web Console Work Flow Select Vendor
We only support Microsoft right now. We are going to support Oracle and Adobe soon.
Web Console Work Flow Select Patch Name
Web Console Work Flow Select OS
Web Console Work Flow Select a File
GDR(General Distribution): a binary marked as GDR contains only security related changes that have been made to the binary QFE(Quick Fix Engineering)/LDR(Limited Distribution Release): a binary marked as QFE/LDR contains both security related changes that have been made to the binaryas well as any functionality changes that have been made to it.
Web Console Work Flow Initiate Diffing
The unpatched file is automagically guessed based on the file name and version string.
Web Console Work Flow Check the results
Web Console Work Flow Check the results
Reading Results
Locate security patches as quickly as possible Sometimes the diff results are not clear because of a lot of noises. The noise is caused by
Feature updates Code cleanup Refactoring Compiler option change Compiler change
Identifying Security Patches
Not all patches are security patches Sometimes it's like finding needles in the sand We need a way for locating patches with strong security implication
Identifying Security Patches
Security Implication Score
DarunGrim 3 provides script interface to the Diffing Engine DarunGrim 3 provides basic set of pattern matching We calculate Security Implication Score using this Python interface
The pattern matching should be easy to extend as the researcher get to know new patterns You can add new patterns if you want.
Examples
Examples for each vulnerability classes. DarunGrim2 and DarunGrim3 examples are shown. Security Implication Scores are shown for some examples.
Stack Based Buffer Overflow: MS06-070
Stack Based Buffer Overflow:
MS06-070/_NetpManageIPCConnect@16
Stack Based Buffer Overflow(Logic Error): MS08-067
Conficker worm exploited this vulnerability to propagate through internal network. Easy target for binary diffing
only 2 functions changed. One is a change in calling convention. The other is the function that has the vulnerability
Stack Based Buffer Overflow(Logic Error): MS08-067
Stack Based Buffer Overflow(Logic Error): MS08-067
Stack Based Buffer Overflow(Logic Error): MS08-067
Stack Based Buffer Overflow(Logic Error): MS08-067
Stack Based Buffer Overflow(Logic Error): MS08-067
StringCchCopyW
http://msdn.microsoft.com/en-us/library/ms647527%28VS.85%29.aspx
Integer Overflow
MS10-030
Integer Overflow
MS10-030 Integer Comparison Routine
Integer Overflow
MS10-030
Integer Overflow
JRE Font Manager Buffer Overflow(Sun Alert 254571)
Integer Overflow
JRE Font Manager Buffer Overflow(Sun Alert 254571)
Original
.text:6D2C4A75 .text:6D2C4A79 .text:6D2C4A7C .text:6D2C4A81 .text:6D2C4A83 .text:6D2C4A84 mov edi, [esp+10h] lea eax, [edi+0Ah] cmp eax, 2000000h jnb short loc_6D2C4A8D push eax call ds:malloc ; size_t
Patched
.text:6D244B06 Additiional Check: .text:6D244B07 .text:6D244B0B .text:6D244B10 .text:6D244B12 .text:6D244B14 .text:6D244B17 .text:6D244B19 .text:6D244B1B .text:6D244B1C push edi
mov edi, [esp+10h] mov eax, 2000000h cmp edi, eax jnb short loc_6D244B2B lea ecx, [edi+0Ah] cmp ecx, eax jnb short loc_6D244B25 push ecx call ds:malloc ; size_t
Insufficient Validation of Parameters
Java Deployment Toolkit
Insufficient Validation of Parameters
Java Deployment Toolkit
Unpatched one has whole a lot of red and yellow blocks.
The whole function's basic blocks have been removed. This is the quick fix for @taviso's 0-day.
The function is responsible for querying registry key for JNLPFile Shell Open key and launching it using CreateProcessA API.
Invalid Argument
MS09-020:WebDav case
Patched
Orginal
Invalid Argument
MS09-020:WebDav case
Flags has changed
Original
Patched
Invalid Argument
MS09-020:WebDav case
What does flag 8 mean?
MSDN(http://msdn.microsoft.com/en-us/library/dd319072(VS.85).aspx) declares like following:
MB_ERR_INVALID_CHARS Windows Vista and later: The function does not drop illegal code points if the application does not set this flag. Windows 2000 Service Pack 4, Windows XP: Fail if an invalid input character is encountered. If this flag is not set, the function silently drops illegal code points. A call to GetLastError returns ERROR_NO_UNICODE_TRANSLATION.
Invalid Argument
MS09-020:WebDav case
Broken UTF8 Heuristics?
6F0695EA mov esi, 0FDE9h ,,,, 6F069641 call ?FIsUTF8Url@@YIHPBD@Z ; FIsUTF8Url(char const *) 6F069646 test eax, eax if(!eax) { 6F0695C3 xor edi, edi 6F06964A mov [ebp-124h], edi }else { 6F069650 cmp [ebp-124h], esi } ... 6F0696C9 mov eax, [ebp-124h] 6F0696D5 sub eax, esi 6F0696DE neg eax 6F0696E0 sbb eax, eax 6F0696E2 and eax, 8
Use-After-Free: CVE-2010-0249-Vulnerability in Internet Explorer Could Allow Remote Code Execution
Use-After-Free: CVE-2010-0249-Vulnerability in Internet Explorer Could Allow Remote Code Execution
Use-After-Free: CVE-2010-0249-Vulnerability in Internet Explorer Could Allow Remote Code Execution
Unpatched
Use-After-Free: CVE-2010-0249-Vulnerability in Internet Explorer Could Allow Remote Code Execution
Patched
Use-After-Free: CVE-2010-0249-Vulnerability in Internet Explorer Could Allow Remote Code Execution
Use-After-Free: CVE-2010-0249-Vulnerability in Internet Explorer Could Allow Remote Code Execution
CTreeNode *orig_obj 4. Release reference counter 2. Remove ptr NodeRelease
CTreeNode *arg_0 3. Add ptr CTreeNode *arg_4 1. Add reference counter NodeAddRef
Use-After-Free: CVE-2010-0249-Vulnerability in Internet Explorer Could Allow Remote Code Execution
Original binary was missing to replace pointer for the tree node.
Freed node was used accidentally. ReplacePtr in adequate places fixed the problem
We might use ReplacePtr pattern for use-after-free bug in IE.
Adding the pattern will help to find same issue later binary diffing.
Conclusion
Binary Diffing can benefit IPS rule writers and security researchers Locating security vulnerabilities from binary can help further binary auditing There are typical patterns in patches according to their bug classes. Security Implication Score by DarunGrim3 helps finding security patches out from feature updates The Security Implication Score logic is written in Python and customizable on-demand.
DarunGrim3
http://www.darungrim.org
All the source code and latest binaries will be uploaded within 2 weeks
Questions?