SlideShare a Scribd company logo
Practical Malware Analysis
Ch 13: Data Encoding
Revised 4-25-16
The Goal of Analyzing
Encoding Algorithms
Reasons Malware Uses Encoding
• Hide configuration information
– Such as C&C domains
• Save information to a staging file
– Before stealing it
• Store strings needed by malware
– Decode them just before they are needed
• Disguise malware as a legitimate tool
– Hide suspicious strings
Simple Ciphers
Why Use Simple Ciphers?
• They are easily broken, but
– They are small, so they fit into space-
constrained environments like exploit
shellcode
– Less obvious than more complex ciphers
– Low overhead, little impact on performance
• These are obfuscation, not encryption
– They make it difficult to recognize the data,
but can't stop a skilled analyst
Caesar Cipher
• Move each letter forward 3 spaces in the
alphabet
ABCDEFGHIJKLMNOPQRSTUVWXYZ
DEFGHIJKLMNOPQRSTUVWXYZABC
• Example
ATTACK AT NOON
DWWDFN DW QRRQ
XOR
• Uses a key to encrypt data
• Uses one bit of data and one bit of the
key at a time
• Example: Encode HI with a key of 0x3c
HI = 0x48 0x49 (ASCII encoding)
Data: 0100 1000 0100 1001
Key: 0011 1100 0011 1100
Result: 0111 0100 0111 0101
0 xor 0 = 0
0 xor 1 = 1
1 xor 0 = 1
1 xor 1 = 0
Practical Malware Analysis Ch13
XOR Reverses Itself
• Example: Encode HI with a key of 0x3c
HI = 0x48 0x49 (ASCII encoding)
Data: 0100 1000 0100 1001
Key: 0011 1100 0011 1100
• Encode it again
Result: 0111 0100 0111 0101
Key: 0011 1100 0011 1100
Data: 0100 1000 0100 1001
0 xor 0 = 0
0 xor 1 = 1
1 xor 0 = 1
1 xor 1 = 0
Brute-Forcing XOR Encoding
• If the key is a single byte, there are only
256 possible keys
– Error in book; this should be "a.exe"
– PE files begin with MZ
MZ = 0x4d 0x5a
Practical Malware Analysis Ch13
Link Ch 13a
Brute-Forcing Many Files
• Look for a
common
string, like
"This Program"
XOR and Nulls
• A null byte reveals the key, because
– 0x00 xor KEY = KEY
• Obviously the key here is 0x12
NULL-Preserving Single-Byte XOR
Encoding
• Algorithm:
– Use XOR encoding, EXCEPT
– If the plaintext is NULL or the key itself, skip
the byte
Practical Malware Analysis Ch13
Identifying XOR Loops in IDA Pro
• Small loops with an XOR instruction inside
1. Start in "IDA View" (seeing code)
2. Click Search, Text
3. Enter xor and Find all occurrences
Three Forms of XOR
• XOR a register with itself, like xor edx, edx
– Innocent, a common way to zero a register
• XOR a register or memory reference with a
constant
– May be an encoding loop, and key is the
constant
• XOR a register or memory reference with a
different register or memory reference
– May be an encoding loop, key less obvious
Practical Malware Analysis Ch13
Practical Malware Analysis Ch13
Base64
• Converts 6 bits into one character in a 64-
character alphabet
• There are a few versions, but all use these
62 characters:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
0123456789
• MIME uses + and /
– Also = to indicate padding
Practical Malware Analysis Ch13
Transforming Data to Base64
• Use 3-byte chunks (24 bits)
• Break into four 6-bit fields
• Convert each to Base64
base64encode.org

base64decode.org
• 3 bytes encode to 4
Base64 characters
Padding
• If input had only 2
characters, an = is
appended
Padding
• If input had only 1
character, == is
appended
Example
• URL and cookie are Base64-encoded
Cookie: Ym90NTQxNjQ
• This has 11
characters—
padding is omitted
• Some Base64
decoders will fail,
but this one just
automatically adds
the missing padding
Finding the Base64 Function
• Look for this "indexing string"
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghi
jklmnopqrstuvwxyz0123456789+/
• Look for a lone padding character
(typically =) hard-coded into the encoding
function
Decoding the URLs
• Custom indexing string
aABCDEFGHIJKLMNOPQRSTUVWXYZbcdefghijk
lmnopqrstuvwxyz0123456789+/
• Look for a lone padding character (typically
=) hard-coded into the encoding function
Practical Malware Analysis Ch13
Common Cryptographic
Algorithms
Strong Cryptography
• Strong enough to resist brute-force attacks
– Ex: SSL, AES, etc.
• Disadvantages of strong encryption
– Large cryptographic libraries required
– May make code less portable
– Standard cryptographic libraries are easily detected
• Via function imports, function matching, or identification of
cryptographic constants
– Symmetric encryption requires a way to hide the key
Recognizing Strings and Imports
• Strings found in malware encrypted with
OpenSSL
Recognizing Strings and Imports
• Microsoft crypto functions usually start
with Crypt or CP or Cert
Searching for Cryptographic Constants
• IDA Pro's FindCrypt2 Plug-in (Link Ch 13c)
– Finds magic constants (binary signatures of
crypto routines)
– Cannot find RC4 or IDEA routines because
they don't use a magic constant
– RC4 is commonly used in malware because it's
small and easy to implement
FindCrypt2
• Runs automatically on any new analysis
• Can be run manually from the Plug-In
Menu
Krypto ANALyzer (PEiD Plug-in)
• Download from link Ch 13d
• Has wider range of constants than FindCrypt2
– More false positives
• Also finds Base64 tables and crypto function
imports
Entropy
• Entropy measures disorder
• To calculate it, just count the number of
occurrences of each byte from 0 to 255
– Calculate Pi = Probability of value i
– Then sum Pi log( Pi) for I = 0 to 255 (Link 13e)
• If all the bytes are equally likely, the
entropy is 8 (maximum disorder)
• If all the bytes are the same, the entropy is
zero
Entropy Demo
• Put output in a file
• Use bin walk -E to analyze the file
• Multiply vertical axis by 8
41
#!/usr/bin/python
import base64, random
a = ''
for i in range(0, 10000):
a += chr(random.randint(0,255))
b = base64.b64encode(a)
c = base64.b32encode(a)
d = base64.b16encode(a)
e = 'A' * 10000
print a + b + c + d + e
Entropy Demo
• Concatenate three images in different
formats
42
Searching for High-Entropy Content
• IDA Pro Entropy Plugin
• Finds regions of high entropy, indicating
encryption (or compression)
Recommended Parameters
• Chunk size: 64 Max. Entropy: 5.95
– Good for finding many constants,
– Including Base64-encoding strings (entropy 6)
• Chunk size: 256 Max. Entropy: 7.9
– Finds very random regions
Entropy Graph
• IDA Pro Entropy Plugin
– Download from link Ch 13g
– Use StandAlone version
– Double-click region, then Calculate, Draw
– Lighter regions have high entropy
– Hover over graph to see numerical value
Practical Malware Analysis Ch13
Custom Encoding
Homegrown Encoding Schemes
• Examples
– One round of XOR, then Base64
– Custom algorithm, possibly similar to a
published cryptographic algorithm
Identifying Custom Encoding
• This sample makes a bunch of 700 KB files
• Figure out the encoding from the code
• Find CreateFileA and WriteFileA
– In function sub_4011A9
• Uses XOR with a pseudorandom stream
Practical Malware Analysis Ch13
Advantages of Custom Encoding to the
Attacker
• Can be small and nonobvious
• Harder to reverse-engineer
Decoding
Two Methods
• Reprogram the functions
• Use the functions in the malware itself
Self-Decoding
• Stop the malware in a debugger with data
decoded
• Isolate the decryption function and set a
breakpoint directly after it
• BUT sometimes you can't figure out how
to stop it with the data you need decoded
Manual Programming of Decoding
Functions
• Standard functions may be available
Practical Malware Analysis Ch13
PyCrypto Library
• Good for standard algorithms
How to Decrypt Using Malware
Practical Malware Analysis Ch13

More Related Content

PDF
Practical Malware Analysis: Ch 11: Malware Behavior
PDF
Practical Malware Analysis Ch 14: Malware-Focused Network Signatures
PDF
Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg
PDF
CNIT 126 9: OllyDbg
PDF
Practical Malware Analysis: Ch 8: Debugging
PPT
Practical Malware Analysis: Ch 7: Analyzing Malicious Windows Programs
PDF
CNIT 126 8: Debugging
PPTX
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...
Practical Malware Analysis: Ch 11: Malware Behavior
Practical Malware Analysis Ch 14: Malware-Focused Network Signatures
Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg
CNIT 126 9: OllyDbg
Practical Malware Analysis: Ch 8: Debugging
Practical Malware Analysis: Ch 7: Analyzing Malicious Windows Programs
CNIT 126 8: Debugging
Practical Malware Analysis: Ch 0: Malware Analysis Primer & 1: Basic Static T...

What's hot (20)

PDF
CNIT 126: 10: Kernel Debugging with WinDbg
PDF
Practical Malware Analysis: Ch 9: OllyDbg
PDF
CNIT 126 11. Malware Behavior
PDF
Practical Malware Analysis Ch12
PDF
CNIT 126 Ch 0: Malware Analysis Primer & 1: Basic Static Techniques
PDF
9: OllyDbg
PDF
Practical Malware Analysis: Ch 15: Anti-Disassembly
PPTX
Practical Malware Analysis: Ch 2 Malware Analysis in Virtual Machines & 3: Ba...
PPT
iOS Application Pentesting
PDF
CNIT 126 12: Covert Malware Launching
PDF
Hunting for Credentials Dumping in Windows Environment
PDF
CNIT 152: 9 Network Evidence
PDF
CNIT 126 2: Malware Analysis in Virtual Machines & 3: Basic Dynamic Analysis
PDF
Network Penetration Testing Toolkit - Nmap, Netcat, and Metasploit Basics
PPT
Software coding and testing
PDF
CNIT 126: Ch 6: Recognizing C Constructs in Assembly
PDF
Ceh v5 module 03 scanning
PPTX
OWASP SB -Threat modeling 101
PDF
Ch 4: Footprinting and Social Engineering
CNIT 126: 10: Kernel Debugging with WinDbg
Practical Malware Analysis: Ch 9: OllyDbg
CNIT 126 11. Malware Behavior
Practical Malware Analysis Ch12
CNIT 126 Ch 0: Malware Analysis Primer & 1: Basic Static Techniques
9: OllyDbg
Practical Malware Analysis: Ch 15: Anti-Disassembly
Practical Malware Analysis: Ch 2 Malware Analysis in Virtual Machines & 3: Ba...
iOS Application Pentesting
CNIT 126 12: Covert Malware Launching
Hunting for Credentials Dumping in Windows Environment
CNIT 152: 9 Network Evidence
CNIT 126 2: Malware Analysis in Virtual Machines & 3: Basic Dynamic Analysis
Network Penetration Testing Toolkit - Nmap, Netcat, and Metasploit Basics
Software coding and testing
CNIT 126: Ch 6: Recognizing C Constructs in Assembly
Ceh v5 module 03 scanning
OWASP SB -Threat modeling 101
Ch 4: Footprinting and Social Engineering
Ad

Viewers also liked (20)

PPTX
Practical Malware Analysis: Ch 5: IDA Pro
PPTX
Practical Malware Analysis: Ch 4 A Crash Course in x86 Disassembly
PPTX
Practical Malware Analysis: Ch 6: Recognizing C Code Constructs in Assembly
PDF
CNIT 121: 10 Enterprise Services
PDF
CNIT 127 Ch 6: The Wild World of Windows
PDF
CNIT 127 Ch 2: Stack overflows on Linux
PDF
CNIT 127 Ch 4: Introduction to format string bugs (rev. 2-9-17)
PDF
CNIT 121: 8 Forensic Duplication
PDF
CNIT 123 Ch 1: Ethical Hacking Overview
PDF
Ch 2: TCP/IP Concepts Review
PDF
Ch 6: Enumeration
PDF
Ch 12: Cryptography
PDF
Ch 11: Hacking Wireless Networks
PDF
Ch 13: Network Protection Systems
PDF
Ch 10: Hacking Web Servers
PDF
Ch 5: Port Scanning
PDF
CNIT 128 5: Mobile malware
PDF
CNIT 127 Lecture 7: Intro to 64-Bit Assembler (not in book)
PDF
CNIT 123: Ch 9: Embedded Operating Systems: The Hidden Threat
PDF
CNIT 123: Ch 13: Network Protection Systems
Practical Malware Analysis: Ch 5: IDA Pro
Practical Malware Analysis: Ch 4 A Crash Course in x86 Disassembly
Practical Malware Analysis: Ch 6: Recognizing C Code Constructs in Assembly
CNIT 121: 10 Enterprise Services
CNIT 127 Ch 6: The Wild World of Windows
CNIT 127 Ch 2: Stack overflows on Linux
CNIT 127 Ch 4: Introduction to format string bugs (rev. 2-9-17)
CNIT 121: 8 Forensic Duplication
CNIT 123 Ch 1: Ethical Hacking Overview
Ch 2: TCP/IP Concepts Review
Ch 6: Enumeration
Ch 12: Cryptography
Ch 11: Hacking Wireless Networks
Ch 13: Network Protection Systems
Ch 10: Hacking Web Servers
Ch 5: Port Scanning
CNIT 128 5: Mobile malware
CNIT 127 Lecture 7: Intro to 64-Bit Assembler (not in book)
CNIT 123: Ch 9: Embedded Operating Systems: The Hidden Threat
CNIT 123: Ch 13: Network Protection Systems
Ad

Similar to Practical Malware Analysis Ch13 (20)

PDF
CNIT 126 13: Data Encoding
PDF
CNIT 126: 13: Data Encoding
PDF
Ch 18: Source Code Auditing
PDF
Automatic tool for static analysis
PPTX
Cryptography using python
PDF
CNIT 127: Ch 18: Source Code Auditing
PPT
encryptcryptographyyyyyyyyyyyyyyyyyy.ppt
PDF
Encryption pres
PPTX
NBTC#2 - Why instrumentation is cooler then ice
PPTX
Fundamentals of Information Encryption
PPT
OWASP Much ado about randomness
PPT
CISSP EXAM PREPARATION FOR A PASSED SCORE
PPT
CryptographyCryptographyCryptography.ppt
PDF
CRYPTOGRAPHY AND NETWORK SECURITY
PPTX
Cryptography_additive_cipher.pptx
PPTX
Encryption in php
KEY
Cryptography for developers
PDF
WEEK-2 (1).pdfdccccccccccccccccccccccccccccccccccc
PPTX
Overview on Cryptography and Network Security
PDF
【Unite 2017 Tokyo】パフォーマンス向上のためのスクリプトのベストプラクティス
CNIT 126 13: Data Encoding
CNIT 126: 13: Data Encoding
Ch 18: Source Code Auditing
Automatic tool for static analysis
Cryptography using python
CNIT 127: Ch 18: Source Code Auditing
encryptcryptographyyyyyyyyyyyyyyyyyy.ppt
Encryption pres
NBTC#2 - Why instrumentation is cooler then ice
Fundamentals of Information Encryption
OWASP Much ado about randomness
CISSP EXAM PREPARATION FOR A PASSED SCORE
CryptographyCryptographyCryptography.ppt
CRYPTOGRAPHY AND NETWORK SECURITY
Cryptography_additive_cipher.pptx
Encryption in php
Cryptography for developers
WEEK-2 (1).pdfdccccccccccccccccccccccccccccccccccc
Overview on Cryptography and Network Security
【Unite 2017 Tokyo】パフォーマンス向上のためのスクリプトのベストプラクティス

More from Sam Bowne (20)

PDF
Introduction to the Class & CISSP Certification
PDF
Cyberwar
PDF
3: DNS vulnerabilities
PDF
8. Software Development Security
PDF
4 Mapping the Application
PDF
3. Attacking iOS Applications (Part 2)
PDF
12 Elliptic Curves
PDF
11. Diffie-Hellman
PDF
2a Analyzing iOS Apps Part 1
PDF
9 Writing Secure Android Applications
PDF
12 Investigating Windows Systems (Part 2 of 3)
PDF
10 RSA
PDF
12 Investigating Windows Systems (Part 1 of 3
PDF
9. Hard Problems
PDF
8 Android Implementation Issues (Part 1)
PDF
11 Analysis Methodology
PDF
8. Authenticated Encryption
PDF
7. Attacking Android Applications (Part 2)
PDF
7. Attacking Android Applications (Part 1)
PDF
5. Stream Ciphers
Introduction to the Class & CISSP Certification
Cyberwar
3: DNS vulnerabilities
8. Software Development Security
4 Mapping the Application
3. Attacking iOS Applications (Part 2)
12 Elliptic Curves
11. Diffie-Hellman
2a Analyzing iOS Apps Part 1
9 Writing Secure Android Applications
12 Investigating Windows Systems (Part 2 of 3)
10 RSA
12 Investigating Windows Systems (Part 1 of 3
9. Hard Problems
8 Android Implementation Issues (Part 1)
11 Analysis Methodology
8. Authenticated Encryption
7. Attacking Android Applications (Part 2)
7. Attacking Android Applications (Part 1)
5. Stream Ciphers

Recently uploaded (20)

PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharma ospi slides which help in ospi learning
PDF
Pre independence Education in Inndia.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Cell Structure & Organelles in detailed.
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Anesthesia in Laparoscopic Surgery in India
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Pharma ospi slides which help in ospi learning
Pre independence Education in Inndia.pdf
Microbial disease of the cardiovascular and lymphatic systems
Renaissance Architecture: A Journey from Faith to Humanism
VCE English Exam - Section C Student Revision Booklet
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Cell Structure & Organelles in detailed.
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
STATICS OF THE RIGID BODIES Hibbelers.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
Final Presentation General Medicine 03-08-2024.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Anesthesia in Laparoscopic Surgery in India

Practical Malware Analysis Ch13

  • 1. Practical Malware Analysis Ch 13: Data Encoding Revised 4-25-16
  • 2. The Goal of Analyzing Encoding Algorithms
  • 3. Reasons Malware Uses Encoding • Hide configuration information – Such as C&C domains • Save information to a staging file – Before stealing it • Store strings needed by malware – Decode them just before they are needed • Disguise malware as a legitimate tool – Hide suspicious strings
  • 5. Why Use Simple Ciphers? • They are easily broken, but – They are small, so they fit into space- constrained environments like exploit shellcode – Less obvious than more complex ciphers – Low overhead, little impact on performance • These are obfuscation, not encryption – They make it difficult to recognize the data, but can't stop a skilled analyst
  • 6. Caesar Cipher • Move each letter forward 3 spaces in the alphabet ABCDEFGHIJKLMNOPQRSTUVWXYZ DEFGHIJKLMNOPQRSTUVWXYZABC • Example ATTACK AT NOON DWWDFN DW QRRQ
  • 7. XOR • Uses a key to encrypt data • Uses one bit of data and one bit of the key at a time • Example: Encode HI with a key of 0x3c HI = 0x48 0x49 (ASCII encoding) Data: 0100 1000 0100 1001 Key: 0011 1100 0011 1100 Result: 0111 0100 0111 0101 0 xor 0 = 0 0 xor 1 = 1 1 xor 0 = 1 1 xor 1 = 0
  • 9. XOR Reverses Itself • Example: Encode HI with a key of 0x3c HI = 0x48 0x49 (ASCII encoding) Data: 0100 1000 0100 1001 Key: 0011 1100 0011 1100 • Encode it again Result: 0111 0100 0111 0101 Key: 0011 1100 0011 1100 Data: 0100 1000 0100 1001 0 xor 0 = 0 0 xor 1 = 1 1 xor 0 = 1 1 xor 1 = 0
  • 10. Brute-Forcing XOR Encoding • If the key is a single byte, there are only 256 possible keys – Error in book; this should be "a.exe" – PE files begin with MZ
  • 11. MZ = 0x4d 0x5a
  • 14. Brute-Forcing Many Files • Look for a common string, like "This Program"
  • 15. XOR and Nulls • A null byte reveals the key, because – 0x00 xor KEY = KEY • Obviously the key here is 0x12
  • 16. NULL-Preserving Single-Byte XOR Encoding • Algorithm: – Use XOR encoding, EXCEPT – If the plaintext is NULL or the key itself, skip the byte
  • 18. Identifying XOR Loops in IDA Pro • Small loops with an XOR instruction inside 1. Start in "IDA View" (seeing code) 2. Click Search, Text 3. Enter xor and Find all occurrences
  • 19. Three Forms of XOR • XOR a register with itself, like xor edx, edx – Innocent, a common way to zero a register • XOR a register or memory reference with a constant – May be an encoding loop, and key is the constant • XOR a register or memory reference with a different register or memory reference – May be an encoding loop, key less obvious
  • 22. Base64 • Converts 6 bits into one character in a 64- character alphabet • There are a few versions, but all use these 62 characters: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789 • MIME uses + and / – Also = to indicate padding
  • 24. Transforming Data to Base64 • Use 3-byte chunks (24 bits) • Break into four 6-bit fields • Convert each to Base64
  • 25. base64encode.org
 base64decode.org • 3 bytes encode to 4 Base64 characters
  • 26. Padding • If input had only 2 characters, an = is appended
  • 27. Padding • If input had only 1 character, == is appended
  • 28. Example • URL and cookie are Base64-encoded
  • 29. Cookie: Ym90NTQxNjQ • This has 11 characters— padding is omitted • Some Base64 decoders will fail, but this one just automatically adds the missing padding
  • 30. Finding the Base64 Function • Look for this "indexing string" ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghi jklmnopqrstuvwxyz0123456789+/ • Look for a lone padding character (typically =) hard-coded into the encoding function
  • 31. Decoding the URLs • Custom indexing string aABCDEFGHIJKLMNOPQRSTUVWXYZbcdefghijk lmnopqrstuvwxyz0123456789+/ • Look for a lone padding character (typically =) hard-coded into the encoding function
  • 34. Strong Cryptography • Strong enough to resist brute-force attacks – Ex: SSL, AES, etc. • Disadvantages of strong encryption – Large cryptographic libraries required – May make code less portable – Standard cryptographic libraries are easily detected • Via function imports, function matching, or identification of cryptographic constants – Symmetric encryption requires a way to hide the key
  • 35. Recognizing Strings and Imports • Strings found in malware encrypted with OpenSSL
  • 36. Recognizing Strings and Imports • Microsoft crypto functions usually start with Crypt or CP or Cert
  • 37. Searching for Cryptographic Constants • IDA Pro's FindCrypt2 Plug-in (Link Ch 13c) – Finds magic constants (binary signatures of crypto routines) – Cannot find RC4 or IDEA routines because they don't use a magic constant – RC4 is commonly used in malware because it's small and easy to implement
  • 38. FindCrypt2 • Runs automatically on any new analysis • Can be run manually from the Plug-In Menu
  • 39. Krypto ANALyzer (PEiD Plug-in) • Download from link Ch 13d • Has wider range of constants than FindCrypt2 – More false positives • Also finds Base64 tables and crypto function imports
  • 40. Entropy • Entropy measures disorder • To calculate it, just count the number of occurrences of each byte from 0 to 255 – Calculate Pi = Probability of value i – Then sum Pi log( Pi) for I = 0 to 255 (Link 13e) • If all the bytes are equally likely, the entropy is 8 (maximum disorder) • If all the bytes are the same, the entropy is zero
  • 41. Entropy Demo • Put output in a file • Use bin walk -E to analyze the file • Multiply vertical axis by 8 41 #!/usr/bin/python import base64, random a = '' for i in range(0, 10000): a += chr(random.randint(0,255)) b = base64.b64encode(a) c = base64.b32encode(a) d = base64.b16encode(a) e = 'A' * 10000 print a + b + c + d + e
  • 42. Entropy Demo • Concatenate three images in different formats 42
  • 43. Searching for High-Entropy Content • IDA Pro Entropy Plugin • Finds regions of high entropy, indicating encryption (or compression)
  • 44. Recommended Parameters • Chunk size: 64 Max. Entropy: 5.95 – Good for finding many constants, – Including Base64-encoding strings (entropy 6) • Chunk size: 256 Max. Entropy: 7.9 – Finds very random regions
  • 45. Entropy Graph • IDA Pro Entropy Plugin – Download from link Ch 13g – Use StandAlone version – Double-click region, then Calculate, Draw – Lighter regions have high entropy – Hover over graph to see numerical value
  • 48. Homegrown Encoding Schemes • Examples – One round of XOR, then Base64 – Custom algorithm, possibly similar to a published cryptographic algorithm
  • 49. Identifying Custom Encoding • This sample makes a bunch of 700 KB files • Figure out the encoding from the code • Find CreateFileA and WriteFileA – In function sub_4011A9 • Uses XOR with a pseudorandom stream
  • 51. Advantages of Custom Encoding to the Attacker • Can be small and nonobvious • Harder to reverse-engineer
  • 53. Two Methods • Reprogram the functions • Use the functions in the malware itself
  • 54. Self-Decoding • Stop the malware in a debugger with data decoded • Isolate the decryption function and set a breakpoint directly after it • BUT sometimes you can't figure out how to stop it with the data you need decoded
  • 55. Manual Programming of Decoding Functions • Standard functions may be available
  • 57. PyCrypto Library • Good for standard algorithms
  • 58. How to Decrypt Using Malware