Anti exploitation and Control Flow
Integrity with Processor Trace
Brought to you by
Shlomi Oberman
independent security researcher
Ron Shina
independent security researcher
 Tracing – what executed and
when?
 Code optimization and profiling
◦ Sampling
◦ Instrumentation
Intel Processor Trace (PT)
Intel PT
 Processor feature enabling instruction tracing with
low overhead – documentation says about 5%
◦ Tens of times faster than the previous option
 Available on Intel Broadwell and Skylake processors
 A similar feature, Real Time Instruction Trace, exists
on certain Intel Atom processors
Intel PT
Packets
 Processor writes trace to memory as packets
 Packet Types
◦ Taken / Not Taken packets for conditional branches
◦ IP packets for indirect branches
◦ Timestamp packets
◦ …
 Binary is needed to recreate the instruction trace
call to foo
branch taken / not taken
Decoded Trace Packets
 User and or Kernel tracing
 Filter by process
 Starting or stopping the trace based on address
ranges (only in later processors)
Configuration options
 Atom processors supporting RTIT – tracing guests
possible, but not the hypervisor
 Broadwell – no support at all
 Skylake – full support
Tracing VM guests and hypervisors
+ Traced Program’s
Binary
Instruction Trace
Intel PT output
 Linux kernel 4.1 comes with integrated PT support
 Linux kernel 4.3 supports tracing using perf user tools
 An open source PT decoding library – libipt
 Gdb 7.10 supports using PT for tracing
 simple-pt – an open source implementation of PT on Linux
(used to create the trace pictures on the previous slide)
* processor supporting PT included separately ;)
Want to use Processor Trace right now? *
Exploitation and the NX Bit
pdf
Hi!
shellcode
 When pdf is opened, the shellcode will be
in memory that isn’t executable – NX bit
 How do attackers run the code to make
their shellcode executable?
◦ Use code that is already executable (the
program’s code )
 This exploitation technique comes in
many forms, most notably, ROP – Return
Oriented Programming
 Using executable memory already in the
program usually involves moving
around the process rather strangely 
for example:
◦ Not returning to a function’s caller
◦ Calling addresses in the middle of functions,
instead of at the beginning
◦ …
“Jump Around, Jump around…” / House of Pain
pdf
Hi!
shellcode
 Establish rules for how the code flows in the process
◦ Functions return to their callers
◦ Calls are made to the beginning of functions
◦ …
 How can those rules be enforced?
◦ Add rule checking to the program’s binary
◦ Trace the program while running and go over the log (this work)
◦ Use other CPU features to detect “surprising” branches
“Control Flow Integrity Principles, Implementations, and Applications”, Abadi,
Budiu, Erlingsson, Ligatti, 2005
Control Flow Integrity (CFI)
 “Security Breaches as PMU Deviation”, Yuan, Xing, Chen, Zang 2011
 “kBouncer: Efficient and Transparent ROP Mitigation” – Pappas, Winner of Microsoft
BlueHat competition 2012, uses previous CPU branch tracing capabilities
 “CFIMon: Detecting Violation of Control Flow Integrity using Performance Counters” –
Xia, Liu, Chen, Zang 2012
 “Taming ROP on Sandy Bridge”, Wicherski of Crowdstrike, 2013
 “Transparent ROP Detection using CPU Performance Counters”, Li, Crouse, THREADS
2014
 and more…
Prior Work
 Anti exploitation system to scan files based on CFI
(think pdf on Adobe Reader)
 Detects whether “illegal” returns were made, like in
ROP
◦ Easy to add other CFI mitigations, such as checking the
targets of calls (no calls to the middle of functions, …)
 (Soon to be) Open Source
 Developed in 2015
Our Implementation
Verifying CFI via Processor Trace
 Was the flow OK?
 Just follow the arrows and
calls using the PT generated
packets
What information is needed to follow the
execution and verify it?
 Control Flow Graph (CFG)
◦ Location of functions
◦ Location of basic blocks
◦ …
 Need this for all the libraries loaded
by the process – Adobe Reader dlls,
Windows dlls
◦ If not – false positives 
 All we have is debugging symbols,
pdb files, for the Windows binaries
 We used IDA to recover the CFG
 IDA didn’t do a good enough job
◦ Part of the functions and basic blocks in Adobe Reader
/ Windows binaries weren’t detected
Static Analysis
 When supporting a new version of Adobe Reader, IDA is used
to get the initial CFG (static analysis)
 Afterwards, many pdf files are traced with PT
◦ When a new basic block or function is discovered while following the
trace – the CFG is updated
 Repeat
◦ run IDA on the new CFG
◦ run the pdf files on IDA’s output
◦ If the CFG was updated in the last iteration
 Repeat 
Dynamic Analysis
 Most of the edges in the CFG are:
◦ Calls relative to the current IP (no
packet for those)
◦ Conditional branches
 When traversing the CFG during
trace verification, fetching the next
node in these cases has to be (very)
fast
 Since the CFG is fixed and built in
preprocessing, this isn’t a problem
Optimization
 Ideally, no disassembly and CFG modification (slow)
would be done during verification
 However, some of the code analyzed is created
dynamically – as long as it doesn’t change, this can be
dealt with in preprocessing
 In cases where it changes every time “Adobe Reader” is
run to open a file, preprocessing isn’t enough
◦ code is disassembled and CFG is updated
Optimization
 Following the execution trace is done on a per
thread basis
 How to know which thread was executing at
each part of the trace?
◦ PT packets give timing information, but only
output the current process
Thread information
 Event Tracing for Windows (ETW)
◦ It should be possible to get the thread context switching
times from the CSwitch events provided by ETW as
TSC
◦ Then these timestamps could be synched with the TSC
packets from PT to determine which thread was
running in different parts of the trace
Thread Information
 What about getting a callback every time a thread in the
traced process is switched in?
◦ AFAWK, no direct way
◦ We hooked the Windows context switch function - don’t do that
◦ Endgame presented a way to achieve this via Asynchronous
Procedure Calls (Blackhat 2016)
Thread Information
 Need to know the executable memory ranges at all
points in the trace – what modules are loaded
 Knowing when the PT trace reached ntdll!LdrLoadDll
and ntdll!LdrUnloadDll isn’t enough
◦ Module name is needed to update the current memory map
 ETW was used to retrieve module load / unload name
and time (tsc) and this is then synched with the times
of the load/unload functions in the trace
Module load / unload
 For example:
◦ Exception dispatching code
◦ User mode callbacks
◦ …
 When going over the trace, when suspected mismatches
occur, the above special cases are checked via binary
signatures
 This mostly needs to be done per operating system, not
per-application
Still not done – functions don’t always return
to their callers
 (almost entirely) Not dealt with by our implementation
 For PT tracing the code being executed is needed 
 One obvious problem is pages that get written to and
executed from simultaneously
 (maybe) One could remove the write permission every
time a page becomes writable and executable and handle
the access violation when it gets written to, in order to
obtain the code’s new version
Dynamically generated code
 A case of dynamically generated code that was dealt with:
 Applications that hook themselves… with identical
hooks, at the same locations and same time
 To the trace verifier, the code is essentially static
Dynamically generated code
 Benign, non malicious files
◦ Run on 10000 pdf, 3000 ppt/x, 3000 doc/x
without false positives
 Malicious files containing a ROP chain
◦ Run on 5 such files, detecting the exploit and
displaying the CFI violation
Scanning Results
 you’d still need
◦ Module load / unload information
◦ Thread context switch times
 but could somewhat do without
◦ The CFG – a partial CFG can be built from the trace (it
doesn’t need to be built in advance)
Forget CFI and anti-exploitation…
What if I just want to trace a process quickly
with Processor Trace?
 Control-flow Enforcement Technology announced
by Intel June 2016. Release date ?
 Processors will directly support:
◦ Shadow (call) Stack tracking –unmatching return
 control protection exception
◦ Indirect branch tracking – an indirect branch to a
target containing an instruction different than
ENDBRANCH  control protection fault
Coming soon to a motherboard near you
 ARM has a feature similar to Processor Trace called
CoreSight
 Tracing on linux has been integrated with perf
 Open source decoding library exists – OpenCSD
http://www.linaro.org/blog/core-dump/coresight-
perf-and-the-opencsd-library/
What about tracing quickly on ARM?
 “Control Jujutsu” – Evans, Long, Otogonbaatar, Shrobe, Rinard,
Okhravi, Stelios, CCS 2015
 Uses indirect call sites with controllable targets and
arguments (via vulnerability) to achieve arbitrary code
execution (e.g., call exec or system)
 Bypasses CFI because the target functions are legal in the
CFG
Bypassing CFI
 “Write Once, Pwn Anywhere”, Yu, Black Hat USA 2014
◦ Sometimes applications have security critical
information in one variable
◦ Pseudo-code from internet explorer’s javascript engine:
if (safemode & 0xB == 0) {
turn_on_god_mode();
}
Bypassing CFI with “data attacks”
 “Control Flow Bending”, Carlini, Barresi, Payer, Wagner, Gross,
USENIX 2015
◦ printf-oriented-programming – if you control the
arguments, printf can do arbitrary computation
Bypassing CFI with “data attacks”
 “Data oriented programming” – Hu, Shinde, Sendroiu,
Zheng, Prateek , Zhenkai, S&P 2016
 goal: perform arbitrary computation while adhering
to the CFG
 Similar to ROP in spirit – use parts of the original
program as “instructions” of a “VM” controlled by
the attacker
 “data gadgets” are used to perform computation on
data
Bypassing CFI with “data attacks”
 gadgets are executed one after the other by using
constructs already in the vulnerable program – such
as loops
 the vulnerability being exploited is used to determine
which data gadget gets run and on what data
“data oriented programming” (cont)
any questions?

More Related Content

PDF
Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg
PDF
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
PDF
CNIT 126 11. Malware Behavior
PDF
Применение виртуализации для динамического анализа
PDF
Practical Malware Analysis Ch12
PDF
Масштабируемый и эффективный фаззинг Google Chrome
PPTX
Injection on Steroids: Codeless code injection and 0-day techniques
PDF
Csw2017 bazhaniuk exploring_yoursystemdeeper_updated
Practical Malware Analysis: Ch 10: Kernel Debugging with WinDbg
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
CNIT 126 11. Malware Behavior
Применение виртуализации для динамического анализа
Practical Malware Analysis Ch12
Масштабируемый и эффективный фаззинг Google Chrome
Injection on Steroids: Codeless code injection and 0-day techniques
Csw2017 bazhaniuk exploring_yoursystemdeeper_updated

What's hot (20)

PPTX
[Wroclaw #3] Trusted Computing
PDF
Process injection - Malware style
PDF
Reverse Engineering the TomTom Runner pt. 2
PPTX
Introduction to Dynamic Malware Analysis ...Or am I "Cuckoo for Malware?"
PPTX
ShinoBOT Suite
PDF
Captain Hook: Pirating AVs to Bypass Exploit Mitigations
PDF
Defcon 22-jesus-molina-learn-how-to-control-every-room
PDF
Solnik secure enclaveprocessor-pacsec
PDF
CrySys guest-lecture: Virtual machine introspection on modern hardware
ODP
Scalability, Fidelity and Stealth in the DRAKVUF Dynamic Malware Analysis System
PPTX
Hacker Halted 2014 - Post-Exploitation After Having Remote Access
PDF
Practical Malware Analysis Ch13
PDF
NMAP by Shrikant Antre & Shobhit Gautam
PPT
Buffer Overflow Countermeasures, DEP, Security Assessment
PDF
CNIT 126 7: Analyzing Malicious Windows Programs
PDF
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...
PDF
BlueHat v18 || The matrix has you - protecting linux using deception
PPTX
Practical Windows Kernel Exploitation
PPTX
[若渴計畫] Black Hat 2017之過去閱讀相關整理
PDF
Defcon 22-paul-mcmillan-attacking-the-iot-using-timing-attac
[Wroclaw #3] Trusted Computing
Process injection - Malware style
Reverse Engineering the TomTom Runner pt. 2
Introduction to Dynamic Malware Analysis ...Or am I "Cuckoo for Malware?"
ShinoBOT Suite
Captain Hook: Pirating AVs to Bypass Exploit Mitigations
Defcon 22-jesus-molina-learn-how-to-control-every-room
Solnik secure enclaveprocessor-pacsec
CrySys guest-lecture: Virtual machine introspection on modern hardware
Scalability, Fidelity and Stealth in the DRAKVUF Dynamic Malware Analysis System
Hacker Halted 2014 - Post-Exploitation After Having Remote Access
Practical Malware Analysis Ch13
NMAP by Shrikant Antre & Shobhit Gautam
Buffer Overflow Countermeasures, DEP, Security Assessment
CNIT 126 7: Analyzing Malicious Windows Programs
Nikita Abdullin - Reverse-engineering of embedded MIPS devices. Case Study - ...
BlueHat v18 || The matrix has you - protecting linux using deception
Practical Windows Kernel Exploitation
[若渴計畫] Black Hat 2017之過去閱讀相關整理
Defcon 22-paul-mcmillan-attacking-the-iot-using-timing-attac
Ad

Viewers also liked (20)

PPTX
Intel processor trace - What are Recorded?
PDF
System Hacking Tutorial #4 - Buffer Overflow - Return Oriented Programming ak...
PDF
Arduino: Open Source Hardware Hacking from the Software Nerd Perspective
PPTX
Software testing methodolgy with the control flow analysis
PDF
Control-Flow Integrity
PDF
Hypersafe (Introducing in japanese by third party)
PPTX
Tired of playing exploit kit whack-a-mole? Let's automate
PDF
System Hacking Tutorial #1 - Introduction to Vulnerability and Type of Vulner...
PPTX
Advanced SOHO Router Exploitation XCON
PDF
Course lecture - An introduction to the Return Oriented Programming
PDF
Building Self-Defending Applications With OWASP AppSensor JavaOne 2016
PPTX
Protecting IIoT Endpoints - an inside look at the Industrial Internet Securit...
PDF
Track 5 session 3 - st dev con 2016 - mechanisms for trusted code execution...
PPTX
Linux binary analysis and exploitation
PDF
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
PDF
Hunting For Exploit Kits
PDF
PDF
Control Flow Analysis
PPTX
MuVM: Higher Order Mutation Analysis Virtual Machine for C
PPTX
Whitewood entropy and random numbers - owasp - austin - jan 2017
Intel processor trace - What are Recorded?
System Hacking Tutorial #4 - Buffer Overflow - Return Oriented Programming ak...
Arduino: Open Source Hardware Hacking from the Software Nerd Perspective
Software testing methodolgy with the control flow analysis
Control-Flow Integrity
Hypersafe (Introducing in japanese by third party)
Tired of playing exploit kit whack-a-mole? Let's automate
System Hacking Tutorial #1 - Introduction to Vulnerability and Type of Vulner...
Advanced SOHO Router Exploitation XCON
Course lecture - An introduction to the Return Oriented Programming
Building Self-Defending Applications With OWASP AppSensor JavaOne 2016
Protecting IIoT Endpoints - an inside look at the Industrial Internet Securit...
Track 5 session 3 - st dev con 2016 - mechanisms for trusted code execution...
Linux binary analysis and exploitation
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Hunting For Exploit Kits
Control Flow Analysis
MuVM: Higher Order Mutation Analysis Virtual Machine for C
Whitewood entropy and random numbers - owasp - austin - jan 2017
Ad

Similar to [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman (20)

PPT
Detecting and Preventing Memory Attacks#
PDF
Interruption Timer Périodique
PDF
DEF CON 27- ITZIK KOTLER and AMIT KLEIN - gotta catch them all
PPT
Dc 12 Chiueh
PPTX
Advanced malware analysis training session4 anti-analysis techniques
ODP
Linux kernel tracing superpowers in the cloud
PDF
BKK16-103 OpenCSD - Open for Business!
PPTX
Ice Age melting down: Intel features considered usefull!
PPTX
Real Time Debugging - What to do when a breakpoint just won't do
PPTX
XenTT: Deterministic Systems Analysis in Xen
PDF
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
PPTX
Advanced malwareanalysis training session2 botnet analysis part1
PDF
A New Tracer for Reverse Engineering - PacSec 2010
PPTX
Advanced malware analysis training session5 reversing automation
PDF
MCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
PPTX
Dpdk applications
PDF
Let's write a Debugger!
PPTX
HPC Application Profiling & Analysis
PDF
HPC Application Profiling and Analysis
PDF
Ceph Day SF 2015 - Big Data Applications and Tuning in Ceph
Detecting and Preventing Memory Attacks#
Interruption Timer Périodique
DEF CON 27- ITZIK KOTLER and AMIT KLEIN - gotta catch them all
Dc 12 Chiueh
Advanced malware analysis training session4 anti-analysis techniques
Linux kernel tracing superpowers in the cloud
BKK16-103 OpenCSD - Open for Business!
Ice Age melting down: Intel features considered usefull!
Real Time Debugging - What to do when a breakpoint just won't do
XenTT: Deterministic Systems Analysis in Xen
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Advanced malwareanalysis training session2 botnet analysis part1
A New Tracer for Reverse Engineering - PacSec 2010
Advanced malware analysis training session5 reversing automation
MCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
Dpdk applications
Let's write a Debugger!
HPC Application Profiling & Analysis
HPC Application Profiling and Analysis
Ceph Day SF 2015 - Big Data Applications and Tuning in Ceph

More from CODE BLUE (20)

PDF
[cb22] Hayabusa Threat Hunting and Fast Forensics in Windows environments fo...
PDF
[cb22] Tales of 5G hacking by Karsten Nohl
PDF
[cb22] Your Printer is not your Printer ! - Hacking Printers at Pwn2Own by A...
PDF
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(4) by 板橋 博之
PDF
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(3) by Lorenzo Pupillo
PDF
[cb22] ”The Present and Future of Coordinated Vulnerability Disclosure” Inte...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(2)by Allan Friedman
PDF
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
PDF
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション (1)by 高橋 郁夫
PDF
[cb22] Are Embedded Devices Ready for ROP Attacks? -ROP verification for low-...
PPTX
[cb22] Wslinkのマルチレイヤーな仮想環境について by Vladislav Hrčka
PPTX
[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladisla...
PDF
[cb22] CloudDragon’s Credential Factory is Powering Up Its Espionage Activiti...
PDF
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
PDF
[cb22] Who is the Mal-Gopher? - Implementation and Evaluation of “gimpfuzzy”...
PDF
[cb22] Mal-gopherとは?Go系マルウェアの分類のためのgimpfuzzy実装と評価 by 澤部 祐太, 甘粕 伸幸, 野村 和也
PDF
[cb22] Tracking the Entire Iceberg - Long-term APT Malware C2 Protocol Emulat...
PDF
[cb22] Fight Against Malware Development Life Cycle by Shusei Tomonaga and Yu...
[cb22] Hayabusa Threat Hunting and Fast Forensics in Windows environments fo...
[cb22] Tales of 5G hacking by Karsten Nohl
[cb22] Your Printer is not your Printer ! - Hacking Printers at Pwn2Own by A...
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(4) by 板橋 博之
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(3) by Lorenzo Pupillo
[cb22] ”The Present and Future of Coordinated Vulnerability Disclosure” Inte...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション(2)by Allan Friedman
[cb22] "The Present and Future of Coordinated Vulnerability Disclosure" Inter...
[cb22] 「協調された脆弱性開示の現在と未来」国際的なパネルディスカッション (1)by 高橋 郁夫
[cb22] Are Embedded Devices Ready for ROP Attacks? -ROP verification for low-...
[cb22] Wslinkのマルチレイヤーな仮想環境について by Vladislav Hrčka
[cb22] Under the hood of Wslink’s multilayered virtual machine en by Vladisla...
[cb22] CloudDragon’s Credential Factory is Powering Up Its Espionage Activiti...
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...
[cb22] Who is the Mal-Gopher? - Implementation and Evaluation of “gimpfuzzy”...
[cb22] Mal-gopherとは?Go系マルウェアの分類のためのgimpfuzzy実装と評価 by 澤部 祐太, 甘粕 伸幸, 野村 和也
[cb22] Tracking the Entire Iceberg - Long-term APT Malware C2 Protocol Emulat...
[cb22] Fight Against Malware Development Life Cycle by Shusei Tomonaga and Yu...

Recently uploaded (20)

PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPTX
The various Industrial Revolutions .pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
CloudStack 4.21: First Look Webinar slides
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Five Habits of High-Impact Board Members
PPT
Geologic Time for studying geology for geologist
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
A review of recent deep learning applications in wood surface defect identifi...
The various Industrial Revolutions .pptx
Zenith AI: Advanced Artificial Intelligence
A proposed approach for plagiarism detection in Myanmar Unicode text
Benefits of Physical activity for teenagers.pptx
CloudStack 4.21: First Look Webinar slides
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
NewMind AI Weekly Chronicles – August ’25 Week III
Consumable AI The What, Why & How for Small Teams.pdf
OpenACC and Open Hackathons Monthly Highlights July 2025
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Taming the Chaos: How to Turn Unstructured Data into Decisions
Five Habits of High-Impact Board Members
Geologic Time for studying geology for geologist
Improvisation in detection of pomegranate leaf disease using transfer learni...
Chapter 5: Probability Theory and Statistics
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx

[CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman

  • 1. Anti exploitation and Control Flow Integrity with Processor Trace
  • 2. Brought to you by Shlomi Oberman independent security researcher Ron Shina independent security researcher
  • 3.  Tracing – what executed and when?  Code optimization and profiling ◦ Sampling ◦ Instrumentation Intel Processor Trace (PT)
  • 4. Intel PT  Processor feature enabling instruction tracing with low overhead – documentation says about 5% ◦ Tens of times faster than the previous option  Available on Intel Broadwell and Skylake processors  A similar feature, Real Time Instruction Trace, exists on certain Intel Atom processors
  • 6. Packets  Processor writes trace to memory as packets  Packet Types ◦ Taken / Not Taken packets for conditional branches ◦ IP packets for indirect branches ◦ Timestamp packets ◦ …  Binary is needed to recreate the instruction trace
  • 7. call to foo branch taken / not taken Decoded Trace Packets
  • 8.  User and or Kernel tracing  Filter by process  Starting or stopping the trace based on address ranges (only in later processors) Configuration options
  • 9.  Atom processors supporting RTIT – tracing guests possible, but not the hypervisor  Broadwell – no support at all  Skylake – full support Tracing VM guests and hypervisors
  • 11.  Linux kernel 4.1 comes with integrated PT support  Linux kernel 4.3 supports tracing using perf user tools  An open source PT decoding library – libipt  Gdb 7.10 supports using PT for tracing  simple-pt – an open source implementation of PT on Linux (used to create the trace pictures on the previous slide) * processor supporting PT included separately ;) Want to use Processor Trace right now? *
  • 12. Exploitation and the NX Bit pdf Hi! shellcode  When pdf is opened, the shellcode will be in memory that isn’t executable – NX bit  How do attackers run the code to make their shellcode executable? ◦ Use code that is already executable (the program’s code )  This exploitation technique comes in many forms, most notably, ROP – Return Oriented Programming
  • 13.  Using executable memory already in the program usually involves moving around the process rather strangely  for example: ◦ Not returning to a function’s caller ◦ Calling addresses in the middle of functions, instead of at the beginning ◦ … “Jump Around, Jump around…” / House of Pain pdf Hi! shellcode
  • 14.  Establish rules for how the code flows in the process ◦ Functions return to their callers ◦ Calls are made to the beginning of functions ◦ …  How can those rules be enforced? ◦ Add rule checking to the program’s binary ◦ Trace the program while running and go over the log (this work) ◦ Use other CPU features to detect “surprising” branches “Control Flow Integrity Principles, Implementations, and Applications”, Abadi, Budiu, Erlingsson, Ligatti, 2005 Control Flow Integrity (CFI)
  • 15.  “Security Breaches as PMU Deviation”, Yuan, Xing, Chen, Zang 2011  “kBouncer: Efficient and Transparent ROP Mitigation” – Pappas, Winner of Microsoft BlueHat competition 2012, uses previous CPU branch tracing capabilities  “CFIMon: Detecting Violation of Control Flow Integrity using Performance Counters” – Xia, Liu, Chen, Zang 2012  “Taming ROP on Sandy Bridge”, Wicherski of Crowdstrike, 2013  “Transparent ROP Detection using CPU Performance Counters”, Li, Crouse, THREADS 2014  and more… Prior Work
  • 16.  Anti exploitation system to scan files based on CFI (think pdf on Adobe Reader)  Detects whether “illegal” returns were made, like in ROP ◦ Easy to add other CFI mitigations, such as checking the targets of calls (no calls to the middle of functions, …)  (Soon to be) Open Source  Developed in 2015 Our Implementation
  • 17. Verifying CFI via Processor Trace  Was the flow OK?  Just follow the arrows and calls using the PT generated packets
  • 18. What information is needed to follow the execution and verify it?  Control Flow Graph (CFG) ◦ Location of functions ◦ Location of basic blocks ◦ …  Need this for all the libraries loaded by the process – Adobe Reader dlls, Windows dlls ◦ If not – false positives   All we have is debugging symbols, pdb files, for the Windows binaries
  • 19.  We used IDA to recover the CFG  IDA didn’t do a good enough job ◦ Part of the functions and basic blocks in Adobe Reader / Windows binaries weren’t detected Static Analysis
  • 20.  When supporting a new version of Adobe Reader, IDA is used to get the initial CFG (static analysis)  Afterwards, many pdf files are traced with PT ◦ When a new basic block or function is discovered while following the trace – the CFG is updated  Repeat ◦ run IDA on the new CFG ◦ run the pdf files on IDA’s output ◦ If the CFG was updated in the last iteration  Repeat  Dynamic Analysis
  • 21.  Most of the edges in the CFG are: ◦ Calls relative to the current IP (no packet for those) ◦ Conditional branches  When traversing the CFG during trace verification, fetching the next node in these cases has to be (very) fast  Since the CFG is fixed and built in preprocessing, this isn’t a problem Optimization
  • 22.  Ideally, no disassembly and CFG modification (slow) would be done during verification  However, some of the code analyzed is created dynamically – as long as it doesn’t change, this can be dealt with in preprocessing  In cases where it changes every time “Adobe Reader” is run to open a file, preprocessing isn’t enough ◦ code is disassembled and CFG is updated Optimization
  • 23.  Following the execution trace is done on a per thread basis  How to know which thread was executing at each part of the trace? ◦ PT packets give timing information, but only output the current process Thread information
  • 24.  Event Tracing for Windows (ETW) ◦ It should be possible to get the thread context switching times from the CSwitch events provided by ETW as TSC ◦ Then these timestamps could be synched with the TSC packets from PT to determine which thread was running in different parts of the trace Thread Information
  • 25.  What about getting a callback every time a thread in the traced process is switched in? ◦ AFAWK, no direct way ◦ We hooked the Windows context switch function - don’t do that ◦ Endgame presented a way to achieve this via Asynchronous Procedure Calls (Blackhat 2016) Thread Information
  • 26.  Need to know the executable memory ranges at all points in the trace – what modules are loaded  Knowing when the PT trace reached ntdll!LdrLoadDll and ntdll!LdrUnloadDll isn’t enough ◦ Module name is needed to update the current memory map  ETW was used to retrieve module load / unload name and time (tsc) and this is then synched with the times of the load/unload functions in the trace Module load / unload
  • 27.  For example: ◦ Exception dispatching code ◦ User mode callbacks ◦ …  When going over the trace, when suspected mismatches occur, the above special cases are checked via binary signatures  This mostly needs to be done per operating system, not per-application Still not done – functions don’t always return to their callers
  • 28.  (almost entirely) Not dealt with by our implementation  For PT tracing the code being executed is needed   One obvious problem is pages that get written to and executed from simultaneously  (maybe) One could remove the write permission every time a page becomes writable and executable and handle the access violation when it gets written to, in order to obtain the code’s new version Dynamically generated code
  • 29.  A case of dynamically generated code that was dealt with:  Applications that hook themselves… with identical hooks, at the same locations and same time  To the trace verifier, the code is essentially static Dynamically generated code
  • 30.  Benign, non malicious files ◦ Run on 10000 pdf, 3000 ppt/x, 3000 doc/x without false positives  Malicious files containing a ROP chain ◦ Run on 5 such files, detecting the exploit and displaying the CFI violation Scanning Results
  • 31.  you’d still need ◦ Module load / unload information ◦ Thread context switch times  but could somewhat do without ◦ The CFG – a partial CFG can be built from the trace (it doesn’t need to be built in advance) Forget CFI and anti-exploitation… What if I just want to trace a process quickly with Processor Trace?
  • 32.  Control-flow Enforcement Technology announced by Intel June 2016. Release date ?  Processors will directly support: ◦ Shadow (call) Stack tracking –unmatching return  control protection exception ◦ Indirect branch tracking – an indirect branch to a target containing an instruction different than ENDBRANCH  control protection fault Coming soon to a motherboard near you
  • 33.  ARM has a feature similar to Processor Trace called CoreSight  Tracing on linux has been integrated with perf  Open source decoding library exists – OpenCSD http://www.linaro.org/blog/core-dump/coresight- perf-and-the-opencsd-library/ What about tracing quickly on ARM?
  • 34.  “Control Jujutsu” – Evans, Long, Otogonbaatar, Shrobe, Rinard, Okhravi, Stelios, CCS 2015  Uses indirect call sites with controllable targets and arguments (via vulnerability) to achieve arbitrary code execution (e.g., call exec or system)  Bypasses CFI because the target functions are legal in the CFG Bypassing CFI
  • 35.  “Write Once, Pwn Anywhere”, Yu, Black Hat USA 2014 ◦ Sometimes applications have security critical information in one variable ◦ Pseudo-code from internet explorer’s javascript engine: if (safemode & 0xB == 0) { turn_on_god_mode(); } Bypassing CFI with “data attacks”
  • 36.  “Control Flow Bending”, Carlini, Barresi, Payer, Wagner, Gross, USENIX 2015 ◦ printf-oriented-programming – if you control the arguments, printf can do arbitrary computation Bypassing CFI with “data attacks”
  • 37.  “Data oriented programming” – Hu, Shinde, Sendroiu, Zheng, Prateek , Zhenkai, S&P 2016  goal: perform arbitrary computation while adhering to the CFG  Similar to ROP in spirit – use parts of the original program as “instructions” of a “VM” controlled by the attacker  “data gadgets” are used to perform computation on data Bypassing CFI with “data attacks”
  • 38.  gadgets are executed one after the other by using constructs already in the vulnerable program – such as loops  the vulnerability being exploited is used to determine which data gadget gets run and on what data “data oriented programming” (cont)

Editor's Notes

  • #20: Optimization? What has it run on? To be released during 2016