Schizophrenic 
files 
Ange Albertini 
MetaRheinMainConstructionDays 
MRMCD 
5-7 september 2014 
HS Darmstadt 
www.mrmcd.net 
2014/09/05
This talk is a collaboration with: 
Gynvael Coldwind 
Security researcher,Google 
Dragon Sector captain 
likes hamburgers 
http://gynvael.coldwind.pl/
Ange Albertini 
reverse engineering & 
visual documentations 
@angealbertini 
ange@corkami.com 
http://www.corkami.com
Schizophrenic files v2
1 file, 2 programs 
⇒ 2 different contents 
No active detection of the program in the file
Fooling, 
not failing 
Both programs will load the file correctly: 
No reported warning or error, no exploitation.
Abusing parsers for 
● fun 
● bypassing security 
○ same-origin policy 
○ evade detection 
○ exfiltration 
○ signing 
■ Android Master Key
ZIP
excerpt from Gynvael's talk: 
"Dziesięć tysięcy pułapek: ZIP, RAR, etc." 
(http://gynvael.coldwind.pl/?id=523)
ZIP archives
ZIP structures are parsed from the end.
File names are actually duplicated.
Why this weird structure?
ZIP archives were commonly read & written on the fly over multiple floppies.
Minimize floppy swaps 
● creation 
a. create one LFH per file 
Floppy full ⇒ start a new LFH on the next floppy 
b. when all files are finished, write CDs sequence 
(1/file) 
c. when all CDs are written, write the EoCD 
● extraction 
a. insert last floppy (contains the EoCD) 
b. insert the floppy with 1st CD 
(often, the last floppy contains EoCD + all CDs) 
c. insert the corresponding LFH’s first floppy 
insert next floppies if required
ZIP was very useful, 
but now it’s awkward. 
Newer archive formats are parsed top-down.
Position in the file
Prepended and appended data is tolerated...
...but not too much! 
(for obvious performance reason)
Duplicating the (relatively small) EoCD increases compatibility.
Scanning direction
If you concatenate 2 archives...
...if you parse bottom-up (standard), you find the 2nd one...
...but you will get the other archive if you parse top-down.
Superfluous headers
You could parse everything nicely...
...but in the end, only the Local File Headers matter.
1 file = 1 Local File Header
Since most ZIP archives start with a sequence of Local File Headers...
You can parse them top-down (until a break) and ignore the CD and EoCD.
Standard parsing: 
bottom-up + all headers
“Efficient” parsing: 
top-down + LFHs only 
Not standard, but good enough in most cases.
Nowadays, most ZIPs are 
a sequence of LFHs 
from the start
ZIP Archive comment
The EoCD contains an optional comment field...
...that can contain a complete archive !
Recap 
● Parsing direction: 
○ standard is bottom-up 
○ parsing LFHs from the start would work in most cases 
● ZIP should be located near the end of the file 
○ or at least, its EoCD 
● An archive comment can contain another 
complete archive
Let's test the parsers! 
abstract.zip
4 LFHs, 4 ways to parse this archive:
1/ you parse it bottom-up
2/ you parse it top-down
3/ look for LFHs from the start (until a break)
4/ scan for LFHs aggressively (you get all four)
Portable Document File
http://youtu.be/JQrBgVRgqtc?t=11m15s 
https://speakerdeck.com/ange/pdf-secrets-hiding-and-revealing-secrets-in-pdf-documents?slide=44
Schizophrenic files v2
PDF Trick #1 
trailers
trailer ⇒ root object ⇒ complete document
… 
… 
… 
a line comment - a correct trailer - a corrupted trailer
Each reader sees a different trailer.
PDF parsing 
Each reader sees a completely different 
document 
3 co-existing documents, all parsed through 
Viewers tolerance makes foreign elements 
ignored
Also available in PDF/A flavor 
(OK for Adobe Reader, but not for Preflight)
sometimes, 
it’s in the specs... 
...but who knows all of them ? 
(obscurity via over-specification)
Notice anything unusual?
This document contains layers. 
(an advanced feature)
What you see is not what you’ll get...
PDF Layers 1/2 
“Optional Content Configuration” 
● principles 
○ define layered content via various /Forms 
○ enable/disable layers on viewing/printing 
● no warning when printing 
● “you can see the preview!” 
○ bypass preview by keeping page 1 unchanged 
○ just do a minor change in the file
PDF Layers 2/2 
● it’s Adobe only 
○ what’s displayed varies with readers 
○ could be hidden via previous schizophrenic trick 
● it was in the specs all along 
○ very rarely used 
○ can be abused with no warning
BMP
BMP 
A pointer to some information that usually comes next… 
What could go wrong...
BMP Trick #1: ignoring the data pointer 
getting data right after 
the header 
getting data 
via the pointer 
(standard)
Trick #2: 
Run-Length Encoding
BMP RLE trick 
RLE structure (each box is 1 byte) 
Length 
>0 
Palette Index 
(color) 
Length 
0 
End of Line 
0 
Length 
0 
End of Bitmap 
1 
Length 
0 
Move Cursor 
2 X offset Y offset 
Length 
0 
RAW Length 
>2 
Palette Index 
(color) 
Palette Index 
(color) ...
BMP RLE trick 
If you just skip pixels, what is their color? 
Length 
0 
End of Line 
0 
Length 
0 
End of Bitmap 
1 
Length 
0 
Move Cursor 
2 X offset Y offset
Option 1 
The missing data will be filled with background color. 
(palette index 0)
Option 2 
The missing data will be black.
Option 3 
The missing data will be transparent.
PNG 
Portable Network Graphics
Combined data + 2 palettes 
Same data chunk combining 2 images via 2 palettes 
cute PoC by @reversity 
“There shall not be more than one PLTE chunk”
Different images depending on which PLTE chunk is used
Portable Executable
the PE Loader 
PE = complex + badly documented 
● fail or fool external tools ? too easy... 
● fooling Windows is much harder: 
○ Windows’ loader usually closes holes 
⇒ older PEs just not working anymore
PE Trick #1 
Data directory loading order
Pointing TLS’ AddressOfIndex to an Import descriptor
W7: TLS is loaded first ⇒ AoI’s address set to 0 
⇒ Imports descriptors’s sequence is truncated before loading
XP: Imports are loaded first - all descriptors are parsed 
TLS is then parsed - descriptors are not relevant anymore
PE Trick #2 
Relocations
Relocations: 
patching absolute addresses 
to solve address space conflicts
Vista W8 
Relocations types 
XP 
Type 4 
HIGH_ADJ -- -- ✓ 
Type 9 
MIPS_JMPADDR16 
IA64_IMM64 
MACHINE_SPEC_9 
32 bit 64 bit ✗
Relocations on relocations 
Type 4 
HIGH_ADJ -- -- ✓ 
Type 9 
MIPS_JMPADDR16 
IA64_IMM64 
MACHINE_SPEC_9 
32 bit 64 bit ✗ 
Type 10 
DIR64 ✓ ✓ ✓ 
as seen in 
PoC||GTFO #1
Relocation-based PE Schizophren
Julian Bangert, Sergey Bratus -- ELF Eccentricities 
https://www.youtube.com/watch?v=4LU6N6THh2U
GIF
GIF 
A GIF is made of blocks. 
if no animation speed is defined, 
they should all be displayed at once.
GIF 
If a frame speed is defined, then: 
first block = background 
next blocks = animation frames 
Background 
(from block 1) 
Frame 1 
(with block 2) 
Frame 2 
(with block 3)
GIF 
Frame 1 Frames 2-10001 
1x1 px 
Frame 10002 
1 complete pic + 10.000 pixels + 1 complete pic
Displaying all blocks at once (standard) 
Forcing animation (even if no frame speed is defined)
same-tool schizophrenia 
1 file + 1 tool = 2 behaviors 
(in different sub-components)
Because it was too simple... 
● WinRar: viewing ⇔ extracting 
○ opening/failing 
○ opening/’nothing’ 
● Adobe: viewing ⇔ printing 
○ well, it’s a feature
Failures & Ideas
Failures & Ideas 
● screen ⇔ printer 
○ embedded color profiles? 
● JPG 
○ IrfanView vs the world 
● Video 
○ FLV: early data pointer, like BMP 
PoC: video fails but plays sound
PNG 
Various ancillary chunks (rendering level) 
● partially supported: 
○ gamma 
○ transparency (for palettes) 
● never supported? 
○ significant bits 
○ chromacities 
● always supported? 
○ physical size
Conclusion
We tend to take our own shortcuts.
Conclusion 
● such a mess 
○ specs are messy 
■ unclear 
■ historical reasons 
○ parsers don’t even respect them 
(particularly when there is an easy shortcut) 
○ official tools “forced” to be tolerant 
■ They’re even trying to repair corrupted files (!) 
● no CVE/blaming for parsing errors? 
○ no security bug if no crash or exploit :(
Schizophrenia symptoms 
● different parsing (seeing different data) 
○ BMP: ignoring data pointer 
○ ZIP: different parsing algorithm & directions 
○ PE: different data directory loading order 
○ PDF: different trailer parsing 
● different interpretation (same data) 
○ GIF: ignoring animation speed 
○ BMP RLE: using different default color 
○ PE: different relocations implementation 
○ PNG: using different palette 
○ PDF: conditional layers
ACK 
@gynvael 
@reversity @travisgoodspeed @sergeybratus 
qkumba @internot @pdfkungfoo 
@j00ru ise ds vx, Mulander 
Felix Groebert, Salvation
@angealbertini 
corkami.com 
Damn, that's the second time those alien bastards shot up my ride!

More Related Content

PDF
trellix-dlp-buyers-guide.pdf
PDF
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...
PDF
Schizophrenic files
PDF
Funky file formats - 31c3
PDF
Caring for file formats
PDF
Binary art - Byte-ing the PE that fails you (extended offline version)
PDF
PDF: myths vs facts
PDF
Dfrws eu 2014 rekall workshop
trellix-dlp-buyers-guide.pdf
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...
Schizophrenic files
Funky file formats - 31c3
Caring for file formats
Binary art - Byte-ing the PE that fails you (extended offline version)
PDF: myths vs facts
Dfrws eu 2014 rekall workshop

Similar to Schizophrenic files v2 (20)

PDF
Debugging ZFS: From Illumos to Linux
PDF
Linux as a gaming platform, ideology aside
PDF
Messing with binary formats
DOCX
Bsdtw17: george neville neil: realities of dtrace on free-bsd
PDF
PDF
Binary art - funky PoCs & visual docs
ODP
Linux multiplexing
PDF
Ruxmon.2013-08.-.CodeBro!
PDF
One Year of Porting - Post-mortem of two Linux/SteamOS launches
ODP
A Dive Into ELF Binaries
PDF
PDF - Secrets - 140519092839-phpapp01
PDF
PDF secrets - hiding & revealing secrets in PDF documents
PDF
Advanced Pdf Tricks
PDF
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
PPTX
The internet of $h1t
PDF
Infrastructure as code might be literally impossible part 2
PDF
A bit more of PE
PPTX
Lrz kurs: big data analysis
PDF
Trusting files (and their formats)
PDF
The challenges of file formats
Debugging ZFS: From Illumos to Linux
Linux as a gaming platform, ideology aside
Messing with binary formats
Bsdtw17: george neville neil: realities of dtrace on free-bsd
Binary art - funky PoCs & visual docs
Linux multiplexing
Ruxmon.2013-08.-.CodeBro!
One Year of Porting - Post-mortem of two Linux/SteamOS launches
A Dive Into ELF Binaries
PDF - Secrets - 140519092839-phpapp01
PDF secrets - hiding & revealing secrets in PDF documents
Advanced Pdf Tricks
Customize and Secure the Runtime and Dependencies of Your Procedural Language...
The internet of $h1t
Infrastructure as code might be literally impossible part 2
A bit more of PE
Lrz kurs: big data analysis
Trusting files (and their formats)
The challenges of file formats
Ad

More from Ange Albertini (20)

PDF
Overview of file type identifiers (HackLu)
PDF
A question of time - Troopers 2024 Keynote
PDF
Technical challenges with file formats
PDF
Relations between archive formats
PDF
Abusing archive file formats
PDF
TimeCryption
PDF
You are *not* an idiot
PDF
Improving file formats
PDF
KILL MD5
PDF
No more dumb hex!
PDF
Beyond your studies
PDF
An introduction to inkscape
PDF
Exploiting hash collisions
PDF
Infosec & failures
PDF
Connecting communities
PDF
TASBot - the perfectionist
PDF
Hacks in video games
PDF
Let's write a PDF file
PDF
An overview of potential leaks via PDF
PDF
Preserving arcade games - 31c3
Overview of file type identifiers (HackLu)
A question of time - Troopers 2024 Keynote
Technical challenges with file formats
Relations between archive formats
Abusing archive file formats
TimeCryption
You are *not* an idiot
Improving file formats
KILL MD5
No more dumb hex!
Beyond your studies
An introduction to inkscape
Exploiting hash collisions
Infosec & failures
Connecting communities
TASBot - the perfectionist
Hacks in video games
Let's write a PDF file
An overview of potential leaks via PDF
Preserving arcade games - 31c3
Ad

Recently uploaded (20)

PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPT
Geologic Time for studying geology for geologist
PPTX
Tartificialntelligence_presentation.pptx
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPT
What is a Computer? Input Devices /output devices
PDF
Getting Started with Data Integration: FME Form 101
PDF
August Patch Tuesday
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
STKI Israel Market Study 2025 version august
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Architecture types and enterprise applications.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
A contest of sentiment analysis: k-nearest neighbor versus neural network
Geologic Time for studying geology for geologist
Tartificialntelligence_presentation.pptx
A novel scalable deep ensemble learning framework for big data classification...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Hybrid model detection and classification of lung cancer
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
What is a Computer? Input Devices /output devices
Getting Started with Data Integration: FME Form 101
August Patch Tuesday
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Taming the Chaos: How to Turn Unstructured Data into Decisions
STKI Israel Market Study 2025 version august
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Architecture types and enterprise applications.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Group 1 Presentation -Planning and Decision Making .pptx
Web Crawler for Trend Tracking Gen Z Insights.pptx
Final SEM Unit 1 for mit wpu at pune .pptx

Schizophrenic files v2