2014/06/02
Zürich, Switzerland
Schizophrenic
files
Ange Albertini
Gynvael Coldwind
Schizophrenic files
Area41
Gynvael Coldwind
Security researcher, Google
Dragon Sector captain
likes hamburgers
http://gynvael.coldwind.pl/
All opinions expressed during this presentation are mine and mine alone.
They are not opinions of my lawyer, barber and especially not my employer.
Ange Albertini
Reverse engineering
&
Visual Documentations
http://corkami.com
1 file + 2 tools
⇒ 2 different documents
No active detection in the file.
abusing parsers for
● fun
● bypassing security
○ same-origin policy
○ evade detection
○ exfiltration
○ signing
■ Android Master Key
ZIP archives
excerpt from Gynvael's talk:
"Dziesięć tysięcy pułapek: ZIP, RAR, etc."
(http://gynvael.coldwind.pl/?id=523)
ZIP
trick 1
a glitch in the matrix
file names in ZIP
a couple of files with the same name?
update:
for an awesome example see:
Android: One Root to Own Them All
Jeff Forristal / Bluebox
(https://media.blackhat.com/us-13/US-13-Forristal-Android-One-Root-to-Own-Them-All-Slides.pdf)
ZIP
trick 2
abstract kitty
Let's start with simple stuff -
the ZIP format
A ZIP file begins with letters PK.
Let's start with simple stuff -
the ZIP format
A ZIP file begins with letters PK.
WRONG
ZIP - second attempt :)
.zip file
last 65557 bytes of the file
the "header" is
"somewhere" here
PK56...
ZIP - "somewhere" ?!
4.3.16 End of central directory record:
end of central dir signature 4 bytes (0x06054b50)
number of this disk 2 bytes
number of the disk with the
start of the central directory 2 bytes
total number of entries in the
central directory on this disk 2 bytes
total number of entries in
the central directory 2 bytes
size of the central directory 4 bytes
offset of start of central
directory with respect to
the starting disk number 4 bytes
.ZIP file comment length 2 bytes
.ZIP file comment (variable size)
you
begin
ZIP
parsing
from
this; it MUST
be
at the end
of the file
$0000-$FFFF
0-65535
22bajty
Total: from 22 to 65557 bytes
(aka: PK56 magic will be somewhere between EOF-65557 and EOF-22)
ZIP - looking for the "header"?
"From the START"
Begin at EOF-65557,
and move forward.
"From the END"
(ZIPs usually don't have comments)
Begin at EOF-22,
and move backward.
PK56...
"somewhere"
PK56...
"somewhere"
The show will
continue in a
moment.
Larch
Something completely different
ZIP Format - LFH
4.3.7 Local file header:
local file header signature 4 bytes (0x04034b50)
version needed to extract 2 bytes
general purpose bit flag 2 bytes
compression method 2 bytes
last mod file time 2 bytes
last mod file date 2 bytes
crc-32 4 bytes
compressed size 4 bytes
uncompressed size 4 bytes
file name length 2 bytes
extra field length 2 bytes
file name (variable size)
extra field (variable size)
file data (variable size)
randomstuff
PK34... LFH + data
Each file/directory in a ZIP has LFH + data.
ZIP Format - CDH
[central directory header n]
central file header signature 4 bytes (0x02014b50)
version made by 2 bytes
version needed to extract 2 bytes
general purpose bit flag 2 bytes
compression method 2 bytes
last mod file time 2 bytes
last mod file date 2 bytes
crc-32 4 bytes
compressed size 4 bytes
uncompressed size 4 bytes
file name length 2 bytes
extra field length 2 bytes
file comment length 2 bytes
disk number start 2 bytes
internal file attributes 2 bytes
external file attributes 4 bytes
relative offset of local header 4 bytes
file name (variable size)
extra field (variable size)
file comment (variable size)
similarstufftoLFH
PK21... CDH
Each file/directory has a CDH entry in the Central Directory
thanks to the
redundancy you
can recover LFH
using CDH, or
CDH using LFH
ZIP - a complete file
PK34... LFH + data PK56...EOCDPK21... CDH
Files (header+data) List of files
(and pointers)
ZIP - a complete file (continued)
PK34... LFH + data PK56...EOCDPK21... CDH
PK34... LFH + data PK56...EOCDPK21... CDH
If the list of the files has pointers to files...
... the ZIP structure can be more relaxed.
ZIP - a complete file (continued)
PK56...EOCDPK21... CDH PK34... LFH + data
file comment (variable size)
You can even do an "inception"
(some parsers may allow EOCD(CHD(LFH)))
And now back
to our show!
(we were looking
for the EOCD)
Larch
Something completely different
ZIP - looking for the "header"?
"stream"
Let's ignore EOCD!
(it's sometimes faster)
(99.9% of ZIPs out there can be parsed this way)
PK34... LFH + data PK34... LFH + data PK34... LFH + data
(single "files" in an archive)
PK56...
(who cares...)
ZIP - looking for the "header"?
"aggressive stream"
We ignore the "garbage"!
(forensics)
PK34... LFH + data PK34... LFH + data PK34... LFH + data
(single "files" in an archive)
PK56...
(who cares...)
Let's test the parsers!
abstract.zip
EOCD
LFH+data
CDH
EOCD
LFH+data
CDH
LFH+data
LFH+data
syntax breaker
yellow is a
comment
of the
green
archive
stream
aggressive
stream
start-first
end-first
abstract.zip
abstract.zip
from zipfile import ZipFile
ZipFile("abstract.zip", "r").
printdir()
abstract.zip
<?php
$za = new ZipArchive();
$za->open('abstract.zip');
for ($i=0; $i<$za->numFiles;$i++) {
echo "index: $in";
print_r($za->statIndex($i));
}
echo "numFile:" . $za->numFiles . "n";
abstract.zip
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
public class zip {
public static void main(String args[]) throws
java.io.IOException, java.io.FileNotFoundException {
InputStream f = new FileInputStream("abstract.zip");
ZipInputStream z = new ZipInputStream(f);
ZipEntry e;
while((e = z.getNextEntry()) != null) {
System.out.println(e.getName());
}
}
}
abstract.zip
EOCD
CDH
EOCD
readme_StartFirst.txt
CDH
readme_AggressiveStream.txt
readme_Stream.txt
syntax breaker
abstract.zip
readme_EndFirst.txt
Total Commander 8.01
UnZip 6.00 (Debian)
Midnight Commander
Windows 7 Explorer
ALZip
KGB Archiver
7-zip
b1.org
Python zipfile
JSZip
C# DotNetZip
perl Archive::Zip
Jeffrey's Exif Viewer
WOBZIP
GNOME File Roller
WinRAR
OSX UnZip
zip.vim v25
Emacs Zip-Archive mode
Ada Zip-Ada v45
Go archive/zip
Pharo smalltalk 2.0 ZipArchive
Ubuntu less
Java ZipFile
EOCD
CDH
EOCD
readme_StartFirst.txt
CDH
readme_AggressiveStream.txt
readme_Stream.txt
syntax breaker
abstract.zip
readme_EndFirst.txt
PHP ZipArchive
PHP zip_open ...
PHP zip:// wrapper
tcl + tclvfs + tclunzip
EOCD
CDH
EOCD
readme_StartFirst.txt
CDH
readme_AggressiveStream.txt
readme_Stream.txt
syntax breaker
abstract.zip
readme_EndFirst.txt
Ruby rubyzip2
Java ZipArchiveInputStream
java.util.zip.ZipInputStream
EOCD
CDH
EOCD
readme_StartFirst.txt
CDH
readme_AggressiveStream.txt
readme_Stream.txt
syntax breaker
abstract.zip
readme_EndFirst.txt
binwalk (found all)
EOCD
CDH
EOCD
readme_StartFirst.txt
CDH
readme_AggressiveStream.txt
readme_Stream.txt
syntax breaker
abstract.zip - result summary
readme_EndFirst.txt
Thanks!
● Mulander
● Felix Groebert
● Salvation
● j00ru
abstract.zip - who cares?
● verify files via End-First
● unpack via Stream
Oops.
abstract.zip - AV
EICAR test results (using VT):
● most End-First
● some Aggressive
● Stream-only:
○ VBA32
○ NANO-Antivirus
○ Norman
○ F-Prot
○ Agnitum
○ Commtouch
https://docs.google.com/spreadsheet/ccc?
key=0Apy5AGVPzpIOdDRPTFNJQXpqNkdjUzl4SE80c1kwdkE&usp=sharing
Portable Document File
http://youtu.be/JQrBgVRgqtc?t=11m15s
https://speakerdeck.com/ange/pdf-secrets-hiding-and-revealing-secrets-in-pdf-documents?slide=44
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks it's many
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks it's many
% trailer <</Root …>>
trailer <</Root …>>
<</Root …>>
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks it's many
sometimes,
it’s in the specs
obscurity via over-specification?
notice anything unusual?
WYSIWYG
“Optional Content Configuration”
● principles
○ define layered content via various /Forms
○ enable/disable layers on viewing/printing
● no warning when printing
● “you can see the preview!”
○ bypass preview by keeping page 1 unchanged
○ just do a minor change in the file
PDF Layers 1/2
● it’s Adobe only
○ what’s displayed varies with readers
○ could be hidden via previous schizophrenic trick
● it was in the specs all along
○ very rarely used
○ can be abused
PDF Layers 2/2
BMP
Trick 1
(originally published in Gynvael's "Format BMP okiem hakera" article in 2008)
FILE HEADER
INFO HEADER
PIXEL DATA
offset 0
offset N
bfOffBits
bfOffBits
Specifies the offset, in
bytes, from the
BITMAPFILEHEADER
structure to the bitmap
bits
(MSDN)
FILE HEADER
INFO HEADER
PIXEL DATA
(secondary)
offset 0
offset N
bfOffBits
bfOffBits
Specifies the offset, in
bytes, from the
BITMAPFILEHEADER
structure to the bitmap
bits
(MSDN)
PIXEL DATA
● Some image
viewers ignore
bfOffBits and look
for data
immediately after
the headers.
Different images, depending on
which pixel data is used.
PIXEL DATA
(secondary)
PIXEL DATA
BMP
Trick 2
Something I've learnt about because it spoiled my steg100
task for a CTF (thankfully during testing).
BMP compression & palette
Run-Length Encoding (each box is 1 byte):
Length
>0
Palette Index
(color)
Length
0
End of Line
0
Length
0
End of Bitmap
1
Length
0
Move Cursor
2
X offset Y offset
Length
0
RAW Length
>2
Palette Index
(color)
Palette Index
(color)
...
BMP compression & palette
Question: If the opcodes below allow jump over pixels and
set no data, how will the pixels look like?
Hint: Please take a look at the presentation title :)
Length
0
End of Line
0
Length
0
End of Bitmap
1
Length
0
Move Cursor
2
X offset Y offset
Option 1
The missing data will be filled with background color.
(index 0 in the palette)
Option 2
The missing data will be black.
Option 3
The missing data will be transparent.
(pink represents transparency)
PNG
a data schizophren
image data combining
● 2 images
● via 2 palettes
cute PoC by @reversity
“There shall not be more than one PLTE chunk”
different images depending on which PLTE chunk is used
Portable Executable
W8Vista
XP
Relocations types
Type 4
HIGH_ADJ -- -- ✓
Type 9
MIPS_JMPADDR16
IA64_IMM64
MACHINE_SPEC_9
32 bit 64 bit ✗
Relocations on relocations
Type 4
HIGH_ADJ -- -- ✓
Type 9
MIPS_JMPADDR16
IA64_IMM64
MACHINE_SPEC_9
32 bit 64 bit ✗
Type 10
DIR64
✓ ✓ ✓
as
seen
in
PoC
||G
TFO
Relocation-based PE Schizophren
Julian Bangert, Sergey Bratus -- ELF Eccentricities
https://www.youtube.com/watch?v=4LU6N6THh2U
GIF
Something Gynvael stumbled on in 2008,
but never made a PoC... until now.
(with great input from Ange)
GIF
GIF can be made of many small images.
If "frame speed" is defined, these are frames instead
(and the first frame is treated as background).
x
x
x y
yy
GIF
Certain parsers (e.g. browsers) treat "images" as "frames"
regardless of "frame speed" not being defined.
Frame 1 Frame 2 Frame 3
GIF
Certain parsers (e.g. browsers) treat "images" as "frames"
regardless of "frame speed" not being defined.
Frame 1 Frame 2 Frame 3
GIF
Schizophrenic PoC:
Frame 1 Frames 2-10001
1x1 px
Frame 10002
These apps try to force animation.
These apps render the GIF by the specs.
GIMP says "frames", but allows one to see
all the frames, which is nice.
same-tool schizophrenia
1 file + 1 tool = 2 behaviors
it was too simple
● WinRar: different behavior when viewing or
extracting
○ opening/failing
○ opening/’nothing’
● Adobe: viewing ⇔printing
○ well, it’s a feature
Failures / Ideas / WIP
Screen ⇔ Printer schizophren
via color profiles?
Failures / Ideas / WIP
● screen ⇔ printer
○ embedded color profiles?
● JPG
○ IrfanView vs the world
● Video
○ FLV: video fails but still plays sound ?
PNG
Various ancillary chunks (rendering level)
● partially supported:
○ gamma
○ transparency (for palettes)
● never supported?
○ significant bits
○ chromacities
● always supported?
○ physical size
Conclusion
Conclusion
● such a mess
○ specs are messy
○ parsers don’t even respect them
● no CVE/blaming for parsing errors?
○ no security bug if no crash or exploit :(
PoCs and slides: http://goo.gl/Sfjfo4
ACK
@reversity @travisgoodspeed @sergeybratus
qkumba @internot @pdfkungfoo
@j00ru ise ds vx
questions?
Ange Albertini
Gynvael Coldwind
thank you
It's time to kick ass and chew bubble gum... and I'm all outta gum.
@angealbertini
@gynvael
Flash (SWF) vs Prezi
vs
Bonus Round
(not a fully schizophrenic problem in popular
parsers, that's why it's here)
Prezi SWF sanitizer
Prezi allows embedding SWF files.
But it first sanitizes them.
It uses one of two built-in SWF parsers.
There was a problem in one of them:
● It allowed huge chunk sizes.
● It just "jumped" (seeked) over these chunk...
● ...which resulted in an integer overflow...
● ...and this lead to schizophrenia.
● As the sanitizer saw a good SWF...
● ...Adobe Flash got its evil twin brother.
Prezi SWF sanitizer
"good" SWF sent to sanitizer
and its evil twin brother
kudos to the sanitizer!
Fixed in Q1 2014. For details see:
"Integer overflow into XSS and other fun stuff - a case study of a bug bounty"
http://gynvael.coldwind.pl/?id=533

More Related Content

PDF
Delphi L05 Files and Dialogs
PPTX
Files and streams In Java
KEY
Hachioji.pm in Machida の LT
PDF
Pentesting drivenbyfoca slides
PDF
Practicing Python 3
PPT
The JSON Saga
PDF
Python - Lecture 8
PDF
Delphi L05 Files and Dialogs
Files and streams In Java
Hachioji.pm in Machida の LT
Pentesting drivenbyfoca slides
Practicing Python 3
The JSON Saga
Python - Lecture 8

What's hot (19)

PPTX
Flash! (Modern File Systems)
PPTX
COM1407: File Processing
PDF
File handling and Dictionaries in python
PDF
あなたの知らないネットワークプログラミングの世界
PDF
Filesinc 130512002619-phpapp01
PPTX
Data file handling
PDF
Python - File operations & Data parsing
PPTX
GopherCon Denver LT 2018
PPT
File handling in C++
PPTX
Filesin c++
PPTX
File Handling Python
PPT
file handling, dynamic memory allocation
PPT
Files in c++ ppt
PPT
File handling in_c
PPT
17 files and streams
PPT
Unit 7
PPT
File in cpp 2016
PDF
Mercurial intro
PPSX
Files in c++
Flash! (Modern File Systems)
COM1407: File Processing
File handling and Dictionaries in python
あなたの知らないネットワークプログラミングの世界
Filesinc 130512002619-phpapp01
Data file handling
Python - File operations & Data parsing
GopherCon Denver LT 2018
File handling in C++
Filesin c++
File Handling Python
file handling, dynamic memory allocation
Files in c++ ppt
File handling in_c
17 files and streams
Unit 7
File in cpp 2016
Mercurial intro
Files in c++
Ad

Similar to Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks it's many (20)

PDF
Schizophrenic files v2
PDF
Relations between archive formats
PDF
Creating a phar
DOC
Zipnotes
PDF
Siemens s7 300-400-pkzip 4.0
PDF
HPDC'23 Rapidgzip
KEY
Git: from Novice to Expert
PPTX
Jack of all Formats
PPTX
Dan Crowley - Jack Of All Formats
PDF
Course 102: Lecture 24: Archiving and Compression of Files
PDF
[2009 CodeEngn Conference 03] externalist - Reversing Undocumented File Forma...
PPT
John's Top PECL Picks
PDF
Improving file formats
PDF
AVTOKYO2013.5 Detail of CVE-2013-4787 (Master Key Vulnerability)
PDF
A binary chimera - 3 headers & 1 data body in a single file
PDF
Zlib.3
PDF
Python programming : Files
PDF
The challenges of file formats
PDF
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
Schizophrenic files v2
Relations between archive formats
Creating a phar
Zipnotes
Siemens s7 300-400-pkzip 4.0
HPDC'23 Rapidgzip
Git: from Novice to Expert
Jack of all Formats
Dan Crowley - Jack Of All Formats
Course 102: Lecture 24: Archiving and Compression of Files
[2009 CodeEngn Conference 03] externalist - Reversing Undocumented File Forma...
John's Top PECL Picks
Improving file formats
AVTOKYO2013.5 Detail of CVE-2013-4787 (Master Key Vulnerability)
A binary chimera - 3 headers & 1 data body in a single file
Zlib.3
Python programming : Files
The challenges of file formats
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
Ad

More from Area41 (11)

PDF
Juriaan Bremer und Marion Marschalek: Curing A 15 Year Old Disease
PDF
Marc Ruef: Adventures in a Decade of Tracking and Consolidating Security Vuln...
PDF
Rob "Mubix" Fuller: Attacker Ghost Stories
PPTX
Halvar Flake: Why Johnny can’t tell if he is compromised
PDF
hashdays 2011: Tobias Ospelt - Reversing Android Apps - Hacking and cracking ...
PDF
hashdays 2011: Mikko Hypponen - Keynote
PDF
hashdays 2011: Felix 'FX' Lindner - Targeted Industrial Control System Attack...
PDF
hashdays 2011: Sniping Slowloris - Taking out DDoS attackers with minimal har...
PDF
hashdays 2011: Christian Bockermann - Protecting Databases with Trees
PDF
hashdays 2011: Ange Albertini - Such a weird processor - messing with x86 opc...
PDF
hashdays 2011: Jean-Philippe Aumasson - Cryptanalysis vs. Reality
Juriaan Bremer und Marion Marschalek: Curing A 15 Year Old Disease
Marc Ruef: Adventures in a Decade of Tracking and Consolidating Security Vuln...
Rob "Mubix" Fuller: Attacker Ghost Stories
Halvar Flake: Why Johnny can’t tell if he is compromised
hashdays 2011: Tobias Ospelt - Reversing Android Apps - Hacking and cracking ...
hashdays 2011: Mikko Hypponen - Keynote
hashdays 2011: Felix 'FX' Lindner - Targeted Industrial Control System Attack...
hashdays 2011: Sniping Slowloris - Taking out DDoS attackers with minimal har...
hashdays 2011: Christian Bockermann - Protecting Databases with Trees
hashdays 2011: Ange Albertini - Such a weird processor - messing with x86 opc...
hashdays 2011: Jean-Philippe Aumasson - Cryptanalysis vs. Reality

Recently uploaded (20)

PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
PDF
4 layer Arch & Reference Arch of IoT.pdf
PPTX
Microsoft User Copilot Training Slide Deck
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
PPTX
future_of_ai_comprehensive_20250822032121.pptx
PDF
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
PDF
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
Advancing precision in air quality forecasting through machine learning integ...
EIS-Webinar-Regulated-Industries-2025-08.pdf
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
“The Future of Visual AI: Efficient Multimodal Intelligence,” a Keynote Prese...
4 layer Arch & Reference Arch of IoT.pdf
Microsoft User Copilot Training Slide Deck
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
future_of_ai_comprehensive_20250822032121.pptx
A hybrid framework for wild animal classification using fine-tuned DenseNet12...
MENA-ECEONOMIC-CONTEXT-VC MENA-ECEONOMIC
Data Virtualization in Action: Scaling APIs and Apps with FME
A symptom-driven medical diagnosis support model based on machine learning te...
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
SGT Report The Beast Plan and Cyberphysical Systems of Control
Introduction to MCP and A2A Protocols: Enabling Agent Communication
Rapid Prototyping: A lecture on prototyping techniques for interface design
Improvisation in detection of pomegranate leaf disease using transfer learni...
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
MuleSoft-Compete-Deck for midddleware integrations
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj

Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks it's many