Automatically Evading Classifiers
Automatically Evading Classifiers
Abstract—Machine learning is widely used to develop classi- Consequently, though evading malware classifiers has been
fiers for security tasks. However, the robustness of these methods partially explored by classifier authors as well as security
against motivated adversaries is uncertain. In this work, we researchers, previous studies significantly under-estimate the
propose a generic method to evaluate the robustness of classifiers attackers’ ability to manipulate samples. For example, previous
under attack. The key idea is to stochastically manipulate a studies may mistakenly assume the attackers can only insert
malicious sample to find a variant that preserves the malicious
behavior but is classified as benign by the classifier. We present
new contents because removing existing contents would easily
a general approach to search for evasive variants and report on corrupt maliciousness [4, 20, 28]. In addition, previous works
results from experiments using our techniques against two PDF are ad hoc and limited to particular target classifiers or specific
malware classifiers, PDFrate and Hidost. Our method is able to types of samples [20, 29]. Other than suggesting point solu-
automatically find evasive variants for both classifiers for all of tions, they do not provide methods to automatically evaluate
the 500 malicious seeds in our study. Our results suggest a general the effectiveness of a classifier against adaptive adversaries.
method for evaluating classifiers used in security applications, and
raise serious doubts about the effectiveness of classifiers based We present a generic method to assess the robustness of a
on superficial features in the presence of adversaries. classifier by simulating attackers’ efforts to evade the classifier.
We do not assume the adversary has any detailed knowledge of
I. I NTRODUCTION the classifier or the features it uses, or can use targeted expert
knowledge to manually direct the search for an evasive sample.
Machine learning models are popular in security tasks such Instead, drawing ideas from genetic programming (GP) [11,
as malware detection, network intrusion detection and spam 15], we perform stochastic manipulations and then evaluate
detection. From the data scientists’ perspective, these models the generated variants to select promising ones. By repeating
are effective since they achieve extremely high accuracy on this procedure iteratively, we aim to generate evasive variants.
test datasets. For example, Dahl et al. reported achieving A sophisticated attacker, of course, can do manipulations that
99.58% accuracy in classifying Win32 malware using an would not be found by a stochastic search, so we cannot claim
ensemble deep neural network with dynamic features [9]. that a classifier that resists such an attack is necessarily robust.
Šrndic et al. achieved over 99.9% accuracy in a PDF malware On the other hand, if the automated approach finds evasive
classification task using an SVM-RBF model with structural samples for a given classifier, it is a clear sign that the classifier
path features [28]. is not robust against a motivated adversary.
However, it is important to realize that these results are for
We evaluated the proposed method on two PDF malware
particular test datasets. Unlike when machine learning is used
classifiers, and found that it could automatically find evasive
in other fields, security tasks involve adversaries responding to
variants for all the 500 sample seeds selected from the Con-
the classifier. For example, attackers may try to generate new
tagio PDF malware archive [5]. The evasive variants exhibit
malware deliberately designed to evade existing classifiers.
the same malicious behaviors as the original samples, but
This breaks the assumption of machine learning models that
are sufficiently different in the classifier’s feature space to be
the training data and the operational data share the same
classified as benign by the machine learning-based models.
data distribution. As a result, it is important to be skeptical
of machine learning results in security contexts that do not Our analysis of the discovered evasive variants reveals that
consider attackers’ efforts to evade the generated models. both classifiers are vulnerable because they employ non-robust
The risk of evasion attacks against machine learning mod- features, which can be manipulated without disrupting the
els under adversarial settings has been discussed in the ma- desired malicious behavior. Superficial features may work well
chine learning community, mainly focused on simple models on test datasets, but if the features used to classify malware
for spam detection (e.g., [10, 18]). However, evasion attacks are shallow artifacts of the training data rather than intrinsic
against malware classification can be much more complex in properties of malicious content, it is possible to find ways to
terms of the classification algorithm and the feature extrac- preserve the malicious behavior while disrupting the features.
tion as well as the mutability of highly-structured samples.
Contributions. Our primary contributions involve developing
and evaluating a general method for automatically finding
Permission to freely reproduce all or part of this paper for noncommercial
purposes is granted provided that copies bear this notice and the full citation
variants that evade classifiers. In particular:
on the first page. Reproduction for commercial purposes is strictly prohibited
without the prior written consent of the Internet Society, the first-named author • We propose a general method to automatically find
(for reproduction of an entire paper only), and the author’s employer if the evasive variants for target classifiers. The method
paper was prepared within the scope of employment. does not rely on any specific classification algorithms
NDSS ’16, 21-24 February 2016, San Diego, CA, USA
Copyright 2016 Internet Society, ISBN 1-891562-41-X or assume detailed knowledge of feature extraction,
http://dx.doi.org/10.14722/ndss.2016.23115 but only needs the classification score feedback on
generated variants and rough knowledge of the likely the prediction error on the training set. This function usually
features used by the classifier (Section II). results in a low error rate on the operational data under the
stationarity assumption that the distribution over data points
• We implement a prototype system that automatically
encountered in the future will be the same as the distribution
finds variants that can evade structural feature-based
over the training set.
PDF malware classifiers. This involves designing op-
erators that perform stochastic manipulations on PDF Machine learning has produced impressive results and is
files, an oracle that determines if a generated variant widely deployed for specific security tasks including malware
preserves maliciousness, a selection mechanism that classification. Without examining the behavior of suspicious
promotes promising variants during the evolutionary malware in a real system, malware classifiers often employ
process, and a fitness function for each target classifier static properties to predict maliciousness such as the file
(Section IV). structure, file size, metadata, grams of tokens or system
• We evaluate the effectiveness of our system in evading calls. Although this approach often achieves high accuracy
two recent PDF malware classifiers: PDFrate [25] and in validation tests, the classifier may learn properties that are
Hidost [28], a classifier designed with the explicit goal superficial artifacts of the training data, rather than properties
of resisting evasion attempts. Our system achieves that are inherently associated with malware. This is because
100% success rates in finding evasive variants against malware samples in the training data are likely to differ from
both classifiers in an experiment with 500 malware the benign samples in many ways that are not essential to their
sample seeds. An analysis of the discovered evasive malicious behavior.
variants in the feature space of each classifier shows
that many non-robust features employed in the classi- B. Threat Model
fication facilitate evasion attacks (Sections V and VI).
We assume an attacker starts with a desired malicious
We provide background on machine learning classifiers in sample that is (correctly) classified by a target classifier
Section II and on PDF malware in Section III. Section VIII as malicious, and wants to create a sample with the same
discusses related work on evasion attacks. malicious behavior, but that is misclassified as benign. The
attacker is capable of manipulating the malicious sample in
II. OVERVIEW many ways, and is likely to have knowledge of samples that
We propose an automated method to simulate an attacker are (correctly) classified as benign.
attempting to find an evasive variant for a desired malware
We assume the attacker has black-box access to the target
sample which is detected by a target classifier. The attacker’s
classifier, and can submit many variants to that classifier. For
goal is to find a malware variant that preserves the malicious
each submitted variant, the attacker learns its classification
behavior of the original sample, but that is misclassified as
score. The classification score is a number (typically a real
benign by the target classifier. In addition to improving our
number between 0 and 1) that indicates the classifier’s predic-
understanding of how classifiers work in the presence of
tion of maliciousness, where values above some threshold (say
adaptive adversaries, we hope our results will lead to strategies
0.5) are considered malicious and samples with lower classi-
for constructing classifiers that are more robust to adversaries,
fication scores are considered benign. We do not assume the
but in this work we focus on assessing evadability.
attacker has any internal information about the classifier, only
that it can use it as a black-box that outputs the classification
A. Machine Learning Classifiers score for an input sample. We assume the classifier operator
Machine learning learns from and makes predictions on does not adapt the classifier to submitted variants (which must
data. A machine learning-based classifier attempts to find a be the case if the attacker has offline access to the classifier).
hypothesis function f that maps data points into different
classes. For example, a malware classification system would
find a hypothesis function f that maps a data point (a piece C. Finding Evasive Samples
of malware sample) into either benign or malicious. Our method uses genetic programming techniques to per-
The effort to train a machine learning system starts with form a directed search of the space of possible samples to
feature extraction. As most machine learning algorithms cannot find ones that evade the classifier while retaining the desired
operate on highly-structured data, the data samples are usually malicious behavior.
represented in a specially-designed feature space. For example,
Genetic programming (GP) is a type of evolutionary al-
a malware classifier may extract the file size and the function
gorithm, originally developed for automatically generating
call traces as features. Each feature is a dimension in the
computer programs tailored to a particular task [11, 15]. It
feature space; consequently, every sample is represented as
is essentially a stochastic search method using computational
a vector. An extra step of feature selection may be performed
analogs of biological mutation and crossover to generate vari-
to reduce the number of features when the number of features
ants, and modeling Darwinian selection using a user-defined
is too large for the classification algorithm.
fitness function. Variants with higher fitness are selected for
The most widely used machine learning algorithms in continued evolution, and the process continues over multiple
security tasks use supervised learning, in which the training generations until a variant with desired properties is found (or
dataset comes with labels identifying the class of every training the search is terminated after exceeding a resource bound).
sample. The hypothesis function f is trained to minimize Genetic programming has been shown to be effective in many
2
Malicious Sample
Population
Target
Classifier Yes
Population Fitness Evasive Variants Success
Initialization Function Found?
Oracle
No
Select No Yes
Mutation Max Generation Failure
Variants Reached?
Benign Samples
tasks including fixing legacy software bugs [17], software III. PDF M ALWARE AND C LASSIFIERS
reverse engineering [13], and software re-engineering [23].
This section provides background on PDF malware and the
two target PDF malware classifiers.
Method. Our procedure is illustrated in Figure 1. It starts with
a seed sample that exhibits malicious behavior, and is classified A. PDF Malware
as malicious by the target classifier. Our method aims to find
The Portable Document Format (PDF) is a popular docu-
an evasive sample that preserves the malicious behavior but is
ment format designed to enable consistent content and layout
misclassified as benign by the target classifier.
in rendering and printing on different platforms. Although it
was not openly standardized until 2008 [1], and there are
First, we initialize a population of variants by performing various non-standard extensions supported by different PDF
random manipulations on the malicious seed. Then, each reader products, all PDF files roughly share the same basic
variant is evaluated by a target classifier as well as an oracle. structure depicted in Figure 2.
The target classifier is a black box that outputs a number that
is a measure of predicted maliciousness of an input sample. A PDF file consists of four parts: header, body, cross-
There is a prescribed threshold used to decide if it is malicious reference table (CRT) and trailer. The header contains the
or benign. The oracle is used to determine if a given sample PDF magic number and a format version indicator. The body
exhibits particular malicious behavior. In most instantiations, is a set of PDF objects that comprise the content of the file,
the oracle will involve expensive dynamic tests. while the CRT indexes the objects in body. The trailer specifies
how to find the CRT and other special objects such as the root
A variant that is classified as benign by the target classifier, object. Thus, PDF readers typically start reading a PDF from
but found to be malicious by the oracle, is a successful evasive the end of the file for efficiency.
sample. If no evasive samples are found in the population, The body is the most important to a PDF since it holds
a subset of the generated variants are selected for the next almost all the visible document data. It contains eight basic
generation based on a fitness measure designed to reflect types of objects, namely Booleans, numbers, strings, names,
progress towards finding an evasive sample. Since it is unlikely arrays, dictionaries, streams and the null objects. The objects
that the transformations will re-introduce malicious behaviors can be labeled with a pair of integer identifiers as indirect
into a variant, corrupted variants that have lost the malicious objects so that they can be referenced by other objects. The
behavior are replaced with other variants or the original seed. inter-referencing objects form a tree-like logical structure, as
is shown in the right of Figure 2. This tree-like structure is
Next, the selected variants are randomly manipulated by ideally suited to genetic programming techniques since it is
mutation operators to produce next generation of the popula- easy to alter and move sub-trees to generate new variants.
tion. The process continues until an evasive sample is found
or a threshold number of generations is reached. PDF malware is becoming increasing prevalent because
PDF is a widely accepted document format and victims are
more willing to open PDFs than other files. According to a
To improve the efficiency of the search, we collect traces
recent Internet security threat report [30], PDF is in top 7
of the mutation operations used and reuse effective traces. If
attachment types in spear-phishing emails in 2014. We expect
a search ends up finding any evasive variants, the mutation
there will be continuing opportunities for PDF malware attacks
traces on the evasive variants will be stored as successful
because 128 new vulnerabilities in Acrobat readers have been
traces. Otherwise, the mutation trace of a variant with the
reported in CVE so far in 2015 (through 8 December), which
highest fitness score is stored. These traces are then applied to
is almost three times the total number in 2014 [8].
other malware seeds to generate variants for their population
initialization. Because of the structure of PDFs and the nature PDF malware typically contains exploits in JavaScript
of the mutation operators, the same sequence of mutations can objects or other objects that take advantage of vulnerabilities
often be applied effectively to many initial seeds. of particular PDF readers (most commonly, Adobe Acrobat).
3
1 0 obj <<
/Type /Catalog
features. The PDF metadata includes the author, title, and
/Pages 2 0 R
/OpenAction << /Catalog
/Page creation date. The object properties includes positions, counts,
/S /JavaScript
Header
/JS alert('hello'); /Type
/Type
20R
and lengths.
>> /Pages
>> endobj /Parent
/Type
0 PDFrate was trained with a random subset of the Contagio
2 0 obj << /Pages /Kids
Body /Type /Pages
/Kids [3 0 R]
0 /MediaBox
0 dataset [5] with 5,000 benign and 5,000 malicious PDFs. The
/Count 1
>> endobj
/Count
/Resources
128 two parameters are respectively the number of trees (ntree =
1
3 0 obj <<
/OpenAction
546
1, 000) and the number of features in each tree (mtry = 43).
Cross-reference ...
table
/Type /Page
/Parent 2 0 R
/S /JavaScript The feature set is a total of 202 integer, floating point, and
Trailer
/MediaBox [0 0 128 546]
/Resources ... /JS alert(‘hello’); Boolean features, but only 135 of the features are described in
>> endobj
the PDFrate documentation.
What we use in this work is an open-source re-
Fig. 2. The physical and logical structure of a PDF file.
implementation of PDFrate named Mimicus [27], implemented
by Nedim Šrndic and Pavel Laskov to mimic PDFrate for
malware evasion experiments [29]. Mimicus was trained with
PDF malware may also carry other encoded payloads in stream
the 135 documented PDFrate features and the same training set
objects which will be triggered after exploits [25].
as PDFrate.1 Mimicus has been shown to have classification
performance nearly equivalent to PDFrate [29].
B. Target Classifiers
Several projects have built PDF malware classifiers using Hidost. Hidost is a support vector machine (SVM) classifi-
machine learning techniques. Earlier works, such as Wepawet cation model. SVM is an optimal margin classifier that tries
[7] and PJScan [16], focused on the embedded malicious to find a small number of support vectors (data points) that
JavaScript in PDF malware. These tools consist of a JavaScript separate all data points of two classes with a hyperplane of a
code extractor and a dynamic or static malicious JavaScript high-dimensional space. With kernel tricks, it can be extended
classifier. as a nonlinear classifier to fit more complex classification
problems. Hidost uses a radial basis function (RBF) kernel to
Since not all PDF malware involves embedded JavaScript, map data points into an infinite dimensional space. At testing
and PDF malware authors have found many tricks for hiding time, the (positive or negative) distance of a data point to
JavaScript code [24], recent PDF malware classifiers have the hyper-plane is output as the prediction result. A positive
focused on structural features of PDF files. In this work, we distance is interpreted as malicious, and negative as benign.
target state-of-the-art structural feature-based classifiers.
Hidost uses the structural paths of objects as classification
Structural feature-based classifiers assume that the mali- features. For example, the structural path of a typical Pages
cious PDFs have different patterns in their internal object object is /Root/Pages. If that object appears in the PDF file,
structures than those found in benign PDFs. For example, its feature value is 1; if not, its feature value is 0. Since the
the PDF Malware Slayer tool uses the object keywords as number of possible structural paths of PDF objects is infinite,
features, where each feature corresponds to the occurrences Hidost uses 6,087 selected paths as features. The selected paths
of a given keyword [19]. For our experiments, we selected are those which appeared in at least 1,000 of the files in a
PDFrate [25, 26] and Hidost [28] as the target classifiers. pool of 658,763 benign and malicious PDFs collected from
They are representatives of recent PDF malware classifiers, VirusTotal [31] and a Google search. The resulting model
and Hidost was developed with a particular goal of being provided by the authors of Hidost was trained using the
resilient to evasion attacks. Both classifiers achieve extremely randomly-sampled 5,000 malicious and 5,000 benign files. It is
high accuracy in malware detection on their testing datasets. reported to be robust against adversaries, where the number of
The other reason for choosing these classifiers as our targets is false negatives on another 5,000 random malicious files only
the availability of the open source implementations. Although increased from 28 to 30 under what the authors claim is the
our method only requires black-box access to the classifier, “strongest conceivable mimicry attack” [28].
having access to the internal feature space is beneficial for
understanding our results (Section VI).
IV. E VADING PDF M ALWARE C LASSIFIERS
PDFrate. PDFrate is a random forest classifier that uses an The proposed method could be applied to any security
ensemble learning model consisting of a large number of de- classifier, although its effectiveness depends on being able to
cision trees designed to reduce variance in predictions. With a find good genetic programming operators to search the feature
random subset of training data and a random subset of features, space efficiently and an appropriate fitness function to direct
each decision tree is trained to minimize the prediction error on the search. In this section, we show how to instantiate our
its training subset. After training, the output score of PDFrate design to find evasive PDF malware.
is the fraction of trees that output “malicious”, ranging from 0
to 1. The threshold value is typically 0.5, although the PDFrate A. PDF Parser and Repacker
authors claim that adjusting the threshold from 0.2 to 0.8 has
little impact on accuracy because most samples have scores The first step is to parse the PDF file as a tree-like
very close to either 0 or 1. representation. We will also need to regenerate a PDF file from
Besides object keywords, PDFrate also employs the PDF 1 The Mimicus authors were unable to locate one malicious file with the
metadata and several properties of objects as the classification MD5 hash 35b621f1065b7c6ebebacb9a785b6d69 in Contagio.
4
the tree representation, after it has been manipulated to produce
a new variant. For this, we use pdfrw [21], a python-based open
source library for parsing PDF files into the tree-like structure
and serializing that structure into an output PDF file.
It is important to note that pdfrw is not a perfect PDF
parser and repacker, and a number of PDF malware samples
have been malformed intentionally to bypass or confuse PDF
parsers used in malware detectors (while still being processed
by target PDF readers due to parser quirks). This means we
cannot test our method on PDF seed samples that cannot be
parsed by pdfrw, or that no longer exhibit malicious behavior Fig. 3. A PDF malware detection result given by the Cuckoo sandbox. The
left side shows the key API execution trace, the right is a screenshot captured
when they are unpacked and packed using pdfrw. from the virtual machine.
To avoid losing too many samples because of PDF parsing
issues, we modified pdfrw to loosen its grammar checking. was possible to achieve an 100% evasion rate only using the
This significantly increased the success rate of repacking PDF simple mutation operations.
malware samples. The modified version of pdfrw is available
at https://github.com/mzweilin/pdfrw.
C. Oracle
In our modified pdfrw, we ignore several potentially cor-
We need an oracle to determine if a variant preserves
rupted, malformed, or misleading auxiliary elements. The EOF
the seed’s malicious behavior. There is no perfectly accurate
marks in PDF raw bytes are ignored; instead, the parser
malware detection technique that works universally (indeed, if
reads in all bytes of a file. The cross-reference tables are
such a technique existed our work would not be necessary). In
ignored; instead, it parses objects in the body directly without
this case we have one advantage that enables a highly-accurate
any index. Stream length indicators are ignored; instead, the
oracle for testing variants: we do not need an oracle that can
parser detects the stream length with the endstream token.
test for arbitrary malicious behavior, but instead only need to
The unpaired keys or values are also ignored in parsing
verify that a particular known malicious action is performed
a dictionary. Ignoring these auxiliary elements significantly
by the variant.
decreases parsing efficiency, thus, is only suitable for repacking
seed malware samples. All seeds are repacked with correct To do this, we use the Cuckoo sandbox [12]. Cuckoo
auxiliary elements for efficient parsing later. In addition, we runs a submitted sample in a virtual machine installed with
added support for parsing empty objects, which do exist in the a PDF reader and reports the behavior of the sample including
malware samples. The dictionary data structure was modified network APIs called and their parameters. Figure 3 shows
to enable deep-copy in duplicating variants from seeds. an example of malware detection results from Cuckoo. The
malware sample opened in a virtual machine exploited a dis-
closed buffer overflow vulnerability in Acrobat Readers (CVE-
B. Genetic Operators
2007-5659). The injected shellcode downloads four additional
Since both of the classifiers we target employ the object pieces of malware from Internet and executes them. Since the
structure of the PDF file as features, we need to generate execution of Cuckoo was isolated from the Internet to avoid
variants by manipulating the PDF files at that level. (If we were spreading malware, the shellcode just received malformed
targeting JavaScript-based classifiers instead, we would instead executable files provided by INetSim, a network service simu-
need to generate variants by manipulating the embedded lator [14]. However, the downloading and execution behaviors
JavaScript code.) Due to the limited number of possible static detected by Cuckoo are enough to show that the shellcode
features, we believe it is reasonable to assume the attackers has been executed. By comparing the behavioral signature of
have the knowledge of the manipulation level. the original PDF malware and the manipulated variant, we
determine if the original malicious behavior is preserved. The
We use computational analogs of mutation in biological details on how we select and compare behavioral signatures
evolution to generate evasive PDF malware variants. The are deferred to Section V-A.
mutation operator changes any object in a PDF file’s tree-
like structure with low probability. An object is mutated with We only focus on the network behaviors of malware sam-
probability given by the mutation rate, typically a number ples in this work. Although this setting prevents our method
smaller than 0.5. The mutation is either a deletion (the object from working on malware samples without network activity,
is removed), an insertion (another object is inserted after it), or we believe it is not a real constraint in practice since malware
a replacement (this object is replaced with some other object). authors could always develop a way to verify the desired
malicious behaviors.
We choose among these options with uniform random
probability. In the case of an insertion or replacement, a second Cuckoo sandbox works well as an oracle, but is com-
object is also chosen uniformly at random from a large pool putationally expensive. We experimented with other possible
of objects segmented from benign PDFs. The external genome oracles, including using Wepawet. Wepawet and similar de-
helps to generate a more diverse population. tection techniques only detect the malicious payloads, but do
not verify that the payload is actually executed in a real PDF
The other well-known operator, crossover, commonly used reader. Because many of the genetic mutations will disrupt that
in genetic algorithms, is not used in this work. We found it execution, oracles that do not actually dynamically observe the
5
variant exhibiting the malicious behavior result in many false F. Trace Collection and Replay
positives (apparently evasive variants that would not actually
work as malware). Hence, it is important to use an oracle The most common way to initialize a population is du-
that confirms the malicious behavior is preserved through plicating the original seed and performing a random mutation
actual execution. This limits the samples we can use in our operation on each copy. Considering the potentially common
experiments to ones for which we can produce the malicious properties across evasive variants, we accelerate the search
behavior in our oracle’s test environment (Section V-A). by reusing mutation traces that successfully led to evasive or
promising variants.
A mutation trace consists of a series of mutations defined
D. Fitness Function by 3-tuple (mutation operator, target object path, file id: source
A fitness function gives the fitness score of each generated object path). For example,
variant. Higher scores are better. Given 0 as a threshold value, (insert, /Root/Pages/Kids/1, 1: /Root/Pages/Kids/4)
a variant with a positive fitness score is evasive: it is classified
as benign and retains the malicious behavior. inserts an external Page object from a benign file 1 to the
targeted PDF file. The three possible mutation operators are
In our case, the fitness function captures both the output
defined in Section IV-B. Though the target object path has
of the oracle and the predicted result of the target classifier.
the same format as the source object path, they are paths in
The oracle is modeled as a binary function: oracle(x) = 1 if
different PDF files. The target object path refers to an object
x exhibits the malicious signature; otherwise, oracle(x) = 0.
in the variant, while the source object path points to an object
In order to eliminate corrupted variants, we always assign the
in an external benign file with the specified file id.
lowest possible fitness score to variants with oracle(x) = 0.
Mutation traces are added to two pools at the end of
Based on the different scoring methods used by the target each GP search. If a GP search successfully generates evasive
classifiers, the fitness functions are defined separately. PDFrate, variants, all of the corresponding mutation traces are added
as a random forest classifier, outputs a confidence value of to the success trace pool. Otherwise, a mutation trace that
maliciousness from 0 to 1, typically with a threshold of 0.5. generates the variant with the highest fitness score is added to
Thus, we define its fitness function as the promising trace pool.
0.5 − pdfrate(x) oracle(x) = 1 The traces in the two pools are replayed in the population
fitnesspdfrate (x) = initialization to produce some variants for the first generation.
LOW SCORE oracle(x) = 0
If the number of usable traces is smaller than the population
with evasive range of (0, 0.5]. size, additional variants are generated in the conventional way.
If the number is larger than the population size, the selection
The SVM model of Hidost outputs negative (positive) dis- process described in Section IV-E shrinks the population to the
tance of a benign (malicious) sample to hyperplane. Therefore, specified size.
for Hidost the fitness function is defined as
hidost(x) × (−1) oracle(x) = 1 V. E XPERIMENT
fitnesshidost (x) =
LOW SCORE oracle(x) = 0 We evaluate the effectiveness of the proposed method
by conducting experiments on the two target PDF malware
with evasive range of (0, +∞). classifiers.
6
First, we filtered out the samples that don’t have any environment, its behavior may be effected by the timing of
network API calls by the shell code analysis of Wepawet, events, service failures, and other sources of non-determinism.
leaving 9,688 out of 10,980 samples. This is not necessary
Focusing on the network behaviors of malware samples, we
for our method, but useful since we use Wepawet to obtain
may extract various network behaviors reported by Cuckoo as
additional information about the samples.
signatures, such as DNS queries, HTTP URL requests, and
Second, the remaining samples were tested in the Cuckoo network destinations. Cuckoo generates these reports from the
sandbox. According to the vulnerability information of each network-related API execution traces and the captured network
sample provided by Wepawet, Adobe Acrobat Reader 8.1.1 is traffic. Table II compares the effectiveness of six different types
the most common target PDF reader, except for CVE-2009- of signatures extracted from Cuckoo reports.
9837 which targets Foxit readers. Thus, these samples were We tested the 500 malware seeds in Cuckoo virtual ma-
loaded with Acrobat Reader 8.1.1. However, not all network chines, running each seed ten times. Our goal is to determine
behaviors indicated by the static analysis on shell code can be which type of signature will have the best precision in captur-
observed in Cuckoo even though we have selected a targeted ing observed malicious behavior, while being consistent across
PDF reader due to the imperfect network simulation in virtual multiple executions of the same sample.
machines as well as the potential sandbox detection features
in malware. As a result, only 1,414 out of the 9,688 samples If a signature extracts any relevant behavior for a seed in
were observed to have malicious network activities running on any of the ten tests, we count the signature effective on the
Acrobat Reader 8.1.1 inside the Cuckoo sandbox. seed. Obviously, an ideal signature would be effective on all
500 seeds. We also measure the consistency of a signature
Next, the 1,414 samples were repacked by the modified over the 10 repeated tests. We designate the extracted behavior
pdfrw with less strict grammar checking, then re-tested by observed most frequently over the ten tests as the reference
Wepawet and Cuckoo. This resulted in 1,384 unique samples. signature for a seed. The consistency on a seed is calculated
Eleven of the samples were corrupted during repacking and no as mode
10 (that is, the fraction of times the reference signature
longer behaved maliciously in Wepawet or Cuckoo. The other occurred across the 10 trials).
19 samples were found to be duplicates after being repacked.
This is a clear sign that malware authors have attempted to The average and the minimum consistency of each type of
evade detection through parsing obfuscation. signature over the ten executions for each of the 500 seeds are
listed in Table II. In general, the signatures extracted from API
Since our goal is to evaluate the effectiveness of an evasion traces are more consistent than those extracted from network
attack, we need to filter out the false negative samples of traffic. We choose the union of the HTTP URL requests and
the target classifiers. PDFrate correctly classified 1,378 out of host queries extracted from API traces as the signature for our
the 1,384 samples as malicious, while Hidost only correctly experiments. By combining those two behavioral signatures,
classified 502 of them. The intersection of the true positives we obtain a signature that is effective on all 500 malware seeds
from both classifiers left a suitable evaluation set of 500 unique and has the highest average and minimum consistency.
PDF malware samples.
According to results from Wepawet, these 500 malware Benign PDF Dataset. We collected a set of 179 benign PDF
samples exploit two different vulnerabilities in Acrobat Read- documents using a Google search with filetype:pdf and no
ers: 333 of them exploit multiple buffer overflows reported in keywords. All files were confirmed to be benign by both Virus-
CVE-2007-5659, the other 167 exploit a stack-based buffer Total [31] and Wepawet [7]. We only included files smaller
overflow reported by CVE-2009-0927. Both vulnerabilities than 1 MB to avoid introducing unnecessary computation
can be exploited to execute arbitrary code. In summary, the costs manipulating extremely large PDF files. We picked the
payloads in the 500 samples access 255 different hosts to 3 benign samples with the lowest scores (that is, most benign)
download additional malware from the Internet. to the target classifiers as the source of external objects in the
experiment. Our results show that just a few benign samples
The selection process leaves us with 500 samples from is sufficient for generating successful evasion attacks.
the original 10,980 malware samples in the Contagio archive.
Although this selects less than 5% of the original samples, it GP Parameters. Several GP parameters are arbitrarily chosen
does not have implications for the success rate of a malware without any parameter fine-tuning other than one obvious
author attempting to find an evasive sample so long as the constraint: we want the experiment to finish in a reasonable
selection criteria have no biases which would impact our time. The population size is 48 and the maximum generation
results. Many of the down-selects are due to artifacts of the is 20. The mutation rate is 0.1. The fitness stop threshold is
experiment, not reflective of what an actual malware producer 0.0, which indicates that an evasive variant has been found.
would observe. For example, the most significant reduction
is because of the particular dynamic environment we selected Target Classifiers. Since we don’t want to abuse the online
to verify the malicious behaviors. Malware authors can easily deployed malware classification systems by submitting too
design an oracle that verifies the presence of the particular many automatically generated malware variants, we always
malicious behaviors they intend to inflict. prefer locally executable code. We used the Mimicus re-
implementation of PDFrate and the Hidost classifier, config-
Reliable Malware Signatures. Since the dynamic behavior ured and trained as described in Section III-B.
of malware samples may vary across executions, we need to
select a reliable malware signature from a group of candidates. Machine. We used one typical desktop PC in the experiment
Even though the malware is executing in the same virtual (Intel Core i7-2600 CPU @ 3.40GHz and 32GB of physical
7
TABLE II. C OMPARISON OF NETWORK - BASED MALWARE SIGNATURES .
Consistency
Source Description Example Effective Average Minimum
API traces Combination of HTTP URL requests and host [http://stortfordaircadets.org.uk/flash/exe.php?x=pdf, 500 0.95 0.50
queries stortfordaircadets.org.uk]
API traces Hosts queried through getaddrinfo() [stortfordaircadets.org.uk] 497 0.95 0.50
Network traffic Transport layer destination IP addresses (udp: [192.168.57.2:53], tcp: [192.168.57.2:80]) 476 0.85 0.10
API traces URLs requested through raw socket, [http://stortfordaircadets.org.uk/flash/exe.php?x=pdf] 473 0.95 0.50
URLDownloadToFileW(), InternetOpenUrlA()
Network traffic DNS queries [stortfordaircadets.org.uk] 462 0.93 0.10
Network traffic HTTP URL requests [http://stortfordaircadets.org.uk/flash/exe.php?x=pdf] 460 0.93 0.10
VI. R ESULTS
The GP-based method achieves surprisingly good results in
evading the two target classifiers. For both of the classifiers,
it is able to generate a variant that preserves the malicious
behavior but is classified as benign for all 500 seeds in our
test set. Our code and data are available under an open source
license from http://www.evadeML.org
A. PDFrate
After approximately one week of execution, the algorithm
found 72 effective mutation traces that generated 16,985 total
evasive variants for the 500 malware seeds (34.0 evasive Fig. 4. The length and efficacy of mutation traces for evading PDFrate.
variants per seed in average), achieving 100% evasion rate in
attacking PDFrate.
8
TABLE III. I MPACT OF PDF RATE F EATURES .
we discuss later.)
Feature Original Evasive ∆score1 ∆score2 Impact
count font 0.0 70.0 0.114 0.392 0.506
Feature Analysis. To understand the evasion attacks, we count obj 11.0 230.0 0.067 0.110 0.177
examine the impact of the changes on the feature space used count endobj 11.0 230.0 0.056 0.069 0.125
count box other 3.0 140.0 0.038 0.043 0.081
by PDFrate. count endstream 4.0 74.0 0.011 0.054 0.065
pos box max 0.0 0.8 0.052 0.013 0.065
We first look at the two simplest mutation traces in length count stream 4.0 74.0 0.021 0.041 0.062
of 1 that are effective for 162 seeds: pos box avg 0.0 0.5 0.022 0.022 0.044
pos eof avg 1.0 1.0 0.000 0.032 0.032
(insert, /Root/Pages/Kids, pos eof min 1.0 1.0 -0.002 0.029 0.027
pos page max 0.0 0.8 0.003 0.018 0.021
3:/Root/Pages/Kids/4/Kids/5/ ) pos eof max 1.0 1.0 0.002 0.016 0.018
(replace, /Root/Type, 3:/Root/Pages/Kids/1/Kids/3) pos page avg 0.0 0.5 0.002 0.010 0.012
size 36,028.0 503,739.0 -0.001 0.005 0.004
Even though they are different operations, the common effect ratio size page 36,028.0 7,407.9 0.001 0.002 0.003
ratio size obj 3,275.3 2,190.2 0.000 0.002 0.002
of the two mutations is that they both introduce new Page ratio size stream 9,007.0 6,807.3 0.002 0.000 0.002
objects from external benign PDFs, resulting in significant len obj avg 3,234.9 2,157.2 0.001 0.000 0.001
changes in the feature space of PDFrate. count page 1.0 68.0 -0.004 0.004 0.000
len obj max 27,455.0 34,314.0 0.000 0.000 0.000
len obj min 44.0 49.0 0.000 0.000 0.000
Table III lists one example of feature changes by simply len stream avg 8,700.3 6,390.2 0.000 0.000 0.000
inserting several Page objects. The classification score of the len stream max 27,392.0 34,246.0 0.000 0.000 0.000
original seed is 0.998, approaching the maximum malicious pos page min 0.0 0.0 -0.002 0.000 -0.002
pos box min 0.0 0.0 -0.003 0.000 -0.003
score of 1.0. After inserting the new Page objects, the clas-
sification score decreases to 0.43, which is below the normal The difference in the feature space of PDFrate for a selected
seed sample (with score 0.998) and corresponding evasive variant
malware threshold of 0.5. The simple insert resulted in a large (with score 0.43). Original is the feature value of the original
number of changes in the feature space. The counters of some seed malware sample; Evasive is the feature value of the evasive
objects like pages, fonts and streams as well as the file size variant. Features with the same value for both samples are not
directly increase due to the newly introduced objects. The included. ∆score1 is the original score subtracting the score after
object length statistics are decreased or increased due to the that feature is changed to Evasive value; ∆score2 is the evasive
score subtracted by the score after that feature is reset to the
change of the object population. Some other features on object Original value. Impact is ∆score1 + ∆score2.
positions are also changed due to the relocation of objects at
the raw byte level. All feature values are in the raw formats
because feature normalization is not required with random of the malware samples in the training set do not contain any
forests. Even though the feature changes are so significant font objects as the malware authors are too lazy to insert any
that PDFrate classifies the new variant as benign, the malicious text, but it is unlikely that any benign PDF file has no font
behavior of the original seed does not change at all. The change objects. However, this is an artifact of the malware samples in
just added some pages to the PDF file. the training set, not an inherent property for malicious PDFs.
One simple manipulation introduces many feature changes, It is trivial to add font objects to an existing PDF malware
but the impact of changing each feature is not equivalent due sample to alter the value of this feature.
to the varying importance of features in the classification.
There are longer traces which contain at most 354 muta-
Though random forest is a complex non-linear model that
tions and influence more features in PDFrate. Table IV lists
is difficult to interpret, we estimate the impact of altering
the features that were most frequently increased and decreased
each feature independently. Intuitively, changing a high impact
across all 16,985 evasive variants found. (The full list of all 68
feature should significantly affect the classification scores.
mutable features of PDFrate found in evasion attacks is found
For an original malware sample, if the feature is changed in Appendix A.) The count is how many times the value of
to a different value in evasive variants, the classification the feature is different for the evasive variant found compared
score should decrease (indicating it appears more like benign to the original seed. High counts imply these features are not
samples to the classifier). On the other hand, for an evasive robust and should not be used in malware classification because
variant, if the feature value is reset to the original value, they are easy to change without corrupting the malicious
the classification score should increase (appears more like properties for many malware seeds.
malicious samples to the classifier). Therefore, we model the
Most non-robust features are unsurprising, because a PDF
impacts with two factors. The decrease of the classification
malware author can always change the visible contents (such
score of a malware seed when a feature is changed to the
as pages, text, images and metadata) in PDF malware samples
evasive value is reported as ∆score1. The increase of the
without corrupting the malicious payloads. The only surprising
classification score of an evasive variant when the feature is
feature is count javascript. Since PDF malware heavily relies
reset to the original value is reported as ∆score2. The impact is
on JavaScript to carry exploits and shell code, it seems surpris-
the sum of the two scores. Table III lists the impact ranking of
ing that it is possible to decrease count javascript without dis-
the affected features, which roughly matches with the feature
rupting the malicious behavior. However, the count javascript
importance ranking in PDFrate [26].
feature is not an accurate count of the number of embedded
The most critical feature change for this example is JavaScript code pieces in a PDF. It just extracts the number of
count font. The original malware sample does not have any JavaScript keywords, but these keywords are optional in script
font objects as fonts are not needed for the exploit. The execution. The targeted PDF reader will execute the JavaScript
classifier learns that this feature is important because most even without the /Javascript keyword.
9
Fig. 6. The distribution of the original classification score of seeds.
B. Hidost
The experiment of evading Hidost took around two days
to execute. Although Hidost was designed specifically to resist
evasion attempts,2 our method achieves a 100% evasion rate,
generating 2,859 evasive samples in total for 500 seeds (5.7
evasive samples per seed in average).
Trace Analysis. We analyze the efficacy of each mutation trace Fig. 7. The length and efficacy of mutation traces for evading Hidost.
which is examined in the same way as for PDFrate. The length
and efficacy of each mutation trace are shown in Figure 7. In
general, it required shorter mutation traces to achieve 100% 2 malware seeds. In contrast, a long mutation trace containing
evasion rate in attacking Hidost than it did for PDFrate. 61 mutations is effective on 334 malware seeds.
We observed two major differences compared to PDFrate. The accumulated number of evasions found sorted by the
First, there is no increasing trace length trend for newly found length of mutation traces is given in Figure 5. The plot is closer
mutation traces, unlike for PDFrate where the trace length to linear, suggesting that, in contrast to PDFrate, there is little
increases with the trace ID. Second, the trace length is more variation in the difficulty of finding evasive variants for differ-
correlated with the efficacy: longer traces tend to be more ent seeds. We believe the differences from PDFrate stem from
effective in generating evasive variants. Several short mutation the different feature set in Hidost. The mutation operations
traces with fewer than 5 mutations are only effective on 1 or have more direct influence on the structural path features in
2 Specifically, the Hidost authors claim, “The most aggressive evasion
Hidost. For example, an object deletion operation just deletes
strategy we could conceive was successful for only 0.025% of malicious
the corresponding path of a feature (along with those of its
examples tested against an off-the-shelf nonlinear SVM classifier with the descendants). In contrast, feature changes in PDFrate resulting
RBF kernel using the binary embedding. Currently, we do not have a rigorous from the same operation are less tangible. Besides decreasing
mathematical explanation for such a surprising robustness. Our intuition the counts of specific objects that we can expect, the other
suggests that the main difficulty on attacker’s part lies in the fact that the input positional features may also change due to the relocation of
features under his control, i.e., the structural elements of a PDF document,
are only loosely related to the true features used by a classifier. The space of objects in repacking the modified variant. As a result, there
true features is hidden behind a complex nonlinear transformation which is are more inter-influences among the mutation operations in
mathematically hard to invert.” [28] evading PDFrate, and a larger number of mutations may be
10
TABLE V. F EATURE CHANGES PRODUCED BY LONGEST H IDOST
required to reach the evasion threshold. The box plot of the MUTATION TRACE .
original classification score in Hidost of each seed shown in
Added Features Deleted Features
the right side of Figure 6 suggests that it usually requires more Threads AcroForm
mutations to find an evasive variant for seeds that appear to ViewerPreferences/Direction Names/JavaScript/Names/S
be more clearly malicious to the classifier. Metadata AcroForm/DR/Encoding/PDFDocEncoding
Metadata/Length AcroForm/.../PDFDocEncoding/Differences
Metadata/Subtype AcroForm/.../PDFDocEncoding/Type
Feature Analysis. The binary features used in Hidost are much Metadata/Type Pages/Rotate
OpenAction/Contents AcroForm/Fields
easier to interpret than the variety of features used by PDFrate. OpenAction/Contents/Filter AcroForm/DA
OpenAction/Contents/Length Outlines/Type
We first look at the simplest mutation traces. There are 5 Pages/MediaBox Outlines
mutation traces in length 1, which are only effective on 1 or Outlines/Count
2 malware seeds. They are: Pages/Resources/ProcSet
Pages/Resources
(delete, /Root/OpenAction/JS/Length)
(delete, /Root/Names)
(delete, /Root/AcroForm/DR) to develop techniques to automatically pare down a trace to
(replace, /Root/AcroForm/DR, its essential operations if desired. The yellow triangle plot in
3: /Root/OpenAction/D/0/.../FontBBox/3) Figure 7 shows the number of affected features for each trace.
(replace, /Root/AcroForm/DR,
3: /Root/Pages/Kids/3/.../DescendantFonts/0/DW) Although its authors claimed that Hidost was robust against
evasion attacks involving just feature addition, we found many
evasive variants that only added features. Among the 2,859
The first three mutations each delete a node from the original evasive variants, 761 are pure feature addition attacks, 21 of
malware seeds, changing the value of the corresponding Hidost them are pure feature deletion attacks, and the other 2,077
feature from 1 to 0. The first deleted object similar to the involved both feature addition and deletion. It is already
count javascript feature in PDFrate. Both capture properties unrealistic to assume attackers can only insert features, and,
that frequently exist in malware samples but not in benign as shown in the claims about non-evadability of Hidost,
files. However, they are optional in malicious code execution. dangerous to assume a technique cannot be evaded because
The other deleted objects are artifacts in the training dataset particular manual techniques fail.
that are not closely tied to malicious behavior. Although the
last two traces use replace operations, the important effects of A complete list of mutated features in evading Hidost is
the replacements are to remove the features extracted from the given in Appendix B. These non-robust features should not be
children objects of the original /Root/AcroForm/DR node. used in a malware classifier, as they can be easily changed
while preserving the original malicious properties.
Simply deleting some objects is not sufficient to evade
Hidost (it is only effective on 1 or 2 malware seeds in
our experiment), but additional mutations are enough to find C. Cross-Evasion Effects
evasive variants for all of the seeds. The longest mutation trace
contains 85 operations, which is effective on 198 malware Even though the classifiers are designed very differently
seeds for generating evasive variants to bypass Hidost. Table V and trained with different training datasets, we suspected they
lists the all of feature changes observed over the 198 malware must share some properties in the same classification task.
seeds when executing that mutation trace. Unsurprisingly, Therefore, we conducted a cross-evasion experiment by feed-
several auxiliary objects are added or deleted to fool Hidost. ing one classifier with the evasive variants found in evading
For example, several metadata objects are inserted. Metadata the other classifier.
widely exists in benign PDFs when users generate PDF docu-
ments with popular PDF writers. On the other hand, it is rare in For 388 of the malware seeds, the evasive variants found
PDF malware because malware authors did not add metadata by evading Hidost are also effective in evading PDFrate. That
in hand-crafting PDF exploits. However, this is just an artifact is to say, without any access to PDFrate, a malware author
in the training dataset and not an essential difference between with access to Hidost could find evasive variants for 77.6% of
PDF documents and PDF malware. Inserting metadata into a the seeds. In contrast, the evasive variants found by evading
PDF malware sample increases the likelihood of the sample PDFrate are only effective against Hidost for two of the
being considered benign by Hidost. malware seeds.
As seen from this example, trace length itself is not a The significant difference in the cross evasion effects
good measure of evasion complexity. Although the stochastic is due to the different feature sets in the two classifiers.
search process found an 85-operation trace to create these Indeed, the primary design goal for Hidost was to be less
evasive variants, the trace only impacts the 23 features (each easily evaded than other classifiers by using features based
corresponding to a node in the PDF file) showing in Table V. on structural properties. The evasive variants generated by the
That is to say, there is a 23-operation trace that would algorithm in evading PDFrate do change the measured features
be just as effective (and probably shorter traces since one significantly, however, they have little effect on the structural
mutation can impact many features), and the trace found by features used in Hidost. In the reverse direction, the evasive
the search includes many useless or redundant mutations. For variants targeting Hidost by directly altering structural features
the purposes of creating evasive malware, it is not important to (necessary to evade Hidost), incidentally impact the features
find the shortest effective trace, although it would be possible used by PDFrate.
11
Oracle Mutation Classifier Others variants, increasing the computational burden for parsing,
manipulating, and repacking.
PDFrate
VII. D ISCUSSION
In this section we discuss the potential defenses and future
directions suggested by our results.
A. Defense
Hidost
Fig. 8. Time required to find evasive variants for 500 malware samples.
Information Hiding and Randomization. One of the most
direct solutions to protect classifiers is hiding the classification
scores from the users or adding random noise to the scores [2].
D. Execution Cost Another proposed method is the multiple classifier system, in
which the classification scores are somewhat randomly picked
One drawback of evolutionary algorithms is that they from different models trained with disjoint features [3]. As our
provide no guarantees about generating good results within method heavily relies on the classification scores of variants
a specific duration. For many problems, these methods can to calculate fitness scores that direct the evolution, the lack of
require a huge amount of computing resources before a desired accurate score feedback makes the search for evasive variants
result is found. Further, failing to find the desired result may much harder and may make our approach infeasible.
be a sign that it doesn’t exist, or just that more computing re-
sources are required. Our experiments show that the resources However, the intrinsic non-robustness of superficial fea-
required for this instantiation are very reasonable. tures should not be simply ignored. Considering the potential
cross-evasion effects (Section VI-C), hiding or randomizing the
For each classifier target, the experiment was run in several information may not help much against an adversary who can
rounds. The first round started with empty trace pools, so the infer something about the types of features used by the target
search for evasive variants relies solely on the stochastic search classifier. Moreover, previous work has shown that accurately
directed by the fitness function. In later rounds, the successful re-implementing a similar classifier with a surrogate training
and promising mutation traces found previously were used to set is possible (indeed, this is what the authors of Mimicus did
accelerate the search. All the failed jobs in a round were re- to experiment with evadability of PDFrate [26, 29]).
run in the next round with the all mutation traces stored in
previous rounds. Adapting to Evasive Variants. Our experiments assume that
For PDFrate, it took four rounds to reach 100% evasion. adversary can test samples without exposing them to the
The evasion rate on Hidost reached 100% in the second round. classifier operator. In an on-line scenario, the classifier may
Figure 8 shows the total time needed to find evasive variants be able to adapt to attempted variants. Note, however, that
for all 500 malware seeds in evading the two classifiers. The retraining is expensive and opens up the classifier to alternate
duration for each job is not meaningful because it mainly evasion strategies such as poisoning attacks.
depends on the job sequence. Later jobs are usually much Chinavle et al. proposed a method that would automatically
faster because they may benefit from mutation traces found retrain the classifier with pseudo labels once evasive variants
earlier, but the order of trying the seeds is arbitrary. were detected by a mutual agreement measure on the ensemble
It took less than 6 days to achieve a 100% evasion rate model, which had been shown effective on a spam detection
on PDFrate. In other words, our method found an evasive task [6]. However, adapting to users’ input without true labels
variant for each seed in 16 minutes on average running on introduces a new risk of poisoning attacks.
a commodity desktop. Evasive variants were found against
Hidost three times faster, taking 5 minutes per seed in average. Defeating Overfitting. The evadability of classifiers we
demonstrate could be just an issue of overfitting, in which case,
The main computation time is running the generated vari- well known machine learning practices should work to defeat
ants in the Cuckoo sandbox, which we use as the oracle in overfitting. For example, collecting a much larger dataset for
our process. The machine with 16 virtual machines running training the model, or using model averaging to lower the
in parallel is able to test 1,000 samples per hour. This could variance.
easily be accelerated by using more machines, since there are
We don’t expect these conventional methods will help,
no dependencies between the executions.
however. It is impossible to collect a complete dataset of future
We also observed that the time spent on other tasks (includ- malware, and none of these techniques anticipate an adversary
ing mutation) in attacking PDFrate takes a larger proportion who is actively attempting to evade the classifier.
of the total duration than for Hidost (8.3% vs. 4.1%). This is
because the benign files used as external object genome are Selecting Robust Features. We found many non-robust fea-
larger than those in attacking Hidost. Hence, it produced larger tures from the two classifiers in the evasion experiments.
12
Obviously, they should be removed from the feature set as they generating actual evasive PDF malware. In fact, the experi-
can be easily manipulated by the attacker without corrupting ments in our work show attackers can also delete features while
the malicious properties. The problem with the features used preserving maliciousness, and our experiments verified that
by both Hidost and PDFrate, however, is that all of the features the resulting evasive variants preserved maliciousness through
are likely non-robust. The superficial features used by these dynamic execution in a test environment.
classifiers do not have any intrinsic distinguishability between
benign and malicious PDFs, and it would be very surprising Šrndic et al. demonstrated how PDFrate could be evaded
if superficial features were found that could be used for by exploiting an implementation flaw in the feature extrac-
robust classification. Instead, it seems necessary to use deeper tion [29]. Our method does not rely on any particular imple-
features to build classifiers that can resist evasion attempts mentation flaw in a target classifier. Instead, it exploits the
by sophisticated adversaries. Such features will depend on weak spots in a classifier model’s feature space and employs
higher-level semantic analysis of the input file, in ways that are a stochastic method to manipulate samples in diverse ways.
difficult to change without disrupting the malicious behavior. Maiorca et al. proposed reverse-mimicry attacks against
PDF malware classifiers [20]. In reverse-mimicry, a benign
B. Improving Automatic Evasion sample is manipulated into a malicious one by inserting
malicious payloads into the structure. The attack is generic
Our automatic evasion method provides a general method to a class of classifiers based on structural features. However,
to evaluate the robustness of classifiers for security tasks. the hand-crafted attack only works on malware with simple
Its ability to find evasive variants against a target classifier payloads. In contrast, our GP-based method is automatic and
demonstrates clear weaknesses, but if our method fails to find does not have this limitation.
evasive variants against a particular classifier this is certainly
not enough to be confident that other techniques (including Evolutionary algorithms have also recently been used to
manual effort) would not be able to find evasive variants. fool deep learning-based computer vision models [22]. In con-
Hence, it is valuable to improve the method to enable more trast, this work uses genetic programming, an important branch
efficient searching to target more challenging classifiers. of evolutionary algorithms for generating highly-structured
data like computer programs.
Parameter Tuning. In this work, we just arbitrarily choose
the search parameters. Tuning the parameters, or even trying IX. C ONCLUSIONS
dynamic mechanisms like parameter decay, could make the Our experiments show how the traditional approach of
search algorithm more efficient. building machine learning classifiers can fail against deter-
mined adversaries. We argue that it is essential for designers
Learnable GP. The current method we use to generate evasive of classifiers used in security applications to consider how
variants is essentially a random search algorithm. Hence, adversaries will adapt to those classifiers, and important for
it often generates corrupted variants that lose the malicious the research community to develop better ways of predicting
behavior. A probabilistic model would learn which mutations the actual effectiveness of a classifier in deployment.
are more effective for generating evasive variants to direct the
search more efficiently.
13
R EFERENCES for Automatic Software Repair. IEEE Transactions on
Software Engineering, 2012.
[1] Adobe, Inc. PDF Reference and Adobe Extensions to [18] Daniel Lowd and Christopher Meek. Adversarial learn-
the PDF Specification. http://www.adobe.com/devnet/pdf/ ing. In 11th ACM SIGKDD International Conference on
pdf reference.html. Knowledge Discovery and Data Mining (KDD), 2005.
[2] Marco Barreno, Blaine Nelson, Russell Sears, Anthony D [19] Davide Maiorca, Giorgio Giacinto, and Igino Corona.
Joseph, and J Doug Tygar. Can Machine Learning Be A Pattern Recognition System for Malicious PDF Files
Secure? In First ACM Symposium on Information, Com- Detection. In 8th International Conference on Machine
puter and Communications Security (ASIACCS), 2006. Learning and Data Mining in Pattern Recognition. 2012.
[3] Battista Biggio, Giorgio Fumera, and Fabio Roli. Mul- [20] Davide Maiorca, Igino Corona, and Giorgio Giacinto.
tiple Classifier Systems for Adversarial Classification Looking at the Bag Is Not Enough to Find the Bomb: An
Tasks. In Multiple Classifier Systems. Springer, 2009. Evasion of Structural Methods for Malicious PDF Files
[4] Battista Biggio, Igino Corona, Davide Maiorca, Blaine Detection. In 8th ACM Symposium on Information, Com-
Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, puter and Communications Security (ASIACCS), 2013.
and Fabio Roli. Evasion Attacks against Machine Learn- [21] Patrick Maupin. PDFRW: A Pure Python Library That
ing at Test Time. In 6th European Machine Learning and Reads and Writes PDFs. https://github.com/pmaupin/
Data Mining Conference (ECML/PKDD). 2013. pdfrw.
[5] Stephan Chenette. Malicious Documents Archive [22] Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep
for Signature Testing and Research - Contagio Mal- Neural Networks are Easily Fooled: High Confidence
ware Dump. http://contagiodump.blogspot.de/2010/08/ Predictions for Unrecognizable Images. In 28th IEEE
malicious-documents-archive-for.html. Conference on Computer Vision and Pattern Recognition
[6] Deepak Chinavle, Pranam Kolari, Tim Oates, and Tim (CVPR), 2015.
Finin. Ensembles in Adversarial Classification for Spam. [23] Conor Ryan. Automatic Re-Engineering of Software
In 18th ACM Conference on Information and Knowledge Using Genetic Programming, volume 2. Springer Science
Management (CIKM), 2009. & Business Media, 2012.
[7] Marco Cova, Christopher Kruegel, and Giovanni Vigna. [24] Karthik Selvaraj and Nino Fred Gutierrez. The Rise of
Detection and Analysis of Drive-By-Download Attacks PDF Malware. https://www.symantec.com/content/en/
and Malicious JavaScript Code. In 19th International us/enterprise/media/security response/whitepapers/the
World Wide Web Conference (WWW), 2010. rise of pdf malware.pdf, March 2010.
[8] CVE Details. Adobe Acrobat Reader — CVE Security [25] Charles Smutz and Angelos Stavrou. Malicious PDF
Vulnerabilities, Versions and Detailed Reports. http: Detection Using Metadata and Structural Features. Tech-
//www.cvedetails.com/product/497. nical report, 2012.
[9] George E Dahl, Jack W Stokes, Li Deng, and Dong Yu. [26] Charles Smutz and Angelos Stavrou. Malicious PDF
Large-Scale Malware Classification Using Random Pro- Detection Using Metadata and Structural Features. In 28th
jections and Neural Networks. In 38th IEEE International ACM Annual Computer Security Applications Conference
Conference on Acoustics, Speech and Signal Processing (ACSAC), 2012.
(ICASSP), 2013. [27] Nedim Šrndic and Pavel Laskov. Mimicus: A Library for
[10] Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, and Adversarial Classifier Evasion. https://github.com/srndic/
Deepak Verma. Adversarial Classification. In 10th mimicus.
ACM SIGKDD International Conference on Knowledge [28] Nedim Šrndic and Pavel Laskov. Detection of Malicious
Discovery and Data Mining (KDD), 2004. Pdf Files Based on Hierarchical Document Structure. In
[11] Stephanie Forrest. Genetic Algorithms: Principles of 20th Network and Distributed System Security Symposium
Natural Selection Applied to Computation. Science, 261 (NDSS), 2013.
(5123), 1993. [29] Nedim Šrndic and Pavel Laskov. Practical Evasion of a
[12] Claudio Guarnieri, Alessandro Tanasi, Jurriaan Bremer, Learning-Based Classifier: A Case Study. In 35th IEEE
and Mark Schloesser. Cuckoo Sandbox: A Malware Symposium on Security and Privacy (Oakland), 2014.
Analysis System. http://www.cuckoosandbox.org/. [30] Symantec Corporation. Symantec Internet Security
[13] Mark Harman, William B Langdon, and Westley Weimer. Threat Report, 2015.
Genetic Programming for Reverse Engineering. In [31] VirusTotal. Free Online Virus, Malware and URL Scan-
20th IEEE Working Conference on Reverse Engineering ner. https://www.virustotal.com/.
(WCRE), 2013.
[14] Thomas Hungenberg and Matthias Eckert. INetSim: In-
ternet Services Simulation Suite. http://www.inetsim.org/.
[15] John R Koza. Genetic Programming: On the Pro-
gramming of Computers by Means of Natural Selection,
volume 1. MIT press, 1992.
[16] Pavel Laskov and Nedim Šrndić. Static Detection of
Malicious JavaScript-Bearing PDF Documents. In 27th
ACM Annual Computer Security Applications Conference
(ACSAC), 2011.
[17] Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest,
and Westley Weimer. GenProg: A Generic Method
14
A PPENDIX B. Mutated Features in Hidost
A. Mutated Features in PDFrate The 24 inserted features and the 19 deleted features in
finding the 2,859 evasive variants against Hidost are listed in
The 68 features mutated in our experiments evading Table VII. As with PDFrate, the features that are not listed are
PDFrate are listed in Table VI. It is important to note, however, not necessarily robust features.
that just because a feature does not appear here does not mean
it is robust to evasion. The features listed are those that were The “counts” are the number of evasive variants mutated
sufficient for achieving 100% evasion rate in our experiment. that feature. Note that some features are hierarchically depen-
Similarly, the unidirectional mutations are how observed in dent in the PDF object structure, so one insertion or deletion
the evasion attack experiment. It doesn’t necessarily mean that may impact many features. For example, inserting a complete
these features cannot also be mutated in the reverse direction Metadata object (as is done in 2,507 of the variants) also
without corrupting malware samples in practice. introduces several child objects: Metadata/Length, Metadata/-
Subtype and Metadata/Type.
TABLE VI. 68 F EATURES M ODIFIED E VADING PDF RATE
Feature Name Mutability Feature Name Mutability TABLE VII. F EATURES A LTERED E VADING H IDOST
box nonother types ↓ image totalpx ↑ Counts Inserted Feature
box other only ↑ len obj avg ↑↓ 2,507 Metadata
count acroform ↑↓ len obj max ↑↓ 2,507 Metadata/Length
count action ↑↓ len obj min ↑↓ 2,507 Metadata/Subtype
count box letter ↓ len stream avg ↑↓ 2,507 Metadata/Type
count box other ↑↓ len stream max ↑ 2,454 PageLabels
count endobj ↑↓ len stream min ↑↓ 2,363 ViewerPreferences/Direction
count endstream ↑↓ pos acroform avg ↑↓ 1,991 Pages/Resources/ProcSet
count font ↑↓ pos acroform max ↑↓ 1,968 Pages/Resources
count image med ↑ pos acroform min ↑↓ 1,702 Pages/Rotate
count image small ↑ pos box avg ↑↓ 1,382 Pages/MediaBox
count image total ↑ pos box max ↑↓ 825 Threads
count image xsmall ↑ pos box min ↑↓ 718 OpenAction/MediaBox
count javascript ↓ pos eof avg ↑ 385 OpenAction/Contents/Filter
count js ↓ pos eof max ↑ 385 OpenAction/Contents/Length
count obj ↑↓ pos eof min ↑ 369 OpenAction/Contents
count objstm ↓ pos image avg ↑ 319 OpenAction/Resources
count page ↑↓ pos image max ↑ 319 OpenAction/Resources/ProcSet
count page obs ↓ pos image min ↑ 158 OpenAction/Rotate
count stream ↑↓ pos page avg ↑↓ 158 OpenAction/CropBox
createdate mismatch ↑ pos page max ↑↓ 51 OpenAction/Type
createdate ts ↑ pos page min ↑↓ 51 OpenAction
createdate tz ↓ producer dot ↑↓ 41 PageLabels/Nums
createdate version ratio ↑ producer lc ↑↓ 41 PageLabels/Nums/S
creator dot ↑↓ producer len ↑↓ 40 PageLayout
creator lc ↑↓ producer mismatch ↑ Counts Deleted Feature
creator len ↑↓ producer num ↑↓
1,345 Names/JavaScript/Names/S
creator mismatch ↑ producer oth ↑↓
865 PageLayout
creator num ↑↓ producer uc ↑↓
615 Outlines/Type
creator oth ↑↓ ratio imagepx size ↑↓
615 Outlines
creator uc ↑↓ ratio size obj ↑↓
615 Outlines/Count
delta ts ↑ ratio size page ↑↓
502 AcroForm/Fields
delta tz ↓ ratio size stream ↑↓
500 AcroForm
image mismatch ↑ size ↑
330 OpenAction/JS/Length
54 Pages/Rotate
14 Pages/Resources/ProcSet
12 AcroForm/DR/Encoding/PDFDocEncoding
12 AcroForm/DR/Encoding/PDFDocEncoding/Differences
12 AcroForm/DR/Encoding/PDFDocEncoding/Type
11 Pages/Resources
9 AcroForm/DA
8 Pages/MediaBox
4 OpenAction/S
3 Names/EmbeddedFiles
2 Names
15