Python and Malware: Developing Stealth and Evasive Malware Without Obfuscation
Python and Malware: Developing Stealth and Evasive Malware Without Obfuscation
Obfuscation
Abstract: With the continuous rise of malicious campaigns and the exploitation of new attack vectors, it is necessary to
assess the efficacy of the defensive mechanisms used to detect them. To this end, the contribution of our work
is twofold. First, it introduces a new method for obfuscating malicious code to bypass all static checks of
multi-engine scanners, such as VirusTotal. Interestingly, our approach to generating the malicious executables
is not based on introducing a new packer but on the augmentation of the capabilities of an existing and widely
used tool for packaging Python, PyInstaller but can be used for all similar packaging tools. As we prove, the
problem is deeper and inherent in almost all antivirus engines and not PyInstaller specific. Second, our work
exposes significant issues of well-known sandboxes that allow malware to evade their checks. As a result,
we show that stealth and evasive malware can be efficiently developed, bypassing with ease state of the art
malware detection tools without raising any alert.
125
Koutsokostas, V. and Patsakis, C.
Python and Malware: Developing Stealth and Evasive Malware without Obfuscation.
DOI: 10.5220/0010541501250136
In Proceedings of the 18th International Conference on Security and Cryptography (SECRYPT 2021), pages 125-136
ISBN: 978-989-758-524-1
Copyright
c 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
SECRYPT 2021 - 18th International Conference on Security and Cryptography
namic analysis. Thus, they perform specific checks in not detected by a wide range of state of the art tools
the system to determine whether they are being exe- used for detecting malware.
cuted in a virtual environment and if known protec- In what follows, we provide a brief overview of
tion mechanisms typical of sandboxes are running, the related work. Then, we discuss the conceptual ap-
and can assess their execution mode to tell whether proach for the development of stealth malware. In
they are being debugged (Issa, 2012). If any of these Section 5, we analyse our experiments and the ex-
checks is positive, the malware typically changes its tracted results. Then, in Section 6, we discuss our
behaviour to harden and slow down its analysis. All findings and their impact. The article concludes sum-
the above can come under one umbrella to facilitate marising the contributions of our work and streamlin-
malware evasion by simultaneously packing the bi- ing future work.
nary and armouring it with a myriad of evasion meth- Ethical Compliance. Our work complies with the
ods (Liţă et al., 2018). standards for conducting offensive security in an eth-
The main goal of this work is to assess the effort ical way. To this end, we have responsibly disclosed
and methods needed to create stealth malware. We our findings to each sandbox provider individually
define this stealth concept in an objective and repeat- prior to submitting this work. Moreover, we have not
able way. More precisely, we consider that a malware published nor communicated our methods to prevent
sample is stealth if (i) it achieves a “clean sheet” after them from being used in the wild.
inspection by multi-engine scanners, such as VirusTo-
tal (VT) and (ii) malware sandbox environments do
not consider it malicious per se. VT and other similar 2 RELATED WORK
services used statically examine the file with several
dozens of antiviruses (AVs). Therefore, even if an AV Similar to the use of sandboxes for cats, a malware
may detect the malware on execution, VT’s verdict sandbox is a controlled virtualised environment in
might classify it as benign. Note that a clean sheet which a potentially dangerous file is submitted for
verdict from VT, which has around 70 AVs clearly inspection, so that it does not “litter” the rest of the
shows the trend of the market, meaning that the rest system. This environment will automatically exe-
of the AVs, which are minor share of the market are cute/open the file and analyse its behaviour, such as
not expected to have different behaviour. Practically, filesystem interaction, network connections, registry
our work starts from understanding why some AVs changes and access, API calls, memory access, etc.
are erroneously flagging some executables as mali- The virtualised and isolated nature of the environment
cious and uncovers an inherent problem of AV en- prevents the malware from causing any harm to the
gines when handling Python files. This can be eas- system performing the analysis. Another approach
ily escalated to develop undetectable malware. What would be to actually debug the suspicious file and ex-
is even more alarming is the fact that while one may amine in detail command by command and even alter
argue that there are several tricks to bypass static AV its behaviour.
tests by hiding the payload, we illustrate that a threat Clearly, the above is not the ideal for the adver-
actor does not need to cover the payload. Widely used sary, so almost all modern malware come equipped
payloads can be simply embedded in Python and es- with an evasion method leveraging, for instance,
cape the detection. sandbox and debugger detection methods. For the
Nevertheless, it is clear that once the user is lured sandbox evasion, the malware performs a broad range
to execute malware, it might be too late to block its ac- of checks to assess the environment they are being
tions. Moreover, we consider the malware as stealth executed. In essence, the malware will look for envi-
if it escapes detection from the state of the art mal- ronmental artifacts (Bulazel and Yener, 2017) which
ware sandboxes. To this end, we experimented with include but are not limited to hardware identifiers,
the most well-known sandboxes publicly available on presence of user interaction, sensor readings, uptime,
the Internet. Our analysis and experiments have un- usernames, timing discrepancies, registry values, and
covered significant issues in these sandbox environ- hardware specifications (Martignoni et al., 2009; Shi
ments that allow malware to bypass them. Based on et al., 2014), see Figure 1. Therefore, such a mal-
the above, our work illustrates critical issues in detect- ware would resolve to i) calls to the registry, check the
ing malware that affects the whole ecosystem, span- process list and filesystem to perform pattern match-
ning from how AVs statically recognise malware, to ing against a predefined set of strings ii) time mea-
the evasion from sandboxed environments. Practi- surements to determine whether the elapsed time is
cally, using our methods, one may efficiently develop aligned with the expected processing time and iii)
malware or armour an existing one so that that it is detect possible deviations from the outcome of spe-
126
Python and Malware: Developing Stealth and Evasive Malware without Obfuscation
cific commands. The above indicates that minor de- actions could be a result of an evasion method.
tails, for instance, the MAC address of the network VM Cloak (Shi et al., 2017) checks the environ-
may easily reveal the virtualised environment as well ment for misconfigurations and differences in execu-
as the list of running processes or inconsistencies in tion environments that could reveal that the execution
CPU/GPU specifications. Some malware may also is done in a VM, while Leguesse et al. (Leguesse
use logical bombs to deliver their payload. For in- et al., 2017) harden Android sandboxes which have
stance, the execution can be delayed based on time more sensors to cover. A widely used project for hid-
constraints or enabled only after proper packet receipt ing Windows VMs is A. Ortega’s pafish1 which fo-
from a specific domain. In fact the time that a honey- cuses on the checks that are performed by malware.
pot devotes for execution of a sample introduces many Recently, D’Elia et al. (Cono D’Elia et al., 2020)
differences on what data is collected. As recently re- introduced a dynamic binary instrumentation based
ported by Küchler et al. (Küchler et al., 2021), the method, called BluePill which allows analysts to in-
bulk most of the malware behavior is observed dur- strument the binaries they are dissecting evasive mal-
ing the first two minutes of execution, while further ware in a stealth way so that they cannot determine
actions may take up to ten minutes. that they are being debugged. Nevertheless, this is
It must be highlighted at this point that due to the another part of the continuous battle, bringing, for in-
monetisation model (discussed later on), a sandbox stance, anti-anti evasion methods in this fight (D’Elia
will not execute and inspect a binary for an arbitrary et al., 2019).
amount of time. Additionally, to analyse as many Finally, it should be noted that bare-metal mal-
samples as possible, it cannot provide all the available ware execution environments, so the execution is per-
system resources. Therefore, by delaying the execu- formed in an actual and not virtualised environment,
tion, allocating a lot of space and memory, a malware so there is no VM nor sandbox stain to cover, are also
may evade detection. Thus, the sharing of the pro- considered in the literature (Kirat et al., 2011; Guan
cessing resources may easily expose the virtualised et al., 2017; Kirat et al., 2014; Mutti et al., 2015;
environment as the VM could report the host’s proces- Deng and Mirkovic, 2018), nevertheless, they cannot
sor with a fragment of the available cores. Recently be considered a practical solution for assessing mal-
Huang et al. (Huang et al., 2020) introduced PiDi- ware samples at the desired rate as they cannot scale
cators which do not use API calls but pure assembly efficiently.
code and far fewer checks to determine whether a bi-
nary is being executed in a VM triggering far fewer
alerts. It has to be noted that the wide adoption of 3 PYTHON & PYINSTALLER
virtualised environments in, e.g. cloud computing,
some malware is even more targeted, trying to detect Python is an interpreted programming language with
sandboxed environments and not simply virtualised continuous increasing popularity. Despite its read-
(Yokoyama et al., 2016). For more on evasion meth- ability and simplicity, it has accumulated several fea-
ods the interested reader may refer to (Chen et al., tures over the years, making it very attractive for
2008; Issa, 2012; Petsas et al., 2014; Uitto et al., scripting and Rapid Application Development. Cur-
2017; Veerappan et al., 2018; Afianian et al., 2019; rently, it is widely used for server-side web develop-
Checkpoint Research, 2020; Apostolopoulos et al., ment, machine learning, system scripting and secure
2021). software-related engineering, especially offensive.
The fact that Python can be used in all major plat-
forms, as well as the fact that it is easy to write and
many exploits and offensive security tools, have been
written in Python has pushed a lot of malware au-
thors to write their malware in this programming lan-
guage2 . However, we argue that there is another more
Figure 1: Sandbox evasion methods overview. 1 https://github.com/a0rtega/pafish
2 https://unit42.paloaltonetworks.com/unit-42-
These countermeasures from the malware have re- technical-analysis-seaduke/,https://blog.talosintelligence.
sulted in the introduction of anti-evasion methods. com/2020/04/poetrat-covid-19-lures.html,https:
For instance, M AL G ENE (Kirat and Vigna, 2015) per- //blog.netlab.360.com/not-really-new-pyhton-ddos-
forms data flow analysis and data mining on the sys- bot-n3cr0m0rph-necromorph/,https://www.crowdstrike.
com/blog/bears-midst-intrusion-democratic-national-
tem calls to determine whether the inspected binary
committee/
127
SECRYPT 2021 - 18th International Conference on Security and Cryptography
important issue with Python that makes it more at- 4 CONCEPTUAL APPROACH
tractive for malware authors. AVs have not properly
integrated this attack vector in their scope, as we will Our work’s conceptual approach is to progressively
show in the next paragraphs. determine what triggers detection of a malicious bi-
While Python is preinstalled by default in most nary in static and dynamic analysis and create patches
Unix-like operating systems, it is not the case of Win- to remove it. We argue that if VirusTotal and other
dows. Moreover, Python, as an interpreted language, similar engines consider a binary as benign and the
does not compile to create an executable. To create dynamic analysis from a sandbox does not trigger
an executable from a Python script, there are several an alert, the binary is deemed benign, even by secu-
options, with the most popular one being PyInstaller. rity savvies. In this regard, a suspicious indication
PyInstaller takes as input a Python and tries to dis- of sandbox would be considered simply suspicious.
cover all its module and library dependencies that are Therefore, it will fall below the detection radars and
needed to properly execute it. To do this, PyInstaller would be executed by a typical user. While we under-
is recursively looking for imports of the necessary stand that an anti-malware mechanism may detect it
files, until it reaches native Python modules and li- upon execution, this is clearly too late in most cases.
braries. Once the dependencies are identified, instead Two individual streams emerged from this ba-
of keeping the Python scripts, PyInstaller keeps the sic concept, targeting towards evading each analy-
compiled Python scripts (.pyc files), usually referred sis. Once we developed the measures that bypassed
to as Python bytecode. These files, along with an ac- each one of them individually, we merged them into
tive Python interpreter and environment in the form of a unique binary. Therefore, we will present the ap-
what is called the bootloader, are copied in a folder. proach and experiments individually. As we will de-
Thus, PyInstaller allows the packaging of applications tail in the next section, for the static analysis, we up-
in folders and unique executable files without the need loaded our samples to VirusTotal and used the detec-
to have Python preinstalled. tion output and classification of each antivirus, the re-
The bootloader is the core component of PyIn- ported YARA rules, as well as the community com-
staller as it prepares the environment for executing ments to determine which static properties are the
the Python code and actually executes it. The boot- ones that lead to the detection of the malware. To fur-
loader is different for each architecture and highly ther validate our results, we submitted our results to
customizable. Once someone launches a bundled two more similar engines. For the dynamic analysis
Python application, the bootloader is initiated and with sandboxes, we initially submitted some binaries
spawns another child process of itself. The parent that collected data from each sandbox environment
bootloader process handles the signals for the two and then used this as an input to armour our binary
processes and uncompresses all the .pyc files in a with evasion measures. Notably, as discussed later in
folder named MEIxxxxxx in the temp folder of the the article, we identified several important issues for
host, where xxxxxx is a random number. The child many of the sandboxes that were responsibly commu-
process loads the temporary Python environment with nicated to them.
all the needed modules and libraries for the script can
be imported and executes the script. Once the child 4.1 Bypassing Static Analysis
process terminates, the parent process will cleanup
and terminate as well. The methodology behind the technique to bypass the
To compress the files and create a single exe- static analysis stems from observations on PyInstaller
cutable, PyInstaller uses two compression methods, (https://www.pyinstaller.org/) 4.0 binaries. To gener-
ZlibArchives for Python compiled files (executable ate an executable, PyInstaller adds a lot of “noise” to
Python zip archives) and CArchive for all other files. the generated binaries, from, e.g. the libraries that are
In this work, we delibrately study PyInstaller as be- appended, and even if the code is not malicious, many
yond being the most widely used solutions for cre- AVs falsely treat the executable as malware. In fact,
ating executables from Python, many other installers as reported by the community, in numerous occasions
are based on it. Therefore, the issues reported in this even simple “Hello world” Python scripts are flagged
case can be escalated to other installers. as malicious by several AVs as they consider binaries
generated by PyInstaller as malicious by default.
The latter exhibits an erroneous policy applied by
almost all AVs; at least the ones used in VT, when
handling binaries produced by PyInstaller. In prac-
tice, none of them understands its output; probably
128
Python and Malware: Developing Stealth and Evasive Malware without Obfuscation
because of its overblown added libraries. Therefore, pile it using PyInstaller to hide its malicious content.
on the one hand, we have most antivirus for which Then, if we masquerade the PyInstaller enough so that
PyInstaller acts like an efficient packer, so one can it is not considered as such, we may pass any exe-
hide arbitrary code in them. On the other hand, other cutable without any detection from the AVs.
AVs have understood this capacity and immediately Based on the above, our strategy is to exploit these
flag the binaries as malicious. inefficiencies in handling binaries generated by PyIn-
powershell - NoP - NonI -W Hidden - Exec
staller. Thus, the plan is to use PyInstaller to create
Bypass - Command New-Object System . the binaries out of malicious scripts, but then remove
Net . Sockets . TCPClient (" 10.0.0.1 " all the possible static features that it appends from the
,4242) ; $stream = $client . GetStream binary. The general outline of the method is illus-
() ;[ byte []] $bytes = 0..65535|%{0}; trated in Algorithm 1.
while (($i = $stream . Read ( $bytes , 0,
$bytes . Length )) - ne 0) {; $data = (
New-Object - TypeName System . Text .
4.2 Bypassing Dynamic Analysis
ASCIIEncoding ). GetString ( $bytes ,0 ,
$i); $sendback = ( iex $data 2 >&1 | The dynamic analysis bypass is solely targeted to-
Out-String ); $sendback2 = wards bypassing the checks performed by executing
$sendback + " PS " + ( pwd ). Path + " > the binary in a set of well-known and widely used
"; $sendbyte = ([ text . encoding ]:: sandboxes. To this end, we first created a set recon-
ASCII ). GetBytes ( $sendback2 ); $stream naissance of executables that were simply collecting
. Write ( $sendbyte ,0 , $sendbyte . Length
); $stream . Flush () }; $client . Close ()
environmental data from each sandbox and perform-
ing some checks with a standard tool for assessing the
Listing 1: A typical Powershell reverse shell. sandboxes’ quality for malware analysis, pafish. Once
collected, the input was then sent to a server that we
In what follows, we dig a bit deeper on the problem controlled to gather and analyse it.
with PyInstaller to understand the nature of the noise Beyond the output of pafish, which identified sev-
that makes it act like a packer. We start with a simple eral misconfigurations and our own findings, one has
reverse shell with a PowerShell script which is typi- to consider some particular inherent issues that such
cally flagged by AVs. The one-line script is provided services have. The environmental findings have to be
in Listing 1. Note that similar backdoor mechanisms; considered in the scope of a service offered in a virtu-
e.g. malicious PowerShell execution, are widely used alised environment, for a limited amount of time and
by malware in the wild. Two scripts, one in JavaScript with the minimum amount of resources to allow for
and one in Python were written appending the exact scaling. As a result, a VM cannot always meet a typ-
same PowerShell code snippet to their body; there- ical computer’s specifications in terms of, e.g. mem-
fore, no obfuscation is applied. While both of them ory, disk, etc.
are plain ASCII files, with minimal differences in Finally, one has also to consider that most sam-
their contents and the malicious string in plain sight, ples in such a sandbox originate from users without
there are significant deviations on their detection from paid plans, so these are tested in VMs that are more
AVs, see Figure 2, which are rather alarming. More limited. Based on the market model (see Section 2), if
precisely, one may observe that the JavaScript file is a file is considered benign by the static analysis, and
flagged as malicious by four times more AVs than its the sandboxes have not identified it as malicious, the
Python peer. Notably, none of them were identified chances of the file being rescanned in a “better” VM
correctly, the JavaScript is considered as text and the drop dramatically.
Python as Java. While the inconsistency in the de-
tection rate of AVs for almost the same plaintext file
cannot be easily understood, the compiled Python file
(pyc), and Python bytecode in general, illustrates a 5 EXPERIMENTAL RESULTS
more catastrophic result. None of the AVs is able to
recognise it as malicious; therefore, it shows that none 5.1 Static Malware Analysis
of the AVs understands what is inside a pyc file as the
conversion to the Python compiled file efficiently ob- Following our findings for the handling of Python
fuscates the contents of the script to bypass the static bytecode, the main goal of the experiments is to alter
analysis. the executable in a way that it does not look generated
The above illustrates a clear strategy to bypass by PyInstaller. In our experiments, we opted to use
static analysis for an executable. One has to write a some standard malicious payloads as a codebase that
Python script which does all the “dirty job” and com- were executed through Python, create an executable
129
SECRYPT 2021 - 18th International Conference on Security and Cryptography
(a) JavaScript using PowerShell reverse (b) Python using PowerShell reverse (c) Compiled Python scipt (pyc) of the
shell. shell. Python using PowerShell reverse shell
script.
Figure 2: Scan results for reverse shell scripts using Javascript and Python.
Algorithm 1: Bypassing static analysis. well known to trigger AVs; therefore, if any of them
1: procedure O BFUSCATE PAYLOAD(x) is identified by an AV or a sandbox, it will immedi-
2: Select proper payload; ately flag the file as malicious in both static and dy-
3: Parametrise the payload; namic analysis. We compiled this script with PyIn-
4: XOR the payload with a random key; staller and submitted the executable to VT. As shown
5: Convert the XORed payload to base64; in Figure 4a, multiple AV engines reported our exe-
cutable as malicious. Moreover, we scanned a sim-
6: procedure PATCH B OOTLOADER(exe) ple “hello world” Python script compiled with PyIn-
7: Rename PyInstaller references to a random staller in VT, and it was also reported as malicious
string by the same antivirus engines (Figure 4b), verifying
8: Rename files and their calls with pyi prefix again the issues described in the previous section. To
to a random prefix. further validate our results, we created some binaries
9: Replace default icons with the exact same functionality using C++, Rust,
10: Update linker’s flags in WScript and Go and submitted them for analysis to VT, see
11: procedure PATCH BINARY(exe) Figures 4c, 4d and 4e respectively. It is important
12: Add version to the binary to highlight in the latter figures that, contrary to the
13: Remove rich header ones for Python, the AVs have correctly identified the
14: Rename RTDATA header to .bss presence of shellcode and Meterpreter, as shown by
15: Recalculate PE32 checksum. the names that they attribute to our binaries. The dif-
16: Select payload P ference is rather important since the shellcode is not
17: P’=Obfuscate payload(P) encoded in any of the implementations showing that
18: Use a Python S script to call P’; PyInstaller has efficiently hidden it from the AVs once
19: Build a PE32 executable from S to a single file to again.
generate the bootlader B Based on the above, it is apparent that by alter-
20: B’=Patch Bootloader(B) ing the PyInstaller fingerprint on the executable, we
21: Build the PE32 executable PE32 from S to a may evade the static analyses of many AVs. Thus, to
single file with bootlader B’ bypass PyInstaller identification by AVs, we initially
22: PE32’=Patch Binary(PE32) made some clear “static” changes. These changes
were i) substitution of strings and files from “pyi ”
to a random short string, ii) rename of “PyInstaller”
with the corresponding bootloader of PyInstaller, and
strings to another random short string, iii) replace-
then make the necessary changes to the bootloader
ment of the default icons, and iv) addition of flags
and the executable to prevent AVs from detecting it.
to the linker in WScript, see Table 1. After these
Initially, we wrote a script with a known malicious
changes, we built the new bootloader. We then com-
shellcode payload from msfvenom and a Powershell
piled the malicious script with the modified PyIn-
command that downloads the EICAR anti-malware
staller bootloader, managing to reduce the AVs that
testfile and XORed that Powershell command with a
reported our executable as malicious to four (Figure
random hard-coded string and converted it to base64.
5a). Note that the aforementioned actions are bypass-
The reason for these choices is that both of them are
ing several checks with YARA rules that some AVs
130
Python and Malware: Developing Stealth and Evasive Malware without Obfuscation
might perform, see Figure 3. that collects intelligence and then aggregate it to make
Since our binary did not have any version informa- a binary that exploits it to bypass the detection.
tion, we added one and recompiled it. While a trivial To this end, we first created some reconnaissance
action, after scanning this executable on VT, the AVs binaries that were submitted to Intezer, Any.run,
are reporting our binary as malicious was further re- Triage, Hybrid Analysis, the public Cuckoo installa-
duced to two (Figure 5b). Finally, we opened the last tion of the Estonian CERT (https://cuckoo.cert.ee/),
built of our executable with PEtools (https://github. Cape, and Threat Grid sandboxes. However, not all
com/petoolse/petools), cleared the rich header and re- of them allowed Internet connections to the binaries.
named the RDATA header to .bss and recalculated the Therefore, we used a machine with a public IP to col-
checksum. The removal of the rich header was made lect the input from the reconnaissance binaries when
to prevent the detection of the binary through the sig- the Internet connection was available. When this was
nature of this header (Webster et al., 2017). This final not the case, we manually inspected the logs that were
executable achieved zero detections from VT, see Fig- generated from the sandboxes as we wrote the corre-
ure 5c. The result was also cross-validated with other sponding logs to the disk and registry.
custom and multi-engine scanners, e.g. Kaspersky To bypass the execution of our malicious code in a
Threat intelligence portal (https://opentip.kaspersky. sandbox environment, we analysed the collected data
com/), Gatewatcher (https://intelligence.gatewatcher. to identify common deficiencies. The most signifi-
com/), MetaDefender (https://metadefender.opswat. cant misconfiguration in almost all sandboxes was the
com/), see Figure 5f, 5d and 5e, respectively. CPU specifications. More precisely, there were obvi-
ous contradictions regarding the threads and cores of
Table 1: Linker flags for PyInstaller. the reported CPU. For instance, a sandbox was report-
Flag Description
ing an AMD EPYC 7371 16-Core Processor, but in
the meantime, it was also reporting two cores and two
/BASE:0x00400000 Set base to default Windows PE image
threads. Therefore, we collected all available CPU
base
/DYNAMICBASE:NO Disable dynamic base
specifications from Intel and AMD and added them as
/VERSION:5.2 Set image version dictionaries in our the evasive final malware. An ag-
/RELEASE Set the checksum of the file gregated table of the issues that we identified in each
sandbox is reported in Table 2 and will be further dis-
cussed in the following paragraphs.
Despite the identified deficiencies, bypassing all
of them in a binary is not straight forward. The rea-
son is that continuous calls to read registry values,
or WMI is triggering alerts in the sandboxes. Thus,
one needs to unify these checks and prioritise them
according to the “noise” they introduce to the sand-
box. Therefore, in our malicious binary, we intro-
duced several conditions before executing the pay-
load.
Firstly, we check whether any known sandbox or
VM process is running in the background. After-
wards, we check whether the threads of the system
are more than four and if the available RAM is more
than 1 GB which is the bare minimum for most of the
Figure 3: A common YARA rule for detecting PyInstaller. 64bit modern computers. Then, we check whether
Source: https://github.com/bartblaze/Yara-rules/blob/ the system is powered on more than a threshold, e.g.
5f4961049d0d510b11250d5628383398889fc881/rules/
2-3 minutes. Next, we examine the foreground ap-
generic/PyInstaller.yar.
plications and the parent of the process of our bi-
nary. The reason for this check is the execution pro-
5.2 Dynamic Analysis with Sandboxes cess of a sandbox. In most cases, there is a dropper
script which opens the file and exits. However, in a
To assess the sandboxes and create a proper eva- real-world execution environment, one would expect
sion method, we first need to establish a ground truth that the user would have some other open programs,
baseline for the environment that the sandboxes use. whether this is the Explorer, Word, or a terminal that
Therefore, the strategy is to initially create a binary would initiate the execution of the binary. Clearly,
131
SECRYPT 2021 - 18th International Conference on Security and Cryptography
(a) Original binary with PyInstaller (b) Hello world binary with PyInstaller. (c) C++ compiled executable with the
malicious payload.
(d) Rust compiled executable with the (e) Go compiled executable with the
malicious payload. malicious payload.
Figure 4: VT detection results for binaries from various languages.
if this is not the case, then some automated script namic analysis, as well as the static analysis offered
opened the binary for inspection. by the sandboxes. In fact, all of them considered the
Notably, up to this point, no flag is triggered to samples suspicious for spawning another process of
the sandbox as the checks do not perform any black- itself which can be considered a false-positive indica-
listed operation and are considered benign by most of tion, but the malicious payload was not delivered as
them. If all these checks are passed, then we start the the binary understood that it was executed in a sand-
registry checks via WMIC for the CPU model name. box.
We validate with our dictionary the existence of the
model and the consistency of the reported threads and
cores with the manufacturer’s actual ones. Usually, 6 DISCUSSION
this query to the registry is logged by the sandbox, but
without any significant alert. Lastly, we query the reg-
Given the inherent static analysis restrictions, low de-
istry, again via WMIC, to access system information
tection rate from AVs in VT can be considered up to a
and find known VM strings in the system model or
point expected as our approach is unique and creates
system manufacturer. Clearly, this is also logged by
an unknown pattern. Nevertheless, the fact that our
the sandboxes, without though any high score to is-
samples do not simply have few detections, but actu-
sue a malicious verdict. Moreover, not all sandboxes ally zero is very alarming. It becomes even more wor-
managed to reach this point of execution, so in many rying because PyInstaller is a widely used tool that is
instances, these logs were not complete in all of the poorly handled. Even the slight changes introduced
reports. by us significantly reduced the AVs’ detection rate.
If any of these checks fail, we perform a grace- Notably, these methods can be applied to other lan-
ful exit, perform some arbitrary computations before- guages’ packaging, e.g. for Go which is increasingly
hand, and add some noise in the analysis. How- being used by malware in the past few years3 .
ever, after the successful pass of the aforementioned It is worth noticing that the above results indicate
checks, the malicious binary is executed. Quite
alarmingly, in all tested sandboxes, our evasion meth- 3 https://unit42.paloaltonetworks.com/the-gopher-in-
ods succeeded, achieving low scores in both the dy- the-room-analysis-of-golang-malware-in-the-wild/
132
Python and Malware: Developing Stealth and Evasive Malware without Obfuscation
(a) Patched binary with PyInstaller (b) Patched binary with version fix. (c) Final binary with all patches.
strings removed.
(d) GATEWATCHER scan results. (e) OPSWAT scan results. (f) Kaspersky static and dynamic analy-
sis results.
Figure 5: Screenshots of the results of our patched samples from multi-engine scanners.
that AVs do not efficiently handle large executables. and bypass them. Therefore, the further randomisa-
For instance, using the UPX feature of PyInstaller to tion of these IDs is necessary as the purchase of more
shrink the executable resulted in further detections of licences does not solve the problem completely.
the binary. Nevertheless, this can be attributed to the Finally, we should also stress the complete ab-
UPX signature. However, the same behaviour was sence of foreground processes in all sandboxes. In all
noticed with, e.g. Nuitka (https://nuitka.net/) which occasions, the binary started without any other win-
created far larger executables. dow opened, clearly showing that a dropper initiated
The results of the dynamic sandbox analysis can the execution. While one may argue that malware
be considered in many cases, catastrophic. The rea- may consider this as part of its persistence, e.g. via
son is that our analysis showcases significant issues registry autorun, it would be relatively easy for the
in the configuration of the sandboxes that allow the malware to verify the claim and correlate it with the
malware to fall below their radar. For instance, the uptime. Therefore, sandboxes must open a couple of
vast majority of sandboxes expose inconsistent CPU windows, e.g. Explorer, to denote some user-initiated
specifications (processor name vs cores/CPU) while action for the binary execution and hide the dropper’s
we also noticed the use of non-existing CPU names existence.
in one of them. Similar issues were also detected for
GPUs.
Differences between CPU timestamp counters 7 CONCLUSIONS
may be more challenging to patch; therefore, they
were encountered in most sandboxes. Quite inter- Many issues arise from misclassifications and it is es-
estingly, the listing of well-known VM processes and sential to understand which features are the ones that
obvious VM related strings in Bios and system manu- resulted to, e.g. a false positive. Based on this prob-
facturer (e.g. QEMU, KVM), small uptime, MAC ad- lematic, we studied the case of PyInstaller, a widely
dress vendor and low RAM, trivially exposed the vir- used packaging tool for Python scripts. The generated
tualisation environment indicating a poor configura- executables are erroneously flagged as malicious re-
tion of the sandbox environment. Moreover, we argue gardless of their content, as repeatedly reported online
that using a limited set of product Windows IDs that by developers. While many malware authors have
we noticed can also be used to fingerprint sandboxes
133
SECRYPT 2021 - 18th International Conference on Security and Cryptography
recently switched to the use of PyInstaller to write might even be in plain sight and evade detection in
their malware, this does not justify why every exe- real-world experiments.
cutable of PyInstaller should be treated as malicious. We argue that one can deploy even stealthier mal-
On the contrary, it implies that AVs do not understand ware by minimising the filesystem footprint. To this
the content of these files and treat them as malicious. end, in future work we plan to rewrite the boot-
Based on this problematic, we have shown that the loader to extract all the necessary files in mem-
problem is inherent as AVs cannot efficiently process ory or use PyOxidizer (https://github.com/indygreg/
Python bytecode, which are included in PyInstaller. PyOxidizer), randomising file names in each com-
As a result, we may develop malware which escapes pilation, further reducing the pattern that one could
static analysis of all AVs by simply changing some use to trace it. Fileless approaches (Kumar et al.,
characteristics of PyInstaller binaries. Clearly, Python 2020) in which all the content is loaded in memory
bytecode decompilation is essential to prevent similar through the use of, e.g. Living Off The Land Bina-
attacks in the near future. ries And Scripts (LOLBins and LOLScripts https://
Based on our analysis, it is evident that apart from github.com/LOLBAS-Project/LOLBAS) can further
clear misconfigurations, resource-wise limitations in decrease the detectability. In parallel, we plan to in-
the sandboxes impose significant constraints that en- vestigate other packaging and distribution tools for
able their identification. More precisely, to address other languages beyond Python to assess their obfus-
the numerous requests for scanning binaries, many cation abilities.
of the sandboxes resort to using a limited set of re-
sources (CPU/RAM) which especially for the CPU is
not properly handled. As illustrated, many of them ACKNOWLEDGEMENTS
report contradictory configurations which can be eas-
ily detected and bypassed without issuing any signif-
This work was supported by the European Commis-
icant alert. The analysis of a binary in a virtualised
sion under the Horizon 2020 Programme (H2020), as
environment which resembles a traditional, modern
part of the projects CyberSec4Europe (https://www.
PC system is very costly, let alone bear metal anal-
cybersec4europe.eu) (Grant Agreement no. 830929),
ysis. Nevertheless, with the continuous increase of
LOCARD (https://locard.eu) (Grant Agreement no.
samples that have to be checked, the balance is going
832735).
to be significantly tipped at the dispense of sandboxes.
The content of this article does not reflect the of-
The latter denotes a definite need to improve our ex-
ficial opinion of the European Union. Responsibility
isting sandboxes’ capabilities to, e.g. enable them to
for the information and views expressed therein lies
report more realistic configurations without exposing entirely with the authors.
them. Moreover, we should further explore the anal-
ysis using symbolic execution of the binary to offer
a cost-efficient alternative. Finally, despite the recent
advances in malware analysis and the numerous aca- REFERENCES
demic works and products touting almost absolute de-
tection rates, we illustrate that undetectable malware Afianian, A., Niksefat, S., Sadeghiyan, B., and Baptiste,
D. (2019). Malware dynamic analysis evasion tech-
134
Python and Malware: Developing Stealth and Evasive Malware without Obfuscation
niques: A survey. ACM Computing Surveys (CSUR), Computer and Communications Security, pages 769–
52(6):1–28. 780, New York, NY, USA. ACM, ACM.
Apostolopoulos, T., Katos, V., Choo, K. R., and Patsakis, Kirat, D., Vigna, G., and Kruegel, C. (2011). Barebox: effi-
C. (2021). Resurrecting anti-virtualization and anti- cient malware analysis on bare-metal. In Proceedings
debugging: Unhooking your hooks. Future Genera- of the 27th Annual Computer Security Applications
tion Computer Systems, 116:393–405. Conference, pages 403–412, New York, NY, USA.
Branco, R. R., Barbosa, G. N., and Neto, P. D. (2012). Sci- ACM, ACM.
entific but not academical overview of malware anti- Kirat, D., Vigna, G., and Kruegel, C. (2014). Bare-
debugging, anti-disassembly and anti-vm technolo- cloud: Bare-metal analysis-based evasive malware de-
gies. In Blackhat USA. tection. In USENIX Security Symposium, pages 287–
Bulazel, A. and Yener, B. (2017). A survey on auto- 301, Berkeley, CA, USA. USENIX Association.
mated dynamic malware analysis evasion and counter- Küchler, A., Mantovani, A., Han, Y., Bilge, L., and
evasion: PC, mobile, and web. In Proceedings of the Balzarotti, D. (2021). Does every second count? time-
1st Reversing and Offensive-oriented Trends Sympo- based evolution of malware behavior in sandboxes. In
sium, page 2, New York, NY, USA. ACM, ACM. Proceedings of the Network and Distributed System
Checkpoint Research (2020). Evasion techniques. https: Security Symposium, NDSS. The Internet Society.
//evasions.checkpoint.com/. Kumar, S. et al. (2020). An emerging threat fileless mal-
Chen, X., Andersen, J., Mao, Z. M., Bailey, M., and ware: a survey and research challenges. Cybersecu-
Nazario, J. (2008). Towards an understanding of anti- rity, 3(1):1–12.
virtualization and anti-debugging behavior in modern Leguesse, Y., Vella, M., and Ellul, J. (2017). Androneo:
malware. In 2008 IEEE International Conference on Hardening android malware sandboxes by predicting
Dependable Systems and Networks With FTCS and evasion heuristics. In IFIP International Conference
DCC (DSN), pages 177–186. IEEE, IEEE. on Information Security Theory and Practice, pages
Cono D’Elia, D., Coppa, E., Palmaro, F., and Cavallaro, L. 140–152, Cham. Springer, Springer International Pub-
(2020). On the dissection of evasive malware. IEEE lishing.
Transactions on Information Forensics and Security, Liţă, C. V., Cosovan, D., and Gavriluţ, D. (2018). Anti-
15:2750–2765. emulation trends in modern packers: a survey on the
D’Elia, D. C., Coppa, E., Nicchi, S., Palmaro, F., and Cav- evolution of anti-emulation techniques in upa pack-
allaro, L. (2019). Sok: Using dynamic binary instru- ers. Journal of Computer Virology and Hacking Tech-
mentation for security (and how you may get caught niques, 14(2):107–126.
red handed). In Proceedings of the 2019 ACM Asia Martignoni, L., Paleari, R., Roglia, G. F., and Bruschi,
Conference on Computer and Communications Secu- D. (2009). Testing CPU emulators. In Proceedings
rity, pages 15–27. of the Eighteenth International Symposium on Soft-
Deng, X. and Mirkovic, J. (2018). Malware analysis ware Testing and Analysis, ISSTA ’09, pages 261–
through high-level behavior. In 11th USENIX Work- 272, New York, NY, USA. ACM.
shop on Cyber Security Experimentation and Test Mutti, S., Fratantonio, Y., Bianchi, A., Invernizzi, L., Cor-
(CSET 18), Baltimore, MD. USENIX Association. betta, J., Kirat, D., Kruegel, C., and Vigna, G. (2015).
Forum, W. E. (2020). Wild wide web consequences of digi- Baredroid: Large-scale analysis of android apps on
tal fragmentation. https://reports.weforum.org/global- real devices. In Proceedings of the 31st Annual Com-
risks-report-2020/wild-wide-web/. puter Security Applications Conference, pages 71–80,
Gandotra, E., Bansal, D., and Sofat, S. (2014). Malware New York, NY, USA. ACM, ACM.
analysis and classification: A survey. Journal of In- Or-Meir, O., Nissim, N., Elovici, Y., and Rokach, L. (2019).
formation Security, 2014. Dynamic malware analysis in the modern era—a state
Guan, L., Jia, S., Chen, B., Zhang, F., Luo, B., Lin, J., Liu, of the art survey. ACM Computing Surveys (CSUR),
P., Xing, X., and Xia, L. (2017). Supporting transpar- 52(5):1–48.
ent snapshot for bare-metal malware analysis on mo- Petsas, T., Voyatzis, G., Athanasopoulos, E., Polychronakis,
bile devices. In Proceedings of the 33rd Annual Com- M., and Ioannidis, S. (2014). Rage against the virtual
puter Security Applications Conference, pages 339– machine: Hindering dynamic analysis of android mal-
349, New York, NY, USA. ACM, ACM. ware. In Proceedings of the Seventh European Work-
Huang, Q., Li, H., He, Y., Tai, J., and Jia, X. (2020). Pidica- shop on System Security, EuroSec ’14, pages 5:1–5:6,
tors: An efficient artifact to detect various vms. In In- New York, NY, USA. ACM.
ternational Conference on Information and Commu- Shi, H., Alwabel, A., and Mirkovic, J. (2014). Cardi-
nications Security, pages 259–275. Springer. nal pill testing of system virtual machines. In 23rd
(IC3), I. C. C. C. (2019). 2019 internet crime report. https: USENIX Security Symposium (USENIX Security 14),
//pdf.ic3.gov/2019 IC3Report.pdf. pages 271–285, San Diego, CA.
Issa, A. (2012). Anti-virtual machines and emulations. Shi, H., Mirkovic, J., and Alwabel, A. (2017). Handling
Journal in Computer Virology, 8(4):141–149. anti-virtual machine techniques in malicious software.
ACM Transactions on Privacy and Security (TOPS),
Kirat, D. and Vigna, G. (2015). Malgene: Automatic ex- 21(1):2:1–2:31.
traction of malware analysis evasion signature. In
Proceedings of the 22nd ACM SIGSAC Conference on Thomas, D. S. (2020). Cybercrime losses: An examination
of us manufacturing and the total economy.
135
SECRYPT 2021 - 18th International Conference on Security and Cryptography
136