1 s2.0 S108480452300156X Main
1 s2.0 S108480452300156X Main
Review
Keywords: Automated attacks allow adversaries to exploit vulnerabilities in enterprise IT systems at short notice. To
Honeypot identify such attacks as well as new cybersecurity threats, defenders use honeypot systems; these monitored
Honeypot framework decoy resources mimic legitimate devices to entice adversaries. The domain of enterprise IT honeypots has
Cybersecurity
been an active area of development and research, especially in the open-source community. In this work, we
Threat intelligence
survey open-source honeypots, honeypot frameworks, and tools that help to develop or discover honeypot
deployments. In contrast to existing surveys, our work provides a detailed discussion of the honeypots’ system
architecture, software architecture, and cloud-native deployment options. In addition, we cover the most recent
academic research in honeypot detection and evasion techniques, and discuss how these advances impact
current open-source honeypots. This work helps the reader to make an educated choice when selecting a
honeypot for deployment or further development.
∗ Corresponding author at: Robert Bosch GmbH, Corporate Research, Renningen, Germany.
E-mail addresses: [email protected] (N. Ilg), [email protected] (P. Duplys), [email protected] (D. Sisejkovic),
[email protected] (M. Menth).
https://doi.org/10.1016/j.jnca.2023.103737
Received 18 June 2023; Received in revised form 20 August 2023; Accepted 14 September 2023
Available online 19 September 2023
1084-8045/© 2023 Elsevier Ltd. All rights reserved.
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Table 1
Comparison of honeypot surveys based on the reviewed honeypot characteristics. Tick marks in brackets indicate that characteristics were not considered for all honeypots, or
only partially.
Survey Focus Honeypot Honeypot Deployment Logging Detection Proprietary
services architecture options solutions
Nawrocki et al. Attack analysis ✓ (✓) – – – ✓
(2016)
Fan et al. (2017) Taxonomy analysis (✓) – ✓ – (✓) ✓
Dalamagkas et al. Honeynets in smart grids ✓ ✓ – ✓ – –
(2019)
Franco et al. (2021) IoT, IIoT, and CPS ✓ – (✓) ✓ – –
This work (2023) Open-source enterprise IT ✓ ✓ ✓ ✓ ✓ –
1.1. Comparison with prior honeypot surveys or deployment. In Section 4, we consider academic research on the
detection and evasion of honeypots. The impact of honeypot detection
Honeypots are well known in enterprise IT. The first honeypot is then discussed with regard to the surveyed honeypots in Section 5.
was implemented more than 20 years ago (Spitzner, 2003). Since Finally, Section 6 concludes the survey.
then, honeypots have evolved from simple Linux installations (Spitzner,
2003) to lightweight cloud resources that mimic all kinds of different 2. Preliminaries on honeypots
services and devices (Nazario, 2022; Deutsche Telekom Security GmbH,
2023; Thinkst Applied Research, 2023). Honeypots are resources that attempt to mimic a real computer
Several surveys on honeypots have been published in recent years. system with the intention of being compromised by an attacker. These
Table 1 compares past honeypot surveys with this work. In addi- decoy systems are constantly monitored by the defenders and incoming
tion to the focused domain, we also compare the honeypots’ char- attacks are recorded and analyzed. Typical applications of honeypots
acteristics that were examined in the respective papers. Surveys es- can range from individual services – many honeypots on the Internet
pecially separate in the depth of the honeypot review. Overviews offer only an Secure Shell (SSH) service – to entire web servers.
with a more shallow discussion on honeypot architecture, however, Fig. 1 shows a honeypot system from the attacker’s and the defender’s
offer a larger amount of honeypots and academic solutions. Although perspective. The adversary expects to attack a genuine system with, for
academic solutions are not proprietary, in some cases the source code example, DDoS attacks, SQL injections, or brute-force attacks. Mean-
is closed-source. while, defenders use the insights from captured attacks to strengthen
The two most recently published surveys focus on industrial pro- their real deployments. Thus, honeypots are tools to gather intelligence,
duction networks (Dalamagkas et al., 2019) and on IoT, IIoT, and identify cybersecurity threats, and understand how adversaries would
Cyber–Physical Systems (CPS) (Franco et al., 2021). This is an area
compromise the mimicked system (Stolfo et al., 2011). This allows
of growing interest as botnet activity and other automated attacks
defenders to make data-driven decisions on what security measures to
on these devices have increased in recent years. While some network
invest in.
protocols are common for enterprise IT and (I)IoT, devices in IoT,
Depending on the expected attacks, different types of honeypots are
IIoT, and CPS have limited computing resources and run different
deployed. In the following, we discuss different types and strategies
services specific to these domains—two focal points that a honeypot
of honeypots and their findings. Since honeypots rely on deception
must emulate realistically.
to convince an adversary to interact with the system (Sanders, 2020),
The authors in Nawrocki et al. (2016) and Fan et al. (2017) present
another focus of this section is on honeypot detection.
solutions with a focus on enterprise IT. However, many of the surveyed
honeypots are long outdated or have received major functionality up-
2.1. Honeypot types
dates since then. Moreover, these surveys include proprietary solutions
without the possibility of a detailed discussion on their architecture and
implementation. Honeypots are typically characterized by the interaction level a
In this survey, we take an in-depth look at enterprise IT honeypots. honeypot offers to the adversary. This interaction level varies from low
This choice is supported by multiple motivators. First, enterprise IT to medium to high (Vetterl and Clayton, 2018).
honeypots have witnessed strong momentum in research and develop- Low-interaction Honeypots (LHs) provide the least amount of
ment in recent years, especially in the open-source community. Second, interaction for an adversary connecting to the system. They offer no
the wide adoption of cloud-native technologies like Docker and Kuber- Operating System (OS) to the attacker but rather a small number of
netes now offers new honeypot deployment options. Finally, research attack paths, e.g., a log-in shell for a given service application. This
on honeypot detection produced new ways to discover honeypots, kind of honeypot is predominantly used to catch credential brute-force
which, in turn, forced honeypot developers and researchers to come attacks or monitor connection attempts. Fig. 2(a) illustrates such an LH.
up with novel ways of deception. The services are only implemented superficially and allow no further
In contrast to existing surveys, our work covers new aspects regard- access to the underlying OS. All activity received by the emulated
ing a honeypot’s deceptiveness and, in addition, includes a detailed dis- services is logged.
cussion on system architecture, software architecture, logging strategy, Medium-interaction Honeypots (MHs) still do not provide a real
and deployment options of the surveyed honeypots. OS but can simulate a system shell to run commands on. Hence,
these honeypots try to present a more attractive target and catch a
1.2. Structure greater scope of attacks. Depending on the depth of the simulated
system, MHs cannot only run selected system commands but collect
The remainder of this work is structured as follows. Section 2 pro- malware samples uploaded by the adversary. As seen in Fig. 2(b),
vides preliminaries on honeypots and honeypot detection. In Section 3, the honeypot application is still separated from the operating system,
we survey open-source honeypots and honeypot frameworks that are ei- making the development of a profound shell emulation a significant
ther currently maintained or provide unique capabilities. Furthermore, effort. LHs and MHs are useful tools for defenders to collect data on
we discuss tools that assist in testing honeypots during development automated, large-scale attacks (Vetterl and Clayton, 2018). Further, an
2
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Fig. 2. Different types of honeypots. Dotted lines illustrate a boundary intended to be breached by adversaries.
extensively emulated system shell can even trap human adversaries for or out-going communication is not plausible, multiple honeypots can
short periods of time. interchange protocol messages.
High-interaction Honeypots (HHs) have the highest odds of trap- In addition, we can classify honeypots based on whether they run
ping a human adversary. On this interaction level, the honeypot has a on a real server or use a virtual deployment option (Dalamagkas
real OS looking vulnerable to the outside world. The high-interaction et al., 2019). On one hand, the deployment of honeypots on a server
possibilities allow for insight into attacker movement and activities or production hardware results in the best possible imitation of the
on the system. However, these insights introduce risks; a vulnerable represented system. This creates realistic response times and system
machine can also be used for subsequent attacks. Thus, an HH also environments. On the other hand, virtual deployments on Virtual Ma-
includes the highest level of maintenance effort. As shown in Fig. 2(c), chines (VMs) or inside containers are more flexible. This results in
the OS is at the disposal of the attacker. There is no longer any hardware independence and a light resource fingerprint, and allows for
separation between the honeypot and the OS. cheaper deployments, e.g., in a cloud environment. Build-in monitoring
of virtual deployment techniques is another advantage, as tools like Vir-
2.2. Deployment scenarios tual Machine Monitors (VMMs) can trace the attacker’s actions (Asrigo
et al., 2006).
The purpose of deployment can be roughly split into two cate-
gories: research and production honeypots (Spitzner and Roesch, 2001). 2.3. Honeypot findings
Research honeypots are mainly used to monitor the development of ex-
isting attacks and to get an early grasp on novel attack vectors. Fig. 3(a) Deploying honeypots on the Internet can yield a lot of information
shows a honeypot deployed inside a Demilitarized Zone (DMZ). This about the threat landscape for selected protocols. This knowledge can
gives the attackers access to the honeypot but prevents penetration be used to adjust existing security solutions. Honeypots are effective in
into the internal network. Those honeypots can supply defenders with collecting information about attacks because they have no value for
valuable information about the current attack landscape. In contrast, legitimate clients. For every connection, it can be assumed that the
production honeypots (Fig. 3(b)) are run to mimic a valuable com- connecting party has malicious intentions. As a result, one does not
ponent in a production setting. They can add an additional layer of need to differentiate between legitimate and hostile connections. In the
intrusion detection as those honeypots are not part of the production following, we take a look at honeypot findings sorted by interaction
process. In this way, every connection attempt can be assumed to be a level.
hostile party with access to the internal production network. Low-interaction honeypots detect scanning attempts as well as
Multiple, interconnected honeypots are called a honeynet (Dalam- brute-force or dictionary attacks (Brown et al., 2012). New variants
agkas et al., 2019). Honeynets are helpful to improve the cover of of vulnerabilities like Shellshock or Log4J can also be revealed by
production honeypots. While a single production machine without in- specialized low-interaction environments. While those low-interaction
3
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
systems are useful to detect if there is interest in a target system, higher deployments, there are additional factors that can affect the timing of
interaction levels are more suitable to reveal an attacker’s intention. network packets, such as network nodes between the attacker and the
Medium-interaction honeypots receive malware samples from dif- target.
ferent kinds of botnets (Zuzčák and Zenka, 2020) and command line
Application fingerprinting. Most current honeypots are not based on
inputs by automated and human adversaries. While some automated
the same programming library as the protocol service they want to
attacks try to inspect the target system – the CPU’s core count or
mimic. Thus, the developers need to adjust the honeypots’ protocol
the kernel architecture provide valuable information – most botnets
responses manually to match the real implementation. Fingerprinting is
disregard the nature of their target and try to infect it. Malware that done by searching for protocol strings that produce different response
is used to harm vulnerable systems can range from simple viruses to messages or response content depending on the protocol implementa-
rootkits and remote access trojans (Sethia and Jeyasekar, 2019). tion in use. Differentiable error messages, padding, or used algorithms
High-interaction environments observe similar results. Malware can reveal distinct protocol libraries. An example of successful finger-
infection and the installation of crypto miners are common findings. printing can be seen in Fig. 4. While both SSH servers (an OpenSSH
Moreover, attackers can try to use captured systems for further at- server and a honeypot) return the same identification string during the
tacks; vulnerable devices are commonly turned into relays for spam handshake, they reveal differences in implementation through the key
E-Mail (Alata et al., 2006) or used for subsequent denial of service exchange methods, encryption algorithms, authentication algorithms,
attacks. and compression algorithms which are exchanged at the key exchange
initialization (SSH KEX_INIT) stage. Hashes differ even between dif-
2.4. Honeypot detection ferent versions of the same implementation (e.g., OpenSSH 7.5 and
OpenSSH 8.5) (Reardon et al., 2022). The tool used to create those
Honeypots are only useful if the attacking party is not aware of the hashes is HASSH (Reardon et al., 2022) and is described in Section 3.4.
real nature of their target system. If a honeypot is identified, adversaries In Vetterl and Clayton (2018) the authors present a breakthrough
simply avoid it. Apart from an unrealistic attack surface, there are mul- in fingerprinting, not patchable for the current generation of honey-
tiple ways to identify a system as a possible honeypot from the outside: pots. They showcased a fingerprinting technique regarding state-of-
default configurations (Srinivasa et al., 2021a), timing attacks (Vetterl the-art SSH, Telnet, and HTTP honeypots, identifying a large number
and Clayton, 2019), and service application fingerprinting (Vetterl and of deployments, especially older versions. For example, they created
Clayton, 2018; Srinivasa et al., 2021a). 11,280,384 different protocol messages to observe the responses of
various implementations of SSH. From these crafted messages, the
Default configurations. All honeypots examined in this paper come authors searched for those that show the most divergence between
with a default configuration for instantaneous deployment. However, implementations; these can later be used to test unknown clients.
default configurations can cause detection. The default content of Passive fingerprinting (observation of protocol header fields) was
HTTP responses and static values in SSL/TLS certificates have led to added by Srinivasa et al. (2021a) to the probe-based approach of Vetterl
the detection of honeypot instances. This unique content in default and Clayton (2018) to further improve the findings; and we discuss
configurations or templates should be avoided by developers. As shown additional academic publications on honeypot detection in Section 4.2.
in Srinivasa et al. (2021a), of 21,656 detected honeypot instances, only Not only academic work is focusing on fingerprinting: the popular scan-
351 honeypots did not run on the default configuration. ner Shodan (Shodan, 2023d) offers a honeypot detection tool (Shodan,
2023a) that is believed to use a similar technique. Once exposed,
Timing attacks. Depending on the represented device and its original
fingerprinting methods can also be found as Metasploit (Rapid7, 2023)
hardware, there may be a difference in response time between the real scripts for simple, automated testing of remote hosts. Shodan and
device and the honeypot. This is particularly the case when virtual Metasploit are also explained in more detail in Section 3.4. This shows
deployment options are used for the honeypot. Protocol processes such that fingerprinting is not only a question of concept in the development
as a TLS handshake or the time until the welcome message appears phase but also requires constant updates to the system by developers
can be used to compare devices (Vetterl and Clayton, 2019). These and users.
attacks are not only dependent on the efficiency of the implementation.
Device and memory virtualization can also contribute to timing discrep- 3. Overview of honeypots, honeypot frameworks, and tools
ancies (Garfinkel et al., 2007). Although timing margins are getting
smaller as hardware performance increases, a honeypot still depends In this section, we give an overview of open-source honeypots and
on similar response times to the device it is representing. In cloud honeypot frameworks that are actively maintained or have unique
4
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Fig. 4. Comparison of an OpenSSH server fingerprint and a honeypot fingerprint. The MD5 hash is calculated from key exchange methods, encryption algorithms, authentication
algorithms, and compression algorithms, which differ in the SSH implementations.
capabilities. The overview is clustered based on the interaction level offered protocols, their unique deception capabilities, external libraries
of the individual honeypots from low to medium to high. Moreover, these honeypots depend on, and the programming language they are
we present a selection of tools that are helpful in testing honeypots. written in. In addition, we compare the honeypots’ architectures. A
summary of our findings is given in Table 2. This overview of honeypots
3.1. Methodology is clustered based on the interaction level. Starting with LHs, the higher
levels are briefly introduced at the first honeypot of the level.
We have compiled and practically deployed the honeypots from
mid-2022 to early 2023. To find honeypots, we searched for popular 3.2.1. Heralding
protocols (e.g., SSH, HTTP, and FTP) and completed them with honey- Heralding (Vestergaard, 2023) is a simple LH distributed under the
pots that cover protocols not addressed by popular solutions. Another GNU General Public License v3.0. Heralding is designed to
emphasis was on unique functionalities that are not offered by well- mimic services typically found in enterprise IT systems, including SSH,
known honeypots. Valuable resources to start are lists like (Nazario, Telnet, File Transfer Protocol (FTP), SMTP, VNC, PostgreSQL, SOCKS5
2022), GitHub repositories of honeypot foundations, and academic
as well as HTTP, POP3, and IMAP as well as their secure variants
work investigating honeypots, extending them, or analyzing their data.
HTTPS, POP3S, and IMAPS.
The insights into the system architecture stem from practical ex-
perience with the honeypots, as well as existing documentation. For System architecture. Heralding is configured using a YAML file herald-
practical experience, we have not only installed the honeypots but ing.yml. It includes settings for individual services as well as generic
have tested their functionality as comprehensively as possible; we settings such as the IP address to listen to and the logging configuration.
have investigated the system through network and vulnerability scans Fig. 5 shows Heralding’s software architecture. Services mimicked by
and investigated the resulting logging entries. For insights into the the honeypot only allow for log-in interaction and close the connection
software architecture, we also examined the source code for depen- afterwards. Listing 1 shows the result of an NMAP port scan of a
dencies and structure. The logging functionality and the deployment deployed Heralding instance.
options were also drawn from practical experience and complemented
by the honeypot’s documentation. To draw conclusions about honeypot Nmap scan report f o r Herlading−IP
detectability, we looked at several factors: software libraries, config- PORT STATE SERVICE
urability, network scans, and whether honeypots have already been 21/ t c p open ftp
fingerprinted in academic work. 23/ t c p open telnet
Furthermore, the tools for honeypot testing were compiled in the 110/ t c p open pop3
same time frame. We included tools that were either used in academic 2222/ t c p open ssh
work to analyze honeypots or that we found helpful for our experimen- 8080/ t c p open http
tal work with the honeypots. While libraries or platforms used to build Listing 1: Port scan of a Heralding honeypot.
the honeypot are included in the honeypot survey, we highlight tools
for testing and comparing the honeypot’s credibility.
Software architecture. Heralding is written in Python. Running Herald-
3.2. Honeypots ing’s latest release 1.0.7 requires Python version 3.7.0 or higher. Each
service that the honeypot can mimic is implemented in its own Python
To collect practical experience and to better understand the hon- module, and the modules can be activated in Heralding’s configuration
eypots’ features, we experimentally deployed all honeypots described file. As a result, it is easy to extend Heralding with additional services
in this section. Our focus was on the honeypots’ deployment options, or adjust the existing services if needed.
5
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
6
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
feature of SSH protocol that allows the SSH endpoint to send additional
data, e.g., custom settings, before completing the SSH handshake. Based
on this feature, Endlessh sends a random SSH banner to the scanning
client and thereby detains automated attacks on SSH.
7
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Blacknet 2 uses Transport Layer Security (TLS) to enable secure source license and implements protocols on different Open Sys-
communication between the SSH honeypot sensors and the master tems Interconnection (OSI) model layers including HTTP, SSH, SMB,
server. Blacknet’s authors recommend using EasyRSA, a command line DNS, ARP, ICMP version 6, TCP, and UDP. This, in turn, allows using
tool for creating Certification Authorities (OpenVPN Inc, 2023), to Masscanned to implement a wide variety of honeypots.
generate digital certificates to be deployed on the honeypot sensors and
the master server. System architecture. Masscanned was originally written for integration
into Instrument de Veille sur les Réseaux Extérieurs (IVRE), an open-
Software architecture. Blacknet’s honeypot sensors are low-interaction
source reconnaissance framework composed of tools for passive and
SSH honeypots. The mocked SSH service is implemented in Python,
active reconnaissance (IVRE, 2023a). However, Masscanned can also
the Python versions supported by Blacknet 2 are 2.7 and 3.4. The
be used independently from IVRE. The architecture of the honeypot is
SSH honeypot sensors use the Paramiko (Forcier and Gaynor, 2022)
shown in Fig. 9.
SSH library to establish the initial SSH connection and to present the
password prompt to the attacker. Masscanned implements the network stack in the user space. As a
The SSH logs are transmitted using the MessagePack binary serial- result, responses to the incoming network packets require no access to
ization format Furuhashi (2021). MessagePack allows a very compact the operating system’s kernel. A major advantage of this architecture is
representation of the log data so that Blacknet 2 can be deployed Masscanned’s ability to customize IP and MAC addresses and generate
in scenarios with limited network bandwidth. The persistent logging responses on the different network layers.
functionality utilizes the PyMySQL (Matsubara, 2023) library. For a stand-alone deployment, a virtual network interface is created,
and the honeypot listens on all ports. Deployment and configuration
Logging. Logs can be saved locally or on a dedicated network-connected
node. In the latter case, the master server establishes a connection to a can be done via the command line.
MySQL or MariaDB database through TCP/IP or a Unix socket.
Software architecture. Masscanned is written in Rust programming lan-
Fingerprinting. Fingerprinting of the Paramiko library is a concern, guage and packaged for Cargo, Rust’s default package manager. Stan-
but an integration of other honeypots as sensors should be possible with dard Rust (network) libraries are used to implement the honeypot’s
moderate effort. The use of different honeypots as sensors is another network protocols. This is possible since their implementation contains
interesting alternative. only basic functionality. As an example, the SSH implementation con-
tains no cryptographic key exchange. Individual modules are sorted by
Deployment options. Blacknet’s GitHub repository provides no Docker
layer–layers 2, 3, 4, and the application layer of the OSI model–nd all
images or instructions on how to build one. However, since the SSH
honeypot sensors and the master server only depend on a handful of modules are activated by default.
Python libraries, Blacknet can be easily dockerized. Moreover, packag- Logging. Logs are by default displayed on the console, but logfmt
ing the honeypot sensors in a Docker container would further simplify
format is also supported. All incoming network packets are saved in
the integration of other honeypots as sensors.
Packet Capture (PCAP) files which can be used for subsequent analysis.
Popularity. Blacknet 2 was initially written in 2010 and rewritten in In addition, the data from the network packets can be evaluated by
2017 based on an LH design. As of the time of this writing, Blacknet’s integrating masscanned with IVRE.
GitHub repository has only 10 stars, but the client–server architecture
and the resulting capability of distributed honeypots are noteworthy Fingerprinting. Because masscanned is tailored to detect simple scan-
features. ning attempts rather than meticulously disguise itself as a real IT sys-
tem, the honeypot has no dedicated measures to prevent its detection.
3.2.5. Masscanned However, since the honeypot implements a custom userland network
Masscanned (IVRE, 2023b) is an LH specialized in capturing scan- stack, masscanned operators can implement their own countermeasures
ning attempts and bots. It is published under the GPL-3.0 open against fingerprinting.
8
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Fig. 9. System architecture of masscanned. The honeypot does implement protocols on different network layers.
Deployment options. Masscanned authors suggest a deployment on a type (GET, POST, or HEAD). In the second step, the honeypot tries to
virtual private server and provide a build manual for a stand-alone determine the type of the attempted attack by searching for specific,
deployment in the project’s README on GitHub. While no Dockerfile is pre-defined patterns in the HTTP request. As an example, using the
provided, packaging a Rust application in a Docker container is simple. pattern =http://, Glastopf would detect an RFI attack in the HTTP
Since masscanned listens on all network ports, a Docker container needs request:
to use the host networking and have the capability to forge network GET http://example.com/vulnerable.php?color=http://evil.com/sh
packets. ell.php
Finally, once the attack type is identified, Glastopf calls an attack-
Popularity. With 61 stars and 11 forks on GitHub, masscanned appears
specific handler which, in turn, generates the response simulating a
to be less popular in its standalone version. However, it is commonly
successful attack.
used as part of the IVRE framework which, having over 2,800 stars
and over 600 forks on GitHub, receives significant attention in the In addition, Glastopf implements a simple parser for injected PHP
developer community. Moreover, masscanned’s commit history shows files. The parser takes the injected PHP file, extracts statements that
that it continues to receive regular updates to its code base. generate output, and generates a valid response corresponding to these
statements. As an example, if the injected PHP file contains:
3.2.6. Glastopf $un = @php_uname(); echo "uname -a: $un<br>"
Glastopf (MushMush Foundation, 2021a) is a low-interaction web Glastopf replaces @php_uname() with a valid (but fake) value and
application honeypot. By mimicking a wide range of web application, generates a response like:
database, and cross-site scripting vulnerabilities, Glastopf mimics a uname -a: GNU/Linux","Linux my.leetserver.com 2.6.18-6-
vulnerable web server with a large attack surface. k7<br>
Glastopf is designed to be indexed by search engines such that, in In addition, emulators for a login interface to catch brute-force
combination with its large attack surface, the number of attack attempts attack attempts or attempted SQL injections can be complemented
that hit the honeypot is maximized. by staged phpinfo or phpmyadmin pages. Those contain hints of
deficiencies such as a vulnerable PHP version. This can range up to the
System architecture. Glastopf is a pure web application honeypot de- emulation of specific exploits like PHP remote code execution (National
signed to emulate a wide range of web application, database, and Vulnerability Database, 2018).
cross-site scripting vulnerabilities. The honeypot’s system architecture
is shown in Fig. 10. Software architecture. Glastopf is implemented in Python. The honey-
Technically, Glastopf emulates vulnerability types like Remote File pot’s code base consists of individual Python modules implementing
Inclusion (RFI) or HTML injection rather than actual, specific vulnera- HTTP functionality, event classification, vulnerability emulators, and
bilities. To achieve this, the honeypot first determines the HTTP request data processing and logging.
9
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
The vulnerability type emulators can access pre-defined HTML System architecture. The honeypot is divided into SNARE and TANNER.
pages to present a web page to the attacker. In addition, a PHP sandbox SNARE clones existing web pages and converts them into attack sur-
is implemented that executes a set of whitelisted PHP functions. Both faces. The honeypot then looks like an NGINX web server presenting
the emulators and the sandbox can be easily extended with custom the cloned contents. As shown in Fig. 11, incoming HTTP requests are
functionality if needed. forwarded to TANNER. TANNER analyzes and classifies the request
Glastopf mostly uses modules from the Python standard library before crafting a response. TANNER also contains the vulnerability
such as urllib and base HTTP server. Moreover, the well-known emulators known from Glastopf.
gevent (Bilenko, 2019) is used as a networking library.
Software architecture. SNARE and TANNER are developed in Python
Logging. Glastopf logs contain the source IP of the attacker as well and mostly tested for Python 3.4+. SNARE uses Beautiful Soup
as the request URL and the attack pattern, e.g., the string of an SQL (Richardson, 2020) to pull data from HTML files. For web server capa-
injection. The logs can either be written into an SQLite database or bilities, asyncio (Python Standard Library, 2023) and AIOHTTP (aio-
sent to a remote log collector using the HPfeeds protocol. http contributors, 2023) (both HTTP libraries) are utilized.
Fingerprinting. The emulation of different vulnerability types allows for Tanner provides different vulnerability emulators, e.g., for Cross-
a realistic picture of a vulnerable web server. While the honeypot can Site Scripting (XSS). Patterns for XSS or SQL injections can be expanded
be fingerprinted in principle, this would require an attacker to craft and additional emulators can be implemented. Incoming SNARE ses-
dedicated HTTP requests to which the honeypot responds differently sions are analyzed similarly to Glastopf; the geoip2 module (Max-
than an actual web server. Mind, 2023) provides additional information about the attacker’s IP.
Deployment options. Since Glastopf is not actively maintained, a host Logging. SNARE forwards requests to TANNER where information
deployment is not recommended. The honeypot’s GitHub repository about an attack is logged locally in a JSON file. The honeypot, further,
provides a Dockerfile based on Ubuntu 14.04 and Python 2.7. implements HPfeeds and MongoDB clients via Python.
Popularity. Glastopf received its last commit in October 2021. Since Fingerprinting. The web application cloning and the representation of
the last major changes to the code base happened in 2014, the authors an NGINX web server improve the cover of the honeypot compared
recommend the use of the successor SNARE/TANNER. Nearly 500 to Glastopf. Fingerprinting the Python HTTP libraries is a concern,
stars and 174 forks on GitHub highlight how Glastopf brought novel but requires actively probing the honeypot—which means additional
approaches to the realm of web application honeypots. overhead for an attacker. By passively watching the honeypot’s HTTP
traffic it looks like an NGINX web server.
3.2.7. SNARE/TANNER
The Super Next generation Advanced Reactive honEypot (SNARE) Deployment options. The documentation (MushMush Foundation, 2016,
(MushMush Foundation, 2021b) and TANNER (MushMush Foundation, 2018) contains instructions for local deployment. Further, both com-
2022b) are the successors of Glastopf. Thus, they build another web ponents have Dockerfiles based on Alpine images. The Docker im-
application honeypot system but the separate structure provides better ages for SNARE (279 MB) and TANNER (330 MB) allow a compact
camouflage and performance. Dockerization of the architecture considering the extensive capabilities.
10
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Table 2
Overview of open-source honeypots including notable features. A * marks the successor in the following row.
Honeypot Last Protocols Type Language Interaction GitHub
maintained level stars
Heralding 2023 HTTP(s), SSH, FTP, Credentials Python Low 344
(Vestergaard, POP3, Telnet,
2023) IMAP, SQL, PostgreSQL,
SMTP, SOCKS5
Glutton 2023 SSH, Telnet, ADB, FTP, Credentials/ GO Low 196
(MushMush HTTP, Proxy
Foundation, XMPP, SMTP, MQTT,
2023) TCP
Endlessh 2021 SSH Bot-trapping C Low 5,811
(Wellons, 2021)
Blacknet 2 2019 SSH Client–server Python Low 10
(Mòrian, 2019)
Masscanned 2023 HTTP, SSH, STUN, Scans/Bots RUST Low 61
(IVRE, 2023b) SMB, DNS,
ARP, ICMP(v6), TCP,
UDP
Glastopf* 2021 HTTP(s) Vulnerability type Python Low 485
(MushMush emulation
Foundation,
2021a)
SNARE 2021/ HTTP(s) Web application Python Low 387/181
(MushMush 2022 cloning
Foundation,
2021b)/
TANNER
(MushMush
Foundation,
2022b)
Dionaea 2021 HTTP, FTP, MQTT, Malware extraction C, Python Low 628
(DinoTools, MySQL, UPnP
2021)
OpenCanary 2023 HTTP, FTP, MSSQL, Server imitation Python Low 1,684
(Thinkst Applied MySQL, NTP, RDP,
Research, 2023) SSH,
SNMP, Telnet, VNC,
TFTP, SAMBA, Git, TCP
Conpot 2022 HTTP, S7comm, FTP, ICS imitation Python Low 1,076
(MushMush Modbus, EtherNet/IP
Foundation,
2022a)
Honeyntp 2014 NTP Connection Python Low 50
(Fyodor, 2014) logging
Log4Pot (Patzke, 2022 Log4Shell Vulnerability Python Low 84
2023) pitfall
Kippo* (Upi 2016 SSH Shell interaction Python Medium 1,457
Tamminen,
2016)
Cowrie 2023 SSH, Telnet Shell interaction Python Medium 4,184
(Oosterhof,
2023)
Sshd-honeypot 2018 SSH, Telnet Cowrie proxy C Medium 22
(amv42, 2018)
Sshesame 2023 SSH Docker environment GO Medium 1,259
(Jakab, 2023)
Wetland 2018 SSH, SFTP, TCP Proxy for a Python High 120
ohmyadd (2018) Docker environment
Popularity. SNARE (387 stars and 122 forks) and TANNER (181 stars offers a wide range of protocols to get there: HTTP, FTP, TFTP, MySQL,
and 87 forks) are developed under GPL-3.0 license. Although MSSQL, UPnP, SMB, SIP, PPTP, MQTT, Mirror, Memcache, Blackhole,
the popularity of the Honeypot is inferior to its predecessor, it brings and epmap build Dionaea’s protocol suite.
expanded capabilities that allow for even more specialized use.
System architecture. Dionaea offers a wide range of features that make
3.2.8. Dionaea the system architecture quite complex. The honeypot consists of dif-
Dionaea, the successor to Nepenthes (Baecher et al., 2006), emulates ferent modules that can be activated at wish. Fig. 12 shows a sample
exploitable vulnerabilities that are exposed by services connected to architecture with the Python and the Emulation (EMU) module run-
the network. The honeypot’s goal is to receive a malware copy, and it ning. The Python module deploys the services offered by the honeypot.
11
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
In addition, the EMU module receives all in and outgoing commu- Popularity. While a lot of the code base was written from 2010 to
nication of deployed services to detect and emulate shell code from 2012, the honeypot was still maintained and expanded between 2016
payloads. This is done to gain malware copies. and 2021. With 628 stars and 169 forks, Dionaea has received sizable
attention in the honeypot community. The honeypot is released under
Software architecture. The core of Dionaea is implemented in C but it
GPL-2.0 license.
embeds Python via Python/C API (Python Software Foundation, 2023).
The possibility to include services via Python is a unique feature of the
honeypot. The connection handling below the application layer, as well 3.2.9. OpenCanary
as the modules are written in C. OpenSSL libraries provide SSL/TLS OpenCanary (Thinkst Applied Research, 2023) is an LH that is
support, and libemu (Angelo Dell’Aera, 2022) is used for the shell maintained by Thinkst Canary (Thinkst Canary, 2023b) and published
code detection and emulation. under the BSD-3-Clause license. The honeypot can mimic typical
The Python module allows for the deployment of services that are enterprise IT servers, e.g., a Linux Web server, a Windows server, or a
implemented in Python. Thus, this module contains all of the honey- MySQL server. Offered services are FTP, HTTP, MSSQL, MySQL, NTP,
pot’s Python scripts. Recommended Python versions are 3.8 and 3.9. In RDP, Redis, SNMP, SSH, Telnet, VNC, TFTP, Samba, GIT, and TCP.
this part of the honeypot implementation, regular Python libraries like Further, the honeypot has a port scan detection module.
sqlite3 and Jinja2 are in use.
System architecture. OpenCanary uses the configuration file to assemble
Logging. Dionaea supports logging as JSON files, in a database, or via server configurations from offered services. The official documenta-
HPfeeds. The honeypot has a filter module to set rules for log filtering tion (Thinkst Canary, 2023a) provides sample configurations for Linux
and allows the activation of fail2ban via the configuration file. Fail2ban Web server, Windows server, MySQL server, and MSSQL server. The
is an intrusion detection and prevention module, that scans log files for Linux Web server configuration, for example, deploys FTP, HTTP, and
IPs causing malicious activity and sets firewall rules to prevent these SSH services looking like an Apache Ubuntu server running OpenSSH.
from entering again. The architecture of this sample configuration is illustrated in Fig. 13.
The Samba module requires a dedicated Samba installation. Then
Fingerprinting. Dionaea was successfully fingerprinted in Srinivasa et al.
events are triggered as soon as somebody accesses the file share.
(2021a) due to two reasons: static values in the generated SSL/TLS cer-
tificates and default configurations of services. Default configurations Software architecture. The honeypot is written in Python and supports
like protocol banner strings must be avoided upon deployment and can versions 2.7 or 3.7+. The structure of the honeypot is modular and
be changed in the configuration of each service. The X509 certificate all services can be enabled through the configuration file. OpenCanary
for SSL/TLS is generated at startup and is self-signed. But parameters mostly uses Twisted (Twisted Matrix Labs, 2023) as a network li-
can be changed in the source code. brary. The SNMP module additionally requires Scapy. A description
Deployment options. For host deployments, the authors recommend of Scapy is provided in Section 3.4.
Ubuntu 18.04 or Debian 10. Ubuntu 20.04 and Debian 11 are not sup- Logging. OpenCanary per default logs events in JSON format but
ported since those dropped libemu from the package repository (Dino- can also write events into syslog. Further, the honeypot implements
Tools, 2015). HPfeeds to send logs to a remote host.
For Docker deployment, an official Docker image (DinoTools, 2023)
is provided, but same can be built from the GitHub repository. The Fingerprinting. Since there are known fingerprinting probes for
image is also based on Ubuntu 18.04. With a size of 182 MB, the image Twisted (Vetterl and Clayton, 2018) active probing is a concern. But
is comparatively small. After the honeypot is built, build packages and fingerprinting methods reported on the GitHub page were already fixed
source code are removed from the container which contributes to the in the past. Further, a lot of passive fingerprinting depends on realistic
small size. system imitations. Those can be changed via the configuration file.
12
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Deployment options. OpenCanary was initially supposed to run as a Although Conpot primarily aims to mimic ICS systems, it has a
daemon on a host system. Therefore, it provides instructions for Ubuntu unique feature in inter-honeypot communication that can improve a
and OS X deployments. But the developers also made a Dockerfile honeypot’s cover. This approach can also be useful in enterprise IT
available which is based on python:3.7-buster. The resulting systems.
Docker image is 964MB in size. This is quite sizable compared to other
System architecture. Conpot wants to allow the user to freely design an
honeypots. By using a slimmer Python base image, for example, a
industrial complex. Therefore, the honeypot includes different proto-
lighter deployment is possible. cols and provides templates to mimic the industrial devices of various
Popularity. OpenCanary is a maintained honeypot with over 1.6 thou- manufacturers. Those templates are XML files that contain properties of
sand stars and over 300 forks on GitHub. The code history shows con- the represented device or Human–Machine Interface (HMI). As shown
stant updates to the code base in the past three years. The development in 14, configured HMIs can connect to real clients to display realistic
information to an adversary. Further, pre-configured traffic between
with the open-source community further allows for comprehensive test-
devices and artificial delay to the service response times (The Honeynet
ing; for example, a fingerprinting method for the SQL Server honeypot
Project, 2018) reinforces the illusion of a real industrial complex.
was reported on the GitHub page and subsequently fixed.
Software architecture. Conpot was written in Python and has a modular
3.2.10. Conpot structure. The core contains basic honeypot functionality like logging,
Conpot (MushMush Foundation, 2022a) is an LH for Industrial session handling, and protocol handlers. This is complemented by
Control Systems (ICS) that emulates multiple industrial control pro- protocol and template modules. This structure allows an extension to
specific parts of the honeypot.
tocols and tasks, including HTTP and Supervisory Control and Data
The honeypot supports Python 3.6 or higher. For protocol imple-
Acquisition (SCADA). The objective is a honeypot system that looks
mentations, the Python modules socketserver (Python, 2022) and
like an industrial network (The Honeynet Project, 2018). Supported
gevent are frequently used. Gevent is a networking library.
protocols are HTTP, S7comm, FTP, Modbus, and EtherNet/IP. Further,
there are templates for different industrial devices and interfaces that Logging. Conpot has extensive logging capabilities. While the base
include IEC104, Kamstrup 382, Guardian AST, IPMI, and SNMP. output is JSON, the honeypot also provides support for syslog, HPfeeds,
13
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Fig. 14. Conpot architecture. In addition to services, there is also the possibility of setting up a human–machine interface connected to a real client.
SQLite, as well as STIX (OASIS Open, 2022a), and TAXII (OASIS method described in Vetterl and Clayton (2018), other protocols need
Open, 2022b). STIX and TAXII are open-source formats to serialize expert knowledge of industrial environments and protocols.
and exchange cyber threat intelligence. The exchanges happen via a
RESTful API using HTTPS. Those protocols are published by OASIS
Open (OASIS Open, 2022c). Deployment options. The latest release (0.6.0) provides a Docker image
Fingerprinting. Conpot is a very specialized environment with atypical based on Python 3.8 as well as instructions for a host deployment of
protocols. While fingerprinting of HTTP and FTP is possible with the Conpot. The pre-built image comes with a size of 260 MB. Container
14
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
3.2.12. Log4Pot
As soon as a new protocol vulnerability becomes public, many
automated scanners start looking for potential targets. Log4Pot (Patzke,
2023) is a honeypot developed solely to catch attackers trying to exploit
the Log4Shell vulnerability.
Literature. Conpot was extensively analyzed in Gokhale et al. (2020). Deployment options. Log4Pot’s GitHub repository currently does not
The authors have focused on finding interactions that end up in a contain a Dockerfile. However, considering that Log4Pot is a rather
dead end or in an infinite loop. The authors of Dowling et al. (2019), small Python project, containerization is possible. There are instruc-
Siniosoglou et al. (2020) used machine learning approaches to improve tions for a host/VM deployment.
the honeypot’s concealment.
Popularity. Log4Pot was developed under GPL-3.0 license. With
84 stars and 21 forks, it is a rather small honeypot. But the monitoring
3.2.11. Honeyntp
Honeyntp (Fyodor, 2014) is a Network Time Protocol (NTP) honey- of individual vulnerabilities is a concept that is often used, especially
pot that is based on an open-source Python NTP server (limifly, 2015). for known CVEs.
NTP is a protocol to ensure time synchronization between IT systems.
Although it does not provide access to a server, NTP and comparable 3.2.13. Kippo
protocols can still be targets of exploits that cause a denial of service MHs combine a low-risk environment of LHs with the greater knowl-
or worse.
edge gains of HHs. While those honeypots may provide a reasonable
System and software architecture. As illustrated in Fig. 15 Honeyntp is return for the moderate risk, the amount of functionality makes them
an NTP server listening to incoming NTP packets. The honeypot collects difficult to develop. SSH is once again the most prominent service
received information in a Redis database. offered by the honeypots.
The Python-implemented honeypot is a one-script application. The Kippo (Upi Tamminen, 2016) is an SSH honeypot with the purpose
Redis library for Python is used for database communication and a of logging brute force attacks and shell interactions. The honeypot was
configuration file provides the database’s location. inspired by Kojoney but is no longer maintained. It is recommended to
Logging. As mentioned earlier, Honeyntp logs into a redis database. But use the fork Cowrie.
there are no packets logged, rather first-seen/last-seen information is System architecture. Kippo allows the attacker to gain access to the
stored per IP/port pair. system via SSH login. After successful login, the attacker is presented
Fingerprinting. Since the underlying NTP server was last updated in with a shell and access to a file system. The general architecture is very
2015 there is a high probability, that the implementation is identifiable similar to the successor Cowrie and comparable with the illustration in
as a Python NTP server. This is a concern since there are approaches Fig. 17. The file system resembles a Debian 5.0 installation and allows
to OS fingerprinting utilizing the running NTP server implementation the defender to place custom files. The attacker can execute commands
as a hint to the OS. in the shell to modify the file system or to download files with wget.
15
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Fig. 16. Architecture of Log4Pot. Logs can be stored in AWS S3 Buckets or Azure Blob.
Fig. 17. System architecture of the honeypots Cowrie (emulated shell mode) and Kippo.
16
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Fig. 18. Architecture of the sshd-honeypot. The modified OpenSSH server utilizes a Cowrie back-end for command interpretation.
Software architecture. The code base consists of the file system con- Logging. Logging functionality was improved to the predecessor. Text
tents, the staged output of shell commands like ifconig, and the logs in the form of JSON, and a database connection are still available.
Python modules. Kippo requires Python 2.5+; Python modules are Possibilities to send Cowrie’s output to an ELK stack, Graylog, or Azure
divided into core functionality, command implementations, and log Sentinel were added. Files that are downloaded from or uploaded to
functionality and are modular in structure. This allows the user to add Cowrie are still saved for later review.
new command functionality to the shell interaction.
Fingerprinting. While Cowrie deployments were fingerprinted in Vetterl
Kippo depends on the Twisted library for networking, especially
and Clayton (2018) and Srinivasa et al. (2021a), this involves a very
Twisted.Conch for SSH and Telnet capability. Further, PyCrypto is
high level of effort and time investment, especially for the up-to-date
used as a cryptography toolkit and MySQLdb for database access.
version. Since the developers adapted the responses of the underlying
Logging. Kippo has different logging modes to choose from: text logs Twisted library (at least for SSH), fingerprinting the implementation
and database logging in form of MySQL or XMPP. Session logs contain requires a lot of overhead in SSH messages.
every command that is executed by the attacker and all downloaded More than half the identified Cowrie instances detected by Srinivasa
files are stored for subsequent review. et al. (2021a) were only labeled with a Shodan honeyscore of 0.0 or 0.3.
Thus, they are difficult to identify with moderate effort. Honeyscore is
Fingerprinting. Fingerprinting will be discussed for Cowrie since it is described in Section 3.4.
the most current version.
Deployment options. Cowrie comes with a Dockerfile and an official
Deployment options. Kippo was tested on Debian, CentOS, FreeBSD, image. The size of 382 MB as well as the availability of different
and Windows 7. Although there is no Dockerfile in Kippo’s repository, architectures allow for versatile application. The image is based on a
Cowrie provides official Docker images. slim version of Debian Bullseye and offers options for AMD, ARM, and
Popularity. Although Kippo’s last update ranges back to 2016, the MIPS, among others.
honeypot was quite popular. Over 1.4 thousand stars on GitHub, as well Popularity. Cowrie is one of the most well-known honeypots for SSH.
as 274 forks, illustrate the popularity of Kippo’s approach. It is not only mentioned in every honeypot survey or comparison, but
4.1 thousand stars and 782 forks on GitHub show the open-source
3.2.14. Cowrie community’s recognition.
The Kippo successor Cowrie (Oosterhof, 2023) is a medium to high
interaction honeypot expanding the capabilities of its predecessor. In 3.2.15. Sshd-honeypot
addition to Telnet support, the developers added curl, SFTP, and SCP Sshd-honeypot (amv42, 2018) is a modified OpenSSH implementa-
functionality to allow the attacker to perform file download and upload tion using Cowrie as back-end (amv42 and Oosterhof, 2018) for com-
operations. mand interpretation and logging. The additional middleman was imple-
mented to circumvent fingerprinting of the SSH library (Twisted.Conch)
System architecture. The general architecture (see Fig. 17) remained
used by Cowrie (Vetterl and Clayton, 2018).
similar to Kippo. The attacker can connect to the system by SSH
or Telnet and is presented with an emulated shell and file system System architecture. Sshd-honeypot has a straightforward system ar-
structure. The attacker can download files with wget or curl, as well chitecture which is illustrated in Fig. 18. A modified OpenSSH 7.3p1
as upload files with SCP or SFTP. daemon builds the honeypot. To ensure Cowrie functionalities, a mod-
In addition to Kippo features, Cowrie supports a high interaction ified Cowrie serves as the back-end. The SSH daemon listens on port
proxy mode. The honeypot can function as a proxy to a honeypot 65222 by default. Thus, it requires an iptables rule to forward
system to monitor activity. Further, Cowrie can use QEMU-emulated incoming SSH traffic from port 22. The port can be changed in the
servers as honeypot systems. In both modes, Cowrie can be configured sshd_config file.
to forward SMTP connections to an SMTP honeypot of choice.
Software architecture. Since this honeypot depends on a real OpenSSH
Software architecture. The software architecture has not been changed implementation, sshd-honeypot was written in C. It depends on libssh
significantly to Kippo’s. However, the individual parts have increased as an SSH library. Release 7.3p1 is the version of OpenSSH that has
in scope. For example, the list of command functionalities has tripled. been modified.
The Python version for Cowrie is 3.8+. Network functionality still
Logging. The back-end Cowrie is only modified to interpret commands
depends on Twisted libraries, and Python libraries bcrypt,
for sshd-honeypot, logging capabilities persist.
cryptography, and pyOpenSSL are used for hashes, cryptography,
and SSL/TLS functionality. Fingerprinting. The authors mention that this architecture was cre-
A feature that was added after the release of Vetterl and Clayton ated to circumvent fingerprinting presented in Vetterl and Clayton
(2018), is the possibility to adjust ciphers, MAC, and compression (2018). This SSH honeypot cannot be distinguished from an OpenSSH
methods used by the SSH implementation. Next to the already existing 7.3p1 server from the outside. Sshd-honeypot nullifies fingerprinting
list of SSH banners, this allows for deeper customization to prevent the but requires a custom-modified client for each represented OpenSSH
honeypot’s detection. version.
17
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Deployment options. The honeypot runs as a daemon like a real Logging. On default sshesame logs connections and shell commands
OpenSSH server. Since there are container environments running on the system console. The honeypot allows configuring JSON log-
OpenSSH servers, a Docker deployment of sshd-honeypot is possible. ging as well as a connection to Prometheus. Prometheus (2023) is an
open-source tool for system monitoring.
Popularity. With 22 stars on GitHub sshd-honeypot is a rather small
side project of Cowrie. Since the authors of Vetterl and Clayton (2018) Fingerprinting. Sshesame is a unique case because the honeypot al-
presented a breakthrough in honeypot detection, it is worth men- lows configuring every parameter used in the SSH handshake. This
tioning possibilities to circumvent those attacks. Developed under the does not prevent extensive probing of the implementation, but it does
BSD-3-Clause license, it was last updated in 2018. prevent passive fingerprinting by recording only the SSH handshake.
Fingerprinting is therefore a significant overhead for the adversary.
3.2.16. Sshesame
Deployment options. Sshesame provides introductions to run as a dae-
Sshesame (Jakab, 2023) is another medium-interaction SSH honey-
mon or a Docker deployment. The Docker image is set up in a multi-
pot, that focuses on the attacker’s behavior once they have access to the
stage build process: the code is first compiled in a GO-lang base image
system. While the depth of the emulated system shell does not reach the
and then deployed in a slim Alpine or distroless image. This results
likes of Cowrie, it has comprehensive configuration capabilities.
in a comparatively very small image with 33.8MB of size. Further-
System architecture. The honeypot accepts incoming SSH connections more, sshesame provides builds for linux-amd64, linux-arm64,
and allows the adversary to log in to the system. As shown in Fig. 19, linux-armv6, and linux-armv7 architectures in its GitHub repos-
the attacker is then presented with a system shell. The shell is emulated itory.
and executes a limited list of commands, e.g., cat, su, or echo.
Popularity. Sshesame has gained a lot of popularity considering it is
A configuration can be passed to modify the offered SSH service; this
a one-person project. The honeypot has received 1.2 thousand stars
includes the protocol banner as well as the welcome message. Sshesame
and has 74 forks on GitHub. It is released and maintained under the
was written with an emphasis on an adjustable attack surface. For
this reason, the honeypot allows adjusting key exchange algorithms,
Apache-2.0 license. Most of the code base is from 2021, but
sshesame still receives constant updates.
ciphers, and macs used by SSH.
Software architecture. Sshesame is written in GO and the latest re- 3.2.17. Wetland
lease v0.0.27 depends on GO-lang 1.14.0. Every functionality is im- High-interaction environments can be built through every commu-
plemented in an individual script which allows for an expansion of nication protocol granting remote access to a system. Notable func-
functionalities. The implementation depends on standard GO crypto tionalities are different approaches to logging and the usage of Docker
libraries (Golang.org, 2023) like crypto/ssh or crypto/x509 for containers as high-interaction environments. These honeypots are asso-
certificates, signatures, and SSH. Configurations are passed as YAML ciated with the highest maintenance effort because a complete system
file. is given to an attacker. While caused damage is minimized by using
18
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Table 3
Overview of open-source honeypot frameworks.
Framework Last maintained Type
Honeyd (Provos, 2004) 2007 Honeynet
T-Pot (Deutsche Telekom Security GmbH, 2023) 2023 Collection of 23 honeypots
Chameleon (QeeqBox, 2022) 2022 Collection of 19 honeypots
OWASP Honeypot Framework (The OWASP Foundation, 2023) 2023 Deployment of modules
HoneyDB (Deception Logic, Inc, 2022a) 2022 Distributed honeypot network
Docker environments, the risk of systems being weaponized against Fingerprinting. The dedicated high-interaction environment inside a
other targets is still an issue. container impedes fingerprinting. Since the Wetland proxy is transpar-
Wetland ohmyadd (2018) is a high-interaction SSH honeypot. It ent, an adversary only receives the container’s SSH fingerprint—the
uses Docker containers to minimize the maintenance effort and the risk SSH implementation can be chosen by the defender.
of high-interaction deployments. Wetland uses HonSSH’s (Nicholson,
2022) networking to be a transparent proxy. The honeypot combines Deployment options. While the honeypot itself does not provide a
the low-maintenance effort of MHs with the insights of high-interaction Docker image, it relies on a Docker environment to function. As
systems. While containers are a useful tool for this, they should not be Wetland depends entirely on Python, it is possible to set it up as a
left entirely unsupervised. container and deploy the entire honeypot system as containers.
System architecture. As shown in Fig. 20, the honeypot utilizes con- Popularity. Wetland is a rather small project, with 120 stars and 25
tainer environments. Wetland only acts as a proxy and forwards in- forks on GitHub. Although the project has not received any commits
coming SSH connections to a Linux container. The container is not since 2018, its early use of container technology is worth noting.
managed by Wetland, rather the address of the container is set in
the configuration file. As the container environment is not part of the 3.3. Honeypot frameworks
Wetland implementation, different distributions can be chosen as the
base image for the honeypot.
Honeypot frameworks combine a collection of honeypots to offer a
Software architecture. Wetland depends on Python 2.7, Paramiko for wider range of possibilities. A central configuration and deployment of
SSH, and IPy for IPv4 and IPv6 capabilities. The main.py script ini- honeypots for different protocols and services as well as a joint data
tializes the Wetland servers and services according to the configuration collection are reasons for a framework deployment. This section pro-
file wetland.cfg.default. Python servers for SFTP, SSH, and TCP vides an overview of state-of-the-art honeypot frameworks or ones with
are included. unique capabilities. Rather than focusing on the features of individual
Further, a Docker image running an SSH daemon is necessary to honeypots, we want to highlight data collection, presentation, as well
provide a high-interaction environment. as deployment possibilities. A comprehensive overview of all surveyed
Logging. Wetland logs shell interaction, SCP and SFTP files, exec- open-source frameworks can be found in Table 3.
commands, and reverse/direct-forward interaction of the adversary. Honeyd (Provos, 2004) not only was one of the first honeypot
Logs are either written to text/JSON files or can be sent via E-Mail, frameworks but implements functionalities rarely seen in other hon-
MQTT, or bearychat. Since logs are stored on the proxy, there is no eypot implementations. The framework is able to deploy multiple fake
risk of exposing logs on the honeypot machine to an adversary. hosts inside the local network and simulate their network stack. TCP,
19
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
UDP, and ICMP packets as well as scans of closed ports are answered 3.4.1. Network scans
correctly. The user can add arbitrary latency between hosts into the Network scans are an important first step when investigating a tar-
configuration to deceive traceroute tools. In addition, the framework get system. Identifying running services can reveal critical information
supports the use of different TCP/IP fingerprints – these are used when about the target system.
remotely identifying OSs – for each host to simulate a real system. Nmap (Lyon, 2022), short for ‘‘Network Mapper’’, is an open-source
T-Pot (Deutsche Telekom Security GmbH, 2023) is a honeypot tool for network reconnaissance. While the prevalent use is in port
framework developed by Deutsche Telekom Security GmbH looking scanning, Nmap has a wide range of other functionalities to offer. Scans
can be used to detect the OS running on a remote host by comparing
to be an all-in-one solution for honeypot deployment. The framework
the TCP/IP fingerprint to the Nmap OS database. Further, Nmap is able
contains more than 20 honeypots that can be deployed via a Docker
to do version detection of running service applications by inspecting
container; including well-known implementations for different services,
IP packets sent by the remote host. This tool can be used for broad
for example, SSH (Cowrie), HTTP (SNARE/TANNER), and SMTP. To
network scans as well as closer reconnaissance of single targets and is
complement a large number of deployable honeypots, the framework
an important tool for attackers and defenders to get a first impression
also contains a web interface to analyze the traffic seen by the deployed of the device in front of them.
instances. Tools like p0fv3 (Zalewski, 2014), fatt (Karimi, 2022) and Shodan (2023d) is a search engine for devices connected to the
geoip-attack-map (eddie4 and May, 2020) are used to gather data Internet. It allows the user to search for arbitrary terms and IP ad-
and visualize it afterwards. T-Pot is designed for deployment on local dresses to get general information like the autonomous system number,
machines and virtual machines in a cloud setup. Internet service provider, and organization. Further, open ports are
Chameleon (QeeqBox, 2022) contains customizable honeypots for displayed as well as basic headers returned by the service. More func-
19 different protocols and services, including SSH, HTTP, E-Mail, and tionalities include monitoring of owned devices and visualization of
database protocols to analyze network traffic, scanning attempts, and search results on a geo map interface (Shodan, 2023c). Shodan im-
credentials. The results are presented on a Grafana Labs (2023) web ages (Shodan, 2023b) allows you to browse all screenshots collected by
interface. The individual honeypots are written in Python and de- Shodan. Known alternatives for Shodan include Cencys (Censys, 2023)
ployed into Chameleon via a Docker container. For most protocol and ZoomEye (Knownsec, 2023).
implementations Twisted is used as a protocol library; protocol identi-
fication banners are customizable. Incoming traffic payloads are parsed 3.4.2. Fingerprinting operating systems
and compared with typical patterns. The simple automation process While tools capable of fingerprinting OSs are also network scanners,
supports cloud (AWS EC2) deployment. it is useful to know not only how these OS scanning features work, but
also how to deceive them.
OWASP Python Honeypot (The OWASP Foundation, 2023) is a
Xprobe2 (binarytrails, 2021) is a tool designed for TCP/IP finger-
honeypot framework supporting Python 3.x. The framework currently
printing remote operating systems. While other OS scanners rely on
implements modules for SSH, FTP, HTTP, ICS, and SMTP for demon-
static database comparisons to identify the remote OS, Xprobe2 values
stration purposes. Further, the Docker-based modules come with a
the results of each probe sent with a statistical approach to find the best
weak password version – these grant easy access for any adversary match possible. This procedure is necessary because TCP/IP packets can
to monitor attacker behavior – as well as a strong password ver- be manipulated by network devices between the own device and the
sion to monitor brute-force attempts. The SSH module is implemented remote host (Arkin and Yarochkin, 2002). In this case, exact matches
through the Paramiko library and reveals that when fingerprinted. are difficult to obtain, therefore, a valuation of the individual results
Configuration possibilities include port configuration as well as a reset can be more accurate.
feature to automatically reset containers after a given time period in p0fv3 (Zalewski, 2014) is a passive fingerprinting tool to identify
case of corruption. An API server and Elasticsearch (Elastic, 2023) are devices that communicate via TCP/IP without any active probing.
used to log data collected by honeypots and a Grafana Web UI for In contrast to, for example, Nmap or Xprobe2, p0fv3 does not send
representation. Elasticsearch as well as the API server do not need to any packets to the remote host. While this method allows one to do
run on the same system as a honeypot instance and can be dockerized. reconnaissance without attracting attention, it relies on being able to
HoneyDB-Agent (Deception Logic, Inc, 2022a) is a honeypot frame- receive network traffic from the target device.
work for several different services, including SSH, Telnet, FTP, and OSfooler-ng (Sanchez, 2019) is an open-source project that was
HTTP. Even though the agent can log attack activity locally and func- developed to prevent the fingerprinting tools described above, from
tion as a standalone honeypot, the ultimate goal is to contribute hon- successfully identifying the OS running on the local machine; this is
eypot findings to the HoneyDB website (Deception Logic, Inc, 2022a). done without the need to modify the system kernel. For this to work,
OSfooler-ng is using the databases provided by p0fv3 and Nmap to
Thus, creating a network of honeypots that covers large parts of the
adopt the chosen TCP/IP fingerprint and fool remote network scanners.
globe. Incoming data is accumulated and a general overview is pro-
vided about the top attacked hosts and services. All data can be
3.4.3. Service fingerprinting and vulnerabilities
accessed via HoneyDB REST API and there is no need for local log man-
Once again, specialized network scanners can be used to get a better
agement (Deception Logic, Inc, 2022b). The interaction level provided
understanding of the target. What makes those tools especially useful is
by the individual service honeypots depends on the functionalities of
the capability to reveal honeypots running on a target system if those
the emulation plugin.
are susceptible to service fingerprinting or use default configurations
known to scanners.
3.4. Tools for honeypot testing Hassh (Reardon et al., 2022) can be used to fingerprint SSH client
and server implementations. After the initial identification message,
server and client exchange their supported suites for key exchange,
In this section, we discuss tools that provide useful functionality for encryption, and authentication. Those lists are combined and hashed
honeypot development. In particular, these tools help the developer into an MD5 hash which can be stored and used for comparison.
verify that the honeypot is correctly mimicking the target system. This tool further includes a hasshGen script to automatically gen-
Another subset of these tools supports the implementation of packet erate Docker containers with a specific SSH implementation version
inspection and deception techniques. to extract the given fingerprint. Hassh’s repository includes a list of
20
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Table 4
Summary of honeypot evasion techniques from academic research.
Evasion mechanism Description References
Manipulate network-facing Honeypots adapt packets and header fields at Albanese et al., Naik et al.,
attack surface the protocol level to prevent detection. Siniosoglou et al.
López-Morales et al., Miah
et al.
Learn detection mechanisms Honeypot learn and prevent automated Dowling et al.
detection mechanisms.
Hide monitoring Honeypots hide that the system is monitoring Asrigo et al., Du et al.
the adversary’s approach. Mohammadzad and
Karimpour
Defuse environment Honeypots provide seemingly real environments Vetterl and Clayton,
that are protected from abuse. Srinivasa et al.
Touch and Colin
OpenSSH, Paramiko, and Dropbear SSH clients, for which the Hassh academic research provides innovative methods that are yet to be seen
fingerprint database was already generated. in open-source solutions. To highlight more new approaches to honey-
Shodan honeyscore (Shodan, 2023a) is an API offered by Shodan to pots, we present academic work independent of the target domain. An
evaluate whether a given host might be a honeypot. The score, ranging overview of honeypot evasion methods from presented publications is
from 0 to 1.0, determines the probability of the host being a honeypot. shown in Table 4.
As the source code is not openly available, we have no certainty about
the way it operates; the authors of Srinivasa et al. (2021a) believe that
4.1. Honeypots from academic research
there are overlaps with their technique.
Open Vulnerability Assessment Scanner (OpenVAS) (Greenbone,
2023) is a comprehensive vulnerability scanner that combines the Honware (Vetterl and Clayton, 2019) is a Customer Premise Equip-
capabilities of many of the above tools. It creates scan reports for the ment (CPE) and IoT honeypot framework allowing the user to upload
entire system including lower levels of the OSI model. Scan operations their own firmware image and deploy it as a honeypot. This solution,
can be customized and scheduled, thereby allowing automated testing based on a custom kernel is chosen by the authors to circumvent
of the network’s systems and honeypots. fingerprinting attacks and to improve existing emulation strategies
Honeyscanner (Koufakos et al., 2023) is an open-source tool that regarding scalability.
is developed to identify vulnerabilities in honeypots. As we discuss The authors in López-Morales et al. (2020) present a honeypot
in Sections 2.4 and 4.2, attacks on honeypots are an increasing part for Programmable Logic Controllers (PLCs). HoneyPLC emphasizes the
of academic research. Honeyscanner is a tool that supports honeypot credibility of the system and the automation of honeypot generation.
developers and maintainers in keeping their honeypots secure. This The included PLC profiler tool scans a specified target to automatically
ranges from passive attacks to active probing, denial-of-service attacks, create a PLC profile. These PLC profiles, afterwards, are deployed
or software library exploitation. To our knowledge, Honeyscanner is on the honeypot. To prevent an attacker’s attempt at fingerprinting,
one of the first tools that specifically targets honeypots. Honeyd’s approach of modifying TCP/IP fingerprints is used. HoneyPLC
is intended to track the rapid development of malware in the ICS
3.4.4. Network packet inspection domain. The honeypot is available on GitHub (López and Doupe, 2023).
Network packet inspection is focusing on gathering and analyzing In Srinivasa et al. (2021b) the authors present RIoTPot; a hon-
incoming network traffic to find interesting and new attack patterns eypot for IoT and Operational Technology. The Honeypot–which is
and packets. Further, the handling of network-level packets can be available on GitHub (Network Security Group, Wireless Communica-
useful to deceive reconnaissance tools. tion Systems at Aalborg University, 2023)–is modular in design to
Wireshark (Gerald Combs, 2022) is an open-source tool and the allow versatility and expandability. RIoTPot can be deployed in low-
predominant choice for network packet inspection that can be used and high-interaction mode, however, these modes can also be com-
to analyze network traffic from a variety of different protocols. It sup- bined for different services resulting in a hybrid-interaction level. Low-
ports live captures of traffic as well as offline analysis and decryption
interaction modules are integrated as GO-lang packages and, thus,
support for, among others, IPsec, TLS, and WPA2. The capture export
allow the addition of custom packages. The high-interaction mode de-
possibilities allow automated analysis of the entire network traffic.
pends on container images to provide a seemingly realistic environment
Snort (Cisco, 2022), an open-source Intrusion Prevention System
to the adversary. RIoTPot’s currently implemented protocols are SSH,
(IPS), has many options for configuration and can also be used for
Telnet, HTTP, Modbus, MQTT, and Constrained Application Protocol
packet sniffing and logging. Furthermore, there are implemented pre-
(CoAP) but further protocols from the ICS domain are planned.
processors for a range of popular protocols like SSH or HTTP which
In Touch and Colin (2022) the authors compare the findings of
inspect incoming packets for known protocol exploits.
Scapy (SecDev, 2022) allows the user to inspect, manipulate, and conventional honeypots like Cowrie to those of their own honeypot
craft network packets for a range of protocols. The Python library Asguard (Touch and Colin, 2021). Asguard is a Reinforcement Learning
supports not only the manipulation of incoming and outgoing packets (RL) agent that is deployed inside an SSH proxy to monitor attacker
but also, for example, network scans, tracerouting, and attacks. It behavior on an HH. The RL agent is trained to prevent the high-
enables the specialization of capabilities of common network tools like interaction environment from being compromised by intercepting and
arpspoof, tcpdump, and p0fv3 (SecDev, 2022). substituting malicious command line inputs. Besides analyzing gathered
honeypot data, the authors also highlight the advantage of HHs: as long
4. Honeypot research as an attacker is not able to compromise the system, high-interaction
environments are the best source of attacker data. While Asguard is an
This section provides an overview of academic research regarding approach capable of deceiving adversaries for a longer period of time
honeypots, honeypot detection strategies, and honeypot evasion tech- compared to conventional solutions, the risk of fingerprinting attacks
niques. We found that, especially in the realm of honeypot evasion, is still an ongoing concern discussed in the paper.
21
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
4.2. Detecting honeypots The ability to send decoy traffic between honeypots is looked
at in Siniosoglou et al. (2020). But rather than making use of pre-
In addition to the methods that we discuss in Section 2.4, there configured network traffic, a deep neural network is generating real-
are other publications on honeypot detection. In Huang et al. (2019) istically looking network traffic in a generative adversarial network
the authors present an approach to automatically identify honeypot architecture. This aims to improve Conpots’ effectiveness in imitating a
deployment. They use a machine learning model to predict a system’s production device inside an ICS environment by introducing authentic
nature based on various input features: application layer header fields network traffic to the honeypot.
(e.g., HTTP options), network layer header fields, and other system Another approach to decoy traffic is presented in Miah et al. (2022).
characteristics (e.g., system fingerprint). Shodan and other scanners In addition to adversarial learning, the authors used game theoretic
for Internet-connected devices are used to gather these input features. models to optimize the decoy traffic. This supports decisions on how
Although the authors limit their approach to HTTP, FTP, and SMTP, the much traffic of which kind should be deployed. Furthermore, the
successful results make this approach interesting for other protocols. In authors create their models with software-defined networking in mind:
decoy flows are designed to mislead attackers who passively monitor
contrast to Vetterl and Clayton (2018) and Srinivasa et al. (2021a), this
the network.
work does not disclose which honeypots were identified.
In Asrigo et al. (2006) the authors implement three different mon-
A fuzzing-based technique to identify honeypots is presented in Sun
itoring designs on virtualization platforms to monitor honeypots. The
et al. (2020). As in traditional fuzzing, the authors aim to provoke error
sensors monitor kernel executes and can interrupt kernel instructions.
handling or unintended use cases to distinguish between honeypots and
This not only allows for event logging but, in theory, can also be used
genuine systems. They probe systems on the Internet and evaluate the
to limit an attacker’s actions. In the evaluation, the authors show that
outputs supported by machine learning. Since Internet probing cannot
VMMs are capable of efficiently logging an adversary’s activity on the
be executed as aggressively as regular fuzzing (e.g., causing system
honeypot. Nevertheless, research has also shown that fully transpar-
crashes) network packets and mutation strategies are constrained.
ent monitoring is almost impossible (Garfinkel et al., 2007). Due to
Fingerprinting adaptive honeypots–honeypots that try to automat- the increasing performance of hardware and virtualization techniques,
ically identify threats–is discussed in Obaidat et al. (2021). The au- however, timing discrepancies are difficult to detect, especially in a
thors present a concept to confuse adaptive honeypot solutions. By network where latency occurs.
repeatedly attacking honeypots with non-malicious network packets or In Du et al. (2021) the authors present a logging approach based
command line inputs, those adaptive systems learn to identify them as a on the record and replay functionalities of VMs. The VMM captures
threat. Those same inputs can subsequently be used to force a reaction activities on the honeypot and defenders can retrace all the attacker’s
of honeypot instances that deploy the identical threat model. steps in replay afterwards. To not record every trivial brute-force
attack, the framework has an on-demand record functionality that starts
4.3. Evasion strategies the recording only if adversaries take notable actions. These are defined
by static rules, e.g., certain command executions. Replays can also be
The authors in Albanese et al. (2014) discuss an algorithmic ap- automatically scanned for interesting events to provide a useful entry
proach to obscure an attacker’s view of the network. The algorithm as- point.
signs each device a changing state primitive. A primitive consists of the The authors in Mohammadzad and Karimpour (2023) present a
definition of an OS fingerprint and methods for protocol scrubbing—the local monitoring technique that does not require virtualization. They
change of service defined fields in network protocol headers (Watson utilize Direct Kernel Object Manipulation (DKOM), a method used by
et al., 2004)–that a client adopts temporarily. Presenting a manipulated modern rootkits to hide their existence on infected systems; DKOM
view of the systems within the network prevents an attacker from manipulates kernel data structures to hide data and processes. Usually,
adversaries use rootkits to install backdoors, key loggers, and other
reliably identifying OSs and services running on each device. This
monitors onto a victim’s machine. In this case, the same technique
information is used to target specific hosts with known vulnerability
is used to monitor the adversary’s behavior on a honeypot. Within
exploits. Although the method is developed to further secure real
the evaluation, the authors show that their honeypot implementation
network hosts, the idea of manipulating service or OS fingerprints to
leaves no traces in memory and requires only low kernel modifications.
provide a different image of a device is a useful approach in the context
of honeypot evasion.
5. Discussion of honeypot detection
The authors in Naik et al. (2018) address fingerprinting of LHs by
doing in-depth research about TCP, UDP, and ICMP probing. Further,
During our work with the various honeypots and our research
they develop an approach to detect fingerprinting of honeypots. The
regarding the topic, we encountered different types of honeypots and
system evaluates incoming TCP options and flags as well as UDP and frameworks. Whether the honeypot is a generic solution trying to detain
ICMP packets to detect an ongoing fingerprinting attempt. It is tested bots or a specialized environment that targets human attackers, it can
against Nmap OS and service fingerprinting attacks and is able to only operate as intended whilst not revealed as such. Recent research in
classify the majority of attacks with a medium to high attack probability fingerprinting has not only posed new challenges to honeypots but also
rating. Understanding the procedure of active fingerprinting attacks is brought new approaches to deception using container deployments.
important to prevent honeypots from being easily identified. Further, In Vetterl and Clayton (2018), fingerprinting was entitled as not
the identification of such attacks is helpful when analyzing scanning fixable for current honeypot architectures. While we agree that the
attempts in honeypot-acquired data. capability to reveal honeypots is severe, we believe that differentiation
Honeypot detection and evasion is an ongoing race between attack- by use case is necessary. The ease of fingerprinting must match the
ers and defenders. For example, malware infections are accompanied desired attacker profile. Crafting various protocol strings to detect
by initial checks to identify honeypots. In Dowling et al. (2019), the different honeypots is an additional expense as well as a lot of overhead
authors present a honeypot that adapts to honeypot detection methods for botnets or automated scans which are targeting millions of devices.
using RL. The approach specifically targets automated attacks and at- Thus, LHs are probably at a lower risk of being detected, simply because
tempts to find optimal responses to the attacker’s command line inputs. the attack is not worth the additional cost. This being said, fingerprints
Evaluation results show that bots can use a large command sequence to taken through the protocol handshake must be considered even for low-
test systems for authenticity, and that the presented honeypot is able interaction devices. This kind of detection is trivial and done without
to learn and withstand these sequences in case of repeated attacks. a lot of overhead in computing or time investment.
22
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
For specialized systems and higher interaction levels, the effort they certainly introduce new challenges regarding the security of the
and cost of fingerprinting must increase. Since high-interaction envi- honeypot.
ronments target human attackers, the time invested into individual During the course of this paper, we worked with open-source so-
attacks rises. Thus, it is expected that the adversary will take the lutions for many use cases. These are ideally not deployed out of the
time to investigate the system. The same holds true for specialized box without preparation. While most are deployment-ready, default
environments: adversaries that are targeting uncommon systems with configurations should be changed, and there might be a need to further
extraordinary services or operating systems are probably familiar with customize protocol responses if specialized environments are wanted.
the target environment and will detect superficial replicas. However, this is the advantage of open-source solutions, as they allow
Kippo, Cowrie, Glastopf, Dionaea, and Conpot were all fingerprinted both: near-instant deployment and protocol-level customization for
in Vetterl and Clayton (2018) and Srinivasa et al. (2021a). While experienced users.
in Srinivasa et al. (2021a) default configurations were a major reason
for the detection of many deployments, the authors of Vetterl and Declaration of competing interest
Clayton (2018) used active probing. Both discovered a particularly
large number of deployments of outdated versions. As a result of The authors declare that they have no known competing finan-
the approach in Vetterl and Clayton (2018) the developers of Cowrie cial interests or personal relationships that could have appeared to
customized the protocol responses of the utilized SSH library and influence the work reported in this paper.
developed the proxy sshd-honeypot. Although this reaction is optimal
and desirable, it is very time-consuming, especially as the number of Data availability
supported protocols and protocol versions is increasing. For highly
customizable environments with a lot of different protocols, the value All reviewed honeypots are open-source. Thus, available on the
of findings does not justify the effort. This has also been observed Internet (and cited in the paper).
in our work with the surveyed honeypots. Versatility in the offered
protocols and generic use cases are mostly coupled with easier finger- Acknowledgments
printing. Honeypots which are specialized on one protocol or have a
very specific use case, are way more difficult to detect. An example of We thank all our colleagues who joined us to discuss honeypots and
such a specialized honeypot is SNARE/TANNER which is used for web- their applications.
application cloning. While this allows only HTTP usage, the quality
of the impression is more convincing. Another approach is the use of References
containers as a proxy, replacing the fingerprint of the honeypot service
with a real service implementation. This technology is rather used by aiohttp contributors, 2023. Welcome to AIOHTTP. https://docs.aiohttp.org/en/stable/.
MHs or HHs like (amv42, 2018) or (ohmyadd, 2018) to justify the (Last Accessed 25 January 2023).
Alata, E., Nicomette, V., Kaaniche, M., Dacier, M., Herrb, M., 2006. Lessons learned
additional expense. from the deployment of a high-interaction honeypot. In: Sixth European Dependable
Despite the benefits of container environments, they should not Computing Conference. pp. 39–46. http://dx.doi.org/10.1109/EDCC.2006.17.
be used carelessly. Misconfigurations of the Docker environment can Albanese, M., Battista, E., Jajodia, S., Casola, V., 2014. Manipulating the Attacker’s
lead to lesser security or even vulnerabilities (The OWASP Foundation, view of a system’s attack surface. In: 2014 IEEE Conference on Communications
and Network Security. pp. 472–480. http://dx.doi.org/10.1109/CNS.2014.6997517.
2022). For example, exposed docker daemons have already caused
amv42, 2018. sshd-honeypot. https://github.com/amv42/sshd-honeypot. Last Commit:
the takeover of containers (Palo Alto Networks, 2021) and honey- f588301 on 20 Dec 2018.
pots (Walla, 2022). amv42, Oosterhof, M., 2018. cowrie-sshd. https://github.com/amv42/cowrie-sshd, Last
In Sections 4 and 3.2.10, we mention attempts to use machine learn- Commit: c53f4c9 on 20 Dec 2018.
ing to improve the deception of honeypots. While those approaches Angelo Dell’Aera, 2022. libemu - x86 emulation and shellcode detection. https://github.
com/buffer/libemu. Last Commit: eb727eb on 14 June 2022.
seem promising to improve a honeypot’s cover, we have not yet seen Arkin, O., Yarochkin, F., 2002. A fuzzy approach to remote active operating
such approaches in current open-source software. system fingerprinting. https://github.com/binarytrails/xprobe2/blob/master/docs/
xprobe2-defcon10.pdf.
6. Conclusion Asrigo, K., Litty, L., Lie, D., 2006. Using VMM-based sensors to monitor honey-
pots. In: Proceedings of the 2nd International Conference on Virtual Execution
Environments. pp. 13–23.
Honeypots are valuable decoy resources that can mimic various Baecher, P., Koetter, M., Holz, T., Dornseif, M., Freiling, F., 2006. The Nepenthes
types of systems. In this survey, we covered open-source honeypots platform: An efficient approach to collect malware. In: International Workshop on
with different focuses and levels of detail. Whether one seeks a web Recent Advances in Intrusion Detection. Springer, pp. 165–184.
application honeypot, an SSH honeypot, or a honeypot mimicking a Bajpai, P., Enbody, R., Cheng, B.H., 2020. Ransomware targeting automobiles. In:
Proceedings of the Second ACM Workshop on Automotive and Aerial Vehicle
mail server, there are solutions for most use cases. Especially bigger
Security. AutoSec ’20, Association for Computing Machinery, pp. 23–29. http:
open-source projects offer high quality due to the great number of //dx.doi.org/10.1145/3375706.3380558.
contributors. Bilenko, D., 2019. gevent - A Python networking library. http://www.gevent.org/index.
During the interaction with deployed honeypots, we observed that html. (Last Accessed 23 January 2023).
binarytrails, 2021. Xprobe2. https://github.com/binarytrails/xprobe2. Last Commit:
customization for different use cases comes with a trade-off in decep-
f14af2e on 22 Mar 2021.
tiveness. This is confirmed by recent research in honeypot discovery: Brown, S., Lam, R., Prasad, S., Ramasubramanian, S., Slauson, J., 2012. Honeypots in
honeypots that specialize in mimicking a single system or service the Cloud.
are more difficult to detect than their customizable counterparts. But Carr, J., Schloesser, M., Lombardo, D., Huanjie, Z., Durechova, K., Werner, T., 2021.
depending on the type of attacker that is targeted by the honeypot, hpfeeds. https://hpfeeds.org/. Last Accessed 12 January 2023.
Censys, 2023. Censys Internet Scanning Intro. https://support.censys.io/hc/en-us/
discovery methods do not nullify the honeypot’s value. Especially au-
articles/360059603231-Censys-Internet-Scanning-Intro. (Last Accessed 02 February
tomated attacks targeting a large number of devices cannot invest the 2023).
additional overhead that is needed to detect the presence of such a Cisco, 2022. Snort. https://www.snort.org/. (Last Accessed 06 July 2022).
decoy system. Dalamagkas, C., Sarigiannidis, P., Ioannidis, D., Iturbe, E., Nikolis, O., Ramos, F.,
Another observation is the widespread utilization of container tech- Rios, E., Sarigiannidis, A., Tzovaras, D., 2019. A survey on honeypots, hon-
eynets and their applications on smart grid. In: IEEE Conference on Network
nology. Almost all honeypots offer docker images or even rely on Softwarization. NetSoft, IEEE, pp. 93–100.
containers as a honeypot or proxy. Although we see great opportu- Deception Logic, Inc, 2022a. HoneyDB. https://honeydb.io/. (Last Accessed 13 July
nities in container deployments, especially for automation purposes, 2022).
23
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Deception Logic, Inc, 2022b. HoneyDB Agent Docs. https://honeydb-agent-docs. Miah, M.S., Zhu, M., Granados, A., Sharmin, N., Anjum, I., Ortiz, A., Kiekintveld, C.,
readthedocs.io/en/latest/. (Last Accessed 13 July 2022). Enck, W., Singh, M.P., 2022. Optimizing honey traffic using game theory and
Deutsche Telekom Security GmbH, 2023. T-Pot - The all in one multi honeypot adversarial learning. In: Cyber Deception: Techniques, Strategies, and Human
plattform. https://github.com/telekom-security/tpotce. Last Commit: 9941818 on Aspects. Springer, pp. 97–124.
12 May 2023. Mohammadzad, M., Karimpour, J., 2023. Using rootkits hiding techniques to conceal
DinoTools, 2015. dionaea Docs - Introduction. https://dionaea.readthedocs.io/en/latest/ honeypot functionality. J. Netw. Comput. Appl. 214, 103606. http://dx.doi.org/10.
introduction.html. (Last Accessed: 16 June 2022). 1016/j.jnca.2023.103606, URL https://www.sciencedirect.com/science/article/pii/
DinoTools, 2021. dionaea honeypot. https://github.com/DinoTools/dionaea. Last S1084804523000255.
Commit: 4e459f1 on 8 February 2021. Mòrian, 2019. Blacknet 2. https://github.com/morian/blacknet. (Last Commit: d0a5730
DinoTools, 2023. Official image for dionaea a low interaction honeypot. https://hub. on 20 December 2019).
docker.com/r/dinotools/dionaea. (Last Updated: 01 February 2023). MushMush Foundation, 2016. Welcome to TANNER’s documentation! https://tanner.
Dowling, S., Schukat, M., Barrett, E., 2019. Using reinforcement learning to con- readthedocs.io/en/latest/. (Last Accessed 01 February 2023).
ceal honeypot functionality. In: Machine Learning and Knowledge Discovery in MushMush Foundation, 2018. Welcome to SNARE’s documentation!. https://snare.
Databases: European Conference, ECML PKDD 2018, Dublin, Ireland, September readthedocs.io/en/latest/. (Last Accessed 01 January 2023).
10–14, 2018, Proceedings, Part III 18. Springer, pp. 341–355. MushMush Foundation, 2021a. Glastopf - Web application honeypot. https://github.
Du, C., Zhao, S., Wang, W., 2021. RRPOT: A record and replay based honeypot com/mushorg/glastopf. (Last Commit: d17fcb6 on 16 October 2021).
system. J. Phys. Conf. Ser. 1757 (1), 012183. http://dx.doi.org/10.1088/1742- MushMush Foundation, 2021b. SNARE - Super next generation advanced reactive
6596/1757/1/012183. honEypot. https://github.com/mushorg/snare. (Last Commit: 0919a80 on 13 June
eddie4, May, M., 2020. Cyber security geoip attack map. https://github.com/eddie4/ 2021).
geoip-attack-map. Last Commit: 7d34b27 on 26 Jun 2020. MushMush Foundation, 2022a. Conpot - ICS/SCADA honeypot. https://github.com/
Elastic, N., 2023. Elasticsearch - The heart of the free and open elastic stack. https: mushorg/conpot. (Last Commit: f0e6925 on 29 June 2022).
//www.elastic.co/elasticsearch/. (Last Accessed 08 August 2023). MushMush Foundation, 2022b. TANNER - He who flays the hide. https://github.com/
European Union Agency for Cybersecurity, 2022. In: Lella, I., Tsekmezoglou, E., mushorg/tanner. (Last Commit: 2fdce2e on 16 January 2022).
Naydenov, R.S., Ciobanu, C., Malatras, A., Theocharidou, M. (Eds.), ENISA Threat MushMush Foundation, 2023. Glutton - Generic low interaction honeypot. https://
Landscape 2022: July 2021 To July 2022. European Network and Information github.com/mushorg/glutton. (Last Commit: c896cd5 on 02 April 2023).
Security Agency, http://dx.doi.org/10.2824/764318. Naik, N., Jenkins, P., Cooke, R., Yang, L., 2018. Honeypots that bite back: A fuzzy
Fan, W., Du, Z., Fernández, D., Villagra, V.A., 2017. Enabling an anatomic view to technique for identifying and inhibiting fingerprinting attacks on low interaction
investigate honeypot systems: A survey. IEEE Syst. J. 12 (4), 3906–3919. honeypots. In: IEEE International Conference on Fuzzy Systems. FUZZ-IEEE, pp.
Forcier, J., Gaynor, A., 2022. Paramiko - the leading native Python SSHv2 protocol 1–8. http://dx.doi.org/10.1109/FUZZ-IEEE.2018.8491456.
library. https://www.paramiko.org/. (Last Accessed 06 July 2022). National Vulnerability Database, 2018. CVE-2012-1823 Detail. https://nvd.nist.gov/
Franco, J., Aris, A., Canberk, B., Uluagac, A.S., 2021. A survey of honeypots and vuln/detail/cve-2012-1823. (Last Accessed 25 January 2023).
honeynets for internet of things, industrial internet of things, and cyber-physical Nawrocki, M., Wählisch, M., Schmidt, T.C., Keil, C., Schönfelder, J., 2016. A survey on
systems. IEEE Commun. Surv. Tutor. 23 (4), 2351–2383. honeypot software and data analysis. ArXiv e-prints: 1608.06249.
Furuhashi, S., 2021. MessagePack: Efficient Binary Serialization Format. https:// Nazario, J., 2022. An awesome list of honeypot resources. https://github.com/paralax/
msgpack.org. (Last Accessed 29 March 2023). awesome-honeypots. (Last Commit: a87cce9 on 29 November 2022).
Fyodor, Y., 2014. honeyntp - NTP logger/honeypot. https://github.com/fygrave/ netfilter.org, 2021a. The netfilter.org ‘‘iptables’’ project. https://www.netfilter.org/
honeyntp. Last Commit: 0c9d938 on 27 March 2014. projects/iptables/index.html. (Last Accessed: 02 February 2023).
Garfinkel, T., Adams, K., Warfield, A., Franklin, J., 2007. Compatibility is not netfilter.org, 2021b. The netfilter.org ‘‘libnetfilter_queue’’ project. https://www.netfilter.
transparency: VMM detection myths and realities. In: HotOS. org/projects/libnetfilter_queue/index.html. (Last Accessed 02 February 2023).
Gerald Combs, 2022. Wireshark - Go Deep. https://www.wireshark.org/. (Last Accessed Network Security Group, Wireless Communication Systems at Aalborg University,
06 July 2022). 2023. RIoTPot - IoT and operational technology honeypot. https://github.com/aau-
Gokhale, S., Dalvi, A., Siddavatam, I., 2020. Industrial Control Systems Honeypot: A network-security/riotpot. (Last Commit: 7475a3a on 07 June 2023).
Formal Analysis of Conpot. Int. J. Comput. Netw. Inf. Secur. 12 (6). Nicholson, T., 2022. HonSSH. https://github.com/tnich/honssh. (Last Commit: 821ce87
Golang.org, 2023. Go cryptography. https://pkg.go.dev/golang.org/x/crypto/. (Last on 02 January 2022).
Accessed 31 January 2023). NIST, 2023. National vulnerability database. https://nvd.nist.gov/. (Last Accessed 07
Grafana Labs, 2023. Grafana. https://github.com/grafana/grafana. (Last Commit: August 2023).
98f3b5f on 16 August 2023). OASIS Open, 2022a. Introduction to STIX. https://oasis-open.github.io/cti-
Greenbone, 2023. openvas-scanner. https://github.com/greenbone/openvas-scanner. documentation/stix/intro. (Last Accessed 23 January 2023).
(Last Commit: cab4d7c on 07 August 2023). OASIS Open, 2022b. Introduction to TAXII. https://oasis-open.github.io/cti-
Huang, C., Han, J., Zhang, X., Liu, J., 2019. Automatic identification of honeypot server documentation/taxii/intro.html, (Last Accessed 23 January 2023).
using machine learning techniques. Secur. Commun. Netw. 2019, 1–8. OASIS Open, 2022c. OASIS Open - Setting the standard for open collaboration. https:
IVRE, 2023a. IVRE - Network recon framework. https://github.com/ivre/ivre. (Last //www.oasis-open.org/. (Last Accessed 23 January 2023).
Commit: 5f67435 on 16 January 2023). Obaidat, M., Brown, J., Alnusair, A., 2021. Blind attack flaws in adaptive honeypot
IVRE, 2023b. masscanned. https://github.com/ivre/masscanned. (Last Commit: strategies. In: IEEE World AI IoT Congress. AIIoT, pp. 0491–0496. http://dx.doi.
511c816 on 30 March 2023). org/10.1109/AIIoT52608.2021.9454206.
Jakab, K., 2023. sshesame. https://github.com/jaksi/sshesame. (Last Commit: 2036179 ohmyadd, 2018. wetland - A high interaction SSH honeypot. https://github.com/
on 18 January 2023). ohmyadd/wetland. (Last Commit: 76d296e on 26 December 2018).
Karimi, A., 2022. FATT /fingerprintAllTheThings. https://github.com/0x4D31/fatt. Oosterhof, M., 2023. Cowrie SSH/Telnet Honeypot. https://github.com/cowrie/cowrie.
(Last Commit: c29e553 on 23 March 2022). (Last Commit: 65fc49e on 09 January 2023).
Knownsec, 2023. ZoomEye. https://www.zoomeye.org/. (Last Accessed: 04 April 2023). OpenVPN Inc, 2023. easy-rsa - Simple shell based CA utility. https://github.com/
Koufakos, A.C., Vasilomanolakis, E., Srinivasa, S., Yaben, R., 2023. Honeyscanner: A OpenVPN/easy-rsa. (Last Commit: 2cadb05 on 28 March 2023).
vulnerability analyzer for honeypots. https://github.com/honeynet/honeyscanner. Palo Alto Networks, 2021. Docker honeypot reveals cryptojacking as most com-
(Last Commit: 7adf647 on 09 August 2023). mon cloud threat. https://unit42.paloaltonetworks.com/docker-honeypot/. (Last
limifly, 2015. A Python based ntp server. https://github.com/limifly/ntpserver. (Last Accessed 14 December 2022).
Commit: 69ec28b on 06 April 2015). Palo Alto Networks, 2022. Attackers move quickly to exploit high-profile zero
López, E., Doupe, A., 2023. honeyplc - High-interaction Honeypot for PLCs and days: Insights From the 2022 unit 42 incident response report. https://unit42.
Industrial Control Systems. https://github.com/sefcom/honeyplc. (Last Commit: paloaltonetworks.com/incident-response-report/. (Last Accessed: 22 November
6188234 on 16 May 2023). 2022).
López-Morales, E., Rubio-Medrano, C., Doupé, A., Shoshitaishvili, Y., Wang, R., Bao, T., Patzke, T., 2023. Log4Pot - A honeypot for the Log4Shell vulnerability (CVE-2021-
Ahn, G.-J., 2020. HoneyPLC: A next-generation honeypot for industrial control 44228). https://github.com/thomaspatzke/Log4Pot. (Last Commit: e224c0f on 26
systems. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and April 2022).
Communications Security. CCS ’20, Association for Computing Machinery, New Prometheus, 2023. The Prometheus monitoring system and time series database. https:
York, NY, USA, pp. 279–291. http://dx.doi.org/10.1145/3372297.3423356. //github.com/prometheus/prometheus. (Last Commit: e023d89 on 31 January
Lyon, G., 2022. Nmap: Discover your network. https://nmap.org/. (Last Accessed 06 2023).
July 2022). Provos, N., 2004. A virtual honeypot framework. In: Proceedings of the 13th Conference
Matsubara, Y., 2023. PyMySQL - Pure Python MySQL client. https://github.com/ on USENIX Security Symposium - Volume 13. SSYM ’04, USENIX Association, USA,
PyMySQL/PyMySQL. (Last Commit: e91d097 on 09 January 2023). p. 1.
MaxMind, 2023. Python code for GeoIP2 webservice client and database reader. Python, 2022. The Python programming language - socketserver. https://github.com/
https://github.com/maxmind/GeoIP2-python. (Last Commit: cf2d16f on 17 January python/cpython/blob/3.11/Lib/socketserver.py. (Last Commit: ecfff63 on 11 March
2023). 2022).
24
N. Ilg et al. Journal of Network and Computer Applications 220 (2023) 103737
Python Software Foundation, 2023. Python/C API reference manual. https://docs. Touch, S., Colin, J.-N., 2021. Asguard: Adaptive self-guarded honeypot. In: 17th
python.org/3/c-api/index.html. (Last Accessed 01 February 2023). International Conference on Web Information Systems and Technologies-Volume
Python Standard Library, 2023. asyncio — Asynchronous I/O. https://docs.python.org/ 1: DMMLACS. SciTePress, pp. 565–574.
3/library/asyncio.html. (Last Accessed 25 January 2023). Touch, S., Colin, J.-N., 2022. A comparison of an adaptive self-guarded honey-
QeeqBox, 2022. Chameleon. https://github.com/qeeqbox/chameleon. (Last Commit: pot with conventional honeypots. Appl. Sci. 12 (10), http://dx.doi.org/10.3390/
0aad988 on 18 April 2022). app12105224, URL https://www.mdpi.com/2076-3417/12/10/5224.
Rapid7, 2023. Metasploit framework. https://github.com/rapid7/metasploit-framework. Twisted Matrix Labs, 2023. Twisted. https://twisted.org/. (Last Accessed 01 February
(Last Commit: 253290d on 15 August 2023). 2023).
Reardon, B., Karimi, A., Salesforce, 2022. ‘‘HASSH’’ - A profiling method for SSH clients Upi Tamminen, 2016. Kippo - SSH honeypot. https://github.com/desaster/kippo. (Last
and servers. https://github.com/salesforce/hassh. (Last Commit: 9a6c29a on 29 Commit: 0d03635 on 30 September 2016).
April 2022). Valeros, V., 2022. Installing glutton honeypot in the cloud. https://www.
Richardson, L., 2020. Beautiful soup documentation. https://www.crummy.com/ stratosphereips.org/blog/2022/5/3/installing-glutton-honeypot-in-the-cloud. (Last
software/BeautifulSoup/bs4/doc/. (Last Accessed: 25 January 2023). Accessed 16 January 2023).
Sanchez, J., 2019. OSfooler-ng. https://github.com/segofensiva/OSfooler-ng. (Last Vestergaard, J., 2023. Heralding - Credentials catching honeypot. https://github.com/
Commit: c0b20d6 on 26 May 2019). johnnykv/heralding. (Last Commit: 6437605 on 02 Aug 2023).
Sanders, C., 2020. Intrusion Detection Honeypots: Detection Through Deception. Vetterl, A., Clayton, R., 2018. Bitter harvest: Systematically fingerprinting low- and
Applied Network Defense, URL https://books.google.de/books?id=suubzQEACAAJ. medium-interaction honeypots at internet scale. In: WOOT @ USENIX Security
SecDev, 2022. Scapy - Packet crafting for Python2 and Python3. https://scapy.net/. Symposium.
(Last Accessed 08 July 2022). Vetterl, A., Clayton, R., 2019. Honware: A Virtual honeypot framework for capturing
Sethia, V., Jeyasekar, A., 2019. Malware capturing and analysis using dionaea honeypot. CPE and IoT Zero Days. In: APWG Symposium on Electronic Crime Research.
In: International Carnahan Conference on Security Technology. ICCST, pp. 1–4. ECrime, pp. 1–13. http://dx.doi.org/10.1109/eCrime47957.2019.9037501.
http://dx.doi.org/10.1109/CCST.2019.8888409. Walla, S., 2022. Compromised docker honeypots used for Pro-Ukrainian DoS
Shodan, 2023a. Honeypot or not? https://honeyscore.shodan.io/. (Last Accessed 03 attack. https://www.crowdstrike.com/blog/compromised-docker-honeypots-used-
August 2023). for-pro-ukrainian-dos-attack/. Last Accessed 14 December 2022.
Shodan, 2023b. Images. https://images.shodan.io/. (Last Accessed 03 August 2023). Wallen, J., 2015. An Introduction to Uncomplicated Firewall (UFW). https://
Shodan, 2023c. Maps. https://maps.shodan.io/. (Last Accessed 13 August 2023). www.linux.com/training-tutorials/introduction-uncomplicated-firewall-ufw/. (Last
Shodan, 2023d. Search engine. https://www.shodan.io/. (Last Accessed 03 August Accessed 20 January 2023).
2023). Watson, D., Smart, M., Malan, G.R., Jahanian, F., 2004. Protocol scrubbing: Network
Siniosoglou, I., Efstathopoulos, G., Pliatsios, D., Moscholios, I.D., Sarigiannidis, A., security through transparent flow modification. IEEE/ACM Trans. Netw. 12 (2),
Sakellari, G., Loukas, G., Sarigiannidis, P., 2020. NeuralPot: An industrial honeypot 261–273.
implementation based on deep neural networks. In: ISCC. pp. 1–7. http://dx.doi. Wellons, C., 2021. Endlessh: an SSH tarpit. https://github.com/skeeto/endlessh. (Last
org/10.1109/ISCC50000.2020.9219712. Commit: dfe44eb on 30 April 2021).
Spitzner, L., 2003. Honeypots: Tracking Hackers, Vol. 1. Addison-Wesley Reading. Zalewski, M., 2014. p0f v3. https://lcamtuf.coredump.cx/p0f3/. (Last Accessed 06 July
Spitzner, L., Roesch, M., 2001. The value of honeypots, part one: Definitions and values 2022).
of honeypots. Secur. Focus. Zuzčák, M., Zenka, M., 2020. Expert system assessing threat level of attacks on a hybrid
Srinivasa, S., Pedersen, J.M., Vasilomanolakis, E., 2021a. Gotta catch ’em all: A SSH honeynet. Comput. Secur. 92, 101784.
multistage framework for honeypot fingerprinting. ArXiv e-prints arXiv:2109.10652,
URL https://arxiv.org/abs/2109.10652.
Srinivasa, S., Pedersen, J.M., Vasilomanolakis, E., 2021b. RIoTPot: A modular hybrid-
interaction IoT/OT honeypot. In: Computer Security–ESORICS 2021. Springer, pp. Niclas Ilg received the B. Sc. And M. Sc. in Computer Science from the University
745–751. of Tuebingen, Germany, in 2020 and 2022. Currently, he is a Ph.D. student at Bosch
Stolfo, S.J., Bowen, B.M., Ben Salem, M., 2011. Insider threat defense. In: van Research, Germany. His research focuses on security automation and honeypot systems.
Tilborg, H.C.A., Jajodia, S. (Eds.), Encyclopedia of Cryptography and Security. Further interests are AI in cybersecurity and connectivity in the automotive domain.
Springer US, Boston, MA, pp. 609–611. http://dx.doi.org/10.1007/978-1-4419-
5906-5_904.
Paul Duplys is a chief expert for cybersecurity at the Technical Strategies and Enabling
Sun, Y., Tian, Z., Li, M., Su, S., Du, X., Guizani, M., 2020. Honeypot identification
department within the Mobility sector of Robert Bosch GmbH, a Tier-1 automotive
in softwarized industrial cyber–physical systems. IEEE Trans. Ind. Inform. 17 (8),
supplier and manufacturer of industrial, residential, and consumer goods. Previous to
5542–5551.
this position, he spent over 12 years with Bosch Corporate Research where he led
The Honeynet Project, 2018. Conpot - Low interaction server side ICS honeypot.
the security, privacy & safety research program and conducted applied research in
https://conpot.readthedocs.io/en/latest/. (Last Accessed 05 July 2022).
various fields of information security. Paul holds a Ph.D. degree from the University
The OWASP Foundation, 2022. OWASP docker security cheat sheet. https://
of Tuebingen on side channel evaluation for the automotive domain.
cheatsheetseries.owasp.org/cheatsheets/Docker_Security_Cheat_Sheet.html.
The OWASP Foundation, 2023. OWASP honeypot, automated deception framework.
https://github.com/OWASP/Python-Honeypot. (Last Commit: 4235acf on 09 May Dominik Sisejkovic received the B.Sc. and M.Sc. degrees in computing from the
2023). University of Zagreb, Croatia, in 2014 and 2016, respectively, and the Ph.D. (Dr.-
The Shadowserver Foundation, 2023. ShadowServer - Network reporting. https://www. Ing) degree (with honors) from RWTH Aachen University, Germany in 2022. During
shadowserver.org/what-we-do/network-reporting/. (Last Accessed: 02 January his Ph.D., he focused on software design for trustworthy microelectronics, leading to
2023). award-winning publications and successful transfer of technology to industry. Currently,
Thinkst Applied Research, 2023. Modular and decentralised honeypot. https://github. he is a team lead for cybersecurity research at Bosch Research, Germany.
com/thinkst/opencanary. (Last Commit: e074f15 on 24 March 2023).
Thinkst Canary, 2023a. OpenCanary - Welcome to the OpenCanary guide.. https:
//opencanary.readthedocs.io/en/latest/. (Last Accessed 01 February 2023). Michael Menth is a professor at the Department of Computer Science at the University
Thinkst Canary, 2023b. Thinkst canary - Know. When it Matters! https://canary.tools/. of Tuebingen/Germany and chairholder of Communication Networks since 2010. His
(Last Accessed 01 February 2023-02-01). special interests are performance analysis and optimization of communication networks,
Tier, R., 2022. How to set up an endlessh tarpit on Ubuntu 22.04. https: network resilience, resource management, and security aspects. His research focus is on
//www.digitalocean.com/community/tutorials/how-to-set-up-an-endlessh-tarpit- network softwarization, in particular P4-based data plane programming, Time-Sensitive
on-ubuntu-22-04. (Last Accessed 20 January 2023). Networking (TSN), Internet of Things, and Internet protocols. Dr. Menth contributes to
standardization bodies, mainly to the IETF.
25