Empirical Study of Passive 802.11 Device Fingerprinting
Empirical Study of Passive 802.11 Device Fingerprinting
11 Device Fingerprinting
Abstract—802.11 device fingerprinting is the action of char- Fingerprinting is also useful in key protected wireless
acterizing a target device through its wireless traffic. This networks (e.g. WPA2). It can be used after a key-based
results in a signature that may be used for identification, authentication mechanism in order to control if only au-
network monitoring or intrusion detection. The fingerprinting
arXiv:1404.6457v1 [cs.CR] 25 Apr 2014
method can be active by sending traffic to the target device, or thorized devices are in the network. Indeed wireless keys
passive by just observing the traffic sent by the target device. may leak as there are several normal situations where e.g.
Many passive fingerprinting methods rely on the observation home users voluntarily give out their wireless key without
of one particular network feature, such as the rate switching renewing it afterwards, as for instance when allowing a
behavior or the transmission pattern of probe requests. guest’s laptop to access the home network. While this
In this work, we evaluate a set of global wireless network
parameters with respect to their ability to identify 802.11 scenario is both common and simple, it also endangers the
devices. We restrict ourselves to parameters that can be home network; the guest may abusively reconnect or the key
observed passively using a standard wireless card. We evaluate may eventually leak from his laptop. Tools like aircrack-ng
these parameters for two different tests: i) the identification exist that allow non-professional hackers to crack the (known
test that returns one single result being the closest match for to be insecure) WEP protocol. Finally, commercial services,
the target device, and ii) the similarity test that returns a set
of devices that are close to the target devices. We find that like wpacracker.com, exist which try to recover WPA keys.
the network parameters transmission time and frame inter- Contributions: We investigate new passive fingerprint-
arrival time perform best in comparison to the other network ing candidates for 802.11 devices. More precisely, we mea-
parameters considered. Finally, we focus on inter-arrival times, sure five network parameters that can be captured with a
the most promising parameter for device identification, and standard wireless card. Using a generic method to calcu-
show its dependency from several device characteristics such
as the wireless card and driver but also running applications. late a signature of a device, we compare the ability of
each network parameter to characterize 802.11 devices. We
perform the similarity and the identification tests, which
I. I NTRODUCTION
we evaluate using our own measurement data as well as
Device fingerprinting is the action of gathering device public data obtained from large conference settings. We
information in order to characterize it. This process gen- use detection window sizes of 5 minutes. We find that
erates a signature, also called a fingerprint, that describes the network parameters transmission time and frame inter-
the observed features of a device in a compact form. If the arrival time perform best in comparison to the other network
generated signature is distinctive enough, it may be used to parameters considered. Our evaluation renders a probability
identify a device. of correct classification that ranges from 79.4% to 95.0% for
One application of 802.11 device fingerprinting is intru- the transmission time and from 62.7% to 93.1% for frame
sion detection and more precisely the prevention of Medium inter-arrival times. In the most difficult testing conditions,
Access Control (MAC) address spoofing. MAC address the wireless traffic of a conference, the inter-arrival time
spoofing is the action of taking the MAC address of another renders the best identification ratios. Up to 56.6% of the
machine in order to benefit from its authorization. The devices could be uniquely identified with a false positive
prevention of MAC address spoofing is of importance in rate of 0.1. Finally, we focus on inter-arrival times, the most
various scenarios. Open wireless networks such as hot-spots promising parameter for device identification, and show its
often implement MAC address based access control in order dependency from several device characteristics such as the
to guarantee that only legitimate client stations connect, e.g. device’s wireless card and driver or running applications.
the devices that purchased Internet access on an airport hot- The remainder of this paper is organized as follows. We
spot. An attacker can steal a legitimate device’s session by present related work in Section II. Section III presents the
spoofing the latter’s MAC address. different network parameters considered, and Section IV
Another application of fingerprinting is the detection of presents the proposed fingerprinting method. Section V
rogue access points (APs). Tools like AirSnarf or Raw- evaluates our method against several traces. We then present
FakeAP enable an attacker to set-up a rogue AP, which result in Section VI the 802.11 features the generated signature
in client stations connecting to the fake AP instead of the depends on, and discuss possible attacks and applications in
genuine one. Section VII. Finally, we conclude with Section VIII.
II. R ELATED W ORK In contrast to the above papers, Pang et al. [15] discuss
privacy implications in 802.11. Their paper highlights that
A first class of related work fingerprints wireless client users are not anonymous when using 802.11 as the protocol
stations by analyzing implementation specificities of the uses globally unique identifiers (the MAC address), which
network card and/or driver. Franklin et al. [9] characterize allow user tracking. Even if we suppose that this identifier
the drivers during the “active scanning period”. This process is masked (e.g. by temporarily changing addresses) it is
is underspecified in the IEEE 802.11 standard regarding the possible to track users by observing a set of parameters in
frequency and order of sending probe requests. Therefore, the 802.11 protocol. The observed parameters are network
each manufacturer implements its own algorithm and timers. destinations, network names advertised in 802.11 probes,
Gopinath et al. [11] show that 802.11 cards exhibit very 802.11 configuration options and broadcast frames’ sizes.
heterogeneous behavior due to implementation specificities. With encrypted traffic, three out of the four parameters still
They test a set of 802.11 features such as Random Back- apply. The presented identification problem is close to our
off timers or Virtual Carrier Sensing and present their identification test. To answer this test successfully, their
experimental results. The observed heterogeneity in behavior fingerprinting technique requires traffic samples for each
may be used to fingerprint a card’s vendor and model, but user that last at least one hour.
does not further analyze this aspect.
Bratus et al. [6] propose an active method to fingerprint III. 802.11 NETWORK PARAMETERS
client stations as well as APs. They send malformed or This section describes the network parameters we consider
non-standard stimulus frames to the fingerprintee and apply for fingerprinting. We focus on network parameters that we
decision trees on the observed response or behavior. This can easily extract using a standard wireless card. We do not
yields a signature of the vendor/manufacturer. Because it is require the usage of expensive equipment such as software
active, an attacker can easily detect this technique. defined radios. Thus, the routine monitoring setup consists
Cache [7] proposes two methods for fingerprinting a in a monitoring device that captures all 802.11 frames using
device’s network card and driver. The first one is active and a standard wireless card in monitoring mode on a specified
uses the 802.11 association redirection mechanism. Even if 802.11 channel.
well specified in the IEEE 802.11 standard, it is very loosely We seek a fingeprinting method applicable on encrypted
implemented in the tested wireless cards. As a consequence, 802.11 traffic, making the fingerprinting method more uni-
each wireless card behaves differently during this phase versal and enabling fingerprinting devices from networks the
which allows characterizing them. The second fingerprinting monitoring device is not part of.
method of Cache is passive. It analyses the duration field The fingerprinting method should not perturbate the net-
values in 802.11 data and management frames. Each wireless work and should be hardly detectable by an attacker. As a
card computes the duration field in a slightly different way, result, our fingerprinting method is passive, i.e. the monitor-
which allows characterizing the network card. ing device does not generate any additional traffic.
Common to all above approaches is that they cannot We also require that the method relies on global network
differentiate between two devices using the same network parameters, thus representing the traffic generated by a
card and driver. Therefore, those approaches may not be sender in general, rather than focusing on specific frames or
used for identifying individual devices. features. In particular, it should be difficult for an adversary
Another class of related work allows fingerprinting indi- to deactivate or forge the considered network parameters.
vidual APs. Jana et al. [12] calculate the clock skews of These considerations eliminate the option of extracting in-
APs in order to identify them. The authors calculate clock formation from the 802.11 headers generated by the emitting
skews by using the timestamps contained in Beacon frames station. The sender fills the fields in these headers, and the
emitted by the AP. Arackaparambil et al. [3] refine the above headers can thus be spoofed (e.g. using tools such as Scapy).
work and propose a new method yielding more precise clock Finally, the fingerprinting method should be accurate, a
measures. They also successfully spoof an AP, making it property that we evaluate in the rest of this paper.
indistinguishable by the methods used by Jana et al. In light of the preceding requirements, we focus on
Loh et al. [14] fingerprint client stations, by observing information that we can extract solely from Radiotap [1] or
probe requests. Stations send probe requests according to Prism headers. The receiving wireless card driver generate
characteristic periodic patterns (see [9]). The period itself is these headers. An adversary that would like to change fields
subject to slight variations. Far from being uniform, these in these headers needs to change its behavior actually.
variations can be clustered. With enough observation time, We consider the following network parameters, all being
each cluster slowly derives, with a slope proportional to the candidates for a fingerprinting method with the aforemen-
time skew. This work is capable of uniquely identifying tioned requirements:
client stations; however, the requires more than one hour • Transmission rate: The 802.11 standard [2] allows
of traffic and is only applicable to client stations. transmitting frames using a set of predifened rates.
Each sending wireless card chooses to transmit a given
frame at a given rate. Gopinath et al. [11] highlight Client
station A DATA DATA
(a) Office 1
0.06
Pang et al. are able to detect 12% to 52% of users with
0.06
a FPR of 0.1 and 5% to 23% of users with a FPR of
Density
Density
0.04
0.04
0.01. In comparison, we could identify 27.1% to 32.2% and
0.02
0.02
6.8% to 22.7% of the devices with an FPR of 0.1 and 0.01
0.00
0.00
respectively.
250 300 350 400 450 250 300 350 400 450
In light of the different evaluation results, we consider
Inter−arrival time [µsec] Inter−arrival time [µsec]
only inter-arrival times in the rest of the paper. This network
parameter always appears in the top 3 network parameters
(which are the transmission time, the inter-arrival time and Figure 4. Example inter-arrival histogram of two different wireless devices
using different backoff implementations. Only data frames transmitted the
the medium access time). The inter-arrival time performs first time (no retries) and sent at 54 Mbps are shown.
well in most setting. Even in the difficult setting of the
conference trace the inter-arrival time yields good identi-
fication ratios. In contrast, the transmission time performs
well in most scenarios but poorly in the most difficult network parameters proposed in Section III. This section
setting of a conference. Finally, the medium-access time also shows that inter-arrival times depend on other network
has a similar behavior than the inter-arrival time but slightly parameters such as the medium access time and the trans-
underperforms in ”easy” settings such as the office traces. mission rate. Thus, this section also gives insights on the
behavior of network devices regarding these parameters.
C. Implementation The inter-arrival time is composed of i) the transmission
We have developed a tool in Python based on the pcap period and ii) the emitting client station’s idle period. Both
library. It analyses standard pcap files as well as live traffic periods have an impact on the signature value. We discuss
and extracts the different network parameters as described different wireless device behaviors that impact either one or
in Section IV. The tool also implements the fingerprinting the other period.
methodology, i.e. the calculation of the device signatures,
A. Wireless medium access methods
reference database, similarity measures and the calculation
of accuracy metrics. The 802.11 standard [2] specifies mechanisms to avoid
In our implementation we require that each training and collisions among multiple devices competing for the same
candidate signature uses a minimum number of 50 observa- wireless medium. The wireless card or driver implement
tions. Table I shows the resulting reference database sizes. these mechanisms. Their effect is essentially expressed in
50 observations correspond roughly to 50 transmitted frames the medium access wait time. Our evaluation (Section V)
for the observed device. The corresponding minimum obser- shows that the medium access wait time performs well for
vation time, i.e. the minimum time required to generate the both the identification and the similarity test.
signature, ranges from several seconds to several minutes. It 1) Impact of random backoff: The random backoff avoids
depends on the number of frames per second transmitted by frame transmission right after the medium is sensed idle.
the observed device. For instance, for one of the devices of Instead, all client stations that would like to transmit frames
office 2 that did not generate much traffic this corresponds should ensure that the medium is idle for a specified period,
to 30 seconds of traffic. We also evaluated the performance called DIFS, plus an additional random time, called backoff
with smaller thresholds, but we came to the conclusion time before sending data. Gopinath et al. [11] note that some
that a minimum of 50 observations is a good compromise device manufacturers implement the 802.11 standard very
between the minimum time required to generate a signature loosely. Berger et al. [5] also note differences such as devices
and matching accuracy. that systematically send frames during the first slot.
To evaluate the impact of these differences on the inter-
VI. FACTORS IMPACTING THE INTER - ARRIVAL arrival time histograms, we conduct the following experi-
HISTOGRAM SHAPE
ment: We send a continuous UDP stream (using iperf)
Previous sections have shown that the frame inter-arrival from one wireless device placed within a Faraday cage. The
time is the most promising network parameter with difficult Faraday cage minimized the impacts of external factors on
monitoring conditions (typically in a conference setting) and the random backoff procedure. In the second experiment, we
for the more difficult test of unique device identification. just replace the sending device with a model from another
This section discusses and demonstrates the different factors manufacturer and sent the same UDP stream again. We only
at various levels of the observed device that impact the analyze frames transmitted at 54 Mbps. Figure 4 shows the
frame inter-arrival time. Thus, this section gives an intuition resulting histograms. We can notice that the first graph adds
why this network parameter performs better than the other one small additional slot before the 16 slots defined by the
0.06
0.20
0.20
0.04
0.04
Density
Density
Density
Density
0.10
0.10
0.02
0.02
0.00
0.00
0.00
0.00
0 500 1000 1500 2000 0 500 1000 1500 2000 0 200 400 600 800 1000 0 200 400 600 800 1000
Inter−arrival time [µsec] Inter−arrival time [µsec] Inter−arrival time [µsec] Inter−arrival time [µsec]
(a) RTS mechanism deactivated. (b) RTS mechanism activated. RTS (a) Device 1 signature (b) Device 2 signature
threshold set to 2000 bytes.
Figure 5. Example inter-arrival histogram for the same device with 1
1
different RTS settings. 0.8 0.8
Density
Density
0.6 0.6
0.4 0.4
0.2 0.2
0 0
1 2 5.5 11 12 18 24 36 48 54 1 2 5.5 11 12 18 24 36 48 54
Rate [Mbps] Rate [Mbps]
standard. In addition, the distribution for the different slots
is slightly different on both devices. This indicates that the
(c) Device 1 transmission rate distri-(d) Device 2 transmission rate dis-
two devices implement the backoff mechanism differently. bution tribution
2) Impact of virtual carrier sensing: Virtual carrier sens- Figure 6. Example signatures and transmission rate distributions of two
ing enables a client station to reserve the medium for a different wireless devices using different transmission rates.
given amount of time (the contention-free-period). The client
station sends a Request to Send frame (RTS) and specifies
the expected data transmission duration in this frame. The
B. Transmission rates
destination device replies with a Clear To Send frame (CTS).
The client station can then transmit data frames during the The transmission rates impact the time needed to transfer
reserved duration and all other stations are supposed to a frame. With inter-arrival times, we measure the frames
mute during the time specified. The idle period between two end-of-reception time the effect of varying transmission rates
frames sent during contention-free-period is fixed to SIFS. is thus directly observable in our inter-arrival histograms.
Wireless cards and drivers handle a so called RTS threshold, Our evaluation of the different network parameters (Sec-
which is a value between 0 and 2347 bytes. tion V) showed that transmission rates alone are not discrim-
If the size of the data to be sent is greater than the inant to measure device similarity and perform even more
RTS threshold, the virtual carrier sensing mechanism will poorly to identify a device in a unique manner. The inter-
be triggered. Otherwise, the data frame will be sent using arrival time between two frames depends on the transmission
the random backoff mechanism. In some wireless card rate of the frame. Thus, it is possible that the transmission
driver implementations, the RTS threshold can be changed rate has a negative impact on the performance of the inter-
manually. In other ones, this threshold is hard-coded into the arrival time based fingerprints.
driver. Some devices do not implement this mechanism at Transmission rates can be quite discriminant in a con-
all and exclusivley rely on the random backoff procedure. trolled environment. Indeed, Gopinath et al. [11] highlight
To evaluate the impact of this mechanism on the inter- that data transfer rates distribution of wireless cards depends
arrival time histograms, we conduct the following experi- on the card’s vendor. Similarly, [10] shows that the rate
ments: In a busy wireless network environment (our lab), switching behavior might be used to characterize a wireless
a client station running under Linux sends a continuous access point. The latter two papers made experiments in a
UDP stream to a device connected by wire to an AP. We very controlled environment.
use iperf to generate the UDP stream. We conduct the We can illustrate the above behavior using the same
experiment twice using two distinctive RTS settings on the Faraday cage experiment done for the random backoff
same sending client station: a) Virtual carrier sensing turned timers. This time we include all frames sent at various
off and b) RTS threshold set to 2000 bytes. Figure 5 shows transmission rates in our measurements. Figure 6 shows the
the resulting histograms. In Figure 5 a), all frames are sent resulting inter-arrival histograms and the distribution of used
after a random backoff mechanism. In Figure 5 b), only RTS transmission rates. We see that the second device changes its
frames are sent after a random backoff mechanism, while transmission rate more frequently. This yields a completely
data frames are sent during contention-free-period. different histogram.
0.15
0.15
0.04
0.04
Density
Density
0.10
0.10
Density
Density
0.02
0.02
0.05
0.05
0.00
0.00
0.00
0.00
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
Inter−arrival time [µsec] Inter−arrival time [µsec]
Inter−arrival time [µsec] Inter−arrival time [µsec]
(a) Netbook instance 1 (b) Netbook instance 2 Figure 8. Example histograms based solely on “Data null function” frames
for two different wireless cards.
Figure 7. Example histogram based solely on data broadcast frames for
two different devices with same model and same OS.