0% found this document useful (0 votes)
12 views11 pages

Empirical Study of Passive 802.11 Device Fingerprinting

Uploaded by

nathantadesse19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views11 pages

Empirical Study of Passive 802.11 Device Fingerprinting

Uploaded by

nathantadesse19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

An empirical study of passive 802.

11 Device Fingerprinting

Christoph Neumann, Olivier Heen, Stéphane Onno


Technicolor Security and Content Protection Labs, Rennes, France
Email: {christoph.neumann, olivier.heen, stephane.onno}@technicolor.com

Abstract—802.11 device fingerprinting is the action of char- Fingerprinting is also useful in key protected wireless
acterizing a target device through its wireless traffic. This networks (e.g. WPA2). It can be used after a key-based
results in a signature that may be used for identification, authentication mechanism in order to control if only au-
network monitoring or intrusion detection. The fingerprinting
arXiv:1404.6457v1 [cs.CR] 25 Apr 2014

method can be active by sending traffic to the target device, or thorized devices are in the network. Indeed wireless keys
passive by just observing the traffic sent by the target device. may leak as there are several normal situations where e.g.
Many passive fingerprinting methods rely on the observation home users voluntarily give out their wireless key without
of one particular network feature, such as the rate switching renewing it afterwards, as for instance when allowing a
behavior or the transmission pattern of probe requests. guest’s laptop to access the home network. While this
In this work, we evaluate a set of global wireless network
parameters with respect to their ability to identify 802.11 scenario is both common and simple, it also endangers the
devices. We restrict ourselves to parameters that can be home network; the guest may abusively reconnect or the key
observed passively using a standard wireless card. We evaluate may eventually leak from his laptop. Tools like aircrack-ng
these parameters for two different tests: i) the identification exist that allow non-professional hackers to crack the (known
test that returns one single result being the closest match for to be insecure) WEP protocol. Finally, commercial services,
the target device, and ii) the similarity test that returns a set
of devices that are close to the target devices. We find that like wpacracker.com, exist which try to recover WPA keys.
the network parameters transmission time and frame inter- Contributions: We investigate new passive fingerprint-
arrival time perform best in comparison to the other network ing candidates for 802.11 devices. More precisely, we mea-
parameters considered. Finally, we focus on inter-arrival times, sure five network parameters that can be captured with a
the most promising parameter for device identification, and standard wireless card. Using a generic method to calcu-
show its dependency from several device characteristics such
as the wireless card and driver but also running applications. late a signature of a device, we compare the ability of
each network parameter to characterize 802.11 devices. We
perform the similarity and the identification tests, which
I. I NTRODUCTION
we evaluate using our own measurement data as well as
Device fingerprinting is the action of gathering device public data obtained from large conference settings. We
information in order to characterize it. This process gen- use detection window sizes of 5 minutes. We find that
erates a signature, also called a fingerprint, that describes the network parameters transmission time and frame inter-
the observed features of a device in a compact form. If the arrival time perform best in comparison to the other network
generated signature is distinctive enough, it may be used to parameters considered. Our evaluation renders a probability
identify a device. of correct classification that ranges from 79.4% to 95.0% for
One application of 802.11 device fingerprinting is intru- the transmission time and from 62.7% to 93.1% for frame
sion detection and more precisely the prevention of Medium inter-arrival times. In the most difficult testing conditions,
Access Control (MAC) address spoofing. MAC address the wireless traffic of a conference, the inter-arrival time
spoofing is the action of taking the MAC address of another renders the best identification ratios. Up to 56.6% of the
machine in order to benefit from its authorization. The devices could be uniquely identified with a false positive
prevention of MAC address spoofing is of importance in rate of 0.1. Finally, we focus on inter-arrival times, the most
various scenarios. Open wireless networks such as hot-spots promising parameter for device identification, and show its
often implement MAC address based access control in order dependency from several device characteristics such as the
to guarantee that only legitimate client stations connect, e.g. device’s wireless card and driver or running applications.
the devices that purchased Internet access on an airport hot- The remainder of this paper is organized as follows. We
spot. An attacker can steal a legitimate device’s session by present related work in Section II. Section III presents the
spoofing the latter’s MAC address. different network parameters considered, and Section IV
Another application of fingerprinting is the detection of presents the proposed fingerprinting method. Section V
rogue access points (APs). Tools like AirSnarf or Raw- evaluates our method against several traces. We then present
FakeAP enable an attacker to set-up a rogue AP, which result in Section VI the 802.11 features the generated signature
in client stations connecting to the fake AP instead of the depends on, and discuss possible attacks and applications in
genuine one. Section VII. Finally, we conclude with Section VIII.
II. R ELATED W ORK In contrast to the above papers, Pang et al. [15] discuss
privacy implications in 802.11. Their paper highlights that
A first class of related work fingerprints wireless client users are not anonymous when using 802.11 as the protocol
stations by analyzing implementation specificities of the uses globally unique identifiers (the MAC address), which
network card and/or driver. Franklin et al. [9] characterize allow user tracking. Even if we suppose that this identifier
the drivers during the “active scanning period”. This process is masked (e.g. by temporarily changing addresses) it is
is underspecified in the IEEE 802.11 standard regarding the possible to track users by observing a set of parameters in
frequency and order of sending probe requests. Therefore, the 802.11 protocol. The observed parameters are network
each manufacturer implements its own algorithm and timers. destinations, network names advertised in 802.11 probes,
Gopinath et al. [11] show that 802.11 cards exhibit very 802.11 configuration options and broadcast frames’ sizes.
heterogeneous behavior due to implementation specificities. With encrypted traffic, three out of the four parameters still
They test a set of 802.11 features such as Random Back- apply. The presented identification problem is close to our
off timers or Virtual Carrier Sensing and present their identification test. To answer this test successfully, their
experimental results. The observed heterogeneity in behavior fingerprinting technique requires traffic samples for each
may be used to fingerprint a card’s vendor and model, but user that last at least one hour.
does not further analyze this aspect.
Bratus et al. [6] propose an active method to fingerprint III. 802.11 NETWORK PARAMETERS
client stations as well as APs. They send malformed or This section describes the network parameters we consider
non-standard stimulus frames to the fingerprintee and apply for fingerprinting. We focus on network parameters that we
decision trees on the observed response or behavior. This can easily extract using a standard wireless card. We do not
yields a signature of the vendor/manufacturer. Because it is require the usage of expensive equipment such as software
active, an attacker can easily detect this technique. defined radios. Thus, the routine monitoring setup consists
Cache [7] proposes two methods for fingerprinting a in a monitoring device that captures all 802.11 frames using
device’s network card and driver. The first one is active and a standard wireless card in monitoring mode on a specified
uses the 802.11 association redirection mechanism. Even if 802.11 channel.
well specified in the IEEE 802.11 standard, it is very loosely We seek a fingeprinting method applicable on encrypted
implemented in the tested wireless cards. As a consequence, 802.11 traffic, making the fingerprinting method more uni-
each wireless card behaves differently during this phase versal and enabling fingerprinting devices from networks the
which allows characterizing them. The second fingerprinting monitoring device is not part of.
method of Cache is passive. It analyses the duration field The fingerprinting method should not perturbate the net-
values in 802.11 data and management frames. Each wireless work and should be hardly detectable by an attacker. As a
card computes the duration field in a slightly different way, result, our fingerprinting method is passive, i.e. the monitor-
which allows characterizing the network card. ing device does not generate any additional traffic.
Common to all above approaches is that they cannot We also require that the method relies on global network
differentiate between two devices using the same network parameters, thus representing the traffic generated by a
card and driver. Therefore, those approaches may not be sender in general, rather than focusing on specific frames or
used for identifying individual devices. features. In particular, it should be difficult for an adversary
Another class of related work allows fingerprinting indi- to deactivate or forge the considered network parameters.
vidual APs. Jana et al. [12] calculate the clock skews of These considerations eliminate the option of extracting in-
APs in order to identify them. The authors calculate clock formation from the 802.11 headers generated by the emitting
skews by using the timestamps contained in Beacon frames station. The sender fills the fields in these headers, and the
emitted by the AP. Arackaparambil et al. [3] refine the above headers can thus be spoofed (e.g. using tools such as Scapy).
work and propose a new method yielding more precise clock Finally, the fingerprinting method should be accurate, a
measures. They also successfully spoof an AP, making it property that we evaluate in the rest of this paper.
indistinguishable by the methods used by Jana et al. In light of the preceding requirements, we focus on
Loh et al. [14] fingerprint client stations, by observing information that we can extract solely from Radiotap [1] or
probe requests. Stations send probe requests according to Prism headers. The receiving wireless card driver generate
characteristic periodic patterns (see [9]). The period itself is these headers. An adversary that would like to change fields
subject to slight variations. Far from being uniform, these in these headers needs to change its behavior actually.
variations can be clustered. With enough observation time, We consider the following network parameters, all being
each cluster slowly derives, with a slope proportional to the candidates for a fingerprinting method with the aforemen-
time skew. This work is capable of uniquely identifying tioned requirements:
client stations; however, the requires more than one hour • Transmission rate: The 802.11 standard [2] allows
of traffic and is only applicable to client stations. transmitting frames using a set of predifened rates.
Each sending wireless card chooses to transmit a given
frame at a given rate. Gopinath et al. [11] highlight Client
station A DATA DATA

that transmission rates distribution of wireless cards Client


station B
ACK ACK

depends on the card’s vendor. Client


station C RTS Data
• Frame size: The size of a 802.11 frame depends on Client CTS
the type of the frame, the fragmentation threshold, the station D

version of IP1 or the applications generating the traffic. Monitoring


device DATA ACK DATA ACK RTS CTS Data
Pang et al. [15] use broadcast frames size as an implicit t0 t1 t2 t3 t4 t5 time

identifier for wireless devices. dropped t2 − t1 dropped t4 − t3 dropped

• Medium access time: The time a wireless device waits


Figure 1. Measurement method example.
after the medium as become idle and before sending
its own frame. Gopinath et al. [11] already noted
that some device manufacturers implement the IEEE
802.11 standard very loosely with respect to the random We now describe the signature generation process more
backoff algorithm, which is one of the medium access formally. The sequence f0 , . . . , fn−1 of frames represents
mechanism of 802.11. the network trace captured by the monitoring device. ti
• Transmission time: The transmission time is the time denotes the time of end of reception of the frame fi (where
required to transmit a frame, thus the time between the 0 ≤ i ≤ n − 1), and frames are ordered in increasing
start of reception and the end of reception of a frame. reception time (i.e. ∀i : ti−1 < ti ). The sender si sends
• Frame inter-arrival time: The frame inter-arrival time the frame fi . For frames like ACK frames or clear-to-send
is the time interval between end of receptions of two frames [2] the sender is unknown2 , thus si = null.
consecutive frames. We calculate or extract the network parameter pi from
the Radiotap or Prism header for each frame fi . Depending
IV. M ETHODOLOGY on the network parameter considered, pi may have different
This section explains how we extract and evaluate sig- meanings. Radiotap or Prism headers include the size sizei ,
natures from the five network parameters considered in the the transmission rate ratei and the end of reception or the
previous section. start of reception ti of a frame fi . If the considered network
parameter is the transmission rate, we have pi = ratei .
A. Signature construction
Similarly, pi = sizei if we consider frame sizes. We can
Signature calculation consists in generating several his- also calculate the inter-arrival time ii = ti − ti−1 , the
tograms, one histogram per frame type (e.g. Data frames, transmission time tti = sizei /ratei and the medium access
Probe Requests,. . . ). A histogram represents the frequencies time mtimei = ti − tti−1 .
of the values measured. Each histogram is weighted, which We add the measured or calculated parameter to the set
gives more or less importance to certain types of frame. P f type (si ). P f type (s) denotes the set of values measured or
We define the signature of device as the set of generated calculated for frames of type f type for the sending device
histograms generated by the device and their weights. s. We denote |P f type (s)| the number of observations for
We loose part of the information contained in a network frames of type f type for device s.
trace by choosing histograms during signature calculation. Figure 1 illustrates our method. Client stations A, B,
Histograms may for instance eliminate characteristic pat- C and D use the same channel and send the frames as
terns or periodic behaviors. Signal processing methods such depicted. The monitoring device listens on the same channel
as n-dimensional histograms, correlation functions or fre- and receives all frames of the emitting client stations. Thus,
quency analysis using Fourier transformations or Wavelet the sequence of frames f0 , . . . , f5 corresponds to the frame
transformations may capture these behaviors. However, the sequence DATA, ACK, DATA, ACK, RTS, CTS. The first
subject of this work is not to find the most adequate signal ACK frame f1 has no explicit sender, s1 = null. Thus, we
processing method, but rather to highlight that some high drop the associated value p1 . Similarly, we drop the frames
level network parameters can achieve good detection ratios, f3 and f5 . If we use transmission rate as a parameter, we
even with a simple signature calculation method. We suspect associate the value rate2 to client station A, as frame f2
that the network parameter that yields good performance is sent by station A. We associate rate4 to client station
with histograms will also yield the good results with more C. Thus, P DAT A (A) = {rate2 } and P RT S (C) = {rate4 }.
advanced signal processing methods. Section VI provides an Similarly, if we use inter-arrival times as a parameter, we
intuition on this latter assertion. associate the interval i2 = t2 − t1 to client station A. We
1 IPv4 addresses use 32 bits while IPv6 addresses use 128 bits. IP
addresses are transported in 802.11 frames, thus changing the size of the 2 ACK and CTS frames do not include a sender address or transmitting
frame. address field.
stores the signatures Sig(ri ) of each wireless devices ri , i.e.
of each source address, appearing in the trace. We suppose
that no attacker polluted the training data. The validity of
this hypothesis is discussed in Section VII.
0.04

In the detection phase, we analyze a second wireless


trace, which we call validation dataset, and extract the
Density

signature Sig(ci ) for all devices appearing in this trace. We


call the devices of the validation dataset candidate devices.
0.02

We compare each candidate device’s signature with all the


signatures of the reference database using the algorithm
described in Section IV-C. This yields a vector of similar-
0.00

ities < sim1 , sim2 , . . . , simN >, simi corresponding to


the similarity of the unknown candidate device’s signature
0 500 1000 1500 2000 2500
compared to the reference signature of device ri .
Inter−arrival time [µsec] For each candidate device, we are interested in resolving
the following two tests:
Figure 2. Example inter-arrival time histogram
• Similarity: The fingerprint algorithm returns a set of
reference devices which signatures similarity simi is
greater than a threshold T . Thus we are interested in
associate the interval i4 = t4 − t3 to client station C. Thus, knowing which reference devices are similar to the
P DAT A (A) = {t2 − t1 } and P RT S (C) = {t4 − t3 }. candidate one.
Based on the above measurements we generate a his- • Identification: The second and more difficult test is
togram for each frame type and each emitting client station. interested in actually and uniquely identifying the
It is composed of bins b0 , . . . , bk−1 . We denote ofj type candidate device. To do so, we pick the reference
(where 0 ≤ j ≤ k − 1) the number of observations in bin device with the greatest similarity from the vector of
bj . We convert the histogram into a percentage frequency similarities returned by the previous test.
distribution, where the bin’s bj percentage frequency is
Pjf type = ofj type /|P f type (s)|. The resulting histogram for Regarding the similarity test we are interested in the True
a give frame type is histf type (s) = {Pjf type |∀j ∈ 0 ≤ j ≤ Positive Rate (TPR) and the False Positive Rate (FPR). The
k − 1}. Figure 2 shows a resulting example histogram using TPR is the fraction of candidate wireless devices known to
inter-arrival times. the reference database for which the returned set contains the
Finally, we define the signature Sig of device s as follows: actual device. The FPR is the fraction of returned reference
devices that do not match the actual candidate device. In
Definition 1 (Device signature): section V, we calculate the FPR and TPR as a function of
the threshold T .
Sig(s) = {(weightf type (s), histf type (s))|∀f type} We plot a similarity curve which draws the TPR as a
The variable weight f type
weights the importance of a function of the FPR (see Section V). We do not apply
histogram for a given frame types. For reference signature classical receiver operating characteristic (ROC) curves in
we choose that this context, since we handle multiple classses (one class per
reference device). In particular with the similarity curve, it is
|P f type (s)| possible to have results in the lower right triangle of the plot.
weightf type (s) = P f type (s)|
f type |P Similarly to ROC curves, we also calculate the Area Under
, thus the distribution of frame types is equivalent to the the Curve (AUC), which measures the global probability of
weight given to each frame type. correct classification.
We express the accuracy metric for identification test as an
B. Detection methodology and accuracy metrics identification ratio. The identification ratio is the fraction of
In general, fingerprinting methods have two phases: a candidate wireless devices known to the reference database
learning phase which populates a reference database, and that the fingerprinting method correctly identifies. As with
a detection phase which matches wireless devices against the similarity test, a false positive rate can be calculated for
the reference database. the identification test. For the identification test, the FPR is
The reference database is built using a training dataset, the fraction of candidate wireless devices that fingerprinting
which is in our case a wireless trace. The reference database method mistakenly identifies as another device.
Conf. 1 Conf. 2 Office 1 Office 2
C. Matching algorithm and similarity measures Total duration 7 hours 1 hour 7 hours 1 hour
The algorithm below depicts how to match the sig- Ref. duration 1 hour 20 min 1 hour 20 min
Cand. duration 6 hours 40 min 6 hours 40 min
nature Sig(c) of a candidate c against the reference Encryption None None WPA WPA
database. The algorithm returns a vector of similarities # ref. devices 188 97 158 120
< sim1 , sim2 , . . . , simN >, simi corresponding to the Table I
similarity of the unknown candidate device’s signature with E VALUATION TRACES FEATURES .
the reference signature of device ri . We use the Cosine-
similarity, as defined below, to calculate the similarity be-
tween two histograms. We weight the resulting score with Network parameter Conf. 1 Conf. 2 Office 1 Office 2
the frame type weight weightf type (ri ) of the reference Transmission rate 4.0% 33.5% 83.7% 70.6%
Frame size 53.4% 78.2% 85.7% 70.0%
signature. Medium access time 63.4% 61.5% 86.4% 68.8%
Transmission time 80.7% 79.4% 95.0% 82.9%
Algorithm 1 Match: Match the signature Sig(c) of candi- Inter-arrival time 62.7% 72.5% 93.7% 80.1%
date c against reference database
Table II
for all f type ∈ Sig(c) do A REA U NDER THE C URVE (AUC) FOR THE SIMILARITY TEST.
Extract histf type (c) from Sig(c)
for all references ri in reference database do
Extract histf type (ri ) from Sig(ri )
simfi type = simCos (histf type (c), histf type (ri ))
simi = simi + weightf type (ri ) ∗ simfi type office 1 trace, the training dataset corresponds to the first
end for
end for hour of the 7 hour trace. The 6 remaining hours compose the
return < sim1 , sim2 , . . . , simN > validation dataset. For the two one hour traces, conference
2 and office 2, we use the first 20 minutes as a reference
trace and the remaining 40 minutes as a training trace. We
Let hist(r) = {Pr,j |∀j} a reference histogram for device use a detection window size of 5 minutes for the validation
r. Let hist(c) = {Pc,j |∀j} a candidate histogram for device dataset. We match all candidate devices against the reference
c. database for each detection window. Using a minimum
Definition 2 (Cosine-similarity): number of frames of 50 for generating the signatures (see
Pk−1
j=0 (Pc,j Pr,j )
Section 4.3) we obtain training database sizes and number
simCos (hist(c), hist(r)) = 1 − qP qP of candidates as shown in Table I.
k−1 2 k−1 2
P
j=0 r,j j=0 Pc,j
B. Evaluation
The Cosine-similarity is based on the Cosine-distance [8].
The similarity equals 1 if two signatures are exactly the 1) Similarity: We first discuss the results for the sim-
same. It yields 0 when signatures have no intersection. ilarity test as defined in Section IV-B. We evaluate each
of the network parameters using the classifier presented in
V. E VALUATION AND I MPLEMENTATION
Section IV-C across several thresholds T . As T decreases,
A. Wireless traces the tolerated similarity between reference and candidate
We evaluate our methodology against four different wire- signature decreases, thus the FPR and TPR increases.
less traces. We use a publicly available 7 hours trace Figure 3 shows the similarity curve for the traces office
collected on August 19th 2008 at 11am on one monitor- 1, office 2, conference 1 and conference 2. This curve draws
ing device during the 2008 Sigcomm conference [16]. We the TPR as a function of the FPR. Table II shows the Area
consider two subsets of the Sigcomm trace: i) the entire Under the Curve (AUC) of these curves.
7 hours trace, that we call conference 1, and ii) the 1st We observe that the transmission time generally outper-
hour trace, extracted from the 7 hours trace, which we forms all other network parameters, independently of the
call conference 2. We also generated two wireless traces considered network trace. If we consider the AUC, we
ourselves. We generated the first one, called office 1, by can rank the network parameters in decreasing fingerprint
capturing all wireless traffic on channel 6 during 7 hours accuracy as follows: transmission time, inter-arrival time,
in our office. We recorded the second one, called office medium access time and transmission rate. The transmission
2, during one hour another day in the same setting. The time achieves an AUC between 79.4% and 95%. The inter-
conference traces are not encrypted (i.e. no WEP or WPA). arrival time has similar results (with the exception of the
The office traces are encrypted (WPA). conference 1 trace), with an AUC between 62.7% and
We evaluate our fingerprinting method by splitting each of 93.7%. The medium access time achieves, the frame size and
the wireless traces described above in two sets: i) a training the transmission rate achieve an AUC between 61.5% and
dataset and ii) a validation dataset. For the conference 1 and 86.4%, 53.4% and 86.7% and 4.0% and 83.7% respectively.
1 Network parameter, FPR Conf. 1 Conf. 2 Office 1 Office 2
Transmission rate, 0.01 0% 0.6% 7.0% 3.0%
Transmission rate, 0.1 0% 7.5% 12.9% 7.0%
0.8 Frame size, 0.01 0% 0.2% 18.4% 13.8%
Avg. True Positive Rate

Frame size, 0.1 4.5% 2.5% 33.9% 20.4%


0.6 Medium access time, 0.01 22.7% 6.8% 34.0% 18.4%
Medium access time, 0.1 27.2% 28.1% 41.0% 21.1%
Transmission time, 0.01 0% 0% 56.1% 43.4%
0.4 Transmission time, 0.1 6.8% 5.8% 60.5% 50.5%
Inter-arrival time, 0.01 15.9% 6.4% 48.0% 21.5%
0.2
frame size Inter-arrival time, 0.1 20.4% 32.2% 56.7% 27.5%
inter-arrival time
transmission rate
transmission time
medium access wait time Table III
0 I DENTIFICATION RATIOS .
0 0.2 0.4 0.6 0.8 1
False Positive Rate

(a) Office 1

1 A notable exception to above ranking is the behavior of


the TPR for small FPRs in the conference setting. The two
0.8 network parameters inter-arrival time and medium access
Avg. True Positive Rate

time clearly outperform all other parameters. With a FPR


0.6 of 0.01, they yield a TPR between 7.8% and 8.3% for the
short conference trace and a between 41% and 45% for the
0.4 longer conference trace. The transmission time only yields
a TPR of 0.2% for the short conference trace and of 13.6%
frame size
0.2 inter-arrival time for the longer conference trace (FPR=0.01).
transmission rate
transmission time
medium access wait time
The conference setting is a more difficult setting for
0
0 0.2 0.4 0.6 0.8 1
device fingerprinting than the office setting. The conference
False Positive Rate trace systematically yields lower AUC and TPRs, even with
(b) Office 2 comparable reference database sizes and number of candi-
dates. In addition, the relative difference of performance
1
between the different network parameters becomes more
important. Particularly the transmission rate has a poor
0.8
fingerprint accuracy, due to the changing wireless conditions.
Avg. True Positive Rate

In a conference setting, devices often change location which


0.6
impacts the quality of the wireless signal and thus the
detection ratio.
0.4
2) Identification: We now present the results for the
frame size identification test shown in Table III.
0.2 inter-arrival time
transmission rate As with the similarity test, the transmission time outper-
transmission time
0
medium access wait time forms the other network parameters in the office setting.
0 0.2 0.4 0.6 0.8 1 With an FPR of 0.1, between 43.4% and 60.5% of the
False Positive Rate
devices could be identified. With an FPR=0.01, between
(c) Conference 1
50.5% and 56.1% of the devices could be identified. The
1
remaining ranking in decreasing order of TPR is the inter-
arrival time, the medium-access time and far behind with
0.8
very poor results the frame size and the transmission rate.
In contrast to the office setting, the transmission time
Avg. True Positive Rate

0.6 performs quite poorly in the conference traces. Instead, the


inter-arrival time and the medium-access time outperform all
0.4 other metrics. Using inter-arrival times, between 6.4% and
15.9% and between 20.4% and 32.2% of the devices could
frame size
0.2 inter-arrival time be identified with an FPR of 0.01 and 0.1 respectively. Using
transmission rate
transmission time medium access times, between 6.8% and 22.7% and between
medium access wait time
0 27.1% and 28.1% of the devices could be identified with an
0 0.2 0.4 0.6 0.8 1
False Positive Rate FPR of 0.01 and 0.1 respectively.
(d) Conference 2 Such identification ratios might appear small. However, if
Figure 3. Similarity curves (TPR vs. FPR) for office and conference traces.
we compare our results to the results obtained by Pang et al.
We do not apply classical receiver operating characteristic (ROC) curves
in this context, since we handle multiple classses (one class per reference
device).
[15], which analyzed a similar problem to our identification
test, we achieve comparable results. For similar settings,

0.06
Pang et al. are able to detect 12% to 52% of users with

0.06
a FPR of 0.1 and 5% to 23% of users with a FPR of

Density

Density
0.04

0.04
0.01. In comparison, we could identify 27.1% to 32.2% and

0.02

0.02
6.8% to 22.7% of the devices with an FPR of 0.1 and 0.01

0.00

0.00
respectively.
250 300 350 400 450 250 300 350 400 450
In light of the different evaluation results, we consider
Inter−arrival time [µsec] Inter−arrival time [µsec]
only inter-arrival times in the rest of the paper. This network
parameter always appears in the top 3 network parameters
(which are the transmission time, the inter-arrival time and Figure 4. Example inter-arrival histogram of two different wireless devices
using different backoff implementations. Only data frames transmitted the
the medium access time). The inter-arrival time performs first time (no retries) and sent at 54 Mbps are shown.
well in most setting. Even in the difficult setting of the
conference trace the inter-arrival time yields good identi-
fication ratios. In contrast, the transmission time performs
well in most scenarios but poorly in the most difficult network parameters proposed in Section III. This section
setting of a conference. Finally, the medium-access time also shows that inter-arrival times depend on other network
has a similar behavior than the inter-arrival time but slightly parameters such as the medium access time and the trans-
underperforms in ”easy” settings such as the office traces. mission rate. Thus, this section also gives insights on the
behavior of network devices regarding these parameters.
C. Implementation The inter-arrival time is composed of i) the transmission
We have developed a tool in Python based on the pcap period and ii) the emitting client station’s idle period. Both
library. It analyses standard pcap files as well as live traffic periods have an impact on the signature value. We discuss
and extracts the different network parameters as described different wireless device behaviors that impact either one or
in Section IV. The tool also implements the fingerprinting the other period.
methodology, i.e. the calculation of the device signatures,
A. Wireless medium access methods
reference database, similarity measures and the calculation
of accuracy metrics. The 802.11 standard [2] specifies mechanisms to avoid
In our implementation we require that each training and collisions among multiple devices competing for the same
candidate signature uses a minimum number of 50 observa- wireless medium. The wireless card or driver implement
tions. Table I shows the resulting reference database sizes. these mechanisms. Their effect is essentially expressed in
50 observations correspond roughly to 50 transmitted frames the medium access wait time. Our evaluation (Section V)
for the observed device. The corresponding minimum obser- shows that the medium access wait time performs well for
vation time, i.e. the minimum time required to generate the both the identification and the similarity test.
signature, ranges from several seconds to several minutes. It 1) Impact of random backoff: The random backoff avoids
depends on the number of frames per second transmitted by frame transmission right after the medium is sensed idle.
the observed device. For instance, for one of the devices of Instead, all client stations that would like to transmit frames
office 2 that did not generate much traffic this corresponds should ensure that the medium is idle for a specified period,
to 30 seconds of traffic. We also evaluated the performance called DIFS, plus an additional random time, called backoff
with smaller thresholds, but we came to the conclusion time before sending data. Gopinath et al. [11] note that some
that a minimum of 50 observations is a good compromise device manufacturers implement the 802.11 standard very
between the minimum time required to generate a signature loosely. Berger et al. [5] also note differences such as devices
and matching accuracy. that systematically send frames during the first slot.
To evaluate the impact of these differences on the inter-
VI. FACTORS IMPACTING THE INTER - ARRIVAL arrival time histograms, we conduct the following experi-
HISTOGRAM SHAPE
ment: We send a continuous UDP stream (using iperf)
Previous sections have shown that the frame inter-arrival from one wireless device placed within a Faraday cage. The
time is the most promising network parameter with difficult Faraday cage minimized the impacts of external factors on
monitoring conditions (typically in a conference setting) and the random backoff procedure. In the second experiment, we
for the more difficult test of unique device identification. just replace the sending device with a model from another
This section discusses and demonstrates the different factors manufacturer and sent the same UDP stream again. We only
at various levels of the observed device that impact the analyze frames transmitted at 54 Mbps. Figure 4 shows the
frame inter-arrival time. Thus, this section gives an intuition resulting histograms. We can notice that the first graph adds
why this network parameter performs better than the other one small additional slot before the 16 slots defined by the
0.06
0.20

0.20

0.04
0.04
Density

Density

Density

Density
0.10

0.10

0.02
0.02
0.00

0.00

0.00

0.00
0 500 1000 1500 2000 0 500 1000 1500 2000 0 200 400 600 800 1000 0 200 400 600 800 1000

Inter−arrival time [µsec] Inter−arrival time [µsec] Inter−arrival time [µsec] Inter−arrival time [µsec]

(a) RTS mechanism deactivated. (b) RTS mechanism activated. RTS (a) Device 1 signature (b) Device 2 signature
threshold set to 2000 bytes.
Figure 5. Example inter-arrival histogram for the same device with 1
1
different RTS settings. 0.8 0.8

Density
Density
0.6 0.6
0.4 0.4
0.2 0.2
0 0
1 2 5.5 11 12 18 24 36 48 54 1 2 5.5 11 12 18 24 36 48 54
Rate [Mbps] Rate [Mbps]
standard. In addition, the distribution for the different slots
is slightly different on both devices. This indicates that the
(c) Device 1 transmission rate distri-(d) Device 2 transmission rate dis-
two devices implement the backoff mechanism differently. bution tribution
2) Impact of virtual carrier sensing: Virtual carrier sens- Figure 6. Example signatures and transmission rate distributions of two
ing enables a client station to reserve the medium for a different wireless devices using different transmission rates.
given amount of time (the contention-free-period). The client
station sends a Request to Send frame (RTS) and specifies
the expected data transmission duration in this frame. The
B. Transmission rates
destination device replies with a Clear To Send frame (CTS).
The client station can then transmit data frames during the The transmission rates impact the time needed to transfer
reserved duration and all other stations are supposed to a frame. With inter-arrival times, we measure the frames
mute during the time specified. The idle period between two end-of-reception time the effect of varying transmission rates
frames sent during contention-free-period is fixed to SIFS. is thus directly observable in our inter-arrival histograms.
Wireless cards and drivers handle a so called RTS threshold, Our evaluation of the different network parameters (Sec-
which is a value between 0 and 2347 bytes. tion V) showed that transmission rates alone are not discrim-
If the size of the data to be sent is greater than the inant to measure device similarity and perform even more
RTS threshold, the virtual carrier sensing mechanism will poorly to identify a device in a unique manner. The inter-
be triggered. Otherwise, the data frame will be sent using arrival time between two frames depends on the transmission
the random backoff mechanism. In some wireless card rate of the frame. Thus, it is possible that the transmission
driver implementations, the RTS threshold can be changed rate has a negative impact on the performance of the inter-
manually. In other ones, this threshold is hard-coded into the arrival time based fingerprints.
driver. Some devices do not implement this mechanism at Transmission rates can be quite discriminant in a con-
all and exclusivley rely on the random backoff procedure. trolled environment. Indeed, Gopinath et al. [11] highlight
To evaluate the impact of this mechanism on the inter- that data transfer rates distribution of wireless cards depends
arrival time histograms, we conduct the following experi- on the card’s vendor. Similarly, [10] shows that the rate
ments: In a busy wireless network environment (our lab), switching behavior might be used to characterize a wireless
a client station running under Linux sends a continuous access point. The latter two papers made experiments in a
UDP stream to a device connected by wire to an AP. We very controlled environment.
use iperf to generate the UDP stream. We conduct the We can illustrate the above behavior using the same
experiment twice using two distinctive RTS settings on the Faraday cage experiment done for the random backoff
same sending client station: a) Virtual carrier sensing turned timers. This time we include all frames sent at various
off and b) RTS threshold set to 2000 bytes. Figure 5 shows transmission rates in our measurements. Figure 6 shows the
the resulting histograms. In Figure 5 a), all frames are sent resulting inter-arrival histograms and the distribution of used
after a random backoff mechanism. In Figure 5 b), only RTS transmission rates. We see that the second device changes its
frames are sent after a random backoff mechanism, while transmission rate more frequently. This yields a completely
data frames are sent during contention-free-period. different histogram.
0.15

0.15

0.04

0.04
Density

Density
0.10

0.10
Density

Density

0.02

0.02
0.05

0.05

0.00

0.00
0.00

0.00
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500
Inter−arrival time [µsec] Inter−arrival time [µsec]
Inter−arrival time [µsec] Inter−arrival time [µsec]

(a) Netbook instance 1 (b) Netbook instance 2 Figure 8. Example histograms based solely on “Data null function” frames
for two different wireless cards.
Figure 7. Example histogram based solely on data broadcast frames for
two different devices with same model and same OS.

two different wireless cards in the same wireless environ-


C. Impact of network services ment. As we can see, the frequency distribution of this type
of frames depends on the wireless card used. Finally, several
Services and applications installed on a device influence cards deactivate the power management feature under Linux.
the generated histograms. Applications generate the actual In this case, the traffic generated by this feature disappears.
data traffic transferred over 802.11 and thus the device’s Other similar features might be mentioned, such as Probe
traffic load. In addition, applications possibly generate very Requests frames for which the literature already noted that
distinctive frames with specific frame sizes. This is typically driver specificities can be observed [9], or similarly for Probe
the case with network services such as Simple Service Dis- Responses in the case of AP’s.
covery Protocol or Link-Local Multicast Name Resolution,
running on the operating system. Our evaluation of the VII. ATTACKS AND APPLICATIONS
different network parameters (Section V) shows that frame In this section we discuss possible attacks against our
sizes yield acceptable results for the similarity test, however fingerprinting method. We also discuss the applicability of
perform very poorly for the identification test. our fingerprinting method in various contexts. We suppose
Figure 7 shows the impact of network services on inter- that the signatures rely on inter-arrival times.
arrival histograms. The figure only shows data broadcast
frames. We generated the two histograms using two net- A. Attacks
books, same manufacturer, same model. Both netbooks ran 1) Forging a signature: An attacker may try to fake the
the same operating system with the same updates. Both signature of a genuine device. As many factors impact the
devices were active at the same time in the same wireless signature (see Section VI) this can be a difficult task. A
environment. Each device generates very distinctive peaks, casual attacker needs the same wireless equipment, driver
eventhough both devices share the characteristics listed and very similar driver and software configuration than the
before. For instance in Figure 7b, the protocols IGMPv3 and genuine device. A more powerful attacker may record traffic
Link-Local Multicast Name Resolution generate the peaks of a genuine device and replay it, possibly live as in a
at approx. 950 µsec and 1200 µsec respectively. The device relay attack. The sensitivity to drivers and 802.11 parameters
of Figure 7a has another set of services running, yielding a complicates the attacker task. Indeed, the attacker must
different histogram. Note that the latter figure also illustrates insert its own attacking traffic within the replayed traffic
the results of Pang et al. [15] which uses broadcast packets without modifying the signature. This restricts the nature
size as an implicit identifier. and quantity of the attacking traffic.
Another way of forging the signature is by learning and
D. Other factors then trying to mimic the signature of a genuine device. The
Each wireless card supports a different 802.11 feature attacker may send traffic at a constant transmission rate and
set or exhibits a specific behavior regarding some 802.11 vary the frame sizes for each frame type to reproduce the
features. For instance, Gopinath et al. [11] highlights that distribution of the histogram. Some frame types, such as
devices implement the power save mode differently. We RTS or Probe requests, depend on the driver or chipset and
observe in our measurements that the power management thus require the attacker to change these features, which are
feature generates additional traffic in the histogram. We can more difficult to forge than application level data.
isolate this traffic by observing the “Data null function” 2) Attacking the learning stage: It is important that the
frames which in most cases implement the power manage- learning stage is not polluted by the attacker. While this is
ment feature. Figure 8 shows two example histograms using never guaranteed in real life situation, acceptable security
is obtained if the learning stage starts on a moment chosen a user’s locations, even in cases where the device regularly
by the user or the administrator and not by the attacker. changes its MAC address in order to stay anonymous.
Suppose that an AP would like to build a reference database.
The learning stage might be initiated upon a user command VIII. C ONCLUSIONS
(like pressing a button). Then the AP learns the signatures We evaluated a set of global wireless network parameters
of the allowed client stations during e.g. two minutes, which with respect to their ability to identify 802.11 devices. To do
is sufficient with our fingerprinting method. so, we defined a passive fingerprinting method that can be
3) Preventing fingerprinting: 802.11 communications are implemented with standard equipment. We considered that
highly sensitive to denial of service attacks. An attacker that the network parameter frame inter-arrival time perform best
is capable of performing a denial of service attack is also in comparison to the other network parameters considered,
capable of preventing any fingerprinting activity. A more in particular in the most difficult scenario of a conference
subtle attacker may complicate the fingerprinting activity setting. Using this network parameter we are able to accu-
without blocking all the traffic. Classically this consists in in- rately identify 802.11 client stations and access points in a
jecting fake frames using the MAC addresses of the genuine reasonable amount of time.
fingerprintees. To the best of our knowledge, all wireless Whilst we gave some intuition about the ability to attack
fingerprinting methods can be degraded by this attack. Our our approach and explained to a certain extent which factors
fingerprinting method does not make an exception. impact the shape of the histogram, future work should
study these aspects in more detail. Especially the impact
B. Applications of applications and device updates on our fingerprinting
method needs to be studied further. Finally, future work
1) Detecting fake client stations: Access control based
should also investigate whether the fingerprinting method
on the MAC address of a client station occurs in various
can be improved by combining several network parameters.
contexts: enterprise network, hot-spots and home networks.
An attacker may want to spoof the MAC address of an autho- R EFERENCES
rized station in order to connect. Our fingerprinting method
is applicable in such context because forging an inter-arrival [1] Radiotap. http://www.radiotap.org/.
[2] Wireless LAN Medium Access Control (MAC) and Physical
time signature is more difficult than just changing a MAC
Layer (PHY) Specifications. IEEE 802.11 Standard, 1999.
address. An AP that routinely fingerprints its client stations [3] C. Arackaparambil, S. Bratus, A. Shubina, and D. Kotz. On
against a reference database of allowed client stations would the Reliability of Wireless Fingerprinting using Clock Skews.
end-up noticing a non matching signature. It can then warn In ACM WiSec, 2010.
the user or administrator that will react accordingly. [4] M. Azizyan, I. Constandache, and R. R. Choudhury. Sur-
roundSense: Mobile Phone Localization via Ambience Fin-
2) Detecting rogue AP’s: Fingerprinting can be used for gerprinting. In ACM MobiCom, 2009.
detecting impersonation of an AP: A hot-spot operator may [5] G. Berger-Sabbatel, Y. Grunenberger, M. Heusse,
record and publish the signatures of valid APs. The client F. Rousseau, and A. Duda. Interarrival Histograms :
station accessing the hot-spot routinely fingerprints the AP A Method for Measuring Transmission Delays in 802.11
and warns the user about mismatches. In this case, the WLANs. Research report, LIG lab, Grenoble, France, 2007.
[6] S. Bratus, C. Cornelius, D. Kotz, and D. Peebles. Active
learning stage must be performed during a safe period: when behavioral fingerprinting of wireless devices. In ACM WiSec,
receiving the AP from the vendor or during the installation 2008.
of the hot-spot. Our method is applicable to APs, if we [7] J. Cache. Fingerprinting 802.11 Devices. Master Thesis,
ignore the data frames that the AP forwarded in lieu of 2006.
another device. Otherwise, applicative data generated by [8] S.-H. Cha. Taxonomy of Nominal Type Histogram Distance
Measures. In MATH, 2008.
other devices would pollute the AP’s signature. This reduces
[9] J. Franklin, D. McCoy, P. Tabriz, V. Neagoe, J. V. Randwyk,
the number of fingerprintable frames for an AP, but it is and D. Sicker. Passive Data Link Layer 802.11 Wireless
sufficient to generate significant signatures. Device Driver Fingerprinting. In Usenix Security, 2006.
3) Localizing and tracking devices: Several authors pro- [10] K. Gao, C. L. Corbett, and R. A. Beyah. A passive approach
pose fingerprinting as an approximate location mechanism to wireless device fingerprinting. In DSN. IEEE, 2010.
[11] K. Gopinath, P. Bhagwat, and K. Gopinath. An Empirical
[13], [4]. In [13] a mobile device fingerprints its wireless Analysis of Heterogeneity in IEEE 802.11 MAC Protocol
environment and carries security decisions accordingly, like Implementations and its Implications. In ACM WiNTECH,
asking a password or not. Our method is applicable in this 2006.
case, because it produces signatures for both client stations [12] S. Jana and S. K. Kasera. On Fast and Accurate Detection
and AP’s, and because it requires only a few frames to of Unauthorized Wireless Access Points Using Clock Skews.
In ACM MobiCom, 2008.
generate signatures with a moving mobile device. [13] N. Kasuya, T. Miyaki, and J. Rekimoto. Activity-based
Finally, similar to [15], our work raises the question of Authentication by Ambient Wi-Fi Fingerprint Sensing. In
privacy. Indeed the generated signature may be used to trace ACM MobiCom, 2009.
[14] D. C. C. Loh, C. Y. Cho, C. P. Tan, and R. S. Lee. Identifying
Unique Devices through Wireless Fingerprinting. In ACM
WiSec, 2008.
[15] J. Pang, B. Greenstein, R. Gummadi, S. Seshan, and
D. Wetherall. 802.11 User Fingerprinting. In ACM MobiCom,
2007.
[16] A. Schulman, D. Levin, and N. Spring. CRAWDAD data
set umd/sigcomm2008 (v. 2009-03-02). Downloaded from
http://crawdad.cs.dartmouth.edu/umd/sigcomm2008, 2009.

You might also like