0% found this document useful (0 votes)
43 views5 pages

VTC v2

This document summarizes a study on LTE network deployment and usage based on crowd-sourced measurement data from multiple mobile operators and cities. The key findings are: 1) LTE is frequently used to improve network coverage rather than capacity, suggesting coverage is a higher priority than speed for deployment. 2) There is no evidence that video traffic is a primary driver for LTE deployment, contrary to common assumptions. 3) Factors like existing network infrastructure and commercial policies have a deeper impact on deployment decisions than purely technical considerations like user demand.

Uploaded by

dudi barash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views5 pages

VTC v2

This document summarizes a study on LTE network deployment and usage based on crowd-sourced measurement data from multiple mobile operators and cities. The key findings are: 1) LTE is frequently used to improve network coverage rather than capacity, suggesting coverage is a higher priority than speed for deployment. 2) There is no evidence that video traffic is a primary driver for LTE deployment, contrary to common assumptions. 3) Factors like existing network infrastructure and commercial policies have a deeper impact on deployment decisions than purely technical considerations like user demand.

Uploaded by

dudi barash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

What is LTE actually used for?

An answer
through multi-operator, crowd-sourced measurement
Francesco Malandrino Scott Kirkpatrick Danny Bickson
The Hebrew University of Jerusalem The Hebrew University of Jerusalem The Hebrew University of Jerusalem
Email: [email protected] Email: [email protected] Email: [email protected]

Abstract—LTE networks are commonplace nowadays; how- side, they are only collected by users of a certain application:
ever, comparatively little is known about where (and why) they the bias of such a self-selected sample and its size can be
are deployed, and the demand they serve. We shed some light problems.
on these issues through large-scale, crowd-sourced measurement.
Our data, collected by users of the WeFi app, spans multiple Given that crowd-sourced traces contain more information
operators and multiple cities, allowing us to observe a wide than other ones, it is natural to wonder how such information
variety of deployment patterns. Surprisingly, we find that LTE is can be used. One use of our traces that is especially rele-
frequently used to improve the coverage of network rather than vant to researchers is checking (and correcting, if need be)
the capacity thereof, and that no evidence shows that video traffic the assumptions made in studies on next-generation network
be a primary driver for its deployment. Our insights suggest that
such factors as pre-existing networks and commercial policies planning. The problem is hugely relevant and widely popular;
have a deeper impact on deployment decisions than purely however, most studies could benefit from a better understand-
technical considerations. ing of how – and for which purpose – present-day networks
are designed and deployed.
I. I NTRODUCTION Our paper is organized as follows. After reviewing related
Cellular networks are remarkable among human-made enti- work in Sec. II, in Sec. III we introduce our dataset, highlight-
ties in that they are both ubiquitous and inaccessible. Virtually ing how it compares to similar traces existing in the literature.
every human being is covered by one or more cellular network; Sec. IV recaps the information we need to extract from our
yet, it is surprisingly difficult to ascertain how exactly such data, and summarizes the tools and methodologies we employ
networks are planned, deployed, and utilized. to that end. In Sec. V we summarize our main findings,
In most cases, such information is simply unavailable, being comparing them with widespread assumptions and common
a closely guarded commercial secret. While some information (mis)conceptions. Sec. VI concludes our paper, describing our
is available, it only concerns the deployment, i.e., the location ongoing efforts to make at least some of our information
of cellular base stations [1], not the demand they serve. When available to the community.
demand information is available, only the aggregate traffic is
known, not the applications it is made of [2]. When detailed II. R ELATED WORK
traffic information is available, it typically includes only one Our results connect to three main categories of prior work:
mobile operator and limited geographical scope [3]. works presenting real-world mobile traces and datasets; works
The heart of the matter is that mobile network operators are studying the deployment of cellular networks; and works doing
not necessarily the best-suited observers to collect information the latter using the first.
on mobile networks. A powerful alternative is represented by Many real-world traces come from volunteers, e.g., the MIT
crowd-sourced network measurement, where volunteers run an Reality Project [4] and the Nokia Mobile Challenge [6]. These
application on their mobile devices, and the logs generated traces include a great deal of valuable information; however,
by such applications are combined together. Crowd-sourced their main shortcoming is the limited number of participants
measurement have existed for decades [4], [5]; however, only (in the case of the Nokia Mobile Challenge, around two
recent projects based on smartphones and apps have been able hundred). This scale is adequate to study, for example, user
to attain both a high level of detail and large scale. mobility or encounter patterns; however, studying a whole
There are three main aspects of crowd-sources traces that cellular network requires information about many more users.
cannot be matched by traces collected by mobile operators. Mobile operators are typically reluctant to release demand
First and most obviously, crowd-sourced traces naturally in- and deployment information to the scientific community. An
clude information about multiple operators, collected at the exception is represented by the Data For Development dataset
same time and location and with the same methodology. by Orange [7], including mobility information for 50,000 users
Furthermore, they include detailed information about the in Ivory Coast, as well as CDR (call-detail record) information
traffic demand and the individual app generating it, which for phone calls and SMS messages. This trace lacks data
mobile operators cannot obtain or share. Finally, they reveal sessions, and is severely restricted by heavy anonymization
what people do when they are not using a cellular network, each ID encountered gets a new coded identity for each ego
including detailed mobility (as opposed to coarse, cell-level site to which they are a neighbor. This makes it impossible to
one) and information about Wi-Fi traffic. On the negative trace out the entire social network.
TABLE I • network operator, cell ID, cell technology and local area
T HE B ROOKLYN AND B OSTON DATASETS . (LAC) the user is connected to (if any);
Brooklyn Boston • Wi-Fi network (SSID) and access point (BSSID) the user
Time of collection Nov. 2014 Nov. 2014 is connected to (if any);
Total traffic [TB] 39 23.2 • active app and amount of downloaded/uploaded data.
Number of records 59 million 69 million
Unique users 19315 15629 If the location of the user or the networks she is connected
Estimated coverage 2% 4% to change within each one-hour period, multiple records are
Unique cells 25971 20754 generated. Similarly, one record is generated for each app that
is active during the same period.
Similar to other crowd-sourced traces, our datasets cover
Cellular network planning is a topic of great theoretical multiple mobile operators and multiple technologies, including
interest and practical relevance. Existing works go to great Wi-Fi. Since we have detailed information about the actual cell
lengths to optimize the power control [8], and noise level [9] each user is connected to, and the technology (e.g., LTE or
of cellular networks, accounting for both the density of the UMTS or CDMA) thereof, this makes our dataset especially
users [10] and their trajectories [11]. Because of the limited useful to study the deployment of cellular networks of different
availability of real-world information about cellular networks, generations. Finally, thanks to its coverage, we can observe a
however, these works cannot account for the fact that deploy- representative fraction of the network traffic, and virtually all
ment decisions are based on more factors than the users and network infrastructure.
their traffic demand.
Some works [1], [2] do study network planning using real- IV. P ROCESSING TOOLS AND STEPS
world information. In particular, the authors of [1] find that
the coverage of different cellular networks is highly redundant In this section, we describe the software tools we use to
and significant gains could be obtained through consolidation. process our datasets (namely, the Graphlab library and the
[2] studies the deployment of LTE accounting for the existing SFrame object) and the actual processing steps we perform.
3G networks, the locations of users, and their demand. The
main limit of these works is that no distinction could be made A. Processing tools: SFrame
between different types of users and demand. At tens of millions of rows each, our datasets definitely
qualify as “big data”, impossible to load into the memory of
III. O UR DATA
any workstation and even most servers. However, this does not
Our data comes from the users of an app called WeFi [12]. necessarily mean that we need a cluster of (virtual) machines
The WeFi app provides its users with information on the to process our data; indeed, we can get our job done with a
safest and fastest Wi-Fi access points available at the user’s single computer and scalable computing.
location. At the same time (and with their consent), it collects Specifically, we resort to a Python library called Graphlab
information about the user’s location, connectivity and activity. Create [13], offering a scalable data type called SFrame,
Wefi reports over seven million downloads of the app whose interface is similar to R’s data.frame type or Pandas’
throughout all the world, and over three billion daily records. DataFrame. The relevant difference is that SFrames are
We use two datasets, relative to the American city of Boston stored on disk, with portions thereof loaded into memory only
and borough of Brooklyn. Their main features are summarized when needed, i.e., when performing some operations – thus
in Tab. I. preventing memory from limiting the amount of data being
Each record contains the following information: processed. Thanks to the scalability of SFrames, we are able
• day, hour (a coarse-grained timestamp); to perform all our computations on a single computer – albeit
• anonymized user identifier and GPS position; a powerful one, with 64 cores and 128 GB of memory.

1.0 1.0
(1) Dataset 0.8
records 0.8

0.6 0.6
CDF

CDF

(2) Cell (3) User 0.4 LTE in Boston 0.4 Video in Boston
information information 3G in Boston Non-video in Boston
0.2 LTE in Brooklyn 0.2 Video in Brooklyn
3G in Brooklyn Non-video in Brooklyn
0.0 0.0
(4) Distance 0 2 4 6 8 10 12 14 0 50 100 150 200 250
statistics Per-cell daily traffic [GByte] Per-user daily traffic [MByte]
(a) (b)
Fig. 1. Processing steps.
Fig. 2. Per-cell (a) and per-app (b) traffic in the Boston and Brooklyn datasets.
Boston 1.0 1.0
AT & T
Sprint 3G 0.8 0.8
T-Mobile LTE
Verizon 0.6 0.6

CDF

CDF
Brooklyn AT & T AT & T
AT & T 0.4 0.4
Sprint
Sprint Sprint
T-Mobile
0.2 T-Mobile 0.2 T-Mobile
Verizon Verizon
Verizon
0.0 0.0
0 200 400 600 800 1000 1200 1400 1600 1800 0 5 10 15 20 25 0 2 4 6 8 10 12
Number of cells Distance [km] Distance [km]
(a) (b) (c)

Fig. 3. Number of LTE and 3G cells deployed by each operator (a); distance between each user and the cell covering her in Boston (b) and Brooklyn (c).
Solid lines refer to LTE, dashed lines to 3G.

Graphlab is a commercial product, offering a free academic cell. All these locations belong to the coverage area of the
license. Furthermore, all the SFrame functionality we use in cell. We go a step further, and assume that the convex hull of
this paper is available as an open-source project [14]. all such locations corresponds to the cell’s coverage area, and
the base station itself is located at the baricenter of the hull.
B. Processing steps: deployment and demand Both assumptions imply some loss of precision; however, they
There are two main aspects of the networks we are in- allow us to classify the cells according to the area they cover,
terested in studying: demand, i.e., the traffic users want to and to study the distance between the base stations and the
consume, and deployment, i.e., the base stations they use to users they serve, as we will see in Sec. V.
this end. Our starting point is represented by the raw records The main aspect of user traffic (block 3 in Fig. 1) we are
in the datasets we described in Sec. III (block 1 in Fig. 1). interested in is its type. In particular, because video is widely
The information we need about cells (block 2 in Fig. 1): expected to make up most of the demand of future networks,
• its location and coverage area;
and indeed considered one of the main reasons why we need
• its technology and served traffic.
future networks at all, we distinguish the user demand in
“video” and “non-video”. We obtain this information from the
The latter can be simply observed from the records, which
active app at every record: apps such as YouTube, Hulu and
include information about the amount of transferred data
Netflix all belong to the first category. Fig. 2(b) shows that,
and cell technology. We observe that to each cell identifier
indeed, video apps require much more data than others.
corresponds exactly one technology; it is indeed common to
At a first sight, Fig. 2 seems to confirm our intuitive
have multiple cells with different technologies on the same
expectations. Video applications require large amounts of data,
tower, but in this case they have different identifiers. Fig. 2(a)
and new, high-speed LTE base stations carry most of it,
shows the CDF of the amount of traffic served by cells; notice
concerning the matching between the traffic demand shown
how LTE cells seem to be more loaded than 3G1 ones, and
in Fig. 2(b) and the deployment summarized in Fig. 2(a).
cells in Brooklyn more than those in Boston.
Are LTE networks deployed to improve the ability of cellular
Reconstructing information about the location and coverage
networks to serve video traffic? Are they better-suited to
area of cells requires some additional care. From the records,
video traffic than to other types of data? Do different mobile
we know the users positions when being covered (i.e., regis-
operators have different strategies in this respect?
tered with) and/or served (i.e., exchanging data with) a given
To answer these questions we need to move to block 4
1 Throughout this paper, we will designate as “3G” both UMTS/HSDPA in Fig. 1, and to correlate the user demand (traffic type and
and CDMA technologies. location) with the cells serving it. This analysis, that is seldom

(a) (b) (c) (d)

Fig. 4. Brooklyn: deployment of 3G (blue dots) and LTE (red dots) for AT&T (a), Sprint (b), T-Mobile (c), Verizon (d). The size of dots is proportional to
the coverage area of each cell.
(a) (b) (c) (d)

Fig. 5. Boston: deployment of 3G (blue dots) and LTE (red dots) for AT&T (a), Sprint (b), T-Mobile (c), Verizon (d). The size of dots is proportional to the
coverage area of each cell.

performed in the literature because of the amount and variety stations (LTE, in this case) where the demand is higher. In
of input data it needs, allows us to draw some unexpected these scenarios, operators would opt for high-capacity, low-
conclusions, summarized next. coverage LTE cells, including small cells and femtocells.
Instead, we observe a marked preference for large cells,
V. T HE ROLE OF LTE DEPLOYMENTS suggesting that coverage is the operators’ priority.
Fig. 3(a) summarizes the number of 3G and LTE cells This is confirmed by Fig. 4, showing where each operator
that each operator has in Boston and Brooklyn. It is very deploys LTE and 3G cells, as well as the coverage area thereof.
interesting to observe that the number of cells and the fraction Verizon’s mostly-LTE deployment (Fig. 4(d)) is mostly based
of LTE ones decidedly depends upon the operator. AT&T and on large cells, a sign that the operator is aiming at covering
T-Mobile have a predominantly 3G network, with the latter de- the whole area with as few cells as possible. Similarly, Sprint
ploying a larger number of cells than its competitors. Verizon, (Fig. 4(b)) tend to place large LTE cells in those areas with
on the other hand, deploys mostly LTE base stations, partially few or no 3G cells of theirs – again using LTE for the primary
because after the adoption of LTE they made substantial efforts purpose of enhancing coverage.
to improve their coverage. Even AT&T (Fig. 4(a)) and T-Mobile (Fig. 4(c)), that deploy
LTE cells in the those populated areas wherein their 3G
A. LTE coverage coverage is already strong, show a clear preference towards
Fig. 3(b) and Fig. 3(c) show the distance between each 3G large cells. In these cases, a major reason for providing LTE
and LTE user (active or not) and the base station she is attached coverage is, so to say, providing LTE coverage, not to fall
to. We can observe an unexpected and significant fact: LTE behind competition. In this case a primary motivation for
users tend to be much farther away from their base stations deployment LTE is commercial rather than technical.
than 3G ones. This is consistent with the fact that LTE cells Moving to Boston in Fig. 5, we can observe similar patterns.
serve more traffic (Fig. 2), but it contradicts the widespread Operators have very different pre-existing 3G networks, but
belief that LTE is primarily used to improve network capacity. they all seem to prefer large LTE cells over small ones.
Many studies on cellular network deployment are motivated AT&T (Fig. 5(a)) and Sprint (Fig. 5(b)) clearly use LTE to
by the expected increase in mobile data – especially, but not complement 3G coverage, while T-Mobile (Fig. 5(c)) focuses
only, video. Such increase can disrupt network connectivity, on downtown areas. Verizon exhibits a stronger tendency to
and therefore mobile operators need to deploy additional base maximize its coverage through large LTE cells.

1.0 1.0
12
0.8 0.8 10
Video traffic [TByte]

0.6 0.6 8 AT & T


Sprint
CDF

CDF

0.4 AT & T 0.4 AT & T 6 T-Mobile


Sprint Sprint Verizon
4
0.2 T-Mobile 0.2 T-Mobile Wi-Fi
Verizon Verizon 2
0.0 0.0
0 5 10 15 20 25 0 2 4 6 8 10 12
0
Distance [km] Distance [km] Boston Brooklyn

(a) (b) (c)

Fig. 6. Distance traveled by LTE video and non-video traffic in Boston (a) and Brooklyn (b); operators serving video traffic (c). Solid lines refer to video
traffic, dashed lines to non-video traffic.
Interestingly, operators follow similar patterns in both cities. and video is not the type of traffic they are designed to
This suggests that their deployment decisions are the result of serve. Rather, operators seem to use LTE primarily to improve
company-wide policies, as well as local conditions such as their coverage, deploying low-frequency, high-range cells at
customer distribution or data demand. strategic locations, in both downtown and suburban areas.
It is important to stress that in both areas we study LTE At a more general level, our data suggest that traffic demand
deployment is ongoing. We have the unique opportunity to is but one of the factors shaping LTE deployments, and some-
observe the first LTE base stations that operators deploy, and times not even the main one. Understanding the deployment
infer thence what aspect of LTE – higher capacity, better decisions made by mobile operators – and foreseeing similar
coverage, etc – they need most urgently. Furthermore, we also decisions for next-generation networks – requires accounting
have the opportunity to compare LTE networks with a pre- such factors as pre-existing, previous-generation networks and
existing 3G ones, assessing whether operators tend to replicate commercial policies.
their previous-generation deployments, to complement them, What we can observe through our traces are early deploy-
or to follow a completely different strategy. ments of LTE, which gives us the opportunity to grasp the
priorities of operators, i.e., which of the multiple benefits of
B. The distance traveled LTE data LTE they seek to obtain first. At the same time, this means that
The last question we seek to answer is whether, and to which our results should be taken with a grain of salt: as an example,
extent, LTE networks are built for video traffic. We ascertain in the long term LTE networks will more likely replace 3G
this by looking at how far away video users are from the base networks than complement them.
station serving them. Notice that, unlike in Fig. 3, we take Current work includes more sophisticated analysis of the
into account the actual traffic served by LTE base stations. data demand, with the purpose of identifying positive and
Fig. 6(a) and Fig. 6(b) depict the distance that LTE video negative correlations between types of traffic. In parallel, we
and non-video traffic travel from the user to the serving base are are seeking to improve our outreach: while the datasets
station. To a shorter distance correspond a better quality and a described in Sec. III cannot be released, owing to privacy
higher throughput; therefore, if LTE networks were designed concerns and non-disclosure agreements, we do plan to make
around a certain type of traffic, we would expect such traffic an anonymized, aggregated version thereof available to the
to travel a shorter distance. community.
However, no such clear pattern can be identified: sometimes
R EFERENCES
video traffic travels a longer distance, e.g., with Verizon in
Fig. 6(a), other times there is no difference between the two. [1] J. Kibilda and L. DaSilva, “Efficient coverage through inter-operator
infrastructure sharing in mobile networks,” in IFIP Wireless Days, 2013.
Also notice that sometimes the same operator has video traffic [2] P. Di Francesco, F. Malandrino, T. Forde, and L. DaSilva, “A sharing-
traveling a shorter distance than the rest in some cities and a and competition-aware framework for cellular network evolution plan-
longer one in others (e.g., Sprint). In summary, optimizing ning,” IEEE Trans. on Cognitive Communications and Networking,
2015.
the service of video traffic seems not to be one of the main [3] S. Hoteit, S. Secci, Z. He, C. Ziemlicki, Z. Smoreda, C. Ratti, and
purposes of LTE deployments. G. Pujolle, “Content consumption cartography of the paris urban region
One compelling explanation is provided to us by Fig. 6(c), using cellular probe data,” in ACM CoNEXT UrbaNe Workshop, 2012.
[4] N. Eagle and A. Pentland, “Reality mining: sensing complex social
depicting how video traffic is served – overwhelmingly, systems,” Personal and ubiquitous computing, 2006.
through Wi-Fi. While it is true mobile video – i.e., video [5] J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau,
consumed through smartphones and tablets – is rapidly grow- “CRAWDAD dataset cambridge/haggle (v. 2009-05-29),” 2009.
[6] J. K. Laurila, D. Gatica-Perez, I. Aad, O. Bornet, T.-M.-T. Do,
ing, our evidence suggests that only a tiny fraction thereof, O. Dousse, J. Eberle, M. Miettinen et al., “The mobile data challenge:
about 92% in Brooklyn and 90% in Boston, is cellular video. Big data for mobile computing research,” in Pervasive Computing, 2012.
Indeed, cable operators are already seeking remedies to this [7] V. D. Blondel, M. Esch, C. Chan, F. Clérot, P. Deville, E. Huens,
F. Morlot, Z. Smoreda, and C. Ziemlicki, “Data for development: the
situation (sometimes controversial ones, such as the Netflix- d4d challenge on mobile phone data,” arXiv preprint, 2012.
Comcast deal [15]), while at this stage mobile operators seem [8] E. Amaldi, A. Capone, and F. Malucelli, “Planning umts base station
to have other priorities. location: Optimization models with power control and algorithms,” IEEE
Trans. on Wireless Communications, 2003.
[9] A. Abdel Khalek, L. Al-Kanj, Z. Dawy, and G. Turkiyyah, “Optimization
VI. C ONCLUSION AND CURRENT WORK models and algorithms for joint uplink/downlink umts radio network
We argued that crowd-sourced datasets, obtained from users planning with sir-based power control,” IEEE Trans. on Vehicular
Technology, 2011.
of smartphone applications, are a very good tool to understand [10] H. Ghazzai, E. Yaacoub, M.-S. Alouini, Z. Dawy, and A. Abu Dayya,
how mobile networks are planned and used. We presented two “Optimized lte cell planning with varying spatial and temporal user
fine specimens of this category, obtained from WeFi users, densities,” IEEE Trans. on Vehicular Technology, 2015.
[11] S. Mitra, S. Ranu, V. Kolar, A. Telang, A. Bhattacharya, R. Kokku, and
in Sec. III, and explained how we process such potentially S. Raghavan, “Trajectory aware macro-cell planning for mobile users,”
overwhelming information in Sec. IV. in IEEE INFOCOM, 2015.
We then set out to check some popular assumptions con- [12] “Wefi: the mobile data analytics company,” http://www.wefi.com.
[13] “GraphLab Create,” https://dato.com/products/create.
cerning the deployment of LTE networks against our data. [14] “GitHub: SFrame,” https://github.com/dato-code/SFrame.
Unexpectedly, we found in Sec. V that improved network [15] E. Wyatt and N. Cohen, “Comcast and Netflix reach deal on service,”
capacity is not the main reason why operators deploy LTE, in New York Times, 2014.

You might also like