0% found this document useful (0 votes)
10 views

Data Center Energy Consumption Modeling

This article surveys the state-of-the-art techniques for modeling and predicting energy consumption in data centers, highlighting the importance of energy efficiency due to economic and environmental impacts. It categorizes over 200 existing models into hardware-centric and software-centric approaches, addressing key challenges and gaps in current methodologies. The paper aims to provide a comprehensive framework for understanding and improving energy consumption in data centers, which are critical infrastructures for the growing IT industry.

Uploaded by

Laki Purašević
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Data Center Energy Consumption Modeling

This article surveys the state-of-the-art techniques for modeling and predicting energy consumption in data centers, highlighting the importance of energy efficiency due to economic and environmental impacts. It categorizes over 200 existing models into hardware-centric and software-centric approaches, addressing key challenges and gaps in current methodologies. The paper aims to provide a comprehensive framework for understanding and improving energy consumption in data centers, which are critical infrastructures for the growing IT industry.

Uploaded by

Laki Purašević
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 1

Data Center Energy Consumption Modeling : A


Survey
Miyuru Dayarathna, Yonggang Wen, Senior Member, IEEE, Rui Fan

Abstract—Data centers are critical, energy-hungry infras- fast growing IT industry and at the same time, resulting in
tructures that run large-scale Internet-based services. Energy a global market size of 152 billion US dollars by 2016 [4].
consumption models are pivotal in designing and optimizing Data centers being large scale computing infrastructures
energy-efficient operations to curb excessive energy consump-
tion in data centers. In this paper, we survey the state-of-the- have huge energy budgets, which have given rise to various
art techniques used for energy consumption modeling and pre- energy efficiency issues.
diction for data centers and their components. We conduct an Energy efficiency of data centers has attained a key
in-depth study of the existing literature on data center power importance in recent years due to its (i) high economic,
modeling, covering more than 200 models. We organize these (ii) environmental, and (iii) performance impact. First, data
models in a hierarchical structure with two main branches
focusing on hardware-centric and software-centric power centers have high economic impact due to multiple reasons.
models. Under hardware-centric approaches we start from the A typical data center may consume as much energy as
digital circuit level and move on to describe higher-level energy 25,000 households. Data center spaces may consume up to
consumption models at the hardware component level, server 100 to 200 times as much electricity as standard office space
level, data center level, and finally systems of systems level. [5]. Furthermore, the energy costs of powering a typical
Under the software-centric approaches we investigate power
models developed for operating systems, virtual machines data center doubles every five years [1]. Therefore, with
and software applications. This systematic approach allows such steep increase in electricity use and rising electricity
us to identify multiple issues prevalent in power modeling costs, power bills have become a significant expense for
of different levels of data center systems, including i) few today’s data centers [5][6]. In some cases power costs may
modeling efforts targeted at power consumption of the entire exceed the cost of purchasing hardware [7]. Second, data
data center ii) many state-of-the-art power models are based
on a few CPU or server metrics, and iii) the effectiveness and center energy usage creates a number of environmental
accuracy of these power models remain open questions. Based problems [8][9]. For example, in 2005, the total data
on these observations, we conclude the survey by describing center power consumption was 1% of the total US power
key challenges for future research on constructing effective consumption, and created as much emissions as a mid-sized
and accurate data center power models. nation like Argentina [10]. In 2010 the global electricity
Index Terms—Data Center, Energy Consumption Modeling, usage by data centers was estimated to be between 1.1%
Energy Efficiency, Cloud Computing and 1.5% of the total worldwide electricity usage [11],
while in the US the data centers consumed 1.7% to 2.2%
I. I NTRODUCTION of all US electrical usage [12]. A recent study done by Van
Heddeghem et al. [13] has found that data centers world-
ATA centers are large scale, mission-critical com-
D puting infrastructures that are operating around the
clock [1][2] to propel the fast growth of IT industry and
wide consumed 270TWh of energy in 2012 and this con-
sumption had a Compound Annual Growth Rate (CAGR) of
4.4% from 2007 to 2012. Due to these reasons data center
transform the economy at large. The criticality of data energy efficiency is now considered chief concern for data
centers have been fueled mainly by two phenomenons. center operators, ahead of the traditional considerations of
First, the ever increasing growth in the demand for data availability and security. Finally, even when running in the
computing, processing and storage by a variety of large idle mode servers consume a significant amount of energy.
scale cloud services, such as Google and Facebook, by Large savings can be made by turning off these servers.
telecommunication operators such as British Telecom [3], This and other measures such as workload consolidation
by banks and others, resulted in the proliferation of large need to be taken to reduce data center electricity usage. At
data centers with thousands of servers (sometimes with the same time, these power saving techniques reduce system
millions of servers). Second, the requirement for supporting performance, pointing to a complex balance between energy
a vast variety of applications ranging from those that run savings and high performance.
for a few seconds to those that run persistently on shared The energy consumed by a data center can be broadly
hardware platforms [1] has promoted building large scale categorized into two parts [14]: energy use by IT equipment
computing infrastructures. As a result, data centers have (e.g., servers, networks, storage, etc.) and usage by in-
been touted as one of the key enabling technologies for the frastructure facilities (e.g., cooling and power conditioning
systems). The amount of energy consumed by these two
M. Dayarathna, Y. Wen, and R. Fan are with the School of Computer
Engineering, Nanyang Technological University, Singapore. subcomponents depend on the design of the data center
E-mail: {miyurud, YGWEN, FanRui}@ntu.edu.sg as well as the efficiency of the equipment. For example,

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 2

according to the statistics published by the Infotech group to make data centers more energy efficient. However,
(see Figure 1), the largest energy consumer in a typical we note that having an energy model is not always
data center is the cooling infrastructure (50%) [13][15], necessary for energy consumption prediction.
while servers and storage devices (26%) rank second in
the energy consumption hierarchy. Note that these values
Validation
might differ from data center to data center (see for
example [16]). In this paper we cover a broad number
of different techniques used in the modeling of different
energy consuming components. System
Model optimization Prediction
Lighting Power cycle
3% Conversion
11%
Network Feature Real
Hardware Extraction System

Cooling 10%
Fig. 2. A systematic view of the energy consumption modeling and
50% Server & prediction process. The data center system optimization cycle consists of
Storage four main steps: feature extraction, model construction, model validation,
26% and usage of the model.

A model is a formal abstraction of a real system. Models


Fig. 1. A breakdown of energy consumption by different components of for computer systems can be represented as equations,
a data center [15]. The cooling infrastructure consumes a major portion graphical models, rules, decision trees, sets of represen-
of the data center energy followed by servers and storage, and other tative examples, neural networks, etc. The choice of rep-
infrastructure elements.
resentation affects the accuracy of the models, as well
A general approach to manage data center energy con- as their interpretability by people [27]. Accurate power
sumption consists of four main steps (see Figure 2): feature consumption models are very important for many energy
extraction, model construction, model validation, and appli- efficiency schemes employed in computing equipment [28].
cation of the model to a task such as prediction. Multiple uses for power models exist, including
• Feature extraction: In order to reduce the energy • Design of data center systems: Power models are nec-
consumption of a data center, we first need to measure essary in the initial design of components and systems,
the energy consumption of its components [17] and since it is infeasible to build physical systems to assess
identify where most of the energy is spent. This is the every design choice’s effect on power consumption
task of the feature extraction phase. [28]. As an example, this approach was used for Data
• Model construction: Second, the selected input fea- Center Efficiency Building Blocks project by Berge et
tures are used to build an energy consumption model al. [29].
using analysis techniques such as regression, machine • Forecasting the trends in energy efficiency: In daily
learning, etc. One of the key problems we face in this operations of computer systems, users and data center
step is that certain important system parameters such operators need to understand the power usage patterns
as the power consumption of a particular component of computer systems in order to maximize their energy
in a data center cannot be measured directly. Classical efficiency. Physical power measurement alone does not
analysis methods may not produce accurate results provide a solution since they cannot predict future
in such situations, and machine learning techniques power consumption, a.k.a. “what if” scenarios [30].
may work better. The outcome of this step is a power Measurements also do not provide a link between
model. resource usage and power consumption [28]. Exper-
• Model validation: Next, the model needs to be vali- imental verification using real test data is generally
dated for its fitness for its intended purposes. expensive and inflexible. Energy models on the other
• Model usage: Finally, the identified model can be used hand are much cheaper and more adaptive to changes
as the basis for predicting the component or system’s in operating parameters [31].
energy consumption. Such predictions can then be • Energy consumption optimization: Many different
used to improve the energy efficiency of the data cen- power consumption optimization schemes have been
ter, for example by incorporating the model into tech- developed on top of power consumption models which
niques such as temperature or energy aware schedul- are represented as mathematical functions [32].
ing [18], dynamic voltage frequency scaling (DVFS) Power modeling is an active area of research, studying
[19][20][21], resource virtualization [22], improving both linear and nonlinear correlations between the system
the algorithms used by the applications [23], switching utilization and power consumption [33].
to low-power states [24], power capping [25], or even However, modeling the exact energy consumption be-
completely shutting down unused servers [10][26], etc. havior of a data center, either at the whole system level

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 3

TABLE I
or the individual component level, is not straightforward. C ONTENTS .
In particular, data center energy consumption patterns de-
pend on multiple factors such as hardware specifications,
workload, cooling requirements, types of applications, etc.,
which cannot be measured easily. The power consumed by I Introduction 1
hardware, software that runs on hardware, and the cooling
II Related Surveys 3
and power infrastructure of the building in which the data
center systems reside are all closely coupled [34]. Further- III Data Center Energy Consumption: A System Perspective 4
III-A Power Consumption Optimization Cycle . . . . . . . . . . . 4
more, it is impractical to perform detailed measurements III-B An Organizational Framework for Power Models . . . . . . . 5
of the energy consumption of all lower level components, IV Digital Circuit Level Energy Consumption Modeling 7
since the measurement infrastructure introduces overhead IV-A Energy vs Power . . . . . . . . . . . . . . . . . . . . . . 7
IV-B Dynamic vs Static Power . . . . . . . . . . . . . . . . . . 7
to the system. Due to these reasons energy consumption
prediction techniques have been developed which can esti- V Aggregate View of Server Energy Models 8
V-A Additive Server Power Models . . . . . . . . . . . . . . . . 8
mate the level of energy consumed by a system for a given V-B System Utilization based Server Power Models . . . . . . . . 10
workload. Energy consumption prediction techniques can V-C Other Server Power Models . . . . . . . . . . . . . . . . . 12
also be utilized for forecasting the energy utilization of a VI Processor Power Models 14
given data center operating in a specific context. VI-A Processor Power Modeling Approaches . . . . . . . . . . . 15
VI-B Power Consumption of Single-core CPUs . . . . . . . . . . 16
The contributions of this paper are numerous. One of the VI-C Power Consumption of Multicore CPUs . . . . . . . . . . . 17
key contributions of this survey is to conduct an in-depth VI-D Power Consumption of GPUs . . . . . . . . . . . . . . . . 20

study of the existing work in data center power models, VII Memory and Storage Power Models 23
VII-A Memory Power Models . . . . . . . . . . . . . . . . . . . 23
and to organize the models using a coherent layer-wise VII-B Hard Disk Power Models . . . . . . . . . . . . . . . . . . 25
abstraction as shown in Figure 4. While there are many VII-C Solid-State Disk Power Models . . . . . . . . . . . . . . . 27
VII-D Modeling Energy Consumption of Storage Servers . . . . . . 27
current power models for different components of a data
center, the models are largely unorganized, and lack an VIII Data Centers Level Energy Consumption Modeling 28
VIII-A Modeling Energy Consumption of a Group of Servers . . . . 28
overall framework that allows them to be used together VIII-B Modeling Energy Consumption of Data Center Networks . . . 31
with each other to model more sophisticated and complex VIII-C Modeling Energy Consumption of Power Conditioning Systems 35
VIII-D Modeling Data Center Cooling Power Consumption . . . . . 37
systems. Furthermore, we give a more detailed taxonomy VIII-E Metrics for Data Center Efficiency . . . . . . . . . . . . . . 39
of the makeup of a data center, as shown in Figure 6, VIII-F Modeling Energy Consumption of a Data Center . . . . . . . 41
and again place and relate existing work to our taxonomy. IX Software Energy Models 42
We believe the breadth and organization of our approach IX-A Energy Consumption Modeling at the OS and Virtualization
Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
makes this survey a valuable resource for both researchers IX-B Modeling Energy Consumption of Data-Intensive Applications 45
and practitioners seeking to understand the complexities of IX-C Modeling Energy Consumption of Communication-Intensive
Applications . . . . . . . . . . . . . . . . . . . . . . . . . 46
data center energy consumption at all levels of the system IX-D Modeling Energy Consumption of General Applications . . . 47
architecture.
X Energy Consumption Modeling Using Machine Learning 48
The rest of this paper is organized as shown in the Table X-A Machine Learning - An Overview . . . . . . . . . . . . . . 49
I. X-B Supervised Learning Techniques . . . . . . . . . . . . . . . 49
X-C Unsupervised Learning Techniques . . . . . . . . . . . . . . 50
X-D Reinforcement Learning Techniques . . . . . . . . . . . . . 50

II. R ELATED S URVEYS XI Comparison of Techniques for Energy Consumption Modeling 51


XI-A Power Model Complexity . . . . . . . . . . . . . . . . . . 51
While there has been a wide body of research on energy XI-B Effectiveness of the Power Models . . . . . . . . . . . . . . 53
XI-C Applications of the Power Models . . . . . . . . . . . . . . 53
consumption modeling and energy consumption prediction
XII Future Directions 53
for data centers, there has been relatively few surveys con-
ducted in this area. The surveys published till present can be XIII Summary 54
classified under five categories: computing, storage and data References 54
management, network, infrastructure, and interdisciplinary.
The majority of existing surveys have been on the energy
consumption of computing subsystems. For example, the centers, a far different type of system. Mittal et al. presented
survey by Beloglazov et al. described causes for high a survey on GPU energy efficiency [43]. Reda et al.
power/energy consumption in computer systems and pre- conducted a survey on power modeling and characterization
sented a classification of energy-efficient computer designs. of computing devices [38]. They reviewed techniques for
However, this survey was not specifically focused on energy power modeling and characterization for general-purpose
consumption modeling. Venkatachalam et al. conducted processors, system-on-chip based embedded systems, and
a survey on techniques that reduce the total power con- field programmable gate arrays. The survey conducted by
sumed by a microprocessor system over time [35]. Mittal’s Valentini et al. studied characteristics of two main power
survey on techniques for improving energy efficiency in management techniques: static power management (SPM)
embedded computing systems [42] is in the same line as and dynamic power management (DPM) [53].
Venkatachalam et al. work. However, both these works Several surveys have been conducted focusing on storage
focused on embedded systems, whereas our focus is on data and data management in data centers. A survey on energy-

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 4

TABLE II
C OMPARISON OF RELATED SURVEYS .
Year Investigator(s) Area of focus
2005 Venkatachalam et al. [35] Power consumption of microprocessor systems
2011 Beloglazov et al. [36] Energy-efficient design of data centers and cloud computing systems
2011 Wang et al. [37] Energy-saving techniques for data management
2012 Reda et al. [38] Power modeling and characterization for processors
2012 Valentini et al. [38] Power management techniques
2013 Ge et al. [39] Energy efficiency of data centers and content delivery networks
2013 Bostoen et al. [40] Power-reduction techniques for data-center storage systems
2014 Orgerie et al. [41] Energy efficiency of computing and network resources
2014 Mittal [42] Energy efficiency in embedded computing systems
2014 Mittal et al. [43] GPU Energy Efficiency
2014 Hammadi et al. [44], Bilal et al. Data center networks and their energy efficiency
[45][46]
2014 Ebrahimi et al. [47] Data center cooling technology
2014 Rahman et al. [48], Mittal [49] Data center power management
2014 Gu et al. [50] VM power metering
2014 Kong et al. [51] Renewable energy usage and/or carbon emission in data centers
2014 Shuja et al. [52] Data center energy-efficiency

efficient data management was conducted by Wang et al. only focused on VM power models, where our work is more
[37]. Their focus was on the domain of energy-saving comprehensive and structured. Kong et al. [51] conducted
techniques for data management. Similarly, Bostoen et al. a survey on renewable energy and/or carbon emission in
conducted a survey on power-reduction techniques for data data centers and their aim is different from the aim of this
center storage systems [40]. Their survey focused only on survey paper.
the storage and file-system software whereas we focus on A chronologically ordered listing of the aforementioned
energy use in the entire data center. surveys is shown in Table II. In this survey paper we
Surveys conducted on network power consumption issues study the existing literature from bottom up, from energy
include the work done by Hammadi et al., which focused on consumption at the digital circuit level on through to the
architectural evolution of data center networks (DCNs) and data center systems of systems level. With this approach
their energy efficiency [44]. Bilal et al. conducted a survey we can compare the energy consumption aspects of the
on data center networks research and described energy data centers across multiple component layers. We believe
efficiency characteristics of DCNs [45][46]. Shuja et al. that the bottom-up decompositional approach we follow as
surveyed energy efficiency of data centers focusing on the well as the comprehensive coverage of the literature on all
balance between energy consumption and quality of service components makes our work a unique contribution to the
(QoS) requirements [52]. Rahman et al. surveyed about data center and cloud computing research communities.
power management methodologies based on geographic
load balancing (GLB) [48]. Unlike our work, none of these
III. DATA C ENTER E NERGY C ONSUMPTION : A S YSTEM
surveys delve into the details on the construction of power
P ERSPECTIVE
models. Furthermore, they mostly only consider a single
aspect of a data center. Another similar survey on power In this section we describe how a data center is organized
management techniques for data centers was presented by and the flow of electrical power within a typical data
Mittal [49]. But again, their focus was not on modeling. center. Later, we present an organizational framework to
Recently several data center infrastructure level surveys help readers design effective power models.
have been conducted. For example, Ebrahimi et al. con-
ducted a survey on the data center cooling technology, and
discussed the power related metrics for different compo- A. Power Consumption Optimization Cycle
nents in a data center in detail [47]. Power flow and chilled water flow of an example data
The remaining related surveys are interdisciplinary, and center is shown in Figure 3 [54]. Data centers are typically
cover multiple aspects of data center power consumption. energized through the electrical grid. However, there are
The survey conducted by Ge et al. focused on describ- also data centers which use diesel, solar, wind power,
ing power-saving techniques for data centers and content hydrogen (fuel cells), etc. among other power sources. The
delivery networks [39]. While achieving power savings is electric power from external sources (i.e., the total facility
one application of the models we survey, our goals are power) is divided between the IT equipment, the infras-
broader, and we seek to survey general power modeling tructure facilities, and support systems by the switch gear.
and prediction techniques. Orgerie et al. [41] surveyed Computer room air conditioning (CRAC) units, a part of the
techniques to improve the energy efficiency of computing cooling infrastructure, receive power through uninterrupted
and network resources, but did not focus on modeling and power supplies (UPSs) to maintain consistent cooling even
prediction. Gu et al. conducted a survey on power metering during possible power failure. Note that certain power
for virtual machines (VMs) in clouds [50]. But their work components such as flywheels or battery backup may not

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 5

Incoming Utility/Grid Diesel Power Generators


Power [61][62][63]. While dedicated PDU hardware provides ac-
Primary Switch Gear curate data on power consumption, present use of PDUs is
costly and introduces system scalability issues. Hardware
Chilled Water Uninterrupted Flywheel /
Chiller Distribution Economizer Power Supply Battery Backup manufacturers are starting to deploy various sensors on
Pumps (UPS)

Cooling
HPC systems to collect power-related data as well as
Data Center
Tower
provide easier access to the gathered data. Modern internal
Lighting
on-board power sensors (e.g., the on-board power sen-
CRAC sors on Tesla K20 GPUs [64]), power/thermal information
Rack
Rack
Power Flow
reporting software such as AMESTER (IBM Automated
Chilled water flow
Measurement of Systems for Temperature and Energy Re-
Raised Floor Plenum porting software) [65], HP Integrated Lights Out (iLO) [66]
are several such examples. However, such facilities might
Server Storage Network
not be available in many hardware platforms. Moreover,
direct power measurement based energy consumption op-
Circuit Board / Cooling Fans /
Power Supply Blowers timization techniques are rarely deployed in current data
centers due to their usability issues. A more viable approach
Fig. 3. Power flow in a typical data center [54]. Data centers are
specifically designed to operate as server spaces and they have more
that has been widely used is to use hardware performance
control over their internal energy flow. counters for energy consumption prediction.
Performance counters are the second type of input that
can be used with the system optimization cycle. Perfor-
be available in many data centers. Figure 3 acts as a model mance counters are a special type of register exposed by
data center for most of the remaining parts of this paper. different systems with the purpose of indicating their state
An overall view of the framework used in this survey of execution [67]. Performance counters can be used to
is shown in Figure 4. In general we can categorize the monitor hundreds of different performance metrics such
constituents of a data center as belonging to one of two as cycle count, instruction counts for fetch/decode/retire,
layers, software and hardware. The software layer can be cache misses, etc [68]. Performance counter information
further divided into two subcategories, the OS/virtualization is used in many different tools and frameworks, alongside
layer, and the application layer. In the first half of this predefined power consumption models, for predicting the
paper we describe the power consumption modeling work energy usage of systems. In certain situations performance
in the hardware layer. Later, we study power consumption counter based energy consumption modeling techniques
modeling for software. Throughout this process, we high- can be augmented with physical power measurement. For
light various energy consumption modeling and prediction example, Figure 5 shows an approach for system power
techniques during this process which are applied at various measurement that uses a combination of sampled mul-
different levels of the data center systems of systems. timeter data for overall total power measurements, and
Energy consumption optimization for such a complex use estimates based on performance counter readings to
systems takes the form of a system optimization cycle, as produce per-unit power breakdowns [69].
shown in Figure 2. Modeling and prediction are two parts The model construction process can be done by people as
of this process. Feature extraction constructs a model of well as by computers using various intelligent techniques
the real world system to simulate its energy consumption. [70]. The model then needs to be validated to determine
Feature extraction is often performed on miniature proto- whether or not it is useful. In most of the power models
types of the real world system (e.g., [55]), though it is presented in this paper, this step has been performed
also possible to conduct such extractions at the whole data manually. However, there are situations where automatic
center scale. model validation is done with the help of computers. Once
Raw power consumption measurements are one of the validated the model can be used for different tasks such as
key inputs to this system optimization cycle. Many studies prediction of the energy consumption of the data center. The
on the energy consumption of computer systems have been experience gained by predicting the energy consumption
conducted using external power meters providing accurate of a real system can be utilized for improving the energy
measurements [56][57] or using hardware or software in- consumption model itself.
strumentation [58]. However, techniques that require the
use of external meters or instrumentation are less portable
because those require physical system access or invasive B. An Organizational Framework for Power Models
probing [59]. In the latest data center power metering In this paper we map the energy consumption of a data
techniques, power usage data is collected by polling power center and its components to an organizational framework.
distribution units (PDUs) connected to IT equipment [60]. We denote the instantaneous power dissipated at time t by,
As mentioned earlier, PDUs are power strips that are
→ → →
created specifically to be used in data center environments. Pt = f (St , At , Et ). (1)
High-end intelligent PDUs offer per-socket measurements,
rich network connectivity, and optional temperature sensors The parameters in this equation are as follows,

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 6

Users
Data Center Enterprises
Data Center Data Data
Center Center

Internet

Data Center Energy Consumption Models


Applications Linear Power Models
Databases Performance Counters
MapReduce
Web
Streaming
Apps Enterprise Apps
Middleware Non-linear Power Models

Guest OS OS/Virtualization Queuing Power Models


VM Monitor
Virtual Device
CPU/Disk/NIC Host OS Driver
Energy Consumption Prediction
Software Techniques
Statistical Techniques
Network/
Servers Storage Supervised Learning
Interconnect
Machine Learning
Techniques Unsupervised
Cooling Power Learning
Systems Conversion Reinforcement
Learning
Hardware

Fig. 4. A holistic view of the context for energy consumption modeling and prediction in data centers. The constituents of a data center can be
categorized into two main layers: software layer and hardware layer.

System Hardware performance Ammeter stack, load balancing, and scheduling algorithms also
Under
Study
counters readings determines the execution strategy.
The power model we use can either be additive in the
Ethernet RS232
power consumption of individual components, regression
Logging based, or use machine learning. The t value associated
System Counter based estimation and total power plots
with each parameter denotes the time aspect of these
Fig. 5. A hybrid approach for system component level power consump- parameters. However, in certain power models, e.g., the
tion estimation [69]. This approach integrates measurements obtained from model in Equation (7), we do not observe the time of the
a multimeter with the performance counter readings to produce per-unit measurements. In such cases the power is calculated using
power breakdowns.
an average value over a time window. For simplicity, in the
rest of the paper we simply use the parameter name, e.g.,
→ → →
• St - represents the internal system state at time t. A, instead of the time parameterized name, e.g., At . This
This can be further divided into three subcategories: organizational model can be used for a number of purposes,
physical, OS, and application software. Hardware con- including data center energy consumption prediction, data
figurations such as the type of processor, amount center system planning, system/subsystem comparisons,
of memory, disk, and NIC structure are examples and energy consumption optimization. We refer to [72] for
of the system state. Raw power measurements and some example uses.
performance counter values indicate the system status When the model is used for power consumption predic-
at a particular time. tion at time t, it can be represented as follows,

• At - represents input to the application at time t, in- → → →
Pt+1 = g(St , At , Et ), (2)
cluding for example application parameters and input
request arrival rates. where the function parameters are subscripted with t =

• Et - represents the execution and scheduling strategy 0, 1, ..., and the function g predicts the power usage of
[71] across the data center system at time t. Examples the next time step. If one knows the details of a system’s
for scheduling include control of the CPU frequency, physical state and the input for the system, she can schedule

powering on or off the server, assignment of workload (i.e., adjust E ) the applications to operate most efficiently
to different nodes or cores, etc. Which software we use for a given power budget. In this case the applications’
at a particular time, how we configure the software deployment schedule is determined by the other three

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 7

Cooling
VIII-D
System
Infrastructure
Power Conditioning System VIII-C
VIII-F Single
VIII-A core
VI-B
Data
Hardware
Systems of Systems Center Group
centric Compute
System of VI
Components
Servers Processor Multicore VI-C
Data Center
Energy
Supervised Learning X-B VII-A
Consumption X IT Storage VII-D
Modeling and V Memory GPU
Machine Learning VI-D
Prediction Unsupervised Learning X-C
techniques Server
IX Network VIII-B
Reinforcement Learning X-D
Fan VIII-D-1
Software OS/
centric Virtualization
IX-A
Network Interface IV-F
Data-Intensive IX-B
Applications Hard Disk VII-B
Communication-Intensive IX-C Server
Storage
SSD VII-C

Fig. 6. A taxonomy based overview of the data center energy consumption modeling literature surveyed in this paper. Roman numbers and capital
letters near the boxes indicate the section/subsection title in this paper.

→ →
parameters (P , S , and A). This general model of data A. Energy vs Power
center power consumption is reflected in many of the Energy (E) is the total amount of work performed by
models described in this paper. However, the interpretation a system over a time period (T ) while power (P ) is the
of the function f is different across different power models, rate at which the work is performed by the system. The
as the models are based on different techniques and focus relationship between these three quantities can be expressed
on different components. For example, power models such as,
as the one given in Equation (10) may use a componentwise
decomposition of power while models such as the one E = P T, (3)
shown in Equation (4) use a static versus dynamic power
decomposition. where E is the system’s energy consumption measured in
A more detailed view of Figure 4 is presented in Figure 6. Joules, P is measured in Watts and T is a period of time
The latter figure provides an overview of the areas surveyed measured in seconds. If T is measured in unit times then
in this paper. We study models from two viewpoints, their the values of energy and power become equal.
level of abstraction and the techniques employed. The The above expression can be slightly enhanced by con-
bounds of the abstractions follow the system boundaries sidering energy as the integration of power values in a time
as well as the application components of the data center period starting from t1 and ends at t2 . Note that we use the
system being modeled. Techniques described in this survey terms energy and power interchangeably in this paper.
are of two types, either hardware centric or software cen-
tric. Software centric techniques can be further divided as B. Dynamic vs Static Power
performance counter based and machine learning based. In CMOS (Complementary metal-oxide semiconductor)
the subsequent sections, we first describe hardware centric technology has been a driving force in recent development
techniques, starting from energy consumption modeling at of computer systems. CMOS has been popular among the
the digital circuit level. microprocessor designers due to its resilience for noise as
well as low heat produced during its operation compared
to other semiconductor technologies. The digital CMOS
IV. D IGITAL C IRCUIT L EVEL E NERGY C ONSUMPTION circuit power consumption (Ptotal ) can be divided into two
M ODELING main parts as,
Power models play a fundamental role in energy- Ptotal = Pdynamic + Pstatic , (4)
efficiency research of which the goal is to improve the
components’ and systems’ design or to efficiently use the where Pdynamic is the dynamic power dissipation while
existing hardware [7]. Desirable properties of a full system Pstatic is the static power dissipation.
energy consumption model include Accuracy (accurate Dynamic power is traditionally thought of as the primary
enough to allow the desired energy saving), Speed (generate source of power dissipation in CMOS circuits [74]. The
predictions quickly enough), Generality and portability three main sources of dynamic power (Pdynamic ) con-
(should be suitable for as many systems as possible), sumption in digital CMOS circuits are switched capaci-
Inexpensiveness (should not requires expensive or intrusive tance power (caused by the charging and discharging of
infrastructure), and Simplicity [73]. This section provides the capacitive load on each gate’s output), short-circuit
an introduction to the energy consumption fundamentals in power (caused by short-circuit current momentarily flowing
the context of electronic components. within the cell), and leakage power (caused by leakage

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 8

current irrespective of the gate’s state) [75]. These power such basic power models have been proven to be useful
components can be represented as follows, in developing energy saving technologies. For example the
power model described in Equation (7) makes the basis
Pdynamic = Pswitching + Pshort−circuit + Pleakage , (5)
of dynamic voltage frequency scaling (DVFS) technique
which is a state-of-the-art energy saving technique used in
where the first term Pswitching represents the switching current computer systems.
component’s (switching capacitance) power. The most sig-
nificant component of power discharge in a well designed V. AGGREGATE V IEW OF S ERVER E NERGY M ODELS
digital circuit is the switching component. The second IT systems located in a data center are organized as
term represents the power consumption that happens due components. Development of component level energy con-
to the direct-path short circuit current that occurs when sumption models helps for multiple different activities such
both the NMOS (N-type metal-oxide semiconductor) and as new equipment procurement, system capacity planning,
PMOS (P-type metal-oxide semiconductor) transistors are etc. While some of the discussed components may appear
simultaneously active making the current directly flow to at different other levels of the data center hierarchy, all
ground. The leakage current creates the third component of the components described in this section are specifically
which is primarily determined by the fabrication technology attributed to servers. In this section we categorize the power
of the chip. In some types of logic styles (such as pseudo- models which provide aggregated view of the server power
NMOS) a fourth type of power called static biasing power models as additive models, utilization based models, and
is consumed [76]. The leakage power consists of both gate queuing models.
and sub-threshold leakages which can be expressed as [77],

Pleakage = ngate Ileakage Vdd ,
 A. Additive Server Power Models
(6) Servers are the source of productive output of a data
= AT 2 e−B/T + Ce(r1 Vdd +r2 ) ,

I
leakage center system. Servers conduct most of the work in a data
where ngate represents the transistor count in a circuit while center and they correspond to considerable load demand
Ileakage represents the leakage current, and T corresponds irrespective of the amount of space they occupy [83]. Fur-
to the temperature. The values A, B, C, r1 , and r2 are thermore, they are the most power proportional components
constants. Circuit activities such as transistor switches, available in a data center which supports implementation
changes of values in registers, etc. contribute to the dynamic of various power saving techniques on servers. In this sub
energy consumption [78]. section we investigate on the additive power models which
The primary source of the dynamic power consumption represent the entire server’s power consumption as a sum-
is the switched capacitance (Capacitive power [79]). If mation of its sub components. We follow an incremental
we denote A as the switching activity (i.e., Number of approach in presenting these power models starting from
switches per clock cycle), C as the physical capacitance, the least descriptive models to most descriptive models.
V as the supply voltage, and f as the clock frequency; the These models cloud be considered as an improvement over
dynamic power consumption can be defined as in Equation linear regression, where non-parametric functions are used
(7) [80][81][82], to fit model locally and are combined together to create the
intended power model [84].
Pcapacitive = ACV 2 f. (7) One of the simplest power models was described by Roy
et al. which represented the server power as a summation of
Multiple techniques are available for easy scaling of the CPU and memory power consumption [85]. We represent
supply voltage and frequency in large range. Therefore, the their power model as,
two parameters V and f attract a large attention by the
power-conscious computing research. E(A) = Ecpu (A) + Ememory (A), (9)
Static power (Pstatic ) is also becoming and important is-
where Ecpu (A) and Ememory (A) are energy consumption
sue because the leakage current flows even when a transistor
of the CPU and the memory while running the algorithm A.
is switched off [78] and the number of transistors used in
More details of these two terms are available in Equations
processors is increasing rapidly. Static power consumption
(54) and (87) respectively. Jain et al. have described a
of a transistor can be denoted as in Equation (8). Static
slightly different power model to this by dividing the
power (Pstatic ) is proportional to number of devices,
energy consumption of CPU and memory into two separate
Pstatic ∝ Istatic V, (8) components as data and instructions [86].
More detailed power models have been created by
where Istatic is the leakage current. considering other components of a server such as disks,
The above mentioned power models can be used to network peripherals, etc. Server energy consumption model
accurately model the energy consumption at the micro described by Tudor et al. [87] augments the above power
architecture level of the digital circuits. The validity of model with I/O parameters. Their model can be shown as,
such models is a question at the higher levels of the
system abstractions. However, as mentioned in Section I Etotal = Ecpu + Ememory + EI/O , (10)

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 9

where energy used by the server is expressed as a function where, A0 , A1 , A2 , and A3 are unknown constants that are
of energy used by CPU, memory, and I/O devices. However, calculated via linear regression analysis and those remain
most of the current platforms do not allow measuring the constant for a specific server architecture. The terms Eproc ,
power consumed by the three main sub systems (CPU, Emem , Eem , Eboard , and Ehdd represent total energy con-
Memory, and Disk) of servers separately. Only the full sumed by the processor, energy consumed by the DDR and
system power denoted by Etotal can be measured [88]. SDRAM chips, energy consumed by the electromechanical
Ge et al. have also described a similar power model by components in the server blade, energy consumed by the
expressing the system power consumption as a summation peripherals that support the operation on board, and energy
of CPU, memory, and other system components [89]. The consumed by the hard disk drive (HDD). Use of single
power model described in Equation (10) can be further constant factor A0 for both CPU and memory can be
expanded as [90], attributed to the close tie between CPU and memory power
consumption.
Etotal = Ecpu + Ememory + Edisk + EN IC , (11) CPU power consumption generally dominates the server
power models [97]. This domination is revisited in multiple
where Ecpu , Ememory , Edisk , and EN IC correspond to
places of this survey. One example detailed system power
energy consumed by CPU, memory, disk, and network
model which possess this characteristic was described by
interface card respectively. Furthermore, this model may
Lent et al. where power consumption of a server is
incorporate an additional term for energy consumption of
expressed as the sum of the power drawn by its sub
mother board as described in [91][92] and in [93] or a
components [98]. In this power model, the power (P)
baseline constant such as described in [94].
consumed by a network server hosting the desired services
The above energy model can be further expanded consid-
is given by,
ering the fact that energy can be calculated by multiplying
average power with execution time as [90],
N
X −1 C−1
X D−1
X
P =I+ αN ρN (i) + αC ρC (j) + αD ρD (k)
i=0 j=0 k=0
Etotal = P comp Tcomp + P N IC Tcomm + P net dev Tnet dev ,
(12)  C−1
X   C−1
X 
where P comp denotes combined CPU and memory aver- +ψm ρC (j) + ψM ρC (j) ,
j=0 j=0
age power usage. Tcomp is the average computation time. (15)
Tcomm is the total network time and P N IC is the average
network interface card power. This energy model also takes where I denotes idle power consumption. Lent et al.
into account the energy cost from network devices’ power assumed each of the subsystems will produce linear power
P net dev and the running time Tnet dev when the devices consumption with respect to their individual utilization.
are under load. Then the power consumption of a core, disk, or port sub-
A slightly different version of this energy model can be system can be estimated as the product of their utilization
constructed by considering the levels of resource utilization (core utilization ρC , disk utilization ρD , network utilization
by the key components of a server [95] as, ρN ) times constant factor (αC , αD , and αN ). These factors
do not necessarily depend on the application workload. The
model shown above does not have a separate subsystem for
Pt = Ccpu,n ucpu,t + Cmemory umemory,t
(13) memory because the power consumed by memory access
+Cdisk udisk,t + Cnic unic,t ,
is included in the calculations of the power incurred by the
where ucpu is the CPU utilization, umemory is the memory other subsystems (especially by the core). CPU instruction
access rate, udisk is the hard disk I/O request rate, and execution tends to highly correlate to memory accesses
unet is the network I/O request rate. Pt refers to the in most applications [98]. The two components ψm and
predicted power consumption of server at time t while ψM are made to model behaviors that could be difficult to
Ccpu , Cmemory , Cdisk , and Cnic are the coefficients of represent otherwise.
CPU, memory, disk and NIC respectively. This power A different type of power models based on the type
model is more descriptive compared to the previously of operations conducted by a server can be developed as
described server power models (in Equations (10) to (12)). follows. In this approach which is similar to the power
System resource utilization values (u) can be regraded as consumption of CMOS circuits described previously, com-
a reflection of the job scheduling strategy of the modeled puter systems’ energy consumption (i.e., data center energy
system. The more jobs get scheduled in the system, the consumption) is divided into two components called static
CPU utilization increases accordingly). (i.e., baseline) power (Pf ix ) and dynamic (i.e., active)
In an almost similar power model, Lewis et al. described power (Pvar ) [75][99][100] which can be expressed as,
the entire system energy consumption using the following
equation [96], Ptotal = Pf ix + Pvar , (16)

where the fraction between static and dynamic power


Esystem = A0 (Eproc +Emem )+A1 Eem +A2 Eboard +A3 Ehdd , depends on both the system under consideration and the
(14) workload itself.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 10

Board Atom+DDR3
Xeon+DDR3
Static power (Pf ix ) consumption in the context of a 4%
Board
server is the power that is consumed by the system ir- 14%
CPUs
respective of its state of operation. This includes power 30%

wasted because of leaking currents in semiconductor com- Memory


ponents such as CPU, memory, I/O and other motherboard 19% CPUs
60%
Memory
48%
components, fans, etc. [101]. This category also includes
Store
power required to keep basic operating system processes 12%
Net
and other idling tasks running (E.g., Power required to keep 2% Store Net
5% 6%
the hardware clocks, timer interrupts, network ports, and
disk drives active [98]). The leaking currents need to be Fig. 8. Power breakdown across the components of two servers [103].
kept minimum to avoid such energy waste. However, this In the case of Atom processor based server, memory consumes largest
requires improvement of the lower level (semi-conductor amount of power while in Xeon based server, the CPUs are the main
power consumers.
chip level) energy consumption [75].
Dynamic power consumption in the context of a server
is made by activities such as operation of circuits, access
to disc drives (I/O), etc. It depends mainly on the type of E = Eboot + Ework + Ehalt , (17)
workload which executes on the computer as well as how
the workload utilizes the system’s CPU, memory, I/O, etc. where Eboot and Ehalt corresponds to system booting and
[101]. Furthermore, 30-40% of the power is spent on the halting energy consumption which is zero if the equipment
disk, the network, the I/O and peripherals, the regulators, need not be booting or halting during its operation life
and the rest of the glue circuitry in the server. cycle. However, use of this type of operation phase based
energy models is quite rare in real world. On the contrary,
Figure 7 shows an example breakdown of power con-
system utilization based power models are heavily used in
sumption of a server [102] deployed in a Google data cen-
data center power modeling. We investigate this important
ter. It should be noted that the percentage power consump-
are in the next subsection.
tion among different components are not fixed entities. For
Another component wise power breakdown approach
example Figure (8) shows power consumption comparison
for modeling server power is use of the VM power as a
of two servers with one being a mobile processor (Atom
parameter in the power model. This can be considered as
processor) [103]. In the case of Atom processor based
an extension of the power model described in Equation (37).
server, memory consumes the largest amount of power
A power model which is based on this concept is described
while in the Xeon based server, the CPUs are the main
in [106] where the server power is expressed as,
power consumers. The disk and the power supply unit are
another two large contributors to this collection [104] which n
X
are specifically not shown in Figure 8. Pserver = Pbaseline + Pvm (i), (18)
i=1

where Pserver represents the total power of a physical


CPUs server while Pbaseline is the baseline power that is empir-
33%
Other ically determined. Pvm is the power of an active VM, and
(server) n is the number of VMs held by the server. This power
22%
model can be further expanded by expressing the power
usage of each and every VM. Each and every VM’s power
consumption can be expressed as in Equation (184). Then
the complete server power can be expressed as,
DRAM
Disks
30% n n n
10% X X X
Pserver = α Ucpu (k) + β Umem (k) + γ Uio (k)+
Net k=1 k=1 k=1
5% ne + Ebaseline ,
(19)
Fig. 7. An approximate distribution of peak power usage by components
of a warehouse scale computer deployed at Google in 2007 [102]. where n is the number of VMs running in the physical node.
A similar power model for server was created in [107] by
considering only CPU, disk and idle power consumption.
Note that most of the power models described in this In that power model, CPUs and disks are considered as the
subsection were based on component wise power consump- major components that reflect the system activities.
tion decomposition. However, there can be other different
types of energy consumption models developed for a server
based on its phases of execution. One such example is the B. System Utilization based Server Power Models
energy model described by Orgerie et al. [105] (though its The second main category of server power models has
not specifically attributed to servers by them), been created considering the amount of system resource

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 11

utilization by its components. Traditionally the CPU has


been the largest power consumer in a server. Hence most Pu = (Pmax − Pidle )u + Pidle , (22)
of the system utilization based power models leverage
where Pidle , Pmax are the average power values when the
CPU utilization as their metric of choice in modeling the
server is idle and the average power value when the server is
entire system’s power consumption. Different from previous
fully utilized respectively. This model assumes server power
subsection, we organized the content in this subsection in a
consumption and CPU utilization has a linear relationship.
chronological order because there is no clearly observable
Certain studies have used this empirical model as the
structural relationship between these power models.
representation of the system’s total power consumption
One of the earliest server utilization based power models since Fan et al.’s study [109] have shown that the power
which appeared in year 2003 was an extension of the basic consumption of servers can be accurately described by a
digital circuit level power model described in Equation liner relationship between the power consumption and CPU
(7). This was introduced by Elnozahy et al. where the utilization [113].
fundamental dynamic power model described in Equation
The above processor utilization based power model has
(7) was extended considering a simplification made on
been highly influential in recent server power modeling
the system voltage [72][108]. They expressed voltage as
research. For example, the works by Zhang et al. and Tang
a linear function in the frequency where V = αf (α is a et al. used CPU utilization as the only parameter to estimate
constant). This results in a power model for a server running the system energy consumption [114][115]. However, there
at frequency f as,
are certain works which define slightly different utilization
metric for the power model in Equation (22). In one such
P (f ) = c0 + c1 f 3 , (20)
works [116], the power model appears in the context of
where c0 is a constant that includes power consumption modeling the energy consumption of a CDN server. Yet,
of all components except the CPU and the base power in [116] the utilization metric has been changed as the
consumption of CPU. c1 is a constant (c1 = ACα2 where percentage between the actual number of connections made
A and C are constants from Equation (7)). to a server s against the maximum number of connections
In year 2006, Economou et al. described Mantis which allowed on the server.
is a non-intrusive method for modeling full-system power In the same work, Fan et al. also have proposed
consumption and real-time power prediction. Mantis uses another empirical, non-linear power model as follows
a one-time calibration phase to generate a model by cor- [73][75][109],
relating AC power measurements with user-level system
Pu = (Pmax − Pidle )(2u − ur ) + Pidle , (23)
utilization metrics [104]. Mantis uses the component uti-
lization metrics collected through the operating system where r is a calibration parameter that minimizes the square
or standard hardware counters for construction of power error which needs to be obtained experimentally. In certain
models. Mantis has been implemented for two different literature the value of r is calculated as 1.4 [109]. Fan et al.
server systems (highly integrated blade server and a Itanium conducted an experiment which compared the accuracy of
server) for which the power models are depicted as, the power models in Equation (22) and Equation (23) using
a few hundred servers in one of the Google’s production
Pblade = 14.45 + 0.236ucpu − (4.47E − 8)umem facilities. Fan et al. mentioned that except for a fixed offset,
+0.00281udisk + (3.1E − 8)unet , the model tracks the dynamic power usage extremely well.
(21)
The error was below 5% for the linear model and 1% for
the empirical model. Although the empirical power model
Pitanium = 635.62 + 0.1108ucpu + (4.05E − 7)umem
in Equation (23) had better error rate, one need to determine
+0.00405udisk + 0unet ,
r calibration parameter which is a disadvantage associated
where the first term in both the above equations is a constant with the model.
which represents the system’s idle power consumption. Several notable works of system utilization based server
Each ucpu , umem , udisk , and unet correspond to CPU energy consumption modeling appeared in years 2010-
utilization, off-chip memory access count, hard disk I/O 2011. One such work was presented by Beloglazov et al.
rate and network I/O rate respectively. [111]. They considered the fact that CPU utilization may
One of the notable processor utilization based power change over time due to the variation of the workload
models is the work by Fan et al. (appeared in the year 2007) handled by the CPU [111]. Therefore, CPU utilization can
which has influenced recent data center power consumption be denoted as a function of time as [2][117] in Equation
modeling research significantly. Fan et al. have shown that (24),
the linear power model can track the dynamic power usage Z t1
with a greater accuracy at the PDU level [6][109]. If we E= P (u(t))dt, (24)
t0
assume the power consumed by a server is approximately
zero when it is switched off, we can model the power Pu where E is the total energy consumption by a physical node
consumed by a server at any specific processor utilization during a time period from t0 to t1 . u(t) corresponds to the
u (u is a fraction [110]) as [6][111][112] in Equation (22), CPU utilization which is a function of time.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 12

Multiple work have been done to model the aggregate



power consumption of a server [112]. Wang et al. presented P i = u i k i µα i
i + Pi , (30)
an energy model derived from experiments on a blade
where Pi∗ represents the static power consumption of server
enclosure system [118]. They modeled server power as
i. This type of power models are also known as Power-Law
shown in the following equation,
models in certain literature [123]. It can be observed that
PBj = gB uj + PB,idle , for any blade j. (25)
when considering one decade period from the year 2003,
many of the utilization based power models have appeared
They mentioned that CPU utilization (u) is a proxy for the around 2010-2011 period. This indicates there are many
effect of active workload management while the slope gB recent work being carried out in this area.
and the intercept PB,idle captures the effect of power status Certain power models consider the System’s CPU die
tuning. temperature along with the CPU utilization to calculate the
In their work on planning and deployment of enterprise heat generated by the server. In a steady-state such heat
applications, Li et al. conducted power consumption mod- dissipated by the server can be equated to the server power
eling of a server [119]. Unlike the previously described dissipation. In one such work the power dissipation by a
approaches, they used a normalized power unit Pnorm , server (Pserver ) is given by a curve-fitting model [124],
Psys − Pidle
Pnorm = , (26) Pserver = PIT + Psf an , (31)
Pbusy − Pidle
where Psys is the system power consumption, Pidle is the where PIT represents the server heat generation excluding
idling power consumption (i.e., the utilization is zero, (U = the heat generation by server cooling fan (Psf an ). The
0)), Pbusy is the power consumption when the system is component of the above model can be expanded as,
completely utilized (U = 1). Furthermore, they described
another model that relates normalized power (Pnorm ) and PIT = 1.566 × 10−5 + 42.29u + 0.379T + 0.03002T 2 , (32)
CPU utilization (U ) as,
where the R2 value of the curve-fitting line was 0.9839. T
−1
Pnorm (U ) = 1 − h(U ) , (27) is the CPU die temperature and u is the CPU utilization.
Furthermore, in certain works [125] the server’s CPU
where h(U ) = c1 U c2 + c3 U c4 + c5 , while (c1 , ..., c5 ) are
usage and operation frequency are used for modeling a
parameters to be fitted.
server’s power consumption. The work by Horvath et al.
In an effort to build on the power model in Equation
is an example where they expressed the server power
(23), Tang et al. created somewhat sophisticated power
consumption as,
consumption model [120] as,
Pi = ai3 fi ui + ai2 fi + ai0 , (33)
βx
Px (t) = Px idle + (Px f ull − Px idle )αx Ux (t) , (28)
where pi , fi , ui represent the power consumption, pro-
where Px idle and Px f ull are the power consumption of cessor’s frequency, and utilization of node i respectively.
a server x at idle and fully loaded states respectively. αx aij (j = 0, 1, 2, 3) are system parameters which can be
and βx are server dependent parameters. Ux corresponds determined by using the system identification of the phys-
to the CPU utilization of server x at time t. The notable ical server. They used the stead-state result of the M/M/n
difference from the power model in Equation (23) is the queuing model. The node utilization u is described as
x
addition of temporal parameter to the power model and the u = sn where x is the number of concurrent tasks in current
feature of accounting multiple different servers. sampling cycle. Arrival rate s is the number of served
Yao et al. described the power consumption of a server tasks and n is the server’s core count. When constructing
as follows [121], the server power model they assumed that all servers are
homogeneous with parameters ai3 = 68.4, ai2 = 14.6,
bi (t)α
P = + Pidle , (29) ai1 = −14.2, ai0 = 15.0.
A
where A, Pidle , and α are constants determined by the
data center. Pidle is the average idle power consumption C. Other Server Power Models
of the server. bi (t) denotes the utilization/rate of operation Additive and utilization based power models represent
if the server i at time t. Yao et al. selected the values majority of the server power models. However, there are
α = 3, Pidle = 150 Watts, and A such that the peak multiple other power models which cannot be specifically
power consumption of a server is 250 Watts. Both the power attributed to these two categories. This sub section inves-
models in Equations (28) and (29) were developed in the tigates on such power models. We follow a chronological
year 2011. ordering of power models as done in the previous section.
A similar power model for a server i was made by In a work on operational state based power modeling,
Tian et al. [122] in year 2014. However, they replaced the Lefurgy et al. have observed that server power consumption
frequency parameter with service rate(µα i
i ) and utilization changes immediately (within a millisecond) as the system’s
of server ui as, performance state changes from irrespective of the previous

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 13

performance state [126]. Therefore, they concluded that is the idle power consumption of the server. They assumed
power consumption of a server for a given workload that the processor accounts for half of the system power
is determined solely by the performance settings and is during active periods and the system consumes 10% of
independent of the power consumption in previous control its peak power during idle periods. They used queuing
periods. Furthermore, from the performance experiments theoretic models for capturing the request processing be-
conducted by Lefurgy et al., it was observed that a linear havior in data center servers. They used the standard M/M/1
model fits well with an R2 > 99% for all workloads. queuing model which assumes exponentially distributed
Therefore, they proposed a server power model as, request inter-arrival time with mean λ1 and an exponentially
distributed service time with mean µ1 .
p(k) = At(k) + B, (34) In a work conducted in year 2012, Enokido et al. created
where A and B are two system dependent parameters. Simple Power Consumption (SPC) model for a server st
p(k) is the power consumption of the server in the k th where the power consumption rate Et (τ ) at time τ is given
control period while t(k) is the performance state of the by [129],
processors in the k th control period. Furthermore, they (
Rt , if Qt (τ ) ≥ 1,
created a dynamic model for power consumption as, Et (τ ) = (38)
min Et , otherwise,
p(k + 1) = p(k) + Ad(k). (35) where Rt shows the maximum power consumption rate
In a detailed work on server power modeling with where a rotation speed of each sever fan is fixed to be
use of regression techniques, Costa et al. [127] intro- minimum. In the SPC model if at least one process pi is
duced a model for describing the power consumption performed, the electric power is consumed at fixed rate Rt
of a computer through use of a combination of sys- on a server st at time τ (Et (τ ) = Rt ). If not the electric
tem wide variables (i.e., system wide metrics such as power consumption rate of the server st is minimum.
Furthermore, they created an extended power model for
host disk.sda disk time write (average time a write op-
a server considering the power consumption of cooling
eration took to complete), host cpu.X cpu.system value
devices (i.e., fans) [129]. They did not consider how much
(processes executing in kernel mode), etc.) Yi , i = 1, ..., I;
electronic power each hardware component of a server like
variables Xjl , j = 1, ..., J describing individual process
CPU, memory, and fans consume. They rather considered
Pl , l = 1, ..., L. They took the power consumption of a
aggregated power usage at macro level. In their Extended
computer with no load be denoted by P0 , and the respective
Simple Power Consumption (ESPC) model, Et (τ ) shows
coefficients of the regression model be called αi for system
the electric power consumption rate [W] of a server st
wide variables, and βj for per process variables. Based
at time τ (t = 1, ..., n), min Et ≤ Et (τ ) ≤ max Et (See
on the above definitions, the power consumption P of a
Equation (39)). Different from the model described in
computer can be denoted as,
Equation (112) they used an additional parameter Rt in
I
X J
X L
X this model. Then the ESPC is stated as,
P = P0 + αi Yi + βj Xjl . (36)
i=1 j=1 l=1 
max Et ,
 if Qt (τ ) ≥ Mt ,
Regression based power modeling has been shown to Et (τ ) = ρt .Qt (τ ) + Rt , if 1 ≤ Qt (τ ) ≤ Mt , (39)
perform poorly on non-trivial workloads due to multiple 
min E , otherwise,
t
reasons such as, level of cross dependency present in
the features fed to the model, features used by previous where ρt is the increasing ratio of the power consumption
approaches are outdated for contemporary platforms, and rate on a server st . ρt ≥ 0 if Qt (τ ) > 1 and ρt = 0 if
modern hardware components abstract away hardware com- Qt (τ ) = 1.
plexity and do not necessarily expose all the power states In another system utilization based energy consumption
to the OS. The changes in the power consumption are not model by Mills et al. the energy consumed by a compute
necessarily associated with changes in their corresponding node with CPU (single) executing at speed σ is modeled
states [128]. as [130],
Queuing theory has been used to construct server power Z t2
models. In one such work, Gupta et al. created a model E(σ, [t1 , t2 ]) = (σ 3 + ρσmax
3
)dt, (40)
t1
for power consumption of a server [113]. They assumed
that servers are power proportional systems (i.e, assuming where ρ stands for overhead power which is consumed
server power consumption and CPU utilization has a linear regardless the speed of the processor. The overhead includes
relationship) [113]. They described the power consumption the power consumption by all other system components
of a server as, such as memory, network, etc. Although the authors men-
tioned the energy consumption of a socket, their power
λ λ model is generalized to the entire server due to this reason.
P (λ) = (Pcpu + Pother ) + (1 − )Pidle , (37)
µ µ In certain works power consumption of a server is
where Pcpu and Pother represent the power consumption calculated by following a top down approach, by dividing
of the processor and other system components while Pidle the total power consumption of servers by the number of

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 14

servers hosted in the data center [131]. However, such state into one of the four system states: LOW, SETUP,
power models are based on a number of assumptions BUSY, and IDLE where each of the power states are
such as uniform server profiles, homogeneous execution of denoted by ELOW , ESET U P , EBU SY , and EIDLE . They
servers, etc. modeled the server power consumption as a Markov chain
Certain power consumption modeling techniques con- [132] (shown in Figure 9). The state (n1 , n2 ) means that the
struct metrics to represent the energy consumption of their server is off when n1 = 0 and on when n1 = 1. There are
target systems. In one such work, Deng et al. defined a n2 jobs in the system. The model allows one to determine
metric called system energy ratio to determine the best an optimal policy for a single server system under a broad
operating point of a full compute system [133][134][135]. range of metrics which considers the expected response
For a memory frequency of fmem they defined the system time of a job in the system, the expected energy consumed
energy ratio (K) as, by the system, and the expected rate that the server switches
between the two energy states (off/on).
Tfmem Pfmem
K(fmem ) = , (41)
Tbase Pbase
where Tfmem corresponds to the performance estimate for
λ λ μ μ λ λ
an epoch at frequency fmem . On the otherhand Pfmem = α γ γ
0,0
Pmem (fmem ) + Pnonmem , where Pmem (f ) is calculated 1,0 0, k+1 1, k+1 1, n-1 0, n-1

according to the model for memory power for Micron DDR λ μ λ λ λ μ λ


μ λ
SDRAM [136]. Pnonmem accounts for all non-memory 1,1 0,1 γ γ
1, k 1, n 0, n
subsystem components. The corresponding values for the 0, k

Tfmem and Pfmem at a nominal frequency are denoted as λ μ λ λ λ μ μ λ λ

Tbase and Pbase . Note that when considering the multiple 1,2 0,2 0, k-1 1, k-1 1, n+1 γ 0, n+1
memory frequencies (fmc1 , fmc2 , ..., fP
mcn ) the term Pfmem λ μ μ λ
λ μ λ λ λ
can be expanded further as Pfmem = i Pfmci + Pnonmem
[134]. The definition of K was further expanded by Deng
et al. as,
Fig. 9. M/M/1 ◦ M, M, k queue Markov chain [132] representation
T 1 n
fcore ,...,fcore
P 1
,fmem fcore n
,...,fcore ,fmem
of a server. The model allows one to determine an optimal policy for a
1 n
K(fcore , ..., fcore , fmem ) =
Tbase Pbase
, single server system under a broad range of metrics.
(42)

where Tbase and Pbase are time and average power at a We described the research conducted on modeling the
nominal frequency (e.g., maximum frequencies). Further- energy consumption of a data center server (as a complete
more, they expressed the full system power usage as, unit) upto this point. All of these models are linear models
while some of them listed in equations (9), ..., (17) are
1 n
P (fcore , ..., fcore , fmem ) = Pnon + Pcache componentwise breakdown of the power of processors. The
n
X i i (43) server power models in equations (22), (23), (24), and (28)
+Pmem (fmem ) + Pcore (fcore ), are examples for non-linear models and those are based
i=1
on CPU utilization. The power model in Equation (37)
where Pnon is the power consumed by all system compo- is different from the rest of the power models since it
nents except cores, the shared L2 cache, and the memory utilized queuing theory. A summary of the server power
subsystem. Pnon is assumed to be fixed. The average power consumption models is shown in Table III. Many of the
of the L2 cache is denoted by Pcache and is calculated aforementioned power models denote a server’s energy
by using the cache’s leakage and the number of accesses consumption as a summation of the power drawn by its
during the epoch. Pmem (f ) is the average power of L2 subcomponents. In the next two sections (Section VI and
misses (which makes the CPU to access memory) and is VII) of this paper we conduct a detailed investigation of the
calculated based on the description given in [137]. The attempts made to model the power consumption of these
i
value of Pcore (f ) is calculated based on the core’s activity sub components.
factor following the same techniques used by [18] and [69].
Deng et al. used several L1 and L2 performance counters
(Total L1 Miss Stalls (TMS), Total L2 Accesses (TLA), VI. P ROCESSOR P OWER M ODELS
Total L2 Misses (TLM), and Total L2 Miss Stalls (TLS)) Today, CPU is one of the largest power consumers of
and per-core sets of four Core Activity Counters (CAC) a server [6]. Modern processors such as Xeon Phi [138]
which track committed ALU instructions, FPU instructions, consists of multiple billions of transistors which makes
branch instructions, and load/store instructions to estimate them utilize huge amount of energy. It has been shown
core power consumption [135]. They found that power that the server power consumption can be described by
usage of cores is sensitive to the memory frequency. a linear relationship between the power consumption and
Power consumption states can be used for construction CPU utilization [139]. CPU frequency to a large extent
of server power models. Maccio et al. described an energy decides the current power utilization of a processor [140].
consumption model for a server by mapping its operational Comprehensive CPU power consumption models rely on

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 15

TABLE III
S UMMARY OF SERVER POWER CONSUMPTION MODELING APPROACHES .
Work(s) Characteristics Limitations
[85][86][87][92]
Component wise breakdown of server power. Depends on multiple assumptions.
[95][96][98]
[90] Component wise breakdown of server power considering temporal features. Needs accurate techniques to measure
the time averages.
[106] Component wise breakdown of server power considering power consumption Needs to measure the energy usage of
of each VM run by the server. each subcomponents. Also have to cal-
culate the values of α, β, γ, and e.
[75][99][100][105] Breakdown of server power considering state of operation. Needs accurate techniques to measure
the time averages.
[6][111][112] Widely used linear power model. Depends on multiple assumptions.
[73][75][109] Non-linear. The value of r need to be known in advance. Depends on multiple assumptions.
[2][117] Non-linear power model based on mathematical integration. Addresses Depends on multiple assumptions.
temporal aspects of the processor power.
[120] Non-linear power model. Addresses temporal aspects of the processor power. αx and βx parameters need to be known
beforehand.
[113][132] Non-linear. Based on queuing theory. Request arrival rate and service time Request arrival rates and service times
are considered. need to be known beforehand.
[118] Non-linear. Based on the level of system CPU utilization. Depends on multiple assumptions.
[119] Relates CPU utilization with system power consumption. Based on multiple assumptions.
[127] Regression based model. Considers both system wide variables as well as Need to know α and β variables before-
per process variables hand.
[129] Non-linear. Considers the cooling power consumption of a server. Depends on multiple assumptions.
[133][134][135] Non-linear. Defines a metric called system energy ratio. Depends on multiple assumptions.

specific details of the CPU micro-architecture and achieve a high up-front cost while the model is being trained, but
high accuracy in terms of CPU power consumption model- this technique is much faster to use [145].
ing [69][141]. This section first provide a general overview
for processor power modeling. Next, it delves into the Functional
Units
Single-core Control
Control
Lkg
Multicore
Register processor Dyn processor
specific details of modeling power consumption of three File 6% 8%
3%
Xeon (Tusla)
10%
major categories of processors used in current data center L3 Lkg
7%
Instruction
systems: Single-core CPUs, Multicore CPUs, and GPUs. Misc
fetch/
L3 Dyn
decode
control 33% 6%
10%

A. Processor Power Modeling Approaches Pipeline,


registers, Core Lkg
Core Dyn
55%
buses and 21%
Similar to system power consumption, processor power clocking
22%
Data cache
19%
consumption can be modeled at very high level as static (a) (b)
and dynamic power consumption. When analyzing the ratio
between the static power and dynamic power it has been Fig. 10. Power breakdown of (a) single-core [142] and (b) multicore
observed that processors presenting a low or very low processors [143]. Note that leakage power is significant portion in the
activity will present a too large static power compared to multicore processor.
dynamic power. Importance of the leakage and static power
increases in very deep submicron technologies. Hence static Statistical approaches for processor power modeling
power has become an important concern in recent times. A [18][141][146] are based on the data analysis conducted
real world example of this phenomenon can be observed on the processor performance counters. Micro architectural
in the power breakdown of two single-core and multicore events for the performance measurement purpose can be
processors shown in Figure 10 [142][143]. While the two obtained from most modern microprocessors. Heuristics
charts are based on two types of categorizations it can be can be selected from the available performance counters to
clearly observed that multicore processor has significant infer power relevant events and can be fed to an analytical
amount of power leakage. Another observation to note is processor to calculate the power. Multiple different power
that processor caches contribute to a significant percentage modeling tools being developed based on CPU performance
of processor power consumption. While not shown in counters. Virtual Energy Counters (vEC) is an example tool
Figure 10, caches in IBM POWER7 processor consumes that provides fast estimation of the energy consumption
around 40% of the processor’s power [144]. of the main components of modern processors. The power
Two high level approaches for generating power mod- analysis is mainly based on the number of cache references,
els for processors are circuit-level modeling (described in hits, misses, and capacitance values. vEC can address this
Section IV) and statistical modeling (related techniques are problem but results in loss of coverage.
described in Sections IX and X). Circuit-level modeling is However, certain works have expressed the negative
a more accurate but computationally expensive approach aspects of use of performance counters for power model-
(for example it is used in Wattch CPU power modeling ing. Economou et al. have pointed out that performance
framework [81]). Statical modeling on the other hand has monitoring using only counters can be quite inaccurate

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 16

since most processors allow for the measurement of only CPU. Following this approach, Bertran et al. expressed the
a limited number of concurrent counter readings [104]. total power consumption of a single core CPU [149] as,
Processor counters provide no insight into the I/O system !
i=n
such as disk and networking which makes it difficult to X
Ptotal = Ai × Pi + Pstatic , (47)
create accurate processor power models. According to [141] i=1
performance counters can provide good power estimation
results and they have estimated power within 2% of the where the weight of component i is represented as Pi
actual power. According to Jarus et al. and Costa et al. the and the activity ratio of the component i is represented
average error lies less than 10% range [127][147]. However, as Ai while there are total n components n the CPU. The
in general the availability of heuristics is limited by the dynamic power consumption of component i is represented
types of the performance counters and the number of events by Ai ×Pi , while Pstatic represents the overall static power
that can be measured simultaneously [146]. consumption of all components. In their case study they
used an Intel R Core 2 Duo processor and they identified
more than 25 microarchitectural components. For example,
B. Power Consumption of Single-core CPUs in their modeling approach they divided the whole memory
While we are living in a multicore era, it is better to subsystem into three power components: L1 and L2 caches
investigate on the single-core power models first, because and the main memory (MEM) (which includes the front
many of the current multicore processor power models are side bus (FSB)). They also defined INT, FP, and SIMD
reincarnations of single-core power models. This subsection power components which are related to the out-of-order
is organized as two main parts. In the first half we describe engine of Core 2 Duo processor. Note that definition of a
the additive power modeling techniques of single-core power component has been a challenge faced by them since
processors. In the second half we investigate of the use certain microarchitectural components are tightly related to
of performance counters for power modeling of single-core each other. Furthermore, there are certain microarchitectural
processors. components that do not expose any means of tracing their
Additive power modeling efforts are present in the con- activities level.
text of single-core CPUs. One such power model was As mentioned in the previous subsection, complete sys-
described by Shin et al. [148] as, tem power consumption can be measured online through
microprocessor performance counters [18][150]. Counter-
based power models have attracted a lot of attention
Pcpu = Pd + Ps + P0 , (44)
because they have become a quick approach to know
where Pd , Ps and P0 correspond to dynamic, static, and the details of power consumption [151]. Bircher et al.
always-on power consumption. The dynamic power is ex- showed that well known performance related events within
pressed using the Equation (7). They mentioned that it is a microprocessor (e.g., cache misses, DMA transactions,
sufficient to include the two major consumers of leakage etc.) are highly correlated to power consumption happening
power in the static power model, which are subthreashold outside the microprocessor [152]. Certain studies have
leakage and gate leakage power. Since the static power used the local events generated within each subsystem to
consumption is also dependent on the die temperature they represent power consumption. Through such performance
incorporated Td as a term in their power model which is counter based energy consumption prediction, the software
expressed as, developers are able to optimize the power behavior of an
application [153].
! One of the earliest notable efforts in this area, Isci et
K2 Vdd +K3
Ps (Td ) = Vdd K1 Td2 e Td
+ K4 e(K5 Vdd +K6 )
, (45) al. described a technique for a coordinated measurement
approach that combines real total power measurement with
where Kn is a technology constant. They then expanded performance-counter-based, perunit power estimation [69].
the right-hand side of the equation as a Taylor series and They provided details on gathering live, per-unit power
retained its linear terms as, estimates based on hardware performance counters. Their
study was developed around strictly co-located physical
∞ 
components identifiable in a Die photo. They selected 22
1 dn Ps (Tr )

physical components and used the component access rates
X
Ps (Td ) = n
(Td − Tr )n ,
n! dT d to weight the component power numbers. If each hardware
n=0 (46)
dPs (Tr ) component is represented as Ci the power consumption of
≈ Ps (Tr ) + (Td − Tr ),
dTd a component P (Ci ) can be represented as,
where Tr is a reference temperature, which is generally P (Ci ) = A(Ci )S(Ci )M (Ci ) + N (Ci ), (48)
some average value within the operational temperature
range. where A(Ci ) corresponds to the access counts for compo-
Additive power models for single core CPUs can be nent Ci . M (Ci ) is the maximum amount of power dissi-
created by aggregating the power consumption of each pated by each component. The M (Ci ) value is estimated
architectural power components which comprises of the by multiplying the maximum power dissipation of die by

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 17

the fraction of area occupied by the component. S(Ci ) 5) number of branch instructions retired per second (θ).
corresponds to a scaling factor introduced to scale the After assuming each access to system components such as
documented maximum processor power by the component L1, L2 caches consumes a fixed amount of energy, a power
area ratios. N (Ci ) corresponds to fixed power dissipation model for CPU can be created as follows,
made by each component. The total power of the processor
(Ptotal ) is calculated as,
P = b0 + b1 α + b2 β + b3 γ + b4 η + b5 θ + b6 f 1.5 , (52)
22
X
Ptotal = P (Ci ) + Pidle . (49) where f is the CPU frequency of which the exponent
i=1
1.5 was determined empirically. bi , i = 0, ..., 6 are task-
where the total power Ptotal is expressed as the sum of specific constants that can be determined during pre-
the power consumption of the 22 components and the idle characterization. b0 represents the system idle and leakage
power (Pidle ). Note that in Equation (48) M (Ci ) and S(Ci ) power.
terms are heuristically determined, M (Ci ) is empirically Merkel et al. [158] constructed a power model for
determined by running several training benchmarks that processors based on events happening in the processor.
stress fewer architectural components and access rates are They assumed that processor consumes a certain fixed
extracted from performance counters [154]. amount of energy for each activity and assign a weight to
Processor cache hit/miss rates can be used to construct each event counter that represents the amount of energy the
simple power model. In such power model described by processor consumes while performing the activities related
Shiue et al., the processor energy consumption is denoted to that specific activity. Next, they estimated the energy
as [155], consumption of the processor as a whole by choosing a
set of n events that can be counted at the same time, and
E = Rhit Ehit + Rmiss Emiss , (50) by weighting each event with its corresponding amount of
where Ehit is the sum of energy in the decoder and the energy αi . Therefore, they determine the amount of energy
energy in the cell arrays, while Emiss is the sum of the Ehit the processor consumes during a particular period of time
and the energy required to access data in main memory. by counting the number of events that occur during that
In another similar work Contreras et al. used instruction period of time as follows,
cache misses and data dependency delay cycles in the Intel n
XScale processor for power consumption estimation [156].
X
E= αi ci . (53)
Assuming a linear correlation between performance counter i−1
values and power consumption they use the following Another performance counter based power model for
model to predict the CPU power consumption (Pcpu ), CPUs was introduced by Roy et al. They described the
Pcpu = A1 (Bf m ) + A2 (Bdd ) + A3 (Bdtlb ) computational energy (Ecpu (A)) consumed by a CPU for
(51) an algorithm A as follows [85],
+ A4 (Bitlb ) + A5 (Bie ) + Kcpu ,
where A1 ,...,A5 are linear parameters (i.e., power weights)
Ecpu (A) = Pclk T (A) + Pw W (A), (54)
and Kcpu is constant representing idle processor power
consumption. The performance counter values of instruc- where Pclk is the leakage power drawn by the processor
tion fetch miss, number of data dependencies, data TLB clock, W (A) is the total time taken by the non-I/O oper-
misses, instructions TLB misses, number of instructions ations performed by the algorithm, T (A) is the total time
executed (i.e., InstExec) are denoted by Bf m , Bdd , Bdtlb , taken by the algorithm, and Pw is used to capture the power
Bitlb , and Bie respectively. However, they mentioned that consumption per operation for the server. Note that the
in reality non-linear relationships exist. term “operation” in their model simply corresponds to an
While modern CPUs offer number of different perfor- operation performed by the processor.
mance counters which could be used for power model-
It can be observed that the complexity of the performance
ing purposes its better to identify some key subset of
counter based power consumption models of single core
performance counters that can better represent the power
processors have increased considerably over a decade’s time
consumption of a CPU. One work in this line is done by
(from year 2003 to 2013). Furthermore, it can be observed
Chen et al. where they found that five performance counters
that a number of performance counter based power models
are sufficient to permit accurate estimation of CPU power
have appeared in recent times.
consumption after conducting experiments with different
combinations of hardware performance counters [157]. The
five performance counters are, C. Power Consumption of Multicore CPUs
1) number of L1 data cache references per second (α), Most of the current data center system servers are
2) number of L2 data cache references per second (β), equipped with multicore CPUs. Since the use of different
3) number of L2 data cache references per second (γ), cores by different threads may create varied levels of
4) number of floating point instructions executed per CPU resource consumption, it is important to model the
second (η), and energy consumption of a multicore CPU at the CPU core

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 18

level. Server’s power consumption depends on the speed


at which the core works. A high level architecture of a Emax = Dmax + Lcore , (56)
multicore processor is shown in Figure 11 [159]. It consists
where Dmax is the maximum dynamic energy consumption
of several dies with each one having several cores. Notable
of the core, Lcore is the leakage energy of the core which
power consumers inside a processor include ALU, FPU,
can be obtained by measuring core power when the core is
Control Unit, On/Off-chip Cache, Buses which are shown
in halt mode [162].
in Figure 11. Cores are highlighted in yellow color while
However, Li et al. improves over such basic core power
dies are denoted with dashed lines. Currently there are two
model by considering different levels of core speed. In the
main approaches for modeling the power consumption of
idle speed model, the amount of power consumed by a core
a multicore processor: queuing theory and component wise
in one unit of time is denoted as,
breakdown of the power consumption. We will first describe
the multicore CPU power models constructed using queuing Pcore = ρsα = (λ/m)Rs(α−1) , (57)
theory. Next, we will move on to listing the power models
constructed via component-wise breakdown/additive power where power allocated for a processor running on speed s
modeling. is sα , λ is the task arrival rate, m is the number of cores,
R is the average number of instructions to be executed, ρ
DIE 1
Voltage Regulator
DIE m
is the core utilization. The power consumed (P) by server
Core 1 Core n Core 1 Core n S can be denoted by,
CP, ALU, CP, ALU, CP, ALU, CP, ALU,
FPU, etc. FPU, etc. FPU, etc. FPU, etc.
Internal
busses
(inter-core
Internal
busses
(inter-core
P = mρsα = λRs(α−1) , (58)
comm) comm)
L1 cache L1 cache L1 cache L1 cache
where mρ = λx̄ represents the average number of busy
Ln cache Ln cache Ln cache Ln cache
cores in S. Since processor core consumes some power P ∗
L1 cache L1 cache
even if it is idling, the above equation is updated as follows,

P = m(ρsα + P ∗ ) = λRs(α−1) + mP ∗ .
Lp cache Lp cache
(59)
System Bus Interface/QPI In the constant speed model the parameter ρ becomes 1
Core
because all cores run at the speed s even if there is no task
Off-chip External buses On-chip Die Chip
key rectangle cache (inter-die comm) cache rectangle rectangle to run. Hence, the power consumption in the constant speed
model can be denoted as,
Fig. 11. An abstract architecture of a multicore processor [159]. All
the components that lie within core-level rectangles are limited to specific
P = m(sα + P ∗ ). (60)
core and cannot be shared with other cores.
However, in the constant speed model the CPU speed (i.e.,
The work by Li et al. is an example of use of the the execution strategy) is kept constant. In general idle-
first approach (queuing theory) for modeling the power speed model is difficult to implement compared to constant
consumption of multicore processors. They treated a mul- speed model because most of the current processors do not
ticore CPU as an M/M/m queuing system with multiple support running different cores at different speeds.
servers [160]. They considered two types of core speed While CPU core can be treated as the smallest power
models called idle-speed model (a core runs at zero speed consumer in a multicore CPU, another approach for ac-
when there is no task to perform) and constant-speed counting for power consumption of a multicore CPU is by
model (all cores run at the speed s even if there is no modeling the power consumption as a summation of power
task to perform) [161]. Such constant speed model has consumed by its threads. Although it cannot be counted as
been assumed in several other work such as [149] which a queuing theory based power model, Shi et al. described
helps to reduce the complexity of the power model. When the amount of power consumed by a workload W in a
constant speed model is employed for modeling the power multicore computer system as [163],
consumption of a multicore processor, the processor’s total
energy consumption can be expressed as [159], P = (Pidle + CPt )T, (61)
n
X where Pidle is the idle power consumption, C is the
Pn = Pc (j), (55) concurrency level of the workload, and Pt is the average
j=1
power dissipation of a thread. T is the total time taken
where Pn denotes the power consumption of n cores and to complete the workload. The CPt accounts for the total
Pc (j) corresponds to the power dissipation of a core j. dynamic power dissipation by all the threads executed by
Power consumption of a single core Pc (j) can be further the multicore processor.
described using a power model such as the ones described The second approach for multicore CPU power consump-
in the previous section (Section VI-B). tion modeling is component wise breakdown (i.e., additive)
One of the simplest forms of denoting the maximum which deep dives into lower level details of the processor.
energy consumption of a core (Emax ) is as follows, Note that the rest of the power models described in this

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 19

subsection are ordered based on the complexity of the of an IBM POWER6TM -based system [165]. This power
power model. One notable yet simple power models in this model consists of several components which can be shown
category was introduced by Basmadjian et al.. This is a as,
generic methodology to devise power consumption estima-
tion models for multicore processors [159]. They stated that
the previous methods for modeling power consumption of P = Nactc Pactc + αK + βL + γM + σN, (65)
multicore processors are based on the assumption that the
power consumption of multiple cores performing parallel where Nactc is the number of active cores, the incremental
computations is equal to the sum of the power of each power consumption increase for a core that exists its sleep
of those active cores. However, they conjectured that such state is represented by Pactc . The power consumption due
assumption leads to the lack of accuracy when applied to to activity in the chip is modeled by using IPC (Instructions
more recent processors such as quad-core. They also took Per Cycle) is represented as K and the amount of L1 load
into consideration the parameters such as power saving misses per cycle (L1LDMPC) is represented as L. Similarly
mechanisms and resource sharing when estimating the the memory system contribution to the power consumption
power consumption of multicore processors. This approach is modeled by using the number of L2 misses per cycle
had an accuracy within maximum error of 5%. Their power (L2LDMPC and L2STMPC) which are represented by
model can be denoted as, M and N respectively. α, β, γ, and σ are regression
parameters.
Pproc = Pmc + Pdies + Pintd , (62) In another work Bertran et al. developed a bottom-up
power model for a CPU [166]. In a bottom-up processor
where Pmc , Pdies , and Pintd denotes the power consump-
power model the overall power consumption of the proces-
tion of chip-level mandatory components, the constituent
sor is represented as the sum of the power consumption of
dies, and inter-die communication. They also described
different power components. The power components are
power models for each of these power components. It
generally associated with micro-architecture components
can be observed that the power consumed by chip-level
which allow the users to derive the power breakdown across
mandatory components and inter-die communication is
the components. They modeled power consumption at each
modeled using/extending the Equation (7) which indicates
component level to obtain the final bottom-up power model.
even higher level system components may be modeled by
Their model is defined as,
adapting such lower level power models.
Another method for component wise breakdown is divid-
N M
ing the power consumption into dynamic and static power X X
where dynamic power is due to power dissipated by cores, Pcpu = Pdynk + SQk + RM + Puncore , (66)
k=1 k=1
on-chip caches, memory controller (i.e., memory access).
Then the total CPU power can be modeled as [164], which comprises of the power consumption of each hard-
3
ware thread enabled on the platform, the SMT effect
Pproc = Pcore +
X
gi Li + gm M + Pbase , (63) (SM Tef f ect denoted by S) of the cores with SMT enabled
i=1 (SM T enabledk denoted by Qk ), the CMP effect as a
where Pbase is the function of the number of cores enabled (CM Pef f ect de-
Pbase/static
3
package power consumption.
noted by R), and the uncore power consumption (Puncore ).
The component i=1 gi Li + gm M represents the power
consumption due to cache and memory access. Here, Li The total number of threads is denoted by N while total
is access per second to level i cache, gi is the cost of number of cores is denoted by M . They used system
a level i cache access. Pcore can be often represented as performance counters to calculate the parameters such as S
Pcore = Cf 3 + Df , where C and D are some constants in their power model. Furthermore, Bertran et al. presented
[164]. Note that this is analogous with the server power a Performance Monitoring Counter (PMC) based power
model described in Equation (20). Furthermore, due to models for power consumption estimation on multicore ar-
the cache and memory access is proportional to the CPU chitectures [167]. Their power modeling technique provides
frequency, the above power model can be further simplified per component level power consumption. In another work
as, Bertran et al. did a comparison of various existing modeling
methodologies in Top-down and Bottom-up fashion [168].
Pproc = F (f ) = af 3 + bf + c, (64) In another such examples Bertran et al. used an ac-
cumulative approach for modeling multicore CPU power
where a, b, and c are constants. Constants a and b are consumption assuming each core behaves equally [149].
application dependent because cache and memory behavior Since all the cores behave equally, the single core model
can be different across applications. bf corresponds to described in Equation (47) can be extended as follows for
cores’ leakage power and power consumption of cache and multicore model,
memory controller. af 3 represents the dynamic power of
the cores while c = Pbase is the base CPU power. 
j=ncore i=m
! 
X X
In a slightly more complicated power model, Jiménez Ptotal =  Aij Pi + Pstatic , (67)
et al. characterized the thermal behavior and power usage j=1 i=1

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 20

where Pi of each component is same as in the single core x86-based processor [171]. They developed an instruction-
CPU power model described previously (in Equation (47)). level energy model for Xeon Phi through the results ob-
However, Aij need to be modified to perform per core tained from an energy per instruction (Epi ) characterization
accounting. made on Xeon Phi. Their model is expressed as,
Similarly core level power model can be implemented
(p1 − p0 )(c1 − c0 )/f
without considering the power consumption of individual Epi = , (70)
N
components. For example, Fu et al. described the processor
where N is the total number of dynamic instructions in
power consumption P (k) of a multicore processor as [169],
the microbenchmark used during the energy consumption
n
X characterization. The power consumed by microbenchmark
i
P (k) = Ps + xi (k)[Pind + Pdi (k)], (68) is calculated by subtracting the initial idle power (p0 ) from
i=1 the average dynamic power (p1 ). The initial idle power
where k is a control period, Ps represents the static power includes power for fan, memory, operating system, and
of all power consuming components (except the cores). leakage. The values c0 and c1 correspond to the cycle
xi represents the state of core Ci . If core i is active, before the microbenchmark starts and the cycle right after
xi = 1, otherwise xi = 0. The active power which is the microbenchmark finishes respectively. Therefore, (c1 -
dependent on the frequency of the core is represented as c0 ) corresponds to the total number of cycles executed by
Pdi (k) = αi fi (k)βi , where both αi and βi are system the microbenchmark. The f corresponds to the frequency
dependent parameters. The definition of active power Pdi (k) at which the dynamic power is sampled.
can be linked with the dynamic power consumption of
a digital circuit described in Equation (7) where αi (k)βi D. Power Consumption of GPUs
corresponds to ACV 2 of Equation (7). Graphical Processing Units (GPUs) are becoming a
A more detailed power model compared to the above common component in data centers because many modern
mentioned two works (in Equations (67) and (68)) was servers are equipped with General Purpose Graphical
described by Qi et al. [170] which is shown in Equation Processing Units (GPGPUs) to allow running massive
(69). Their power model was for multicore BP-CMP (block- hardware-supported concurrent systems [172]. Despite the
partitioned chip-multiprocessor) based computing systems. importance of GPU’s on data center operations the energy
Their power model had an additional level of abstraction consumption measurement, modeling, and prediction re-
made at block level compared to the previously described search on GPUs is still in its infancy. In this section we
multicore power models. In their work they considered a structure the GPU power models as performance counter
CMP with 2k processing cores, where k ≥ 1. They assumed based power models and as additive power models. Fur-
the cores are homogeneous. Considering the fact that it is thermore, we categorize the additive power models as pure
possible to provide different supply voltages for different GPU based models and as GPU-CPU power models.
regions on a chip using voltage island technique, the cores The first category of GPU power models described in this
on one chip was partitioned into blocks. In this model paper are performance counter based power models. One
(which is shown below), of the examples is the work by Song et al. which combined
hardware performance counter data with machine learn-
nb nci
!
X i
X ing and advanced analytics to model power-performance
P = Ps + xi Pind + yi,j Pdi,j , (69)
i=1 j=1
efficiency for modern GPU-based computers [59]. Their
approach is a performance counter based technique which
the Ps denotes the static power from all the power compo- does not require detailed understanding of the underlying
nents, while xi represents the state of the block Bi . It sets system. They pointed out deficiencies of regression based
xi = 1 if any core on the block is active and the block is models [128] such as Multiple Linear Regression (MLR)
i
on, otherwise xi = 0 and Bi is switched off. Pind is the for GPU power modeling [57] such as lack of flexibility
static power of core i and it does not depend on the supply and adaptivity.
voltage or frequency. yi,j represents the state of the j’th In another work on use of GPU performance coun-
core on block Bi . The frequency dependent active power ters, sophisticated tree-based random forest methods were
for the core is defined as Pdi,j = Cef .fim , where both Cef employed by Chen et al. to correlate and predict the
and m are system dependent parameters. All the cores on power consumption using a set of performance variables
block Bi run at the same frequency fi . However, it does [173][174]. They showed that the statistical model predicts
not depend on any application inputs. Note that the three power consumption more accurately than the contemporary
power models (in Equations (67), (69), and (68)) described regression based techniques.
in this subsection share many similarities. The second category of GPU power models are the addi-
In another processor energy consumption modeling work tive power models. The first sub category is the pure GPU
which cannot be attributed to either queuing theory based or based additive power models. Multiple different software
component wise break down models, Shao et al. developed frameworks to model CPU/GPU power models currently
an instruction-level energy model for Intel Xeon Phi pro- exist which can be categorized as additive power models.
cessor which is the first commercial many core/multi-thread In their work on GPUWattch, Leng et al. described a GPU

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 21

XML Interface
A (Micro)Architecture Param
Frequency, Vdd, In-order
Yes OoO, Cache Size NoC type
Microbenchmark Core count, Multithreaded? … Arch. Circuit Tech.

Configure
Good MCPAT
Coverage? Circuit Parameters
Basic structures: No Refinement SRAM, DRAM, DFF, Crossbar type, …
wire, Logic, SRAM, … Chip Timing
Microbenchmark Refinement: Tech Parameters
Representation Power/Area/ Dynamic
Adjust hardware assumptions Device (HP, LSTP, LOP), Wire Type
Access Patterns: Timing Area
Validation User

Optimization
Hit/miss ratio, coalesce pattern, … Add more microbenchmarks Model
Profiling Tools Input
Optimization Target Power Leakage
Power Measurement Max area/power Deviation
Optimization function Optimizer
Power model Short-circuit
Bottom-up Modeling Power Model
Refinement:

Stats
Basic structures: Component: Identify modeling unknowns Machine Status
Start RF, Caches, Exe, Units,… Accuracy No Adjust component parameters Hardware utilization
wire, Logic, SRAM, … Achieved? Cycle-by-cycle
Scale estimated power P-State / C-state Config
performance
Yes
simulator
Runtime Power Stats
Stop
A Thermal Stats
If thermal Model plugged in
Initial Modeling Validation Refinement
Stage Stage Stage

Fig. 13. Block diagram of the McPAT framework [143]. McPAT


framework uses an XML-based interface with the performance simulator.
Fig. 12. GPUWattch methodology to build power models [175]. The pro- McPAT is the first integrated power, area, and timing modeling framework
cess shown in this figure iteratively identifies and refines the inaccuracies for multithreaded and multicore/manycore processors [143].
in the power model.

total power consumption of a GPU Streaming Multipro-


power model that captures all aspects of GPU power at a cessor (SMs) into two parts called idle power and runtime
very high level [175]. GPUWattch can be considered as an power [178]. They modeled the runtime power (P ) of the
extension of Wattch CPU power model into the domain of GPU as,
GPUs. Figure 12 shows an abstract view of the process
e
followed in GPUWattch for building robust power models. P =
X
(Nsm Pu,i Uu,i ) + Bu,i Uu,i , (73)
In this process a bottom-up methodology is followed to i−1
build an initial model. The simulated model is compared
where Nsm is the number of components, Pu,i is the
with the measured hardware power to identify any modeling
power consumption of active component, e is the number
inaccuracies by using a special suite of 80 microbench-
of architectural component types, Bu,i is the base power of
marks that are designed to create a system of linear equa-
the component, Uu,i is the utilization. The block diagram
tions that correspond to the total power consumption. Next,
of the McPAT framework.
they progressively eliminate the inaccuracies by solving
for the unknowns in the system. The power model built NVIDIA Quadro 6000 - 50% Memory NVIDIA Quadro 6000 - 10% Memory
BW Utilization BW Utilization
using this approach achieved an average accuracy that is
within 9.9% error of the measured results for the GTX 480 MCs
22.3% MCs
Cores
GPU [175]. In a different work Leng et al. has also re- 44.0%
21.4%

Cores
ported almost similar (10%) modeling error of GPUWattch 54.4% DRAMs
DRAMs 24.2%
[176]. The power model shown below is a very high 33.7%

level representation of the GPU power they model which


AMD Radeon HD 7970 - 50% Memory AMD Radeon HD 7970 - 10%
consisted of the leakage(Pleakage ), idle SM (Pidlesm ), and BW Utilization Memory BW Utilization
all components’ (N in total) dynamic power as,
MCs MCs
22.0% 21.4%
N
X Cores
46.4%
P = (αi Pmaxi ) + Pidlesm + Pleakage , (71) Cores DRAMs
1 DRAMs
57.3% 21.3%
31.6%

where dynamic power of each component is calculated


as the activity factor (αi ) multiplied by the component’s
peak power Pmaxi . MCPAT is another framework similar Fig. 14. Power breakdown for NVIDIA and AMD GPUs [179]. The off-
chip DRAM accesses consume a significant portion of the total system
to GPUWattch that models the power consumption of power.
processors [143][177]. In MCPAT the total GPU power
consumption is expresses as a summation of the power It should be noted that the power consumption of GPUs
consumed by different components of a GPU as [177], change considerably due to the memory bandwidth utiliza-
X
tion. See Figure 14 for an example [179]. If the memory
P = Pcomponent , power can be reduced by half, it will lead to a 12.5%
(72) saving of the system power. From Figure 14 it can be
observed that the GPU cores (functional units) dominate
= Pf pu + Palu + Pconstmem + Pothers ,
the power consumption of the GPUs. Furthermore, the off-
where Pf pu , Palu , and Pconstmem correspond to power chip memory consumes a significant portion of power in a
dissipated by arithmetic and logic unit (ALU), floating point GPU.
unit (FPU), and constant memory. In another example for additive power models, Hong et
Similar to the digital circuit level power breakdown al. described modeling power consumption of GPUs [154].
shown in Equation (4), Kasichayanula et al. divided the They represented the GPU power consumption (Pgpu )

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 22

TABLE IV
S UMMARY OF PROCESSOR POWER CONSUMPTION MODELING APPROACHES .
Work(s) Type Characteristics Limitations
single core/
[148][149] Non-linear. Componentwise power break- Each CPU subcomponent’s power usage must be known
multicore
down (i.e., additive power models). beforehand.
[69][85]
single core Performance counter based Choosing the proper performance counter(s) is a chal-
[156][158]
lenge, while certain performance counters may not be
available across different processors.
[160][161] multicore Queuing theory based power models. Power consumption of each core must be known before-
hand.
[165] multicore Specific to IBM POWER6TM Need to know α, β, and σ variables beforehand.
[149][166] multicore Considers power consumption of each core Need to know the subcomponents’ power consumption
and their subcomponents. beforehand.
[166][169][170] multicore Non-linear. Accounts for each core. Power consumption of each core must be known before-
hand.
[171] multicore Instruction-level energy consumption model Need to know the initial idle power beforehand.
made for Intel Xeon Phi
[59][173][174]
GPU Performance counter based power consump- Based on multiple assumptions.
[175]
tion models.
[154][178][179] GPU Component-wise breakdown of GPU power Power consumption of each subcomponent must be
(i.e., additive power models). known beforehand.
[178] GPU Non-linear. It is almost similar to a multi- Power consumption of each GPU core must be known
core power consumption model. beforehand.
[180][181] CPU-GPU Non-linear power model. Based on multiple assumptions.
(and other
peripherals)

as a sum of runtime power (runtime power) and idle


power (IdleP ower). Runtime power is also divided among n
X m
X
i
power consumption of Streaming Multiprocessors (SMs) Psystem (w) = Pgpu (wi ) + j
Pcpu (wj ) + Pmainboard (w),
i=1 j=1
and memory. To model the runtime power of SMs, they (75)
decomposed the SM into several physical components and where Psystem , Pcpu , Pgpu and Pmainboard represent the
accounted for the power consumption of each component. power of the overall system, GPU, CPU, and main board
For example, RP Const SM is a constant runtime power respectively. Number of GPUs and CPUs are represented
component. Therefore, the complete GPU power model as N and M which involve in the computing workload
described by them can be denoted as, w. Workloads assigned to GP Ui and cpuj are represented
by wi and wj respectively. In a similar work [182] that
n
X expresses the power consumption of a GPU in the context
Pgpu = Nsms Si + Pmem + Pidle ,
i=0
of its corresponding CPU, the energy consumption of the
n
X (74) GPU is expressed as,
Si = Pint + Pf p + Psf u + Palu + Ptexture + Pcc
i=0
+Pshared + Preg + Pf ds + Pcsm , Egpu = tgpu (Pavg gpu + Pidle cpu ) + Etransf er , (76)

where Nsms represents the number of streaming multipro- where tgpu , Pavg gpu , Pidle cpu , and Etransf er represent
cessors and a streaming component i is represented by Si . the time spent on GPU, average power consumption of
Runtime power consumption of memory is represented by GPU, idle power consumption of CPU and the energy
Pmem while the idle power is represented by Pidle . The consumed for transfer between CPU and GPU respectively.
terms Pint , Pf p , Psf u , Palu , Ptexture , Pcc , Pshared , Preg , In a similar line of research Marowka et al. presented
Pf ds correspond to integer arithmetic unit, floating point analytical models to analyze the different performance
unit, SFU, ALU, Texture cache, Constant cache, shared gains and energy consumption of various architectural
memory, register file, FDS components of the GPU. Pcsm design choices for hybrid CPU-GPU chips [181]. For such
is a constant runtime power component for each active asymmetric CPU-GPU execution scenario, they assumed
streaming multiprocessor. that a program’s execution time can be composed of a time
The second sub category of the additive GPU power fraction where the program runs sequentially (1 − f ), and
models is the models that combine both the GPU and a time fraction of the program’s parallel execution time
external (CPU) power consumption aspects. Certain works where the program runs in parallel on the CPU cores (α),
which model GPU power consumption as part of the full and a time fraction of the program’s parallel execution
system power model currently exist. One such example is time where the program runs in parallel on the GPU cores
the power model described by Ren et al. considering the (1−α). They also assumed that one core is active during the
CPU, GPU, and main board components [180]. Their model sequential computation and consumes a power of 1, while
can be represented as, the remaining (c − 1) idle CPU-cores consume (c − 1)kc

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 23

and g idle GPU-cores consume gwg kg . Therefore, during Figure 15 shows the organization of a typical DRAM
the parallel computation on the CPU-cores only, c CPU- [185]. Therefore, memory subsystem’s power consumption
cores consume c power and g idle GPU-cores consume should be modeled considering different components that
gwg kg power. During the parallel computation on the make up the memory hierarchy. For example, Deng et
GPU-cores only, g GPU-cores consume gwg power and c al. divided power consumption of memory subsystem into
idle CPU-cores consume ckc power. In such scenario, the three categories: DRAM, register/PLL (phase locked loop),
power consumption during the sequential, CPU, and GPU and memory controller power [133]. However, most of the
processing phases can be represented as, current studies on memory subsystem power consumption
are based on DRAM (Dynamic Random-Access Memory)
Ps = (1 − f ){1 + (c − 1)kc + gwg kg }, power consumption. Commodity DRAM devices recently
have begun to address power concerns as low power DRAM
devices.
αf
Pc = {c + gwg kg },
c RAS CAS WE OE

Timing and control

(1 − α)f
Pg = {gwg + ckc }. Refresh MUX

gβ counter

(77)
Row Memory
Row
array
This requires (1 − f ) to perform sequential computation, A1
A2
A3
address
buffer
decoder
(2048×2048×4)

and times (1−α)f


gβ and αf
c to perform the parallel computa- A10

tions on the GPU and CPU respectively. Note that (1 − f ) Column


address Refresh circuitry
Data input buffer
Data output
D1
D2
D3
buffer D4
is a time fraction of the program’s parallel execution time Column decoder
buffer

where the program runs in parallel on the CPU cores (α),


and (1 − α) is a time fraction of the program’s parallel Fig. 15. Typical 16 Megabit DRAM (4M × 4) [185]. Note that the
DRAM includes a refresh circuitry. Periodic refreshing requires disabling
execution time where the program runs in parallel on the access to the DRAM while the data cells are refreshed.
CPU cores. Therefore, they represented the average power
consumption Wa of an asymmetric processor as,
In the rest of this subsection we categorize the memory
Ps + Pc + Pg power models as additive power models and performance
Wa = . (78)
(1 − f ) + αf
+ (1−α)f counter based power models. Furthermore, we organize
c gβ
the content from the least complex power models to most
We list down a summary of the processor power con- complex power models.
sumption modeling approaches in Table IV. In the context of additive power modeling, one of the
fundamental approaches is representing the power con-
VII. M EMORY AND S TORAGE P OWER M ODELS sumption as static and dynamic power. This can be observed
Traditionally, processors have been regarded as the main in the context of DRAM power models as well. Lin et
contributors to server power consumption. However, in al. employed a simple power model in their work [186].
recent times contribution made by memory and secondary The model estimated the DRAM power (Pdm ) at a given
storage for data center power consumption has increased moment as follows,
considerably. In this section we first investigate on the
memory power consumption and then move on to describ- Pdm = Pstatic dm + α1 µread + α2 µwrite , (79)
ing the power consumption of secondary storage such as
hard disk drive (HDD) and flash based storage (SSD). where α is a constant that depends on the processor. The
static power consumption of the DRAM is denoted by
A. Memory Power Models Pstatic dm , while the read and write throughput values are
denoted by µread and µwrite respectively.
The second largest power consumer in a server is its
In another additive power model, over all energy usage
memory [139]. Even in large petascale systems main
of the memory system is modeled as [187],
memory consumes about ≈30% of the total power [183].
IT equipment such as servers comprise of a memory
hierarchy. The rapid increase of the DRAM capacity and E = EIcache + EDcache + EBuses + EP ads + EM M , (80)
bandwidth has contributed for DRAM memory sub sys-
tem to consume a significant portion of the total system where the energy consumed by the instruction cache, data
power [184]. The DDR3 and FB-DIMM (Fully Buffered cache is denoted by Icache and Dcache respectively.
DIMM) dual in-line memory modules (DIMMs) typically EBuses represents the energy consumed in the address and
consume power from 5W up to 21W per DIMM [139]. data buses between Icache/Dcache and the data path.EP ads
In the current generation server systems, DRAM power denotes the energy consumption of I/O pads and the exter-
consumption can be comparable to that of processors [184]. nal buses to the main memory from the caches. EM M is

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 24

calculated based on the memory energy consumption model they modeled a stream of memory requests as a Poisson
described in [155]. process where they assumed that the time between requests
Another similar power modeling work was conducted follow an exponential distribution and the inter-arrival
by Ahn et al. [188]. Static power mainly comprises of times are statistically independent [191]. When Ti is an
power consumed from peripheral circuits (i.e., DLL and I/O exponentially distributed random variable for the idle time
buffers) such as transistors, and refresh operations. Since between two memory requests, the exponential distribution
DRAM access is a two step process, dynamic power can is parametrized by 1/Ta where Ta is the average inter-
be further categorized into two parts. First is the activate arrival time. They represented the power dissipation during
precharge power that is discharged when bitlines in a bank powerdown and powerup as Pd and Pu respectively. If
of a DRAM chip are precharged (During this process data the idleness exceeds a threshold Tt the memory power-
in a row of the bank is delivered to the bitlines and latched down happens, and a latency of Tu has to be faced
(activated) to sense amplifiers by row-level commands). when powering-up the memory. Powerdown is invoked with
The second type of power is read-write power which is probability f = P (Ti > Tt ) = e−Tt /Ta where f is the
consumed when a part of the row is read or updated by power down fraction. In this scenario the power dissipation
column-level commands. The dynamic power consumption by DRAM is Pd for (Ti − Tt ) time while powered-down
is proportional to the rate of each operation. Since a row and dissipates Pu for (Tt + Tu ) time while powered-up. Ti
can be read or written multiple times when it is activated, is the only random variable; E[Ti ] = Ta ,
the rates of activate-precharge and read-write operations can
be different. They modeled the total power consumed by E[E] = f × E[Pd (Ti − Tt ) + Pu Tt + Pu Tu ] + (1 − f )E[Pu Ti ]
a memory channel (i.e., total main memory power Pmem ) = f [Pd (Ta − Tt ) + Pu Tt + Pu Tu ] + (1 − f )[Pu Ta ]
as, = Pd [f (Ta − Tt )] + Pu [f (Tt + Tu ) + (1 − f )Ta ],
(84)

Pmem = DSRσ + Erw ρrw + DEap fap , (81) where the expectation for memory energy is given by E[E].
Another power model for DRAM energy consumption
where D is the number of DRAM chips per subset, S is the
was introduced by Lewis et al. in [92]. The model is based
number of subsets per rank, R is the number of ranks per
on the observation that energy consumed by the DRAM
channel, σ is the static power of the DRAM chip, Erw is the
bank is directly related to the number of DRAM read/write
energy needed to read or write a bit, ρrw is the read-write
operations involved during the time interval of interest.
bandwidth per memory channel (measured, not peak), Eap
Energy consumption of a DRAM module over the time
is the energy to activate and precharge a row in a DRAM
interval between t1 and t2 is expressed as,
chip, and fap is the frequency of the activate-precharge
operation pairs in the memory channel.
N
In an expanded version of the power models shown in Z t2  X  
Emem = Ci (t) + D(t) Pdr + Pab dt, (85)
Equations (79) and (81), Rhu et al. [189] modeled DRAM t1 i=1
power as,
where Ci (t), i = 1, 2, ..., N is the last-level cache misses
P = Ppre stby + Pact stby + Pref + Pact pre + Prd bank
for all N constituent cores of the server when executing
(82) jobs, D(t) is the data amount due to disk access or OS
+Prd io + Pwr bank + Pwr io ,
support and due to performance improvement for peripheral
where Ppre stby , Pact stby , Pref , Pact pre , Prd bank , Prd io , devices. Pdr is the DRAM read/write power per unit data.
Pwr bank , Pwr io repesent precharge standby power, ac- Pab represents the activation power and DRAM background
tive standby power, refresh power, activation & precharge power. The value of Pab was calculated by using the values
power, read power and write power categorized as power mentioned in the DRAM documentation. In the case of
consumed by DRAM bank and IO pins respectively. This AMD OpteronTM server used by Lewis et al., the value of
power model was based on Hynix GDDR5 specification. Pab was amounted to 493mW for one DRAM module.
Another similar power model based on background power All the above mentioned power models are additive
and operation power was described by David et al. in [190]. power models. However, some power models are specif-
They applied an extended version of this memory power ically developed using performance counter values (still
model for Dynamic Voltage Frequency scaling where the these models can be considered as additive power models).
memory power consumption at voltage v and frequency f In one such works Contreras et al. parametrized memory
is given by, power consumption (Pmem ) using instruction fetch miss
and data dependencies [156],
Pf,v = Pf − Pf Pvstep Nf , (83)
Pmem = α1 (Bf m ) + α2 (Bdd ) + Kmem . (86)
where they conservatively assumed the total power reduc-
tion per voltage step as 6% thus Pvstep = 0.06. Here, α1 and α2 are linear “weighting” parameters. An
Malladi et al. created a model to represent the expec- important point about this power model is that it re-uses
tation for memory energy E[E]. During this process first performance events used for CPU power model described

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 25

in Equation (51). Bf m and Bdd correspond to instruction Coil Motor (VCM), and the electronics [194] (See Figure
fetch miss and number of data dependencies respectively. 16[192][195]). The power consumption of electronics can
A completely different approach for modeling the mem- be modeled by following the same techniques discussed
ory power was followed by Roy et al. [85]. In their work under the section on CPU, and memory power consumption
on energy consumption of an algorithm A they modeled modeling. However, in the context of HDD the electrome-
the memory power consumption as, chanical components such as SPM accounts for most of the
power consumption.
Emem (A) = Pcke T (A) + Pstby Tact (A)α(A) In this subsection we organize the HDD power models
+Eact α(A) + (R(A) + W (A))Trdwr Prdwr , following a chronological ordering. One of the earliest work
(87) in this category is the work by Sato et al. [196]. The Power
consumption by SPM can be expressed as [194][196],
where α(A), R(A), and W (A) represent the number of
activation cycles (α and β pair), the number of reads, and 2.8
Pspm ≈ nωspm (2r)4.6 , (88)
writes respectively executed by A. Tact (A) denotes the
average time taken by one activation by A. The power where, n is the number of platters of the HDD, ωspm is
model comprised of three components, the first component the angular velocity of the SPM (i.e, RPM of the disk),
Pcke captures the leakage power drawn when the memory and r is the radius of the platters. Since the platters
is in standby mode, with none of the banks are activated. are always rotating when the disk is powered, the above
The second component Pstby captures the incremental cost equation denotes the static power consumption by the
over and above the leakage power for banks to be activated disk irrespective of whether it is merely idling or actively
and waiting for commands. The third component captures performing I/O operations. The VCM power belongs to the
the incremental cost of various commands. Since α and β dynamic portion of the HDD power consumption. VCM
commands are always paired together, the energy cost of power consumption happens only when a disk seek needs
these two commands is represented as Eact . The energy to be performed. This also happens only during specific
usage of R and W commands is captured as Prdwr Trdwr . phases of a seek operation.
Hylick et al. observed that read energy consumption of
multiple hard disk drives has a cubic relationship with Log-
B. Hard Disk Power Models
ical Block Number (LBN) [197]. Note that the amount of
Hard Disk Drive (HDD) is currently the main type of LBNs in a hard disk indicates its capacity. The read energy
secondary storage media used in data center servers. HDD (Er ) consumption with the LBN (L) can be indicated as,
contains disk platters on a rotating spindle and read-write
heads floating above the platters. Disk is the subsystem that Er ∝ L3 . (89)
is hardest to model correctly [88]. This is because of the
difficulty arising due to the lack of visibility into the power Furthermore, they modeled the total amount of energy con-
states of the hard disk drive and the impact of disk hardware sumed by a drive servicing a set of N requests (considering
caches. I time idle) comprised of S seeks as [198],
N
X S
X I
X
Arm
Electronics
Data
Channel
Etotal = Eactive + Eseek + Eidle , (90)
i=0 i=0 i=0
VCM
Hard Disk Host where Etotal is the total energy, Eseek is the seek energy,
Controller Interface
and Eidle is the idle energy in Joules.
Spindle
VCM
Driver PES
Dempsey is a disk simulation environment which in-
Motor Demodulator
cludes support for modeling disk power consumption [199].
Filters
To obtain a measure of the average power consumed by
Microprocessor
a specific disk stage S, Dempsey executes two workload
Spindle Driver
traces which differ only in the amount of time spent in stage
DAC RAM S. Then the average power consumption for disk stage S is
represented using the following equation,
Spindle E2 − E1
Controller P̄s = , (91)
T2 − T1

Fig. 16. Block diagram of a hard disk system [192]. The electromechan-
where Ei represents the total energy consumed by trace i
ical subsystem consumes 50% of the total idle power with the remaining and Ti is the total time taken by trace i. They referred this
50% dissipated in the electronics. method of estimating the average power consumption of an
individual stage as the Two-Trace method.
Two components in the power consumed by HDDs (Disk Hibernator is a disk array energy management system
drives in general) are called static power and dynamic developed by Zhu et al. (circa year 2005) that provides
power [193]. There are three sources of power consump- energy savings while meeting performance goals [200].
tion within a HDD: The Spindle Motor (SPM), Voice They assumed that the most recent observed average request

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 26

TABLE V
S UMMARY OF MEMORY POWER CONSUMPTION MODELING APPROACHES .
Work(s) Type Characteristics Limitations
[186] DRAM Additive power model. Considers static, read, and write α is dependent on the system’s processor.
power of the entire DRAM
[188] DRAM Additive power models. Multiple DRAM chips. More The frequency of activate-precharge operation pairs depends
elaborated one compared to [186]. on many factors.
[189] DRAM Additive power model which is more descriptive than The values of the individual power dissipation need to be
[188]. Considers multiple subcomponents of DRAM identified beforehand.
power dissipation.
[191] DRAM Additive power model. Based on information theory. Assumes that power up latency is exposed on the critical path.
[92] DRAM Performance counter based. Considers multiple factors The traffic over hyper transfer buses affect the value of [92].
such as disk accesses, peripheral devices, etc.
[85] DRAM Performance counter based. Energy consumption during Considers only a limited set of algorithms.
an algorithm’s execution.
[156] DRAM Performance counter based. The values of the weighting parameters α1 and α2 need to
be known beforehand.

arrival rate at disk i in the disk array as αi . For a disk that is where t2 is the actual seek time while Ec correspond to
spinning at speed j, the service time tij can be measured the energy consumption of the electronic part of the disk
at run time. If the mean and the variance of the service system (Ec ≈ 40% of total system idle power).
time can be denoted as K(tij ), the disk utilization ρij can Similar to other components of a server, the HDD power
be calculated as ρij = αi K(tij ). If the disk i needs to consumption is dependent on low level issues such as
change its speed in the new epoch, the disk cannot service the physical design of the drive, dynamic behavior of
requests while it is changing its spin speed. If the length the electromechanical components [194]. However, most
of the transition period is denoted as Ti , if disk i does not of the power modeling techniques represent disk power
service requests, the total energy for disk can be denoted consumption at higher level abstraction based on HDD
as, power states. Each state corresponds to a category of
high level activity and each state has an associated cost.
0 00 000
Eij = Pij Tepoch ρij + Pij (Tepoch − Tepoch ρij − Ti ) + Pij Ti , Transitioning between different states may also incur power
(92) costs and extra latencies. The power consumption of a
where Eij is the energy consumption of keeping disk i at HDD can be shown as a state machine where the nodes
speed j in the next epoch. Pij0 , Pij00 , and Pij000 correspond to correspond to the power states and the edges denote the
active power, idle power at speed j, and transition power. state transitions (See Figure 17).
The active time during which disk i is serving requests is At least three modes of operation (i.e., power states)
Tepoch × ρij since the request arrival rate is independent exists for power-manageable disk: active, idle, and standby
of the disk speed transition. The power components such (See Figure 17) [201][202]. In the idle state the disk
as Pij000 are simple divisions of the entire power at different continues spinning, consuming power Pid . In the active
states. state the disk consumes power Pact for transferring data
Bircher et al. estimated the dynamic events of the hard while in the standby state, the disk is spun down to reduce
disk drive through the events such as interrupts and DMA its power consumption to Psb . The remaining time is the
access [150]. In another power model, the energy consumed idle time when disk i is spinning at speed j, but does not
by the VCM for one seek operation can be denoted as service requests.
[193][194][195],
2 Spin down
nJvcm ωvcm nbvcm ωvcm
Evcm = + , (93)
2 3
where Jvcm is the inertia of the arm actuator, bvcm is the standby Active Seek Idle
Spin up
friction coefficient of the arm actuator, and ωvcm is the
maximum angular velocity of the VCM [194]. When the
D R/W requests
average seek time tseek is expressed as tseek = 2 ωVavg
CM
the
power consumption of VCM can be modeled as,
Fig. 17. Power state machine of a hard disk drive [202]. There are at least
EV CM three power states in modern disk drives called active, idle, and standby.
PV CM = , (94)
tseek
where Davg is the average angular seek distance. The power In another HDD power modeling attempt, Bostoen et al.
models for spindle motor (Equation (88)) and voice coil [201] modeled the energy dissipation in the idle mode (Eid )
motor (Equation (94)) can be combined as follows to model as a function of the idle time Tid (the time between two
the entire hard disk power consumption [195], I/O requests) as follows,

E = PSP M t0 + PV CM t2 + Ec , (95) Eid (Tid ) = Pid Tid , (96)

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 27

while in the standby mode, the disk energy consumption where Esp,r is the energy to bias the wordline of the
Esb can be represented as a function of idle time Tid as, selected page to ground, Eup,r is the energy to bias the
unselected pages to Vread , Erpre is the energy to transit
0 from the read to the precharge state (Erpre ), Edec,r is the
Esb (Tid ) = Psb (Tid − Tdu ) + Edu = Psb Tid + Edu , (97)
energy for the decode operation estimated using CACTI
0
where Edu = Edu − Psb Tdu is the extra energy dissipated [207], Ebl1,r and Ebl0,r correspond to the energy dissipated
during disk spin-down and spin-up taking the standby mode during transitioning from logical “1” to “0” and vice versa.
as a reference. They used the CACTI’s DRAM sense amplifier model to
A summary of the HDD power models is shown in Table find the amount of energy dissipation for sensing. The term
VI. Ess,r corresponds to this term.
In a similar energy modeling attempt, Mohan et al.
denoted the total energy dissipated when programming the
C. Solid-State Disk Power Models
SSD (i.e., program operation) as,
Flash memory based solid state disks (also known as
solid state drives) (SSD) are becoming a crucial component
of modern data center memory hierarchies [203][204]. SSD Ep = Edec,p + Epgm + Eppre , (100)
has become a strong candidate for the primary storage due
to its better energy efficiency and faster random access. where Edec,p = Edec,r which is estimated using CACTI,
Design of the flash memories is closely related to the power Epgm is the maximum energy for programming, Eppre is
budget within which they are allowed to operate. Flash used the energy to transit from program state to precharge state.
in consumer electronic devices has a significantly lower In the erasure operation of a flash memory, the erasure
power budget compared to that of an SSD used in data happens at the block level. The controller sends the address
centers. of the block to be erased. The controller uses only the
In this subsection we organize the power models based block decoder and the energy for block decoding (Edec,e )
on their chronological order. In a typical Flash memory is calculated using CACTI. They modeled the total energy
device, multiple silicon Flash memory dies are packed in a dissipated in the erase operation Eerase as,
3D stacking. These Flash memory dies share the I/O signals
of device in a time-multiplexed way. Park et al. termed the Nec
X
I/O signals of the package as channel while they termed Eerase = Edec,e + Ese (Vse,i ) + Eepre , (101)
a memory die as way. Park et al. expressed the per-way i=0

power consumption of write operation (Ppw ) [205] as, where Ese (Vse,i ) is the energy for suberase operation
with Vse,i is the voltage used for that purpose where i
(Psw − Pidle )
Ppw = , (98) denotes the iteration count of the suberase operation. The
# active Flash dies at the plateau
erase operation ends with a read operation. They took the
where the denominator of the right-hand side represents the energy for transitioning from the erase to precharge as
number of flash dies that are turned on during the plateau the same as energy to transition from read to precharge
period of the write operation. The power consumption by (Eepre = Erpre ).
sequential write operations is denoted by Psw . In their work The power state transition between these different states
they measured the power during the plateau period where used by FlashPower is shown in Figure 18. Note that Figure
all the flash dies and the controller are active. Per-way 18 (a) and (b) show the power transition for SLC and
power consumption of read operation can be calculated in MLC NAND flash chips respectively. The bubbles show the
the same manner. individual states while solid lines denote state transitions.
Among the multiple different power consumption models A summary of the SSD power models is shown in Table
for SSDs, Mohan et al. presented FlashPower which is a VI.
detailed power model for the two most popular variants of
NAND flash [206] called single-level cell (SLC) and 2-bit
multilevel cell (MLC) based NAND flash memory chips. D. Modeling Energy Consumption of Storage Servers
FlashPower uses analytical models to estimate NAND flash Storage systems deployed in data centers account for
memory chip energy dissipation during basic flash opera- considerable amount of energy (ranked second in the energy
tions such as read, program and erase, and when the chip is consumption hierarchy as described in Section I) consumed
idle. While they modeled the Flash power consumption in a by the entire data center. Some studies have shown that the
very detailed manner we highlight three of the highest level energy consumption of storage may go up to 40% of the
models they described in their work below. They modeled entire data center [208]. The storage portion of a data center
the total energy dissipated per read operation for SLC flash consists of storage controllers and directly-attached storage
and fast page of MLC flash as, [209]. Power consumption of storage systems is unique
because they contain large amounts of data (often keeps
Er = Esp,r + Eup,r + Ebl1,r + Ebl0,r + Esl,r + Erpre several copies of data in higher storage tiers). In backup
(99)
+Ess,r + Edec,r , storage systems, most of the data is cold because backups

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 28

TABLE VI
S UMMARY OF DISK POWER CONSUMPTION MODELING APPROACHES .
Work(s) Types Characteristics Limitations
[194][196] SPM Non-linear model. Accounts for the physical characteristics. Simple power model.
[193][194][195] VCM Non-linear model. Accounts for the physical characteristics. More complicated power model
compared to [194][196]
[198] Complete HDD Accounts for three different states of operations. Simple power model
[197] HDD Accounts only for the read energy Simple power model
[200] HDD Considers the transitioning time periods of the disk. More complicated power model.
[201] HDD Accounts for the energy dissipated during the idle mode. Simple model, multiple assump-
tions.
[199] HDD This is a simulation based model developed using Dempsey. Needs physical metering.
[206] SSD Describes the SSD power consumption in detail. Depends on CACTI
[205] SSD Write power consumption of Flash memory. The modeled targets the plateau of
the Flash memory’s operation.

(a)
Powered
Off


 (a) max W, if m ≥ 1 and w = 1,
m ≥ 1 and w = 0,



 (b) max R, if
Programmed
Program Page ‘0’
(c) (12.66w3 − 17.89w2 + 9.11w)(max W

Page Read
Command Precharge Command eφ (t) =
Unprogrammed
State
Page ‘1’

 −max R)/3.88 + max R,
Page Erase 
if m ≥ 1 and 0 < w < 1,

Command 
Read State Program State 
Erase
(d) min E if m = 0,

State
Powered On
(102)
(b)
where max W and max R correspond to the maximum
Powered
Off rate at which the read and write operations are performed.
They did experiments in a real environment and obtained
an equation (Equation (102) (c)) for power consumption
Read Program Fast Page Fast Page
Fast Page
Programmed
Fast Page
Programmed
Command Command ‘0’ ‘1’ when concurrent processes are performed. While we list
Precharge
Slow Page Slow Page State Slow Page
‘00’
Slow Page
‘01’
down this storage server power model in this subsection,
Programmed Programmed Erase
Command
Slow Page Slow Page
most of the data center storage power modeling attempts
Read State

Erase
‘10’ ‘11’
were described in Sections VII-A, VII-B, and VII-C.
State Program State
Powered On

VIII. DATA C ENTERS L EVEL E NERGY C ONSUMPTION


Fig. 18. Power state machines of NAND Flash [206]. (a) Power state M ODELING
machine for a SLC NAND Flash chip. (b) Power state machine for a 2-bit
MLC NAND Flash chip.
The power models described in this paper until this
point have been focused on modeling energy consumption
at individual components. When constructing higher level
power models for data centers it is essential to have
knowledge on the details of such lower level components
which accounts for the total data center power consumption.
are generally only accessed when there is a failure in higher This section investigates on the data center power models
storage tier [208]. constructed on the higher levels of abstractions. First, we
describe modeling the energy consumption of a group of
servers and then move on to describing the efforts on energy
Inoue et al. conducted power consumption modeling of
consumption modeling of data center networks. Modeling
a storage server [210]. They introduced a simple power
the data center cooling power consumption distribution and
consumption model for conducting storage type application
the metrics for data center energy efficiency are described
processes where the maximum electric power of a computer
next.
is consumed if at least one storage application process
is performed on the computer. They used a power meter
to measure the entire storage server’s power since it is A. Modeling Energy Consumption of a Group of Servers
difficult to measure the power consumption at individual The aforementioned lower level power models can be
component level. They measured the power consumption extended to build higher level power models. While the
rate of the computer in the environment φ(m, w) where basics remains the same, these higher level power models
w(0 ≤ w ≤ 1) is the ratio of W processes to the total pose increased complexity compared to their lower level
number m of concurrent processes. They measured the counterparts. One of the frequently sought higher level
power consumption rate eφ (t) where ten processes are abstractions is a group of servers. In this subsection we
concurrently performed (m = 10). Their read (R) Write investigate on various different techniques that have been
(W) energy consumption model can be represented as, followed for modeling the power consumption of a group of

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 29

Power P1 (speed s1)


is the expected number of idle servers.
PS A sightly different version of the above power model
q1 was described by Lent [215]. This model assumed that the
Power P2 (speed s2)
servers are homogeneous and there exists an ideal load
PS
balancing among the nodes that are running. The power
q2
Incoming consumption of the computing cluster is modeled as,
jobs with Power Pi (speed si )
rate λ
qi PS P (λ) = nk(I + Jρ) + n(1 − k)H, (104)

where λ is the job arrival rate. k = m/n is the ratio of


Power Pk (speed sk)
qk running to hibernating nodes. I is the idle power consumed
PS by each server while j is the power increment for utilization
level of ρ. H is the node power consumption while in hiber-
Fig. 19. K-server farm model [214]. The model assumes that the jobs nate mode. Note that the original power model described in
at a server are scheduled using the Processor-Sharing (PS) scheduling [215] has an additional term called O which describes the
discipline.
power consumption of other equipment in the data center
facility such as UPSs, network equipment, etc.
servers. The power models developed for group of servers In [211] servers are powered up and down one at a
can be categorized into three subcategories as queuing time [216]. Mitrani et al., worked on the problem of
theory based power models, power efficiency metrics based analyzing and optimizing the power management policies
power models, and others. where servers are turned on and off as a group [216]. They
mentioned that in a large-scale server farm it is neither
First, we consider use of queuing theory for modeling
desirable nor practical to micro-manage power consumption
energy consumption of a group of servers. Multiple differ-
by turning isolated servers on and off. In their model the
ent types of parameters need to be considered in the context
server farm consists of N servers of which m are kept
of a group of servers compared to a single server’s power
as reserve (0 ≤ m ≤ N ). The jobs arrive at the farm
model. For example, there is a time delay and sometimes
in a Poisson stream with a rate λ. The operational servers
a power penalty associated with the setup (turn the servers
accept one job at a time and the service times are distributed
ON) cost. Gandhi et al., Artalejo et al., and Mazzucco et al.
exponentially with a mean of 1/µ. The availability of the
have studied about server farms with setup costs specifically
reserve servers is controlled by two thresholds U and D
in the context of modeling the power consumption of such
(0 ≤ U ≤ D). If the number of jobs in the system increases
systems [211][212][213] which we elaborate next.
from U to U +1 and the reserves are OFF they are powered
In their work Gandhi et al. developed a queuing model
ON as a block. They become operational together after an
for a server farm with k-servers (see Figure 19) [211][214].
interval of time distributed exponentially with mean 1/ν
They assumed that there is a fixed power budget P that can
in which the servers are consuming power without being
be split among the k servers P in the cluster, with allocating
k operational. Even if U ≥ N by the time the reserves are
Pi power to server i where i=1 Pi = P . In their work
powered on jobs may have departed leaving some or all of
they modeled the server farm with setup costs using M/M/k
them idle. Similarly, if the reserves are powered ON (or
queuing system, with a Poisson arrival process with rate λ
powered up) and the number of jobs in the system drops
with exponentially distributed job sizes, denoted by random
from D + 1 to D, then they are powered down as a block.
variable S ∼ Exp(µ). They denoted the system load as
When all servers are ON, the reserves are not different from
ρ = µλ where 0 ≤ ρ ≤ k. For stability they need λ < kµ. In
other servers, the system behaves like an M/M/N queue.
their model a server can be in one of four states: on,idle,off,
In a slightly different work from [211] where it does not
or setup. When a server is serving jobs its in the on state
allow for jobs to queue if no server is available, Mazzucco
and its power consumption is denoted by Pon . If there are
et al. modeled the power consumption by a cluster of n
no jobs to serve, the server can either remain idle, or be
powered on servers as [213],
turned off, where there is no time delay to turn off a server.
Furthermore, if a server remains idle, it consumes non-zero
P = ne1 + m̄(e2 − e1 ), (105)
power Pidle and they assumed Pidle < Pon . The server
consumes zero power if it is turned off. Hence 0 = Pof f < where e1 is the energy consumed per unit time by idle
Pidle < Pon . To transit a server from off to on mode it must servers. Energy drawn by each busy server is denoted by
under go setup mode where it cannot serve any requests. e2 while the average number l ofm servers running jobs is
During the setup time, the server power consumption is
denoted by m̄ (m̄ ≤ n, m̄ = Tµ , where T is the system’s
taken as Pon . They denoted the mean power consumption
during the ON/IDLE state as, throughput, µ1 is the service time). While Gandhi et al.,
Mitrani et al., and Mazzucco et al. described interesting
Pon|idle = ρPon + (k − ρ)Pidle , (103) observations related to the power consumption of server
farms, such observations are out of the scope of this survey.
where ρ is the expected number of on servers and (k − ρ) More details of these observations are available from [211].

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 30

In another energy consumption modeling attempt for ! N


group of data center servers following a system utilization 1 X
Pdc = 1+ Pich , (108)
based model was created by Liu et al. [217]. Different from η(Ts ) i=1
the power model in Equation (22), the Liu et al.’s work
considers u as the average CPU utilization across all servers where η represents the Coefficient of Performance (COP)
in the group of servers being considered. They calculate which is a term used to measure the efficiency of a CRAC
the IT resource demand of interactive workload i using unit. The COP is the ratio of heat removed (Q) to the
the M/G1/1/PS model, which gives µi −λi (t)/a1
≤ rti . amount of work necessary (W ) to remove that heat [220].
i (t)
Their IT power model which covers all the IT energy Pich denotes the total power consumption of a chassis. Note
consumption of a data center is defined as, that a chassis may host multiple servers (each consuming
power Pjs ) and the chassis may consume a baseline power
P Pib (which includes fan power and switching losses). Hence
nj (t)
P
i ai (t) j the total chassis power can be stated as,
d(t) = (Pi + (Pb − Pi )ui ) + Pb , (106)
Q Q
X
 P  Pich = Pib + Pjs . (109)
1 j nji (t)
where ui = 1− µi rti + ai (t) . This power model j

has been derived by using the server power model shown In the rest of this subsection we describe the power
in Equation (22). The term ai (t) represents the minimum models of group of servers which cannot be categorized
CPU capacity needed, which can be expressed as a linear under the queuing theory based or energy efficiency metric
function of the arrival rate λi (t) as ai (t) = µiλ−1/rt
i (t)
i
. The based power models. In one such works a parallel system’s
value of µi is estimated through real measurements and the power consumption was expressed as [56][59][221],
response time requirements rti based on the SLAs. Each
Ω Z t2 Ω
batch job is denoted by j and at time t it shares nji (t) ≥ 0 X X
E= Pi (t)dt = P̄i Tdelay , (110)
CPU resource with interactive workload i and uses addi- i=1 t1 i=1
tional nj (t) ≥ 0 resources by itself.
P Furthermore, they used
an IT capacity constraint as a (t) +
P
n (t) ≤ D where energy E specifies the total number of joules in the
i i j j
assuming the data center has D CPU capacity in total. time interval (t1 , t2 ), as a sum product of the average power
The second clearly observable category of power models (P̄i ) (i ∈ set of all the nodes Ω) times the delay (Tdelay =
for group of servers is the power models developed based t2 −t1 ) while power Pi (i ∈ set of all the nodes Ω) describes
on a data center performance metric. Unlike the lower level the rate of energy consumption at a discrete point in time
power models, it can be observed that such power efficiency on node i. This is essentially an extension for the power
metrics are utilized to create power consumption models at model described in Equation (24) (However, in this case it
the higher levels of data center power hierarchy. Use of is for entire system.). A similar extension of a single server
PUE metric for calculating the non-IT power consumed power model to represent a server cluster power was made
by servers is a simple way to relate the Non-IT power with by Elnozahy et al. [108] where they represented the power
the servers. One such example is where Qureshi et al. [218] consumption of a cluster of n identical servers operating at
created a power consumption model for a group of n servers frequency f as [72],
deployed in an Internet scale system by merging the power
P (f ) = n × (c0 + c1 f 3 ), (111)
model described in Equation (22) with the Power Usage
Effectiveness (PUE) metric of the data center as follows, where all the parameters remain the same as described in
Equation (20).
P ≈ n(Pidle + (Ppeak − Pidle )U + (ηpue − 1)Ppeak ), (107) Aikebaier et al. modeled energy consumption of a group
of computers [222] with two power consumption mod-
where n denotes the server count, Ppeak is the server peak els: simple and multi-level models. They considered the
power in Watts, Pidle is the idle power, and U is the average scenario of a system S which is composed of comput-
server utilization, and PUE is denoted by ηpue . In this ers c1 , ..., cn (n ≥ 1) with m ≥ 1 processes p1 , ..., pm
model (ηpue − 1) term denotes the ratio between the non- running [223]. First they measured the amount of electric
IT power and IT-Power in the context of a server. Hence, power consumed by web applications in a cluster sys-
the term (ηpue − 1)Ppeak corresponds to the amount of tem composed of Linux Virtual Server (LVS) systems.
non-IT power being apportioned to a server. Overall this They abstract out the essential properties which domi-
power model tries to unify the entire data center energy nate the power consumption of a server [224]. Based
consumption under the unit of a server which distinguishes on the measurement results they presented models for
this power model from Fan et al.’s work in Equation (22). energy consumption of a computer. They denoted Ei (t)
A different perspective of modeling the power consump- as the electric power consumption (Watts per time unit)
tion by a group of servers was made by Pedram [219]. Total of a computer ci at time t[W/time unit](i = 1, ..., n).
power consumption of a data center which consists of N Furthermore, max E and min E indicate the maximum
server chassis was modeled by Pedram. The data center energy consumption and minimum energy consumption of
power model can be shown as, a computer ci [129], respectively (min Ei ≤ Ei (t) ≤

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 31

max Ei , max E = max(max E1 , ..., max En ), min E = cost of communication links and intermediate hardware
min(min E1 , ..., min En )). In the simple model, the nor- (See Figure 20 for an example of a data center network
malized energy consumption rate ei (t) is given depending [227][228].). Such energy costs are significant in the global
on how many number of processes are performed as, Internet infrastructure. Most of the existing data center
( networks exhibit very small dynamic range because the
max ei , if Ni (t) ≥ 1, idle power consumption is relatively significant compared
ei (t) = (112)
min ei , if Ni (t) < 1, to the power consumption when the network is fully utilized
[229]. Multiple research have been conducted to address
where min ei and max ei correspond to min Ei /max E
such network power consumption issues [230].
and max Ei /max E respectively. This model simply says
that the power consumption can vary between two levels
min ei and max ei for computer ci . In this model they as-
sumed that the electric power is maximally consumed even
if a single process is run by the computer. They developed
a multi-level model extending the aforementioned power
model as well [129].
While the above server group power models considered
only power consumption by servers, there are certain other
work that model the energy consumption of a group of
servers along with the network devices which interconnect
them. While such work can be listed under Section VIII-B
or Section VIII-F, we list down such work here since they
are more specific to the compute cluster. The work by
Velasco et al. is an example for such power modeling Fig. 20. A typical fat-tree data center network [227]. Many data centers
are built using a fat-tree network topology due to its high bisection
attempt [225]. They modeled the power consumption of bandwidth [231].
i
cluster i, Pcluster as,
Next, we move on to exploring the data center network
M 2 /4
! level energy consumption modeling efforts.
i M X
1) Modeling Network Link Power Consumption: Energy
Pcluster = ai (Pagg + Pedge ) + Pserver (ksi ) (113)
2 s=1 consumption modeling conducted focusing the data center
networks consider the entire network’s energy consumption
where ai is a term which indicates whether the cluster is
including network devices as well as network links. We
active and Pagg and Pedge denote the power consumption of
follow chronological ordering of power models in this
aggregation and edge switches. When combined with a term
subsection.
for power consumption of core switches, the data center’s
total IT devices’ power consumption can be modeled as,
Ireland
M
M2 X i
California
Illinois
PIT = Pcore + Pcluster . (114)
4 i=1 Florida
Spain Taiwan
Saudi India
It could be observed that in multiple server group level Arabia

power models (or even at the server level power models


Peru
such as shown in Equations (29),(30), and (25)) the homo- Brazil Australia
geneity of the servers is assumed across the server cluster.
Users Location
While the energy consumption behavior might be similar Data center + Users Location
across multiple servers, the models’ constants might change
slightly per server basis due to slight differences in electro- Fig. 21. A distributed data center network [225]. Each location collects
user traffic towards the set of federated data centers which consists of five
mechanical characteristics of the hardware components of data centers strategically located around the globe.
the servers.
Additive power consumption models are present at the
B. Modeling Energy Consumption of Data Center Net- data center network level. Heller et al. described the total
works network energy consumption based on the level of power
In this section we discuss the power modeling work consumption by a link. Their energy model can be ex-
conducted on a group of servers connected via a data center pressed as [227],
network as well as the scenarios such as multiple data X X
center systems being linked via wide area networks (i.e., Enet = Xu,v a(u, v) + Yu b(u), (115)
(u,v)∈E u∈V
distributed data centers [226], see Figure 21 for an sam-
ple [225].). When modeling energy consumption at such where a(u, v) is the power cost for link (u,v), b(u) is
higher level abstraction, we need to consider the energy the power cost for switch u, Xu,v is a binary decision

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 32

TABLE VII
S UMMARY OF POWER CONSUMPTION MODELING APPROACHES FOR A GROUP OF SERVERS .
Work(s) Characteristics Limitations
[211][212][213] Models server farms with setup costs. Depends on multiple assumptions.
[56][59][221] An extension of the power model 24. Relatively simple power model.
[222][223][224] Two power models: simple and multi-level. Both are based on Measuring the power consumed by each pro-
process level power consumption. cess is a challenge.
[218] Hybrid of the power model described in Equation (22) and the PUE Need to measure idle and peak powers of each
metric of the data center server.
[217] Derived using the server power model described in Equation (22) Depends on multiple assumptions.
[219] Considers chassis power consumption aspects. Depends on the server mounting architecture.
[108] Extension of a single server power model to represent a group of Assumes all the servers are homogeneous.
server’s power usage.

variable indicating whether link (u,v) is powered ON. Yu,v c c c


N
N X N
is a binary decision variable indicating whether switch u c
X L
X
Ptotal = xi,j Pi,j + yi PiN , (117)
is powered ON. The power cost of a link and a switch i=1 j=1 i=1
are considered fixed (there is no such thing as a half-on
Ethernet link). The same power model has been used in a where xi,j ∈ 0, 1 and yi ∈ 0, 1 represent the power status
work conducted on data center network power consumption of link (i, j) ∈ E and node i ∈ V respectively. xi,j ∈ 0, 1
optimization by Widjaja et al. [232]. is a binary variable which is equal to 1 if the link between
i and j is powered on or 0 otherwise. Similarly yi is set
Stage 1 Stage S to 1 if the node i is powered on or 0 otherwise. Somewhat
Transport
Transport
similar power model for this model has been used for power
Switch Switch Switch
consumption optimization of a data center network by Jin
et al. in [236].
Access Network

Switch
User Ports

Access Network

Switch Switch

Fig. 22. Simplified model of a global network [233]. The network


comprises of an array of multiport switches interconnected by optical
transport systems and connected to user devices via an access network.

Tucker et al. [233] described a power model for a Fig. 23. A core-level subgraph of a 4-ary fat-tree data center network
global network where they modeled the network as a [235]. The diagram includes four pods each of which contains edge
minimal array of switching devices that connected using switches, aggregation switches, and servers.
a configurable non-blocking multi-stage Clos architecture
[234]. Figure 22 shows the network model they used in Another work which is almost similar to Zhang et al.
their study. In [233] Tucker et al. modeled the energy per was conducted by Li et al. described two power models
bit (Enet ) in the network as, for data center network traffic flows [237]. Specifically
they modeled the total amount of network energy used for
transmitting a group of traffic flows in a data center as,
s
X s−1
X
Enet = Eswitch,i + Ecore,i + Eaccess , (116)
X X
E= (Pi ti + Qi,j t0i,j ), (118)
i=1 i=1
i∈I j∈J(i)

where Enet is the energy consumed by each bit in each where I and J(i) represent the set of switches and the set of
stage of switching and in each transport system. Eswitch,i ports in switch i, respectively. The fixed power consumption
is the energy per bit in stage i of switching. Ecore is the in switch i is denoted by Pi , while the active working
energy per bit in a transport system at the output of a switch duration of switch i is denoted by ti . The active working
in stage i of the network in Figure 22. Etransport is the duration of port j in switch i is denoted by t0i,j . This general
two-way transport energy per bit in the access network. s power model was further extended by Li et al. by assuming
is the number of stages of switching. that all the switches run with fixed power P and all the ports
In the same line of research of the power model shown have equal power Q. They further assumed the bandwidth
in Equation (116), Zhang et al. [235] modeled the total capacity of each port as C. With these assumptions, they
c
power consumption of network switches and links (Ptotal ) modeled the total data center network energy consumption
in a core-level subgraph (such as shown in Figure 23) as, as,

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 33

the data plane respectively. They further represented the


X mi X X m0i,j Pctrl , Penv , and part of Pdata which are fixed as Pidle . The
E=P +Q 0
, (119) load dependent component of Pdata was expanded to two
i∈I
Ui C|J(i)| i∈I
CUi,j
j∈J(i)
more terms based on packet processing energy and store &
where mi and m0i,j represent the aggregate traffic amount forward energy as,
traveling through switch i and its port j respectively, while
0 P = Pidle + Ep Rpkt + Esf Rbyte , (122)
Ui and Ui,j represent the average utilization ratio of switch
i and its port j in the transmission duration, respectively.
where Ep is the per-packet processing energy, and Esf is
While there are high level power models being developed
the per-byte store and forward energy which are constants
to the entire data center networks, there are multiple works
for a given router/switch configuration. Rpkt is the input
conducted focusing on individual network devices deployed
packet rate and Rbyte is the input byte rate (Rpkt =
in a data center network which is explored next.
dRbyte /Le, where L is the packet length in bytes).
2) Modeling Network Device Power Consumption: Data
Total energy used by a switch can be modeled in an
center network is the skeletal structure upon which pro-
additive power model as [234][243],
cessors, memory, and I/O devices are dynamically shared
and is generally regarded as a critical design element in
j k
the system architecture. In this skeletal structure, network X X
E= EIi + Esupply + Econtrol − EOi , (123)
devices such as routers, switches, etc. play a pivotal role in
i=1 j=1
data center operations. In most of the situations networking
devices such as routers are provisioned for peak loads, where all the input/output energies to/from the switch
yet they operate at low average utilization levels. Router are denoted as EIi and EOi . The supply and control
may consume between 80-90% of its peak power when energies for the switch are denoted as Esupply and Econtrol
it is not forwarding any packets [238] (i.e., the networking respectively.
devices display almost no energy proportionality [239]). For A slightly extended version of the energy consumption
example, one of the latest releases of CISCO data center of network elements can be made by taking the integration
routers Cisco Nexus X9536PQ: 36-port 40 Gigabit Ethernet of power consumed by the device [244]. The incremental
QSFP+line card consumes 360W as typical operational energy (Einc ) due to the introduction of an additional traffic
power while its maximum power of operation is 400W flow can be stated as,
[240]. Therefore, measures need to be taken to model the
energy consumption of networking devices such as routers Z t2 Z t2

which enables development of energy efficient operation Einc = P (C + ∆C(t)) − P (C)dt = ∆P (t)dt,
t1 t1
techniques. Multiple work have been conducted to model ∂P (C)
Z t2
∂P (C)
the energy consumption of routers and switches. In this = ∆C(t)dt = Nbit = Eb (C)Nbit ,
∂C t1 ∂C
subsection we investigate on the network devices power (124)
models based on their level of complexity from least
complex to most complex power models. where Nbit is the number of transmitted bits and Eb (C) is
Additive (componentwise breakdown) power models rep- the energy-per-bit for the network element with throughput
resent one of the least complicated types of power models C. The incremental power consumption increase was taken
for network devices. The simplest power model which to be negligible. The authors mentioned that to use this
can be created for a network device is dividing its power power model in reality, they need to derive the form of
consumption as static and dynamic portions. Energy con- Eb (C) for the given element/elements.
sumption of a network device operating with a traffic load Vishwanath et al. presented a methodology for construct-
ρ can be expressed as [241], ing quantitative power models based on vendor neutral mea-
surements on commercially available routers and switches
E(ρ) = Estatic + Edynamic (ρ), (120) [242]. The system architecture of the router/switch they
used in their study is highlighted in Figure 24.
where Estatic is the static power consumption independent A linear regression based energy consumption model for
from traffic and Edynamic (ρ) is the dynamic part that is a residential and professional switches was introduced by
function of the traffic load ρ. Hlavacs et al. [245] where the energy consumption E(P )
Another way of constructing additive power models , of the device is given by,
Vishwanath et al. [242] presented the power consumption
P of an IP router/Ethernet switch as the sum of the power (
consumed by its three major subsystems (which can be α + βlogB, B > 1,
E(P ) ≈ P̂ (B) = (125)
observed clearly in Figure 24), α, B ≤ 1,

P = Pctrl + Penv + Pdata , (121)


where α and β correspond to intercept and regression
coefficient respectively. B is the bandwidth measured in
where the terms Pctrl , Penv , and Pdata represents the power kbit/s. They conducted measurements with real switches
consumption of the control plane, environmental units, and on multiple different aspects and reported the results. For

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 34

Power supply,
Data plane fans, etc.
Environmental units
3) Power Consumption of Network Interfaces: Network
interface card (NIC) is a significant contributor to the sys-
Incoming packet
Port 1 tem power consumption. Most network hardware operate
Payload
constantly at maximum capacity, irrespective of the traffic
Port n Packet Processor

Forwarding tables
load, even though its average usage lies far below the maxi-
Line-card i
mum [248]. Traditional Ethernet is power-unaware standard
Routing
engine cards
which uses a constant amount of power independently
from the actual traffic flowing through the wires. However,
Control plane Port 1 Forwarding tables Switch
fabric
recent high speed Gigabit Ethernet interface cards may
Outgoing packet
Port n
Packet Processor consume up to 20W which makes it reasonable to introduce
Line-card j
Payload power saving mechanisms for such network interfaces.
Furthermore, in an experiment conducted on TCP energy
consumption, Bolla et al. observed that their System Under
Fig. 24. Structure of a router/switch with control plane, data plane, and Test (SUT) which was a Linux workstation equipped with
environmental units highlighted [242]. Packet processor conducts lookups
using the forwarding tables. 4-core Intel i5 processor, the NIC power consumption
varied between (10% to 7%) when the system transitioned
from idling to active mode of operation [249]. In both idle
example, they calculated the TCP regression model for a and active modes they measured a constant 7W of power
classical 8 port Fast Ethernet switch (Netgear FS608v2) as, consumed by the NIC. This results in considerable power
consumption when running a large scale server installation.
P̂tcp (B) = 2.588 − 0.0128logB. (126) Therefore, NIC should be taken into consideration when
In the same line of research on power consumption modeling the overall data center system power consump-
modeling of network switches, Mahadevan et al. [246] tion.
modeled the energy consumption of a network switch as, Network interface card can be either in idle mode or in
active mode at any given time [250]. If Eidle is the power of
conf igs
X the idle interface and Pdynamic is the power when active
Pswitch = Pchassis + αPlinecard + βi Pconf igsi Si , (either receiving or transmitting packets) the total energy
i=0 consumption of the interface (Enic ) can be represented as,
(127)
where Plinecard is the power consumed by linecard with Enic = Pidle Tidle + Pdynamic Tdynamic , (129)
all ports disabled and α is the number of active cards
where Tidle is the total idle time. Tdynamic represents the
in the switch. The variable conf igs is the number of
total active time in a total observation period T . The value
configurations for the port line rate. Pconf igs i is the power
of T can be denoted as,
for a port operating at speed i while βi is the number of
ports of that category. Here i can be 0, 10Mbps, 100 Mbps, T = Tdynamic + Tidle , (130)
1Gbps, etc. Si is the scaling factor to account for a port’s
utilization. where the average NIC power Pnic during the period T can
In the line of router power modeling, AHN et al. mea- be denoted as,
sured the power consumption of an edge router (CISCO
7609) [247]. They found that the power consumption is (T − Tdynamic )Pidle + Pdynamic Tdynamic
Pnic =
in direct proportion to the link utilization as well as the T (131)
packet sizes. Based on this observation they defined basic = Pidle + (Pdynamic − Pidle )ρ,
model for power consumption of a router interface as where ρ = Tdynamic /T is the channel utilization (i.e.,
Pinterf ace (ρ, s, c) where ρ means the link utilization, s normalized link’s load). The time periods and the power
is the packet size, and c is the computing coefficient values depend on the particular network technology em-
which concerns the power consumption overhead by routing ployed [250]. The NIC power models described above (in
protocols. Since the power consumption during the routing Equations (129) and (131)) are dividing the NIC power
protocol exchange is negligible, they simplified their model consumption as static and dynamic portions.
as Pint (ρ, s) which can be stated as, The choice of network technology could affect utilization
of other computer system components (especially CPU)
Pint (ρ, s) = Php + Ppt [250]. E.g., In serial point-to-point communications, the
(128)
= Ehp × α + Ept × β, CPU is normally used to execute a significant number of
where the header processing power consumption denoted communication-related operations which easily increases
by Php , packet transferring power consumption denoted by the dynamic power consumption of CPU. On the other hand
Ppt , packet header processing energy is denoted as Ehp in embedded network technologies such as Infiniband can
Joules, and the per bit transfer energy as Ept (Joule/bit). move much of the communication work to the embedded
The data rates of packets per second is denoted by α while architecture. Such behavior can be accommodated in the
bits per second is denoted by β. CPU power models. CPU utilization u, can be denoted

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 35

0 0
as u + γρ where u and γρ correspond to non-network where PA is the supply power to each amplifier and
and network dependent CPU load respectively. γ(γ ≥ 0) PT X/RX is the supply power to each WDM transmit-
models the impact of a given network technology on CPU ter/receiver pair. PT X and PRX are the transmitter and
load based on network utilization ρ. The smaller γ values receiver supply powers.
represent network communications that are dependent on Van Heddeghem et al. modeled the total power dissipa-
the CPU while larger γ values could be used to model the tion of an optical multilayer core network (Pcore ) as the
higher CPU dependency of other network interfaces. sum of its constituting layers,
4) Power Consumption of Optical Networks: Optical in-
terconnects provide a viable solution offering high through- Pcore = Pip + Pethernet + Potn + Pwdm , (136)
put, reduced latency and reduced energy consumption com- where the terms Pip , Pethernet , Potn , and Pwdm represents
pared to current networks based on commodity switches the power consumption by the IP layer, Ethernet layer,
[251]. In this subsection we describe the optical network optical transport network, wavelength division multiplexing
power models from the least complicated to most compli- respectively [252]. Each of these parameters were expanded
cated. further as follows,
Kachris et al. [251] created a power model for a ref-
 
erence network architecture designed based on commodity
switches. Their reference power model is represented as, Pip = δ 2σip γ , (137a)
 
Pref =
X
(Ptor + Ptrans ) + Paggrsw + P10gbps , (132) Pethernet = δ 2σeth γ , (137b)
racks  
where Ptor is the power consumed by the Top-of- Potn = δ 2σotn γ , (137c)
Rack Switch, Ptrans is the power for the 1Gbps Ether-
net transceivers, Paggrsw is the power of the aggregate
 
switch, and P10gbps is the power of the 10Gbps Ethernet Poptsw = δ 2σoxc H , (137d)
transceivers. In their model, the energy dissipated by WDM  
PON (Wavelength Division Multiplexing Passive Optical Ptransponders = δ 2σtr H , (137e)
Network [253]) network for inter track communication was
represented as, 1

α
 
Pamplif iers = δ σola H , (137f)
X
f Lamp
Pwdm = (Ptor + Ptrans + Psf p ) + Paggrsw + Pwa , (133)    
α
racks Pregereneration = δ σre H , (137g)
Lregen
where Psf p is the power of the optical WDM MMF  
(multimode fiber [254]) transceivers, Pwa is the power of where γ = η1pr + H of which ηpr accounts for traffic
the WDM array port in the aggregate switch. Kachris et
al. also described power models for six different optical protection which equals to 2 for 1 + 1 protection. For
data center network architectures [228]. They described unprotected traffic the value remains as 1. H represents the
the power consumption of an Arrayed Waveguide Granting average hop count, ηc accounts for the cooling and facilities
Routing (AWGR) based with buffers scheme as, overhead power consumption in a data center measured by
PUE. The value of term δ is given by δ = ηc ηpr Nd DC
X X X where Nd stands for the total number of IP/Multi Protocol
P = Ptrx + Ptwc + Pbuf f er , Label Switching(MPLS) demands and DC is the average
(134)
= nPtrx + nPtwc + an(Poe + Peo + Psdram ), demand capacity.
Next, we move onto describing one of the most impor-
where Ptrx is the power of optical transceiver, Ptwc is the tant, yet non-IT component of a data center: the power
power of the tunable wavelength converter, Pshbuf f er is conditioning unit.
the power of the Shared Buffer, Poe,eo is the power of
the O/E and E/O converters. P˙sdram, n, and a denotes
power usage of SDRAM, number of the Top of the Rack C. Modeling Energy Consumption of Power Conditioning
switches, and probability of contention respectively. This Systems
is an additive power model. A similar technique has been Power conditioning system of a data center is responsible
used for modeling the power consumption of the rest of the of delivering electric power to the loads of the system (IT
architectures of which the details can be found at [228]. and mechanical equipment). Maintaining adequate power
The power consumption of a WDM transmission system quality levels and consistency of power supply is a must
comprising m identical optically amplified stages as shown [256]. Power conditioning system of a data center consumes
in Figure 26 can be modeled as [243], significant amount of energy as the power wasted during
transformation process which can be traced in its power
Ptot = mPA + PT X/RX , (135) hierarchy. Distribution of uninterrupted electrical power

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 36

TABLE VIII
S UMMARY OF POWER CONSUMPTION MODELING APPROACHES FOR DATA C ENTER N ETWORKS .
Work(s) Type(s) Characteristics Limitations
[227][232] Entire network Total network energy consumption is expressed as Relatively simple power model.
summation of energy dissipated by each network
link.
[237] Entire network Represented the total amount of network energy used Depends on multiple assumptions such as all
for transmitting a group of traffic flows in a data the switches, ports have fixed power dissipa-
center. tion.
[242] Network device Additive power model Measuring the power consumption by each
different units is a challenge.
[244] Network device Mathematical integration based power model. Need additional transformation to use this
power model in reality.
[245] Switch based on linear regression. Need to calculate the two constants α and β.
[219] Switch An additive power model. Need to calculate the two constants α and β.
[247] Router Based on link utilization and packet size. The values Ept and Ehp have to be calcu-
lated in advance.
[250] Network Interface Based on static and dynamic power consumptions of The time values Tidle and Tdynamic need to
network interfaces. be calculated accurately.
[228][251] Optical Networks Additive power model. Depends on the accurate measurement of mul-
tiple parameters.
[252] Optical Networks Additive power model (total power is calculated as Depends on the accurate measurement of mul-
the sum of its constituting layers). tiple parameters.

208V
13.2kV 480V PDU
Utility A Server Cluster 1
ATS UPS
PDU
Generator
Server Cluster 1
ATS UPS
Utility B PDU
Server Cluster 1

Fig. 25. An example power delivery system for a data center [255]. The system is an example for a high-availability power system with redundant
distribution paths.

Stage l Stage m
into a data center requires considerable infrastructure (such
PTX PA PRX
as transformers, switchgears, PDUs, UPSs, etc. [257]). In a PA

typical data center power hierarchy, a primary switch board Pl Pin Pl Pl Pin Pin
distributes power among multiple Uninterrupted Power TX α G α G RX

Supply sub-stations (UPSs). Each UPS in turn, supplies Lstage


power to a collection of PDUs. A PDU is associated with a L = m Lstage
collection of server racks and each rack has several chassis
that host the individual servers. Such an arrangement forms Fig. 26. A WDM transmission system of length L, which comprises
an optical transmitter, m identical stages of optical gain, and an optical
a power supply hierarchy within a data center [258][259]. receiver [233][243].
An illustration of such power hierarchy is shown in Figure
25 [255][260][261].
PDUs are responsible for providing consistent power utilities during power failures [262]. Note that in different
supply for the servers. They transform the high voltage data center designs UPS can sit before PDU or it can sit
power distributed throughout the data center to voltage in between PDU and the server(s). UPSs incur some power
levels appropriate for servers. PDUs incur a constant power overheads even when operating on utility power which can
loss which is proportional to the square of the load which be modeled as,
can be represented as [262][263],
X
Pups = Pups + πups ( Ppdu ), (139)
X
Ppdu loss = Ppdu idle + πpdu ( Psrv )2 , (138) loss idle
M
N

where Ppdu loss represents power consumed by the PDU, where πups denotes the UPS loss coefficient. Pelley et al.
while πpdu represents the PDU power loss coefficient, and mentioned that PDUs typically waste about 3% of their
Ppdu idle which is the PDU’s idle power consumption. The input power while for UPSs it amounts for 9% of UPS input
number of servers in the data center is represented by N . power at full load. Next, we describe the power modeling
UPSs on the other hand act as the temporary power efforts related to data center cooling systems.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 37

D. Modeling Data Center Cooling Power Consumption statistics indicate that power consumption by system fans
Even a carefully designed, racked blade system using is of considerable amount that need to be accounted in
low-voltage components can consume up to 22kW of power modeling the power hierarchy of a modern data center.
[264]. These levels of power consumption generates con- In one of the most simplest fan power models cooling
siderable heat that has to be disposed in order for servers to power can be expressed as a function of the IT power as
operate within a safe operating temperature range. Cooling [217],
systems are used to effectively maintain the temperature of
fa (d) = kd3 , (140)
a data center [265]. Cooling power is the biggest consumer
of the non-computing power in a data center followed by where 0 ≤ d ≤ d, k > 0. The parameter k depends on the
power conversion and other losses [266][267][268]. Figure temperature difference (tra − toa ) which is based on the
27 provides a breakdown of the cooling power in a data heat transfer theory. Here toa is the outside air temperature
center [82]. The data center cooling power is a function and tra is the temperature of the (hot) exhausting air from
of many factors such as layout of the data center, the air the IT racks. d is the maximum capacity of the cooling
flow rate, the spatial allocation of the computing power, system that can be modeled as d = C(tra − toa ).
the air flow rate, and the efficiency of the CRAC units. In an extended version of the power model shown in
In this subsection first we investigate on the server (i.e., Equation (142), Meisner et al. [273] described the Com-
system) level fan power models. Next, we describe the puter Room Air Handler (CRAH) power consumption as,
CRAC unit level power models which can be considered
as an extension of the fan power models. Finally, we Pcrah = Pidle + Pdyn f 3 , (141)
list down some of the power modeling work that cannot where f is the fan speed of CRAH (between 0 to 1.0) and
be specifically categorized into the aforementioned two Pidle and Pdyn represent the idle and dynamic power usage
categories. of CRAH unit.
In certain literature, fan power consumption is added to
A breakdown of power consumption an aggregated quantity called “Electromechanical Energy
of HVAC of a data center Consumption” [92]. It is typical that multiple fans exist in
a server. The power drawn by the ith fan at time t can be
CW denoted by,
Pumps !3
18% RP Mfi an (t)
Fans Pfi an (t) = Pbase , (142)
RP Mbase
39%
where Pbase is the base power consumption of the unloaded
system without running any applications (Note that the
RP Mbase corresponds to the fan’s rpm while running
the base workload). This value is obtained in [92] by
Chillers measuring the current drawn on the +12V and +5V lines
Cooling 39%
using a current probe and an oscilloscope. Therefore, if
Towers there are total N fans installed in the server, the total
4% electromechanical energy consumption (Eem ) over a given
task execution period of Tp is denoted by,
Fig. 27. An example power breakdown of a data center HVAC [82]. The !
three major power consumers in the HVAC includes fans (39%), chillers Z Tp N
X
(39%), and cooling water pumps (18%). Eem = Pfi an (t) dt, (143)
0 i=1

1) Power Consumption of System Fans: Cooling fans where the Pfi an (t) is denoted by the power model in
located inside a system unit represents another important Equation (142).
energy consumer of a data center server. For example, one Unlike other components of a computer system, the
of the studies made by Vasan et al. have shown that the system fans have received less attention from power mod-
fans consumed about 220.8W (18%) while the CPU power eling researchers. Mämmelä et al. measured the power
consumption was measured at 380W (32%) [269], while the consumption of system fans by directly attaching cabling to
total approximate system power consumption was measured the measurement device [274]. Based on the performance
at 1203W. In another study conducted by Lefurgy et al. it results they obtained, they constructed a power model for
was observed that fan power dominated the small config- system fans as,
uration of IBM p670 server power envelope (51%) while Pf an = 8.3306810−15 a4 + 8.51757ω 4 − 2.9569d4
in large configuration it represented a considerable portion
− 1.1013810−10 a3 + 54.6855ω 3 − 76.4897d3
(28%) of the server power envelope [270]. Furthermore,
fan power is a cubic function of fan speed (Pf an ∝ s3f an ) + 4.8542910−7 a2 + 258.847ω 2 − 1059.02d2
[16][271]. Hence over-provisioning of cold air into the − 6.0612710−5 a + 32.6862ω + 67.3012d − 5.478,
servers can easily lead to energy inefficiencies [272]. These (144)

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 38

where ω denotes the fan width (in mm), d denotes the fan remove a constant amount of heat. Pf an is the total power
depth (in mm), and a presents the revolutions per minutes. consumed by the CRAC fans. A similar power model for
This is a fourth order, relatively complicated polynomial modeling the cooling power consumption is described in
power model. [284].
2) CRAC Power Models: Typically, the largest consumer Additive power models can be observed in data center
of power and the most inefficient system in a data center is cooling power models as well. In one such work Das et
CRAC. Factors that affect such operation of the CRAC unit al. developed models for power consumption of cooling
include the operational efficiency and the air distribution support systems. Their model included the computer room
design of the unit [275]. Percentage of the cooling power air conditioning fan, refrigeration by chiller units, pumps
varies, but can be up to 50% or more in a poorly designed of the cooling distribution unit, lights, humidity control and
and operated data center [276][277]. About 40% of the other miscellaneous items [285]. They modeled the total
total energy consumption in the telecom industry is devoted power dissipation of a data center (Prf ) [286] as,
to cooling equipment in data centers [278]. Most of this
energy is consumed by the site chiller plant, CRAC, and
Prf = Pit + Ppdu + Pcrac + Pcdu + Pmisc , (148)
by the air handlers (CRAH [279]) [280]. Heat dissipation
in a data center is related to its server utilization [281]. where Prf corresponds to the raised floor power, Pit is
Studies have shown that for every 1W of power utilized the power consumed by the IT equipment. Pcrac is the
during the operation of servers an additional 0.5-1W of power consumed by computer room air conditioning units.
power is consumed by the cooling equipment to extract the The power losses happen due to uninterruptible power
heat out from the data center [118]. Power consumption supply (UPS) systems and losses associated with the power
of the data center cooling equipment generally depends on distribution are represented as Ppdu . They used Pcdu to
two parameters, first the amount of heat generated by the denote the power dissipation for the pumps in the cooling
equipment within the data center and second due to the distribution unit (CDU) which provide direct cooling water
environmental parameters such as temperature [282]. Here for rear door and side-car heat exchange mounted on a
we organize the CRAC power modeling efforts from least few racks. This model is almost equal to the model of
complicated to most complicated power models. raised floor power described by Hamann et al. in [287]
One of the simplest power models for CRAC was de- where the latter has a term Plight to denote the power used
scribed by Zhan et al. They partitioned the total power for lighting. The total CRAC power consumption and total
budget among the cooling and computing units in a self CDU power can be denoted as follows,
consistent way [283]. They modeled the power consump-
tion ck of a CRAC unit k as,
X X
Pcrac = Pcraci , and Pcdu = Pcduj , (149)
P i j
i pi
ck = , (145)
η where i and j correspond to CRAC and CDU respec-
P tively. The CRAC system used in their study equipped
where i pi is the power consumption of servers with
their heat flow directed towards the CRAC unit. η is the with variable frequency drivers (VFDs) which showed the
Coefficient of Performance (CoP). Based on the physical following empirical relationship between fan power Pcraci
measurements they have created an empirical model for η and relative fan speed θi for a respective CRAC,
of a commercial water-chilled CRAC unit as,
Pcraci = Pcraci ,100 (θi )2.75 , (150)
2
η = 0.0068t + 0.0008t + 0.458, (146)
where Pcraci ,100 is the fan power at θi = 100%. Further-
where t is the supply air temperature of the CRAC unit in more, they showed that under steady state conditions (i.e.,
degrees Celsius. after thermal equilibrium is reached) the energy balance re-
Moore et al. [220] modeled the cooling power consump- quires that the total raised floor power (Prf ) equal the total
tion as, cooling power (Pcool ), that is provided by both the cracs ,
Pcool(crac) , and the rear-door/side-car heat exchanger or
Q
C= + Pf an , (147) CDU, Pcool(cdu) . Therefore, raised floor power Prf can be
η(T = Tsup + Tadj )
denoted as,
where Q is the amount of server power consumption,
η(T = Tsup + Tadj ) is the η at Tsup + Tadj . Note that Tsup X X
is the temperature of the cold air supplied by the CRAC Prf = Pcool = Pcool(craci ) + Pcool(cduj ) . (151)
i j
units. They assumed a uniform Tsup from each CRAC unit.
Tadj is the adjusted CRAC supply temperature. η is the The cooling power of CRACs and CDUs can be denoted
coefficient of performance which gives the performance of as the product of the fluid flow rate in cfm (Cubic feet per
the CRAC units. η is the ratio of heat removed (Q) to the minute), the temperature differential (δTi ) between the cold
amount of work necessary (W ) to remove that heat which fluid emerging from the unit and the hot fluid returned back
can be expressed as η = Q/W [220]. A higher η value to the unit, and the density and specific heat of the fluid.
shows a more efficient process which requires less work to Therefore, these two quantities can be denoted as,

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 39

in the data center j. Uj is the system utilization (%) in data


Pcool(craci ) = φcraci ∆Ti /3293[cf m F/kW],◦
(152)
center j. If the total workload in the data center j (Uj ) is
less than 25% of the total data center processing capacity,
all chillers can be turned off to save cooling system energy.
Pcool(cduj ) = φcduj ∆Tj /6.817[cf m◦ F/kW]. (153)
This is the reason why such division of the cooling energy
happens in their power model.
Furthermore, they showed that since all raised floor power Similar to the use of PUE in the previous section, it
needs to be cooled by the chilling system, which requires can be observed that multiple power models have been
power for refrigeration (Pr ) that can be approximated as, developed concentrating the CoP metric.

Pchiller = Pr /η, (154)


E. Metrics for Data Center Efficiency
where η is the coefficient of performance of the chiller High levels of energy has been consumed by data centers
system described earlier of this section. They assumed a to power the IT equipment contained within them as well
value of 4.5 for η which they mentioned to be somewhat as to extract the heat produced by such equipment. Data
typical for large-scale centrifugal chilling systems based on center industry’s heavy reliance on power has historically
the case study results of a large scale data center described triggered the requirement for use of metrics for tracking the
in [288]. operational efficiency of data centers [293][294]. In this
Certain power models such as the one described by section we describe few key power consumption metrics
Kaushik et al. for a data center in [289] express the power used for measuring the energy efficiency of data centers.
consumption of a system as a ratio. Cooling power con- Such description is needed because certain data center
sumption increase (from P1 to P2 ) due to the requirement power consumption models have used these metrics. In this
of air-flow increase within the data center from V1 to V2 section we organize the metrics based on their significance
can be represented as follows [276][290], for energy efficiency measurement from most significant to
 P 3 least significant.
V2 R2 2
= = , (155) One of the most widely used data center energy
V1 R1 P1
efficiency metric is Power Usage Effectiveness (PUE)
where R1 and R2 correspond to the rounds per minute [295][296]. It is a metric to compare different data center
(RPM) values of the fan. designs in terms of their electricity consumption [297] (See
Certain power models take into account the temporal the illustration in Figure 28 [263]). PUE of a data center
features of data center power usage. In one such work Tu et (ηpue ) is calculated as,
al. described a data center total power consumption [291]
as the sum of the server, power conditioning system, and Total data center annual energy
ηpue = , (159)
the cooling system power draw, that can be expressed as a Total IT annual energy
time-dependent function of b(t) (b(t) = fs (x(t), a(t))), where the Total data center annual energy is the sum of
power drawn by cooling, lightening, and IT equipment.
b(t) + fp (b(t)) + fct (b(t)) , gt (x(t), a(t)), (156) PUE is a value greater than 1 (ηpue ≥ 1) since data
centers draw considerable amount of power as non-IT
where x(t) is the number of active servers and s(t) ∈ power. Google data centers reported a PUE of 1.12 in 2013
[0, x(t)] is the total server service capability at time t. To [298]. A higher PUE translates into a greater portion of
get the workload served in the same time slot s(t) > a(t). the electricity coming to the data center spent on cooling
They also described a similar power model for water chiller and the rest of the infrastructure (A visual explanation
cooling system as, is available in [299]). While PUE is widely recognized
as the preferred energy efficiency metric for data centers
fct (b(t)) = Qt b2 (t) + Lt b(t) + Ct , (157)
[300], a good PUE value is not enough to guarantee the
where Qt , Lt , Ct ≥ 0 depend on outside air and chilled global efficiency of the data center, because PUE metric
water temperature at time t. does not consider the actual utilization (applications and
In a similar power model, Zheng et al. described a workloads [301]) of computational resources [302][303].
power consumption model CRAC systems [292]. They Furthermore, typical PUE reports communicate the mini-
summarized the total power consumption of cooling system mum data center infrastructure power use. Hence, can only
as, be used to determine the minimum potential energy usage
of the corresponding facility [304].

2 Another metric used in data center energy efficiency mea-
αj Uj + βj Uj + γj + θj , if Uj ≤ 25%,

Pcoolingj = , (158)
surement is Data Center Infrastructure Efficiency (DCiE)

θ ,
j Otherwise, which is expressed as follows [41][305],

where θ is the power consumption made by CRAC (this 1 IT Devices Power Consumption
was simplified as a fixed power consumption). αj , βj , and ηdcie = = × 100%.
ηpue Total Power Consumption
γj correspond to the chiller power consumption coefficients (160)

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 40

TABLE IX
S UMMARY OF DATA CENTER COOLING POWER DISTRIBUTION MODELING APPROACHES .
Work(s) Characteristics Limitations
[217] Cooling power is expressed as a function of IT power. Need to calculate the parameter k which is
based on the heat transfer theory.
[273] Additive power model of CRAH unit. Depends on multiple assumptions.
[92] Fan power consumption is expressed in terms of RPM values. Depends on multiple assumptions.
[274] System fan power model based on curve fitting. Relatively complicated polynomial.
[283] Total power budget of a system was partitioned among cooling and Depends on multiple assumptions such as
computing units in a self consistent manner. the value of η which is the coefficient of
performance.
[220] Focused on the fan power and is based on the coefficient of performance Assumes uniform supply temperature from
(CoP) each CRAC unit.
[285][286] Introduced a series of power models. Depends on multiple assumptions.
[289] Considers the air flow within a data center. Measurement of accurate air flow within a
data center is challenging.
[291] Considers time factor. Depends on multiple assumptions.
[276][289][290] Power consumption of a system is expressed as a ratio. Depends on multiple assumptions.
[292] Considered the scenario where cooling unit can be turned-off Depends on multiple assumptions.

Data center practice, PUE and GEC are measured for the entire data
Useful
Total input Power
path to IT
UPS
PDU Power output center level where as ITEU and ITEE are measured only
Cabling
to IT IT
Power to data
Switches
Equipment Computing for some measurable IT devices. The relationship between
center Cooling
the aforementioned data center energy consumption metrics
Power to
secondary
Lights
Fire can be shown as in Figure 29.
Security
support Generator
Switchgear

NCPI*
IT 1 1
DPPE = ITEU × ITEE × ×
Measured by IT device
PUE 1 - GEC
Total Power IN Measurement covering the
under device configuration
entire data center
Data Center Efficiency = management
* NCPI – Network critical
Power to IT DPPE for computing services using IT devices
physical infrastructure

Fig. 29. Relationship between DPPE, ITEU, ITEE, PUE, and GEC
Fig. 28. An illustration of how PUE is defined [263]. Data center metrics [309]. DPPE is designed to increase as the values of these four
efficiency is defined as the fraction of input power delivered to the IT indicators increase (the inverse value for PUE).
load.
Data Center Energy Productivity (DCeP) is a metric
introduced by Sego et al. for measuring the useful work
DCiE is the reciprocal measurment for PUE. While both
performed by a data center relative to the energy consumed
these metrics are used by data center vendors, PUE has
by the data center in performing the work [311]. Therefore,
been used more commonly than DCiE [306].
DCeP (ηdcep ) can be expressed as,
Patterson et al. described two new data center energy
consumption metrics called IT-power usage effectiveness W Useful work produced
(ITUE) and Total-power usage effectiveness (TUE) [307]. ηdcep = = , (162)
Etotal Total energy consumed by the data center
ITUE is defined as total IT energy divided by computational
where the Useful work produced is measured through phys-
energy. TUE is the total energy into the data center divided
ical measurements. The total energy consumed is gathered
by the total energy to the computational components inside
during an interval of time called the assessment window.
the IT equipment. TUE can be expressed as product of
DCeP metric allows the user to define the computational
ITUE and PUE metrics.
tasks, transactions, or jobs that are of interest, and then
Data center Performance Per Energy (DPPE) is a metric assign a measure of importance of economic value to each
which indicates the energy efficiency of the data center specific unit of work completed.
as a whole. DPPE is a metric for indicating data center Similar to higher level data center energy consumption
productivity per unit energy [308][309][310]. DPPE (ηdppe ) efficiency metrics, several power related metrics have been
is defined as follows, proposed for measuring data center cooling system effi-
Throughput at the data center ciency. The Data Center Cooling System Efficiency (CSE)
ηdppe = . (161) characterizes the overall efficiency of the cooling system
Energy consumption
(which includes chillers, pumps, and cooling towers) in
Another performance related to the data center energy terms of energy input per unit of cooling output [312][313].
consumption is the Green Energy Coefficient (GEC) [309]. The CSE (ηcse ) can be expressed as,
In simple terms GEC is the ratio between the green energy
(e.g., wind power, solar power, etc.) consumed by the data Average cooling system power usage
center and the total data center energy consumption. In ηcse = . (163)
Average cooling load

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 41

Data Center Workload Power Efficiency (DWPE) was


M
proposed as a complete data center energy efficiency metric X
P (u(t), x(t)) = pi (ui (t), xi (t)), (166)
by Wilde et al. [314]. DWPE (or simply d) was proposed
i=1
as the first metric that is able to show how energy efficient
a HPC system is for a particular workload when it is run where x(t) = (x1 (t), x2 (t), ..., xM (t)) and u(t) =
in a specific data center. DWPE (ηd ) is defined as, (u1 (t), u2 (t), ..., uM (t)) are the vectors of speed selections
and utilization of data center servers respectively. In their
ηp
ηd = , (164) model i corresponds to a server. They have ignored the
ηu
toggling costs such as turning a server off, VM migration,
where ηp is a performance per Watt metric for a HPC etc. Similar work to this research can be observed in
system while ηu is the system PUE which is a metric that [319][320].
defines the effectiveness of the system in a specific data
PUE (described in Section VIII-E) is heavily utilized for
center.
modeling the power consumption of data center systems
since PUE represents the total IT power consumption of a
F. Modeling Energy Consumption of a Data Center data center. From here onwards we organize the PUE based
As described in Section I, a data center is a container data center power models in chronological order. One such
that holds multiple items such as groups of servers, storage, power models for entire data center was created by Masanet
networking devices, power distribution units, and cooling et al.. They described an electric power consumption model
systems. This section describes the energy consumption for data centers as follows [321],
modeling work conducted at the whole data center level
XhX i
which accounts for energy consumed by all the aforemen- Ed = s
Eij + Ejs + Ejn ηpuej , (167)
tioned components. We list the power models in this section j i
in a two fold manner. First, we investigate on the general
power models for data centers. Next, we investigate on a where E d represents the data center electricity demand
S
specific category of power models which are developed (kWh/y), Eij is the electricity used by servers of class
based on the PUE of the data center. i in space type j (kWh/y), Ejs is the electricity used by
Perhaps the most simplest types of the energy consump- external storage devices in space type j (kWh/y), Ejn is
tion models that can be created for an entire data center the electricity used by network devices in space type j, and
would be such as the one described by Aebischer et al. ηpuej is the power utilization effectiveness of infrastructure
[315]. They mentioned that modeling the electricity demand equipment in space type j. They mentioned that their
of a set of devices does not require a complex model, model estimates data center energy demand as a function
but needs much input data while not all of the available of four variables that account for the electricity use of
data are statistically significant. They modeled the energy servers, external storage devices, network devices, and
consumption in the use phase of a collection of devices by infrastructure equipment. This is another example (similar
following a bottom-up approach, to the previous power model by Mahmud et al. in Equation
(169)) for calculation of the total data center power by
E(t) =
X
ni (t) × eij (t) × uijk (t),
multiplying total IT power by PUE. However, this power
(165) model is an additive power model which differentiates it
ijk

where n is the number of devices of type i, e is the from the power models described in Equations (169) and
power load in functional state j and u is the intensity (171).
of use by user k. A similar data center power model was In the same way Yao et al. extended their server level
created by LeLou.̇et et al. [316] where they expressed the power model [121][322] for modeling the power consump-
power consumption of a data center as the sum of the tion of an entire data center as,
minimum (i.e., idle) power consumptions of its hosts and
the consumptions induced by the VMs. However, both such
!!
bi (t)α
works do not consider the non-IT power consumption of a P (Ni (t), bi (t)) = Ni (t) + Pidle .U, (168)
A
data center.
CPU utilization based power models can be observed where the A, Pidle , and α parameters have the same mean-
even at the higher levels of data center hierarchy. In ing as in Equation (29). The use of U term accounts for
one of this category of works, Raghavendra et al. con- additional power usage due to cooling, power conversion
structed power/performance models for data centers based loss, etc. for having Ni (t) servers active.
on CPU utilization [317]. For each system in the data center Mahmud et al. [323] mathematically denoted the total
they calibrated models on the actual hardware by running power consumption of a data center during time t by
workloads at different utilization levels and measuring the p(λ(t), m(t)), that can be expressed as,
corresponding power and performance (in terms of the
percentage of the work done). J
" #
Islam et al. described a utilization based power model
X λj (t)
p(λ(t), m(t)) = ηpue mj (t) e0 + ec , (169)
[318] for an entire data center power consumption as, j=1
mj (t)µj

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 42

where ηpue > 1 is the PUE, e0 is the static server power the applications. Much of earlier work on energy efficiency
irrespective of the workloads. Here ec is the computing has focused on modeling the energy usage of hardware
power incurred only when a server is processing workloads. components or software applications. Applications create
λj (t) is the arrival rate of type-j jobs. mj (t) is the number the demand for resources and physical hardware are the
of servers for type-j jobs. The service rate of a server for components that actually consume the IT power. The OS
processing type-j jobs is µj . The λ(t) = (λ1 (t), ..., λJ (t)), was largely considered an intermediary between the two
and m(t) = (m1 (t), ..., mJ (t)). Their power model can be key layers. This section first lists the power consumption
considered as an extension of a power model for a group models of OSs and then moves on to describing the power
of servers (described in Section VIII-A) to an entire data models for VMs. Furthermore, we list the power models in
center via use of PUE metric. order from simpler to more complex.
In another similar line of research Zhou et al. described
a data center power model using the power usage efficiency Power Dissiption of OS Routines
metric (PUE) [324]. Their model is a hybrid of the power
L2- cache Memory
model for a single server (which is almost similar to the one
1% 1%
in Equation 30) and data center PUE. Given the number
L1- cache
of active servers mj (t), parameters αj , βj , νj and power 14%
usage efficiency metric P U Ej in data center j, the power
consumption of data center j in time slot t can be quantified
Datapath
by Ej (t) as, & Pipeline
ν 50%
Ej (t) = P U Ej mj (t)[αj µj j (t) + βj ], (170)
where α is a positive factor, β is the power consumption Clock
in idle state, ν is an empirically determined exponent 34%
parameter (ν ≥ 1) with a typical value of ν = 2.
In a similar line of research was presented by Liu et al.
[325] where they created the power model by combining Fig. 30. Power dissipation breakdown across OS routines [146]. Data-
the workload traces and the PUE (ηpue ) to create the total path and pipeline structures that support multiple issue and out-of-order
power demand of a data center as, execution are found to consume 50% of total power on the examined OS
routines.
ν(t) = ηpue (t)(a(t) + b(t)), (171)
It is important to understand which OS level events
where a(t) is the power demand from the inflexible work- gives rise to power consumption before starting to model
load and b(t) is the power demand from the flexible OS power usage. One such characterization was done by
workload. Li et al. [146], which characterized the behavior of a
commercial OS across a large spectrum of applications to
IX. S OFTWARE E NERGY M ODELS identify OS energy profiles. The OS energy consumption
Up to now we have focused on energy consumption profiling gave a breakdown of power dissipation of OS
models based on physical characteristics of the data center. routines as shown in Figure 30. According to this chart,
But equally important is to consider the type of applica- the data-path and pipeline structures which support multiple
tions and workloads a data center handles. Data center issues and out-of-order execution are found to consume
software can be broadly categorized into five categories: 50% of total power on the examined OS routines. The
compute-intensive, data-intensive, communication-intensive capacitive load to the clock network which switches on
applications, and OS and virtualization software, and gen- every clock tick also causes significant power consumption
eral software. In the computation-intensive tasks the major (about 34% in Figure 30). The number of instructions
resource consumer is the CPU core(s). In a data-intensive that flow through a data-path usually determines its en-
task the storage resources in the cloud system are the ergy consumption. Furthermore the ILP (Instruction Level
main energy consumer. In communication-intensive tasks Parallelism) performance measured by IPC (Instructions
a large proportion of the energy is used by network re- Per Cycle) impacts the circuit switching activities in the
sources such as network cards, routers, switches, etc. In the microprocessor components and can result in significant
following subsections, we explore the energy consumption variations in power. Based on these observations Li et al.
modeling efforts in the context of OS and virtualization, created the following simple linear regression model for OS
data-intensive, communication-intensive, and computation- routine power consumption as,
intensive tasks, as well as general data center applications.
P = k0 + k1 ψ. (172)
A. Energy Consumption Modeling at the OS and Virtual-
ization Level Here, k0 and k1 are regression model parameters. The ILP
An operating system (OS) sits in between the two key is denoted by ψ. This power model was extended to the
layers of the data center stack: the physical hardware and entire OS energy consumption model as follows,

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 43

X where the model restricts the interaction among the basis


EOS = (Posr,i Tosr,i ). (173) functions to a degree of two. They used the same MARS
i
algorithm to select knots and fit parameters to select which
Here, Posr,i is the power of the i’th OS routine invocation bases would interact. The most complicated power model
and Tosr,i is the execution time of that invocation. They they introduced was a switching model which can be given
mentioned that Posr,i can be computed in many ways, for as,
example by averaging the power usage of all invocations
of that routine usage in a program. fˆ() = I(f )(a0 +
X
ai xi ) + (1 − I(f ))(a00 +
X
a0i x0i ), (179)
Once the routines with high power consumption are i i
identified, the related high level OS performance counters where I(f ) = 1 ⇐⇒ xi < threshold; otherwise I(f ) = 0.
can be utilized to construct OS level power models. Davis This switching power model uses CPU frequency in an
et al. presented composable, highly accurate, OS-based indicator function I(f ), allowing each p-state/frequency to
(CHAOS) full-system power models for servers and clusters have its own linear model. This results in a set of (possibly)
[326]. These power models were based on high-level OS different linear models depending on the clock frequency.
performance counters. They evaluated four different model- The switching model’s indicator function partitions the
ing techniques of different conceptual and implementation space for all the features, creating completely separate
complexities. In their method the full system power is models for each frequency state. They also mentioned that
represented as a function fˆ() of high level OS performance the switching model is more rigid even though it may
counters represented by (x1 , ..., xn ). In each model they require more parameters and may have discontinuities at
varied the number of model features, starting from CPU the knots (i.e., frequency transitions) [326].
utilization to the full cluster specific and general feature Application checkpoint-restart is an important technique
sets. Their baseline linear power model can be shown as, used by operating systems to save the state of a running
X application to secondary storage so that it can later resume
fˆ() = a0 + ai xi , (174)
i
its execution from the state at which it was checkpointed
[328]. Power models built for OS processes (especially in
where the parameters (ai )n0 are fitted by minimizing the the context of data centers) need to consider the energy
squared error. This baseline model is used to compare consumed by such checkpoint-restart mechanisms. Coordi-
all other proposed power models for fˆ(x1 , ..., xn ) and to nated checkpointing periodically pauses tasks and writes a
evaluate the increase in accuracy of more complex models. checkpoint to stable storage. The checkpoint is read into
They create the following piecewise linear power model as, memory and used to restart execution if a CPU attached to
XX a socket fails. The power models for an operating system
fˆ() = a0 + s
ai,j Bi,j (xi , ti,j ). (175)
i j
process created by Mills et al. defined the total energy
consumption of a single process which uses checkpoint and
This model provides an extra degree of freedom where the restart as,
parameter s can be positive (+) or negative (-), and the basis
s
functions Bi,j are hinge functions such that, Ts
Ecpr = Esoc (σmax , [0, Tω ]) + Eio ([0, δ]) ×
( τ
+ 0, if x = t, Tω (180)
Bi,j (x, t) = (176) +Eio ([0, R]) × ,
x − t, otherwise, Msys
(
0, if x > t,
where the first term Esoc (σmax , [0, Tω ]) correspond to the

Bi,j (x, t) = (177) energy consumed by a socket (i.e., CPU) at speed σmax (the
t − x, otherwise,
maximum execution speed of the CPU), during a check-
where t thresholds are called knots and the j indices permit point restart period of length Tω . The model assumes at
a feature to be responsible for multiple knots. The authors any given time all processes are either working, writing a
mentioned that fitting these models requires finding the checkpoint or restoring from a checkpoint and all sockets
knots ti,j and the parameters ai,j . They used an imple- are always executing at σmax . The second portion of the
mentation of the Multivariate Adaptive Regression Splines equation adds the energy required to write or restore from
(MARS) algorithm for this purpose. They mentioned that a checkpoint times the number of times the process will be
these models can express a feature such as CPU utilization writing or recovering from a checkpoint. Msys , τ , Ts , δ,
which may consume different amounts of full-system power and R stand for the system’s MTBF, checkpoint interval,
in different regions of operation. They also proposed a check point time, and recovery time respectively.
quadratic model which extends the piecewise linear model In the next half of this subsection we discuss works to
and introduces nonlinearity within each segment by making model the power usage of virtual machines. We list the
the basis functions interact. This quadratic power model power models in the order of increasing complexity. The
[327] can be represented as, models have several types, e.g., power models associated
with software frameworks, temporal power models, compo-
nent based (additive) power models, and models based on
XX
fˆ() = a0 + ai,j Bis (xi , ti )Bjs (xj , tj ), (178)
i j the state of operation of a VM such as VM live migration.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 44

Software frameworks have been developed to estimate this is to break down power consumption as either static
the power consumption of VMs. The work by Kansal or dynamic [331]. Then the total power consumption of a
et al. developed power models to infer power consump- VM can be expressed as,
tion from resource usage at runtime and identified the
total idle dynamic
challenges which arise when using such models for VM Pvm i
= Pvm i
+ Pvmi
, (182)
power metering [88]. Once the power models are developed where Pvmidle
is the static/idle power consumption while
i
they can be used for metering VM power by tracking dynamic
Pvmi is the dynamic power consumption. In [331] the
each hardware resource used by a VM and converting the main focus was on modeling the idle power consumption
resource usage to power usage based on a power model for of a VM. The power model is expressed as,
the resource. They mentioned that their approach does not
assume the availability of detailed power models from each  idle res
Pserverj , ⇔ ∃k/Uvmik = 100%,
hardware component as required by some previous works idle

on VM energy usage measurement [329]. They proposed Pvm i
= res
αk Uvmik
P
idle

 kP
a mechanism for VM power metering named Joulemeter αk
Pserver j
, otherwise,
k
(See Figure 31 for details). (183)
where the weight assigned to the resource k (possibly 1) is
VM-1 … VM-n VM Model
denoted by αk and the utilization of resource k by vmi is
resk
Workload
Resource Tracing Refinement represented as Uvm i
. The model expresses the fact that the
idle power consumption of a given VM is equal to the idle
System Base Model power consumption of the server on which it runs if and
Training
Resource & Power
Tracing
only if the VM uses 100% of any of the server/hypervisor’s
Server
resources (e.g., disk, RAM, or the number of the virtual
Energy Calculation
CPUs (vCPUs)) because this prevents the server from
hosting another VM. In other situations, the VM’s idle
Fig. 31. Joulemeter VM power metering [88]. Joulemeter is intended power consumption is correlated to its utilization of each
to provide the same power metering functionality for VMs similar to
hardware meters for physical servers. resource. Weights are used to account for the resource
scarcity since resources such as RAM, vCPU, etc. might
Some of the notable components of Joulemeter in- limit the number of hosted VMs in the server.
clude the resource tracing module which uses the hy- Another approach for VM power modeling is to break
pervisor counters to track the individual VM resource down power usage into components such as CPU, memory,
usage. Through experiments Kansal et al. demonstrated that IO, etc.
linearity assumptions made in linear power models do lead
to errors. They also stated that the magnitude of the errors Pvm = αUcpu + βUmem + γUio + e, (184)
was small compared to full system energy but was much where Ucpu , Umem , and Uio represent CPU utilization,
larger compared to the energy used by an individual VM. memory usage, and disk IO throughput, respectively. e is
They also mentioned that errors reported were averaged an adjustment value. The weights α, β, and γ need to be
across multiple different workloads and can be higher on trained offline. Note that this is almost similar to the power
specific workloads. Kansal et al. mitigated such errors by model described in Equation (10), which also represented
using built-in server power sensors that were not available server power using a componentwise breakdown.
in the older servers used in prior works. VM live migration is a technology which has attracted
Temporal aspects such as the number of events occurring considerable interest from data center researchers in recent
in a particular time window can be used to create VM years [22]. VM live migration is a very important tool for
power models. In one such example work on VM power system management in various scenarios such as VM load
modeling, Kim et al. created a power model for VMs balancing, fault tolerance, power management, etc. VM
assuming the power consumption of the VMs in time period migration involves source host, network switch, and des-
t is determined by the number of events that occur during tination hosts. Liu et al. presented an energy consumption
t [330]. Energy consumption of a VM in their model is model for VM migration as follows [22],
represented as,

Evmi,t ∝ C1 Ni + C2 Mi − C3 St , (181) Emig = Esour + Edest = (αs + αd )Vmig + (βs + βd ), (185)

where the coefficients C1 , C2 , and C3 are obtained by where αs , αd , βs , and βd are model parameters to be
conducting multi-variable linear regression over data sets trained. Vmig is measured in megabytes and the energy
sampled under diverse circumstances. Ni is the number of Emig is measured in joules. The authors mentioned that
retired instructions during the time interval t, Mi is the the model can be used with heterogeneous physical hosts as
number of memory accesses, and St is the number of active well. In this case the model parameters need to be retrained
cores at time t. for each of two different platform. In a homogeneous
Componentwise power consumption breakdowns can be environment the modeling equation reduces to Emig =
obtained in the context of VMs as well. One way to do αVmig + β. They learned the energy model parameters

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 45

by using linear regression, and the model was given as


Emig = 0.512Vmig + 20.165.
E = Pnode,idle N t + Ecomputation + Emotion , (188)

B. Modeling Energy Consumption of Data-Intensive Appli- where t is the execution time. The term Pnode,idle N t
cations represents the total energy consumption by the idling nodes.
They modeled the energy consumption during the compu-
Data-intensive tasks are usually I/O bound and they tation phase of the workflow as,
require processing large volumes of data [332]. Data-
intensive applications can be categorized as online data- !
intensive [333] and offline data-intensive applications based Pcpu,dynamic tcons,v
Ecomputation = Is V tprod,v + , (189)
on their type of operation. It can be observed that most of C Ia
the current data-intensive application power models can be
where V , Is , and Ia represent number of variables, num-
categorized under the second type, a significant percentage
ber of simulation steps, and number of simulation steps
of which are MapReduce power models. Therefore, in this
between two analyses respectively. The two time related
subsection, we first delve into the details of power con-
parameters tprod,v and tcons,v represent the time taken to
sumption modeling of general data-intensive applications,
produce a variable (s) and the time taken to consume a
before considering power models of MapReduce applica-
variable s, respectively. Since the workflow involves the
tions that are heavily deployed in current data centers.
use of a deep memory hierarchy, the power model needs
A data warehouse is an example of an offline data- to consider the power consumption during the data loading
intensive application that gets frequently deployed in data and storage phases. Similarly energy consumption for data
center clusters. Poess et al. developed a power consumption motion was defined as,
model for enterprise data warehouses based on the TCP-
H benchmark [334]. The simplified power consumption    
X
model they developed can be applied to any published Edatamotion = V Is tst ld
v,β + tv,β Pβ,dyn ,
TPC-H result and is representative data warehouse systems. β∈mem,stg,net
However, this model is intended to estimate peak power (190)
consumption of the system [334]. They described the power where the model is constructed by multiplying V Is with
consumption of the entire server as, the summation of three sub terms corresponding to staging
(stg), network access (net), and memory access (mem).
Ps = (Cc Pc + 9Cm + Cdi Pd ) ∗ 1.3 + 100, (186) The two terms tld st
v,β and tv,β represent data loading and
storage times, respectively.
where Ps is the power consumption of the entire server, MapReduce is one of the most frequently used data pro-
Cc is the number of CPUs per server, Pc is the Thermal cessing models in today’s data centers. Furthermore, there
Design Power (TDP) of a CPU in watts (Pc ∈ [55, 165] in has been multiple recent works on the energy efficiency of
the processors used in their study), Cm is the number of MapReduce and Hadoop applications [337][338]. We now
memory DIMMs per server, Cdi is the number of internal discuss some of this work. MapReduce is a programming
disks per server, and Pd is the disk power consumption. model for processing and generating large data sets [339].
They also added 30% of the power usage of the above With the widespread adoption of the MapReduce program-
components plus 100 watts to the above model to account ming model through implementations such as Hadoop [340]
for the power overhead of the chassis. Furthermore, they and Dryad [341], MapReduce systems has become one of
described the power consumption of the I/O subsystem the key contributors to modern data center workloads.
(Pio ) as, Regression techniques have been utilized to create
MapReduce power models. One example for such work
Pio = Ce ∗ Cde ∗ Pde ∗ 1.2, (187) is done by Zhu et al. [342][343] where they developed
a general power consumption model for each node i in a
where Ce is the number of enclosures, Cde is the external Hadoop cluster as follows,
disks per enclosure, Pde is the the power consumption of
the external disk (Pde ∈ [7.2, 19] in the external disks pi (k) = Ai p0i (k − 1) + Bi ∆xi (k), (191)
used in their study). They added 20% of the power as the
overhead of the enclosure. Then they expressed the power where Ai and Bi are known system parameters which may
consumption of the entire system as P = Ps + Pio . vary due to the varying workloads. p0i is the measured
Many data-intensive applications such as text process- power consumption [344]. The change in the arrival rate
ing, scientific data analysis, and machine learning can be threshold for node i is given by ∆xi . The k’th control point
described as a set of tasks with dependencies between them represents the time k. [343] used a recursive least square
[335]. These applications are called workflow applications (RLS) estimator with exponential forgetting to identify the
and are designed to run on distributed computers and system parameters Ai and Bi for all nodes i. They used
storage. Energy consumption of a data-intensive workflow this power model in a power aware scheduler which they
execution was described by Gamell et al. as [336], implemented for Hadoop (see Figure 32).

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 46

Model A i , Bi xi (k) Hadoop Since the workload characteristics may require the job be
estimator cluster run within some time limit τ , the cluster energy manage-
Adaptive
ment problem can be cast as,
Measuring
power
controller Power min(E(ω, υ, η))|Tω ≤ τ. (195)
tracking
Through the Equation (193) it can be observed that energy
consumption reduction can be achieved by powering down
parts of the cluster. Furthermore, it was shown that reducing
m
x (k - 1) , p‘i (k – 1) the power drawn by online idle nodes Pidle can have a big
impact on energy management schemes.
Fig. 32. Workflow of the admission controller. The model estimator
component dynamically models the power consumption of each server. C. Modeling Energy Consumption of Communication-
[344]. Intensive Applications
Communication-intensive applications are composed of
a group of tasks during the course of a computation
The model estimator dynamically models the power
exchange a large amount of messages among themselves
consumption of each server to ensure the accuracy under
[347]. Communication-intensive applications are generally
the dynamic workloads. For managing the power peaks,
network bound and impose a significant burden on the data
the controller module makes control decisions based on the
center network infrastructure. Most such applications are
model generated by the model estimator. This is an example
developed using the Message Passing Interface (MPI), a
for an application of the energy consumption modeling and
widely used API for high performance computing applica-
prediction process described in Section I of this paper.
tions [348]. This subsection lists some of the notable efforts
Additive approaches have been used in MapReduce
in communication-intensive data center application power
power models as well. In one such work, Feng et al. [345]
modeling.
presented an energy model for MapReduce workloads as,
Message broadcasting is a typical communication pattern
in which data belonging to a single process is sent to all
E = P T = Pi Ti + Pm Tm + Ps Ts + Pr Tr , (192) the processors by the communicator [348]. Diouri et al.
presented techniques for energy consumption estimation
where energy consumed by the MapReduce cluster is of MPI broadcast algorithms in large scale HPC systems.
represented by multiplying the power P with time T . This Their methods can be used to estimate the power consump-
is modeled in more detail by summing the energy consumed tion of a particular broadcast algorithm for a large variety of
for job initialization (Pi Ti ), the map stage (Pm Tm ), reduce execution configurations [349]. In the estimator component
stage (Pr Tr ), and the intermediate data shuffling (Ps Ts ). of their proposed technique they used two models of energy
Furthermore, they mentioned that there are four factors consumption for modeling the MPI Scatter And Gather
that affect the total energy consumption of a MapReduce (MPI/SAG) algorithms and the Hybrid Scatter And Gather
job, namely the CPU intensiveness of the workload, I/O (Hybrid/SAG). Since both models follow similar structure,
intensiveness, replica factor, and the block size. we show the MPI/SAG model below,
Another additive power model for a MapReduce cluster
N M
was presented by Lang et al. [346]. In their model the total X nodei
X switchj
Esag = ξsag + ξsag
energy consumption E(ω, υ, η) is denoted as,
i=1 j=1
N M
switch
X X
= tscatter (p, N )( ρnode i
scatter (p) + ρscatterj )
E(ω, υ, η) = (Ptr Ttr ) + (Pωn + Pωn̄ )Tω + m
(Pidle + m̄
Pidle )Tidle , i=1 j=1
(193) N M
where Ptr is the average transitioning power, where tran- +tallgather (p, N )(
X
ρnode i
X switch j
allgather (p) + ρallgather ),
sitioning refers to turning nodes on and off. Transitioning i=1 j=1
power can have a significant impact on the energy consump- (196)
tion of a MapReduce cluster. Ttr is the total transitioning where the model represents the total energy consumption
[n,n̄] [n,n̄]
time in υ, Pω is the on/off-line workload power, Pidle is as a sum of energy consumption by the nodes and switches.
the on/off-line idle power. Variables n and m correspond to Furthermore, the work expanded the first line of the Equa-
the number of online nodes running the job, and the number tion (196) by incorporating the time factor and splitting
of online nodes in the idle period. Variables n̄ = N − n the energy consumption of the scatter and gather phases
and m̄ = N − m are the corresponding offline values. into two separate subterms, which are symmetric to each
Furthermore, the time components for E(ω, υ, η) must sum other. It should be noted that the variable p in the energy
to υ, where, model corresponds to the number of processes per node. N
is the number of compute nodes, and M is the number of
υ = Ttr + Tω + Tidle . (194) switches.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 47

In their work on exploring data-related power consumption and conducts online analysis of the
energy/performance tradeoffs at extreme scales, Gamell measured data. For example, PowerPack is a collection
et al. used machine-independent code characteristics of software components that enables correlating system
(e.g., data access and exchange patterns, computational component level power profiling to application functions
profiles, messaging profiles, etc.) to develop a power [56][70]. PowerPack depends on power meters connected
model, which was then validated empirically using an to the hardware system to obtain measurements. PowerPack
instrumented platform [350]. They followed a similar allows the user to obtain direct measurements of the major
approach for dividing the energy consumption among system components’ power, including CPU, memory, hard
different components of a computer system as described disk, and motherboard power usage.
in Equation (11). They approximated the dynamic Another category of this type of software infers sys-
processor memory and power dissipation using an activity tem power consumption using the system specification.
factor/switching activity (α), Instruction level energy profiles can be utilized to create
energy profiles for specific target platforms. In one of such
dynamic active active
Psystem = αcpu Pcpu + αmem Pmem , (197) power modeling work, Smith et al. described “Application
active Profiles”, which is a means for presenting resource utiliza-
where Pcpu corresponds to the dynamic power consump-
tion of distributed application deployment. The Applica-
tion of a CPU, as described in Equation (7). The activity
tion Profiles they developed captures usage of the CPU,
factors are computed from the number of operations per
memory, hard-disk, and network accesses. They described
second (MIPS), number of memory accesses per second
“CloudMonitor”, a tool that infers the power consumption
(membw ), and a normalization factor that represents the
from software alone through the use of computationally
maximum capacity as follows,
generated power models [60]. They mentioned that since
servers in a data center are normally procured in batches
mips membw with the same configuration, training of the model is only
αcpu = ; αmem = . (198)
max mips max membw required on one server per batch. The resulting model is
Unlike some of the previous works, they modeled the able to predict power usage across the remaining servers
energy consumption that occurs during communication without the need for using dedicated power meters. They
(Ecomm ) between MPI processes as follows, mentioned that the power model is applicable to different
workloads if the hardware configuration is the same across
PM datai
P , multiple machines. Their power model can be expressed as,


 i=1 BWnet transf er


 if smp(srci ) 6= smp(desti ),
Ecomm = P = α + β1 Pcpu + β2 Pmem + β3 Phdd + β4 Pnet , (200)
PM datai active active
 (Pcpu + Pmem ),
 i=1

 BWmem

if smp(srci ) = smp(desti ), where the model considers each hardware subcomponent
(199) that they measured and their approach generates weights
where smp(i) = smp(j) is used to indicate that the MPI automatically during the training phase. Here α is the base-
ranks i and j are mapped to cores that share memory. line power and the coefficients β1 ,β2 ,β3 , and β4 represent
BWnet and BWnet are the bandwidth values of the network the coefficients for the power consumption of the CPU
and memory respectively. Ptransf er depends on the net- (Pcpu ), memory (Pmem ), HDD (Phdd ), and network (Pnet ),
work characteristics, e.g., whether the network is InfiniBand respectively.
or Gemini. A similar additive power model for the energy consump-
tion of data center applications was described by Aroca
et al. [352]. Another work on modeling software power
D. Modeling Energy Consumption of General Applications
consumption by Wang et al. created an additive power
There has been a number of works on data center model as [353],
application power modeling which cannot be specifically
attributed to one of the four types of applications described
P = a1 cpuu + a2 γ + a3 δ + a4 σ + a5 cpuf + Pidle , (201)
above. This section lists such power models, starting from
more general power modeling approaches such as perfor- where a1 , ..., a5 are a set of coefficients to be determined
mance counter based software power models, algorithmic by a set of training benchmarks. The parameters cpuu , γ,
power models, and the application of software architecture δ, σ, cpuf , and Pidle represent the CPU utilization, cache
concepts [351] for power consumption modeling. Next, we miss rate, context switching rate, instructions per cycle,
describe more specific power models such as web service CPU frequency, and idle power dissipation of the system,
power models and business processes, which are frequently respectively.
deployed in data centers. Similarly CoolEmAll project [354] (which is aimed at
Multiple software based techniques and APIs have been decreasing energy consumption of data centers by al-
developed in recent years for inferring individual compo- lowing designers, planners, and administrators to model
nent level power consumption. Some of these software APIs and analyze energy efficiency of various configurations
utilize external power meters for measuring the system and solutions) takes in to account multiple factors when

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 48

creating data center application power models. They used this paper. Although power model is abstract, Koller et al.
application level estimator based on performance counters, state that the actual slope for more than 90% of operational
model specific registers (MSR), and system information for systems had less than 5% error, indicating that throughput
this purpose [355]. In total they used 17 different variables based power modeling is quite accurate.
of the aforementioned three categories. Similar to the algorithmic power consumption models
While individual application power models can be con- described in Equations (54), (87), and (210), Demmel et
structed as described above, higher level abstractions for al. described a technique for modeling the total energy
software systems can be created by considering their cost E of executing an algorithm [360]. In their model
software architecture. In one such example, Seo et al. they sum the energy costs of computation (proportional to
described a framework that supports early estimation of the the number of flops F ), communication (proportional to
energy consumption induced by an architectural style in a the number of words W and messages S sent), memory
distributed system. Their framework defines a method to (proportional to the memory used M times the run time
derive platform and application-independent equations that T ) and “leakage” (proportional to runtime T ) for each
characterize a style’s energy consumption behavior. They processor and multiplied by the number of processors p.
derived energy cost models for five architectural styles: This power model can be expressed as,
peer-to-peer, C2, client-server, publish-subscribe (pub-sub),
and pipe-and-filter [356]. They also described a framework E = p(γe F + βe W + αe S + δe M T + e T ), (205)
for estimating energy consumption of Java-based software
systems [357]. Their energy cost model consists of linear where δe is the energy cost per stored word per second.
equations. Their energy consumption model for a dis- γe , βe and αe are the energy costs (in joules) per flop,
tributed system is an additive model where the total system per word transferred and per message, respectively. The
energy consumption (Etotal ) is denoted by its constituent term δe M T assumes that energy is used only for memory
n components and m connectors as, that are used for the duration of the algorithm (which is a
strong architectural assumption). e is the energy leakage
n
X m
X per second in the system outside the memory. e may
Etotal = Ei + Cj , (202) encompass the static leakage energy from circuits as well
i=1 j=1 as the energy of other devices not defined within the model
where Ei denotes the energy consumption of the component (e.g., disk behavior or fan activity).
i and Cj denotes the energy consumption of the connector Web services power modeling is one of the more specific
j. This power model was further expanded to a generic types of application power modeling scenarios. Bartalos
energy consumption model, which we do not describe in et al. developed linear regression based models for the
this paper. Interested readers can refer to [356] for more energy consumption of a computer system considering
details. multiple aspects such as number of instructions executed,
Another power model which did not consider the applica- number of sent or received packets, CPU cycles (CPU
tion specific details was described by Cumming et al. [358]. unhalted events), IPC, percentage of non-idle CPU time,
They denoted the total energy consumed by the compute and last level cache misses [361][362]. They estimate the
nodes of a job as, computer’s instantaneous power consumption while exe-
cuting web service workloads using an aggregate linear
En + N/4 × 100W × τ instantaneous power model.
E= , (203)
0.95 Business processes in the form of web services are
where N/4 × 100 × τ accounts for 100W-per-blade con- frequently deployed in data center systems. Nowak et al.
tribution from a network interconnect. τ is the wall clock modeled the power consumption of such business processes
time for running the application. The denominator 0.95 is [363], defining the power consumption of a complete pro-
used to adjust for AC/DC conversion. cess instance as,
In a different line of research, Koller et al. investigated m
about an application-aware power model [359]. They ob-
X
Pi = (Ci (j)) + E, (206)
served that the marginal (dynamic) power for any applica- j=1
tion Ai has a linear relationship with application throughput where Ci (j) is the power consumption of an activity a.
(λi ). They proposed an application throughput based power The power consumed by the process engine performing the
model as, activity is given by E and j = (1, ..., m) is the number of
activities of a business process model. Note that this power
P (Ai ) = αi + βi λi , (204)
model is only for a single process instance i of an entire
where α and β are constants for each application which business process consisting of I total business processes.
need to be measured in separate calibration runs for each
application and on each server type the application is X. E NERGY C ONSUMPTION M ODELING U SING
placed on. These two parameters can be inferred using M ACHINE L EARNING
two calibration runs. This power model does not have any Machine learning (ML) is a scientific discipline which
correspondence to the general power model described in is concerned with developing learning capabilities in com-

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 49

TABLE X
S UMMARY OF SOFTWARE ENERGY MODELS .
Work(s) Category Characteristics Limitations
[334] OS Based on linear regression. Regression model parameters k0 and k1
need to be calculated beforehand.
[326][327] OS Based on linear regression but more complicated Regression model parameters need to be cal-
than [334]. culated beforehand.
[328] OS Considers the power consumed by processes during Based on multiple assumptions.
the process of application checkpointing.
[330] VM Temporal aspects such as the number of events The constants C1 , C2 , and C3 need to be
occurring at a particular time window is considered. obtained by multi-variable linear regression.
[331] VM Componentwise power consumption breakdown Power values of the components need to be
known beforehand.
[22] VM Considers VM live migration scenario. Multiple model parameters need to be
trained.
[334] Enterprise data Based on TCP-H benchmark. Intended for estimating the peak power. Does
warehouses not consider data warehouse specific aspects.
[60] distributed appli- Based on Application Profiles Based on multiple assumptions. β1 ...β4 need
cations to be calculated in advance.
[356][357] Software Archi- Derived cost models for five different architectural The framework is fairly high level one.
tectural Styles styles
[349] MPI Estimates the power consumption of a particular Depends on multiple assumptions such as
broadcasting algorithm. portability of the algorithm.
[350] extreme scale ap- Based on machine-independent code characteristics. Depends on multiple assumptions. Value of
plications α needs to be calculated in advance.
[336] workflow Additive power model. Depends on multiple assumptions.
[359] General software Based on application throughput. Depends on multiple assumptions. α and β
applications constants need to be calculated beforehand.
[352] General software Additive power model. Need to calculate a1 ...a5 in advance. De-
applications pends on the accuracy multiple parameters
such as cache miss rate, context switching
rate, etc.
[358] General software The total energy consumption of a compute nodes of Depends on multiple assumptions.
applications a job.
[360] Algorithms A processor based power model. Depends on multiple assumptions.
[342][343] MapReduce Regression based power model. Multiple parameters such as Ai , Bi , and δxi
need to be calculated beforehand.
[345][346][353] MapReduce Additive power models. [346] does not consider Depends on multiple assumptions.
MapReduce specific aspects.
[361][362] Web services A detailed power model. Depends on multiple assumptions.
[363] Business An additive power mode which calculates the power This power model is for process instance i
processes consumed by each process instance. of the entire business processes.

puter systems [27]. In this section we provide a brief consumption in a data center [364]. Machine learning al-
introduction to machine learning. Next, we describe the gorithm should be computationally lightweight and should
use of machine learning techniques in the context of data be able to produce good results when trained with various
center power modeling by describing on use of supervised, workloads.
unsupervised, reinforcement, and evolutionary learning al- Machine learning algorithms can generally be catego-
gorithms in the context of data center power modeling rized under four themes: supervised learning, unsupervised
respectively. learning, reinforcement learning, and evolutionary learning
[365]. In this section of the paper we follow a similar cate-
gorization to summarize the energy consumption prediction
A. Machine Learning - An Overview research conducted using machine learning. However, some
A computer system is said to learn if it improves its power prediction research constitutes the use of multiple
performance or knowledge due to experience and adapts different machine learning techniques, and cannot be placed
to a changing environment. Results from machine learning in a specific category.
are in the form of information or models (i.e., functions)
representing what has been learned. The results of what has
been learned are most often used for making predictions, B. Supervised Learning Techniques
similar to the use of manually created models described in The most common type of learning is supervised learn-
the first half of this paper. ing. In supervised learning algorithms, a training set of ex-
In recent years the use of machine learning techniques amples with correct responses (targets) is provided. Based
for power consumption modeling and prediction has been on the training set, the algorithm generalizes to respond
a hot topic among data center energy researchers. Different correctly to all possible inputs. Algorithms and techniques
types of algorithms that are prominent in machine learning such as linear regression, nonlinear regression, tree based
and data mining can be applied for prediction of power techniques (such as classification trees, regression trees,

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 50

etc.), support vector machines (SVM), etc. are all super- Past Data for learning models Model Fitting
vised learning techniques. Most of the work on linear Monitoring QoS function

regression and non-linear regression has been discussed in Stats about Conditions
the previous sections. In this section we discuss some of RT To allocate

the other supervised learning techniques. Stats


Stats Model about VM
A decision tree is a supervised learning technique using a System
CPU/MEM/IO
Decision Maker
Create
Schedule
tree of decision nodes [366]. Decision trees break classifica- Monitoring

Stats about System


tion down into a set of choices about each feature, starting
from the root of the tree progressing down to the leaves,
where the classification decision is given. The M5 algorithm Fig. 33. Flow of decision making with machine learning. The process
is the most commonly used classifier in this family. Berral involves learning models from online system performance information as
well as from empirical information.
et al. presented a methodology for using machine learning
techniques to model the main resources of a web-service
based data center from low-level information. They used which are also categorized under supervised learning [366].
the M5P algorithm [367] for calculating the expected CPU Decision tress can be easily turned into a set of if-then rules
and I/O usage [364][368]. M5P is the implementation of which are suitable for use in a rule induction system [365].
M5 algorithm in the Weka toolkit [369]. It uses a decision During the conversion process one rule is generated for
tree that performs linear regressions on its leaves. This is each leaf in the decision tree. Fargo et al. applied reasoning
effective because CPU and I/O usage may differ signifi- to optimize power consumption and workload performance
cantly in different workloads, but are reasonably linear in by mapping the current system behavior to the appropriate
each. They use normal linear regression to model memory. application template (AppFlow type [373]), defined as,
The work first models virtual machine (VM) and physical
Atype = f (u, v, n), (209)
machine (PM) behaviors (CPU, memory, and I/O) based on
the amount of load received. The input data for the analysis where the three parameters CPU utilization (u), memory
are, utilization (v), and processor number (n) are used to
• The estimated requests per time unit. determine the AppFlow type (Atype ).
• The average computational time per request.
• The average number of bytes exchanged per request. C. Unsupervised Learning Techniques
Then high-level information predictors are learned to Unlike supervised learning, a training set of examples
drive decision-making algorithms for virtualized service with correct responses are not provided in unsupervised
schedulers, without much expert knowledge or real-time learning [365]. Clustering algorithms (such as hierarchi-
supervision [370][371]. The information collected from cal clustering, k-means clustering) and Gaussian mixture
system behaviors was used by the learning model to predict models (GMM) are examples for unsupervised learning
the power consumption levels, CPU loads, and SLA timings techniques.
to improve scheduling decisions [372]. The M5P algorithm Power models can be created to correlate power con-
was used because simple linear regression is incapable of sumption to architectural metrics (such as memory access
describing the relationship between resources and response rates and instruction throughput) of the workloads running
time. in the VMs. The metrics collection can be conducted per
They learned the following function which can predict VM and can be fed to the model to make the power
the estimated effective resources required by a VM based prediction [101]. Such systems are non-intrusive because
only on its received load without imposing stress on the they do not need to know the internal states of the VMs
VM or occupation on the PM or network, and the applications running inside them. The model uses a
GMM based approach. The GMMs are trained by running
f (l) → E[µcpu , µmem , µio ], (207)
a small set of benchmark applications. Dhiman et al.
where l, µcpu , µmem , and µio represent the load, CPU uti- implemented this technique on a computer running Xen
lization, memory utilization, and amount of I/O, performed virtualization technology [101]. They showed that their
respectively. They also learned a function that calculates the approach can perform online power prediction with an
expected response time from placing a VM in a PM with average error of less than 10% across different workloads
a given occupation such that the scheduler can consolidate and different utilization levels.
VMs without excessively increasing the response time,
D. Reinforcement Learning Techniques
f (s, r) → E[τ ], (208)
Reinforcement learning (RL) algorithms have a behavior
where τ represents the run time (RT), s represents the which is in between supervised and unsupervised learn-
status, and r represents the resources. The overall decision ing [365]. There is a significant amount of uncertainty
making system is shown in Figure 33. and variability associated with the energy consumption
Rules based learning is another type of supervised learn- model using information coming from the environment,
ing, and are a popular alternative to decision trees [367] application, and hardware. An online power management

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 51

technique based on model-free constrained reinforcement XI. C OMPARISON OF T ECHNIQUES FOR E NERGY
learning was presented in [374] as a solution. In this work, C ONSUMPTION M ODELING
the power manager learns a new power control policy The first half of this paper focused on power consump-
dynamically at runtime from the information it receives via tion modeling efforts made at various different levels of
RL. abstraction in data centers. A summary of how different
Neural networks have been widely used for predictions power model equations map to different levels of a data
about resource overbooking strategies, with the goal of center’s components hierarchy is given in Figure 34. Most
achieving more efficient energy usage. In one of the earliest of the power modeling efforts have been conducted for
works applying this technique, Moreno et al. implemented lower level hardware systems. In the literature surveyed, we
a Multilayer Perceptron (MLP) neural network to predict observed that the majority of power models are developed
the optimum amount of computing resources required by a around processor power consumption. There have been
customer’s applications based on historical data [140]. The relatively few models for other important components such
MLP neural network based resource predictor processes the as system fans or SSDs. This may be partly due to the
customer’s utilization data to predict the resource consump- fact that most of the power modeling was carried out as
tion of the current submitted workload. part of a energy consumption reduction mechanism, which
focus only on energy proportional systems. This may be
One of the more recent works on resource overbooking another reason for why there are very few works on network
is iOverbook, an autonomous, online, and intelligent over- level power modeling, since network devices are less power
booking strategy for heterogeneous and virtualized environ- proportional compared to the servers.
ments [375]. A similar neural network based technique for
power modeling was used by Guzek et al. [376].
A. Power Model Complexity
When using neural networks based energy consump- Relatively few power models currently exists for higher
tion prediction it is important to evaluate multiple dif- levels of the data center component hierarchy. Power mod-
ferent neural networks with different characteristics, e.g., eling research on the OS/virtualization layer of data centers
different numbers of inputs. Tesauro et al. followed a still lag behind the work on the physical hardware and
similar approach for developing control policies for real- application layers. There are a number of reasons for the
time management of power consumption in application lack of research, including the complexity of systems at
servers [377]. They developed a 2-input and 15-input neural higher levels of the data center hierarchy. A chronological
network model a state-action value function defining the summary of data center power modeling and prediction
power manager’s control policy in an IBM BladeCenter research is shown in Figure 35. We see that the amount
cluster. They observed that the 15-input neural network with of power modeling and prediction research has increased
preprocessing exhibits the steadiest response time, while significantly in the last two years time.
the power cap [378] decisions of 15-input neural network We observed that certain power models are built on top of
showed quite large short-term fluctuations. the others. For example the power model described in Equa-
Li et al. analyzed the relationship between software tion (24) describing the CPU power consumption has been
power consumption and some software features on the algo- extended to the entire system’s power consumption in the
rithmic level [379]. They measured time complexity, space power model described in the Equation (110). We observed
complexity, and input scale, and proposed an embedded that certain power models such as the one described in
software power model based on algorithm complexity. They Equations (22) and (7) has been highly influential in power
designed and trained a back propagation artificial neural modeling at various levels of data centers. While there are
network (B-ANN) to fit the power model accurately using works that assume the applications running in the cluster
a sample training function set and more than 400 software are bound to one type of resource such as CPU [381], power
power data points. usage of individual hardware devices may not necessarily
provide an overall view of the power consumption by
There have been works that integrate reinforcement the entire system [382]. For example, recent work have
learning with other machine learning techniques for power shown that memory intensive algorithm implementations
consumption prediction. Artificial neural network (ANN) may consume more energy than CPU intensive algorithms
and linear regression have been used to develop prediction- in the context of sorting and joining algorithms which are
based resource measurement and provisioning strategies to essential for database query processing [383][384]. Simple
satisfy future resource demands [380]. In this work Islam models that work well in the context of hardware many
et al. used data generated from the TPC-W benchmark not necessarily work well in the context of software, since
in the Amazon EC2 cloud for training and testing the there are many components handled by a software system
prediction models. They validated the effectiveness of the running in a typical data center and these components
prediction framework and claimed that their framework is change quite frequently (e.g., certain old/malfunctioning
able to make accurate projections of energy consumption hardware components in a server many be replaced with
requirement and can also forecast resource demand before new ones). Most of the hardware level power models do
the VM instance’s setup time. not rely on machine learning techniques while the work

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 52

TABLE XI
S UMMARY OF MACHINE LEARNING BASED ENERGY CONSUMPTION PREDICTION APPROACHES .
Work Category Algorithm(s) Characteristics Limitations
[364][368] Supervised Decision Trees (M5P) The model predicts multiple energy related Networking costs have not been ad-
parameters. dressed.
[101] Unsupervised GMM Average error of less than 10% across dif- I/O related workload metrics are not
ferent workloads. considered.
[374] RL Q-Learning Based on model-free constrained reinforce- Depends on the number of heuristics
ment learning for training the model.
[379] RL B-ANN An embedded software power model. Less portable.
[380] RL ANN, LR Used ANN and LR for resource overbook- Time consuming ANN training pro-
ing cess.
[140][375][377] RL MLP Predicts overbooking ratios for cloud com- Outliers in power consumption results
puting systems. in low quality predictions.

Data Center Energy Consumption Modeling


Software centric Hardware centric
System of Systems
Data center networks, 5
Applications, 17 (332,337,…,342)
OS/Virtualization, 12 Machine Learning, 12 Data Center System, 7
(441,443,456,…,470) (38,210,434,…,439) (150,199,471,475,479,482,…,488) (423,…,433)
Group of Servers, 13
(310,…,330)
MapReduce, 3
(449,…,453) Network devices, 8 Power Conditioning System, 2 Cooling System, 13 Optical Networks, 12 Server, 38
(349,…,356) (371,372) (324,392,…,401) (333,351,360,…,362) (114,117,134,…,149,157,…,167,173,…,193,309)

Network Interface, 2 Fan, 5 Memory, 9 Processor Semiconductor


Server Storage
(359) (142,321,382,383) (134,142,226,280,…,286) electronics, 7
(116,…,131)

Disk, 10 SSD, 4 Single core, 11 Multi core, 16 GPU, 10


(291,…,298) (304,…,306) (135,214,…,228) (230,232,…,251) (95,256,…,267)

Fig. 34. Taxonomy of power consumption models. The numbers between the parentheses correspond to the reference in this paper. The number
following the title of each box indicates the total number of power consumption models described in this paper from this category. Note that the
number of references do not necessarily relate to the number of power models shown here.

conducted at the OS/virtualization or application levels rely centric power models.


more heavily on machine learning. The granularity of power measurement and modeling is
Various new hardware technologies such as FPGAs show an important consideration. The appropriate measurement
great potential in being deployed in future data centers granularity depends on the application [128]. However,
[385][386][387]. Since such technologies are relatively new most of the available hardware infrastructures do not allow
in data centers we do not describe power models for them for power consumption monitoring of the individual appli-
in detail in this paper. cations. Therefore, the validity of the most power models
Most of the current power consumption models are with multiple applications running in a data center still
component/application/system centric. Hence these models needs to be evaluated. Furthermore, most of the power
introduce multiple challenges when employed to model modeling research do not attempt to characterize the level
energy consumption of a data center. These challenges are of accuracy of their proposed power models, which makes
listed below, it harder to do comparisons among them.
• Portability: The power consumption models are not While there has been previous efforts to model the energy
portable across different systems. consumed by software [390], it is difficult to estimate
• Accuracy: Since the workloads are diverse across the power consumption of real world data center software
different data center deployments, the accuracy of applications [127] due to their complexity. There have been
the models degrade rapidly. Furthermore, the level of a few notable efforts conducted to alleviate this issue, such
support given by most modern hardware systems for as PowerPack [56][70].
measuring energy consumption is insufficient. As stated earlier, it is possible to combine multiple power
While some studies have solely attributed a server’s power models corresponding to different components of a system
consumption to the CPU, recent studies have shown that to construct power models representing the entire system.
processors contribute only a minority of many server’s This can go beyond the simple additive models as used
power demand. Furthermore, it has been found that the in for example Equations (148) and (76). Certain works
chipset has become the dominant source of power consump- have used more sophisticated combinations of base models.
tion in modern commodity servers [388][389]. Therefore, For example, Roy et al. [85] combined the processor and
in recent works there has been a tendency to use non-CPU- memory power models described in Equations (54) and (87)

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 53

Joulemeter VM power metering [88] framework, McPAT


1 2 1 2 2 processor power modeling framework are examples for
such initiatives.
1 1 4 2 5 6 3
While some of the power models used in the lower level
1 2 2 2 1 hardware systems rely on physical and electrical principles,
the power consumption models at higher levels have a less
1 2 2 4 2 1
tangible physical basis. For example, the popular power
1 2 1 3 5 3 2 model in Equation (7) represents the physics of operation
of lower level hardware and is more tangible than, for ex-
1 1 3 2 2 5 5 ample, the neural network based power model described in
1 2 4 1 1 5 2 2 1 [374]. Additionally, we observe that reinforcement learning
has been used more extensively for power prediction than
Power Models

1 1
other machine learning techniques.
1 1 1 2 2

1 1 1 1 1 3 2
C. Applications of the Power Models
The power models described in this paper have a number
1 2 3 4 4 2
of different applications. One of the key applications is
1 1 2 2 7 1 7 5 3 power consumption prediction. If a model is calibrated
to a particular data center environment (i.e., system state
1 1 1 2 3 1 4 5 9 10 → →
( S ) and execution strategy (E ) in Equation (2)), and the
relevant input parameters are known in advance (i.e., input

2000

2015
2010
2005

Timeline to the application (A) in Equation (2)), they can be used


Key
for predicting the target system or subsystem’s energy
Server SSD Cooling OS and virtualization
consumption, as illustrated in the energy consumption mod-
CPU DRAM Group of servers Entire data center
GPU
eling and prediction process shown in Figure 2 in Section
SPM Software applications

HDD Machine learning


III.
Network
Some of these applications can be found in the areas
of energy consumption efficiency such as smart grids,
Fig. 35. Data center energy consumption modeling and prediction
timeline. The number on each icon represents the number of publications sensor networks, content delivery networks, etc. Theoretical
pertaining to the power models of the icon’s category. The timeline high- foundations of Smart Metering is similar to the power
lights both power consumption models and power consumption prediction consumption modeling work described in this paper. Data
techniques.
center power models can be used in conjunction with smart
meter networks to optimize the energy costs of data center
systems [391].
respectively into the following power model as,
Improving the impact of data center’s load on power
systems (due to the data center’s massive energy usage)
PB
E(A) = W (A) + C(A) , (210) and reducing the cost of energy for data centers are two of
k
the most important objectives in energy consumption opti-
where there are P parallel disks, each block has B items, mization of data centers. The energy consumption models
and the fast memory can hold P blocks. The number of surveyed in this paper helps achieving the aforementioned
Activation Cycles used by algorithm A is denoted by C(A). objectives in multiple different ways when considering their
Work complexity is defined by W (A) and C is the ACT applications in different power related studies such as elec-
command. tricity market participation, renewable power integration
[319][320], demand response, carbon market, etc. Based on
B. Effectiveness of the Power Models the availability of power plants and fuels, local fuel costs,
The effectiveness of the power models we have surveyed and pricing regulations electricity price exhibits location
is another big question. Certain power models created for diversities as well as time diversities [292]. For example,
hardware systems have been demonstrated to be more effec- the power model (shown in Equation 168) presented for
tive than power models constructed for software systems. entire data center by Yao et al. has been used for reducing
Furthermore, most of the current power models assume energy consumption by 18% through extensive simulation
static system behavior. Although more than 200 power based experiments [121] on geographically distributed data
models have been surveyed in this paper, most of them centers.
are applicable at different levels of granularity of the data
center system. Power models which have been used in XII. F UTURE D IRECTIONS
real world application deployments are the most effective The aim of this survey paper was to create a com-
in this regard. Power model used in DVFS described in prehensive taxonomy of data center power modeling and
Equation 7, Wattch CPU power modeling framework [81], prediction techniques. A number of different insights gained

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 54

through this survey were described in Section XI. It can for electrical energy by data centers, it is necessary to
be observed that most of the existing power modeling account for the vast amount of energy they consume.
techniques do not consider the interactions occurring be- Energy modeling and prediction of data centers plays a
tween different components of systems. For example, the pivotal role in this context.
power consumption analysis conducted by Zhao et al. This survey paper conducted a systematic, in-depth study
indicates that DRAM access consumes a significant portion about existing work in power consumption modeling for
of the total system power. These kinds of intercomponent data center systems. We performed a layer-wise decom-
relationships should be considered when developing power position of a data center system’s power hierarchy. We
consumption models [179]. first divided the components into two types, hardware and
Multiple future directions for energy consumption mod- software. Next, we conducted an analysis of current power
eling and prediction exists in the areas of neural computing, models at different layers of the data center system in
bio-inspired computing [392][393], etc. There have been a bottom-up fashion. Altogether, more than 200 power
recent work on application of bioinspired techniques for models were examined in this survey.
development of novel data center architectures [394][395]. We observed that while there has been a large number of
Swarm Intelligence is an AI technique and a branch studies conducted on the energy consumption modeling at
of evolutionary computing that works on the collective lower levels of the data center hierarchy, much less work
behavior of systems having many individuals interacting has been done at the higher levels [263]. This is a critical
locally with each other and with their environment [396]. limitation of the current state-of-the-art in data center power
In recent times, Swarm-inspired techniques have been em- modeling research. Furthermore, the accuracy, generality,
ployed for reducing power consumption in data centers in and practicality of the majority of the power consumption
[397][398][399]. models remain open. Based on the trend observed through
Support vector machines (SVMs) are one of the most our study we envision significant growth in energy model-
popular algorithms in machine learning. SVMs often pro- ing and prediction research for higher layers of data center
vide significantly better classification performance than systems in the near future.
other machine learning algorithms on reasonably sized data
sets. Currently SVM is popular among the researchers in ACKNOWLEDGEMENT
electricity distribution systems, and they have applications This work was in part supported by the Energy Inno-
such as power-quality classification, power transformer vation Research Program (EIRP), administrated by Energy
fault diagnosis, etc. However, not much work have been Innovation Programme Office (EIPO), Energy Market Au-
conducted on using SVMs for building power models. thority, Singapore. This work was also in part supported
Deep learning [400] is a novel field of research in AI by gift funds from Microsoft Research Asia and Cisco
that deals with deep architectures which are composed of Systems, Inc.
multiple levels of non-linear operations. Examples for deep
architectures include neural networks with many hidden R EFERENCES
layers, complicated propositional formulas re-using many
[1] R. Buyya, C. Vecchiola, and S. Selvi, Mastering Cloud Computing:
sub-formulas [401], etc. Deep neural networks have been Foundations and Applications Programming. Elsevier Inc., 2013.
proven to produce exceptional results in classification tasks [2] R. Buyya, A. Beloglazov, and J. H. Abawajy, “Energy-efficient
[402] which indicates its promise in creating future high- management of data center resources for cloud computing: A
vision, architectural elements, and open challenges,” CoRR, vol.
resolution energy models for data centers. abs/1006.0308, 2010.
Data center operators are interested in using alternative [3] L. Krug, M. Shackleton, and F. Saffre, “Understanding the environ-
power sources [403] (such as wind power, solar power mental costs of fixed line networking,” in Proceedings of the 5th
International Conference on Future Energy Systems, ser. e-Energy
[404], etc.) for data center operations, both for cost savings ’14. New York, NY, USA: ACM, 2014, pp. 87–95.
and as an environmentally friendly measure [405]. For [4] Canalys, “IntelTM intelligent power technol-
example, it has been shown recently that UPS batteries used ogy,” URL: http://www.canalys.com/newsroom/
data-center-infrastructure-market-will-be-worth-152-billion-2016,
in data centers can help reduce the peak power costs without 2012.
any workload performance degradation [406][407]. Fur- [5] M. Poess and R. O. Nambiar, “Energy cost, the key challenge
thermore, direct current (DC) power distribution, battery- of today’s data centers: A power consumption analysis of tpc-c
results,” Proc. VLDB Endow., vol. 1, no. 2, pp. 1229–1240, Aug.
backed servers/racks [408] to improve on central UPS 2008.
power backups, free cooling [409] (practice of using outside [6] Y. Gao, H. Guan, Z. Qi, B. Wang, and L. Liu, “Quality of service
air to cool the data center facility), thermal energy storage aware power management for virtualized data centers,” Journal of
Systems Architecture, vol. 59, no. 45, pp. 245 – 259, 2013.
[410], etc. are some trends in data center energy related [7] S. Rivoire, M. Shah, P. Ranganathan, C. Kozyrakis, and J. Meza,
research [411]. Modeling power consumption of such com- “Models and metrics to enable energy-efficiency optimizations,”
plex modes of data center operations is another prospective Computer, vol. 40, no. 12, pp. 39–48, Dec 2007.
[8] K. Bilal, S. Malik, S. Khan, and A. Zomaya, “Trends and challenges
future research direction. in cloud datacenters,” Cloud Computing, IEEE, vol. 1, no. 1, pp.
10–20, May 2014.
XIII. S UMMARY [9] B. Whitehead, D. Andrews, A. Shah, and G. Maidment, “Assessing
the environmental impact of data centres part 1: Background,
Data centers are the backbone of today’s Internet and energy use and metrics,” Building and Environment, vol. 82, no. 0,
cloud computing systems. Due to the increasing demand pp. 151 – 159, 2014.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 55

[10] V. Mathew, R. K. Sitaraman, and P. J. Shenoy, “Energy-aware load works,” Selected Topics in Quantum Electronics, IEEE Journal of,
balancing in content delivery networks,” CoRR, vol. abs/1109.5641, vol. 17, no. 2, pp. 275–284, March 2011.
2011. [32] H. Xu and B. Li, “Reducing electricity demand charge for data cen-
[11] P. Corcoran and A. Andrae, “Emerging trends in electricity con- ters with partial execution,” in Proceedings of the 5th International
sumption for consumer ict,” National University of Ireland, Galway, Conference on Future Energy Systems, ser. e-Energy ’14. New
Connacht, Ireland, Tech. Rep, 2013. York, NY, USA: ACM, 2014, pp. 51–61.
[12] J. Koomey, “Growth in data center electricity use 2005 to [33] D. Wang, C. Ren, S. Govindan, A. Sivasubramaniam, B. Urgaonkar,
2010,” 2011. [Online]. Available: http://www.analyticspress.com/ A. Kansal, and K. Vaid, “Ace: Abstracting, characterizing and ex-
datacenters.html ploiting datacenter power demands,” in Workload Characterization
[13] W. V. Heddeghem, S. Lambert, B. Lannoo, D. Colle, M. Pickavet, (IISWC), 2013 IEEE International Symposium on, Sept 2013, pp.
and P. Demeester, “Trends in worldwide ICT electricity consump- 44–55.
tion from 2007 to 2012,” Computer Communications, vol. 50, no. 0, [34] J. Evans, “On performance and energy management in high per-
pp. 64 – 76, 2014, green Networking. formance computing systems,” in Parallel Processing Workshops
[14] The Equipment Energy Efficiency (E3) Program, “Energy efficiency (ICPPW), 2010 39th International Conference on, Sept 2010, pp.
policy options for australian and new zealand data centres,” 2014. 445–452.
[15] Info-Tech, “Facts & stats: Data architecture and [35] V. Venkatachalam and M. Franz, “Power reduction techniques for
more data,” URL: http://blog.infotech.com/facts-stats/ microprocessor systems,” ACM Comput. Surv., vol. 37, no. 3, pp.
facts-stats-data-architecture-and-more-data/, 2010. 195–237, Sep. 2005.
[16] S. Yeo, M. M. Hossain, J.-C. Huang, and H.-H. S. Lee, “Atac: Am- [36] A. Beloglazov, R. Buyya, Y. C. Lee, and A. Zomaya, “A taxonomy
bient temperature-aware capping for power efficient datacenters,” and survey of energy-efficient data centers and cloud computing
in Proceedings of the ACM Symposium on Cloud Computing, ser. systems,” Advances in Computers, vol. 82, pp. 47–111, 2011.
SOCC ’14. New York, NY, USA: ACM, 2014, pp. 17:1–17:14. [37] J. Wang, L. Feng, W. Xue, and Z. Song, “A survey on energy-
[17] D. J. Brown and C. Reams, “Toward energy-efficient computing,” efficient data management,” SIGMOD Rec., vol. 40, no. 2, pp. 17–
Queue, vol. 8, no. 2, pp. 30:30–30:43, Feb. 2010. 23, Sep. 2011.
[18] F. Bellosa, “The benefits of event: Driven energy accounting in [38] S. Reda and A. N. Nowroz, “Power modeling and characterization
power-sensitive systems,” in Proceedings of the 9th Workshop on of computing devices: A survey,” Found. Trends Electron. Des.
ACM SIGOPS European Workshop: Beyond the PC: New Chal- Autom., vol. 6, no. 2, pp. 121–216, Feb. 2012.
lenges for the Operating System, ser. EW 9. New York, NY, [39] C. Ge, Z. Sun, and N. Wang, “A survey of power-saving techniques
USA: ACM, 2000, pp. 37–42. on data centers and content delivery networks,” Communications
[19] S.-Y. Jing, S. Ali, K. She, and Y. Zhong, “State-of-the-art research Surveys Tutorials, IEEE, vol. 15, no. 3, pp. 1334–1354, Third 2013.
study for green cloud computing,” The Journal of Supercomputing, [40] T. Bostoen, S. Mullender, and Y. Berbers, “Power-reduction tech-
vol. 65, no. 1, pp. 445–468, 2013. niques for data-center storage systems,” ACM Comput. Surv.,
vol. 45, no. 3, pp. 33:1–33:38, Jul. 2013.
[20] Y. Hotta, M. Sato, H. Kimura, S. Matsuoka, T. Boku, and D. Taka-
[41] A.-C. Orgerie, M. D. d. Assuncao, and L. Lefevre, “A survey
hashi, “Profile-based optimization of power performance by using
on techniques for improving the energy efficiency of large-scale
dynamic voltage scaling on a pc cluster,” in Parallel and Distributed
distributed systems,” ACM Comput. Surv., vol. 46, no. 4, pp. 47:1–
Processing Symposium, 2006. IPDPS 2006. 20th International,
47:31, Mar. 2014.
April 2006, pp. 8 pp.–.
[42] S. Mittal, “A survey of techniques for improving energy efficiency
[21] M. Weiser, B. Welch, A. Demers, and S. Shenker, “Scheduling for
in embedded computing systems,” arXiv preprint arXiv:1401.0765,
reduced cpu energy,” in Proceedings of the 1st USENIX Conference
2014.
on Operating Systems Design and Implementation, ser. OSDI ’94.
[43] S. Mittal and J. S. Vetter, “A survey of methods for analyzing and
Berkeley, CA, USA: USENIX Association, 1994.
improving gpu energy efficiency,” ACM Comput. Surv., vol. 47,
[22] H. Liu, C.-Z. Xu, H. Jin, J. Gong, and X. Liao, “Performance no. 2, pp. 19:1–19:23, Aug. 2014.
and energy modeling for live migration of virtual machines,” in [44] A. Hammadi and L. Mhamdi, “A survey on architectures and energy
Proceedings of the 20th International Symposium on High Perfor- efficiency in data center networks,” Computer Communications,
mance Distributed Computing, ser. HPDC ’11. New York, NY, vol. 40, no. 0, pp. 1 – 21, 2014.
USA: ACM, 2011, pp. 171–182. [45] K. Bilal, S. U. R. Malik, O. Khalid, A. Hameed, E. Alvarez,
[23] A. Beloglazov and R. Buyya, “Energy efficient resource manage- V. Wijaysekara, R. Irfan, S. Shrestha, D. Dwivedy, M. Ali, U. S.
ment in virtualized cloud data centers,” in Cluster, Cloud and Khan, A. Abbas, N. Jalil, and S. U. Khan, “A taxonomy and survey
Grid Computing (CCGrid), 2010 10th IEEE/ACM International on green data center networks,” Future Generation Computer
Conference on, May 2010, pp. 826–831. Systems, vol. 36, no. 0, pp. 189 – 208, 2014, special Section:
[24] E. Feller, C. Rohr, D. Margery, and C. Morin, “Energy manage- Intelligent Big Data Processing Special Section: Behavior Data Se-
ment in iaas clouds: A holistic approach,” in Cloud Computing curity Issues in Network Information Propagation Special Section:
(CLOUD), 2012 IEEE 5th International Conference on, June 2012, Energy-efficiency in Large Distributed Computing Architectures
pp. 204–212. Special Section: eScience Infrastructure and Applications.
[25] C. Lefurgy, X. Wang, and M. Ware, “Power capping: a prelude to [46] K. Bilal, S. Khan, and A. Zomaya, “Green data center networks:
power shifting,” Cluster Computing, vol. 11, no. 2, pp. 183–195, Challenges and opportunities,” in Frontiers of Information Technol-
2008. ogy (FIT), 2013 11th International Conference on, Dec 2013, pp.
[26] M. Lin, A. Wierman, L. Andrew, and E. Thereska, “Dynamic 229–234.
right-sizing for power-proportional data centers,” Networking, [47] K. Ebrahimi, G. F. Jones, and A. S. Fleischer, “A review of data
IEEE/ACM Transactions on, vol. 21, no. 5, pp. 1378–1391, Oct center cooling technology, operating conditions and the correspond-
2013. ing low-grade waste heat recovery opportunities,” Renewable and
[27] “Mathematical models,” in Encyclopedia of the Sciences of Learn- Sustainable Energy Reviews, vol. 31, no. 0, pp. 622 – 638, 2014.
ing, N. Seel, Ed. Springer US, 2012, pp. 2113–2113. [48] A. Rahman, X. Liu, and F. Kong, “A survey on geographic load
[28] S. M. Rivoire, “Models and metrics for energy-efficient com- balancing based data center power management in the smart grid
puter systems,” Ph.D. dissertation, Stanford, CA, USA, 2008, environment,” Communications Surveys Tutorials, IEEE, vol. 16,
aAI3313649. no. 1, pp. 214–233, First 2014.
[29] M. vor dem Berge, G. Da Costa, A. Kopecki, A. Oleksiak, J.- [49] S. Mittal, “Power management techniques for data centers: A
M. Pierson, T. Piontek, E. Volk, and S. Wesner, “Modeling and survey,” CoRR, vol. abs/1404.6681, 2014.
simulation of data center energy-efficiency in coolemall,” in Energy [50] C. Gu, H. Huang, and X. Jia, “Power metering for virtual machine
Efficient Data Centers, ser. Lecture Notes in Computer Science. in cloud computing-challenges and opportunities,” Access, IEEE,
Springer Berlin Heidelberg, 2012, vol. 7396, pp. 25–36. vol. 2, pp. 1106–1116, 2014.
[30] A. Floratou, F. Bertsch, J. M. Patel, and G. Laskaris, “Towards [51] F. Kong and X. Liu, “A survey on green-energy-aware power
building wind tunnels for data center design,” Proceedings of the management for datacenters,” ACM Comput. Surv., vol. 47, no. 2,
VLDB Endowment, vol. 7, no. 9, 2014. pp. 30:1–30:38, Nov. 2014.
[31] D. Kilper, G. Atkinson, S. Korotky, S. Goyal, P. Vetter, D. Su- [52] J. Shuja, K. Bilal, S. Madani, M. Othman, R. Ranjan, P. Balaji,
vakovic, and O. Blume, “Power trends in communication net- and S. Khan, “Survey of techniques and architectures for designing

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 56

energy-efficient data centers,” Systems Journal, IEEE, vol. PP, hosting centers,” in Proceedings of the 2005 ACM SIGMETRICS
no. 99, pp. 1–13, 2014. International Conference on Measurement and Modeling of Com-
[53] B. Khargharia, S. Hariri, F. Szidarovszky, M. Houri, H. El- puter Systems, ser. SIGMETRICS ’05. New York, NY, USA:
Rewini, S. Khan, I. Ahmad, and M. Yousif, “Autonomic power ACM, 2005, pp. 303–314.
performance management for large-scale data centers,” in Parallel [73] S. Rivoire, P. Ranganathan, and C. Kozyrakis, “A comparison
and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE of high-level full-system power models,” in Proceedings of the
International, March 2007, pp. 1–8. 2008 Conference on Power Aware Computing and Systems, ser.
[54] Y. Joshi and P. Kumar, “Introduction to data center energy flow and HotPower’08. Berkeley, CA, USA: USENIX Association, 2008,
thermal management,” in Energy Efficient Thermal Management of pp. 3–3.
Data Centers, Y. Joshi and P. Kumar, Eds. Springer US, 2012, [74] G.-Y. Wei, M. Horowitz, and J. Kim, “Energy-efficient design of
pp. 1–38. high-speed links,” in Power Aware Design Methodologies, M. Pe-
[55] Y. Kodama, S. Itoh, T. Shimizu, S. Sekiguchi, H. Nakamura, and dram and J. Rabaey, Eds. Springer US, 2002, pp. 201–239.
N. Mori, “Power reduction scheme of fans in a blade system [75] A. Beloglazov, R. Buyya, Y. C. Lee, and A. Y. Zomaya, “A
by considering the imbalance of cpu temperatures,” in Green taxonomy and survey of energy-efficient data centers and cloud
Computing and Communications (GreenCom), 2010 IEEE/ACM computing systems,” CoRR, 2010.
Int’l Conference on Int’l Conference on Cyber, Physical and Social [76] T. Burd and R. Brodersen, Energy Efficient Microprocessor Design.
Computing (CPSCom), Dec 2010, pp. 81–87. Springer US, 2002.
[56] R. Ge, X. Feng, S. Song, H.-C. Chang, D. Li, and K. Cameron, [77] W. Wu, L. Jin, J. Yang, P. Liu, and S. X.-D. Tan, “Efficient power
“Powerpack: Energy profiling and analysis of high-performance modeling and software thermal sensing for runtime temperature
systems and applications,” Parallel and Distributed Systems, IEEE monitoring,” ACM Trans. Des. Autom. Electron. Syst., vol. 12, no. 3,
Transactions on, vol. 21, no. 5, pp. 658–671, May 2010. pp. 25:1–25:29, May 2008.
[57] H. Nagasaka, N. Maruyama, A. Nukada, T. Endo, and S. Matsuoka, [78] J. L. Hennessy and D. A. Patterson, Computer architecture: a quan-
“Statistical power modeling of gpu kernels using performance titative approach, ser. The Morgan Kaufmann Series in Computer
counters,” in Green Computing Conference, 2010 International, Architecture and Design. Morgan Kaufmann, 2011.
Aug 2010, pp. 115–122. [79] Y. C. Lee and A. Zomaya, “Energy conscious scheduling for dis-
[58] W. Bircher and L. John, “Core-level activity prediction for multi- tributed computing systems under different operating conditions,”
core power management,” Emerging and Selected Topics in Circuits Parallel and Distributed Systems, IEEE Transactions on, vol. 22,
and Systems, IEEE Journal on, vol. 1, no. 3, pp. 218–227, Sept no. 8, pp. 1374–1381, 2011.
2011. [80] R. Ge, X. Feng, and K. W. Cameron, “Performance-constrained
[59] S. Song, C. Su, B. Rountree, and K. Cameron, “A simplified and distributed dvs scheduling for scientific applications on power-
accurate model of power-performance efficiency on emergent gpu aware clusters,” in Proceedings of the 2005 ACM/IEEE Conference
architectures,” in Parallel Distributed Processing (IPDPS), 2013 on Supercomputing, ser. SC ’05. Washington, DC, USA: IEEE
IEEE 27th International Symposium on, May 2013, pp. 673–686. Computer Society, 2005, pp. 34–.
[60] J. Smith, A. Khajeh-Hosseini, J. Ward, and I. Sommerville, “Cloud- [81] D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: A framework
monitor: Profiling power usage,” in Cloud Computing (CLOUD), for architectural-level power analysis and optimizations,” in Pro-
2012 IEEE 5th International Conference on, June 2012, pp. 947– ceedings of the 27th Annual International Symposium on Computer
948. Architecture, ser. ISCA ’00. New York, NY, USA: ACM, 2000,
[61] M. Witkowski, A. Oleksiak, T. Piontek, and J. Weglarz, “Practical pp. 83–94.
power consumption estimation for real life {HPC} applications,” [82] S. Yeo and H.-H. Lee, “Peeling the power onion of data centers,”
Future Generation Computer Systems, vol. 29, no. 1, pp. 208 – in Energy Efficient Thermal Management of Data Centers, Y. Joshi
217, 2013, including Special section: AIRCC-NetCoM 2009 and and P. Kumar, Eds. Springer US, 2012, pp. 137–168.
Special section: Clouds and Service-Oriented Architectures. [83] J. Esch, “Prolog to ”estimating the energy use and efficiency
[62] Raritan, “Intelligent rack power distribution,” 2010. potential of u.s. data centers”,” Proceedings of the IEEE, vol. 99,
[63] R. Bolla, R. Bruschi, and P. Lago, “The hidden cost of network low no. 8, pp. 1437–1439, Aug 2011.
power idle,” in Communications (ICC), 2013 IEEE International [84] S. Ghosh, S. Chandrasekaran, and B. Chapman, “Statistical mod-
Conference on, June 2013, pp. 4148–4153. eling of power/energy of scientific kernels on a multi-gpu system,”
[64] M. Burtscher, I. Zecena, and Z. Zong, “Measuring gpu power with in Green Computing Conference (IGCC), 2013 International, June
the k20 built-in sensor,” in Proceedings of Workshop on General 2013, pp. 1–6.
Purpose Processing Using GPUs, ser. GPGPU-7. New York, NY, [85] S. Roy, A. Rudra, and A. Verma, “An energy complexity model for
USA: ACM, 2014, pp. 28:28–28:36. algorithms,” in Proceedings of the 4th Conference on Innovations
[65] S. Miwa and C. R. Lefurgy, “Evaluation of core hopping on in Theoretical Computer Science, ser. ITCS ’13. New York, NY,
power7,” SIGMETRICS Perform. Eval. Rev., vol. 42, no. 3, pp. USA: ACM, 2013, pp. 283–304.
55–60, Dec. 2014. [86] R. Jain, D. Molnar, and Z. Ramzan, “Towards understanding
[66] HP, “Server remote management with hp integrated lights out (ilo),” algorithmic factors affecting energy consumption: Switching com-
URL: http://h18013.www1.hp.com/products/servers/management/ plexity, randomness, and preliminary experiments,” in Proceedings
remotemgmt.html, 2014. of the 2005 Joint Workshop on Foundations of Mobile Computing,
[67] H. Chen and W. Shi, “Power measuring and profiling: the state of ser. DIALM-POMC ’05. New York, NY, USA: ACM, 2005, pp.
art,” Handbook of Energy-Aware and Green Computing, pp. 649– 70–79.
674. [87] B. M. Tudor and Y. M. Teo, “On understanding the energy con-
[68] L. John and L. Eeckhout, Performance Evaluation and Benchmark- sumption of arm-based multicore servers,” SIGMETRICS Perform.
ing. Taylor & Francis, 2005. Eval. Rev., vol. 41, no. 1, pp. 267–278, Jun. 2013.
[69] C. Isci and M. Martonosi, “Runtime power monitoring in high-end [88] A. Kansal, F. Zhao, J. Liu, N. Kothari, and A. A. Bhattacharya,
processors: Methodology and empirical data,” in Proceedings of the “Virtual machine power metering and provisioning,” in Proceedings
36th Annual IEEE/ACM International Symposium on Microarchi- of the 1st ACM Symposium on Cloud Computing, ser. SoCC ’10.
tecture, ser. MICRO 36. Washington, DC, USA: IEEE Computer New York, NY, USA: ACM, 2010, pp. 39–50.
Society, 2003, pp. 93–. [89] R. Ge, X. Feng, and K. Cameron, “Modeling and evaluating energy-
[70] X. Wu, H.-C. Chang, S. Moore, V. Taylor, C.-Y. Su, D. Terpstra, performance efficiency of parallel processing on multicore based
C. Lively, K. Cameron, and C. W. Lee, “Mummi: Multiple metrics power aware systems,” in Parallel Distributed Processing, 2009.
modeling infrastructure for exploring performance and power mod- IPDPS 2009. IEEE International Symposium on, May 2009, pp.
eling,” in Proceedings of the Conference on Extreme Science and 1–8.
Engineering Discovery Environment: Gateway to Discovery, ser. [90] S. L. Song, K. Barker, and D. Kerbyson, “Unified performance and
XSEDE ’13. New York, NY, USA: ACM, 2013, pp. 36:1–36:8. power modeling of scientific workloads,” in Proceedings of the 1st
[71] J. Shuja, K. Bilal, S. Madani, and S. Khan, “Data center energy International Workshop on Energy Efficient Supercomputing, ser.
efficient resource scheduling,” Cluster Computing, vol. 17, no. 4, E2SC ’13. New York, NY, USA: ACM, 2013, pp. 4:1–4:8.
pp. 1265–1277, 2014. [91] A. Lewis, J. Simon, and N.-F. Tzeng, “Chaotic attractor prediction
[72] Y. Chen, A. Das, W. Qin, A. Sivasubramaniam, Q. Wang, and for server run-time energy consumption,” in Proceedings of the
N. Gautam, “Managing server energy and operational costs in 2010 International Conference on Power Aware Computing and

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 57

Systems, ser. HotPower’10. Berkeley, CA, USA: USENIX Asso- [110] Y. Jin, Y. Wen, Q. Chen, and Z. Zhu, “An empirical investigation
ciation, 2010, pp. 1–16. of the impact of server virtualization on energy efficiency for green
[92] A. W. Lewis, N.-F. Tzeng, and S. Ghosh, “Runtime energy con- data center,” The Computer Journal, vol. 56, no. 8, pp. 977–990,
sumption estimation for server workloads based on chaotic time- 2013.
series approximation,” ACM Trans. Archit. Code Optim., vol. 9, [111] A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-aware resource
no. 3, pp. 15:1–15:26, Oct. 2012. allocation heuristics for efficient management of data centers for
[93] V. Perumal and S. Subbiah, “Power-conservative server consolida- cloud computing,” Future Generation Computer Systems, vol. 28,
tion based resource management in cloud,” International Journal no. 5, pp. 755 – 768, 2012.
of Network Management, pp. n/a–n/a, 2014. [112] R. Basmadjian, N. Ali, F. Niedermeier, H. de Meer, and G. Giuliani,
[94] A. Chatzipapas, D. Pediaditakis, C. Rotsos, V. Mancuso, “A methodology to predict the power consumption of servers in
J. Crowcroft, and A. Moore, “Challenge: Resolving data data centres,” in Proceedings of the 2Nd International Conference
center power bill disputes: The energy-performance trade-offs of on Energy-Efficient Computing and Networking, ser. e-Energy ’11.
consolidation,” in Proceedings of the 2015 ACM Sixth International New York, NY, USA: ACM, 2011, pp. 1–10.
Conference on Future Energy Systems, ser. e-Energy ’15. New [113] V. Gupta, R. Nathuji, and K. Schwan, “An analysis of power
York, NY, USA: ACM, 2015, pp. 89–94. [Online]. Available: reduction in datacenters using heterogeneous chip multiprocessors,”
http://doi.acm.org/10.1145/2768510.2770933 SIGMETRICS Perform. Eval. Rev., vol. 39, no. 3, pp. 87–91, Dec.
[95] I. Alan, E. Arslan, and T. Kosar, “Energy-aware data transfer 2011.
tuning,” in Cluster, Cloud and Grid Computing (CCGrid), 2014 [114] X. Zhang, J.-J. Lu, X. Qin, and X.-N. Zhao, “A high-level energy
14th IEEE/ACM International Symposium on, May 2014, pp. 626– consumption model for heterogeneous data centers,” Simulation
634. Modelling Practice and Theory, vol. 39, no. 0, pp. 41 – 55, 2013,
[96] A. Lewis, S. Ghosh, and N.-F. Tzeng, “Run-time energy con- s.I.Energy efficiency in grids and clouds.
sumption estimation based on workload in server systems,” in [115] M. Tang and S. Pan, “A hybrid genetic algorithm for the energy-
Proceedings of the 2008 Conference on Power Aware Computing efficient virtual machine placement problem in data centers,” Neural
and Systems, ser. HotPower’08. Berkeley, CA, USA: USENIX Processing Letters, pp. 1–11, 2014.
Association, 2008, pp. 4–4. [116] S. ul Islam and J.-M. Pierson, “Evaluating energy consumption in
[97] P. Bohrer, E. N. Elnozahy, T. Keller, M. Kistler, C. Lefurgy, C. Mc- cdn servers,” in ICT as Key Technology against Global Warming,
Dowell, and R. Rajamony, “Power aware computing,” R. Graybill ser. Lecture Notes in Computer Science, A. Auweter, D. Kranzlm-
and R. Melhem, Eds. Norwell, MA, USA: Kluwer Academic ller, A. Tahamtan, and A. Tjoa, Eds. Springer Berlin Heidelberg,
Publishers, 2002, ch. The Case for Power Management in Web 2012, vol. 7453, pp. 64–78.
Servers, pp. 261–289. [117] S.-H. Lim, B. Sharma, B. C. Tak, and C. Das, “A dynamic energy
[98] R. Lent, “A model for network server performance and power management scheme for multi-tier data centers,” in Performance
consumption,” Sustainable Computing: Informatics and Systems, Analysis of Systems and Software (ISPASS), 2011 IEEE Interna-
vol. 3, no. 2, pp. 80 – 93, 2013. tional Symposium on, 2011, pp. 257–266.
[99] F. Chen, J. Grundy, Y. Yang, J.-G. Schneider, and Q. He, “Experi- [118] Z. Wang, N. Tolia, and C. Bash, “Opportunities and challenges to
mental analysis of task-based energy consumption in cloud comput- unify workload, power, and cooling management in data centers,”
ing systems,” in Proceedings of the 4th ACM/SPEC International SIGOPS Oper. Syst. Rev., vol. 44, no. 3, pp. 41–46, Aug. 2010.
Conference on Performance Engineering, ser. ICPE ’13. New [119] H. Li, G. Casale, and T. Ellahi, “Sla-driven planning and optimiza-
York, NY, USA: ACM, 2013, pp. 295–306. tion of enterprise applications,” in Proceedings of the First Joint
[100] P. Xiao, Z. Hu, D. Liu, G. Yan, and X. Qu, “Virtual machine power WOSP/SIPEW International Conference on Performance Engineer-
measuring technique with bounded error in cloud environments,” ing, ser. WOSP/SIPEW ’10. New York, NY, USA: ACM, 2010,
Journal of Network and Computer Applications, vol. 36, no. 2, pp. pp. 117–128.
818 – 828, 2013. [120] C.-J. Tang and M.-R. Dai, “Dynamic computing resource ad-
[101] G. Dhiman, K. Mihic, and T. Rosing, “A system for online power justment for enhancing energy efficiency of cloud service data
prediction in virtualized environments using gaussian mixture mod- centers,” in System Integration (SII), 2011 IEEE/SICE International
els,” in Proceedings of the 47th Design Automation Conference, ser. Symposium on, Dec 2011, pp. 1159–1164.
DAC ’10. New York, NY, USA: ACM, 2010, pp. 807–812. [121] Y. Yao, L. Huang, A. Sharma, L. Golubchik, and M. Neely, “Data
[102] L. A. Barroso and U. Hölzle, “The datacenter as a computer: An centers power reduction: A two time scale approach for delay
introduction to the design of warehouse-scale machines,” Synthesis tolerant workloads,” in INFOCOM, 2012 Proceedings IEEE, March
lectures on computer architecture, vol. 4, no. 1, pp. 1–108, 2009. 2012, pp. 1431–1439.
[103] K. T. Malladi, B. C. Lee, F. A. Nothaft, C. Kozyrakis, K. Periy- [122] Y. Tian, C. Lin, and K. Li, “Managing performance and power
athambi, and M. Horowitz, “Towards energy-proportional datacen- consumption tradeoff for multiple heterogeneous servers in cloud
ter memory with mobile dram,” in Proceedings of the 39th Annual computing,” Cluster Computing, vol. 17, no. 3, pp. 943–955, 2014.
International Symposium on Computer Architecture, ser. ISCA ’12. [123] B. Subramaniam and W.-C. Feng, “Enabling efficient power pro-
Washington, DC, USA: IEEE Computer Society, 2012, pp. 37–48. visioning for enterprise applications,” in Cluster, Cloud and Grid
[104] D. Economou, S. Rivoire, and C. Kozyrakis, “Full-system power Computing (CCGrid), 2014 14th IEEE/ACM International Sympo-
analysis and modeling for server environments,” in In Workshop on sium on, May 2014, pp. 71–80.
Modeling Benchmarking and Simulation (MOBS), 2006. [124] S.-W. Ham, M.-H. Kim, B.-N. Choi, and J.-W. Jeong, “Simplified
[105] A.-C. Orgerie, L. Lefvre, and I. Gurin-Lassous, “Energy-efficient server model to simulate data center cooling energy consumption,”
bandwidth reservation for bulk data transfers in dedicated wired Energy and Buildings, vol. 86, no. 0, pp. 328 – 339, 2015.
networks,” The Journal of Supercomputing, vol. 62, no. 3, pp. [125] T. Horvath and K. Skadron, “Multi-mode energy management for
1139–1166, 2012. multi-tier server clusters,” in Proceedings of the 17th International
[106] Y. Li, Y. Wang, B. Yin, and L. Guan, “An online power metering Conference on Parallel Architectures and Compilation Techniques,
model for cloud environment,” in Network Computing and Appli- ser. PACT ’08. New York, NY, USA: ACM, 2008, pp. 270–279.
cations (NCA), 2012 11th IEEE International Symposium on, Aug [126] C. Lefurgy, X. Wang, and M. Ware, “Server-level power control,”
2012, pp. 175–180. in Proceedings of the Fourth International Conference on Auto-
[107] X. Xu, K. Teramoto, A. Morales, and H. Huang, “Dual: Reliability- nomic Computing, ser. ICAC ’07. Washington, DC, USA: IEEE
aware power management in data centers,” in Cluster, Cloud and Computer Society, 2007, pp. 4–.
Grid Computing (CCGrid), 2013 13th IEEE/ACM International [127] G. Da Costa and H. Hlavacs, “Methodology of measurement for
Symposium on, May 2013, pp. 530–537. energy consumption of applications,” in Grid Computing (GRID),
[108] E. Elnozahy, M. Kistler, and R. Rajamony, “Energy-efficient server 2010 11th IEEE/ACM International Conference on, Oct 2010, pp.
clusters,” in Power-Aware Computer Systems, ser. Lecture Notes in 290–297.
Computer Science, B. Falsafi and T. Vijaykumar, Eds. Springer [128] J. C. McCullough, Y. Agarwal, J. Chandrashekar, S. Kuppuswamy,
Berlin Heidelberg, 2003, vol. 2325, pp. 179–197. A. C. Snoeren, and R. K. Gupta, “Evaluating the effectiveness of
[109] X. Fan, W.-D. Weber, and L. A. Barroso, “Power provisioning for model-based power characterization,” in Proceedings of the 2011
a warehouse-sized computer,” in Proceedings of the 34th Annual USENIX Conference on USENIX Annual Technical Conference,
International Symposium on Computer Architecture, ser. ISCA ’07. ser. USENIXATC’11. Berkeley, CA, USA: USENIX Association,
New York, NY, USA: ACM, 2007, pp. 13–23. 2011, pp. 12–12.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 58

[129] T. Enokido and M. Takizawa, “An extended power consumption computing,” in Proceedings of the 2009 International Conference
model for distributed applications,” in Advanced Information Net- on Computer-Aided Design, ser. ICCAD ’09. New York, NY,
working and Applications (AINA), 2012 IEEE 26th International USA: ACM, 2009, pp. 652–657.
Conference on, March 2012, pp. 912–919. [149] R. Bertran, M. Gonzelez, X. Martorell, N. Navarro, and E. Ayguade,
[130] B. Mills, T. Znati, R. Melhem, K. Ferreira, and R. Grant, “Energy “A systematic methodology to generate decomposable and respon-
consumption of resilience mechanisms in large scale systems,” in sive power models for cmps,” Computers, IEEE Transactions on,
Parallel, Distributed and Network-Based Processing (PDP), 2014 vol. 62, no. 7, pp. 1289–1302, July 2013.
22nd Euromicro International Conference on, Feb 2014, pp. 528– [150] W. Bircher and L. John, “Complete system power estimation: A
535. trickle-down approach based on performance events,” in Perfor-
[131] M. Pawlish, A. Varde, and S. Robila, “Analyzing utilization rates mance Analysis of Systems Software, 2007. ISPASS 2007. IEEE
in data centers for optimizing energy management,” in Green International Symposium on, 2007, pp. 158–168.
Computing Conference (IGCC), 2012 International, June 2012, pp. [151] R. Bertran, M. Gonzàlez, X. Martorell, N. Navarro, and E. Ayguadé,
1–6. “Counter-based power modeling methods,” Comput. J., vol. 56,
[132] V. Maccio and D. Down, “On optimal policies for energy- no. 2, pp. 198–213, Feb. 2013.
aware servers,” in Modeling, Analysis Simulation of Computer [152] A. Merkel, J. Stoess, and F. Bellosa, “Resource-conscious schedul-
and Telecommunication Systems (MASCOTS), 2013 IEEE 21st ing for energy efficiency on multicore processors,” in Proceedings
International Symposium on, Aug 2013, pp. 31–39. of the 5th European Conference on Computer Systems, ser. EuroSys
[133] Q. Deng, L. Ramos, R. Bianchini, D. Meisner, and T. Wenisch, ’10. New York, NY, USA: ACM, 2010, pp. 153–166.
“Active low-power modes for main memory with memscale,” [153] S. Wang, H. Chen, and W. Shi, “Span: A software power analyzer
Micro, IEEE, vol. 32, no. 3, pp. 60–69, May 2012. for multicore computer systems,” Sustainable Computing: Infor-
[134] Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bian- matics and Systems, vol. 1, no. 1, pp. 23 – 34, 2011.
chini, “Multiscale: Memory system dvfs with multiple memory [154] S. Hong and H. Kim, “An integrated gpu power and performance
controllers,” in Proceedings of the 2012 ACM/IEEE International model,” SIGARCH Comput. Archit. News, vol. 38, no. 3, pp. 280–
Symposium on Low Power Electronics and Design, ser. ISLPED 289, Jun. 2010.
’12. New York, NY, USA: ACM, 2012, pp. 297–302. [155] W.-T. Shiue and C. Chakrabarti, “Memory exploration for low
[135] Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and power, embedded systems,” in Design Automation Conference,
R. Bianchini, “Coscale: Coordinating cpu and memory system 1999. Proceedings. 36th, 1999, pp. 140–145.
dvfs in server systems,” in Proceedings of the 2012 45th Annual [156] G. Contreras and M. Martonosi, “Power prediction for intel
IEEE/ACM International Symposium on Microarchitecture, ser. xscale R processors using performance monitoring unit events,” in
MICRO-45. Washington, DC, USA: IEEE Computer Society, Proceedings of the 2005 International Symposium on Low Power
2012, pp. 143–154. Electronics and Design, ser. ISLPED ’05. New York, NY, USA:
[136] J. Janzen, “Calculating memory system power for ddr sdram,” 2001. ACM, 2005, pp. 221–226.
[137] Micron Technology, Inc., “Calculating memory system power for [157] X. Chen, C. Xu, and R. Dick, “Memory access aware on-line volt-
ddr3,” 2007. age control for performance and energy optimization,” in Computer-
[138] J. Jeffers and J. Reinders, “Chapter 1 - introduction,” in Intel Xeon Aided Design (ICCAD), 2010 IEEE/ACM International Conference
Phi Coprocessor High Performance Programming, J. Jeffers and on, Nov 2010, pp. 365–372.
J. Reinders, Eds. Boston: Morgan Kaufmann, 2013, pp. 1 – 22. [158] A. Merkel and F. Bellosa, “Balancing power consumption
[139] Intel, “The problem of power consumption in servers, energy in multiprocessor systems,” in Proceedings of the 1st ACM
efficiency for information technology,” 2009. SIGOPS/EuroSys European Conference on Computer Systems
[140] I. Moreno and J. Xu, “Neural network-based overallocation for 2006, ser. EuroSys ’06. New York, NY, USA: ACM, 2006, pp.
improved energy-efficiency in real-time cloud environments,” in 403–414.
Object/Component/Service-Oriented Real-Time Distributed Com- [159] R. Basmadjian and H. de Meer, “Evaluating and modeling power
puting (ISORC), 2012 IEEE 15th International Symposium on, consumption of multi-core processors,” in Proceedings of the 3rd
April 2012, pp. 119–126. International Conference on Future Energy Systems: Where Energy,
[141] R. Joseph and M. Martonosi, “Run-time power estimation in Computing and Communication Meet, ser. e-Energy ’12. New
high performance microprocessors,” in Low Power Electronics and York, NY, USA: ACM, 2012, pp. 12:1–12:10.
Design, International Symposium on, 2001., 2001, pp. 135–140. [160] K. Li, “Optimal configuration of a multicore server processor for
[142] R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, managing the power and performance tradeoff,” The Journal of
B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz, “Un- Supercomputing, vol. 61, no. 1, pp. 189–214, 2012.
derstanding sources of inefficiency in general-purpose chips, isca,” [161] J. Cao, K. Li, and I. Stojmenovic, “Optimal power allocation
2010. and load distribution for multiple heterogeneous multicore server
[143] S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and processors across clouds and data centers,” Computers, IEEE Trans-
N. P. Jouppi, “The mcpat framework for multicore and manycore actions on, vol. 63, no. 1, pp. 45–58, Jan 2014.
architectures: Simultaneously modeling power, area, and timing,” [162] Q. Liu, M. Moreto, V. Jimenez, J. Abella, F. J. Cazorla, and
ACM Trans. Archit. Code Optim., vol. 10, no. 1, pp. 5:1–5:29, Apr. M. Valero, “Hardware support for accurate per-task energy metering
2013. in multicore systems,” ACM Trans. Archit. Code Optim., vol. 10,
[144] V. Zyuban, J. Friedrich, C. Gonzalez, R. Rao, M. Brown, no. 4, pp. 34:1–34:27, Dec. 2013.
M. Ziegler, H. Jacobson, S. Islam, S. Chu, P. Kartschoke, [163] W. Shi, S. Wang, and B. Luo, “Cpt: An energy-efficiency model
G. Fiorenza, M. Boersma, and J. Culp, “Power optimization for multi-core computer systems,” 2013.
methodology for the ibm power7 microprocessor,” IBM Journal [164] O. Sarood, A. Langer, A. Gupta, and L. Kale, “Maximizing
of Research and Development, vol. 55, no. 3, pp. 7:1–7:9, May throughput of overprovisioned hpc data centers under a strict power
2011. budget,” in Proceedings of the International Conference for High
[145] T. Diop, N. E. Jerger, and J. Anderson, “Power modeling for Performance Computing, Networking, Storage and Analysis, ser.
heterogeneous processors,” in Proceedings of Workshop on General SC ’14. Piscataway, NJ, USA: IEEE Press, 2014, pp. 807–818.
Purpose Processing Using GPUs, ser. GPGPU-7. New York, NY, [165] V. Jiménez, F. J. Cazorla, R. Gioiosa, M. Valero, C. Boneti, E. Kur-
USA: ACM, 2014, pp. 90:90–90:98. sun, C.-Y. Cher, C. Isci, A. Buyuktosunoglu, and P. Bose, “Power
[146] T. Li and L. K. John, “Run-time modeling and estimation of and thermal characterization of power6 system,” in Proceedings of
operating system power consumption,” in Proceedings of the 2003 the 19th International Conference on Parallel Architectures and
ACM SIGMETRICS International Conference on Measurement and Compilation Techniques, ser. PACT ’10. New York, NY, USA:
Modeling of Computer Systems, ser. SIGMETRICS ’03. New York, ACM, 2010, pp. 7–18.
NY, USA: ACM, 2003, pp. 160–171. [166] R. Bertran, A. Buyuktosunoglu, M. Gupta, M. Gonzalez, and
[147] M. Jarus, A. Oleksiak, T. Piontek, and J. Weglarz, “Runtime power P. Bose, “Systematic energy characterization of cmp/smt processor
usage estimation of {HPC} servers for various classes of real-life systems via automated micro-benchmarks,” in Microarchitecture
applications,” Future Generation Computer Systems, vol. 36, no. 0, (MICRO), 2012 45th Annual IEEE/ACM International Symposium
pp. 299 – 310, 2014. on, Dec 2012, pp. 199–211.
[148] D. Shin, J. Kim, N. Chang, J. Choi, S. W. Chung, and E.-Y. [167] R. Bertran, Y. Becerra, D. Carrera, V. Beltran, M. Gonzalez Tal-
Chung, “Energy-optimal dynamic thermal management for green lada, X. Martorell, J. Torres, and E. Ayguade, “Accurate energy

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 59

accounting for shared virtualized environments using pmc-based [185] W. Stallings, Computer organization and architecture: designing
power modeling techniques,” in Grid Computing (GRID), 2010 11th for performance. Pearson Education Inc., 2010.
IEEE/ACM International Conference on, 2010, pp. 1–8. [186] J. Lin, H. Zheng, Z. Zhu, H. David, and Z. Zhang, “Thermal
[168] R. Bertran, M. Gonzlez, X. Martorell, N. Navarro, and E. Ayguad, modeling and management of dram memory systems,” SIGARCH
“Counter-based power modeling methods: Top-down vs. bottom- Comput. Archit. News, vol. 35, no. 2, pp. 312–322, Jun. 2007.
up,” The Computer Journal, 2012. [187] N. Vijaykrishnan, M. Kandemir, M. J. Irwin, H. S. Kim, and W. Ye,
[169] X. Fu and X. Wang, “Utilization-controlled task consolidation for “Energy-driven integrated hardware-software optimizations using
power optimization in multi-core real-time systems,” in Embedded simplepower,” SIGARCH Comput. Archit. News, vol. 28, no. 2, pp.
and Real-Time Computing Systems and Applications (RTCSA), 95–106, May 2000.
2011 IEEE 17th International Conference on, vol. 1, Aug 2011, [188] J. H. Ahn, N. P. Jouppi, C. Kozyrakis, J. Leverich, and R. S.
pp. 73–82. Schreiber, “Improving system energy efficiency with memory rank
[170] X. Qi and D. Zhu, “Power management for real-time embedded subsetting,” ACM Trans. Archit. Code Optim., vol. 9, no. 1, pp.
systems on block-partitioned multicore platforms,” in Embedded 4:1–4:28, Mar. 2012.
Software and Systems, 2008. ICESS ’08. International Conference [189] M. Rhu, M. Sullivan, J. Leng, and M. Erez, “A locality-aware
on, July 2008, pp. 110–117. memory hierarchy for energy-efficient gpu architectures,” in Pro-
[171] Y. Shao and D. Brooks, “Energy characterization and instruction- ceedings of the 46th Annual IEEE/ACM International Symposium
level energy model of intel’s xeon phi processor,” in Low Power on Microarchitecture, ser. MICRO-46. New York, NY, USA:
Electronics and Design (ISLPED), 2013 IEEE International Sym- ACM, 2013, pp. 86–98.
posium on, Sept 2013, pp. 389–394. [190] H. David, C. Fallin, E. Gorbatov, U. R. Hanebutte, and O. Mutlu,
[172] S. Kim, I. Roy, and V. Talwar, “Evaluating integrated graphics pro- “Memory power management via dynamic voltage/frequency scal-
cessors for data center workloads,” in Proceedings of the Workshop ing,” in Proceedings of the 8th ACM International Conference on
on Power-Aware Computing and Systems, ser. HotPower ’13. New Autonomic Computing, ser. ICAC ’11. New York, NY, USA: ACM,
York, NY, USA: ACM, 2013, pp. 8:1–8:5. 2011, pp. 31–40.
[173] J. Chen, B. Li, Y. Zhang, L. Peng, and J.-K. Peir, “Tree structured [191] K. T. Malladi, I. Shaeffer, L. Gopalakrishnan, D. Lo, B. C. Lee,
analysis on gpu power study,” in Computer Design (ICCD), 2011 and M. Horowitz, “Rethinking dram power modes for energy pro-
IEEE 29th International Conference on, Oct 2011, pp. 57–64. portionality,” in Proceedings of the 2012 45th Annual IEEE/ACM
[174] J. Chen, B. Li, Y. Zhang, L. Peng, and J.-K. Peir, “Statistical gpu International Symposium on Microarchitecture, ser. MICRO-45.
power analysis using tree-based methods,” in Green Computing Washington, DC, USA: IEEE Computer Society, 2012, pp. 131–
Conference and Workshops (IGCC), 2011 International, July 2011, 142.
pp. 1–6. [192] M. Sri-Jayantha, “Trends in mobile storage design,” in Low Power
Electronics, 1995., IEEE Symposium on, Oct 1995, pp. 54–57.
[175] J. Leng, T. Hetherington, A. ElTantawy, S. Gilani, N. S. Kim,
[193] Y. Zhang, S. Gurumurthi, and M. R. Stan, “Soda: Sensitivity
T. M. Aamodt, and V. J. Reddi, “Gpuwattch: Enabling energy
based optimization of disk architecture,” in Proceedings of the 44th
optimizations in gpgpus,” SIGARCH Comput. Archit. News, vol. 41,
Annual Design Automation Conference, ser. DAC ’07. New York,
no. 3, pp. 487–498, Jun. 2013.
NY, USA: ACM, 2007, pp. 865–870.
[176] J. Leng, Y. Zu, M. Rhu, M. Gupta, and V. J. Reddi, “Gpuvolt:
[194] S. Gurumurthi and A. Sivasubramaniam, Energy-Efficient Storage
Modeling and characterizing voltage noise in gpu architectures,” in
Systems for Data Centers. John Wiley & Sons, Inc., 2012, pp.
Proceedings of the 2014 International Symposium on Low Power
361–376.
Electronics and Design, ser. ISLPED ’14. New York, NY, USA:
[195] S. Sankar, Y. Zhang, S. Gurumurthi, and M. Stan, “Sensitivity-based
ACM, 2014, pp. 141–146.
optimization of disk architecture,” Computers, IEEE Transactions
[177] J. Lim, N. B. Lakshminarayana, H. Kim, W. Song, S. Yalamanchili, on, vol. 58, no. 1, pp. 69–81, Jan 2009.
and W. Sung, “Power modeling for gpu architectures using mcpat,” [196] I. Sato, K. Otani, M. Mizukami, S. Oguchi, K. Hoshiya, and
ACM Trans. Des. Autom. Electron. Syst., vol. 19, no. 3, pp. 26:1– K.-i. Shimokura, “Characteristics of heat transfer in small disk
26:24, Jun. 2014. enclosures at high rotation speeds,” Components, Hybrids, and
[178] K. Kasichayanula, D. Terpstra, P. Luszczek, S. Tomov, S. Moore, Manufacturing Technology, IEEE Transactions on, vol. 13, no. 4,
and G. Peterson, “Power aware computing on gpus,” in Application pp. 1006–1011, 1990.
Accelerators in High Performance Computing (SAAHPC), 2012 [197] A. Hylick, R. Sohan, A. Rice, and B. Jones, “An analysis of hard
Symposium on, July 2012, pp. 64–73. drive energy consumption,” in Modeling, Analysis and Simulation
[179] J. Zhao, G. Sun, G. H. Loh, and Y. Xie, “Optimizing gpu energy of Computers and Telecommunication Systems, 2008. MASCOTS
efficiency with 3d die-stacking graphics memory and reconfigurable 2008. IEEE International Symposium on, Sept 2008, pp. 1–10.
memory interface,” ACM Trans. Archit. Code Optim., vol. 10, no. 4, [198] A. Hylick and R. Sohan, “A methodology for generating disk drive
pp. 24:1–24:25, Dec. 2013. energy models using performance data,” Energy (Joules), vol. 80,
[180] D.-Q. Ren and R. Suda, “Global optimization model on power effi- p. 100, 2009.
ciency of gpu and multicore processing element for simd computing [199] J. Zedlewski, S. Sobti, N. Garg, F. Zheng, A. Krishnamurthy, and
with cuda,” Computer Science - Research and Development, vol. 27, R. Wang, “Modeling hard-disk power consumption,” in Proceedings
no. 4, pp. 319–327, 2012. of the 2Nd USENIX Conference on File and Storage Technologies,
[181] A. Marowka, “Analytical modeling of energy efficiency in hetero- ser. FAST ’03. Berkeley, CA, USA: USENIX Association, 2003,
geneous processors,” Computers & Electrical Engineering, vol. 39, pp. 217–230.
no. 8, pp. 2566 – 2578, 2013. [200] Q. Zhu, Z. Chen, L. Tan, Y. Zhou, K. Keeton, and J. Wilkes,
[182] M. Rofouei, T. Stathopoulos, S. Ryffel, W. Kaiser, and M. Sar- “Hibernator: Helping disk arrays sleep through the winter,” SIGOPS
rafzadeh, “Energy-aware high performance computing with graphic Oper. Syst. Rev., vol. 39, no. 5, pp. 177–190, Oct. 2005.
processing units,” in Proceedings of the 2008 Conference on Power [201] T. Bostoen, S. Mullender, and Y. Berbers, “Analysis of disk power
Aware Computing and Systems, ser. HotPower’08. Berkeley, CA, management for data-center storage systems,” in Future Energy
USA: USENIX Association, 2008, pp. 11–11. Systems: Where Energy, Computing and Communication Meet (e-
[183] B. Giridhar, M. Cieslak, D. Duggal, R. Dreslinski, H. M. Chen, Energy), 2012 Third International Conference on, May 2012, pp.
R. Patti, B. Hold, C. Chakrabarti, T. Mudge, and D. Blaauw, 1–10.
“Exploring dram organizations for energy-efficient and resilient [202] Y. Deng, “What is the future of disk drives, death or rebirth?” ACM
exascale memories,” in Proceedings of the International Confer- Comput. Surv., vol. 43, no. 3, pp. 23:1–23:27, Apr. 2011.
ence on High Performance Computing, Networking, Storage and [203] K. Kant, “Data center evolution: A tutorial on state of the art, issues,
Analysis, ser. SC ’13. New York, NY, USA: ACM, 2013, pp. and challenges,” Computer Networks, vol. 53, no. 17, pp. 2939 –
23:1–23:12. 2965, 2009, virtualized Data Centers.
[184] J. Lin, H. Zheng, Z. Zhu, E. Gorbatov, H. David, and Z. Zhang, [204] D. Andersen and S. Swanson, “Rethinking flash in the data center,”
“Software thermal management of dram memory for multicore sys- Micro, IEEE, vol. 30, no. 4, pp. 52–54, July 2010.
tems,” in Proceedings of the 2008 ACM SIGMETRICS International [205] J. Park, S. Yoo, S. Lee, and C. Park, “Power modeling of solid state
Conference on Measurement and Modeling of Computer Systems, disk for dynamic power management policy design in embedded
ser. SIGMETRICS ’08. New York, NY, USA: ACM, 2008, pp. systems,” in Software Technologies for Embedded and Ubiquitous
337–348. Systems, ser. Lecture Notes in Computer Science, S. Lee and

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 60

P. Narasimhan, Eds. Springer Berlin Heidelberg, 2009, vol. 5860, Parallel Processing Workshops, 2009. ICPPW ’09. International
pp. 24–35. Conference on, Sept 2009, pp. 470–477.
[206] V. Mohan, T. Bunker, L. Grupp, S. Gurumurthi, M. Stan, and [224] T. Inoue, A. Aikebaier, T. Enokido, and M. Takizawa, “Evalua-
S. Swanson, “Modeling power consumption of nand flash memories tion of an energy-aware selection algorithm for computation and
using flashpower,” Computer-Aided Design of Integrated Circuits storage-based applications,” in Network-Based Information Systems
and Systems, IEEE Transactions on, vol. 32, no. 7, pp. 1031–1044, (NBiS), 2012 15th International Conference on, Sept 2012, pp. 120–
July 2013. 127.
[207] S. J. E. Wilton and N. Jouppi, “Cacti: an enhanced cache access and [225] L. Velasco, A. Asensio, J. Berral, E. Bonetto, F. Musumeci, and
cycle time model,” Solid-State Circuits, IEEE Journal of, vol. 31, V. Lpez, “Elastic operations in federated datacenters for perfor-
no. 5, pp. 677–688, May 1996. mance and cost optimization,” Computer Communications, vol. 50,
[208] Z. Li, K. M. Greenan, A. W. Leung, and E. Zadok, “Power no. 0, pp. 142 – 151, 2014, green Networking.
consumption in enterprise-scale backup storage systems,” in Pro- [226] D. A. Patterson, “Technical perspective: The data center is the
ceedings of the 10th USENIX Conference on File and Storage computer,” Commun. ACM, vol. 51, no. 1, pp. 105–105, Jan. 2008.
Technologies, ser. FAST’12. Berkeley, CA, USA: USENIX [227] B. Heller, S. Seetharaman, P. Mahadevan, Y. Yiakoumis, P. Sharma,
Association, 2012, pp. 6–6. S. Banerjee, and N. McKeown, “Elastictree: Saving energy in data
[209] M. Allalouf, Y. Arbitman, M. Factor, R. I. Kat, K. Meth, and center networks,” in Proceedings of the 7th USENIX Conference
D. Naor, “Storage modeling for power estimation,” in Proceedings on Networked Systems Design and Implementation, ser. NSDI’10.
of SYSTOR 2009: The Israeli Experimental Systems Conference, Berkeley, CA, USA: USENIX Association, 2010, pp. 17–17.
ser. SYSTOR ’09. New York, NY, USA: ACM, 2009, pp. 3:1– [228] C. Kachris and I. Tomkos, “Power consumption evaluation of all-
3:10. optical data center networks,” Cluster Computing, vol. 16, no. 3,
[210] T. Inoue, A. Aikebaier, T. Enokido, and M. Takizawa, “A power pp. 611–623, 2013.
consumption model of a storage server,” in Network-Based Infor- [229] D. Abts, M. R. Marty, P. M. Wells, P. Klausler, and H. Liu,
mation Systems (NBiS), 2011 14th International Conference on, “Energy proportional datacenter networks,” SIGARCH Comput.
Sept 2011, pp. 382–387. Archit. News, vol. 38, no. 3, pp. 338–347, Jun. 2010.
[211] A. Gandhi, M. Harchol-Balter, and I. Adan, “Server farms with [230] J. Shuja, S. Madani, K. Bilal, K. Hayat, S. Khan, and S. Sarwar,
setup costs,” Performance Evaluation, vol. 67, no. 11, pp. 1123 – “Energy-efficient data centers,” Computing, vol. 94, no. 12, pp.
1138, 2010. 973–994, 2012.
[212] M. Mazzucco, D. Dyachuk, and M. Dikaiakos, “Profit-aware server [231] Q. Yi and S. Singh, “Minimizing energy consumption of fattree
allocation for green internet services,” in Modeling, Analysis Sim- data center networks,” SIGMETRICS Perform. Eval. Rev., vol. 42,
ulation of Computer and Telecommunication Systems (MASCOTS), no. 3, pp. 67–72, Dec. 2014.
2010 IEEE International Symposium on, 2010, pp. 277–284. [232] I. Widjaja, A. Walid, Y. Luo, Y. Xu, and H. Chao, “Small versus
[213] M. Mazzucco and D. Dyachuk, “Balancing electricity bill and large: Switch sizing in topology design of energy-efficient data
performance in server farms with setup costs,” Future Generation centers,” in Quality of Service (IWQoS), 2013 IEEE/ACM 21st
Computer Systems, vol. 28, no. 2, pp. 415 – 426, 2012. International Symposium on, June 2013, pp. 1–6.
[214] A. Gandhi, M. Harchol-Balter, R. Das, and C. Lefurgy, “Optimal [233] R. Tucker, “Green optical communications - part ii: Energy limita-
power allocation in server farms,” in Proceedings of the Eleventh tions in networks,” Selected Topics in Quantum Electronics, IEEE
International Joint Conference on Measurement and Modeling of Journal of, vol. 17, no. 2, pp. 261–274, March 2011.
Computer Systems, ser. SIGMETRICS ’09. New York, NY, USA: [234] K. Hinton, G. Raskutti, P. Farrell, and R. Tucker, “Switching energy
ACM, 2009, pp. 157–168. and device size limits on digital photonic signal processing tech-
[215] R. Lent, “Analysis of an energy proportional data center,” Ad nologies,” Selected Topics in Quantum Electronics, IEEE Journal
Hoc Networks, vol. 25, Part B, no. 0, pp. 554 – 564, 2015, new of, vol. 14, no. 3, pp. 938–945, May 2008.
Research Challenges in Mobile, Opportunistic and Delay-Tolerant [235] Y. Zhang and N. Ansari, “Hero: Hierarchical energy optimization
Networks Energy-Aware Data Centers: Architecture, Infrastructure, for data center networks,” Systems Journal, IEEE, vol. PP, no. 99,
and Communication. pp. 1–10, 2013.
[216] I. Mitrani, “Trading power consumption against performance by [236] H. Jin, T. Cheocherngngarn, D. Levy, A. Smith, D. Pan, J. Liu,
reserving blocks of servers,” in Computer Performance Engineer- and N. Pissinou, “Joint host-network optimization for energy-
ing, ser. Lecture Notes in Computer Science, M. Tribastone and efficient data center networking,” in Parallel Distributed Processing
S. Gilmore, Eds. Springer Berlin Heidelberg, 2013, vol. 7587, pp. (IPDPS), 2013 IEEE 27th International Symposium on, May 2013,
1–15. pp. 623–634.
[217] Z. Liu, Y. Chen, C. Bash, A. Wierman, D. Gmach, Z. Wang, [237] D. Li, Y. Shang, and C. Chen, “Software defined green data center
M. Marwah, and C. Hyser, “Renewable and cooling aware work- network with exclusive routing,” in INFOCOM, 2014 Proceedings
load management for sustainable data centers,” in Proceedings of IEEE, April 2014, pp. 1743–1751.
the 12th ACM SIGMETRICS/PERFORMANCE Joint International [238] L. Niccolini, G. Iannaccone, S. Ratnasamy, J. Chandrashekar,
Conference on Measurement and Modeling of Computer Systems, and L. Rizzo, “Building a power-proportional software router,” in
ser. SIGMETRICS ’12. New York, NY, USA: ACM, 2012, pp. Proceedings of the 2012 USENIX Conference on Annual Technical
175–186. Conference, ser. USENIX ATC’12. Berkeley, CA, USA: USENIX
[218] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, Association, 2012, pp. 8–8.
“Cutting the electric bill for internet-scale systems,” SIGCOMM [239] P. Mahadevan, P. Sharma, S. Banerjee, and P. Ranganathan, “A
Comput. Commun. Rev., vol. 39, no. 4, pp. 123–134, Aug. 2009. power benchmarking framework for network devices,” in Proceed-
[219] M. Pedram, “Energy-efficient datacenters,” Computer-Aided Design ings of the 8th International IFIP-TC 6 Networking Conference, ser.
of Integrated Circuits and Systems, IEEE Transactions on, vol. 31, NETWORKING ’09. Berlin, Heidelberg: Springer-Verlag, 2009,
no. 10, pp. 1465–1484, Oct 2012. pp. 795–808.
[220] J. Moore, J. Chase, P. Ranganathan, and R. Sharma, “Making [240] Cisco, “Cisco nexus 9500 platform switches,” Cisco Nexus 9000
scheduling ”cool”: Temperature-aware workload placement in data Series Switches, 2014.
centers,” in Proceedings of the Annual Conference on USENIX [241] E. Bonetto, A. Finamore, M. Mellia, and R. Fiandra, “Energy
Annual Technical Conference, ser. ATEC ’05. Berkeley, CA, USA: efficiency in access and aggregation networks: From current traffic
USENIX Association, 2005, pp. 5–5. to potential savings,” Computer Networks, vol. 65, no. 0, pp. 151
[221] D. Duolikun, T. Enokido, A. Aikebaier, and M. Takizawa, “Energy- – 166, 2014.
efficient dynamic clusters of servers,” The Journal of Supercomput- [242] A. Vishwanath, K. Hinton, R. Ayre, and R. Tucker, “Modelling
ing, pp. 1–15, 2014. energy consumption in high-capacity routers and switches,” vol. PP,
[222] A. Aikebaier, Y. Yang, T. Enokido, and M. Takizawa, “Energy- no. 99, 2014, pp. 1–1.
efficient computation models for distributed systems,” in Network- [243] D. C. Kilper and R. S. Tucker, “Chapter 17 - energy-efficient
Based Information Systems, 2009. NBIS ’09. International Confer- telecommunications,” in Optical Fiber Telecommunications (Sixth
ence on, Aug 2009, pp. 424–431. Edition), sixth edition ed., ser. Optics and Photonics, I. P. Kaminow,
[223] A. Aikebaier, T. Enokido, and M. Takizawa, “Distributed cluster T. Li, and A. E. Willner, Eds. Boston: Academic Press, 2013, pp.
architecture for increasing energy efficiency in cluster systems,” in 747 – 791.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 61

[244] F. Jalali, C. Gray, A. Vishwanath, R. Ayre, T. Alpcan, K. Hinton, [265] E. Lee, I. Kulkarni, D. Pompili, and M. Parashar, “Proactive thermal
and R. Tucker, “Energy consumption of photo sharing in online management in green datacenters,” The Journal of Supercomputing,
social networks,” in Cluster, Cloud and Grid Computing (CCGrid), vol. 60, no. 2, pp. 165–195, 2012.
2014 14th IEEE/ACM International Symposium on, May 2014, pp. [266] Z. Abbasi, G. Varsamopoulos, and S. K. S. Gupta, “Tacoma: Server
604–611. and workload management in internet data centers considering
[245] H. Hlavacs, G. Da Costa, and J. Pierson, “Energy consumption of cooling-computing power trade-off and energy proportionality,”
residential and professional switches,” in Computational Science ACM Trans. Archit. Code Optim., vol. 9, no. 2, pp. 11:1–11:37,
and Engineering, 2009. CSE ’09. International Conference on, Jun. 2012.
vol. 1, Aug 2009, pp. 240–246. [267] N. Rasmussen, “Calculating total cooling requirements for data
[246] P. Mahadevan, S. Banerjee, and P. Sharma, “Energy proportionality centers,” White Paper, vol. 25, pp. 1–8, 2007.
of an enterprise network,” in Proceedings of the First ACM SIG- [268] E. Pakbaznia and M. Pedram, “Minimizing data center cooling
COMM Workshop on Green Networking, ser. Green Networking and server power costs,” in Proceedings of the 14th ACM/IEEE
’10. New York, NY, USA: ACM, 2010, pp. 53–60. International Symposium on Low Power Electronics and Design,
[247] J. Ahn and H.-S. Park, “Measurement and modeling the power ser. ISLPED ’09. New York, NY, USA: ACM, 2009, pp. 145–
consumption of router interface,” in Advanced Communication 150.
Technology (ICACT), 2014 16th International Conference on, Feb [269] A. Vasan, A. Sivasubramaniam, V. Shimpi, T. Sivabalan, and
2014, pp. 860–863. R. Subbiah, “Worth their watts? - an empirical study of datacenter
[248] R. Bolla, R. Bruschi, F. Davoli, L. Di Gregorio, P. Donadio, servers,” in High Performance Computer Architecture (HPCA),
L. Fialho, M. Collier, A. Lombardo, D. Reforgiato Recupero, and 2010 IEEE 16th International Symposium on, Jan 2010, pp. 1–10.
T. Szemethy, “The green abstraction layer: A standard power- [270] C. Lefurgy, K. Rajamani, F. Rawson, W. Felter, M. Kistler, and
management interface for next-generation network devices,” Inter- T. Keller, “Energy management for commercial servers,” Computer,
net Computing, IEEE, vol. 17, no. 2, pp. 82–86, March 2013. vol. 36, no. 12, pp. 39–48, Dec 2003.
[249] R. Bolla, R. Bruschi, O. M. Jaramillo Ortiz, and P. Lago, “The [271] J. Kim, M. Sabry, D. Atienza, K. Vaidyanathan, and K. Gross,
energy consumption of tcp,” in Proceedings of the Fourth Inter- “Global fan speed control considering non-ideal temperature mea-
national Conference on Future Energy Systems, ser. e-Energy ’13. surements in enterprise servers,” in Design, Automation and Test
New York, NY, USA: ACM, 2013, pp. 203–212. in Europe Conference and Exhibition (DATE), 2014, March 2014,
[250] R. Basmadjian, H. Meer, R. Lent, and G. Giuliani, “Cloud com- pp. 1–6.
puting and its interest in saving energy: the use case of a private [272] M. Zapater, J. L. Ayala, J. M. Moya, K. Vaidyanathan, K. Gross,
cloud,” Journal of Cloud Computing, vol. 1, no. 1, 2012. and A. K. Coskun, “Leakage and temperature aware server con-
[251] C. Kachris and I. Tomkos, “Power consumption evaluation of trol for improving energy efficiency in data centers,” in Design,
hybrid wdm pon networks for data centers,” in Networks and Automation Test in Europe Conference Exhibition (DATE), 2013,
Optical Communications (NOC), 2011 16th European Conference March 2013, pp. 266–269.
on, July 2011, pp. 118–121. [273] D. Meisner and T. F. Wenisch, “Does low-power design imply
[252] W. Van Heddeghem, F. Idzikowski, W. Vereecken, D. Colle, energy efficiency for data centers?” in Proceedings of the 17th
M. Pickavet, and P. Demeester, “Power consumption modeling in IEEE/ACM International Symposium on Low-power Electronics
optical multilayer networks,” Photonic Network Communications, and Design, ser. ISLPED ’11. Piscataway, NJ, USA: IEEE Press,
vol. 24, no. 2, pp. 86–102, 2012. 2011, pp. 109–114.
[253] M. McGarry, M. Reisslein, and M. Maier, “Wdm ethernet passive [274] O. Mmmel, M. Majanen, R. Basmadjian, H. Meer, A. Giesler, and
optical networks,” Communications Magazine, IEEE, vol. 44, no. 2, W. Homberg, “Energy-aware job scheduler for high-performance
pp. 15–22, Feb 2006. computing,” Computer Science - Research and Development,
[254] T. Shimada, N. Sakurai, and K. Kumozaki, “Wdm access system vol. 27, no. 4, pp. 265–275, 2012.
based on shared demultiplexer and mmf links,” Lightwave Technol- [275] Cisco, “Cisco energy efficient data center solutions and best prac-
ogy, Journal of, vol. 23, no. 9, pp. 2621–2628, Sept 2005. tices,” 2007.
[255] S. Pelley, D. Meisner, P. Zandevakili, T. F. Wenisch, and J. Un- [276] M. Patterson, “The effect of data center temperature on energy
derwood, “Power routing: Dynamic power provisioning in the data efficiency,” in Thermal and Thermomechanical Phenomena in Elec-
center,” SIGPLAN Not., vol. 45, no. 3, pp. 231–242, Mar. 2010. tronic Systems, 2008. ITHERM 2008. 11th Intersociety Conference
[256] E. Or, V. Depoorter, A. Garcia, and J. Salom, “Energy efficiency on, May 2008, pp. 1167–1174.
and renewable energy integration in data centres. strategies and [277] R. Ghosh, V. Sundaralingam, and Y. Joshi, “Effect of rack server
modelling review,” Renewable and Sustainable Energy Reviews, population on temperatures in data centers,” in Thermal and Ther-
vol. 42, no. 0, pp. 429 – 445, 2015. momechanical Phenomena in Electronic Systems (ITherm), 2012
[257] D. Bouley and W. Torell, “Containerized power and cooling mod- 13th IEEE Intersociety Conference on, May 2012, pp. 30–37.
ules for data centers,” 2012. [278] J. Dai, M. Ohadi, D. Das, and M. Pecht, “The telecom industry
[258] S. Govindan, J. Choi, B. Urgaonkar, A. Sivasubramaniam, and and data centers,” in Optimum Cooling of Data Centers. Springer
A. Baldini, “Statistical profiling-based techniques for effective New York, 2014, pp. 1–8.
power provisioning in data centers,” in Proceedings of the 4th [279] H. Ma and C. Chen, “Development of a divided zone method
ACM European Conference on Computer Systems, ser. EuroSys for power savings in a data center,” in Semiconductor Thermal
’09. New York, NY, USA: ACM, 2009, pp. 317–330. Measurement and Management Symposium (SEMI-THERM), 2013
[259] H. Luo, B. Khargharia, S. Hariri, and Y. Al-Nashif, “Autonomic 29th Annual IEEE, March 2013, pp. 33–38.
green computing in large-scale data centers,” in Energy-Efficient [280] M. David and R. Schmidt, “Impact of ashrae environmental classes
Distributed Computing Systems. John Wiley & Sons, Inc., 2012, on data centers,” in Thermal and Thermomechanical Phenomena in
pp. 271–299. Electronic Systems (ITherm), 2014 IEEE Intersociety Conference
[260] X. Fu, X. Wang, and C. Lefurgy, “How much power oversubscrip- on, May 2014, pp. 1092–1099.
tion is safe and allowed in data centers,” in Proceedings of the [281] S. K. S. Gupta, A. Banerjee, Z. Abbasi, G. Varsamopoulos,
8th ACM International Conference on Autonomic Computing, ser. M. Jonas, J. Ferguson, R. R. Gilbert, and T. Mukherjee, “Gdcsim:
ICAC ’11. New York, NY, USA: ACM, 2011, pp. 21–30. A simulator for green data center design and analysis,” ACM Trans.
[261] D. Wang, C. Ren, and A. Sivasubramaniam, “Virtualizing power Model. Comput. Simul., vol. 24, no. 1, pp. 3:1–3:27, Jan. 2014.
distribution in datacenters,” in Proceedings of the 40th Annual [282] T. Malkamaki and S. Ovaska, “Data centers and energy balance in
International Symposium on Computer Architecture, ser. ISCA ’13. finland,” in Green Computing Conference (IGCC), 2012 Interna-
New York, NY, USA: ACM, 2013, pp. 595–606. tional, June 2012, pp. 1–6.
[262] S. Pelley, D. Meisner, T. F. Wenisch, and J. W. VanGilder, “Under- [283] X. Zhan and S. Reda, “Techniques for energy-efficient power
standing and abstracting total data center power,” in Workshop on budgeting in data centers,” in Proceedings of the 50th Annual
Energy-Efficient Design, 2009. Design Automation Conference, ser. DAC ’13. New York, NY,
[263] N. Rasmussen, “Electrical efficiency modeling of data centers,” USA: ACM, 2013, pp. 176:1–176:7.
2006. [284] J. Doyle, R. Shorten, and D. O’Mahony, “Stratus: Load balancing
[264] P. Anderson, G. Backhouse, D. Curtis, S. Redding, and D. Wallom, the cloud for carbon emissions control,” Cloud Computing, IEEE
“Low carbon computing : a view to 2050 and beyond,” 2009. Transactions on, vol. 1, no. 1, pp. 1–1, Jan 2013.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 62

[285] R. Das, J. O. Kephart, J. Lenchner, and H. Hamann, “Utility- [305] G. GRID, “Green grid data center power efficiency metrics: Pue
function-driven energy-efficient cooling in data centers,” in Pro- and dcie,” 2008.
ceedings of the 7th International Conference on Autonomic Com- [306] D. Chernicoff, “Data centerenergy efficiency,” 2009.
puting, ser. ICAC ’10. New York, NY, USA: ACM, 2010, pp. [307] M. Patterson, S. Poole, C.-H. Hsu, D. Maxwell, W. Tschudi,
61–70. H. Coles, D. Martinez, and N. Bates, “Tue, a new energy-efficiency
[286] R. Das, S. Yarlanki, H. Hamann, J. O. Kephart, and V. Lopez, metric applied at ornls jaguar,” in Supercomputing, ser. Lecture
“A unified approach to coordinated energy-management in data Notes in Computer Science, J. Kunkel, T. Ludwig, and H. Meuer,
centers,” in Proceedings of the 7th International Conference on Eds. Springer Berlin Heidelberg, 2013, vol. 7905, pp. 372–382.
Network and Services Management, ser. CNSM ’11. Laxenburg, [308] Green IT Promotion Council, “Introduction of datacenter perfor-
Austria, Austria: International Federation for Information Process- mance per energy,” 2010.
ing, 2011, pp. 504–508. [309] Green IT Promotion Council, “New data center energy efficiency
[287] H. Hamann, M. Schappert, M. Iyengar, T. van Kessel, and evaluation index dppe(datacenter performance per energy) measure-
A. Claassen, “Methods and techniques for measuring and improv- ment guidelines (ver 2.05),” 2012.
ing data center best practices,” in Thermal and Thermomechanical [310] M. Obaidat, A. Anpalagan, and I. Woungang, Handbook of Green
Phenomena in Electronic Systems, 2008. ITHERM 2008. 11th Information and Communication Systems. Elsevier Science, 2012.
Intersociety Conference on, May 2008, pp. 1146–1152. [311] L. H. Sego, A. Márquez, A. Rawson, T. Cader, K. Fox, W. I.
[288] H. Hamann, T. van Kessel, M. Iyengar, J. Y. Chung, W. Hirt, Gustafson, Jr., and C. J. Mundy, “Implementing the data center
M. Schappert, A. Claassen, J. Cook, W. Min, Y. Amemiya, energy productivity metric,” J. Emerg. Technol. Comput. Syst.,
V. Lopez, J. Lacey, and M. O’Boyle, “Uncovering energy-efficiency vol. 8, no. 4, pp. 30:1–30:22, Nov. 2012.
opportunities in data centers,” IBM Journal of Research and De- [312] L. Wang and S. Khan, “Review of performance metrics for green
velopment, vol. 53, no. 3, pp. 10:1–10:12, May 2009. data centers: a taxonomy study,” The Journal of Supercomputing,
[289] R. T. Kaushik and K. Nahrstedt, “T: A data-centric cooling en- vol. 63, no. 3, pp. 639–656, 2013.
ergy costs reduction approach for big data analytics cloud,” in [313] P. Mathew, S. Greenberg, D. Sartor, J. Bruschi, and L. Chu,
Proceedings of the International Conference on High Performance “Self-benchmarking guide for data center infrastructure: Metrics,
Computing, Networking, Storage and Analysis, ser. SC ’12. Los benchmarks, actions,” 2010.
Alamitos, CA, USA: IEEE Computer Society Press, 2012, pp. 52:1– [314] T. Wilde, A. Auweter, M. Patterson, H. Shoukourian, H. Huber,
52:11. A. Bode, D. Labrenz, and C. Cavazzoni, “Dwpe, a new data
[290] X. Han and Y. Joshi, “Energy reduction in server cooling via real center energy-efficiency metric bridging the gap between infrastruc-
time thermal control,” in Semiconductor Thermal Measurement and ture and workload,” in High Performance Computing Simulation
Management Symposium (SEMI-THERM), 2012 28th Annual IEEE, (HPCS), 2014 International Conference on, July 2014, pp. 893–
March 2012, pp. 20–27. 901.
[291] J. Tu, L. Lu, M. Chen, and R. K. Sitaraman, “Dynamic provisioning [315] B. Aebischer and L. Hilty, “The energy demand of ict: A historical
in next-generation data centers with on-site power production,” perspective and current methodological challenges,” in ICT Inno-
in Proceedings of the Fourth International Conference on Future vations for Sustainability, ser. Advances in Intelligent Systems and
Energy Systems, ser. e-Energy ’13. New York, NY, USA: ACM, Computing. Springer International Publishing, 2015, vol. 310, pp.
2013, pp. 137–148. 71–103.
[292] X. Zheng and Y. Cai, “Energy-aware load dispatching in geo- [316] G. Le Louet and J.-M. Menaud, “Optiplace: Designing cloud
graphically located internet data centers,” Sustainable Computing: management with flexible power models through constraint pro-
Informatics and Systems, vol. 1, no. 4, pp. 275 – 285, 2011. graming,” in Utility and Cloud Computing (UCC), 2013 IEEE/ACM
[293] B. Whitehead, D. Andrews, A. Shah, and G. Maidment, “Assessing 6th International Conference on, Dec 2013, pp. 211–218.
the environmental impact of data centres part 2: Building environ- [317] R. Raghavendra, P. Ranganathan, V. Talwar, Z. Wang, and X. Zhu,
mental assessment methods and life cycle assessment,” Building “No ”power” struggles: Coordinated multi-level power management
and Environment, no. 0, pp. –, 2014. for the data center,” SIGARCH Comput. Archit. News, vol. 36, no. 1,
[294] L. Wang and S. Khan, “Review of performance metrics for green pp. 48–59, Mar. 2008.
data centers: a taxonomy study,” The Journal of Supercomputing, [318] M. Islam, S. Ren, and G. Quan, “Online energy budgeting for
vol. 63, no. 3, pp. 639–656, 2013. virtualized data centers,” in Modeling, Analysis Simulation of Com-
[295] P. Rad, M. Thoene, and T. Webb, “Best practices for increasing data puter and Telecommunication Systems (MASCOTS), 2013 IEEE
center energy efficiency,” Dell Power Solutions Magazine, 2008. 21st International Symposium on, Aug 2013, pp. 424–433.
[296] G. GRID, “Pue: A comprehensive examination of the metric,” 2012. [319] Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. Andrew,
[297] A. Khosravi, S. Garg, and R. Buyya, “Energy and carbon-efficient “Greening geographical load balancing,” in Proceedings of the
placement of virtual machines in distributed cloud data centers,” in ACM SIGMETRICS Joint International Conference on Measure-
Euro-Par 2013 Parallel Processing, ser. Lecture Notes in Computer ment and Modeling of Computer Systems, ser. SIGMETRICS ’11.
Science, F. Wolf, B. Mohr, and D. Mey, Eds. Springer Berlin New York, NY, USA: ACM, 2011, pp. 233–244.
Heidelberg, 2013, vol. 8097, pp. 317–328. [320] M. Lin, Z. Liu, A. Wierman, and L. Andrew, “Online algorithms
[298] J. Gao, “Machine learning applications for data center optimiza- for geographical load balancing,” in Green Computing Conference
tion,” 2014. (IGCC), 2012 International, June 2012, pp. 1–10.
[299] J. Dai, M. Ohadi, D. Das, and M. Pecht, “Data center energy flow [321] E. Masanet, R. Brown, A. Shehabi, J. Koomey, and B. Nordman,
and efficiency,” in Optimum Cooling of Data Centers. Springer “Estimating the energy use and efficiency potential of u.s. data
New York, 2014, pp. 9–30. centers,” Proceedings of the IEEE, vol. 99, no. 8, pp. 1440–1453,
[300] K. Choo, R. M. Galante, and M. M. Ohadi, “Energy consumption Aug 2011.
analysis of a medium-size primary data center in an academic [322] Y. Yao, L. Huang, A. Sharma, L. Golubchik, and M. Neely,
campus,” Energy and Buildings, vol. 76, no. 0, pp. 414 – 421, “Power cost reduction in distributed data centers: A two-time-scale
2014. approach for delay tolerant workloads,” Parallel and Distributed
[301] R. Zhou, Y. Shi, and C. Zhu, “Axpue: Application level metrics Systems, IEEE Transactions on, vol. 25, no. 1, pp. 200–211, Jan
for power usage effectiveness in data centers,” in Big Data, 2013 2014.
IEEE International Conference on, Oct 2013, pp. 110–117. [323] A. H. Mahmud and S. Ren, “Online capacity provisioning
[302] R. Giordanelli, C. Mastroianni, M. Meo, G. Papuzzo, and for carbon-neutral data center with demand-responsive electricity
A. Roscetti, “Saving energy in data centers,” 2013. prices,” SIGMETRICS Perform. Eval. Rev., vol. 41, no. 2, pp. 26–
[303] G. A. Brady, N. Kapur, J. L. Summers, and H. M. Thompson, “A 37, Aug. 2013.
case study and critical assessment in calculating power usage effec- [324] Z. Zhou, F. Liu, Y. Xu, R. Zou, H. Xu, J. Lui, and H. Jin, “Carbon-
tiveness for a data centre,” Energy Conversion and Management, aware load balancing for geo-distributed cloud services,” in Mod-
vol. 76, no. 0, pp. 155 – 161, 2013. eling, Analysis Simulation of Computer and Telecommunication
[304] J. Yuventi and R. Mehdizadeh, “A critical analysis of power usage Systems (MASCOTS), 2013 IEEE 21st International Symposium on,
effectiveness and its use in communicating data center energy Aug 2013, pp. 232–241.
consumption,” Energy and Buildings, vol. 64, no. 0, pp. 90 – 94, [325] Z. Liu, I. Liu, S. Low, and A. Wierman, “Pricing data center
2013. demand response,” in The 2014 ACM International Conference on

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 63

Measurement and Modeling of Computer Systems, ser. SIGMET- of the Asia-Pacific Workshop on Systems, ser. APSYS ’12. New
RICS ’14. New York, NY, USA: ACM, 2014, pp. 111–123. York, NY, USA: ACM, 2012, pp. 3:1–3:6.
[326] J. Davis, S. Rivoire, M. Goldszmidt, and E. Ardestani, “Chaos: [344] N. Zhu, X. Liu, J. Liu, and Y. Hua, “Towards a cost-efficient
Composable highly accurate os-based power models,” in Workload mapreduce: Mitigating power peaks for hadoop clusters,” Tsinghua
Characterization (IISWC), 2012 IEEE International Symposium on, Science and Technology, vol. 19, no. 1, pp. 24–32, Feb 2014.
Nov 2012, pp. 153–163. [345] B. Feng, J. Lu, Y. Zhou, and N. Yang, “Energy efficiency for
[327] J. Davis, S. Rivoire, and M. Goldszmidt, “Star-cap: Cluster power mapreduce workloads: An in-depth study,” in Proceedings of the
management using software-only models,” Tech. Rep. MSR-TR- Twenty-Third Australasian Database Conference - Volume 124, ser.
2012-107. ADC ’12. Darlinghurst, Australia, Australia: Australian Computer
[328] O. Laadan and J. Nieh, “Transparent checkpoint-restart of multiple Society, Inc., 2012, pp. 61–70.
processes on commodity operating systems,” in 2007 USENIX An- [346] W. Lang and J. M. Patel, “Energy management for mapreduce
nual Technical Conference on Proceedings of the USENIX Annual clusters,” Proc. VLDB Endow., vol. 3, no. 1-2, pp. 129–139, Sep.
Technical Conference, ser. ATC’07. Berkeley, CA, USA: USENIX 2010.
Association, 2007, pp. 25:1–25:14. [347] A. Di Stefano, G. Morana, and D. Zito, “Improving the allocation of
[329] J. Stoess, C. Lang, and F. Bellosa, “Energy management for communication-intensive applications in clouds using time-related
hypervisor-based virtual machines,” in 2007 USENIX Annual Tech- information,” in Parallel and Distributed Computing (ISPDC), 2012
nical Conference on Proceedings of the USENIX Annual Technical 11th International Symposium on, June 2012, pp. 71–78.
Conference, ser. ATC’07. Berkeley, CA, USA: USENIX Associ- [348] P. Pacheco, An Introduction to Parallel Programming, 1st ed. San
ation, 2007, pp. 1:1–1:14. Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2011.
[330] N. Kim, J. Cho, and E. Seo, “Energy-based accounting and schedul- [349] M. E. M. Diouri, O. Glück, J.-C. Mignot, and L. Lefèvre, “Energy
ing of virtual machines in a cloud system,” in Green Computing estimation for mpi broadcasting algorithms in large scale hpc
and Communications (GreenCom), 2011 IEEE/ACM International systems,” in Proceedings of the 20th European MPI Users’ Group
Conference on, Aug 2011, pp. 176–181. Meeting, ser. EuroMPI ’13. New York, NY, USA: ACM, 2013,
[331] F. Quesnel, H. Mehta, and J.-M. Menaud, “Estimating the power pp. 111–116.
consumption of an idle virtual machine,” in Green Computing and [350] M. Gamell, I. Rodero, M. Parashar, J. C. Bennett, H. Kolla, J. Chen,
Communications (GreenCom), 2013 IEEE and Internet of Things P.-T. Bremer, A. G. Landge, A. Gyulassy, P. McCormick, S. Pakin,
(iThings/CPSCom), IEEE International Conference on and IEEE V. Pascucci, and S. Klasky, “Exploring power behaviors and trade-
Cyber, Physical and Social Computing, Aug 2013, pp. 268–275. offs of in-situ data analytics,” in Proceedings of the International
[332] F. Chen, J. Grundy, J.-G. Schneider, Y. Yang, and Q. He, “Auto- Conference on High Performance Computing, Networking, Storage
mated analysis of performance and energy consumption for cloud and Analysis, ser. SC ’13. New York, NY, USA: ACM, 2013, pp.
applications,” in Proceedings of the 5th ACM/SPEC International 77:1–77:12.
Conference on Performance Engineering, ser. ICPE ’14. New [351] C. Bunse and S. Stiemer, “On the energy consumption of design
York, NY, USA: ACM, 2014, pp. 39–50. patterns,” Softwaretechnik-Trends, vol. 33, no. 2, 2013.
[333] D. Meisner, C. M. Sadler, L. A. Barroso, W.-D. Weber, and T. F. [352] J. Arjona Aroca, A. Chatzipapas, A. Fernández Anta, and V. Man-
Wenisch, “Power management of online data-intensive services,” cuso, “A measurement-based analysis of the energy consumption
in Proceedings of the 38th Annual International Symposium on of data center servers,” in Proceedings of the 5th International
Computer Architecture, ser. ISCA ’11. New York, NY, USA: Conference on Future Energy Systems, ser. e-Energy ’14. New
ACM, 2011, pp. 319–330. York, NY, USA: ACM, 2014, pp. 63–74.
[334] M. Poess and R. Othayoth Nambiar, “A power consumption analy- [353] S. Wang, Y. Li, W. Shi, L. Fan, and A. Agrawal, “Safari: Function-
sis of decision support systems,” in Proceedings of the First Joint level power analysis using automatic instrumentation,” in Energy
WOSP/SIPEW International Conference on Performance Engineer- Aware Computing, 2012 International Conference on, Dec 2012,
ing, ser. WOSP/SIPEW ’10. New York, NY, USA: ACM, 2010, pp. 1–6.
pp. 147–152. [354] CoolEmAll Project, “Coolemall,” URL: http://www.coolemall.eu,
[335] M. Horiuchi and K. Taura, “Acceleration of data-intensive workflow 2014.
applications by using file access history,” in High Performance [355] L. Cupertino, G. D. Costa, A. Oleksiak, W. Piatek, J.-M. Pierson,
Computing, Networking, Storage and Analysis (SCC), 2012 SC J. Salom, L. Sis, P. Stolf, H. Sun, and T. Zilio, “Energy-efficient,
Companion:, Nov 2012, pp. 157–165. thermal-aware modeling and simulation of data centers: The coole-
[336] M. Gamell, I. Rodero, M. Parashar, and S. Poole, “Exploring energy mall approach and evaluation results,” Ad Hoc Networks, vol. 25,
and performance behaviors of data-intensive scientific workflows Part B, no. 0, pp. 535 – 553, 2015, new Research Challenges in
on systems with deep memory hierarchies,” in High Performance Mobile, Opportunistic and Delay-Tolerant Networks Energy-Aware
Computing (HiPC), 2013 20th International Conference on, Dec Data Centers: Architecture, Infrastructure, and Communication.
2013, pp. 226–235. [356] C. Seo, G. Edwards, D. Popescu, S. Malek, and N. Medvidovic,
[337] L. Mashayekhy, M. Nejad, D. Grosu, D. Lu, and W. Shi, “Energy- “A framework for estimating the energy consumption induced by
aware scheduling of mapreduce jobs,” in Big Data (BigData a distributed system’s architectural style,” in Proceedings of the
Congress), 2014 IEEE International Congress on, June 2014, pp. 8th International Workshop on Specification and Verification of
32–39. Component-based Systems, ser. SAVCBS ’09. New York, NY,
[338] L. Mashayekhy, M. Nejad, D. Grosu, Q. Zhang, and W. Shi, USA: ACM, 2009, pp. 27–34.
“Energy-aware scheduling of mapreduce jobs for big data appli- [357] C. Seo, S. Malek, and N. Medvidovic, “An energy consumption
cations,” Parallel and Distributed Systems, IEEE Transactions on, framework for distributed java-based systems,” in Proceedings of
vol. PP, no. 99, pp. 1–1, 2014. the Twenty-second IEEE/ACM International Conference on Auto-
[339] J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing mated Software Engineering, ser. ASE ’07. New York, NY, USA:
on large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, Jan. ACM, 2007, pp. 421–424.
2008. [358] B. Cumming, G. Fourestey, O. Fuhrer, T. Gysi, M. Fatica, and
[340] T. White, Hadoop: The definitive guide. ” O’Reilly Media, Inc.”, T. C. Schulthess, “Application centric energy-efficiency study of
2012. distributed multi-core and hybrid cpu-gpu systems,” in Proceedings
[341] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: of the International Conference for High Performance Computing,
Distributed data-parallel programs from sequential building blocks,” Networking, Storage and Analysis, ser. SC ’14. Piscataway, NJ,
in Proceedings of the 2Nd ACM SIGOPS/EuroSys European Con- USA: IEEE Press, 2014, pp. 819–829.
ference on Computer Systems 2007, ser. EuroSys ’07. New York, [359] R. Koller, A. Verma, and A. Neogi, “Wattapp: An application aware
NY, USA: ACM, 2007, pp. 59–72. power meter for shared data centers,” in Proceedings of the 7th
[342] N. Zhu, L. Rao, X. Liu, J. Liu, and H. Guan, “Taming power peaks International Conference on Autonomic Computing, ser. ICAC ’10.
in mapreduce clusters,” in Proceedings of the ACM SIGCOMM New York, NY, USA: ACM, 2010, pp. 31–40.
2011 Conference, ser. SIGCOMM ’11. New York, NY, USA: [360] J. Demmel, A. Gearhart, B. Lipshitz, and O. Schwartz, “Perfect
ACM, 2011, pp. 416–417. strong scaling using no additional energy,” in Parallel Distributed
[343] N. Zhu, L. Rao, X. Liu, and J. Liu, “Handling more data with less Processing (IPDPS), 2013 IEEE 27th International Symposium on,
cost: Taming power peaks in mapreduce clusters,” in Proceedings May 2013, pp. 649–660.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 64

[361] P. Bartalos and M. Blake, “Engineering energy-aware web services Communications (GreenCom), 2010 IEEE/ACM Int’l Conference
toward dynamically-green computing,” in Service-Oriented Com- on Int’l Conference on Cyber, Physical and Social Computing
puting - ICSOC 2011 Workshops, ser. Lecture Notes in Computer (CPSCom), Dec 2010, pp. 454–459.
Science, G. Pallis, M. Jmaiel, A. Charfi, S. Graupner, Y. Karabulut, [380] S. Islam, J. Keung, K. Lee, and A. Liu, “Empirical prediction
S. Guinea, F. Rosenberg, Q. Sheng, C. Pautasso, and S. Mokhtar, models for adaptive resource provisioning in the cloud,” Future
Eds. Springer Berlin Heidelberg, 2012, vol. 7221, pp. 87–96. Generation Computer Systems, vol. 28, no. 1, pp. 155 – 162, 2012.
[362] P. Bartalos and M. Blake, “Green web services: Modeling and [381] Y. Gao, H. Guan, Z. Qi, T. Song, F. Huan, and L. Liu, “Service level
estimating power consumption of web services,” in Web Services agreement based energy-efficient resource management in cloud
(ICWS), 2012 IEEE 19th International Conference on, 2012, pp. data centers,” Computers & Electrical Engineering, no. 0, pp. –,
178–185. 2013.
[363] A. Nowak, T. Binz, F. Leymann, and N. Urbach, “Determining [382] Z. Zhang and S. Fu, “Macropower: A coarse-grain power profiling
power consumption of business processes and their activities to framework for energy-efficient cloud computing,” in Performance
enable green business process reengineering,” in Enterprise Dis- Computing and Communications Conference (IPCCC), 2011 IEEE
tributed Object Computing Conference (EDOC), 2013 17th IEEE 30th International, 2011, pp. 1–8.
International, Sept 2013, pp. 259–266. [383] C. Bunse, H. Hpfner, S. Klingert, E. Mansour, and S. Roychoud-
[364] J. Berral, R. Gavalda, and J. Torres, “Power-aware multi-data hury, “Energy aware database management,” in Energy-Efficient
center management using machine learning,” in Parallel Processing Data Centers, ser. Lecture Notes in Computer Science, S. Klingert,
(ICPP), 2013 42nd International Conference on, Oct 2013, pp. X. Hesselbach-Serra, M. Ortega, and G. Giuliani, Eds. Springer
858–867. Berlin Heidelberg, 2014, vol. 8343, pp. 40–53.
[365] S. Marsland, Machine Learning: An Algorithmic Perspective. Tay- [384] H. Höpfner and C. Bunse, “Energy aware data management on
lor & Francis, 2011. avr micro controller based systems,” SIGSOFT Softw. Eng. Notes,
[366] K. Cios, R. Swiniarski, W. Pedrycz, and L. Kurgan, “Supervised vol. 35, no. 3, pp. 1–8, May 2010.
learning: Decision trees, rule algorithms, and their hybrids,” in Data [385] Altera, “Fpga power management and modeling techniques,” 2012.
Mining. Springer US, 2007, pp. 381–417. [386] F. Li, Y. Lin, L. He, D. Chen, and J. Cong, “Power modeling and
[367] I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical characteristics of field programmable gate arrays,” Computer-Aided
Machine Learning Tools and Techniques, 3rd ed. San Francisco, Design of Integrated Circuits and Systems, IEEE Transactions on,
CA, USA: Morgan Kaufmann Publishers Inc., 2011. vol. 24, no. 11, pp. 1712–1724, Nov 2005.
[368] J. Berral, R. Gavalda, and J. Torres, “Adaptive scheduling on power- [387] K. K. W. Poon, S. J. E. Wilton, and A. Yan, “A detailed power
aware managed data-centers using machine learning,” in Grid Com- model for field-programmable gate arrays,” ACM Trans. Des.
puting (GRID), 2011 12th IEEE/ACM International Conference on, Autom. Electron. Syst., vol. 10, no. 2, pp. 279–302, Apr. 2005.
Sept 2011, pp. 66–73.
[388] S. Dawson-Haggerty, A. Krioukov, and D. E. Culler, “Power
[369] J. Dolado, D. Rodrı́guez, J. Riquelme, F. Ferrer-Troyano, and
optimization a reality check,” EECS Department, University of
J. Cuadrado, “A two-stage zone regression method for global char-
California, Berkeley, Tech. Rep. UCB/EECS-2009-140, Oct 2009.
acterization of a project database,” Advances in Machine Learning
[Online]. Available: http://www.eecs.berkeley.edu/Pubs/TechRpts/
Applications in Software Engineering, p. 1, 2007.
2009/EECS-2009-140.html
[370] J. L. Berral, I. n. Goiri, R. Nou, F. Julià, J. Guitart, R. Gavaldà,
[389] Y. Zhang and N. Ansari, “On architecture design, congestion
and J. Torres, “Towards energy-aware scheduling in data centers
notification, tcp incast and power consumption in data centers,”
using machine learning,” in Proceedings of the 1st International
Communications Surveys Tutorials, IEEE, vol. 15, no. 1, pp. 39–
Conference on Energy-Efficient Computing and Networking, ser.
64, First 2013.
e-Energy ’10. New York, NY, USA: ACM, 2010, pp. 215–224.
[371] J. Ll. Berral, R. Gavaldà, and J. Torres, “Empowering automatic [390] V. Tiwari, S. Malik, A. Wolfe, and M.-C. Lee, “Instruction level
data-center management with machine learning,” in Proceedings of power analysis and optimization of software,” in VLSI Design,
the 28th Annual ACM Symposium on Applied Computing, ser. SAC 1996. Proceedings., Ninth International Conference on, Jan 1996,
’13. New York, NY, USA: ACM, 2013, pp. 170–172. pp. 326–328.
[372] J. L. Berral, I. Goiri, R. Nou, F. Juli, J. O. Fit, J. Guitart, R. Gavald, [391] H. Wang, J. Huang, X. Lin, and H. Mohsenian-Rad, “Exploring
and J. Torres, Toward Energy-Aware Scheduling Using Machine smart grid and data center interactions for electric power load
Learning. John Wiley & Sons, Inc., 2012, pp. 215–244. balancing,” SIGMETRICS Perform. Eval. Rev., vol. 41, no. 3, pp.
[373] B. Khargharia, H. Luo, Y. Al-Nashif, and S. Hariri, “Appflow: 89–94, Jan. 2014.
Autonomic performance-per-watt management of large-scale data [392] D. Barbagallo, E. Di Nitto, D. J. Dubois, and R. Mirandola, “A bio-
centers,” in Green Computing and Communications (GreenCom), inspired algorithm for energy optimization in a self-organizing data
2010 IEEE/ACM Int’l Conference on Int’l Conference on Cyber, center,” in Proceedings of the First International Conference on
Physical and Social Computing (CPSCom), Dec 2010, pp. 103– Self-organizing Architectures, ser. SOAR’09. Berlin, Heidelberg:
111. Springer-Verlag, 2010, pp. 127–151.
[374] H. Shen, Y. Tan, J. Lu, Q. Wu, and Q. Qiu, “Achieving autonomous [393] R. Carroll, S. Balasubramaniam, J. Suzuki, C. Lee, W. Donnelly,
power management using reinforcement learning,” ACM Trans. and D. Botvich, “Bio-inspired service management framework:
Des. Autom. Electron. Syst., vol. 18, no. 2, pp. 24:1–24:32, Apr. Green data-centres case study,” Int. J. Grid Util. Comput., vol. 4,
2013. no. 4, pp. 278–292, Oct. 2013.
[375] F. Caglar and A. Gokhale, “ioverbook: Intelligent resource- [394] L. Gyarmati and T. Trinh, “Energy efficiency of data centers,” in
overbooking to support soft real-time applications in the cloud,” in Green IT: Technologies and Applications, J. Kim and M. Lee, Eds.
Cloud Computing (CLOUD), 2014 IEEE International Conference Springer Berlin Heidelberg, 2011, pp. 229–244.
on, July 2014. [395] D. Barbagallo, E. Di Nitto, D. Dubois, and R. Mirandola, “A
[376] M. Guzek, S. Varrette, V. Plugaru, J. E. Pecero, and P. Bouvry, bio-inspired algorithm for energy optimization in a self-organizing
“A holistic model of the performance and the energy efficiency data center,” in Self-Organizing Architectures, ser. Lecture Notes
of hypervisors in a high-performance computing environment,” in Computer Science, D. Weyns, S. Malek, R. de Lemos, and
Concurrency and Computation: Practice and Experience, vol. 26, J. Andersson, Eds. Springer Berlin Heidelberg, 2010, vol. 6090,
no. 15, pp. 2569–2590, 2014. pp. 127–151.
[377] G. Tesauro, R. Das, H. Chan, J. O. Kephart, D. Levine, F. L. R. III, [396] S. Dehuri, S. Ghosh, and S. Cho, Integration of Swarm Intelligence
and C. Lefurgy, “Managing power consumption and performance of and Artificial Neutral Network, ser. Series in machine perception
computing systems using reinforcement learning,” in NIPS, 2007. and artificial intelligence. World Scientific Publishing Company,
[378] A. A. Bhattacharya, D. Culler, A. Kansal, S. Govindan, and Incorporated, 2011.
S. Sankar, “The need for speed and stability in data center power [397] C. B. Pop, I. Anghel, T. Cioara, I. Salomie, and I. Vartic, “A swarm-
capping,” in Proceedings of the 2012 International Green Comput- inspired data center consolidation methodology,” in Proceedings of
ing Conference (IGCC), ser. IGCC ’12. Washington, DC, USA: the 2Nd International Conference on Web Intelligence, Mining and
IEEE Computer Society, 2012, pp. 1–10. Semantics, ser. WIMS ’12. New York, NY, USA: ACM, 2012,
[379] Q. Li, B. Guo, Y. Shen, J. H. Wang, Y. S. Wu, and Y. Liu, “An pp. 41:1–41:7.
embedded software power model based on algorithm complexity [398] I. Anghel, C. B. Pop, T. Cioara, I. Salomie, and I. Vartic, “A swarm-
using back-propagation neural networks,” in Green Computing and inspired technique for self-organizing and consolidating data centre

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/COMST.2015.2481183, IEEE Communications Surveys & Tutorials
SUBMITTED TO IEEE COMMUNICATIONS SURVEYS & TUTORIALS, SEPTEMBER 2015 65

servers,” Scalable Computing: Practice and Experience, vol. 14, Miyuru Dayarathna is a research fellow
no. 2, 2013. at School of Computer Engineering, Nanyang
[399] E. Feller, L. Rilling, and C. Morin, “Energy-aware ant colony based Technological University (NTU), Singapore. He
workload placement in clouds,” in Grid Computing (GRID), 2011 received B.Sc. (Hons) degree in Information
12th IEEE/ACM International Conference on, Sept 2011, pp. 26– Technology from University of Moratuwa, Sri
33. Lanka in 2008, Master’s degree in Media Design
[400] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning from Keio University, Japan in 2010, and Ph.D.
algorithm for deep belief nets,” Neural computation, vol. 18, no. 7, degree in Computer Science from Tokyo Insti-
pp. 1527–1554, 2006. tute of Technology, Japan in 2013. His research
interests include stream computing, graph data
[401] Y. Bengio, “Learning deep architectures for ai,” Foundations and management and mining, energy efficient com-
trends R in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009. puter systems, cloud computing, high performance computing, database
[402] T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman, “Project systems, and performance engineering. He has published technical papers
adam: Building an efficient and scalable deep learning training in various international journals and conferences.
system,” in 11th USENIX Symposium on Operating Systems Design
and Implementation (OSDI 14). Broomfield, CO: USENIX Yonggang Wen (S99-M08-SM14) is an assistant
Association, Oct. 2014, pp. 571–582. professor with school of computer engineering
[403] C. Li, R. Wang, T. Li, D. Qian, and J. Yuan, “Managing green at Nanyang Technological University, Singapore.
datacenters powered by hybrid renewable energy systems,” in 11th He received his PhD degree in Electrical Engi-
International Conference on Autonomic Computing (ICAC 14). neering and Computer Science (minor in Western
Philadelphia, PA: USENIX Association, 2014, pp. 261–272. Literature) from Massachusetts Institute of Tech-
[404] I. n. Goiri, W. Katsak, K. Le, T. D. Nguyen, and R. Bianchini, nology (MIT), Cambridge, USA. Previously he
“Parasol and greenswitch: Managing datacenters powered by re- has worked in Cisco to lead product development
newable energy,” in Proceedings of the Eighteenth International in content delivery network, which had a revenue
Conference on Architectural Support for Programming Languages impact of 3 Billion US dollars globally. Dr. Wen
and Operating Systems, ser. ASPLOS ’13. New York, NY, USA: has published over 130 papers in top journals
ACM, 2013, pp. 51–64. and prestigious conferences. His work in Multi-Screen Cloud Social TV
has been featured by global media (more than 1600 news articles from
[405] R. Weidmann and H.-R. Vogel, “Data center 2.0: Energy-efficient
over 29 countries) and received ASEAN ICT Award 2013 (Gold Medal).
and sustainable,” in The Road to a Modern IT Factory, ser.
His work on Cloud3DView for Data Centre Life-Cycle Management, as
Management for Professionals, F. Abolhassan, Ed. Springer Berlin
the only academia entry, has made into the Top 4 finalist on Data Centre
Heidelberg, 2014, pp. 129–136.
Dynamics Awards 2014 APAC. He is a co-recipient of Best Paper Awards
[406] B. Aksanli, E. Pettis, and T. Rosing, “Architecting efficient peak at EAI Chinacom 2015, IEEE WCSP 2014, IEEE Globecom 2013 and
power shaving using batteries in data centers,” in Modeling, IEEE EUC 2012, and a co-recipient of 2015 IEEE Multimedia Best Paper
Analysis Simulation of Computer and Telecommunication Systems Award. He serves on editorial boards for IEEE Communications Survey
(MASCOTS), 2013 IEEE 21st International Symposium on, Aug & Tutorials, IEEE Transactions on Multimedia, IEEE Transactions on
2013, pp. 242–253. Signal and Information Processing over Networks, IEEE Access Journal
[407] V. Kontorinis, L. E. Zhang, B. Aksanli, J. Sampson, H. Homayoun, and Elsevier Ad Hoc Networks, and was elected as the Chair for IEEE
E. Pettis, D. M. Tullsen, and T. S. Rosing, “Managing distributed ComSoc Multimedia Communication Technical Committee (2014-2016).
ups energy for effective power capping in data centers,” SIGARCH His research interests include cloud computing, green data center, big data
Comput. Archit. News, vol. 40, no. 3, pp. 488–499, Jun. 2012. analytics, multimedia network and mobile computing
[408] Y. Kuroda, A. Akai, T. Kato, and Y. Kudo, “High-efficiency power
supply system for server machines in data center,” in High Per-
formance Computing and Simulation (HPCS), 2013 International
Conference on, July 2013, pp. 172–177.
[409] S. Hancock, “Iceland looks to serve the world,” URL: http://news. Rui Fan Rui Fan is an assistant professor at
bbc.co.uk/2/hi/programmes/click online/8297237.stm, 2009. Nanyang Technological University, Singapore.
[410] W. Zheng, K. Ma, and X. Wang, “Exploiting thermal energy He received his PhD from the Massachusetts
storage to reduce data center capital and operating expenses,” in Institute of Technology, and also received MIT’s
High Performance Computer Architecture (HPCA), 2014 IEEE 20th Sprowls Award for best doctoral theses in com-
International Symposium on, Feb 2014, pp. 132–141. puter science. His research interests include the-
[411] L. Ganesh, H. Weatherspoon, T. Marian, and K. Birman, “Integrated oretical and applied algorithms for parallel and
approach to data center power management,” Computers, IEEE distributed computing, especially algorithms for
Transactions on, vol. 62, no. 6, pp. 1086–1096, June 2013. multicores, GPUs and cloud computing.

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/.

You might also like