Acos: An Autonomic Management Layer Enhancing Commodity Operating Systems
Acos: An Autonomic Management Layer Enhancing Commodity Operating Systems
Abstract— Traditionally, operating systems have been in charge effect of ending the free lunch, moving a good part of the bur-
of serving as a convenient layer between applications and the den of improving performance onto the software developers’
bare-metal hardware by both providing an abstraction of hard- shoulders. The demanding ability of producing efficient and re-
ware itself and allocating available resources to the applications.
Both of these roles are becoming ever more important due to the liable parallel software adds to the already considerable bulk of
increasing complexity of modern computer architectures. The expertise needed for successfully coping with requirements in
rise of chip multi-processors brought to consolidation of multi- terms of computing performance, functionality, reliability, and
ple applications and systems (i.e., through hardware-supported constraints satisfaction required by today’s IT. This situation
virtualization) on a single piece of silicon. Heterogeneous System- leads to an increased need of pushing part of the system man-
on-Chips (SoCs) are becoming ubiquitous, while deep memory
hierarchies (i.e., multiple level of caches) are around since the agement effort into computing systems themselves, which can
beginning of this millennium. Moreover, modern systems have to be achieved leveraging autonomic computing techniques [2].
face the challenge of meeting a growing set of functional and Autonomic computing carries a proposal of embedding
non-functional requirements (e.g., performance, temperature, self-* properties into computing systems with the objective
power consumption, etc.); a system-wide strategy is needed and of maintaining “good-enough” working conditions. The term
operating systems may become the right actors for this role.
This paper describes an approach to enhance commodity autonomic strongly recalls the initial link between autonomic
operating systems with an autonomic layer, introducing smart, computing and the biological world, in fact, autonomic com-
automatic resource allocation through self-management capabil- puting systems were initially supposed to mimic the behavior
ities, which in turn leverage the availability of user and system- of the autonomic nervous systems of human beings. In details,
specified goals and constraints. The methodology for realiz- computing systems should be able to monitor themselves and
ing such an Autonomic Operating System (AcOS) is illustrated
throughout its building blocks: monitors, actuators, and adapta- their environment, detect significant changes and decide how
tion policies. The applicability and usefulness of this approach are to react, and act to execute such decisions towards user-
demonstrated by providing experimental evidence gathered from specified goals in terms of non-functional requirements (e.g.,
different case studies involving two different operating systems desired QoS, system health, etc.) [3].
(i.e., GNU/Linux and FreeBSD) and dealing with diverse goals Within this context, realizing support for autonomic man-
and constraints: performance and temperature requirements.
agement at the operating system level is crucial. The op-
erating system is the central system layer, serving as the
I. BACKGROUND AND I NTRODUCTION bridge between hardware components and applications by both
The turn of computer architectures from the well-established offering abstractions for more convenient access to hardware
single-core structure to multiple (and, possibly, heterogeneous) devices and managing the allocation of system resources (e.g.,
processing elements is a years-long trend. This paradigm shift processor or memory bandwidth). Hence, the operating system
has been dictated by physical and architectural motivations [1]. has both full access to the bare hardware and overall view of
Physical causes depend on the so-called power wall (i.e., the software being executed. The operating system is where
inability to increase the clock frequency without hurting power maximum information regarding the status of the computing
consumption), while architectural reasons relate to the well- system as a whole and maximum freedom of action over
known ILP-wall (i.e., the diminishing revenue due to further its behavior are available. For these reasons, we claim that
micro-architectural optimizations). These difficulties made it an appropriate infrastructure at the operating system level is
infeasible to keep up with Joy’s performance law and, to fundamental for being able to realize an autonomic apparatus
survive its commitment to performance improvements, the within a computing system, able to integrate the diverse self-
computer industry changed strategy, opening the multicore era. management techniques proposed in literature [4, 5, 6, 7, 8, 9,
In the single-core era, software performance improvements 10, 11]. This paper presents the proposed methodology and an
were provided by beefier single-core processors and appli- evaluation through case studies of different implementations
cations experienced the so-called “free lunch”, with free-of- of this approach to the realization of an Autonomic Operating
charge speedups obtainable by just upgrading to the latest System (AcOS) layer.
processor. The new parallel course in computer architectures, In the remainder of this paper, Section II describes the pro-
despite being due to architectural reasons, carries the side posed methodology, Section III presents case studies regarding
2
the implementation of this methodology over two well-known • Monitors provide access to information about the current
commodity operating systems and reports experimental results status of the system or of its environment and to user-
based on these case studies. Finally, Section IV gives an defined goals. This information is accessed by adaptation
overview of related works and Section V concludes the paper. policies to apply their decision mechanism.
The AcOS way of realizing the three phases of the ODA
II. AUTONOMIC O PERATING S YSTEM I NFRASTRUCTURE autonomic loop through autonomic components, along with
The creation of an infrastructure for autonomic computing more details about the overall approach, are further illustrated
in commodity operating systems opens to the possibility in the remainder of this section.
of supporting self-management capabilities for any system
supported by the OS of choice. For instance, enhancing B. Acting on the System
GNU/Linux, and, more specifically, the Linux kernel with an
Autonomic action on the computing system corresponds to
autonomic layer enables a wide variety of system and devices
the modification of one or more parameters which affect its
(ranging from mobiles to supercomputers) to take advantage
runtime behavior. An actuator is a wrapper for one or more
of the benefits carried by runtime self-adaptation. Autonomic
parameters, exporting a well-defined action under the form of
Operating System (AcOS) is aimed at this target, i.e., the
an Application Programming Interface (API) which can be
definition of a unified infrastructure for autonomic computing
used by adaptation policies. For instance, an actuator may
and its implementation over commodity operating systems.
export the action of assigning a certain number of cores to
The methodology at the base of AcOS leverages the Observe-
the tasks of a specific application. The implementation of
Decide-Act (ODA) autonomic loop [3]; Figure 1 represents an
the calls exported through an actuator’s API must take care
overview of a computing system enhanced with the autonomic
of performing all the operations required to reliably perform
layer. The AcOS infrastructure defines a model for hosting an
the action. Since the AcOS model places the autonomic
layer within the operating system, actuators can affect all the
App App App App App ... system parameters managed by the operating system through
Applications its subsystems (e.g., processor scheduler, device drivers, etc.).
Note that not all the possible actions affect a system parame-
ter: a different class of actuators can be defined at the level of a
D D Scheduler
O O single application. For instance, an application may be able to
A A File System
vary at runtime the number of its working threads. This kind
D D
O O Dev. Drivers of actuators, however, still requires support at the operating
A A
Operating System system level for providing a standardized API to be used by
adaptation policies with a broader view on the system than a
CPU(s) Memory Devices ... tight adaptation loop within the application itself. For instance,
HW Components to support the runtime adaptation of the number of threads in
an application, the autonomic layer would define an actuator
Fig. 1. Overall view of the AcOS infrastructure model. The autonomic layer API for adaptation policies and also another interface for
enhances the operating system implementing different autonomic loops, which
evaluate the system status and act within operating system boundary to meet
applications capable of this kind of self-adaptation to register
user and system-specified goals and constraints. and create a communication channel (e.g., based on shared
memory) through which be informed of calls to the actuator
array of ODA autonomic loops within the autonomic layer. by adaptation policies. In brief, an application-level actuator
These self-management loops gather information from running is just the same as a system-level one, but its backend is not
applications, the system, and the hardware components (ob- fully implemented within the operating system but it triggers
servation), analyze the data collected comparing it against the an action in registered applications. Thanks to this mechanism,
desired system status, determining how to intervene (decision), AcOS is able to make use of both system and application-level
and adjust available knobs (action). actuators, enabling more powerful adaptation; for instance,
when changing the number of cores assigned to an application,
A. AcOS system componenents an adaptation policy could also change the number of its
AcOS defines common interfaces and templates, standardiz- working threads to match the number of cores, obtaining the
ing these phases by the definition of three kinds of autonomic best possible scalability.
components:
• Actuators are wrappers around a parameter of the system C. Making Decisions
and serve as the actual knobs which can be adjusted by In the AcOS model, decisions are made by adaptation
the autonomic layer. policies, which are defined as a distributed decision making
• Adaptation policies represent the core autonomic com- infrastructure. Each adaptation policy chooses which monitors
ponent, as they actually take decisions on how to adapt and actuators to use. To do so, adaptation policies embed
the system behavior. Adaptation policies implement a a decision mechanism able to evaluate information about
decision mechanism to decide how to act on the knobs the system status and user-specified goals, which are made
provided by actuators. available by monitors (see the next Section II-D). The decision
3
mechanism within an adaptation policy can vary from a simple between system code, which usually runs in kernel mode and
heuristic based on empirical observation to more complex application code, which runs in user mode [13]1 . One of the
control techniques based on control theory or machine learn- challenges in the design of the autonomic infrastructure was to
ing. Adaptation policies can work at either application-specific allow efficient communications between the user and kernel-
(e.g., performance) or system-wide (e.g., system temperature) space, since diverse, interacting components can be placed
level or even at both levels. For instance, an adaptation policy in different spaces or a single component may need to be
seeking the goal of keeping the temperature of the processor partitioned. To avoid the use of system calls, which are time-
below a certain threshold by randomly injecting idle cycles in consuming and possibility harmful [14, 15], on hot code-paths,
the CPU(s) would work at system-level, but if the idle cycles the AcOS infrastructure leverages shared memory, mapping
injection is selectively performed taking into consideration portions of physical memory in both the user and kernel-
the performance of the running application, it would operate space allowing for fast, low-latency communication without
at application-level, while using system-level information and unnecessary overheads [10]. This is done considering both
goals. Clearly, in each case an appropriate actuator to allow security (i.e., managing permissions on the shared memory)
the action is needed. and efficiency (i.e., laying out data in a cache-friendly way
This definition of adaptation policy leads to interaction to avoid false-sharing [16] and other subtle issues). Thanks
problems in terms of control stability, for instance, when two to this approach, the most common operations are managed
adaptation policies want to use the same actuator in opposite in the fastest possible way, while system calls (or alternative
directions. Currently, this problem is solved in AcOS by interfaces, such as Linux’s sysfs and procfs) are em-
providing the user with the possibility of enabling or disabling ployed to handle uncommon operations [10]. According to this
each policy. One of the ongoing works is a study of the model, an autonomic component can be implemented across
application of distributed decision theory to this context to user and kernel-space, making use of specially mapped shared
automatize the activation and deactivation of conflicting adap- memory to inexpensively pass information across address
tation policies with a system-wide policies coordinator. Even spaces. Moreover, we foresee similar optimization between the
if this is currently an open problem, the AcOS infrastructure different agents involved in the global system optimization.
has the major advantage of keeping all the components at the
operating system level, thusly simplifying the connection of
the of the policies coordinator with the autonomic layer. III. C ASES FOR AUTONOMIC C OMPONENTS
We employed the methodology described in Section II to
D. Gathering Information extend two well-known commodity operating systems (i.e.,
The autonomic layer needs constantly updated informations Linux and FreeBSD) with an autonomic layer. This section
regarding the system status in order to be able to take reports case studies involving the creation of autonomic com-
informed decisions through the adaptation policies. In the ponents integrated in either Linux or FreeBSD and aimed
AcOS model, this information is made available by monitors. at tackling interesting runtime management problems in the
Just as actuators, monitors can simply be wrappers around an context of server systems. More into details, one of the
information source already available to the operating system interesting problems in server systems is the management
(e.g., the temperature of each core in the processor). In this of the quality of service (QoS) yielded by the running ap-
case, a monitor is said to be passive [12], since it simply wraps plications. A typical class of workloads in this scenario is
already available information. On the other hand, active [12] made of throughput-based application processing a stream of
monitors process in some way the information to synthesize data (e.g., a video encoder for online streaming). To enable
a metric for characterizing a certain runtime property (e.g., autonomic QoS management within this context, we designed
throughput). Other than making runtime information available and implemented an active monitor for throughput called Heart
through APIs, monitors also manage the specification of goals Rate Monitor (HRM) and different adaptation policies and
by the user and expose this additional data to adaptation actuators exporting actions over the task scheduler. Another
policies. Hence, a monitor is characterize by the metric of the compelling problem is that of thermal management, which
measurements it takes and it must expose at least two APIs: can be a significant issue in server farms and data centers. To
one towards adaptation policies, allowing to get the data on tackle this problem, we implemented a passive temperature
both the current status and the associated goal, and one towards monitor and an adaptation policy exploiting the available
users, allowing to define goals on the monitor’s measurement. actuators. The remaining of this section gives a brief overview
of these autonomic components and presents an experimental
E. Autonomic Components Interoperation characterization of what it’s possible to realize by applying
the proposed methodology. For the experimental evaluation,
In order to have a working autonomic layer, the three we employed workstations equipped with current quad-core
classes of autonomic components must be able to reliably and Intel processors (Xeon and Core i7) and applications from the
efficiently communicate. The AcOS model has been devised as PARSEC 2.1 [17] parallel benchmark suite.
an enhancement layer for contemporary commodity operating
systems, which, in most of the cases, feature a monolithic 1 A different design that exploits a micro-kernel, message-passing based,
kernel. In particular, our current design is strongly biased distributed operating system is foreseeable and maybe even more suitable for
towards UNIX-like operating systems, with a strong separation the distributed, agent-like environment we describe.
4
40
trends have been artificially created by running a variable global heart rate
number of instances of another CPU-bound application (i.e., window heart rate
the cpuburn stress test), to simulate collateral system load. 30
throughput [frames/s]
Figure 3 shows a plot of the microbenchmark’s throughput,
representing the global heart rate and six additional window
heart rates over different moving averages of sizes in the 20
set {1, 5, 10, 15, 30, 60}[seconds]2 . The execution presents six
50×106 10
(3)short load ends (4) long heavy load begins
(2) short load begins (5) long heavy load ends
throughput [heartbeats/s]
40 0
0 5 10 15 20 25 30
time [s]
(1) initial load ends
30 Fig. 4. Execution phases of the x264 video encoder working on the native
global input of the PARSEC 2.1 suite.
window60 s
window30 s
20 window15 s
window10 s
window5 s
window1 s B. Performance-Aware Scheduling
10×106
50 100 150 200 The information provided by HRM has been employed by
time [s]
different adaptation policies affecting the task scheduler in
Fig. 3. Global and six different window heart rates of an ad-hoc application different ways. These case studies are based on Linux 3.3
showing different performance trends.
and appropriate actuators were implemented to allow the
autonomic layer access two different scheduling parameters:
different phases: initially, up to the point marked (1), there is a
tasks priority (obtained by scaling their virtual runtime [10])
light additional load which makes the heart rates over shorter
and CPU affinity. The first actuator is used by an adaptation
time windows quite noisy. Then, this load terminates and the
policy built for the Metronome framework [10] and called
application reaches its peak performance up to point (2), when
Performance-Aware Fair Scheduler (PAFS), while the second
another external load is started. It is apparent from the figure
actuator is employed by another adaptation policy named
how this disturb is clearly visible looking at the heart rate
Performance-Aware Processor Allocator ((PA)2 ).
measured on the short term, while it could go almost unnoticed
1) Performance-Aware Fair Scheduling: The rationale be-
looking at heart rates take over longer periods. At point (3) the
hind PAFS is adapting the priority of the tasks belonging
second external load terminates and the microbenchmark goes
to HRM-instrumented applications according to whether their
back to its peak throughput but, at point (4), a heavier and
user-specified throughput goal is being attained or not. If an
longer-lasting load is applied. Again, it can be noticed how
application is running too slowly, i.e., under its minimum
heart rates measured on shorter time windows give a prompt
desired heart rate, its priority is increased, and conversely
feedback when a change in the performance happens, but tend
if it is running over its maximum desired heart rate. This
to become noisy when the execution becomes more regular.
simple scheme has been implemented in an adaptation policy
Finally, at point (5), the final load terminates and, after some
based on an heuristic decision mechanism, which affects the
more time, the experiment is concluded.
priority of the tasks belonging to instrumented applications
Using the right moving average can help identify execution
(i.e., comprised in an HRM group) by scaling their virtual
phases of a workload; for instance, Figure 4 shows a plot
runtime, which is the metric used by Linux’s Completely Fair
of an instance of the x264 video encoder application from
Scheduler (CFS) to choose, at each context switch, which task
the PARSEC 2.1 suite working on the reference input and
to execute next. To evaluate this adaptation policy, two 4-
instrumented to emit one heartbeat per encoded frame. The
threaded instances of the x264 video encoder have been run,
figure shows both the global and window heart rate, which
both instrumented with HRM and attached to different groups,
helps highlighting a much lighter phase in the execution, due
on a quad-core workstation equipped with an AcOS-enhanced
to the characteristics of the input. These two examples give
Linux 3.3. This time, the workload consists in encoding a
evidence of how proper instrumentation with HRM can yield
copy of the full-lenght Big Buck Bunny full HD video [19].
accurate runtime information, helping characterize applica-
Figure 5a shows the two encoders managed by the Linux
tions’ performance. Since these data are available at runtime,
Completely Fair Scheduler (CFS), which is perfectly fair in
in the AcOS layer, in both kernel and user-space, adaptation
assigning the bandwidth of the cores to the two instances of
policies can use them to pursue user-specified goals. This is
the same, application resulting in equal performance. The CFS
what is presented in the remainder of this section.
features a sophisticated mechanism to assign different CPU
bandwidths to different objects (e.g., applications); however,
this mechanism is difficult to use when systems and admin-
istrators are faced with high-level performance goals (e.g.,
2 Note that, when there are no enough data to compute a window heart rate frames/s). High-level performance goals make HRM shine
over its full size, the measure is still provided using the available data since it provides general performance measures understandable
6
150 150
throughput [frames/s]
throughput [frames/s]
100 100
50 50
(a) Unmanaged instances of x264. (b) Managed instances of x264; the performance goals, in frames/s, are [30,60]
and [70,100] for the two instances.
Fig. 5. Window heart rate and its LOESS interpolation for each instance of x264.
30
by users, administrators, and systems and effective adaptation
throghput [frames/s]
global2 heart rate
global3 heart rate
policies can be designed and developed to exploit these data. In global4 heart rate
20 global heart rate
Figure 5b, two instances of x264 are executed with different
window heart rate
performance goals (i.e., the red and green areas, which are
respectively 30 to 60 frames/s and 70 to 100 frames/s) and are 10
driven by the adaptive scheduler towards their performance
goal, successfully exploiting the information HRM provides 0
to adjust their virtual runtimes (which boils down to assign- 4
workload
workload and cores
12
11
problem at the very end of the memory hierarchy altering [11] Juan A Colmenares, Sarah Bird, Henry Cook, Paul Pearce, David Zhu, John Shalf,
Steven Hofmeyr, Krste Asanovic, and John Kubiatowicz. Resource Management
the behavior of memory controllers [31, 32, 33, 34]. Other in the Tessellation Manycore OS. In Proc. of the 2nd Workshop on Hot Topics in
works addressed the problem through dynamic assignment of Parallelism, 2010.
[12] Markus C. Huebscher and Julie a. McCann. A survey of Autonomic Computing –
processors [35, 4, 36] and CPU bandwidth [10]. Researchers degrees, models, and applications. ACM Comput. Surv., 40(3), 2008.
have also tackled the problem with more comprehensive [13] Andrew S Tanenbaum. Modern Operating Systems. Prentice Hall PTR, Upper
Saddle River, NJ, USA, 2nd edition, 2001.
frameworks capable of managing more than one resource. [14] Livio Soares and Michael Stumm. FlexSC: Flexible System Call Scheduling with
Bitirgen et al. [37] exploited machine learning to distribute Exception-Less System Calls. In Proceedings of the 9th USENIX Conference on
Operating Systems Design and Implementation, 2010.
shared resources on a CMP. Srikantaiah et al [27] devised a [15] Livio Soares and Michael Stumm. Exception-Less System Calls for Event-Driven
strategy to partition both processors and caches. Hoffmann et Servers. In Proceedings of the 2011 USENIX Annual Technical Conference, 2011.
[16] Eddy Z. Zhang, Yunlian Jiang, and Xipeng Shen. Does cache sharing on modern
al. [5] proposed SElf-awarE Computing (SEEC), harnessing CMP matter to the performance of contemporary multithreaded programs? In
both control theory and machine learning to meet user-define Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice
of Parallel Programming, 2010.
performance goals allocating cores and scaling frequencies. [17] Princeton University. Parsec benchmark suite website, November 2011.
Sharifi et al. [9] presented METE, a framework for meeting [18] Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski.
Performance and environment monitoring for continuous program optimization.
QoS through control theory. IBM Journal of Research and Development, 50(2.3), 2006.
[19] Big Buck Bunny. http://www.bigbuckbunny.org/.
V. C ONCLUSIONS AND F UTURE W ORKS [20] Rose Liu, Kevin Klues, Sarah Bird, Steven Hofmeyr, Krste Asanović, and John
Kubiatowicz. Tessellation: Space-Time Partitioning in a Manycore Client OS. In
This paper presents AcOS: a methodology for enhancing Proc. of the 1st Workshop on Hot Topics in Parallelism, 2009.
[21] Henry Hoffmann, Jonathan Eastep, Marco D. Santambrogio, Jason E. Miller, and
commodity operating systems with an autonomic management Anant Agarwal. Application Heartbeats for Software Performance and Health.
layer. This approach is based on three basic blocks, collectively In Proceedings of the 15th Symposium on Principles and Practice of Parallel
Programming, pages 347–348, 2010.
called autonomic components, which ensure the availability of [22] Gerald Tesauro, David M. Chess, William E. Walsh, Rajarshi Das, Alla Segal,
information and goals, manage adaptation decisions, and allow Ian Whalley, Jeffrey O. Kephart, and Steve R. White. A Multi–Agent Systems
Approach to Autonomic Computing. In Proceedings of the Third International
to modify system parameters through appropriate knobs. This Joint Conference on Autonomous Agents and Multiagent Systems, pages 464–471,
methodology has been applied towards the enhancement of 2004.
[23] G. Tesauro, R. Das, W.E. Walsh, and J.O. Kephart. Utility–Function–Driven
two widespread opensource kernels (i.e., Linux and FreeBSD) Resource Allocation in Autonomic Systems. In Proceedings of the Second
with autonomic management of performance and temperature International Conference on Autonomic Computing, pages 342–343, 2005.
[24] G. Tesauro, N.K. Jong, R. Das, and M.N. Bennani. A Hybrid Reinforcement
requirements. The case studies show that the applications of Learning Approach to Autonomic Resource Allocation. In Proceedings of the
the proposed methodology able achieve the different goals. Third Internation Conference on Autonomic Computing, pages 65–73, 2006.
[25] Jin Heo and Tarek Abdelzaher. AdaptGuard: Guarding Adaptive Systems from
This work on AcOS opens the way to many developments; Instability. In Proceedings of the 6th International Conference on Autonomic
in particular, we are working towards the implementation of Computing, pages 77–86, 2009.
[26] Jichuan Chang and Gurindar S. Sohi. Cooperative Cache Partitioning for Chip
parts of the model which have not been extensively experimen- Multiprocessors. In Proceedings of the 21st Annual International Conference on
tally evaluated yet (e.g., the use of application-level actuators Supercomputing, 2007.
[27] Shekhar Srikantaiah, Reetuparna Das, Asit K. Mishra, Chita R. Das, and Mahmut
in concert with system-level ones). Another open issue is Kandemir. A Case for Integrated Processor-Cache Partitioning in Chip Multi-
investigating the possibility of applying distributed decision processors. In Proceedings of the Conference on High Performance Computing
Networking, Storage and Analysis, 2009.
theory to automatize the activation and deactivation of possibly [28] S. Srikantaiah, E. Kultursay, Tao Zhang, M. Kandemir, M.J. Irwin, and Yuan Xie.
conflicting adaptation policies with a system-wide coordinator. MorphCache: A Reconfigurable Adaptive Multi-level Cache hierarchy. In 2011
IEEE 17th International Symposium on High Performance Computer Architecture,
2011.
R EFERENCES [29] Mahmut Kandemir, Taylan Yemliha, and Emre Kultursay. A Helper Thread
Based Dynamic Cache Partitioning Scheme for Multithreaded Applications. In
[1] Samuel H. Fuller and Lynette I. Millett. Computing Performance: Game Over or
Proceedings of the 48th Design Automation Conference, 2011.
Next Level? Computer, 44(1), 2011.
[30] Akbar Sharifi, Shekhar Srikantaiah, Mahmut Kandemir, and Mary Jane Irwin.
[2] Jeffrey O. Kephart and David M. Chess. The Vision of Autonomic Computing.
Courteous Cache Sharing: Being Nice to Others in Capacity Management. In
Computer, 36(1):41–50, 2003.
Proceedings of the 49th Annual Design Automation Conference, 2012 (to appear).
[3] Mazeiar Salehie and Ladan Tahvildari. Self-Adaptive Software: Landscape and
[31] Nauman Rafique, Won-Taek Lim, and Mithuna Thottethodi. Effective Management
Research Challenges. ACM Trans. Auton. Adapt. Syst., 4(2), 2009.
of DRAM Bandwidth in Multicore Processors. In Proceedings of the 16th
[4] Henry Hoffmann, Jonathan Eastep, Marco D. Santambrogio, Jason E. Miller,
International Conference on Parallel Architecture and Compilation Techniques,
and Anant Agarwal. Application Heartbeats: A Generic Interface for Specifying
2007.
Program Performance and Goals in Autonomous Computing Environments. In
[32] Onur Mutlu and Thomas Moscibroda. Stall-Time Fair Memory Access Scheduling
Proceedings of the 7th International Conference on Autonomic computing, pages
for Chip Multiprocessors. In Proceedings of the 40th Annual IEEE/ACM Interna-
79–88, 2010.
tional Symposium on Microarchitecture, 2007.
[5] Henry Hoffmann, Martina Maggio, Marco D. Santambrogio, Alberto Leva, and
[33] Engin Ipek, Onur Mutlu, José F. Martı́nez, and Rich Caruana. Self-Optimizing
Anant Agarwal. SEEC: A Framework for Self–aware Management of Multicore
Memory Controllers: A Reinforcement Learning Approach. In Proceedings of the
Resources. Technical Report MIT–CSAIL–TR–2011–016, Massachusetts Institute
35th Annual International Symposium on Computer Architecture, 2008.
of Technology, Computer Science and Artificial Intelligence Laboratory, 2011.
[34] Fang Liu and Yan Solihin. Studying the Impact of Hardware Prefetching and
[6] Jeffrey O. Kephart and Rajarshi Das. Achieving Self–Management via Utility
Bandwidth Partitioning in Chip-Multiprocessors. In Proceedings of the ACM
Functions. IEEE Internet Computing, 11(1):40–48, 2007.
SIGMETRICS Joint International Conference on Measurement and Modeling of
[7] Gerald Tesauro. Reinforcement Learning in Autonomic Computing: A Manifesto
Computer Systems, 2011.
and Case Studies. IEEE Internet Computing, 11(1):22–30, 2007.
[35] Julita Corbalán, Xavier Martorell, and Jesús Labarta. Performance-Driven Processor
[8] Robert W. Wisniewski, Dilma Da Silva, Marc A. Auslander, Orran Krieger, Michal
Allocation. In Proceedings of the 4th Symposium on Operating System Design and
Ostrowski, and Bryan S. Rosenburg. K42: lessons for the OS community. SIGOPS
Implementation, 2000.
Oper. Syst. Rev., 42(1), 2008.
[36] M. Maggio, H. Hoffmann, M.D. Santambrogio, A. Agarwal, and A. Leva. Control-
[9] Akbar Sharifi, Shekhar Srikantaiah, Asit K. Mishra, Mahmut Kandemir, and
ling software applications via resource allocation within the Heartbeats framework.
Chita R. Das. METE: Meeting End-to-End QoS in Multicores through System-
In Proceedings of the 49th Conference on Decision and Control, pages 3736–3741,
Wide Resource Management. In Proceedings of the ACM SIGMETRICS Joint
2010.
International Conference on Measurement and Modeling of Computer Systems,
[37] Ramazan Bitirgen, Engin Ipek, and Jose F. Martinez. Coordinated Management
2011.
of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning
[10] Filippo Sironi, Davide B. Bartolini, Simone Campanoni, Fabio Cancare, Henry
Approach. In Proceedings of the 41st Annual IEEE/ACM International Symposium
Hoffmann, Donatella Sciuto, and Marco D. Santambrogio. Metronome: Operating
on Microarchitecture, 2008.
System Level Performance Management via Self-Adaptive Computing. In Proc. of
the 49th Design Automation Conference, 2012.