DATA CENTER
ESSENTIALS
Introduction to the components that support a
data center and how they work together
Part G2: General >
Data Center Reliability
Data Center Reliability: Agenda
Data • Reliability
Center • Availability
• Redundancy
Goals
• Uptime Institute
• TIA
Tier levels • Tier III
• Tier IV
Terms to Understand
Term Definition
Redundancy duplication of equipment or systems
Reliability likelihood of meeting functions over time
Availability amount of time system is functioning
Uptime time that equipment & systems are
functioning
Downtime time that equipment & systems are not
functioning
DATA CENTER
ESSENTIALS
Reliability, Availability, Redundancy
Data Center Overview Supporting systems
operate together in
Electrical a complex manner
Generators
And many more:
• Plumbing
• Fuel Oil
• Controls
• Fire Protection
• Security
Cooling Towers • Maintenance
• Etc…
Modern Data Centers: Three Key Definitions
Reliability
• The ability of a system or component to perform its required
functions under stated conditions for a specified period of
time
• May be given in number of 9’s of operation per year (five
9’s = 99.999%)
Reliability = business continuity
Reliability is needed from the bottom to the
top to meet the goals of the end users
Modern Data Centers: Three Key Definitions
Availability
• The proportion of time a system is in a
functioning condition
• Often given in number of 9’s of operation per year
(five 9’s = 99.999%)
A loss of availability doesn’t end when the power
comes back on. Servers also need time to recover.
Availability & Reliability
Adding a 9 makes a difference
Availability Number Downtime Downtime Downtime
of 9s Per day Per Month Per year
99.9999% 6 00:00:00.1 00:00:02.6 00:00:31.5
99.999% 5 00:00:00.4 00:00:26 00:05:15
99.99% 4 00:00:08 00:04:22 00:52:35
99.9% 3 00:01:26 00:43:49 08:45:56
99% 2 00:14:23 07:18:17 87:39:29
Going from 5 to 6 nines reduces theoretical downtime by 4 minutes, 43.5 seconds.
This time may not seem like much, however for financial and mission critical support, this can be far too long.
Modern Data Centers: Three Key Definitions
Redundancy
• Duplication of critical components or functions of
a system with the intention of increasing reliability
of the system
N+1: one extra piece of equipment
provided in parallel
N+2: two extra pieces of equipment
provided in parallel
2N: two separate support systems
2N+1: two separate support systems,
each with extra equipment provided in
parallel
Activity: Redundancy
Determine the redundancy topology of the
equipment serving each of the loads below
100 kW
100 kW
100 kW
100 kW 300 kW 300 kW
Load 100 kW Load
100 kW
100 kW
100 kW
N+1 100 kW N+2
200 kW
300 kW 300 kW 300 kW
200 kW
Load Load
300 kW
200 kW
2N N+1
DATA CENTER
ESSENTIALS
Tier Levels
Uptime Tier Levels
• pioneered by the Uptime Institute
‘Uptime’ • founded in 1993.
Established Tier • Lowest reliability to highest: I, II, III, & IV
• Globally recognized system
classification system • Although marketing material may list a tier level, not always accurate
UTI provide • Design documents
• Constructed facilities
certifications: • Specialists and Designers
Tier Levels
IV – Fault tolerant
• Multiple active paths
• Redundancy
• Concurrently Maintainable
III – Concurrently maintainable
• Multiple paths; only one active
• Redundancy
• Concurrently Maintainable
II – Redundant capacity
• Single path
• Redundant components
I – Basic
• Single path
• No redundancy
TIA 942
Telecommunications Industry Association sets standards for the telecomm
industry
Standard 942 developed for data centers; developed 2005, updated 2013
Applies to:
• Network architecture
• File storage, backup and archiving
• Database management
• Network access control and security
• Web hosting
• Application hosting
• Content distribution
• System redundancy
• Electrical design
• Power management
• Environmental control
• Protection against physical hazards (fire, flood, windstorm)
Telecom Topology
1 2
TIA 942
Tier III: Concurrent Maintainability
Separate paths,
redundant equipment
The ability to perform maintenance
(planned or emergency) may involve
taking redundant components offline
Concurrent maintainability permits
systems to be bypassed without affecting
the availability of the computing
equipment
There may be more than one way to
provide concurrent maintainability
• Each step and path of power, cooling and other
systems needs to comply with 1 means of
failure/outage
• Additional outages, such as 2 components failing,
are not considered for the design of a Tier III design
Tier IV: Fault Tolerant
Data Center Tour: Reliability
Simple block diagram by Eaton shows redundancy of equipment and power
architecture.
Activity
Determine if the tier diagram depicted meets Tier III or IV.
Two paths to load; not separate,
Single point not enough redundancy for 2N
100 kW 100 kW
of failure
100 kW 300 kW 100 kW 300 kW
Load Load
100 kW 100 kW
100 kW 100 kW
Neither Tier III
300 kW
200 kW
300 kW 300 kW
300 kW 200 kW
Load Load
Tier IV 200 kW
Tier III
Two paths to load, each at 100% Two paths to load; not separate,
not enough redundancy for 2N
End of presentation
DATA CENTER
ESSENTIALS
Part G2: General >
Data Center Reliability