0% found this document useful (0 votes)
455 views

Data Protection Participant Guide PDF

Uploaded by

Kevin Yu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
455 views

Data Protection Participant Guide PDF

Uploaded by

Kevin Yu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

DATA PROTECTION

PARTICIPANT GUIDE

PARTICIPANT GUIDE
Data Protection

© Copyright 2021 Dell Inc. Page i


Table of Contents

Data Protection .......................................................................................................................... 2

Data Replication ......................................................................................................... 3


Data Replication ..................................................................................................................4
Data Replication Overview ...................................................................................................5
Use of Replicas ....................................................................................................................6
Data Replication: Additional Information ..............................................................................8
Types of Replication ............................................................................................................9
Local Replication: Snapshot ...............................................................................................10
Local Replication: Clone ....................................................................................................12
Remote Replication: Synchronous .....................................................................................13
Remote Replication: Asynchronous ...................................................................................15
Replication Types: Additional Information ..........................................................................17
Continuous Data Protection (CDP) ....................................................................................18
Key Continuous Data Protection Components ...................................................................20
Continuous Data Protection: Local and Remote Replication ..............................................21
CDP: Additional Information ...............................................................................................22

Knowledge Check .................................................................................................... 23


Knowledge Check ..............................................................................................................24

Data Backup ............................................................................................................. 25


Data Backup ......................................................................................................................26
Backup Overview ...............................................................................................................27
Backup Architecture ...........................................................................................................28
Backup Operation ..............................................................................................................30
Recovery Operation ...........................................................................................................31
Backup Granularities..........................................................................................................32
Agent-Based Backup .........................................................................................................36
Image-Based Backup .........................................................................................................37
Cloud-Based Backup (Backup as a Service) ......................................................................39
Backup Architecture: Additional Information .......................................................................40

Data Protection

Page ii © Copyright 2021 Dell Inc.


Backup and Recovery Lab Demo.......................................................................................41

Knowledge Check .................................................................................................... 42


Knowledge Check ..............................................................................................................43

Data Deduplication................................................................................................... 44
Data Deduplication.............................................................................................................45
Data Deduplication Overview .............................................................................................46
Key Benefits of Data Deduplication ....................................................................................48
Data Deduplication Method: Source-Based........................................................................49
Data Deduplication Method: Target-Based.........................................................................50
Data Deduplication: Additional Information.........................................................................51

Knowledge Check .................................................................................................... 52


Knowledge Check ..............................................................................................................53

Data Archiving.......................................................................................................... 54
Data Archiving ...................................................................................................................55
Data Archiving Overview ....................................................................................................56
Backup vs. Archive ............................................................................................................57
Data Archiving Operations .................................................................................................58
Use Case: Email Archiving .................................................................................................59

Knowledge Check .................................................................................................... 61


Knowledge Check ..............................................................................................................62

Data Migration .......................................................................................................... 63


Data Migration ...................................................................................................................64
Data Migration Overview ....................................................................................................65
Hypervisor-Based Migration ...............................................................................................66
Storage-Based Data Migration ...........................................................................................68
Appliance-Based Data Migration ........................................................................................70
VM Migration: Additional Information..................................................................................71

Data Protection

© Copyright 2021 Dell Inc. Page iii


Knowledge Check .................................................................................................... 72
Knowledge Check ..............................................................................................................73

Concepts in Practice................................................................................................ 74
Concepts in Practice ..........................................................................................................75

Exercise: Data Protection........................................................................................ 81


Exercise: Data Protection...................................................................................................82

Data Protection

Page iv © Copyright 2021 Dell Inc.


Data Replication

Data Protection

© Copyright 2021 Dell Inc. Page 1


Data Replication

Data Protection

Data Protection

Page 2 © Copyright 2021 Dell Inc.


Data Replication

Data Replication

Data Protection

© Copyright 2021 Dell Inc. Page 3


Data Replication

Data Replication

Data Protection

Page 4 © Copyright 2021 Dell Inc.


Data Replication

Data Replication Overview

Replication

Servers

Data Center B

Source Replica Cloud

Replication
Data Replication

Data Center A

Data replication across different locations

Replication is a process of creating an exact copy (replica) of the data to ensure


business continuity when there is a local outage or disaster.

• Replicas 1 are used to restore and restart operations when there is data loss.
• Data can be replicated to one or more locations.

− For example, the production data is copied from the source (primary
storage) to the target. The target can be other storage in the same data
center, storage in a different data center, or to the cloud.

1 In a replication environment, a compute system accesses the production data


from one or more LUNs on a storage system. These LUNs are known as source
LUNs, production LUNs, or the source. A LUN on which the production data is
replicated to is called the target or replica.

Data Protection

© Copyright 2021 Dell Inc. Page 5


Data Replication

Use of Replicas

Replicas

Can act as a source for backup

Used to restart business operations or to


recover the data

Data Replication
Data migration

Source

Used for testing purposes

Used for running decision support activities

Notes:

Alternative source for backup: Under normal backup operations, data is read
from the production LUNs and written to the backup device. This approach places
an additional burden on the production infrastructure because production LUNs are
simultaneously involved in production operations and servicing data for backup
operations. To avoid this situation, a replica can be created from production LUN
and it can be used as a source to perform backup operations. This method
alleviates the backup I/O workload on the production LUNs.

Fast recovery and restart: For critical applications, replicas can be taken at short,
regular intervals. This approach allows easy and fast recovery from data loss. If a
complete failure of the source (production) LUN occurs, the replication solution
enables one to restart the production operation on the replica to reduce the RTO.

Decision-support activities, such as reporting: Running reports using the data on


the replicas greatly reduces the I/O burden that is placed on the production device.

Data Protection

Page 6 © Copyright 2021 Dell Inc.


Data Replication

Testing platform: Replicas are also used for testing new applications or upgrades.
For example, an organization may use the replica to test the production application
upgrade; if the test is successful, the upgrade may be implemented on the
production environment.

Data migration: Another use for a replica is data migration. Data migrations are
performed for various reasons such as migrating from a smaller capacity LUN to
one of a larger capacity for newer versions of the application.

Data Protection

© Copyright 2021 Dell Inc. Page 7


Data Replication

Data Replication: Additional Information

To understand more about replication, click here.

Data Protection

Page 8 © Copyright 2021 Dell Inc.


Data Replication

Types of Replication

Local Replication Remote Replication

• Replicating data within the same • Replicating data to remote


location. locations (locations can be
geographically dispersed).
− Within a data center.
− Data can be synchronously or
− Within a storage system.
asynchronously replicated.
• It is typically used for operational
− It helps to mitigate the risks
restore of data when there is data
associated with regional
loss.
outages.
• It enables organizations to
replicate the data to cloud for DR
purpose.

Data Protection

© Copyright 2021 Dell Inc. Page 9


Data Replication

Local Replication: Snapshot

A snapshot is a virtual copy of a set of files, VM, or LUN as they appeared at a


specific point-in-time (PIT). A point-in-time copy of data contains a consistent
image of the data as it appeared at a given point in time.

Snapshots can establish recovery points in a small fraction of time and can reduce
Recovery Point Objective (RPO) by supporting more frequent recovery points. If a
file is lost or corrupted, it can typically be restored from the latest snapshot data in
a few seconds.

VM Snapshot

• A VM snapshot preserves the state and data of a VM at a specific PIT, enabling


quick restoration of a VM.
− The snapshot includes the power state of a VM (powered-on, powered-off,
or suspended).
− The data includes all the files that make up the VM.
• For example:
− Administrator can create a snapshot of a VM, make changes such as
applying patches and software upgrades to the VM.
− If anything goes wrong, the administrator can restore the VM to its previous
state using the VM snapshot.
• Taking multiple snapshots provide several restore points for a VM.

Storage System-Based Snapshot

• Storage system-based snapshot provides space optimal pointer-based virtual


replication.
• At the time of replication session activation, the target (snapshot) contains
pointers to the location of the data on the source.
• The snapshot does not contain data at any time. The snapshot is known as a
virtual replica.
• The snapshot is immediately accessible after the replication session activation.
Notes:

Data Protection

Page 10 © Copyright 2021 Dell Inc.


Data Replication

Multiple snapshots can be created from the same source LUN for various business
requirements. Some snapshot software provides the capability of automatic
termination of a snapshot upon reaching the expiration date. The unavailability of
the source device invalidates the data on the target. The storage system-based
snapshot uses a Redirect on Write (RoW) mechanism.

RoW redirects new writes that are destined for the source LUN to a reserved LUN
in the storage pool. In RoW, a new write from source compute system is written to
a new location (redirected) inside the pool. The original data remains where it is,
and is therefore read from the original location on the source LUN and is untouched
by the RoW process.

Data Protection

© Copyright 2021 Dell Inc. Page 11


Data Replication

Local Replication: Clone

• Cloning provides the ability to create fully populated point-in-time copies of


LUNs within a storage system or create a copy of an existing VM.
• Clone of a storage volume
− Initial synchronization is performed between the source LUN and the replica
(clone).
− During the synchronization process, the replica is not available for any
compute system access. Once the synchronization is completed, the replica
is exactly same as the source LUN.
− Data changes made to both the source and the replica can be tracked at
some predefined granularity.
• VM clone

− Clone is a copy of an existing virtual machine (parent VM).


− Typically, clones are deployed when many identical VMs are required that
reduces the time that is required to deploy a new VM.
− Two types of clones:

Full clone It is an independent copy of a VM that shares nothing


with the parent VM.

Linked clone It is created from a snapshot of the parent VM.

Data Protection

Page 12 © Copyright 2021 Dell Inc.


Data Replication

Remote Replication: Synchronous

• Write is committed to both the source and the remote replica before it is
acknowledged to the compute system.
• Synchronous replication enables restarting business operations at a remote site
with zero data loss and provides near zero RPO.

Synchronous remote replication process

Notes:

Storage-based remote replication solution can avoid downtime by enabling


business operations at remote sites. Storage-based synchronous remote
replication provides near zero RPO where the target is identical to the source
always.

In synchronous replication, writes must be committed to the source and the remote
target prior to acknowledging “write complete” to the production compute system.
Another writes on the source cannot occur until each preceding write has been
completed and acknowledged.

This approach ensures that data is identical on the source and the target always.
Further, writes are transmitted to the remote site exactly in the order in which they
are received at the source. Write ordering is maintained and it ensures
transactional consistency when the applications are restarted at the remote
location. As a result, the remote images are always restartable copies.

Application response time is increased with synchronous remote replication. Since,


writes must be committed on both the source and the target before sending the
“write complete” acknowledgment to the compute system. The degree of impact on

Data Protection

© Copyright 2021 Dell Inc. Page 13


Data Replication

response time depends primarily on the distance and the network bandwidth
between sites. If the bandwidth provided for synchronous remote replication is less
than the maximum write workload, there are times during the day when the
response time might be excessively elongated, causing applications to time out.
The distances over which synchronous replication can be deployed depend on the
capability of an application to tolerate the extensions in response time. Typically,
synchronous remote replication is deployed for distances less than 200 kilometers
(125 miles) between the two sites.

Data Protection

Page 14 © Copyright 2021 Dell Inc.


Data Replication

Remote Replication: Asynchronous

• A write is committed to the source and immediately acknowledged to the


compute system.
• Data is buffered at the source and sent to the remote site periodically.
• Replica is behind the source by a finite amount (finite RPO).

Asynchronous remote replication process

Notes:

It is important for an organization to replicate data across geographical locations to


mitigate the risk involved during disaster. If the data is replicated (synchronously)
between sites and the disaster strikes, then there would be a chance that both the
sites may be impacted. This method may lead to data loss and service outage.

Asynchronous replication enables to replicate data across sites which are 1000s of
kilometers apart.

In asynchronous remote replication, a write from a production compute system is


committed to the source and immediately acknowledged to the compute system.
Asynchronous replication also mitigates the impact to the response time of an
application because the writes are acknowledged immediately to the compute
system.

In asynchronous replication, compute system writes are collected into buffer (delta
set) at the source. This delta set is transferred to the remote site in regular
intervals. Adequate buffer capacity should be provisioned to perform asynchronous
replication. Some storage vendors offer a feature called delta set extension, which
enables to offload delta set from buffer (cache) to specially configured drives. This

Data Protection

© Copyright 2021 Dell Inc. Page 15


Data Replication

feature makes asynchronous replication resilient to the temporary increase in write


workload or loss of network link.

In asynchronous replication, RPO depends on the size of the buffer, the available
network bandwidth, and the write workload to the source. This replication can take
advantage of locality of reference (repeated writes to the same location). If the
same location is written multiple times in the buffer prior to transmission to the
remote site, only the final version of the data is transmitted. This feature conserves
link bandwidth.

Data Protection

Page 16 © Copyright 2021 Dell Inc.


Data Replication

Replication Types: Additional Information

To understand more about various types of replication, click here.

Data Protection

© Copyright 2021 Dell Inc. Page 17


Data Replication

Continuous Data Protection (CDP)

• Continuous Data Protection provides the capability to restore data and VMs to
any previous point-in-time (PIT).

− Data changes are continuously captured and stored at a separate location


from the production volume so that the data can be restored to any previous
PIT.

1: Continuous Data Protection provides continuous replication, tracks all the


changes to the production volumes that enable to recover to any point-in-time.

2: Continuous Data Protection solutions have the capability to replicate data across
heterogeneous storage systems.

3: Continuous Data Protection supports both local and remote replication of data
and VMs to meet operational and disaster recovery respectively.

4: Continuous Data Protection supports various WAN optimization techniques


(deduplication, compression, and fast write) to reduce bandwidth requirements and
also optimally uses the available bandwidth.

Data Protection

Page 18 © Copyright 2021 Dell Inc.


Data Replication

5: Continuous Data Protection supports multisite replication, where the data can be
replicated to more than two sites using synchronous and asynchronous replication.

Data Protection

© Copyright 2021 Dell Inc. Page 19


Data Replication

Key Continuous Data Protection Components

Continuous Data Protection (CDP) components

Data Protection

Page 20 © Copyright 2021 Dell Inc.


Data Replication

Continuous Data Protection: Local and Remote Replication

Notes:

Typically, the replica is synchronized with the source, and then the replication
process starts. After the replication starts, all the writes from the compute system to
the source (production volume) are split into two copies. One copy is sent to the
local Continuous Data Protection appliance at the source site, and the other copy is
sent to the production volume. Then the local appliance writes the data to the
journal at the source site and the data in turn is written to the local replica. If a file is
accidentally deleted, or the file is corrupted, the local journal enables organizations
to recover the application data to any PIT.

In remote replication, the local appliance at the source site sends the received write
I/O to the appliance at the remote (DR) site. Then, the write is applied to the journal
volume at the remote site. As a next step, data from the journal volume is sent to
the remote replica at predefined intervals. Continuous Data Protection operates in
either synchronous or asynchronous mode.

Data Protection

© Copyright 2021 Dell Inc. Page 21


Data Replication

CDP: Additional Information

To understand about continuous data replication, click here.

Data Protection

Page 22 © Copyright 2021 Dell Inc.


Knowledge Check

Knowledge Check

Data Protection

© Copyright 2021 Dell Inc. Page 23


Knowledge Check

Knowledge Check

1. Which provides the ability to create fully populated point-in-time copies of LUNs
within a storage system or create a copy of an existing VM?
a. Clone
b. Snapshot
c. Pointer-based virtual replica
d. Full volume virtual replica

Data Protection

Page 24 © Copyright 2021 Dell Inc.


Data Backup

Data Backup

Data Protection

© Copyright 2021 Dell Inc. Page 25


Data Backup

Data Backup

Data Protection

Page 26 © Copyright 2021 Dell Inc.


Data Backup

Backup Overview

A Backup is an additional copy of production data, which is created and retained for
the sole purpose of recovering lost or corrupted data.

Organizations implement backups in order to protect the data from accidental


deletion, application crashes, data corruption, and disaster.

An organization needs data backups to:

• Recover the lost or corrupted data for smooth functioning of business


operations.
• Comply with regulatory requirements.
• Avoid financial and business loss.

Data Protection

© Copyright 2021 Dell Inc. Page 27


Data Backup

Backup Architecture

In a backup environment, the common backup components are Backup Client,


Backup Server, Storage Node, and Backup Target (Backup Device).

Notes:

The role of a backup client is to gather the data that must backup and send it to
the storage node. The backup client can be installed on application servers, mobile
clients, and desktops. It also sends the tracking information to the backup server.

The backup server manages the backup operations and maintains the backup
catalog, which contains information about the backup configuration and backup
metadata. The backup configuration contains information about when to run
backups, which client data to be backed up, and so on. The backup metadata
contains information about the backed-up data. The storage node is responsible
for organizing the client’s data and writing the data to a backup device. A storage
node controls one or more backup devices.

In most implementations, the storage node and the backup server run on the same
system. Backup devices may be attached directly or through a network to the
storage node. The storage node sends the tracking information about the data that
is written to the backup device to the backup server. Typically this information is

Data Protection

Page 28 © Copyright 2021 Dell Inc.


Data Backup

used for recoveries. Backup targets include tape, disk, virtual disk library, and the
cloud.

Data Protection

© Copyright 2021 Dell Inc. Page 29


Data Backup

Backup Operation

Steps to perform backup operation

Data Protection

Page 30 © Copyright 2021 Dell Inc.


Data Backup

Recovery Operation

After the data is backed up, it can be restored when required. A recovery operation
restores data to its original state at a specific Point in Time (PIT). Typically, backup
applications support restoring one or more individual files, directories, or VMs.

Steps to perform recovery operation

Data Protection

© Copyright 2021 Dell Inc. Page 31


Data Backup

Backup Granularities

Backups can be categorized as Full, Incremental, and Cumulative (or


Differential).

Click ‘arrow marks’ or ‘dots’ to navigate through each option.

Full Backup

• Full backup copies all data on the production volume to a backup device.

− It provides a faster data recovery.


− It requires more storage space and takes more time to backup.

Production Full Backup

Full Backup-Restore

In the motion graphics shown below, a full backup is created on every Sunday.
When there is a data loss in the production on Monday, the recent full backup that
is created on the previous Sunday is used to restore the data in the production.

• Recovery Point Objective (RPO) determines which backup copy is used to


restore the production.

Data Protection

Page 32 © Copyright 2021 Dell Inc.


Data Backup

Incremental Backup

Incremental backup copies the data that has changed since the last backup.

• The main advantage of incremental backups is that fewer files are backed up
daily, allowing for shorter backup windows 2.
• Click here 3 to view the example of incremental backup.

2 It is the period during which a production volume is available to perform a backup.


3 For example, as shown in the motion graphic, a full backup is created on Sunday,
and incremental backups are created for the rest of the week. Backup that is
created on Monday would contain only the data that has changed since Sunday.
Backup that is created on Tuesday would contain only the data that has changed
since Monday. Backup that is created on Wednesday would only contain the data

Data Protection

© Copyright 2021 Dell Inc. Page 33


Data Backup

Cumulative Backup

Cumulative (differential) backup copies the data that has changed since the last full
backup.

• The advantage of differential backups over incremental backup is shorter


restore times.

that has changed since Tuesday. The primary disadvantage to incremental


backups is that they can be time-consuming to restore. Suppose that an
administrator wants to restore the backup from Wednesday. The administrator has
to first restore full backup that is created on Monday. After that, the administrator
has to restore backup that is created on Tuesday, then followed by backup created
on Wednesday.

Data Protection

Page 34 © Copyright 2021 Dell Inc.


Data Backup

• The tradeoff is that as time progresses, a differential backup can grow to


contain more data than an incremental backup.
• Click here 4 to view the example of cumulative backup.

4 For example, the administrator created a full backup on Sunday and differential
backups for the rest of the week. Backup that is created on Monday would contain
all the data that has changed since Sunday. It would therefore be identical to an
incremental backup at this point. On Tuesday, however, the differential backup
would backup any data that had changed since Sunday (full backup). The
advantage that differential backups have over incremental is shorter restore times.
Restoring a differential backup never requires more than two copies. The tradeoff is
that as time progresses, a differential backup can grow to contain more data than
an incremental backup. Suppose that an administrator wants to restore the backup
from Tuesday. The administrator has to first restore the full backup that is created
on Sunday. After that, the administrator has to restore the backup created on
Tuesday.

Data Protection

© Copyright 2021 Dell Inc. Page 35


Data Backup

Agent-Based Backup

In this approach, an agent or client is installed on a virtual machine (VM) or a


physical compute system. The agent streams the backup data to the backup
device.

Backup Server
Application
Servers

Backup Device

Agent-based backup

• Agent-based backup supports file-level backup and restore.

− It impacts performance of applications running on compute systems.


− The agent running on the compute system consumes CPU cycles and
memory resources.

Data Protection

Page 36 © Copyright 2021 Dell Inc.


Data Backup

Image-Based Backup

Image-based backup makes a copy of the virtual machine disk and configuration
that is associated with a particular VM. The backup is saved as a single entity
called a VM image.

Create Snapshot
VM Management Server

VM Snapshot A Proxy
Server

Create Mount
Snapshot Snapshot

Application Servers Backup Data


VM File System Volume

Backup
Backup Server Data
Backup Device

Image-based backup process

• In an image-based backup, the backup software can:


− Send request to the VM management server to create a snapshot of the
VMs to be backed up and mount it on the proxy server.
• Backup is performed using the snapshot by the proxy server.

Notes:

This backup is used for restoring an entire VM if there is any hardware failure or
human error. It is also possible to restore individual files and folders within a virtual
machine.

In an image-level backup, the backup software can backup VMs without installing
backup agents inside the VMs or at the hypervisor-level. Proxy server performs the
backup operations, and it acts as the backup client. The proxy server offloads the
backup processing from the VMs.

The proxy server communicates to the management server responsible for


managing the virtualized compute environment. It sends commands to create a
snapshot of the VM to be backed up and to mount the snapshot to the proxy

Data Protection

© Copyright 2021 Dell Inc. Page 37


Data Backup

server. A snapshot captures the configuration and virtual disk data of the target VM
and provides a point-in-time view of the VM.

The proxy server then performs backup by using the snapshot. Performing an
image-level backup of a virtual machine disk enables running a bare metal restore
of a VM.

Some of the vendors support changed block tracking mechanism. This feature
identifies and tags any blocks that have changed since the last VM snapshot. This
method enables the backup application to backup only the blocks that have
changed, rather than backing up every block.

Data Protection

Page 38 © Copyright 2021 Dell Inc.


Data Backup

Cloud-Based Backup (Backup as a Service)

Cloud Resources

Backup Data

Data Center

Organizations must regularly protect the data to avoid losses, stay compliant, and
preserve data integrity. They may face challenges on IT budget, and IT
management. These challenges can be addressed with the emergence of cloud-
based data protection.

• It enables consumers to procure backup services on-demand.


• It reduces the backup management overhead.
• Cloud-based backup gives the consumers the flexibility to select a backup
technology based on their current requirements.

Data Protection

© Copyright 2021 Dell Inc. Page 39


Data Backup

Backup Architecture: Additional Information

To understand more about backup architecture, click here.

Data Protection

Page 40 © Copyright 2021 Dell Inc.


Data Backup

Backup and Recovery Lab Demo

To view the demo of performing backup and recovery using Dell EMC NetWorker, click here.

Data Protection

© Copyright 2021 Dell Inc. Page 41


Knowledge Check

Knowledge Check

Data Protection

Page 42 © Copyright 2021 Dell Inc.


Knowledge Check

Knowledge Check

1. Which backup component manages the backup operations and maintains the
backup catalog?
a. Backup client
b. Backup target
c. Backup server
d. Backup device

Data Protection

© Copyright 2021 Dell Inc. Page 43


Data Deduplication

Data Deduplication

Data Protection

Page 44 © Copyright 2021 Dell Inc.


Data Deduplication

Data Deduplication

Data Protection

© Copyright 2021 Dell Inc. Page 45


Data Deduplication

Data Deduplication Overview

Challenges of duplicate data in a data center:

Deduplication

After deduplication: unique


segments = 3

Before deduplication: total


segments = 39

Example of data deduplication.

• Difficult to protect the data within the budget.


• Duplicate data impacts the backup window.
• It increases network bandwidth.

Data deduplication provides a solution for organizations to overcome these


challenges in a backup and production environment.

Data Deduplication is the process of detecting and identifying the unique data
segments within a given set of data to eliminate redundancy.

• Effectiveness of deduplication is expressed as a deduplication ratio 5.

5It is the ratio of data before deduplication to the amount of data after
deduplication. This ratio is typically depicted as “ratio:1” or “ratio X” (10:1 or 10 X).
For example, if 200 GB of data consumes 20 GB of storage capacity after data
deduplication, the space reduction ratio is 10:1.

Data Protection

Page 46 © Copyright 2021 Dell Inc.


Data Deduplication

Notes:

In a data center environment, a certain percentage of data, which is retained on a


backup media is redundant. The typical backup process for most organizations
consists of a series of daily incremental backups and weekly full backups. Daily
backups are retained for a few weeks and weekly full backups are retained for
several months. Because of this process, multiple copies of identical or slowly
changing data are retained on backup media, leading to a high level of data
redundancy.

Many files are common across multiple systems in a data center environment.
Many users across an environment store identical file such as Word documents,
Microsoft PowerPoint presentations, and Excel spreadsheets. Backups of these
systems contain many identical files. Also, many users keep multiple versions of
files that they are working on. Many of these files differ only slightly from other
versions, but are seen by backup applications as new data that must be protected.

Due to this redundant data, the organizations are facing many challenges. Backing
up redundant data increases the amount of storage that is required to protect the
data and then increases the storage infrastructure cost. It is important for
organizations to protect the data within the limited budget. Organizations are
running out of backup window time and facing difficulties meeting recovery
objectives. Backing up large amount of duplicate data at the remote site or cloud
for DR purpose is also cumbersome and requires huge bandwidth.

Data deduplication provides a solution for organizations to overcome these


challenges in a backup and production environment. Deduplication is the process
of detecting and identifying the unique data segments (chunk) within a given set of
data to eliminate redundancy. Only one copy of the data is stored; the subsequent
copies are replaced with a pointer to the original data.

Data Protection

© Copyright 2021 Dell Inc. Page 47


Data Deduplication

Key Benefits of Data Deduplication

1 2 3 4

1: By eliminating redundant data from the backup, the infrastructure requirement is


minimized. Data deduplication directly results in reduced storage requirements.
Smaller storage needs result in lower acquisition costs as well as reduced power
and cooling costs.

2: As data deduplication reduces the amount of content in the daily backup, users
can extend their retention policies. This approach can have a significant benefit to
users who require longer retention.

3: Data deduplication eliminates redundant content of backup data, which results in


backing up less data and reduced backup window.

4: By using data deduplication at the client, redundant data is removed before the
data is transferred over the network. This approach reduces the network bandwidth
that is required for sending backup data to remote site for DR purpose.

Data Protection

Page 48 © Copyright 2021 Dell Inc.


Data Deduplication

Data Deduplication Method: Source-Based

Deduplication at Source

A A

VMs

Hypervisor Deduplication Server

Application Server (Backup


Client) Backup Device

A
Deduplication Agent

Source-based data deduplication

• Data is deduplicated at the source (backup client).


− Backup client sends only new, unique segments across the network.
− It reduces storage capacity and network bandwidth requirements.
− It is recommended for Remote Office Branch Office (ROBO) environments
for taking centralized backup.
• Cloud service providers use source-based method when performing backup
from consumer’s location to their location.

Data Protection

© Copyright 2021 Dell Inc. Page 49


Data Deduplication

Data Deduplication Method: Target-Based

Deduplication at Target

VMs

Hypervisor Deduplication Server

Application Server (Backup


Backup Device
Client)
Deduplication Appliance

Target-based data deduplication

• Data is deduplicated at the target.


− Inline
− Postprocess
• It offloads the backup client from the deduplication process.
• It requires sufficient network bandwidth.
• In some implementations, part of the deduplication load is moved to the backup
server.

− Reduces the burden on the target.


− Improves the overall backup performance.

Data Protection

Page 50 © Copyright 2021 Dell Inc.


Data Deduplication

Data Deduplication: Additional Information

To understand about data deduplication, click here.

Data Protection

© Copyright 2021 Dell Inc. Page 51


Knowledge Check

Knowledge Check

Data Protection

Page 52 © Copyright 2021 Dell Inc.


Knowledge Check

Knowledge Check

1. What is the benefit of implementing source-based deduplication?


a. Reduces the amount of data sent over the network.
b. Improves the performance of an application server (client)
c. Improves the performance of a backup device
d. Reduces the storage required to copy backup catalog

Data Protection

© Copyright 2021 Dell Inc. Page 53


Data Archiving

Data Archiving

Data Protection

Page 54 © Copyright 2021 Dell Inc.


Data Archiving

Data Archiving

Data Protection

© Copyright 2021 Dell Inc. Page 55


Data Archiving

Data Archiving Overview

Data archiving moves fixed content that is no longer actively accessed to a


separate low-cost archive storage system for long-term retention and future
reference.

• Data archiving saves primary storage capacity.


• Data archiving reduces backup window and backup storage cost.

Notes:

Data in the primary storage is actively accessed and changed. As data ages, it is
less likely to change and eventually becomes “fixed” but continues to be accessed
by applications and users. This data is called fixed data. Fixed data is growing at
over 90 percent annually. Keeping the fixed data in primary storage systems poses
several challenges.

First, preserving data on the primary storage system causes increasing


consumption of expensive primary storage. Second, data that must be preserved
over a long period for compliance reasons may be modified or deleted by the
users. These pose a risk of a compliance breach. Finally, the backup of high-
growth fixed data results in an increased backup window and related backup
storage cost. Data archiving addresses these challenges.

Data archiving is the process of moving fixed data that is no longer actively
accessed to a separate lower-cost archive storage system for long-term retention
and future reference. With archiving, the capacity on expensive primary storage
can be reclaimed by moving infrequently accessed data to lower-cost archive
storage.

Data Protection

Page 56 © Copyright 2021 Dell Inc.


Data Archiving

Backup vs. Archive

Data Backup Data Archive

Secondary copy of data Primary copy of data

Used for data recovery operations Available for data retrieval

Primary objective – operational Primary objective – compliance


recovery and disaster recovery adherence and lower cost

Typically short-term (weeks or months) Long-term (months, years, or decades)


retention retention

Data Protection

© Copyright 2021 Dell Inc. Page 57


Data Archiving

Data Archiving Operations

Data archiving components and operations

• The data archiving operation involves the archiving agent, the archive server
(policy engine), and the archive storage.
• Archiving agent scans primary storage to find files that meet the archiving
policy.
− The archive server indexes the files.
• Once the files have been indexed, they are moved to archive storage and small
stub 6 files are left on the primary storage.

6The stub file contains the address of the archived file. As the size of the stub file is
small, it saves space on primary storage.

Data Protection

Page 58 © Copyright 2021 Dell Inc.


Data Archiving

Use Case: Email Archiving

• Email archiving is the process of archiving email messages from the mail server
to an archive storage.

− After the email is archived, it is retained for years, based on the retention
policy.

Legal Dispute

• Email archiving helps an organization to address legal disputes.

− For example, an organization may be involved in a legal dispute. They must


produce all email messages within a specified time period containing
specific keywords that were sent to or from certain people.

Government Compliance

• Email archiving helps to meet government compliance requirements such as


Sarbanes-Oxley and SEC regulations.

− For example, an organization must produce all email messages from all
individuals that are involved in stock sales or transfers. Failure to comply
with these requirements could cause an organization to incur penalties.

Mailbox Space Savings

• Email archiving provides more mailbox space by moving old email messages to
archive storage.

Data Protection

© Copyright 2021 Dell Inc. Page 59


Data Archiving

− For example, an organization may configure a quota on each mailbox to limit


its size. A fixed quota for a mailbox forces users to delete email messages
as they approach the quota size. However, users often must access email
messages that are weeks, months, or even years old. With email archiving,
organizations can free up space in user mailboxes and still provide user
access to older email messages.

Data Protection

Page 60 © Copyright 2021 Dell Inc.


Knowledge Check

Knowledge Check

Data Protection

© Copyright 2021 Dell Inc. Page 61


Knowledge Check

Knowledge Check

1. Which archiving component scans primary storage to find files that meet the
archiving policy?
a. Archiving agent
b. Archiving storage
c. Archiving client
d. Archiving policy engine

Data Protection

Page 62 © Copyright 2021 Dell Inc.


Data Migration

Data Migration

Data Protection

© Copyright 2021 Dell Inc. Page 63


Data Migration

Data Migration

Data Protection

Page 64 © Copyright 2021 Dell Inc.


Data Migration

Data Migration Overview

Data migration is a specialized replication technique that enables moving data


from one system to another within a data center, between data centers, between
cloud, and between data center and cloud. It transfers the data between hosts
(physical or virtual), storage devices, or formats.

• In today’s competitive business environment, IT organizations should require


nondisruptive live migration solutions in place to meet the required SLAs.
• Organization deploys data migration solutions for the following reasons:

− Data center maintenance without downtime.


− Avoid production impacts due to natural disasters.
− Facilitate technology upgrades and refreshes.
− Data center migration or consolidation.
− Workload balancing across data centers.

Data Protection

© Copyright 2021 Dell Inc. Page 65


Data Migration

Hypervisor-Based Migration

VM Migrations

Migrated VMs

VM Migration

Compute System 1
Compute System 2

Network

Storage System

VM migration between compute systems

In this type of migration, virtual machines (VMs) are moved from one physical
compute system to another without any downtime. VM migration method enables:

• Scheduled maintenance without any downtime.


• VM load balancing.

Data Protection

Page 66 © Copyright 2021 Dell Inc.


Data Migration

VM Storage Migration

Compute System

Network

VM Storage
Migration

Storage system
Storage system

VM storage migration between storage systems

In a VM storage migration, VM files are moved from one storage system to another
system without any downtime or service disruption.

Key benefits of this type of migration are as follows:

• It simplifies array migration and storage upgrades.


• Dynamically optimizes storage I/O performance.
• Efficiently manages storage capacity.

Data Protection

© Copyright 2021 Dell Inc. Page 67


Data Migration

Storage-Based Data Migration

Storage-based migration moves block-level and file-level data between


heterogeneous storage systems.

SAN-based Migration

Push
Control Remote
Device Device

Pull

Control Storage System Remote Storage System

SAN-based migration

• SAN-based migration moves block-level data between heterogeneous storage


systems over SAN.
• Storage system that performs migration is called the control storage system.
• Data migration solutions perform push 7 and pull 8 operations for data movement.

7 Data is pushed from control system to remote system.


8 Data is pulled from the remote system to control system.

Data Protection

Page 68 © Copyright 2021 Dell Inc.


Data Migration

NAS-based Migration

NAS-based migration

• NAS-based migration moves file-level data between NAS systems over LAN or
WAN.

In this example, the new NAS system initiates the migration operation and pulls the
data directly from the old NAS system over the LAN. The key advantage of NAS to
NAS direct data migration is that there is no need for an external component (host
or appliance) to perform or initiate the migration process.

Data Protection

© Copyright 2021 Dell Inc. Page 69


Data Migration

Appliance-Based Data Migration

• Virtualization appliance facilitates the movement of files from an old NAS


system to a new NAS system.

• While the files are being moved, clients can access their files non-disruptively.
− Clients can also read their files from the old location and write them back to
the new location without realizing that the physical location has changed.
• Virtualization appliance creates a virtualization layer that eliminates the
dependencies between the data that is accessed at the file level and the
location where the files are physically stored.

Data Protection

Page 70 © Copyright 2021 Dell Inc.


Data Migration

VM Migration: Additional Information

To understand about virtual machine migration, click here.

Data Protection

© Copyright 2021 Dell Inc. Page 71


Knowledge Check

Knowledge Check

Data Protection

Page 72 © Copyright 2021 Dell Inc.


Knowledge Check

Knowledge Check

1. Which migration moves file-level data between file servers over LAN or WAN?
a. NAS-based
b. Byte-based
c. SAN-based
d. Block-based

Data Protection

© Copyright 2021 Dell Inc. Page 73


Concepts in Practice

Concepts in Practice

Data Protection

Page 74 © Copyright 2021 Dell Inc.


Concepts in Practice

Concepts in Practice

Dell EMC NetWorker

Dell EMC NetWorker is a backup and recovery solution for


mission-critical business applications in physical and
virtual environments for on-premises and cloud.

• Unified backup and recovery software for the


enterprise: Deduplication, backup to disk and tape,
snapshots, replication and NAS.
• NetWorker provides a robust cloud capability enabling
long-term retention to the cloud, backup to the cloud
and backup in the cloud.
• NetWorker Module for Databases and Applications (NMDA) provides a data
protection solution for DB2, Informix, MySQL, Oracle, SAP IQ, and Sybase ASE
data.

Dell EMC PowerProtect Data Manager

Dell EMC PowerProtect Data Manager provides software defined data protection,
automated discovery, deduplication, operational agility, self-service and IT
governance for physical, virtual and cloud environments.

PowerProtect Data Manager:

• Enables the protection, management, and recovery of data in on-premises,


virtualized, and cloud deployments, including protection of in-cloud workloads.

Data Protection

© Copyright 2021 Dell Inc. Page 75


Concepts in Practice

• Enables the protection of traditional workloads including Oracle, Exchange,


SQL, and file systems as well as Kubernetes containers and virtual
environments.
• Restores data on-premises or in the cloud. Governance control ensures IT
compliance, making even the strictest service level objectives obtainable.

Dell EMC Data Protection Advisor (DPA)

DPA is a reporting and analytics platform that provides full visibility into the
effectiveness of your data protection strategy. It can automate and centralize the
collection and analysis of all data.

Dell EMC Data Protection Advisor:

• Provides visibility across physical and virtual environments from a unified


dashboard.
• Provides real-time monitoring and alerting of protection software and storage.
• Allows automated discovery of data protection infrastructure.

Dell EMC PowerProtect DP Series Appliance

PowerProtect DP series appliances deliver powerful backup and recovery of all


organization’s data, wherever it lives, using a single appliance.

• PowerProtect Appliance is the next generation of Integrated Data Protection


Appliance (IDPA). It is all-in-one data protection software and storage in a
single appliance that delivers backup, replication, recovery, search, analytics
and more.

Data Protection

Page 76 © Copyright 2021 Dell Inc.


Concepts in Practice

• Features include:
− Systems can scale to Petabyte of usable capacity.
− Cloud long-term retention and cloud DR-ready.
− Provides VMware integration.
• PowerProtect Appliance supports native Cloud DR with end-to-end
orchestration.

− Allows enterprises to copy backed-up VMs from on-premises IDPA


environments to a public cloud.

Dell EMC PowerProtect DD Series Appliances

DD series enables organizations to protect, manage, and recover data at scale


across their diverse environments.

• Integrates easily with existing infrastructures,


enabling ease-of-use with leading backup and
archiving applications.
• Natively tiers deduplicated data to any supported
cloud environment for long-term retention with
Dell EMC Cloud Tier.
• Provides fast disaster recovery with orchestrated
DR and provides an efficient architecture to
extend on-premises data protection.

PowerProtect DD Virtual Edition (DDVE) leverages


the power of DDOS to deliver software-defined
protection storage on-premises and in-cloud.

Dell EMC TimeFinder SnapVX

TimeFinder SnapVX is a local replication solution for PowerMax, VMAX All Flash
storage systems with cloud scalable snaps and clones to protect data. SnapVX
solution:

Data Protection

© Copyright 2021 Dell Inc. Page 77


Concepts in Practice

• Provides space-efficient local snapshots that can be


used for localized protection and recovery and other
use cases including development, test, analytics,
backups, and updating.
• Secures snapshots to prevent accidental or malicious
deletion, securing them for a specified retention
period.

− The snapshots are made as efficient as possible by


sharing point-in-time tracks which are called
snapshot deltas.

Dell EMC SRDF

SRDF is Dell EMC’s remote replication technology for PowerMax. SRDF:

• Provides disaster recovery and data mobility solutions.


• Copies data between the sites independently without the host.
− There are no limits to the distance between the source and the target
copies.
• Enables storage systems to be in the same room, different buildings, or
hundreds to thousands of kilometers apart.
• Provides the ability to maintain multiple, host-independent, remotely mirrored
copies of data.

Dell EMC RecoverPoint

Dell EMC RecoverPoint provides continuous data protection for comprehensive


operational and disaster recovery.

• RecoverPoint for Virtual Machines is a hypervisor-based, software-only data


protection solution for Virtual Machines.
• RecoverPoint:

− Enables Continuous Data Protection for any point in time (PIT) recovery to
optimize RPO and RTO.
− Provides synchronous (sync) or asynchronous (async) replication policies.

Data Protection

Page 78 © Copyright 2021 Dell Inc.


Concepts in Practice

− Reduces WAN bandwidth consumption and uses available bandwidth


optimally.

VMware vSphere HA and FT

• VMware vSphere High Availability (HA) leverages multiple ESXi hosts that are
configured as a cluster to provide rapid recovery from outages.
− Provides high availability for applications running in virtual machines.
− Protects against a server failure by restarting the virtual machines on other
hosts within the cluster.
− Protects against application failure by continuously monitoring a virtual
machine and resetting it if a failure is detected.
• VMware vSphere Fault Tolerance (FT) provides a higher level of availability.

− Enables users to protect any virtual machine from a host failure with no loss
of data, transactions, or connections.
− Provides continuous availability by ensuring that the states of the Primary
and Secondary VMs are identical at any point in time.
− If either the host running the Primary VM or the host running the Secondary
VM fails, an immediate and transparent failover occurs.

Dell EMC Cloud Tier

Dell EMC Cloud Tier provides a solution for long-term retention. Using advanced
deduplication technology that reduces storage footprints, unique data is sent to the
cloud and data lands on the cloud object storage already deduplicated. Cloud
tiering:

• Provides a native cloud tiering with no external appliance, or cloud gateway


required.
• Enables efficient transfer of data to and from the cloud, using less bandwidth
(source-side deduplication).

Dell EMC Avamar

• Dell EMC Avamar enables fast, efficient backup and recovery through its
integrated variable-length deduplication technology.

Data Protection

© Copyright 2021 Dell Inc. Page 79


Concepts in Practice

• Avamar is optimized for fast, daily full backups of physical and virtual
environments, NAS servers, enterprise applications, remote offices and
desktops/laptops.
• Dell EMC Avamar is proven backup and recovery software that delivers secure
data protection for cloud, remote offices, desktops, laptops, and data centers.

Data Protection

Page 80 © Copyright 2021 Dell Inc.


Exercise: Data Protection

Data Protection

© Copyright 2021 Dell Inc. Page 81


Exercise: Data Protection

Scenario

A major multinational bank runs business-critical


applications in a data center:

• They have multiple remote and branch offices


(ROBO) across different geographic locations.
• They use tape as the primary backup storage media
for backing up virtual machines (VMs) and application data.
• They use an agent-based backup solution for backing
up data.
• The bank has a file-sharing environment in which
multiple NAS systems serve all the users.
− The data is backed up from application servers to backup device.
• Approximately 25% of data in the production environment is inactive data (fixed
content).
• The bank has two data centers which are 1000 miles apart.

Challenges

• Backup operations consume resources on the compute systems that are


running multiple VMs.
− This approach impacts the applications that are deployed on the VMs.
• Recovering data or VMs also takes more time.
• Backup environment has a huge amount of redundant data.
− It increases the infrastructure cost and impacts the backup window.
• Branch offices also have limited IT resources for managing backup.

Data Protection

Page 82 © Copyright 2021 Dell Inc.


− Backing up data from branch offices to a centralized data center involve
sending huge volumes of data over the WAN. It increases the cost of
deployment.
• Organization incurs a huge investment and operational expense in managing an
offsite backup infrastructure at remote site.

Requirements

• Need faster backup and restore to meet the SLAs.


• Need to eliminate redundant copies of data.
• Need an effective solution to address the backup and recovery challenges of
remote and branch offices.
• Need to offload the backup workload from the compute system to avoid
performance impact to applications.
• The organization requires a strategy to eliminate backing up fixed content from
the production environment.
• The organization requires a solution to reduce the management overhead and
the investment cost in managing the offsite backup copy.
• The organization requires a remote replication solution for DR that should not
impact the response time of the application.

Deliverables

• Recommend solutions that meet the organization's requirements.

Solutions

• Implement disk-based backup solution to improve the backup and recovery


performance for meeting SLAs.
• Implement deduplication solution to eliminate the redundant copies of data.
• Organization can use disk-based backup solutions along with source-based
deduplication.

Data Protection

© Copyright 2021 Dell Inc. Page 83


− Eliminate the challenges that are associated with centrally backing up
remote office data.
− Deduplication considerably reduces the required network bandwidth.
• Implement image-based backup that helps to offload backup operation from
VMs to a proxy server.
− No backup agent is required inside the VM to backup.
• Organization can implement data archiving solutions that archive fixed content
from the production environment.
− Reduce the amount of data to be backed up.
• Organization can choose backup as a service to replicate the offsite backup
copy to the cloud.
− It saves CAPEX and reduces the management overhead to the organization.
• To meet the DR requirement, the organization can implement asynchronous
remote replication.

− It provides finite RPO and does not impact response time.

Data Protection

Page 84 © Copyright 2021 Dell Inc.

You might also like