h18883 Dell Emc Powerprotect Data Manager Dynamic Nas Protection WP
h18883 Dell Emc Powerprotect Data Manager Dynamic Nas Protection WP
Protection
October 2022
H18883.2
White Paper
Abstract
This white paper describes how Dell PowerProtect Data Manager
protects NAS storage arrays and generic NAS shares using a dynamic
NAS-protection solution.
Dell Technologies
Copyright
The information in this publication is provided as is. Dell Inc. makes no representations or warranties of any kind with respect
to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose.
Use, copying, and distribution of any software described in this publication requires an applicable software license.
Copyright © 2021-2022 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other
trademarks are trademarks of Dell Inc. or its subsidiaries. Intel, the Intel logo, the Intel Inside logo and Xeon are trademarks
of Intel Corporation in the U.S. and/or other countries. Other trademarks may be trademarks of their respective owners.
Published in the USA October 2022 H18883.2.
Dell Inc. believes the information in this document is accurate as of its publication date. The information is subject to change
without notice.
Contents
Executive summary ........................................................................................................................ 4
Introduction ..................................................................................................................................... 6
References ..................................................................................................................................... 32
Executive summary
Overview Network attached storage (NAS) is an IP-based file-sharing storage device which is
attached to a local area network (LAN). NAS can serve various clients and servers over
an IP network. A NAS device uses its own operating system and integrated hardware and
software to deliver a range of file-service needs.
NAS is widely used for its simplicity, ease of use, and outstanding performance. With its
simple use come challenges regarding data protection. For years, NAS and backup
vendors have used the NDMP protocol to protect NAS data. The NDMP protocol has its
own limitations, such as manual slicing of a NAS share to achieve multi-stream backup,
limited parallel streams, and periodic full backups. Customers also face challenges to
protect their growing amounts of data and to back up this data within their specified
backup windows.
PowerProtect Data Manager for NAS protection addresses today’s customer challenges
of protecting evolving NAS environments. Unlike NDMP-based solutions, dynamic NAS
protection is a NAS-vendor-agnostic solution. With dynamic NAS protection, customers
can overcome the challenges with the NDMP protocol.
Protecting NAS assets with Data Manager is a non-NDMP solution. Dynamic NAS
protection uses the NAS Protection Engine for backup and recovery orchestration. This
solution is easy to use, and provides automatic discovery, orchestration, and
management through the Data Manager UI. With its snapshot technology and intelligent
slicing, Data Manager protects NAS data efficiently within the required backup window.
This solution addresses some of the challenges to dynamic NAS protection with the
following capabilities:
• Vendor-agnostic solution for NAS protection
• Forever incremental backup and no periodic full
• High number of parallel streams and multiple virtual containers to address scale
and performance
• Index, search, and restore
• Restore to any NAS device, such as NFS/CIFS
Audience This white paper is intended for Dell Technologies customers, partners, and employees
looking to protect NAS storage arrays using Data Manager.
We value your Dell Technologies and the authors of this document welcome your feedback on this
feedback document. Contact the Dell Technologies team by email.
Note: For links to other documentation for this topic, see the PowerProtect Data Manager Info
Hub.
Introduction
PowerProtect PowerProtect Data Manager for NAS protection is a software-only solution that supports
Data Manager for centralized backup and recovery for NAS assets. Dynamic NAS protection provides a
NAS overview non-NDMP, crawl- and backup-based solution by leveraging the NAS Protection Engine
internally using Filesystem Agents (FSA) File based-backup (FBB) technology. Data
Manager for NAS protection supports multi-stream backup and restore. With centralized
support, Data Manager controls and manages end to end backup and recovery
operations.
Data Manager for NAS protection supports all the Data Manager objectives such as DD
Replication, Cloud Tier, progress monitoring, and SLA compliance.
The dynamic NAS solution supports protection for Dell PowerStore, Dell Unity, Dell
PowerScale (Isilon) NAS products, and any NFS or CIFS share using generic NAS for
other vendors such as NetApp, Windows, and Linux file servers.
Note: Supported hardware or software platforms may be updated in subsequent releases. See the
support matrix at https://elabnavigator.dell.com/eln/modernHomeDataProtection for the latest
product information.
Data Manager for NAS protection supports following restore use cases:
• Share-level restore
• Restore to any device, NFS, or CIFS
• Restore to original and alternate NAS shares
• File-Level Recovery (FLR): NAS backups are indexed on the Search Engine for
search and restore operations.
Note: See the PowerProtect Data Manager for Network Attached Storage User Guide
for details on how to review protection logs.
Architecture overview
The following high-level architecture describes the dynamic NAS asset backup and
recovery solution with Data Manager.
The NAS array is the primary storage location for the NAS data from where the data is
read and sent to the secondary storage location which is PowerProtect DD series
appliances. Data Manager protects the NAS assets using specialized NAS Protection
Engines.
The NAS Protection Engine is used as a data mover for backup and recovery.
Containerized NAS agents run on the NAS Protection Engine to support multiple NAS
protection operations. Each NAS container is pre-installed with NAS agent and FSA
agent. NAS Agent will use FSA binaries for backup and recovery and orchestrate them to
run backup and recovery with multiple threads or streams to achieve optimal scale and
performance. Once the backup is completed, the Search Cluster creates indexes for the
NAS backup and supports search and FLR.
NAS discovery plug-in: The discovery plug-in enables automated discovery of supported
Dell Technologies NAS appliances as assets and stores the asset information to the
Elasticsearch database.
Virtual Proxy Orchestrator Daemon (vpod): NASDM uses vpod to orchestrate NAS
assets protection using NAS Protection Engines. NASDM integrates with vpod component
for backup and recovery.
NAS Container: Docker container running on Protection Engine for NAS data protection.
• Docker-based container for NAS workloads on the Protection Engine
• Runs on-demand on Protection Engine and destroyed after backup/recovery job
completion
• Each job will run in a separate NAS container
• Multiple containers can run simultaneously
• NAS container is packaged with NAS Protection Engine
NAS Agent: Backup and recovery agent for NAS shares.
• Manages NAS asset snapshot (Create, delete, and mount)
• Uses intelligent share slicer to create slices of NAS asset for parallel backups
• Performs multiple stream backup to achieve optimal scale and performance
• Manages Filesystem Agent for backup and recovery
• Manages NAS metadata records for each NAS asset
• Periodically collects backup and recovery progress
FSA Agent: Dynamic NAS Protection uses FSA-FBB as data mover for NAS data
protection. FBB method uses a crawl-and-backup method, which means that it crawls the
given file system and backs up files and metadata.
• Packaged with NAS agent and installed as part of NAS container deployment on
Protection Engine
• Moves the NAS data to and from PowerProtect DD series during backup and
recovery respectively
Search cluster
The search cluster consists of one or more Search Engine nodes to index the NAS
backup data.
• PowerProtect Data Manager search cluster is a multi-node Search Engine (up to 5)
Note: Each search node can index up to 1 billion files. If there are no virtual machine
backups configured and only NAS shares are being protected, each node can index up to 1
billion files over single (or different) shares.
Intelligent auto The NAS file-share auto slicer is a new library that is embedded in the Data Manager NAS
slicer for NAS agent. The slicer splits NAS assets (NAS share, a file system) into multiple sub-assets in
protection preparation for multi-stream data movement to PowerProtect DD series. Slices are
created using parallel threads, and each slice is backed up concurrently using available
NAS Protection Engine containers and moved to a PowerProtect DD series appliance.
The slicer partitions NAS assets dynamically before each backup. Based on backup
history and changes in the content of the NAS asset being sliced, relevant slices are
added, removed, or rebalanced. Periodically, unbalanced trees are automatically
managed as content changes over time. No manual reconfiguration is required. The
default slice size is 200 GB or 1 million files (tolerance of 30%).
Auto distribution Dynamic NAS solution enables automated load balancing of protection engine hosts, and
of backup automatic scaling for containers to achieve maximum backup streams and reduce manual
streams management overhead.
Sizing During backup, each asset is divided into smaller slices based on the threshold values as
recommendation mentioned below. Each slice is then serviced by an individual stream.
s • Number of Slices = (Assets Size) / (Slice Size) (The threshold slice size is 200 GB
and/or file count of 1 million; with a tolerance of 30%.)
• While deciding the number of Protection Engines, it is recommended to use a factor
of 1.2-1.5x size of the above slice count.
• Each Protection Engine supports up to 24 concurrent streams.
The following are the recommended guidelines for Protection Engine count to achieve
optimized throughput.
▪ Number of Protection Engines = (Number of Slices) / 24 (24 is the total count of
streams per Protection Engine, where eight streams a piece are served by a
different container.)
▪ With the current Data Manager v19.9 release, the recommendation is to scale
up to 11 Protection Engines for larger shares (for example 50 TB or larger).
• To achieve optimum performance, it is recommended to use a dedicated 10 GbE
network per Protection Engine. The Protection Engine throughput is bounded by
underlying network stack on ESXi host. Hence, a dedicated 10 GbE network per
Protection Engine would achieve better performance.
• Multiple NAS Protection Engines with a dedicated 10 GbE network can achieve
better net aggregated throughput. This includes reading the data from NAS array
and writing it to protection storage.
• If the whole environment is 10 GbE network (NAS array, PowerProtect DD series
and multiple Protection Engines), the overall throughput is bound by 10 GbE
network speed.
• If the read throughput from the NAS array and write throughput to PowerProtect DD
series causes a bottleneck with multiple Protection Engines, it is recommended to
have more network ports on NAS array and on PowerProtect DD series.
• The asset parallelism per asset helps to load balance the number of streams across
multiple shares. Asset Level Parallelism enables all asset backups to run in parallel,
and each of these assets has many concurrent streams (as per user Input of asset
parallelism). Also, if there are enough containers available, all these assets will run
in parallel. The Asset Level Parallelism parameter maximum supported count with
Data Manager v19.9 release is 256 concurrent streams per asset.
• Also, we can use the sizing tool created by Dell Technologies to determine the
number of protection engines which must be deployed for a certain protection load.
Contact Dell Technologies support to download and use a recent version of the
tool. The yellow highlighted sections in the tool can be edited to give us the approx.
number of proxy engines which need to be deployed without any manual
calculation. Some of the inputs required from the customer are:
▪ Array type (PowerScale, Dell Unity, PowerStore, or generic)
▪ Total amount of NAS data to be protected in TB
▪ Total number of files in millions
▪ Expected backup duration for full (Gen 0) backup
▪ Network parameters like the number of Array nodes/ ports, PowerProtect DD
series ports and Proxy ESXi’s ports involved in the backup. Same data will be
used to account for any bottlenecks in the proxy engine calculation.
▪ Expected backup duration for Synthetic full/ Incremental (Gen 1) and the
approx. change rate expected between backups.
Steps for NAS The following objectives are required to be completed for protecting the NAS assets.
protection
Note: See the PowerProtect Data Manager for Network Attached Storage User Guide or release
notes for details about prerequisites and initial configuration settings.
Enabling the Enabling the asset source in Data Manager allows you to add and register the asset
NAS asset source for the protection of NAS assets. From the Data Manager UI, the NAS asset
source source can be enabled from New Asset Source section as given below.
Adding a NAS For supported appliance types, the NAS appliance can be added as an asset source for
appliance to Data Manager to automatically discover any assets to protect.
Data Manager
Field Description
Field Description
Address Enter the FQDN or IP address for the appliance management interface.
Port Enter the port number for HTTPS REST API access to the appliance.
Dell PowerScale Data Manager 19.12 has added the following features to support Dell PowerScale
SmartConnect SmartConnect and multiple access zones:
and multiple
access zone • PowerScale SmartConnect names can be added as an asset source without
duplication of assets during discovery. Through this we can leverage
support
SmartConnect for client connection load balancing, and dynamic NFS failover and
failback of client connections across storage nodes to provide optimal utilization
of the cluster resources.
• SmartConnect access zones are used as a data path where backup and restores
execute over the network, mapped to zones of which the asset is part.
Note: See the PowerScale OneFS Web Administration Guide for more information about
SmartConnect and non-system access zones.
Adding a NAS For appliances where Data Manager does not support automatic discovery, the NAS
share to Data share can be added as an asset source.
Manager
Default Port
Protocol Syntax
Numbers
<NAS>:/<share-path-and-name>
NFS 2049
<NAS>:<port>/<share-path-and-name>
Note: <NAS> can be either the fully qualified domain name or IP address for the NAS. Verify and
use the user-defined port numbers, if any.
See the Dell PowerProtect Data Manager for Network Attached Storage User Guide for
more detailed steps about adding the NAS appliance or share to Data Manager.
Note: The discovery status of generic NAS asset sources is displayed as Unknown.
Notes:
• For Generic NAS assets, the size of the share is shown as 0 bytes during the initial
discovery. However, the size of the share is determined and displayed after the first
successful backup.
• In case of multi-protocol shares in supported Dell Technologies appliances, Data
Manager displays two entries of the same asset with different protocols
(CIFS/NFS).
Deploying a The NAS Protection Engine is deployed on the selected VMware vCenter, and Data
Protection Manager registers the Protection Engine. The NAS Protection Engine hosts the NAS
Engine for NAS agent and FSA.
asset protection
Enter a new value (1 to 256) for Maximum Streams and click Save. The default value is 8
streams.
Centralized Data Manager supports centralized protection for NAS assets, where all the stages of the
protection policy protection policy are managed by Data Manager.
for NAS assets
Note: Data Manager uses these credentials at the policy level for all shares unless otherwise
specified at the asset level. The credentials provide snapshot creation and export permissions on
the appliance and read/write access to the NAS shares.
If credentials are set at the protection policy level, all shares should use the same
credentials for access. Otherwise, if individual asset credentials are set at the asset level,
multiple assets use their respective credentials.
See the Dell PowerProtect Data Manager for Network Attached Storage User Guide for
more detailed steps about creating a protection policy for NAS protection.
With Data Manager 19.12, we can now continue the backup even if there is a data access
denied or ACL access denied failure encountered on files. We can set these flags as
shown in the following figure.
Manual When performing the NAS backup using the Protect Now option, Data Manager provides
protection of Synthetic Full or Full as the backup selection type.
NAS backup
The following screen shows the protection job details from Data Manager UI for the NAS
protection job in progress.
The following screen shows the job summary details from Data Manager UI for the NAS
backup in progress.
The following screen shows the protection job details from Data Manager UI for the
successful NAS backup.
The following shows the job summary details from Data Manager UI for the successful
NAS backup.
Backup From Data Manager 19.12, the skipped element(s) count of ACLs and data backed up is
completed with captured in the job summary, as shown in the following figure.
exceptions
The user can see all skipped elements in the UI, using the Show Skipped files button, for
index enabled backups.
Existing “Export Log” functionality is used to download the skipped element list in CSV
format.
slicer to create slices of NAS data. The file system agent moves the data slices in parallel
to the PowerProtect DD series appliance. NASDM initiates the indexing once the backup
is complete.
The following shows the options to restore and overwrite the original share or restore to
an alternate share or array.
NAS file-level NAS Protection solution provides File Index, Search, and FLR which allow you to search
restore using the file and folder from the entire NAS backup. When the Search Engine is deployed and
File Search NAS protection policy is enabled with indexing, you can use the File Search option to
restore individual files and folders from one or more NAS backups.
From Data Manager 19.12, we can now search for skipped files that have not been
backed up using the File Search option.
Note: Indexing must be enabled for the File Search option to become available.
File versions: Select how Data Manager should distinguish files from different backups, if
the selected files and folders exist in multiple backups of the same NAS asset.
Option Description
The following shows selecting the destination restore location options for FLR.
Option Description
Restore and Overwrite the The restore operation overwrites any files at the original location
Original Files and Folders with the same names.
Option Description
Restore to an Alternate A table of available shares appears. Complete the following sub
Share or Array steps.
Note: From PowerProtect Data Manager 19.11, having FLR restore to the original location allows
you to overwrite the existing files on the export/ share without creating the entire path hierarchy of
the restored file/folder.
For alternate share restores, we can now recover the files/folders directly at the root level of the
alternate asset/ share/ export. This helps to retain the complete folder hierarchy when performing
the FLR action.
Performance Results
Disclaimer: Backups were performed using PowerScale storage as the source and PowerProtect
DD series as the backup target with PowerProtect Data Manager. These results were derived
from Dell Technologies internal testing performed under varying conditions.
Filesystems come in all sorts of complex topologies. In this section we will observe the
average write throughput achieved when performing backups for filesystems with
Balanced directory structure and Unbalanced directory structure. Data Manager uses auto
slicing and intelligent scaling through proxies to support faster multi – stream backups
irrespective of the filesystem type.
Comparable results are seen in case of Unbalanced directory structure as well (Millions of
files varying from 32KB to 4GB file sizes distributed in multiple directories). During an
initial full backup to PowerProtect DD series, write throughput is 2 TB/hr and the
incremental backup (with 4% change in data) shows an increased throughput of 3 TB/hr.
For these tests, multiple proxy engines were used to support the workload and the asset
level parallelism was set to 256. To learn more about calculating number of proxies
required and setting protection engine parameters, see the section Sizing
recommendations.
References
Dell The following Dell Technologies documentation provides other information related to this
Technologies document. Access to these documents depends on your login credentials. If you do not
documentation have access to a document, contact your Dell Technologies representative.
• The Data Protection Info Hub
For PowerProtect Data Manager:
• Dell PowerProtect Data Manager for Network Attached Storage User Guide
• Dell PowerProtect Data Manager Administration and User Guide
• Dell PowerProtect Data Manager Deployment Guide
• Dell PowerProtect Data Manager Release Notes
For Dell PowerProtect DD series appliances:
• Dell DDOS Administration Guide