PowerScale - Upgrade Planning and Process Guide
PowerScale - Upgrade Planning and Process Guide
April 2024
Notes, cautions, and warnings
NOTE: A NOTE indicates important information that helps you make better use of your product.
CAUTION: A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid
the problem.
WARNING: A WARNING indicates a potential for property damage, personal injury, or death.
© 2013 - 2024 Dell Inc. or its subsidiaries. All rights reserved. Dell Technologies, Dell, and other trademarks are trademarks of Dell Inc. or its
subsidiaries. Other trademarks may be trademarks of their respective owners.
Contents
Contents 3
HealthCheck tool...............................................................................................................................................................20
Backup data........................................................................................................................................................................20
SyncIQ backup............................................................................................................................................................. 20
NDMP backup.............................................................................................................................................................. 20
Back up custom settings........................................................................................................................................... 20
Complete or stop jobs in progress................................................................................................................................. 21
Complete system jobs................................................................................................................................................. 21
Update Node Firmware Package...................................................................................................................................22
Update Drive Support Package..................................................................................................................................... 22
Configure IPMI ports........................................................................................................................................................ 22
Disable SupportAssist or Secure Remote Services (SRS)...................................................................................... 22
4 Contents
Chapter 6: Troubleshooting your upgrade....................................................................................36
Troubleshooting an upgrade........................................................................................................................................... 36
Contents 5
1
Introduction to this guide
Topics:
• About this guide
• Where to get help
NOTE: For upgrades from OneFS 9.0.0.0 and earlier, contact your Dell Technologies account team.
Planning an upgrade 7
Review required documentation
Reviewing the documentation in this list helps you to understand the upgrade process and the impact the upgrade could have on
your workflow. Some links require a support login.
Required documentation
See the PowerScale OneFS InfoHub for the version of OneFS that corresponds with your deployment.
● PowerScale OneFS Supportability and Compatibility Guide
Confirm that your PowerScale software and PowerScale hardware is compatible with the version of OneFS to which you are
upgrading. Check on releases, release dates, and version status.
● PowerScale OneFS Release Notes
Read the OneFS release notes for each version between your current version and your target version for information about
new features, changes, resolved issues, and known issues.
● PowerScale OneFS Current Patches
Review patches that have been released for the version of OneFS to which you are upgrading.
Other documentation
● OneFS Technical and Security Advisories
Determine whether any PowerScale Technical Advisories or Security Advisories have been issued for the version of OneFS
to which you are upgrading.
● For more information about tips and tricks from the services team, see KB article 200890
Parallel upgrades
Dell Technologies recommends using the parallel upgrade option when upgrading any size cluster running OneFS 8.2.2 and later.
Parallel upgrades require a smaller maintenance window than rolling upgrades, and do not require the interruption of service like
simultaneous upgrades.
A parallel upgrade installs the new operating system on a subset of nodes and restarts that subset of nodes at the same time.
Each subset of nodes attempts to make a reservation for their turn to upgrade until all nodes are upgraded. Node subsets and
reservations are based on diskpool and node availability.
During a parallel upgrade, node subsets that are not being upgraded remain online and can continue serving clients. However,
clients that are connected to a restarting node are disconnected and reconnected. How the client connection behaves when
a node is restarted depends on several factors including client type, client configuration (mount type, timeout settings), IP
allocation method, and how the client connected to the cluster. In OneFS 9.2.0.0 and later, client connection behavior is
managed by the disruption manager settings.
Rolling upgrades
A rolling upgrade installs the new operating system and restarts each node individually in the OneFS cluster so that only one
node is offline at a time. A rolling upgrade takes longer to complete than a simultaneous upgrade. During a rolling upgrade, nodes
8 Planning an upgrade
that are not currently being upgraded remain online and can continue serving clients. However, clients that are connected to
a restarting node are disconnected and reconnected. How the client connection behaves when a node is restarted depends
on several factors including client type, client configuration (mount type, timeout settings), IP allocation method, and how the
client connected to the cluster.
Simultaneous upgrades
A simultaneous upgrade installs the new operating system and restarts all nodes in the OneFS cluster at the same time.
Simultaneous upgrades are faster than rolling upgrades but require a temporary interruption of service during the upgrade
process. All client connections to the cluster must be terminated before completing the upgrade and data is inaccessible until
the installation of the new OneFS operating system is complete and the cluster is back online.
isi_for_array uname -r
Find your version of OneFS in the supported OneFS Upgrade Paths matrix and confirm which versions of OneFS are supported
upgrade paths.
Planning an upgrade 9
Upgrade Paths for supported versions of OneFS
The following table can be used to determine the supported upgrade path from the current version of OneFS on your cluster to
a target version of OneFS. The upgrade paths table displays the supported upgrade paths from major OneFS version to major
OneFS version, but does not include patch information.
Key:
● "O" : Rolling and Simultaneous upgrades available
● "=" : Parallel, Rolling, and Simultaneous upgrades available
● "x" : Unsupported upgrade path
● "–": For upgrades from OneFS 9.0.0 and earlier, contact your Dell Technologies account team.
NOTE: These supported upgrade paths do not guarantee bug fix and feature parity. For more information about using
patches to achieve bug fix and feature parity, contact your account team.
NOTE: If you are upgrading from an earlier version of OneFS than is listed in the upgrade matrix above, contact your
account team for assistance.
10 Planning an upgrade
Consider upgrade limitations
If the upgrade cannot be completed for any reason—for example, if there is insufficient space on the cluster or if the upgrade
process detects a stalled drive—the system will revert to the current version and the upgrade will be cancelled. Preparing your
cluster as recommended in this guide will help you to avoid situations that might result in a cancelled upgrade.
NOTE: In OneFS 8.2.0 and newer, you have the ability to pause and resume the upgrade process in order to resolve
blocking issues.
Upgrade The maintenance window should encompass the pre-upgrade, upgrade, and post-upgrade phases.
maintenance Estimate the time that it takes to run the upgrade considering cluster size and upgrade type (parallel,
window rolling, or simultaneous). Schedule time to inform users when the upgrade will take place and that client
connections might be slow, file access might be affected, and clients might be disconnected. A best
practice is to upgrade the cluster during an off-hours maintenance window.
Schedule time for node and drive health checks and replacement of bad hardware. Include time to update
configurations and settings that are not supported in the new version.
Estimate the time that it takes to back up your data, considering cluster size, number of files, types of
files, and file size. Also include time to collect information about the cluster such as status, logs, and
settings.
If performing a parallel or rolling upgrade, consider whether you will configure client connection drain
times, which will extend the required maintenance window, but lower the impact on client connections.
Build in time to let the upgrade jobs run to completion and to reestablish permissions and connections.
Schedule time or extend the maintenance window to accommodate post-upgrade tasks such as
reconfiguring custom settings, updating scripts to reflect command and functionality changes in the
upgrade version, and potential troubleshooting.
(Optional) If available, upgrading a test cluster with the same current version of OneFS before you upgrade your
Upgrade a test production cluster can expose issues that could slow down or prevent the upgrade of your production
cluster system.
After you upgrade a test cluster, verify that the cluster is operational and validate key workflows on the
test cluster by simulating how administrators, users, and applications interact with the system.
Planning an upgrade 11
3
Completing pre-upgrade tasks
Topics:
• Pre-upgrade process - Overview
• Collect cluster information
• Check cluster readiness
• Verify configurations and settings
• Download the OneFS installation bundle file
• Upgrade compatibility check utility
• On-Cluster Analysis tool
• HealthCheck tool
• Backup data
• Complete or stop jobs in progress
• Update Node Firmware Package
• Update Drive Support Package
• Configure IPMI ports
• Disable SupportAssist or Secure Remote Services (SRS)
Pre-upgrade - Summary
The following is a summary of steps to perform during the pre-upgrade phase:
1. Collect cluster information.
a. Collect cluster status.
b. Gather cluster logs.
c. Check cluster hardware health.
d. Check cluster available space.
2. Resolve events and errors.
3. Verify cluster configuration.
a. Preserve the Kerberos keytab file.
b. Install DataIQ or InsightIQ.
4. Download the OneFS installation file.
5. Run the upgrade compatibility check utility.
a. Reconfigure unsupported SMB settings.
6. Run the On-Cluster Analysis tool.
7. Run the HealthCheck tool.
8. Back-up cluster data.
9. Complete system jobs.
10. Update Node Firmware Package.
11. Update Drive Support Package.
12. Configure IPMI ports.
13. Disable SupportAssist.
isi_gather_info
To gather the log files in OneFS 9.1.0.0 and later, run the following command:
The files generated during the log gathering process are stored on the cluster in the /ifs/data/Isilon_Support/pkg
directory.
isi status -v
2. To check for drives that do not report a status of HEALTHY, L3, or JOURNAL.
NOTE: If a drive is degraded, do not continue with the upgrade until the issue is resolved.
4. If the cluster has an InfiniBand network, confirm whether a node has been assigned the OpenSM (subnet manager) main
role.
Confirm that the output displays only one node in the cluster as the main (opensm). The output should be similar to the
following:
/ifs The /ifs directory cannot be If this directory is at or near the minimum available-
more than 90 percent capacity. space requirement, see the following resources for
steps to address the issue:
● Event ID 100010004, The cluster's /ifs partition
is near capacity.
/var The /var partition cannot be If this directory is at or near the minimum available-
more than 90 percent capacity. space requirement, see the following resources for
steps to address the issue:
● Event ID 100010001, The /var partition is near
capacity.
/var/crash The /var/crash directory If this directory is at or near the minimum available-
cannot be more than 90 percent space requirement, see the following resources for
capacity. steps to address the issue:
● Event ID 100010002, The /var/crash partition is
near capacity.
isi stat
● To confirm how much space is being used in each node pool, run the following command:
isi stat -p
● To confirm how much space is being used by critical directories on the cluster, run the following command:
The isi_for_array output is similar to the following for each node in the cluster:
● If the command returns any critical errors, check the log files in the following directories for more information:
○ /var/log
○ /var/log/messages
○ /var/crash
● NOTE: If any log files contain messages about a dynamic sector recovery (DSR) failure or a Data Integrity (IDI)
failure, contact PowerScale before you upgrade.
2. Cancel non-critical events before upgrading to prevent a recurrence of notifications that you know to be harmless.
If you have critical events that you are unable to resolve, contact support before upgrading.
</files>
</user_preserve>
NOTE: If you use the web administration interface to upgrade from OneFS 9.3.0.x or earlier, to OneFS 9.4.0.0 or later,
you must first change the file extension from .isi to .tar.gz. (example: change OneFS_v9.4.0.0_Install.isi to
OneFS_v9.4.0.0.Install.tar.gz)
2. Open a secure shell (SSH) connection to any node in the cluster and log in using the root account.
3. Move the installation file that you downloaded into the /ifs/data directory on the cluster you want to upgrade.
4. Validate the integrity of the installation file.
For OneFS 9.4.0.0 and later, use the following instructions:
a. To import the package into the OneFS Catalog, run the isi upgrade catalog import /ifs/data/
<filename>.isi command.
b. To verify the package, run the isi upgrade catalog verify --file /ifs/data/<filename>.isi
command.
For OneFS 9.3.0.0 and earlier, use the following instructions:
a. Optional: On the OneFS downloads page, click Checksum Values and record the SHA-256 checksum value displayed.
b. Optional: In OneFS CLI, run the following command where <installation-file-name> is the name of the downloaded
installation file:
● sha256 /ifs/data/<installation-file-name>
c. Optional: Compare the SHA-256 checksum value that you recorded from the downloads page on the Online Support site
to the checksum value returned from the SHA-256 command. If the values do not match, re-download the installation
file.
gzip -d /ifs/data/<installation-file-name>
NOTE: Clusters running OneFS 9.4.0.0 and later are not required to unzip .tgz files.
6. (OneFS 9.3.0.0 and earlier) To unbundle the .tar file, run the following command where <installation-file-name> is the name
of the OneFS install tar file:
NOTE: Clusters running OneFS 9.4.0.0 and later are not required to unbundle .tar files.
7. (OneFS 9.3.0.0 and earlier) Validate the manifest file (example: OneFS_v9.3.0.0_Install.tar.gz). See the Checking Manifests
section of the PowerScale OneFS Security Configuration Guide for more information.
NOTE: Clusters running OneFS 9.4.0.0 and later use the OneFS Catalog to verify packages.
1. Open a secure shell (SSH) connection to any node in the cluster and log in to the cluster with the root account.
2. Start the upgrade compatibility check utility by running the following command, where <install-image-path> is the file path
of the upgrade installation file.
isi upgrade cluster assess <install-image-path>
NOTE: The upgrade compatibility check utility might take several minutes to run. If the utility returns errors, resolve the
errors before continuing with the upgrade. Warnings are informational and do not prevent an upgrade.
3. View the results of the upgrade compatibility check by running the following command:
isi_upgrade_logs -a
Backup data
It is recommended to backup your cluster data immediately before you upgrade. Schedule sufficient time for the back up to
complete before the upgrade window.
SyncIQ backup
SyncIQ is one option you can use to backup your OneFS cluster. SyncIQ creates and references snapshots to replicate a
consistent point-in-time image of a root directory.
For more information about backing up your OneFS cluster, see the OneFS CLI Administration Guide or the OneFS Web
Administration Guide for your version of OneFS.
NOTE: If you are upgrading your cluster from OneFS 8.1.0.x or earlier to OneFS 8.1.1.x, 8.1.2.x, or 8.1.3.x, and your cluster
is in Compliance mode, you must ensure that all SyncIQ partners are on the same code and patch level OneFS before
restarting SyncIQ backups, or the backups fail. This issue is resolved in OneFS 8.2.0 and later.
NOTE: If you are upgrading your cluster to OneFS 8.2.0 through 9.2.1.0, are using SyncIQ, and have non-networked
(NANON) nodes on your source cluster, SyncIQ jobs might fail after the upgrade. For more information including a
workaround, see the KB article SyncIQ jobs failing with NANON after upgrade
NDMP backup
Other OneFS cluster backup options include using the Network Data Management Protocol (NDMP).
From a backup server, you can perform both NDMP three-way backup and NDMP two-way backup processes between a cluster
and backup devices such as tape devices, media servers, and virtual tape libraries (VTLs).
See the OneFS Web Administration Guide or the OneFS CLI Administration Guide for information about backing up data using
NDMP.
Passwords for After you upgrade, you might Make a list of the local accounts and their passwords before you
local user have to reset the passwords upgrade.
accounts of the local user accounts that
you configured on the cluster.
Other users should be prepared to
reset the passwords of their local
accounts after the upgrade.
sysctl If you changed the default PowerScale does not recommend modifying sysctl parameters
parameters value that is assigned to one unless you are instructed to do so by Dell Technologies Support. If
or more sysctl parameters you must modify a sysctl parameter, configure the parameter in
by editing the /etc/mcp/ the /etc/mcp/override/sysctl.conf file to ensure that the
override/sysctl.conf file, change is preserved when you upgrade a node or a cluster.
the etc/mcp/templates
Before you upgrade, document your custom sysctl parameters and
directory, or the /etc/local/
back up the /etc/mcp/override/sysctl.conf, the etc/mcp/
sysctl.conf file, you might
templates directory, and the /etc/local/sysctl.conf files.
need to reset the parameter after
you upgrade. For more information about making sysctl changes persist through
upgrades, see KB article 000102543 and KB article 000083411.
If you modified a sysctl
parameter by editing another
file—for example, the /etc/
sysctl.conf file—the change
will not be preserved during the
upgrade.
Cron jobs Cron jobs settings that were Document and back up custom cron job settings or configure them
not configured in the /etc/mcp/ in the /etc/mcp/override/crontab.smbtime file before you
override/crontab.smbtime upgrade.
file are not preserved during an
upgrade. After you upgrade, you might have to modify a cron job to
accommodate changes to OneFS commands.
2. To cancel a job, run the following command where <job_id> is the ID of the job you want to cancel:
NOTE: Sync polices and jobs must be canceled or paused in order for the upgrade to complete successfully.
NOTE: Use of IPMI ports is supported in OneFS version 8.2.2 and later.
NOTE: You can upgrade OneFS using the command-line interface or the web administration interface.
Upgrade - Summary
The following is a summary of steps to perform during the upgrade phase:
1. Perform the upgrade.
2. Commit the upgrade.
3. Verify the upgrade.
NOTE: In 9.2.0.0 and later, you can use the upgrade status screen to monitor the progress of your upgrade, view
connected clients, and delay draining clients from specific nodes.
After the upgrade, a number of upgrade-related jobs may continue to run on the cluster for some time. During this time, the
cluster is accessible, but you might experience a decrease in cluster performance. After the jobs complete, performance will
return to normal. At this stage, the upgrade is complete, but is not committed. You can still roll back to the previous version of
OneFS. Some new features in the upgrade might not be available until the upgrade is committed.
In OneFS 9.2.0.0 and later, you can include the firmware upgrade with your OneFS upgrade. To perform a parallel upgrade
that includes a firmware upgrade, run the following command where <install-image-path> is the file path of the upgrade
install image and <firmware-path> is the file path of the firmware package.
NOTE: The isi upgrade cluster command runs asynchronously. The command does not run the entire upgrade
process; instead, it sets up the upgrade process, which nodes take turns controlling. For this reason, the command
returns quickly. To view the progress of the upgrade, use the isi upgrade view command or the web administration
interface.
Options Description
Nodes to Upgrade specific nodes with the --nodes <integer_range_list> option.
select for
Specify the nodes in their upgrade order as a comma-separated list (for example, --nodes 7,3,2,5) or as
upgrade
a dash-separated range (for example, --nodes 1-7) of logical node numbers (LNNs).
NOTE: We recommend that you upgrade all the nodes. If you upgrade some nodes, a weekly alert is
sent to confirm that the upgrade is making progress. Do not leave the cluster in a partially upgraded
state for a prolonged period. Some new features in the upgrade might not be available until all the nodes
in the cluster have been upgraded and the upgrade is committed. Refer to the release notes for the
OneFS version that you are upgrading to for information about features that require all the nodes to be
upgraded.
OneFS 9.2.0.0 and later ignores the node restart order. Instead, OneFS restarts any node that is ready, as long as only one
node is restarting at a time.
The following example for OneFS 9.0.0.0 through OneFS 9.1.0.x starts a parallel upgrade on nodes 7,3,2,5, in that order:
5. To add nodes to a current upgrade, run the following command, where <nodes> is the list of LNNs:
6. To add any remaining nodes to the upgrade, run the following command:
After the upgrade, a number of upgrade-related jobs may continue to run on the cluster for some time. During this time, the
cluster is accessible, but you might experience a decrease in cluster performance. After the jobs complete, performance will
return to normal. At this stage, the upgrade is complete, but is not committed. You can still roll back to the previous version of
OneFS. Some new features in the upgrade might not be available until the upgrade is committed.
SMB3 Client transitions from the restarting node to a new node without disruption.
NFSv2 and NFSv3 Client transitions from the restarting node to a new node without disruption.
NFSv4 Client transitions from the restarting node to a new node without disruption. Clients use NFSv4
failover support.
NOTE: For more information about NFS settings, see article 457328, Best practices for NFS client settings.
NOTE: In 9.2.0.0 and later, you can use the upgrade status screen to monitor the progress of your upgrade and view
connected clients.
After the upgrade, a number of upgrade-related jobs may continue to run on the cluster for some time. During this time, the
cluster is accessible, but you might experience a decrease in cluster performance. After the jobs complete, performance will
return to normal. At this stage, the upgrade is complete, but is not committed. You can still roll back to the previous version of
OneFS. Some new features in the upgrade might not be available until the upgrade is committed.
NOTE: The isi upgrade cluster command runs asynchronously. The command does not run the entire upgrade
process; instead, it sets up the upgrade process, which nodes take turns controlling. For this reason, the command
returns quickly. To view the progress of the upgrade, use the isi upgrade view command or the web administration
interface.
3. Optional: You can specify the following rolling upgrade options:
Options Description
Nodes to Upgrade specific nodes with the --nodes <integer_range_list> option.
select for
Specify the nodes in their upgrade order as a comma-separated list (for example, --nodes 7,3,2,5) or as
upgrade
a dash-separated range (for example, --nodes 1-7) of logical node numbers (LNNs).
NOTE: We recommend that you upgrade all the nodes. If you upgrade some nodes, a weekly alert is
sent to confirm that the upgrade is making progress. Do not leave the cluster in a partially upgraded
state for a prolonged period. Some new features in the upgrade might not be available until all the nodes
in the cluster have been upgraded and the upgrade is committed. Refer to the release notes for the
OneFS version that you are upgrading to for information about features that require all the nodes to be
upgraded.
OneFS 9.2.0.0 and later ignores the node restart order. Instead, OneFS restarts any node that is ready, as long as only one
node is restarting at a time.
The following example for OneFS 8.2.2 through OneFS 9.1.0.x starts a rolling upgrade on nodes 7,3,2,5, in that order:
The following example for OneFS 8.2.1 and earlier starts a rolling upgrade on nodes 7,3,2,5 in that order:
4. Optional: To add nodes to a current upgrade, run the following command, where <nodes> is the list of LNNs:
5. Optional: To add any remaining nodes to the upgrade, run the following command:
After the upgrade, a number of upgrade-related jobs may continue to run on the cluster for some time. During this time, the
cluster is accessible, but you might experience a decrease in cluster performance. After the jobs complete, performance will
SMB3 Client transitions from the restarting node to a new node without disruption.
NFSv2 and NFSv3 Client transitions from the restarting node to a new node without disruption.
NFSv4 Client transitions from the restarting node to a new node without disruption. Clients use NFSv4
failover support.
NOTE: If a client is reconnected to a node that has not yet been upgraded, the client might be required to reestablish a
connection to the cluster more than once.
NOTE: For more information about NFS settings, see article 457328, Best practices for NFS client settings.
NOTE: The isi upgrade cluster command runs asynchronously. The command does not run the entire upgrade
process; instead, it sets up the upgrade process, which nodes take turns controlling. For this reason, the command
returns quickly. To view the progress of the upgrade, use the isi upgrade view command or the web administration
interface.
After the upgrade, a number of upgrade-related jobs may continue to run on the cluster for some time. During this time, the
cluster is accessible, but you might experience a decrease in cluster performance. After the jobs complete, performance will
return to normal. At this stage, the upgrade is complete, but is not committed. You can still roll back to the previous version of
OneFS. Some new features in the upgrade might not be available until the upgrade is committed.
isi stat
2. Remove the installation files from the /ifs/data directory by running the following command where
<installation_file_name> is the name of the installation file:
rm /ifs/data/<installation_file_name>
3. Collect information about the cluster in OneFS 9.0.0.0 and earlier, by running the following command:
isi_gather_info
Collect information about the cluster in OneFS 9.1.0.0 and later, by running the following command:
Post-upgrade - Summary
The following is a summary of steps to perform during the post-upgrade phase:
1. Allow upgrade jobs to run.
2. Verify operational status.
3. Re-establish user privileges.
4. Restore client connections and workflows.
5. Verify Kerberos migration.
6. Restore custom settings.
7. Test custom scripts.
8. (Optional) Install latest patch.
NOTE: For more information about the Upgrade job, see KB Article 194551.
isi_for_array -s uname -a
2. View the status of the cluster and ensure all the nodes are operational:
isi status
3. Check the devices in the nodes to validate the status of the drives:
6. To verify network connectivity and SmartConnect functionality, ping all internal and external interfaces on the cluster.
7. Verify the network interfaces:
10. To check for issues, review the log files on the cluster:
isi_upgrade_logs
cat /var/log/messages
13. Check the status of the node firmware to ensure it is consistent across nodes:
For OneFS 9.0.0 and later:
isi_upgrade_logs --get-fw-report
14. Ensure that all the licenses carried over and remain up to date:
15. Check the status of the authentication providers to ensure that they remain active:
19. If you use SupportAssist or SRS, confirm that the service is reenabled.
NOTE: If you are using NDMP backups on your cluster, re-enable the NDMP service and test that it's working correctly.
Troubleshooting an upgrade
If you experience problems with your upgrade, try the following:
● Check the upgrade logs and review for errors.
● Search for OneFS upgrade information within the knowledge base on the PowerScale support site.
● Contact your Dell Technologies Support representative.