0% found this document useful (0 votes)
585 views27 pages

FAS8200 - Controller - Module Replacment

This document provides a comprehensive guide for replacing the controller module in a NetApp FAS8200 system, detailing prerequisites, shutdown procedures, and hardware replacement steps. It emphasizes the importance of ensuring system configurations and health before proceeding with the replacement, as well as the correct handling of components like boot media, DIMMs, and PCIe cards. The document also includes specific commands and procedures for both standard and MetroCluster configurations to ensure a smooth and effective replacement process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
585 views27 pages

FAS8200 - Controller - Module Replacment

This document provides a comprehensive guide for replacing the controller module in a NetApp FAS8200 system, detailing prerequisites, shutdown procedures, and hardware replacement steps. It emphasizes the importance of ensuring system configurations and health before proceeding with the replacement, as well as the correct handling of components like boot media, DIMMs, and PCIe cards. The document also includes specific commands and procedures for both standard and MetroCluster configurations to ensure a smooth and effective replacement process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Controller

Install and maintain


NetApp
December 18, 2024

This PDF was generated from https://docs.netapp.com/us-en/ontap-systems/fas8200/controller-replace-


overview.html on December 18, 2024. Always check docs.netapp.com for the latest.
Table of Contents
Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Overview of controller module replacement - FAS8200. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Shut down the impaired controller - FAS8200 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Replace the controller module hardware - FAS8200 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Restore and verify the system configuration - FAS8200 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Recable the system and reassign disks - FAS8200. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Complete system restoration - FAS8200 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Controller
Overview of controller module replacement - FAS8200
You must review the prerequisites for the replacement procedure and select the correct
one for your version of the ONTAP operating system.
• All drive shelves must be working properly.
• If your system is a FlexArray system or has a V_StorageAttach license, you must refer to the additional
required steps before performing this procedure.
• If your system is in an HA pair, the healthy controller must be able to take over the controller that is being
replaced (referred to in this procedure as the “impaired controller”).
• If your system is in a MetroCluster configuration, you must review the section Choosing the correct
recovery procedure to determine whether you should use this procedure.

If this is the procedure you should use, note that the controller replacement procedure for a controller in a
four or eight controller MetroCluster configuration is the same as that in an HA pair. No MetroCluster-
specific steps are required because the failure is restricted to an HA pair and storage failover commands
can be used to provide nondisruptive operation during the replacement.

• This procedure includes steps for automatically or manually reassigning drives to the replacement
controller, depending on your system’s configuration.

You should perform the drive reassignment as directed in the procedure.

• You must replace the failed component with a replacement FRU component you received from your
provider.
• You must be replacing a controller module with a controller module of the same model type. You cannot
upgrade your system by just replacing the controller module.
• You cannot change any drives or drive shelves as part of this procedure.
• In this procedure, the boot device is moved from the impaired controller to the replacement controller so
that the replacement controller will boot up in the same version of ONTAP as the old controller module.
• Any PCIe cards moved from the old controller module to the new controller module or added from existing
customer site inventory must be supported by the replacement controller module.

NetApp Hardware Universe

• It is important that you apply the commands in these steps on the correct systems:
◦ The impaired controller is the controller that is being replaced.
◦ The replacement controller is the new controller that is replacing the impaired controller.
◦ The healthy controller is the surviving controller.
• You must always capture the controller’s console output to a text file.

This provides you a record of the procedure so that you can troubleshoot any issues that you might
encounter during the replacement process.

1
Shut down the impaired controller - FAS8200
You can shut down or take over the impaired controller using different procedures,
depending on the storage system hardware configuration.

2
Option 1: Most systems
To shut down the impaired controller, you must determine the status of the controller and, if necessary,
take over the controller so that the healthy controller continues to serve data from the impaired controller
storage.

About this task


• If you have a SAN system, you must have checked event messages (cluster kernel-service
show) for the impaired controller SCSI blade. The cluster kernel-service show command
(from priv advanced mode) displays the node name, quorum status of that node, availability status of
that node, and operational status of that node.

Each SCSI-blade process should be in quorum with the other nodes in the cluster. Any issues must
be resolved before you proceed with the replacement.

• If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or
a healthy controller shows false for eligibility and health, you must correct the issue before shutting
down the impaired controller; see Synchronize a node with the cluster.

Steps
1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message:
system node autosupport invoke -node * -type all -message MAINT=<# of
hours>h

The following AutoSupport message suppresses automatic case creation for two hours: cluster1:>
system node autosupport invoke -node * -type all -message MAINT=2h

2. Disable automatic giveback from the console of the healthy controller: storage failover modify
-node local -auto-giveback false

When you see Do you want to disable auto-giveback?, enter y.

3. Take the impaired controller to the LOADER prompt:

If the impaired controller is Then…


displaying…
The LOADER prompt Go to the next step.

Waiting for giveback… Press Ctrl-C, and then respond y when prompted.

System prompt or password Take over or halt the impaired controller from the healthy
prompt controller: storage failover takeover -ofnode
impaired_node_name

When the impaired controller shows Waiting for giveback…, press


Ctrl-C, and then respond y.

Option 2: Controller is in a two-node MetroCluster


To shut down the impaired controller, you must determine the status of the controller and, if necessary,

3
switch over the controller so that the healthy controller continues to serve data from the impaired
controller storage.

About this task


• You must leave the power supplies turned on at the end of this procedure to provide power to the
healthy controller.

Steps
1. Check the MetroCluster status to determine whether the impaired controller has automatically
switched over to the healthy controller: metrocluster show
2. Depending on whether an automatic switchover has occurred, proceed according to the following
table:

If the impaired controller… Then…


Has automatically switched over Proceed to the next step.

Has not automatically switched Perform a planned switchover operation from the healthy
over controller: metrocluster switchover

Has not automatically switched Review the veto messages and, if possible, resolve the issue and
over, you attempted switchover try again. If you are unable to resolve the issue, contact technical
with the metrocluster support.
switchover command, and
the switchover was vetoed

3. Resynchronize the data aggregates by running the metrocluster heal -phase aggregates
command from the surviving cluster.

controller_A_1::> metrocluster heal -phase aggregates


[Job 130] Job succeeded: Heal Aggregates is successful.

If the healing is vetoed, you have the option of reissuing the metrocluster heal command with
the -override-vetoes parameter. If you use this optional parameter, the system overrides any soft
vetoes that prevent the healing operation.

4. Verify that the operation has been completed by using the metrocluster operation show command.

controller_A_1::> metrocluster operation show


Operation: heal-aggregates
State: successful
Start Time: 7/25/2016 18:45:55
End Time: 7/25/2016 18:45:56
Errors: -

5. Check the state of the aggregates by using the storage aggregate show command.

4
controller_A_1::> storage aggregate show
Aggregate Size Available Used% State #Vols Nodes
RAID Status
--------- -------- --------- ----- ------- ------ ----------------
------------
...
aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2
raid_dp, mirrored, normal...

6. Heal the root aggregates by using the metrocluster heal -phase root-aggregates
command.

mcc1A::> metrocluster heal -phase root-aggregates


[Job 137] Job succeeded: Heal Root Aggregates is successful

If the healing is vetoed, you have the option of reissuing the metrocluster heal command with
the -override-vetoes parameter. If you use this optional parameter, the system overrides any soft
vetoes that prevent the healing operation.

7. Verify that the heal operation is complete by using the metrocluster operation show command
on the destination cluster:

mcc1A::> metrocluster operation show


Operation: heal-root-aggregates
State: successful
Start Time: 7/29/2016 20:54:41
End Time: 7/29/2016 20:54:42
Errors: -

8. On the impaired controller module, disconnect the power supplies.

Replace the controller module hardware - FAS8200


To replace the controller module hardware, you must remove the impaired controller,
move FRU components to the replacement controller module, install the replacement
controller module in the chassis, and then boot the system to Maintenance mode.

Step 1: Open the controller module


To replace the controller module, you must first remove the old controller module from the chassis.

1. If you are not already grounded, properly ground yourself.


2. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the
system cables and SFPs (if needed) from the controller module, keeping track of where the cables were

5
connected.

Leave the cables in the cable management device so that when you reinstall the cable management
device, the cables are organized.

3. Remove and set aside the cable management devices from the left and right sides of the controller module.

4. If you left the SFP modules in the system after removing the cables, move them to the new controller
module.
5. Loosen the thumbscrew on the cam handle on the controller module.

Thumbscrew

Cam handle

6. Pull the cam handle downward and begin to slide the controller module out of the chassis.

Make sure that you support the bottom of the controller module as you slide it out of the chassis.

Step 2: Move the boot device


You must locate the boot media and follow the directions to remove it from the old controller and insert it in the
new controller.

1. Locate the boot media using the following illustration or the FRU map on the controller module:

6
2. Press the blue button on the boot media housing to release the boot media from its housing, and then
gently pull it straight out of the boot media socket.

Do not twist or pull the boot media straight up, because this could damage the socket or the
boot media.

3. Move the boot media to the new controller module, align the edges of the boot media with the socket
housing, and then gently push it into the socket.
4. Check the boot media to make sure that it is seated squarely and completely in the socket.

If necessary, remove the boot media and reseat it into the socket.

5. Push the boot media down to engage the locking button on the boot media housing.

Step 3: Move the NVMEM battery


To move the NVMEM battery from the old controller module to the new controller module, you must perform a
specific sequence of steps.

1. Check the NVMEM LED:

7
◦ If your system is in an HA configuration, go to the next step.
◦ If your system is in a stand-alone configuration, cleanly shut down the controller module, and then
check the NVRAM LED identified by the NV icon.

The NVRAM LED blinks while destaging contents to the flash memory when you halt the
system. After the destage is complete, the LED turns off.

▪ If power is lost without a clean shutdown, the NVMEM LED flashes until the destage is complete,
and then the LED turns off.
▪ If the LED is on and power is on, unwritten data is stored on NVMEM.

This typically occurs during an uncontrolled shutdown after ONTAP has successfully booted.

2. Open the CPU air duct and locate the NVMEM battery.

8
Battery lock tab

NVMEM battery pack

3. Grasp the battery and press the blue locking tab marked PUSH, and then lift the battery out of the holder
and controller module.
4. Remove the battery from the controller module and set it aside.

Step 4: Move the DIMMs


To move the DIMMs, locate and move them from the old controller into the replacement controller and follow
the specific sequence of steps.

1. Locate the DIMMs on your controller module.


2. Note the orientation of the DIMM in the socket so that you can insert the DIMM in the replacement
controller module in the proper orientation.
3. Eject the DIMM from its slot by slowly pushing apart the two DIMM ejector tabs on either side of the DIMM,
and then slide the DIMM out of the slot.

Carefully hold the DIMM by the edges to avoid pressure on the components on the DIMM
circuit board.

The number and placement of system DIMMs depends on the model of your system.

The following illustration shows the location of system DIMMs:

9
4. Locate the slot where you are installing the DIMM.
5. Make sure that the DIMM ejector tabs on the connector are in the open position, and then insert the DIMM
squarely into the slot.

The DIMM fits tightly in the slot, but should go in easily. If not, realign the DIMM with the slot and reinsert it.

Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the slot.

6. Repeat these steps for the remaining DIMMs.


7. Move the NVMEM battery to the replacement controller module.
8. Align the tab or tabs on the battery holder with the notches in the controller module side, and then gently
push down on the battery housing until the battery housing clicks into place.

Step 5: Move a PCIe card


To move PCIe cards, locate and move them from the old controller into the replacement controller and follow
the specific sequence of steps.

You must have the new controller module ready so that you can move the PCIe cards directly from the old
controller module to the corresponding slots in the new one.

10
1. Loosen the thumbscrew on the controller module side panel.
2. Swing the side panel off the controller module.

Side panel

PCIe card

3. Remove the PCIe card from the old controller module and set it aside.

Make sure that you keep track of which slot the PCIe card was in.

4. Repeat the preceding step for the remaining PCIe cards in the old controller module.
5. Open the new controller module side panel, if necessary, slide off the PCIe card filler plate, as needed, and
carefully install the PCIe card.

Be sure that you properly align the card in the slot and exert even pressure on the card when seating it in
the socket. The card must be fully and evenly seated in the slot.

6. Repeat the preceding step for the remaining PCIe cards that you set aside.
7. Close the side panel and tighten the thumbscrew.

11
Step 6: Move a caching module
You must move the caching modules from the impaired controller modules to the replacement controller
module when replacing a controller module.

1. Locate the caching module at the rear of the controller module and remove it:
a. Press the release tab.
b. Remove the heatsink.

The storage system comes with two slots available for the caching module and only one slot is
occupied, by default.

2. Move the caching module to the new controller module, and then align the edges of the caching module
with the socket housing and gently push it into the socket.
3. Verify that the caching module is seated squarely and completely in the socket. If necessary, remove the
caching module and reseat it into the socket.
4. Reseat and push the heatsink down to engage the locking button on the caching module housing.
5. Repeat the steps if you have a second caching module. Close the controller module cover.

12
Step 7: Install the controller
After you install the components from the old controller module into the new controller module, you must install
the new controller module into the system chassis and boot the operating system.

For HA pairs with two controller modules in the same chassis, the sequence in which you install the controller
module is especially important because it attempts to reboot as soon as you completely seat it in the chassis.

The system might update system firmware when it boots. Do not abort this process. The
procedure requires you to interrupt the boot process, which you can typically do at any time after
prompted to do so. However, if the system updates the system firmware when it boots, you must
wait until after the update is complete before interrupting the boot process.

1. If you are not already grounded, properly ground yourself.


2. If you have not already done so, close the CPU air duct.
3. Align the end of the controller module with the opening in the chassis, and then gently push the controller
module halfway into the system.

Do not completely insert the controller module in the chassis until instructed to do so.

4. Cable the management and console ports only, so that you can access the system to perform the tasks in
the following sections.

You will connect the rest of the cables to the controller module later in this procedure.

5. Complete the reinstallation of the controller module:

13
If your system is in… Then perform these steps…
An HA pair
The controller module begins to boot as soon
as it is fully seated in the chassis. Be
prepared to interrupt the boot process.

a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position. Tighten the
thumbscrew on the cam handle on back of the controller module.

Do not use excessive force when sliding the


controller module into the chassis to avoid
damaging the connectors.

The controller begins to boot as soon as it is seated in the


chassis.

b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. When you see the message Press Ctrl-C for Boot Menu,
press Ctrl-C to interrupt the boot process.

If you miss the prompt and the controller module


boots to ONTAP, enter halt, and then at the
LOADER prompt enter boot_ontap, press Ctrl-
C when prompted, and then boot to Maintenance
mode.

e. Select the option to boot to Maintenance mode from the displayed


menu.

14
If your system is in… Then perform these steps…
A stand-alone configuration a. With the cam handle in the open position, firmly push the
controller module in until it meets the midplane and is fully seated,
and then close the cam handle to the locked position. Tighten the
thumbscrew on the cam handle on back of the controller module.

Do not use excessive force when sliding the


controller module into the chassis to avoid
damaging the connectors.

b. If you have not already done so, reinstall the cable management
device.
c. Bind the cables to the cable management device with the hook
and loop strap.
d. Reconnect the power cables to the power supplies and to the
power sources, turn on the power to start the boot process, and
then press Ctrl-C after you see the Press Ctrl-C for Boot
Menu message.

If you miss the prompt and the controller module


boots to ONTAP, enter halt, and then at the
LOADER prompt enter boot_ontap, press Ctrl-
C when prompted, and then boot to Maintenance
mode.

e. From the boot menu, select the option for Maintenance mode.

Important: During the boot process, you might see the following prompts:

◦ A prompt warning of a system ID mismatch and asking to override the system ID.
◦ A prompt warning that when entering Maintenance mode in an HA configuration you must ensure that
the healthy controller remains down. You can safely respond y to these prompts.

Restore and verify the system configuration - FAS8200


After completing the hardware replacement and booting to Maintenance mode, you verify
the low-level system configuration of the replacement controller and reconfigure system
settings as necessary.

Step 1: Set and verify system time after replacing the controller
You should check the time and date on the replacement controller module against the healthy controller
module in an HA pair, or against a reliable time server in a stand-alone configuration. If the time and date do
not match, you must reset them on the replacement controller module to prevent possible outages on clients
due to time differences.

About this task


It is important that you apply the commands in the steps on the correct systems:

15
• The replacement node is the new node that replaced the impaired node as part of this procedure.
• The healthy node is the HA partner of the replacement node.

Steps
1. If the replacement node is not at the LOADER prompt, halt the system to the LOADER prompt.
2. On the healthy node, check the system time: cluster date show

The date and time are based on the configured timezone.

3. At the LOADER prompt, check the date and time on the replacement node: show date

The date and time are given in GMT.

4. If necessary, set the date in GMT on the replacement node: set date mm/dd/yyyy
5. If necessary, set the time in GMT on the replacement node: set time hh:mm:ss
6. At the LOADER prompt, confirm the date and time on the replacement node: show date

The date and time are given in GMT.

Step 2: Verify and set the HA state of the controller module


You must verify the HA state of the controller module and, if necessary, update the state to match your system
configuration.

1. In Maintenance mode from the new controller module, verify that all components display the same HA
state: ha-config show

The HA state should be the same for all components.

2. If the displayed system state of the controller module does not match your system configuration, set the HA
state for the controller module: ha-config modify controller ha-state

The value for HA-state can be one of the following:

◦ ha
◦ mcc
◦ mcc-2n
◦ mccip
◦ non-ha
3. If the displayed system state of the controller module does not match your system configuration, set the HA
state for the controller module: ha-config modify controller ha-state
4. Confirm that the setting has changed: ha-config show

Recable the system and reassign disks - FAS8200


Continue the replacement procedure by recabling the storage and confirming disk

16
reassignment.

Step 1: Recable the system


Verify the controller module’s storage and network connections.

Steps
1. Verify that the cabling is correct by using Active IQ Config Advisor.
a. Download and install Config Advisor.
b. Enter the information for the target system, and then click Collect Data.
c. Click the Cabling tab, and then examine the output. Make sure that all disk shelves are displayed and
all disks appear in the output, correcting any cabling issues you find.
d. Check other cabling by clicking the appropriate tab, and then examining the output from Config Advisor.

Step 2: Reassign disks


If the storage system is in an HA pair, the system ID of the new controller module is automatically assigned to
the disks when the giveback occurs at the end of the procedure. You must use the correct procedure for your
configuration.

Option 1: Verify the system ID change on an HA system

You must confirm the system ID change when you boot the replacement controller and then verify that the
change was implemented.

This procedure applies only to systems running ONTAP in an HA pair.

1. If the replacement controller is in Maintenance mode (showing the *> prompt, exit Maintenance mode and
go to the LOADER prompt: halt
2. From the LOADER prompt on the replacement controller, boot the controller, entering y if you are prompted
to override the system ID due to a system ID mismatch. boot_ontap
3. Wait until the Waiting for giveback… message is displayed on the replacement controller console and
then, from the healthy controller, verify that the new partner system ID has been automatically assigned:
storage failover show

In the command output, you should see a message that the system ID has changed on the impaired
controller, showing the correct old and new IDs. In the following example, node2 has undergone
replacement and has a new system ID of 151759706.

17
node1> `storage failover show`
Takeover
Node Partner Possible State Description
------------ ------------ --------
-------------------------------------
node1 node2 false System ID changed on
partner (Old:
151759755, New:
151759706), In takeover
node2 node1 - Waiting for giveback
(HA mailboxes)

4. From the healthy controller, verify that any coredumps are saved:
a. Change to the advanced privilege level: set -privilege advanced
5. If your storage system has Storage or Volume Encryption configured, you must restore Storage or Volume
Encryption functionality by using one of the following procedures, depending on whether you are using
onboard or external key management:
◦ Restore onboard key management encryption keys
◦ Restore external key management encryption keys

You can respond Y when prompted to continue into advanced mode. The advanced mode prompt
appears (*>).

a. Save any coredumps: system node run -node local-node-name partner savecore
b. Wait for savecore command to complete before issuing the giveback.

You can enter the following command to monitor the progress of the savecore command: system
node run -node local-node-name partner savecore -s

c. Return to the admin privilege level: set -privilege admin


6. Give back the controller:
a. From the healthy controller, give back the replaced controller’s storage: storage failover
giveback -ofnode replacement_node_name

The replacement controller takes back its storage and completes booting.

If you are prompted to override the system ID due to a system ID mismatch, you should enter y.

If the giveback is vetoed, you can consider overriding the vetoes.

Find the High-Availability Configuration content for your version of ONTAP 9

b. After the giveback has been completed, confirm that the HA pair is healthy and that takeover is
possible: storage failover show

The output from the storage failover show command should not include the System ID changed

18
on partner message.

7. Verify that the disks were assigned correctly: storage disk show -ownership

The disks belonging to the replacement controller should show the new system ID. In the following
example, the disks owned by node1 now show the new system ID, 1873775277:

node1> `storage disk show -ownership`

Disk Aggregate Home Owner DR Home Home ID Owner ID DR Home ID


Reserver Pool
----- ------ ----- ------ -------- ------- ------- -------
--------- ---
1.0.0 aggr0_1 node1 node1 - 1873775277 1873775277 -
1873775277 Pool0
1.0.1 aggr0_1 node1 node1 1873775277 1873775277 -
1873775277 Pool0
.
.
.

Option 2: Manually reassign the system ID on systems in a two-node MetroCluster configuration

In a two-node MetroCluster configuration running ONTAP, you must manually reassign disks to the new
controller’s system ID before you return the system to normal operating condition.

About this task


This procedure applies only to systems in a two-node MetroCluster configuration running ONTAP.

You must be sure to issue the commands in this procedure on the correct node:

• The impaired node is the node on which you are performing maintenance.
• The replacement node is the new node that replaced the impaired node as part of this procedure.
• The healthy node is the DR partner of the impaired node.

Steps
1. If you have not already done so, reboot the replacement node, interrupt the boot process by entering
Ctrl-C, and then select the option to boot to Maintenance mode from the displayed menu.

You must enter Y when prompted to override the system ID due to a system ID mismatch.

2. View the old system IDs from the healthy node: `metrocluster node show -fields node-
systemid,dr-partner-systemid`

In this example, the Node_B_1 is the old node, with the old system ID of 118073209:

19
dr-group-id cluster node node-systemid dr-
partner-systemid
----------- --------------------- -------------------- -------------
-------------------
1 Cluster_A Node_A_1 536872914
118073209
1 Cluster_B Node_B_1 118073209
536872914
2 entries were displayed.

3. View the new system ID at the Maintenance mode prompt on the impaired node: disk show

In this example, the new system ID is 118065481:

Local System ID: 118065481


...
...

4. Reassign disk ownership (for FAS systems) or LUN ownership (for FlexArray systems), by using the
system ID information obtained from the disk show command: disk reassign -s old system ID

In the case of the preceding example, the command is: disk reassign -s 118073209

You can respond Y when prompted to continue.

5. Verify that the disks (or FlexArray LUNs) were assigned correctly: disk show -a

Verify that the disks belonging to the replacement node show the new system ID for the replacement node.
In the following example, the disks owned by system-1 now show the new system ID, 118065481:

*> disk show -a


Local System ID: 118065481

DISK OWNER POOL SERIAL NUMBER HOME


------- ------------- ----- ------------- -------------
disk_name system-1 (118065481) Pool0 J8Y0TDZC system-1
(118065481)
disk_name system-1 (118065481) Pool0 J8Y09DXC system-1
(118065481)
.
.
.

6. From the healthy node, verify that any coredumps are saved:

20
a. Change to the advanced privilege level: set -privilege advanced

You can respond Y when prompted to continue into advanced mode. The advanced mode prompt
appears (*>).

b. Verify that the coredumps are saved: system node run -node local-node-name partner
savecore

If the command output indicates that savecore is in progress, wait for savecore to complete before
issuing the giveback. You can monitor the progress of the savecore using the system node run
-node local-node-name partner savecore -s command.</info>.

c. Return to the admin privilege level: set -privilege admin


7. If the replacement node is in Maintenance mode (showing the *> prompt), exit Maintenance mode and go
to the LOADER prompt: halt
8. Boot the replacement node: boot_ontap
9. After the replacement node has fully booted, perform a switchback: metrocluster switchback
10. Verify the MetroCluster configuration: metrocluster node show - fields configuration-state

node1_siteA::> metrocluster node show -fields configuration-state

dr-group-id cluster node configuration-state


----------- ---------------------- --------------
-------------------
1 node1_siteA node1mcc-001 configured
1 node1_siteA node1mcc-002 configured
1 node1_siteB node1mcc-003 configured
1 node1_siteB node1mcc-004 configured

4 entries were displayed.

11. Verify the operation of the MetroCluster configuration in Data ONTAP:


a. Check for any health alerts on both clusters: system health alert show
b. Confirm that the MetroCluster is configured and in normal mode: metrocluster show
c. Perform a MetroCluster check: metrocluster check run
d. Display the results of the MetroCluster check: metrocluster check show
e. Run Config Advisor. Go to the Config Advisor page on the NetApp Support Site at
support.netapp.com/NOW/download/tools/config_advisor/.

After running Config Advisor, review the tool’s output and follow the recommendations in the output to
address any issues discovered.

12. Simulate a switchover operation:


a. From any node’s prompt, change to the advanced privilege level: set -privilege advanced

21
You need to respond with y when prompted to continue into advanced mode and see the advanced
mode prompt (*>).

b. Perform the switchback operation with the -simulate parameter: metrocluster switchover
-simulate
c. Return to the admin privilege level: set -privilege admin

Complete system restoration - FAS8200


To restore your system to full operation, you must restore the NetApp Storage Encryption
configuration (if necessary), and install licenses for the new controller, and return the
failed part to NetApp, as described in the RMA instructions shipped with the kit.

Step 1: Install licenses for the replacement controller in ONTAP


You must install new licenses for the replacement node if the impaired node was using ONTAP features that
require a standard (node-locked) license. For features with standard licenses, each node in the cluster should
have its own key for the feature.

About this task


Until you install license keys, features requiring standard licenses continue to be available to the replacement
node. However, if the impaired node was the only node in the cluster with a license for the feature, no
configuration changes to the feature are allowed. Also, using unlicensed features on the node might put you
out of compliance with your license agreement, so you should install the replacement license key or keys on
the replacement node as soon as possible.

Before you begin


The licenses keys must be in the 28-character format.

You have a 90-day grace period in which to install the license keys. After the grace period, all old licenses are
invalidated. After a valid license key is installed, you have 24 hours to install all of the keys before the grace
period ends.

If your system was initially running ONTAP 9.10.1 or later, use the procedure documented in
Post Motherboard Replacement Process to update Licensing on a AFF/FAS system. If you are
unsure of the initial ONTAP release for your system, see NetApp Hardware Universe for more
information.

Steps
1. If you need new license keys, obtain replacement license keys on the NetApp Support Site in the My
Support section under Software licenses.

The new license keys that you require are automatically generated and sent to the email
address on file. If you fail to receive the email with the license keys within 30 days, you
should contact technical support.

2. Install each license key: system license add -license-code license-key, license-key...
3. Remove the old licenses, if desired:
a. Check for unused licenses: license clean-up -unused -simulate

22
b. If the list looks correct, remove the unused licenses: license clean-up -unused

Step 2: Verify LIFs and register the serial number


Before returning the replacement node to service, you should verify that the LIFs are on their home ports, and
register the serial number of the replacement node if AutoSupport is enabled, and reset automatic giveback.

Steps
1. Verify that the logical interfaces are reporting to their home server and ports: network interface show
-is-home false

If any LIFs are listed as false, revert them to their home ports: network interface revert -vserver
* -lif *

2. Register the system serial number with NetApp Support.


◦ If AutoSupport is enabled, send an AutoSupport message to register the serial number.
◦ If AutoSupport is not enabled, call NetApp Support to register the serial number.
3. If an AutoSupport maintenance window was triggered, end it by using the system node autosupport
invoke -node * -type all -message MAINT=END command.
4. If automatic giveback was disabled, reenable it: storage failover modify -node local -auto
-giveback true

Step 3: Switch back aggregates in a two-node MetroCluster configuration


After you have completed the FRU replacement in a two-node MetroCluster configuration, you can perform the
MetroCluster switchback operation. This returns the configuration to its normal operating state, with the sync-
source storage virtual machines (SVMs) on the formerly impaired site now active and serving data from the
local disk pools.

This task only applies to two-node MetroCluster configurations.

Steps
1. Verify that all nodes are in the enabled state: metrocluster node show

cluster_B::> metrocluster node show

DR Configuration DR
Group Cluster Node State Mirroring Mode
----- ------- -------------- -------------- ---------
--------------------
1 cluster_A
controller_A_1 configured enabled heal roots
completed
cluster_B
controller_B_1 configured enabled waiting for
switchback recovery
2 entries were displayed.

23
2. Verify that resynchronization is complete on all SVMs: metrocluster vserver show
3. Verify that any automatic LIF migrations being performed by the healing operations were completed
successfully: metrocluster check lif show
4. Perform the switchback by using the metrocluster switchback command from any node in the
surviving cluster.
5. Verify that the switchback operation has completed: metrocluster show

The switchback operation is still running when a cluster is in the waiting-for-switchback state:

cluster_B::> metrocluster show


Cluster Configuration State Mode
-------------------- ------------------- ---------
Local: cluster_B configured switchover
Remote: cluster_A configured waiting-for-switchback

The switchback operation is complete when the clusters are in the normal state.:

cluster_B::> metrocluster show


Cluster Configuration State Mode
-------------------- ------------------- ---------
Local: cluster_B configured normal
Remote: cluster_A configured normal

If a switchback is taking a long time to finish, you can check on the status of in-progress baselines by using
the metrocluster config-replication resync-status show command.

6. Reestablish any SnapMirror or SnapVault configurations.

Step 4: Return the failed part to NetApp


Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return
and Replacements page for further information.

24
Copyright information

Copyright © 2024 NetApp, Inc. All Rights Reserved. Printed in the U.S. No part of this document covered by
copyright may be reproduced in any form or by any means—graphic, electronic, or mechanical, including
photocopying, recording, taping, or storage in an electronic retrieval system—without prior written permission
of the copyright owner.

Software derived from copyrighted NetApp material is subject to the following license and disclaimer:

THIS SOFTWARE IS PROVIDED BY NETAPP “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL
NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

NetApp reserves the right to change any products described herein at any time, and without notice. NetApp
assumes no responsibility or liability arising from the use of products described herein, except as expressly
agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any
patent rights, trademark rights, or any other intellectual property rights of NetApp.

The product described in this manual may be protected by one or more U.S. patents, foreign patents, or
pending applications.

LIMITED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set
forth in subparagraph (b)(3) of the Rights in Technical Data -Noncommercial Items at DFARS 252.227-7013
(FEB 2014) and FAR 52.227-19 (DEC 2007).

Data contained herein pertains to a commercial product and/or commercial service (as defined in FAR 2.101)
and is proprietary to NetApp, Inc. All NetApp technical data and computer software provided under this
Agreement is commercial in nature and developed solely at private expense. The U.S. Government has a non-
exclusive, non-transferrable, nonsublicensable, worldwide, limited irrevocable license to use the Data only in
connection with and in support of the U.S. Government contract under which the Data was delivered. Except
as provided herein, the Data may not be used, disclosed, reproduced, modified, performed, or displayed
without the prior written approval of NetApp, Inc. United States Government license rights for the Department
of Defense are limited to those rights identified in DFARS clause 252.227-7015(b) (FEB 2014).

Trademark information

NETAPP, the NETAPP logo, and the marks listed at http://www.netapp.com/TM are trademarks of NetApp, Inc.
Other company and product names may be trademarks of their respective owners.

25

You might also like