0% found this document useful (0 votes)
5 views353 pages

Vmware Vsan 8 0

VMware vSAN 8.0 provides comprehensive documentation on the planning, deployment, and administration of vSAN clusters, including detailed release notes for updates. It covers various aspects such as hardware and software requirements, network design, and integration with other VMware products. The document also includes guidelines for managing storage devices, configuring policies, and troubleshooting within vSAN environments.

Uploaded by

attar.youssef
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views353 pages

Vmware Vsan 8 0

VMware vSAN 8.0 provides comprehensive documentation on the planning, deployment, and administration of vSAN clusters, including detailed release notes for updates. It covers various aspects such as hardware and software requirements, network design, and integration with other VMware products. The document also includes guidelines for managing storage devices, configuring policies, and troubleshooting within vSAN environments.

Uploaded by

attar.youssef
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 353

VMware vSAN 8.

0
VMware vSAN 8.0

Table of Contents
Release Notes.....................................................................................................................................12
VMware vSAN 8.0 Update 3 Release Notes................................................................................................................ 12
VMware vSAN 8.0 Update 2 Release Notes................................................................................................................ 16
VMware vSAN 8.0 Update 1 Release Notes................................................................................................................ 26
VMware vSAN 8.0 Release Notes.................................................................................................................................38
vSAN Planning and Deployment...................................................................................................... 50
Updated Information...................................................................................................................................................... 50
What Is vSAN................................................................................................................................................................ 50
vSAN Concepts......................................................................................................................................................... 51
Characteristics of vSAN.....................................................................................................................................51
vSAN Terms and Definitions..............................................................................................................................53
How vSAN Differs from Traditional Storage...................................................................................................... 56
Building a vSAN Cluster............................................................................................................................................... 56
vSAN Deployment Options....................................................................................................................................... 58
Integrate vSAN with Other VMware Software............................................................................................................. 59
Limitations of vSAN....................................................................................................................................................... 60
Requirements for Enabling vSAN................................................................................................................................ 60
Hardware Requirements for vSAN.......................................................................................................................... 60
Cluster Requirements for vSAN................................................................................................................................62
Software Requirements for vSAN.............................................................................................................................62
Networking Requirements for vSAN......................................................................................................................... 62
License Requirements...............................................................................................................................................62
Per TiB License for vSAN................................................................................................................................. 63
VMware Cloud Foundation License for vSAN...................................................................................................63
VMware vSphere Foundation Capacity License for vSAN................................................................................63
Per CPU License for vSAN............................................................................................................................... 64
Per Core License for vSAN............................................................................................................................... 64
Designing and Sizing a vSAN Cluster......................................................................................................................... 65
Designing and Sizing vSAN Storage........................................................................................................................65
Planning Capacity in vSAN............................................................................................................................... 65
Design Considerations for Flash Caching Devices in vSAN.............................................................................67
Design Considerations for Flash Capacity Devices in vSAN............................................................................ 68
Design Considerations for Magnetic Disks in vSAN......................................................................................... 69
Design Considerations for Storage Controllers in vSAN...................................................................................70
Designing and Sizing vSAN Hosts........................................................................................................................... 70
Design Considerations for a vSAN Cluster.............................................................................................................. 71

3
VMware vSAN 8.0

Designing the vSAN Network................................................................................................................................... 72


Creating Static Routes for vSAN Networking....................................................................................................74
Best Practices for vSAN Networking.................................................................................................................75
Designing and Sizing vSAN Fault Domains............................................................................................................. 75
Using Boot Devices and vSAN.................................................................................................................................76
Persistent Logging in a vSAN Cluster...................................................................................................................... 77
Preparing a New or Existing Cluster for vSAN.......................................................................................................... 77
Preparing Storage..................................................................................................................................................... 77
Verify the Compatibility of Storage Devices...................................................................................................... 77
Preparing Storage Devices................................................................................................................................77
Preparing Storage Controllers........................................................................................................................... 79
Mark Flash Devices as Capacity Using ESXCLI.............................................................................................. 80
Untag Flash Devices Used as Capacity Using ESXCLI................................................................................... 81
Mark Flash Devices as Capacity Using RVC.................................................................................................. 81
Providing Memory for vSAN..................................................................................................................................... 82
Preparing Your Hosts for vSAN................................................................................................................................ 82
vSAN and vCenter Server Compatibility.................................................................................................................. 82
Configuring the vSAN Network................................................................................................................................. 82
Creating a Single Site vSAN Cluster........................................................................................................................... 83
Characteristics of a vSAN Cluster............................................................................................................................ 83
Before Creating a vSAN Cluster...............................................................................................................................84
Using Quickstart to Configure and Expand a vSAN Cluster.................................................................................... 85
Use Quickstart to Configure a vSAN Cluster.................................................................................................... 87
Manually Enabling vSAN.......................................................................................................................................... 89
Set Up a VMkernel Network for vSAN.............................................................................................................. 90
Create a vSAN Cluster...................................................................................................................................... 90
Configure a Cluster for vSAN Using the vSphere Client.................................................................................. 90
Edit vSAN Settings............................................................................................................................................ 93
Enable vSAN on an Existing Cluster.................................................................................................................95
Configure License Settings for a vSAN Cluster....................................................................................................... 96
View a Subscribed Feature for a vSAN Cluster.......................................................................................................96
View vSAN Datastore............................................................................................................................................... 96
Using vSAN and vSphere HA...................................................................................................................................97
Deploying vSAN with vCenter Server.......................................................................................................................99
Turn Off vSAN...........................................................................................................................................................99
Creating a vSAN Stretched Cluster or Two-Node vSAN Cluster............................................................................ 100
What Are vSAN Stretched Clusters........................................................................................................................100
vSAN Stretched Cluster Design Considerations............................................................................................. 102
Best Practices for Working with vSAN Stretched Clusters............................................................................. 102
vSAN Stretched Clusters Network Design...................................................................................................... 103

4
VMware vSAN 8.0

What Are Two-Node vSAN Clusters.......................................................................................................................103


Use Quickstart to Configure a vSAN Stretched Cluster or Two-Node vSAN Cluster............................................. 104
Manually Configure vSAN Stretched Cluster..........................................................................................................106
Change the Preferred Fault Domain...................................................................................................................... 107
Deploying a vSAN Witness Appliance....................................................................................................................107
Set Up the vSAN Network on the Witness Appliance.....................................................................................108
Configure Management Network on the Witness Appliance...........................................................................108
Configure Network Interface for Witness Traffic..............................................................................................109
Change the Witness Host....................................................................................................................................... 111
Convert a vSAN Stretched Cluster to a Single Site vSAN Cluster........................................................................ 111
vSAN Network Design..................................................................................................................... 112
What is vSAN Network................................................................................................................................................ 112
Understanding vSAN Networking...............................................................................................................................114
vSAN Network Characteristics................................................................................................................................ 115
ESXi Traffic Types...................................................................................................................................................116
Network Requirements for vSAN............................................................................................................................117
Physical NIC Requirements.............................................................................................................................117
Bandwidth and Latency Requirements............................................................................................................118
Layer 2 and Layer 3 Support.......................................................................................................................... 119
Routing and Switching Requirements............................................................................................................. 119
vSAN Network Port Requirements.................................................................................................................120
Network Firewall Requirements.......................................................................................................................121
Using Unicast in vSAN Network.................................................................................................................................121
Pre-Version 5 Disk Group Behavior....................................................................................................................... 121
Version 5 Disk Group Behavior.............................................................................................................................. 122
DHCP Support on Unicast Network....................................................................................................................... 122
IPv6 Support on Unicast Network.......................................................................................................................... 122
Query Unicast with ESXCLI....................................................................................................................................122
View the Communication Modes..................................................................................................................... 122
Verify the vSAN Cluster Hosts........................................................................................................................ 123
View the vSAN Network Information............................................................................................................... 123
Intra-Cluster Traffic..................................................................................................................................................124
Intra-Cluster Traffic in a Single Rack.............................................................................................................. 124
Intra-Cluster Traffic in a vSAN Stretched Cluster............................................................................................125
Configuring IP Network Transport............................................................................................................................. 125
vSphere TCP/IP Stacks.......................................................................................................................................... 125
vSphere RDMA....................................................................................................................................................... 127
IPv6 Support............................................................................................................................................................127
Static Routes........................................................................................................................................................... 127
Jumbo Frames........................................................................................................................................................ 128

5
VMware vSAN 8.0

Using VMware NSX with vSAN...................................................................................................................................128


Using Congestion Control and Flow Control............................................................................................................128
Basic NIC Teaming, Failover, and Load Balancing.................................................................................................. 129
Basic NIC Teaming................................................................................................................................................. 130
Configure Load Balancing for NIC Teams.......................................................................................................131
Advanced NIC Teaming............................................................................................................................................... 132
Link Aggregation Group Overview..........................................................................................................................133
Static and Dynamic Link Aggregation............................................................................................................. 134
Static LACP with Route Based on IP Hash.................................................................................................... 135
Understanding Network Air Gaps........................................................................................................................... 136
Pros and Cons of Air Gap Network Configurations with vSAN..............................................................................136
NIC Teaming Configuration Examples....................................................................................................................137
Configuration 1: Single vmknic, Route Based on Physical NIC Load............................................................. 137
Configuration 2: Multiple vmknics, Route Based on Originating Port ID.........................................................138
Configuration 3: Dynamic LACP......................................................................................................................141
Configuration 4: Static LACP – Route Based on IP Hash.............................................................................. 145
Network I/O Control..................................................................................................................................................... 148
Network I/O Control Configuration Example...........................................................................................................149
Understanding vSAN Network Topologies................................................................................................................ 150
Standard Deployments............................................................................................................................................150
vSAN Stretched Cluster Deployments....................................................................................................................153
Two Node vSAN Deployments............................................................................................................................... 158
Configuration of Network from Data Sites to Witness Host................................................................................... 160
Corner Case Deployments......................................................................................................................................161
Troubleshooting the vSAN Network...........................................................................................................................162
Using Multicast in vSAN Network.............................................................................................................................. 170
Internet Group Management Protocol.................................................................................................................... 170
Protocol Independent Multicast...............................................................................................................................171
Networking Considerations for vSAN File Service.................................................................................................. 171
Networking Considerations for iSCSI on vSAN........................................................................................................172
Characteristics of vSAN iSCSI Network................................................................................................................. 173
Migrating from Standard to Distributed vSwitch......................................................................................................173
Checklist Summary for vSAN Network......................................................................................................................177
Administering VMware vSAN........................................................................................................ 179
Updated Information.................................................................................................................................................... 179
What Is vSAN.............................................................................................................................................................. 179
vSAN Concepts....................................................................................................................................................... 179
Characteristics of vSAN...................................................................................................................................180
vSAN Terms and Definitions............................................................................................................................181
How vSAN Differs from Traditional Storage.................................................................................................... 185

6
VMware vSAN 8.0

Building a vSAN Cluster............................................................................................................................................. 185


vSAN Deployment Options..................................................................................................................................... 186
Integrate vSAN with Other VMware Software........................................................................................................... 188
Limitations of vSAN..................................................................................................................................................... 188
Configuring and Managing a vSAN Cluster.............................................................................................................. 189
Configure a Cluster for vSAN Using the vSphere Client........................................................................................189
Enable vSAN on an Existing Cluster......................................................................................................................192
Turn Off vSAN.........................................................................................................................................................192
Edit vSAN Settings..................................................................................................................................................192
View vSAN Datastore............................................................................................................................................. 194
Upload Files or Folders to vSAN Datastores......................................................................................................... 195
Download Files or Folders from vSAN Datastores.................................................................................................196
Using vSAN Policies.................................................................................................................................................... 196
What are vSAN Policies......................................................................................................................................... 197
How vSAN Manages Policy Changes.................................................................................................................... 201
View vSAN Storage Providers................................................................................................................................ 201
What are vSAN Default Storage Policies............................................................................................................... 202
Change the Default Storage Policy for vSAN Datastores...................................................................................... 204
Define a Storage Policy for vSAN Using vSphere Client....................................................................................... 205
Expanding and Managing a vSAN Cluster................................................................................................................ 206
Expanding a vSAN Cluster..................................................................................................................................... 207
Expanding vSAN Cluster Capacity and Performance..................................................................................... 207
Use Quickstart to Add Hosts to a vSAN Cluster.............................................................................................207
Add a Host to the vSAN Cluster..................................................................................................................... 208
Configuring Hosts in the vSAN Cluster Using Host Profile............................................................................. 209
Sharing Remote vSAN Datastores......................................................................................................................... 210
View Remote vSAN Datastores.......................................................................................................................216
Mount Remote vSAN Datastore...................................................................................................................... 217
Unmount Remote vSAN Datastore..................................................................................................................217
Monitor Datastore Sharing with vSphere Client.............................................................................................. 217
Add Remote vCenter as Datastore Source.....................................................................................................219
Working with Members of the vSAN Cluster in Maintenance Mode...................................................................... 219
Check the Data Migration Capabilities of a Host in the vSAN Cluster............................................................220
Place a Member of vSAN Cluster in Maintenance Mode............................................................................... 221
Managing Fault Domains in vSAN Clusters........................................................................................................... 223
Create a New Fault Domain in vSAN Cluster................................................................................................224
Move Host into Selected Fault Domain in vSAN Cluster...............................................................................225
Move Hosts out of a Fault Domain in vSAN Cluster...................................................................................... 225
Rename a Fault Domain in vSAN Cluster...................................................................................................... 225
Remove Selected Fault Domains from vSAN Cluster................................................................................... 226

7
VMware vSAN 8.0

Tolerate Additional Failures with Fault Domain in vSAN Cluster................................................................... 226


Using vSAN Data Protection...................................................................................................................................226
Deploying the Snapshot Service Appliance.................................................................................................... 229
Create a vSAN Data Protection Group........................................................................................................... 230
Delete vSAN Snapshots.................................................................................................................................. 232
Restore a VM from a vSAN Snapshot............................................................................................................ 232
Clone a VM from a vSAN Snapshot............................................................................................................... 233
Using the vSAN iSCSI Target Service....................................................................................................................233
Enable the vSAN iSCSI Target Service.......................................................................................................... 234
Create a vSAN iSCSI Target........................................................................................................................... 234
Add a LUN to a vSAN iSCSI Target............................................................................................................... 235
Resize a LUN on a vSAN iSCSI Target..........................................................................................................235
Create a vSAN iSCSI Initiator Group.............................................................................................................. 235
Assign a Target to a vSAN iSCSI Initiator Group........................................................................................... 236
Turn Off the vSAN iSCSI Target Service........................................................................................................ 236
Monitor vSAN iSCSI Target Service................................................................................................................ 237
vSAN File Service................................................................................................................................................... 237
Limitations and Considerations of vSAN File Service.....................................................................................238
Enable vSAN File Service............................................................................................................................... 239
Configure vSAN File Service........................................................................................................................... 241
Edit vSAN File Service.................................................................................................................................... 245
Create a vSAN File Share...............................................................................................................................246
View vSAN File Shares................................................................................................................................... 247
Access vSAN File Shares............................................................................................................................... 247
Edit a vSAN File Share................................................................................................................................... 249
Manage SMB File Share on vSAN Cluster..................................................................................................... 249
Delete a vSAN File Share............................................................................................................................... 250
vSAN Distributed File System Snapshot......................................................................................................... 250
Rebalance Workload on vSAN File Service Hosts......................................................................................... 251
Reclaiming Space with Unmap in vSAN Distributed File System................................................................... 252
Upgrade vSAN File Service.............................................................................................................................252
Monitor Performance of vSAN File Service.................................................................................................... 253
Monitor vSAN File Share Capacity..................................................................................................................254
Monitor vSAN File Service and File Share Health.......................................................................................... 254
Migrate a Hybrid vSAN Cluster to an All-Flash Cluster......................................................................................... 254
Shutting Down and Restarting the vSAN Cluster...................................................................................................255
Shut Down the vSAN Cluster Using the Shutdown Cluster Wizard................................................................256
Restart the vSAN Cluster................................................................................................................................ 256
Manually Shut Down and Restart the vSAN Cluster.......................................................................................257
Device Management in a vSAN Cluster.....................................................................................................................259

8
VMware vSAN 8.0

Managing Storage Devices in vSAN Cluster..........................................................................................................259


Create a Disk Group or Storage Pool in vSAN Cluster.................................................................................. 260
Claim Storage Devices for vSAN Original Storage Architecture Cluster.........................................................261
Claim Storage Devices for vSAN Express Storage Architecture Cluster........................................................ 261
Claim Disks for vSAN Direct........................................................................................................................... 262
Working with Individual Devices in vSAN Cluster.................................................................................................. 262
Add Devices to the Disk Group in vSAN Cluster............................................................................................263
Check a Disk or Disk Group's Data Migration Capabilities from vSAN Cluster.............................................. 263
Remove Disk Groups or Devices from vSAN................................................................................................. 264
Recreate a Disk Group in vSAN Cluster.........................................................................................................265
Using Locator LEDs in vSAN.........................................................................................................................265
Mark Devices as Flash in vSAN..................................................................................................................... 266
Mark Devices as HDD in vSAN...................................................................................................................... 267
Mark Devices as Local in vSAN......................................................................................................................267
Mark Devices as Remote in vSAN..................................................................................................................268
Add a Capacity Device to vSAN Disk Group.................................................................................................. 268
Remove Partition From Devices......................................................................................................................269
Increasing Space Efficiency in a vSAN Cluster........................................................................................................269
vSAN Space Efficiency Features............................................................................................................................269
Reclaiming Storage Space in vSAN with SCSI Unmap......................................................................................... 269
Using Deduplication and Compression in vSAN Cluster........................................................................................270
Deduplication and Compression Design Considerations in vSAN Cluster......................................................272
Enable Deduplication and Compression on a New vSAN Cluster.................................................................. 272
Enable Deduplication and Compression on an Existing vSAN Cluster...........................................................272
Disable Deduplication and Compression on vSAN Cluster.............................................................................273
Reduce VM Redundancy for vSAN Cluster.................................................................................................... 273
Add or Remove Disks with Deduplication and Compression Enabled............................................................274
Using RAID 5 or RAID 6 Erasure Coding in vSAN Cluster................................................................................... 274
RAID 5 or RAID 6 Design Considerations in vSAN Cluster...................................................................................275
Using Encryption in a vSAN Cluster......................................................................................................................... 275
vSAN Data-In-Transit Encryption............................................................................................................................ 275
Enable Data-In-Transit Encryption on a vSAN Cluster................................................................................... 276
vSAN Data-At-Rest Encryption............................................................................................................................... 276
How vSAN Data-At-Rest Encryption Works.................................................................................................... 276
Design Considerations for vSAN Data-At-Rest Encryption............................................................................. 277
Set Up the Standard Key Provider..................................................................................................................278
Enable Encryption on a New vSAN Cluster....................................................................................................283
Generate New Encryption Keys...................................................................................................................... 284
Enable vSAN Encryption on Existing vSAN Cluster....................................................................................... 284
vSAN Encryption and Core Dumps.................................................................................................................285

9
VMware vSAN 8.0

Upgrading the vSAN Cluster.......................................................................................................................................287


Before You Upgrade vSAN..................................................................................................................................... 287
Upgrade the vCenter Server...................................................................................................................................289
Upgrade the ESXi Hosts.........................................................................................................................................289
About the vSAN Disk Format................................................................................................................................. 289
Upgrading vSAN Disk Format Using vSphere Client...................................................................................... 290
Upgrade vSAN Disk Format Using RVC......................................................................................................... 291
Verify the vSAN Disk Format Upgrade............................................................................................................292
About vSAN Object Format.................................................................................................................................... 292
Verify the vSAN Cluster Upgrade........................................................................................................................... 293
Using the RVC Upgrade Command Options During vSAN Cluster Upgrade.........................................................293
vSAN Build Recommendations for vSphere Lifecycle Manager............................................................................ 294
vSAN Monitoring and Troubleshooting......................................................................................... 296
What Is vSAN.............................................................................................................................................................. 296
Monitoring the vSAN Cluster...................................................................................................................................... 296
Monitor vSAN Capacity...........................................................................................................................................296
Monitor Physical Devices in vSAN Cluster.............................................................................................................301
Monitor Devices that Participate in vSAN Datastores............................................................................................ 301
Monitor Virtual Objects in vSAN Cluster................................................................................................................ 301
Monitor Container Volumes in vSAN Cluster......................................................................................................... 302
About Reserved Capacity in vSAN Cluster............................................................................................................ 302
Configure Reserved Capacity for vSAN Cluster............................................................................................. 303
About vSAN Cluster Resynchronization................................................................................................................. 305
Monitor the Resynchronization Tasks in vSAN Cluster...................................................................................305
About vSAN Cluster Rebalancing...........................................................................................................................306
Configure Automatic Rebalance in vSAN Cluster........................................................................................... 307
Using the vSAN Default Alarms............................................................................................................................. 309
View vSAN Default Alarms.............................................................................................................................. 309
View vSAN Network Alarms............................................................................................................................ 309
Using the VMkernel Observations for Creating vSAN Alarms.............................................................................. 309
Creating a vCenter Server Alarm for a vSAN Event.......................................................................................310
Monitoring vSAN Skyline Health................................................................................................................................ 311
About the vSAN Skyline Health..............................................................................................................................312
Check vSAN Skyline Health................................................................................................................................... 313
Monitor vSAN from ESXi Host Client..................................................................................................................... 314
Proactive Tests on vSAN Cluster........................................................................................................................... 314
Managing Proactive Hardware.................................................................................................................................... 314
About Hardware Support Managers....................................................................................................................... 315
Deploying and Configuring Hardware Support Managers...................................................................................... 315
Registering Hardware Support Manager................................................................................................................ 315

10
VMware vSAN 8.0

Associating and Dissociating Hosts........................................................................................................................315


Processing Hardware Failures................................................................................................................................316
Monitoring vSAN Performance................................................................................................................................... 316
About the vSAN Performance Service................................................................................................................... 316
Configure vSAN Performance Service................................................................................................................... 317
Use Saved Time Range in vSAN Cluster.............................................................................................................. 318
View vSAN Cluster Performance............................................................................................................................318
View vSAN Host Performance................................................................................................................................320
View vSAN VM Performance..................................................................................................................................322
Use vSAN I/O Insight..............................................................................................................................................322
View vSAN I/O Insight Metrics........................................................................................................................ 323
Use vSAN I/O Trip Analyzer................................................................................................................................... 325
View vSAN Performance Metrics for Support Cases............................................................................................. 326
Using vSAN Performance Diagnostics................................................................................................................... 327
View vSAN Obfuscation Map..................................................................................................................................327
Handling Failures and Troubleshooting vSAN......................................................................................................... 327
Uploading a vSAN Support Bundle........................................................................................................................ 328
Using Esxcli Commands with vSAN.......................................................................................................................328
Using vsantop Command-Line Tool........................................................................................................................331
vSAN Configuration on an ESXi Host Might Fail................................................................................................... 331
Not Compliant Virtual Machine Objects Do Not Become Compliant Instantly........................................................332
vSAN Cluster Configuration Issues........................................................................................................................ 332
Handling Failures in vSAN......................................................................................................................................333
Failure Handling in vSAN................................................................................................................................ 333
Troubleshooting vSAN..................................................................................................................................... 341
Replacing Existing Hardware Components in vSAN Cluster.......................................................................... 344
Shutting Down and Restarting the vSAN Cluster.................................................................................................... 348
Shut Down the vSAN Cluster Using the Shutdown Cluster Wizard....................................................................... 348
Restart the vSAN Cluster....................................................................................................................................... 349
Manually Shut Down and Restart the vSAN Cluster..............................................................................................349
Documentation Legal Notice.......................................................................................................... 352

11
VMware vSAN 8.0

Release Notes
Release notes include product enhancements and notices, bug fixes, and resolved issues.

VMware vSAN 8.0 Update 3 Release Notes


This document contains the following sections
• Introduction
• What's in the Release Notes
• What's New
• VMware vSAN Community
• Product Support Notices
• Upgrades for This Release
• Limitations
• Known Issues

Introduction

VMware vSAN 8.0 Update 3| 25 JUN 2024| ISO Build 24022510


Check for additions and updates to these release notes.

What's in the Release Notes


These release notes introduce you to new features in VMware vSAN 8.0 Update 3 and provide information about known
issues.

What's New
vSAN 8.0 Update 3 introduces the following new features and enhancements:
Licensing
Capacity-based licensing. The subscription-based licensing in VMware Cloud Foundation 5.2 entitles customers to 1
Tebibyte (TiB) of vSAN capacity per VCF core license, with additional capacity available through a capacity-based add-on
license.
Flexible Topologies
Stretched cluster support on vSAN ESA. In VMware Cloud Foundation 5.2 vSAN Express Storage Architecture fully
supports stretched cluster topologies. This enables VCF to configure workload and management domains that provide
site-level resilience for your workloads and data.
vSAN Max as principal storage. VCF 5.2 supports vSAN Max as primary, centralized shared storage. This enables VCF
to configure workload domains with disaggregated vSAN Max clusters, in addition to aggregated vSAN HCI clusters, to
increase your flexibility.
vSAN File Services up to 250 file shares. vSAN 8.0 Update 3 improves the scalability of native File Services in VMware
Cloud Foundation by increasing the number of shares per cluster to 250 on vSAN ESA.
Data Protection

12
VMware vSAN 8.0

vSAN local data protection leveraging ESA scalable snapshots. vSAN data protection enables you to capture local
snapshots using an intuitive new UI, and store them on your vSAN datastore. Use protection groups to easily define VM
membership, snapshot schedules, retention, and immutability criteria for VMs. You can use these snapshots to revert,
restore, recover, or clone VMs for enhanced levels of protection.
Integration with VMware Live Cyber Recovery (VLCR), for faster ransomware recovery. vSAN data protection
integration with VMware Live Cyber Recovery (VLCR), captures point-in-time snapshots for cloud-based ransomware
protection, and VLCR provides the tools to protect data off-site and recover VMs in an isolated recovery environment
(IRE) for analysis before restoring them on-premises. VLCR uses a local snapshot, and only updates the changes (deltas)
from IRE to production, drastically reducing restore times.

Improved Resiliency
Congestion remediation. vSAN 8.0 Update 3 enhances vSAN OSA's ability to detect and remediate various types of
congestion early, preventing cluster-wide I/O latencies.
Adaptive delete congestion. vSAN now provides adaptive delete congestion for compression-only disk groups in vSAN
OSA, improving IOPS performance and delivering more predictable application responses.
Enhanced Management
Proactive hardware management. vSAN 8.0 Update 3 introduces a new method for collecting critical storage device
telemetry from preferred server vendors, enabling predictive management of hardware issues. Proactive Hardware
Management leverages OEM vendors' predictive failure monitoring tools integrated into vCenter via API (after OEM
Hardware Support Manager is setup and configured), to help you make more informed decisions about hardware
maintenance.
Data-at-rest encryption disable for vSAN ESA. You can disable data-at-rest encryption on vSAN ESA clusters at any
point after enabling it. vSAN ESA now supports the following operations for data-at-rest encryption: enable encryption,
disable encryption, shallow rekey, and deep rekey.
Customizable alarm thresholds for NVMe storage devices in vSAN ESA. vSAN 8.0 Update 3 enables you to
customize alert thresholds for device endurance, tailoring them to specific clusters, hosts, disk vendors, or even individual
devices.
vSAN I/O Trip Analyzer cluster level view. vSAN 8.0 Update 3 enables you to run performance analysis on multiple
VMs simultaneously. You can select up to 8 VMs at once, and quickly perform analysis on each selected VM.
Enhanced awareness of vSAN Max using Aria Operations. VMware Cloud Foundation Operations with VCF 5.2
introduces enhanced visibility for vSAN Max clusters throughout its user interface. This enables Aria Operations to track
resource utilization and health status.
Federated vSAN health monitoring in Aria Operations. The latest Aria Operations introduces federated vSAN cluster
health monitoring for clusters spanning across multiple vCenters.

VMware vSAN Community


Use the vSAN Community Web site to provide feedback and request assistance with any problems you find while using
vSAN.

Product Support Notices

Deprecation of locales:

13
VMware vSAN 8.0

Beginning with the next major release, we will be reducing the number of supported localization languages. The three
supported languages will be:
• Japanese
• Spanish
• French
The following languages will no longer be supported:
• Italian, German, Brazilian Portuguese, Traditional Chinese, Korean, Simplified Chinese
Impact:
• Users who have been using the deprecated languages will no longer receive updates or support in these languages.
• All user interfaces, help documentation, and customer support will be available in the three supported languages
mentioned above.

Upgrades for This Release


For instructions about upgrading vSAN, see the VMware vSAN 8.0 Update 3 documentation.
Note: Before performing the upgrade, please review the most recent version of the VMware Compatibility Guide to
validate that the latest vSAN version is available for your platform.
vSAN 8.0 Update 3 is a new release that requires a full upgrade to vSphere 8.0 Update 3. Perform the following tasks to
complete the upgrade:
1. Upgrade to vCenter Server 8.0 Update 3. For more information, see the VMware vSphere 8.0 Update 3 Release
Notes.
2. Upgrade hosts to ESXi 8.0 Update 3. For more information, see the VMware vSphere 8.0 Update 3 Release Notes.
3. Upgrade the vSAN on-disk format to version 20.0. If upgrading from on-disk format version 3.0 or later, no data
evacuation is required (metadata update only).
4. Upgrade FSVM to enable new File Service features and get all the latest updates.
Note: vSAN retired disk format version 1.0 in vSAN 7.0 Update 1. Disks running disk format version 1.0 are no
longer recognized by vSAN. vSAN will block upgrade through vSphere Update Manager, ISO install, or esxcli to vSAN 7.0
Update 1. To avoid these issues, upgrade disks running disk format version 1.0 to a higher version. If you have disks on
version 1.0, a health check alerts you to upgrade the disk format version.
Disk format version 1.0 does not have performance and snapshot enhancements, and it lacks support for advanced
features including checksum, deduplication and compression, and encryption. For more information about vSAN disk
format version, see KB 2148493.
Upgrading the On-disk Format for Hosts with Limited Capacity
During an upgrade of the vSAN on-disk format from version 1.0 or 2.0, a disk group evacuation is performed. The disk
group is removed and upgraded to on-disk format version 17.0, and the disk group is added back to the cluster. For two-
node or three-node clusters, or clusters without enough capacity to evacuate each disk group, select Allow Reduced
Redundancy from the vSphere Client. You also can use the following RVC command to upgrade the on-disk format:
vsan.ondisk_upgrade --allow-reduced-redundancy
When you allow reduced redundancy, your VMs are unprotected for the duration of the upgrade, because this method
does not evacuate data to the other hosts in the cluster. It removes each disk group, upgrades the on-disk format, and
adds the disk group back to the cluster. All objects remain available, but with reduced redundancy.
If you enable deduplication and compression during the upgrade, you can select Allow Reduced Redundancy from the
vSphere Client.

14
VMware vSAN 8.0

Limitations
For information about maximum configuration limits for the vSAN 8.0 Update 3 release, see the Configuration Maximums
documentation.

Known Issues

vSAN ESA: Scheduled snapshots missing in Snapshots list


Some processing operations can cause a VM in a data protection group to miss a scheduled snapshot. This is expected
behavior, and does not affect future scheduled snapshots.
Workaround: None.
vSAN ESA: Data protection does not activate if VM has vSphere snapshots
This issue occurs when you take a vSphere snapshot of a VM before vSAN takes any data protection snapshots. vSAN
data protection cannot be activated for the VM, and vSAN does not take any scheduled or manual vSAN snapshots.
Workaround: Delete the vSphere snapshot.
Failed vCenter Server tasks while creating vSAN File share at high concurrency
While creating vSAN File Shares at high concurrency (more than 32 threads), the vCenter Server tasks may fail with the
Operation not allowed in current state error. This error does not impact the back end executions and the corresponding
host tasks can succeed.
Workaround: Verify the health status of the created file shares using the vSAN health check. Close the current vCenter
Server session and open a new session.

vSAN File Service does not support NFSv4 delegations


vSAN File Service does not support NFSv4 delegations in this release.
Workaround: None.
In stretched cluster, file server with no affinity cannot rebalance
In the stretched cluster vSAN File Service environment, a file server with no affinity location configured cannot be
rebalanced between Preferred ESXi hosts and Non-preferred ESXi hosts.
Workaround: Set the affinity location of the file server to Preferred or Non-Preferred by editing the file service domain
configuration.
Deleting files in a file share might not be reflected in vSAN capacity view
The allocated blocks might not be returned back to the vSAN storage instantly after all the files are deleted and hence
it would take some time before the reclaimed storage capacity to be updated in vSAN capacity view. When new data is
written to the same file share, these deleted blocks might get reused prior to returning them to vSAN storage.
If unmap is enabled and vSAN deduplication is disabled, the space may not be freed back to vSAN unless 4MB aligned
space are freed in VDFS. If unmap is enabled and vSAN deduplication is enabled, space freed by VDFS will be freed
back to vSAN with a delay.

Workaround: To release the storage back to vSAN immediately, delete the file shares.

vSAN allows a VM to be provisioned across local and remote datastores


vSphere does not prevent users from provisioning a VM across local and remote datastores in an HCI Mesh environment.
For example, you can provision one VMDK on the local vSAN datastore and one VMDK on remote vSAN datastore. This
is not supported because vSphere HA is not supported with this configuration.

15
VMware vSAN 8.0

Workaround: Do not provision a VM across local and remote datastores.


Power Off VMs fails with Question Pending
If a VM has a pending question, you are not allowed to do any VM-related operations until the question is answered.
Workaround: Try to free the disk space on the relevant volume, and then click Retry.
vSAN OSA: In deduplication clusters, reactive rebalancing might not happen when the disks show more than
80% full
In deduplication clusters, when the disks display more than 80% full on the dashboard, the reactive rebalancing might not
start as expected. This is because in deduplication clusters, pending writes and deletes are also considered for calculating
the free capacity.
Workaround: None.
Host failure when converting data host into witness host
When you convert a vSAN cluster into a stretched cluster, you must provide a witness host. You can convert a data host
into the witness host, but you must use maintenance mode with Full data migration during the process. If you place the
host into maintenance mode with Ensure accessibility option, and then configure it as the witness host, the host might
fail with a purple diagnostic screen.
Workaround: Remove the disk group on the witness host and then re-create the disk group.
Duplicate VM with the same name in vCenter Server when residing host fails during datastore migration
If a VM is undergoing storage vMotion from vSAN to another datastore, such as NFS, and the host on which
it resides encounters a failure on the vSAN network, causing HA failover of the VM, the VM might be duplicated in the
vCenter Server.
Workaround: Power off the invalid VM and unregister it from the vCenter Server.

VMware vSAN 8.0 Update 2 Release Notes


This document contains the following sections
• Introduction
• What's in the Release Notes
• What's New
• VMware vSAN Community
• Upgrades for This Release
• Limitations
• Known Issues

Introduction

VMware vSAN 8.0 Update 2| 21 SEP 2023| ISO Build 22380479


Check for additions and updates to these release notes.

What's in the Release Notes


These release notes introduce you to new features in VMware vSAN 8.0 Update 2 and provide information about known
issues.

What's New
vSAN 8.0 Update 2 introduces the following new features and enhancements:

16
VMware vSAN 8.0

Disaggregated Storage
Enhanced topologies for disaggregation with vSAN Express Storage Architecture bring feature parity for vSAN OSA and
vSAN ESA.
vSAN ESA support for stretched clusters in disaggregated topology. vSAN ESA supports disaggregation when
using vSAN stretched clusters. In addition to supporting several stretched cluster configurations, vSAN also optimizes the
network paths for certain topologies to improve the performance capabilities of stretched cluster configurations.
Support of disaggregation across clusters using multiple vCenter Servers. vSAN 8.0 Update 2 supports
disaggregation across environments using multiple vCenter Servers when using vSAN ESA. This enables vSphere or
vSAN clusters managed by one vCenter Server to use the storage resources of a vSAN cluster managed by a different
vCenter Server..
vSAN ESA Adaptive Write path for disaggregated storage. Disaggregated deployments get the performance benefits
of a new adaptive write path previously introduced in vSAN 8.0 Update 1 for standard ESA based deployments. VMs
running on a vSphere or vSAN cluster that consume storage from another vSAN ESA cluster can take advantage of this
capability. Adaptive write path technology in a disaggregated environment helps your VMs achieve higher throughput and
lower latency, and do so automatically in real time, without any interaction by the administrator.
Core Platform Enhancements
Integrated File Services for Cloud Native and traditional workloads. vSAN 8.0 Update 2 supports vSAN File Service
on vSAN Express Storage Architecture. File service clients can benefit from performance and efficiency enhancements
provided by vSAN ESA.
Adaptive Write Path optimizations in vSAN ESA. vSAN ESA introduces an adaptive write path that helps the cluster
ingest and process data more quickly. This optimization improves performance for workloads driving high I/O to single
object (VMDK), and also improves aggregate cluster performance.
Increased number of VM's per host in vSAN ESA clusters (up to 500/host). vSAN 8.0 Update 2 supports up to 500
VMs per host VM on vSAN ESA clusters, provided the underlying hardware infrastructure can support it. Now you can
leverage NVMe-based high performance hardware platforms optimized for the latest generation of CPUs with high core
densities, and consolidate more VMs per host.
New ReadyNode profile and support for read-intensive devices for vSAN ESA. vSAN ESA announces the availability
of new ReadyNode profiles designed for small data centers and edge environments with lower overall hardware
requirements on a per-node basis. This release also introduces support for read-intensive storage devices.
vSAN ESA support for encryption deep rekey. vSAN clusters using data-at-rest encryption have the ability to perform a
deep rekey operation. A deep rekey decrypts the data that has been encrypted and stored on a vSAN cluster using the old
encryption key, and re-encrypts the data using newly issued encryption keys prior to storing it on the vSAN cluster.

Enriched Operations
vSAN ESA prescriptive disk claim. vSAN ESA includes a prescriptive disk claim process that further simplifies
management of storage devices in each host in a vSAN cluster. This feature provides consistency to the disk claiming
process during initial deployment and cluster expansion.
Capacity reporting enhancements. Overhead breakdown in vSAN ESA space reporting displays both the ESA object
overhead and the original file system overhead.
Auto-Policy management improvements in vSAN ESA. Enhanced auto-policy management feature determines if the
default storage policy needs to be adjusted when a user adds or removes a host from a cluster. If vSAN identifies a need
to change the default storage policy, it triggers a health check warning. You can make the change with a simple click at
which time vSAN reconfigures the cluster with the new policy.
Skyline Health remediation enhancements. vSAN Skyline Health helps you reduce resolution times by providing
deployment-specific guidance along with more prescriptive guidance on how to resolve issues.

17
VMware vSAN 8.0

Key expiration for clusters with data-at-rest encryption. vSAN 8.0 Update 2 supports the use of KMS servers with a
key expiration attribute used for assigning an expiration date to a Key Encryption Key (KEK).
I/O top contributors enhancements. vSAN Performance Service has improved the process to find performance hot
spots over a customizable time period to help you diagnose performance issues while using multiple types of sources for
analysis (VMs, host disks, and so on).
I/O Trip Analyzer supported on two node clusters and stretched clusters. vSAN 8.0 Update 2 has enhanced the I/
O Trip Analyzer to report on workloads in a vSAN stretched cluster. Now you can determine where the primary source of
latency is occurring in a vSAN stretched cluster, as well as latencies in other parts of the stack that can contribute to the
overall latency experienced by the VM.
Easier configuration for two node clusters and stretched clusters. Several new features to help management of two
node and stretched cluster deployments.
• Witness host traffic configured in the vSphere Client.
• Support for medium sized witness host appliance in vSAN ESA.
• Support in vLCM to manage lifecycle of shared witness host appliance types.

Cloud Native Storage


CSI snapshot support for TKG service. Cloud Native Storage introduces CSI snapshot support for TKG Service,
enabling K8s users and backup vendors to take persistent volume snapshots on TKGS.
Data mobility of Cloud Native persistent volumes across datastores. This release introduces built-in migration of
persistent volumes across datastores in the vSphere Client.

VMware vSAN Community


Use the vSAN Community Web site to provide feedback and request assistance with any problems you find while using
vSAN.

Upgrades for This Release


For instructions about upgrading vSAN, see the VMware vSAN 8.0 Update 2 documentation.
Note: Before performing the upgrade, please review the most recent version of the VMware Compatibility Guide to
validate that the latest vSAN version is available for your platform.
Note: vSAN Express Storage Architecture is available only for new deployments. You cannot upgrade a cluster to vSAN
ESA.
vSAN 8.0 Update 2 is a new release that requires a full upgrade to vSphere 8.0 Update 2. Perform the following tasks to
complete the upgrade:
1. Upgrade to vCenter Server 8.0 Update 2. For more information, see the VMware vSphere 8.0 Update 2 Release
Notes.
2. Upgrade hosts to ESXi 8.0 Update 2. For more information, see the VMware vSphere 8.0 Update 2 Release Notes.
3. Upgrade the vSAN on-disk format to version 19.0. If upgrading from on-disk format version 3.0 or later, no data
evacuation is required (metadata update only).
4. Upgrade FSVM to enable new File Service features and get all the latest updates.
Note: vSAN retired disk format version 1.0 in vSAN 7.0 Update 1. Disks running disk format version 1.0 are no
longer recognized by vSAN. vSAN will block upgrade through vSphere Update Manager, ISO install, or esxcli to vSAN 7.0
Update 1. To avoid these issues, upgrade disks running disk format version 1.0 to a higher version. If you have disks on
version 1.0, a health check alerts you to upgrade the disk format version.

18
VMware vSAN 8.0

Disk format version 1.0 does not have performance and snapshot enhancements, and it lacks support for advanced
features including checksum, deduplication and compression, and encryption. For more information about vSAN disk
format version, see KB 2148493.
Upgrading the On-disk Format for Hosts with Limited Capacity
During an upgrade of the vSAN on-disk format from version 1.0 or 2.0, a disk group evacuation is performed. The disk
group is removed and upgraded to on-disk format version 17.0, and the disk group is added back to the cluster. For two-
node or three-node clusters, or clusters without enough capacity to evacuate each disk group, select Allow Reduced
Redundancy from the vSphere Client. You also can use the following RVC command to upgrade the on-disk format:
vsan.ondisk_upgrade --allow-reduced-redundancy
When you allow reduced redundancy, your VMs are unprotected for the duration of the upgrade, because this method
does not evacuate data to the other hosts in the cluster. It removes each disk group, upgrades the on-disk format, and
adds the disk group back to the cluster. All objects remain available, but with reduced redundancy.
If you enable deduplication and compression during the upgrade, you can select Allow Reduced Redundancy from the
vSphere Client.

Limitations
For information about maximum configuration limits for the vSAN 8.0 Update 2 release, see the Configuration Maximums
documentation.

Known Issues

Failed to consolidate vmdk of vRDM on vSAN ESA cluster


This issue can occur if you use a vRDM type virtual disk with compatibility mode set to virtual and disk mode set to
dependent. In the vSAN ESA datastore, the corresponding virtual disk cannot be consolidated in the snapshot deletion
operation after the snapshot is created. This problem can cause the virtual disk to fail to create new snapshot.
None.
Encryption operations blocked if Native Key Provider is not backed up
If you are using Native Key Provider for vSAN data-at-rest encryption, and the Native Key Provider is not backed up,
encryption-related reconfiguration operations might fail with the following message: The KMS cluster {kmsCluster}
is a Native Key Provider that is not backed up yet .
Back up the Native Key Provider before enabling data-at-rest encryption.
Must manually assign storage policy to restored VMs
If you restore or create a linked clone from a deleted VM, vSAN does not automatically assign a storage policy. The new
VM does not have a storage policy.
Manually assign the storage policy after you create a linked clone VM or restore a VM.
Japanese text for Skyline Health displays incorrectly when vCenter is air gapped
This issue can occur when the vCenter network connection is air gapped. Some Japanese text for Skyline Health localizes
incorrectly. You might see a message ID such as the following: com.vmware.vsan.health.XXX
Perform the following steps:
1. SSH to vCenter.
2. Open the following file:/etc/vmware-vsan-health/cloudHealthResources/locale/ja/
vsanhealthremediation.vmsg
3. Search for the following:vsan.health.optimaldsdefaultpolicy.htf.0.default.0.des

19
VMware vSAN 8.0

4. Merge the two lines into one as shown below, and save the file.
5. Restart the vSAN health service: /usr/sbin/vmon-cli -r vsan-health
Before:
vsan.health.optimaldsdefaultpolicy.htf.0.default.0.des=健全性テーブルに表示される
vSAN の最適なデータストアのデフォルト ポリシーの「許容される障害の数」または「サイトの耐障害性」の構成 (ある
いは両方) が推奨ポリシーと一致しません。
After:
vsan.health.optimaldsdefaultpolicy.htf.0.default.0.des=健全性テーブルに表示される vSAN の最適なデータストアのデ
フォルト ポリシーの「許容される障害の数」または「サイトの耐障害性」の構成 (あるいは両方) が推奨ポリシーと一致
しません。
hostAffinity policy option lost during upgrade
When you upgrade from vSAN 6.7 to vSAN 8.0, the vCenter Server hostaffinity option value is changed to false.
Workaround: Set the hostaffinity option back to true to continue using vSAN HostLocal policy for a normal VM.
Cannot enable File Service if vCenter Server internet connectivity is disabled
If you disable vCenter Server internet connectivity, the Enable File Service dialog does not display File service agent
section and you cannot select OVF.
Workaround: To enable vCenter Server internet connectivity:
1. Navigate to Cluster > Configure > vSAN > Internet Connectivity.
2. Click Edit to open Edit Internet Connectivity dialog.
3. Select Enable Internet access for all vSAN clusters checkbox and click Apply.
Cannot deactivate encryption on vSAN ESA
After you enable data-at-rest encryption on a vSAN ESA cluster, you cannot deactivate it.
Workaround: None.
vSAN File Service does not support NFSv4 delegations
vSAN File Service does not support NFSv4 delegations in this release.
Workaround: None.
In stretched cluster, file server with no affinity cannot rebalance
In the stretched cluster vSAN File Service environment, a file server with no affinity location configured cannot be
rebalanced between Preferred ESXi hosts and Non-preferred ESXi hosts.
Workaround: Set the affinity location of the file server to Preferred or Non-Preferred by editing the file service domain
configuration.
Remediation of ESXi hosts in a vSphere Lifecycle Manager cluster with vSAN fails if vCenter services are
deployed on custom ports
If vCenter Server services are deployed on custom ports in a cluster with vSAN, vSphere DRS, and vSphere HA,
remediation of vSphere Lifecycle Manager clusters might fail. This problem is caused by a vSAN resource health check
error. ESXi hosts cannot enter maintenance mode, which leads to failing remediation tasks.
Workaround: None.
When vSAN file service is enabled, DFC-related operations such as upgrade, enabling encryption or data-
efficiency might fail

20
VMware vSAN 8.0

When file service is enabled, an agent VM runs on each host. The underlying vSAN object might be placed across
multiple diskgroups. When the first diskgroup gets converted, the vSAN object becomes inaccessible and the agent VM
is in an invalid state. If you try to delete the VM and redeploy a new VM, the operation fails due to the VM’s invalid state.
The VM gets unregistered but the inaccessible object still exists there. When the next diskgroup gets converted, there is
a precheck for inaccessible objects in the whole cluster. This check fails the DFC since it finds inaccessible objects of the
old agent VM.
Workaround: Manually remove the inaccessible objects.
When such failure happens, you can see the DFC task failure.
1. Identify the inaccessible objects from the failure task fault information.
2. To ensure that the objects belong to the agent VM, inspect the hostd log file and confirm that the objects belong to the
VM’s object layout.
3. Log in to the host and use the /usr/lib/vmware/osfs/bin/objtool command to remove the objects manually.
Note: To prevent this problem, disable file service before performing any DFC-related operation.

Cannot extract host profile on a vSAN HCI mesh compute-only host


vSAN host profile plugin does not support vSAN HCI mesh compute-only hosts. If you try to extract the host profile on an
HCI mesh compute-only host, the attempt fails.
Workaround: None.
Deleting files in a file share might not be reflected in vSAN capacity view
The allocated blocks might not be returned back to the vSAN storage instantly after all the files are deleted and hence
it would take some time before the reclaimed storage capacity to be updated in vSAN capacity view. When new data is
written to the same file share, these deleted blocks might get reused prior to returning them to vSAN storage.
If unmap is enabled and vSAN deduplication is disabled, the space may not be freed back to vSAN unless 4MB aligned
space are freed in VDFS. If unmap is enabled and vSAN deduplication is enabled, space freed by VDFS will be freed
back to vSAN with a delay.

Workaround: To release the storage back to vSAN immediately, delete the file shares.

vCenter VM crash on stretched cluster with data-in-transit encryption


vCenter VM might crash on a vSAN stretched cluster if the vCenter VM is on vSAN with data-in-transit encryption
enabled. When all hosts in one site are down and then power on again, the vCenter VM might crash after the failed site
returns to service.
Workaround: Use the following script to resolve this problem: thumbPrintRepair.py

vSAN allows a VM to be provisioned across local and remote datastores


vSphere does not prevent users from provisioning a VM across local and remote datastores in an HCI Mesh environment.
For example, you can provision one VMDK on the local vSAN datastore and one VMDK on remote vSAN datastore. This
is not supported because vSphere HA is not supported with this configuration.
Workaround: Do not provision a VM across local and remote datastores.
The object reformatting task is not progressing
If object reformatting is needed after an upgrade, a health alert is triggered, and vSAN begins reformatting. vSAN
performs this task in batches, and it depends on the amount of transient capacity available in the cluster. When the
transient capacity exceeds the maximum limit, vSAN waits for the transient capacity to be freed before proceeding
with the reformatting. During this phase, the task might appear to be halted. The health alert will clear and the task will
progress when transient capacity is available.

21
VMware vSAN 8.0

Workaround: None. The task is working as expected.

System VMs cannot be powered-off


With the release of vSphere Cluster Services (vCLS) in vSphere 7.0 Update 1, a set of system VMs might be placed
within the vSAN cluster. These system VMs cannot be powered-off by users. This issue can impact some vSAN
workflows, which are documented in the following article: https://kb.vmware.com/s/article/80877
Workaround: For more information about this issue, refer to this KB article: https://kb.vmware.com/s/article/80483.
vSAN File Service cannot be enabled due to an old vSAN on-disk format version
vSAN File Service cannot be enabled with the vSAN on-disk format version earlier than 11.0 (this is the on-disk format
version in vSAN 7.0).
Workaround: Upgrade the vSAN disk format version before enabling File Service.

Host failure in hot-plug scenario when drive is reinserted


During a hot drive removal, VMware native NVMe hot-plug can cause a host failure if the NVMe drive is pulled and
reinserted within one minute. This is applicable to both vSphere and vSAN for any new or existing drive reinsertion.
Workaround: After removing a hot drive, wait for one minute before you reinsert the new or existing drive.
Cannot place last host in a cluster into maintenance mode, or remove a disk or disk group
Operations in Full data migration or Ensure accessibility mode might fail without providing guidance to add a new
resource, when there is only one host left in the cluster and that host enters maintenance mode. This can also happen
when there is only one disk or disk group left in the cluster and that disk or disk group is to be removed.
Workaround: Before you place the last remaining host in the cluster into maintenance mode with Full data migration or
Ensure accessibility mode selected, add another host with the same configuration to the cluster. Before you remove the
last remaining disk or disk group in the cluster, add a new disk or disk group with the same configuration and capacity.
Object reconfiguration workflows might fail due to the lack of capacity if one or more disks or disk groups are
almost full
vSAN resyncs get paused when the disks in non-deduplication clusters or disk groups in deduplication clusters reach a
configurable resync pause fullness threshold. This is to avoid filling up the disks with resync I/O. If the disks reach this
threshold, vSAN stops reconfiguration workflows, such as EMM, repairs, rebalance, and policy change.
Workaround: If space is available elsewhere in the cluster, rebalancing the cluster frees up space on the other disks, so
that subsequent reconfiguration attempts succeed.
After recovery from cluster full, VMs can lose HA protection
In a vSAN cluster that has hosts with disks 100% full, the VMs might have a question pending and hence lose the HA
protection. Also, the VMs that had a pending question are not HA protected after recovering from cluster full scenario.
Workaround: After recovering from vSAN cluster full scenario, perform one of the following actions:
• Disable and re-enable HA.
• Reconfigure HA.
• Power off and power on the VMs.

Power Off VMs fails with Question Pending


If a VM has a pending question, you are not allowed to do any VM-related operations until the question is answered.
Workaround: Try to free the disk space on the relevant volume, and then click Retry.
When the cluster is full, the IP addresses of VMs either change to IPV6 or become unavailable

22
VMware vSAN 8.0

When a vSAN cluster is full with one or more disk groups reaching 100%, there can be a VM pending question that
requires user action. If the question is not answered and if the cluster full condition is left unattended, the IP addresses
VMs might change to IPv6 or become unavailable. This prevents you from using SSH to access the VMs. It also prevents
you from using the VM console, because the console goes blank after you type root .
Workaround: None.
Unable to remove a dedupe enabled disk group after a capacity disk enters PDL state
When a capacity disk in a dedupe-enabled disk group is removed, or its unique ID changes, or when the device
experiences an unrecoverable hardware error, it enters Permanent Device Loss (PDL) state. If you try to remove the disk
group, you might see an error message informing you that the action cannot be completed.
Workaround: Whenever a capacity disk is removed, or its unique ID changes, or when the device experiences an
unrecoverable hardware error, wait for a few minutes before trying to remove the disk group.
In deduplication clusters, reactive rebalancing might not happen when the disks show more than 80% full
In deduplication clusters, when the disks display more than 80% full on the dashboard, the reactive rebalancing might not
start as expected. This is because in deduplication clusters, pending writes and deletes are also considered for calculating
the free capacity.
Workaround: None.
TRIM/UNMAP commands from Guest OS fail
If the Guest OS attempts to perform space reclamation during online snapshot consolidation, the TRIM/UNMAP
commands fail. This failure keeps space from being reclaimed.
Workaround: Try to reclaim the space after the online snapshot operation is complete. If subsequent TRIM/UNMAP
operations fail, remount the disk.
Space reclamation from SCSI TRIM/UNMAP is lost when online snapshot consolidation is performed
Space reclamation achieved from SCSI TRIM/UNMAP commands is lost when you perform online snapshot consolidation.
Offline snapshot consolidation does not affect SCSI unmap operation.
Workaround: Reclaim the space after online snapshot consolidation is complete.

Host failure when converting data host into witness host


When you convert a vSAN cluster into a stretched cluster, you must provide a witness host. You can convert a data host
into the witness host, but you must use maintenance mode with Full data migration during the process. If you place the
host into maintenance mode with Ensure accessibility option, and then configure it as the witness host, the host might
fail with a purple diagnostic screen.
Workaround: Remove the disk group on the witness host and then re-create the disk group.
Duplicate VM with the same name in vCenter Server when residing host fails during datastore migration
If a VM is undergoing storage vMotion from vSAN to another datastore, such as NFS, and the host on which
it resides encounters a failure on the vSAN network, causing HA failover of the VM, the VM might be duplicated in the
vCenter Server.
Workaround: Power off the invalid VM and unregister it from the vCenter Server.
Reconfiguring an existing stretched cluster under a new vCenter Server causes vSAN to issue a health check
warning
When rebuilding a current stretched cluster under a new vCenter Server, the vSAN cluster health check is red. The
following message appears: vSphere cluster members match vSAN cluster members

23
VMware vSAN 8.0

Workaround: Use the following procedure to configure the stretched cluster.


1. Use SSH to log in to the witness host.
2. Decommission the disks on witness host. Run the following command: esxcli vsan storage remove -s "SSD
UUID"
3. Force the witness host to leave the cluster. Run the following command: esxcli vsan cluster leave
4. Reconfigure the stretched cluster from the new vCenter Server (Configure > vSAN > Fault Domains & Stretched
Cluster).
Disk format upgrade fails while vSAN resynchronizes large objects
If the vSAN cluster contains very large objects, the disk format upgrade might fail while the object is resynchronized. You
might see the following error message: Failed to convert object(s) on vSAN
vSAN cannot perform the upgrade until the object is resynchronized. You can check the status of the resynchronization
(Monitor > vSAN > Resyncing Components) to verify when the process is complete.

Workaround: Wait until no resynchronization is pending, then retry the disk format upgrade.
Powered off VMs appear as inaccessible during witness host replacement
When you change a witness host in a stretched cluster, VMs that are powered off appear as inaccessible in the vSphere
Web Client for a brief time. After the process is complete, powered off VMs appear as accessible. All running VMs appear
as accessible throughout the process.
Workaround: None.
Cannot place hosts in maintenance mode if they have faulty boot media
vSAN cannot place hosts with faulty boot media into maintenance mode. The task to enter maintenance mode might fail
with an internal vSAN error, due to the inability to save configuration changes. You might see log events similar to the
following: Lost Connectivity to the device xxx backing the boot filesystem
Workaround: Remove disk groups manually from each host, using the Full data evacuation option. Then place the host
in maintenance mode.
After stretched cluster failover, VMs on the preferred site register alert: Failed to failover
If the secondary site in a stretched cluster fails, VMs failover to the preferred site. VMs already on the preferred site might
register the following alert: Failed to failover.
Workaround: Ignore this alert. It does not impact the behavior of the failover.
During network partition, components in the active site appear to be absent
During a network partition in a vSAN two-host or stretched cluster, the vSphere Web Client might display a view of the
cluster from the perspective of the non-active site. You might see active components in the primary site displayed as
absent.
Workaround: Use RVC commands to query the state of objects in the cluster. For example: vsan.vm_object_info

Some objects are non-compliant after force repair


After you perform a force repair, some objects might not be repaired because the ownership of the objects was transferred
to a different node during the process. The force repair might be delayed for those objects.
Workaround: Attempt the force repair operation after all other objects are repaired and resynchronized. You can wait until
vSAN repairs the objects.
When you move a host from one encrypted cluster to another, and then back to the original cluster, the task fails

24
VMware vSAN 8.0

When you move a host from an encrypted vSAN cluster to another encrypted vSAN cluster, then move the host back to
the original encrypted cluster, the task might fail. You might see the following message: A general system error
occurred: Invalid fault . This error occurs because vSAN cannot re-encrypt data on the host using the original
encryption key. After a short time, vCenter Server restores the original key on the host, and all unmounted disks in the
vSAN cluster are mounted.
Workaround: Reboot the host and wait for all disks to get mounted.
Cannot perform deep rekey if a disk group is unmounted
Before vSAN performs a deep rekey, it performs a shallow rekey. The shallow rekey fails if an unmounted disk group is
present. The deep rekey process cannot begin.
Workaround: Remount or remove the unmounted disk group.
Log entries state that firewall configuration has changed
A new firewall entry appears in the security profile when vSAN encryption is enabled: vsanEncryption. This rule controls
how hosts communicate directly to the KMS. When it is triggered, log entries are added to /var/log/vobd.log . You
might see the following messages:
Firewall configuration has changed. Operation 'addIP4' for rule set vsanEncryption
succeeded.
Firewall configuration has changed. Operation 'removeIP4' for rule set vsanEncryption
succeeded.
These messages can be ignored.

Workaround: None.
HA failover does not occur after setting Traffic Type option on a vmknic to support witness traffic
If you set the traffic type option on a vmknic to support witness traffic, vSphere HA does not automatically discover the
new setting. You must manually disable and then re-enable HA so it can discover the vmknic. If you configure the vmknic
and the vSAN cluster first, and then enable HA on the cluster, it does discover the vmknic.
Workaround: Manually disable vSphere HA on the cluster, and then re-enable it.

After resolving network partition, some VM operations on linked clone VMs might fail
Some VM operations on linked clone VMs that are not producing I/O inside the guest operating system might fail. The
operations that might fail include taking snapshots and suspending the VMs. This problem can occur after a network
partition is resolved, if the parent base VM's namespace is not yet accessible. When the parent VM's namespace
becomes accessible, HA is not notified to power on the VM.
Workaround: Power cycle VMs that are not actively running I/O operations.
Cannot place a witness host in Maintenance Mode
When you attempt to place a witness host in Maintenance Mode, the host remains in the current state and you see the
following notification: A specified parameter was not correct.
Workaround: When placing a witness host in Maintenance Mode, choose the No data migration option.
Moving the witness host into and then out of a stretched cluster leaves the cluster in a misconfigured state
If you place the witness host in a vSAN-enabled vCenter cluster, an alarm notifies you that the witness host cannot reside
in the cluster. But if you move the witness host out of the cluster, the cluster remains in a misconfigured state.
Workaround: Move the witness host out of the vSAN stretched cluster, and reconfigure the stretched cluster. For more
information, see this article: https://kb.vmware.com/s/article/2130587.

25
VMware vSAN 8.0

Unmounted vSAN disks and disk groups displayed as mounted in the vSphere Web Client Operational Status
field
After the vSAN disks or disk groups are unmounted by either running the esxcli vsan storage disk group
unmount command or by the vSAN Device Monitor service when disks show persistently high latencies, the vSphere
Web Client incorrectly displays the Operational Status field as mounted.
Workaround: Use the Health field to verify disk status, instead of the Operational Status field.

VMware vSAN 8.0 Update 1 Release Notes


This document contains the following sections
• What's in the Release Notes
• What's New
• VMware vSAN Community
• Upgrades for This Release
• Limitations
• Known Issues

What's in the Release Notes


These release notes introduce you to new features in VMware vSAN 8.0 Update 1 and provide information on resolved
and known issues.

What's New
vSAN 8.0 Update 1 introduces the following new features and enhancements:
Disaggregated Storage
Disaggregation with vSAN Express Storage Architecture. vSAN 8.0 Update 1 provides disaggregation support for
vSAN Express Storage Architecture (ESA), as it is supported with vSAN Original Storage Architecture (OSA). You can
mount remote vSAN datastores that reside in other vSAN ESA server clusters. You also can use an ESA cluster as the
external storage resource for a compute-only cluster. All capabilities and limits that apply to disaggregation support for
vSAN OSA also apply to vSAN ESA. vSAN ESA client clusters can connect only to a vSAN ESA based server cluster.
Disaggregation for vSAN stretched clusters (vSAN OSA). This release supports vSAN stretched clusters in
disaggregated topology. In addition to supporting several stretched cluster configurations, vSAN can optimize network
paths for certain topologies to improve stretched cluster performance.
Disaggregation across clusters using multiple vCenter Servers (vSAN OSA). vSAN 8.0 Update 1 introduces support
for vSAN OSA disaggregation across environments using multiple vCenter Servers. This enables clusters managed by
one vCenter Server to use storage resources that reside on a vSAN cluster managed by a different vCenter Server.

Optimized Performance, Durability, and Flexibility


• Improved performance with new Adaptive Write Path. vSAN ESA introduces a new adaptive write path that
dynamically optimizes guest workloads tht issue large streaming writes, resulting in higher throughput and lower
latency with no additional complexity.
• Optimized I/O processing for single VMDK/objects (vSAN ESA). vSAN ESA has optimized the I/O processing that
occurs for each object that reside on a vSAN datastore, increasing the performance of VMs with a significant amount
of virtual hardware storage resources.
• Enhanced durability in maintenance mode scenarios. When a vSAN ESA cluster enters maintenance mode (EMM)
with Ensure Accessibility (applies to RAID 5/6 Erasure Coding), vSAN can write all incremental updates to another

26
VMware vSAN 8.0

host in addition to the hosts holding the data. This helps ensure the durability of the changed data if additional hosts
fail while the original host is still in maintenance mode.
• Increased administrative storage capacity on vSAN datastores using customizable namespace objects. You
can customize the size of namespace objects that enable administrators to store ISO files, VMware content library, or
other infrastructure support files on a vSAN datastore.
• Witness appliance certification. In vSAN 8.0 Update 1, the software acceptance level for vSAN witness appliance
has changed to Partner Supported. All vSphere Installation Bundles (VIBs) must be certified.

Simplified Management
• Auto-policy management for the default storage policy (vSAN ESA). vSAN ESA introduces auto-policy
management, an optional feature that creates and assigns a default storage policy designed for the cluster. Based on
the size and type of cluster, auto-policy management selects the ideal level of failure to tolerate and data placement
scheme. Skyline health uses this data to monitor and alert you if the default storage policy is ideal or sub-optimal, and
guides you to adjust the default policy based on the cluster characteristics. Skyline health actively monitors the cluster
as its size changes, and provides new recommendations as needed.
• Skyline health intelligent cluster health scoring, diagnostics and remediation. Improve efficiency by using the
cluster health status and troubleshooting dashboard that prioritizes identified issues, enabling you to focus and take
action on the most important issues.
• High resolution performance monitoring in vSAN performance service. vSAN performance service provides real-
time monitoring of performance metrics that collects and renders metrics every 30 seconds, making monitoring and
troubleshooting more meaningful. VMware snapshot APIs are unchanged. VMware VADP supports all vSAN ESA
native snapshot operations on the vSphere platform.
• VM I/O trip analyzer task scheduling. VM I/O trip analyzer can schedule based on time-of-day, for a particular
duration and frequency to capture details for repeat-offender VMs. The diagnostics data collected are available for
analysis in the VM I/O trip analyzer interface in vCenter.
• PowerCLI enhancements. PowerCLI supports the following new capabilities:
vSAN ESA disaggregation
vSAN OSA disaggregation for stretched clusters
vSAN OSA disaggregation across multiple vCenter Servers
vSAN cluster shutdown
Object format updates and custom namespace objects

Cloud Native Storage


• Cloud Native Support for TKGs and supervisor clusters (vSAN ESA). Containers powered by vSphere and vSAN
can consume persistent storage for developers and administrators, and use the improved performance and efficiency
for their cloud native workloads.
• Data Persistence platform support using common vSphere switching. vSAN Data Persistence platform allows
third-party ISVs to build solutions, such as S3-compatible object stores, that run natively on vSAN. vDPp is now
compatible with VMware vSphere Distributed Switches, reducing the cost and complexity of these solutions.
• Thick provisioning for persistent volumes using SPBM on VMFS datastores (VMware vSAN Direct
Configuration). Persistent volumes can be programmatically provisioned as thick when defined in the storage class
that is mapped to a storage policy.

VMware vSAN Community


Use the vSAN Community Web site to provide feedback and request assistance with any problems you find while using
vSAN.

Upgrades for This Release


For instructions about upgrading vSAN, see the VMware vSAN 8.0 Update 1 documentation.

27
VMware vSAN 8.0

Note: Before performing the upgrade, please review the most recent version of the VMware Compatibility Guide to
validate that the latest vSAN version is available for your platform.
Note: vSAN Express Storage Architecture is available only for new deployments. You cannot upgrade a cluster to vSAN
ESA.
vSAN 8.0 Update 1 is a new release that requires a full upgrade to vSphere 8.0 Update 1. Perform the following tasks to
complete the upgrade:
1. Upgrade to vCenter Server 8.0 Update 1. For more information, see the VMware vSphere 8.0 Update 1 Release
Notes.
2. Upgrade hosts to ESXi 8.0 Update 1. For more information, see the VMware vSphere 8.0 Update 1 Release Notes.
3. Upgrade the vSAN on-disk format to version 18.0. If upgrading from on-disk format version 3.0 or later, no data
evacuation is required (metadata update only).
4. Upgrade FSVM to enable new File Service features and get all the latest updates.
Note: vSAN retired disk format version 1.0 in vSAN 7.0 Update 1. Disks running disk format version 1.0 are no
longer recognized by vSAN. vSAN will block upgrade through vSphere Update Manager, ISO install, or esxcli to vSAN 7.0
Update 1. To avoid these issues, upgrade disks running disk format version 1.0 to a higher version. If you have disks on
version 1.0, a health check alerts you to upgrade the disk format version.
Disk format version 1.0 does not have performance and snapshot enhancements, and it lacks support for advanced
features including checksum, deduplication and compression, and encryption. For more information about vSAN disk
format version, see KB 2148493.
Upgrading the On-disk Format for Hosts with Limited Capacity
During an upgrade of the vSAN on-disk format from version 1.0 or 2.0, a disk group evacuation is performed. The disk
group is removed and upgraded to on-disk format version 17.0, and the disk group is added back to the cluster. For two-
node or three-node clusters, or clusters without enough capacity to evacuate each disk group, select Allow Reduced
Redundancy from the vSphere Client. You also can use the following RVC command to upgrade the on-disk format:
vsan.ondisk_upgrade --allow-reduced-redundancy
When you allow reduced redundancy, your VMs are unprotected for the duration of the upgrade, because this method
does not evacuate data to the other hosts in the cluster. It removes each disk group, upgrades the on-disk format, and
adds the disk group back to the cluster. All objects remain available, but with reduced redundancy.
If you enable deduplication and compression during the upgrade, you can select Allow Reduced Redundancy from the
vSphere Client.

Limitations
For information about maximum configuration limits for the vSAN 8.0 Update 1 release, see the Configuration Maximums
documentation.

Known Issues

Snapshots on vSAN ESA HCI Mesh client cluster not supported


Certain snapshot operations on VMs deployed on a HCI Mesh client cluster over vSAN ESA server cluster might fail under
specific conditions. Do not use snapshots in vSAN ESA client cluster or migrate VMs with snapshots to a vSAN ESA client
cluster.
Workaround: None.
Remote datastore on vSAN ESA compute client cluster does not show valid capacity

28
VMware vSAN 8.0

This issue affects compute-only client clusters that mount from a vSAN ESA server cluster. When you mount a remote
datastore, the datastore capacity value shown on the host and the vSphere client does not match the actual value. Aside
from the reporting issue, there is no known impact on VM operations.
Workaround: None.
Cannot enable File Service if vCenter Server internet connectivity is disabled
If you disable vCenter Server internet connectivity, the Enable File Service dialog does not display File service agent
section and you cannot select OVF.
Workaround: To enable vCenter Server internet connectivity:
1. Navigate to Cluster > Configure > vSAN > Internet Connectivity.
2. Click Edit to open Edit Internet Connectivity dialog.
3. Select Enable Internet access for all vSAN clusters checkbox and click Apply.
KMS connection health checks not available when KMS is offline
This issue affects vSAN health checks for clusters with data-at-rest encryption. When the KMS is offline, the following
health check might not be available: VMware vCenter and all hosts are connected to Key Management Servers. If this
issue occurs, you cannot see warnings or errors that indicate the offline status of the KMS.
Workaround: None
Mount remote datastore from a stretched server cluster fails with message: Site affinity provided in
server cluster configuration are not present
Mounting a remote datastore from a stretched server cluster might fail under the following conditions:
• The client vSAN cluster already has another datastore from a different stretched server cluster.
• The stretched server clusters have different fault domain names.
• The client has asymmetric network topology to both the server clusters.
The following message is displayed: Site affinity provided in server cluster configuration are not
present in server cluster fault domains
Workaround: Rename the fault domains on both server clusters to match, and retry the operation.
Sequential workload performance improvements not enabled
Some performance improvements for sequential workloads cannot take effect until the vSAN object moves from the
host or the host is rebooted. You must manually abdicate the DOM owner of all vSAN objects to enable performance
improvements.
Workaround: After you upgrade from vSAN 8.0 to 8.0 Update 1, use the following command to manually abdicate the
DOM owner of all vSAN objects:
vsish -e set /vmkModules/vsan/dom/ownerAbdicateAll 1
Adding host back to cluster fails with the following message: A general system error occurred: Too many
outstanding requests
vSAN module unload operation can timeout while waiting for control device references. If this happens, an attempt to
move the host out of the cluster fails with the following message: Operation timed out
Any further attempts to move the host back to the cluster fail with the following message: A general system error
occurred: Too many outstanding requests
Workaround: Reboot the host before adding it back to the cluster.
Virtual machine snapshot fails after extending virtual disk size in vSAN ESA

29
VMware vSAN 8.0

This issue affects any virtual machine that has CBRC enabled in a vSAN ESA cluster. If you extend size of the VM's
virtual disks, taking a virtual machine snapshot fails.
Workaround: Perform the following steps to take a VM snapshot after you extend the size of a VM's virtual disks.
1. Power off the virtual machine and disable CBRC to all disks through API.
2. Take the virtual machine snapshot.
3. Reenable CBRC and power on the virtual machine.
Linked clone VMs migrated to vSAN ESA creates snapshot for linked clone vsanSpase disk
When migrating VMs from VMFS/NFS/vSAN OSA datastore to vSAN ESA datastore, vSAN cannot distinguish between
a snapshot vsanSparse disk and a linked clone vsanSparse disk. Since vSAN ESA supports native snapshot, a native
snapshot disk is created. If you migrate multiple VMs with moveAllDiskBackingsAndAllowSharing option, each VM
attempts to create a native snapshot of a base disk and run I/O on that object. Only the last VM can run I/O, other VMs will
fail.
Workaround: To avoid this issue, do not use moveAllDiskBackingsAndAllowSharing option when migrating linked VMs to a
vSAN ESA cluster.
hostAffinity policy option lost during upgrade
When you upgrade from vSAN 6.7 to vSAN 8.0, the vCenter Server hostaffinity option value is changed to false.
Workaround: Set the hostaffinity option back to true to continue using vSAN HostLocal policy for a normal VM.
Cannot upgrade cluster to vSAN Express Storage Architecture
You cannot upgrade or convert a cluster on vSAN Original Storage Architecture to vSAN Express Storage Architecture.
vSAN ESA is supported only on new deployments.

Workaround: None.
Encryption deep rekey not supported on vSAN ESA
vSAN Express Storage Architecture does not support encryption deep rekey in this release.

Workaround: None.
vSAN File Service not supported on vSAN ESA
vSAN Express Storage Architecture does not support vSAN File Service in this release.
Workaround: None.
Cannot change encryption settings on vSAN ESA
Encryption can only be configured vSAN ESA during cluster creation. You cannot change the settings later.
Workaround: None.
vSAN File Service does not support NFSv4 delegations
vSAN File Service does not support NFSv4 delegations in this release.
Workaround: None.
In stretched cluster, file server with no affinity cannot rebalance
In the stretched cluster vSAN File Service environment, a file server with no affinity location configured cannot be
rebalanced between Preferred ESXi hosts and Non-preferred ESXi hosts.
Workaround: Set the affinity location of the file server to Preferred or Non-Preferred by editing the file service domain
configuration.

30
VMware vSAN 8.0

Kubernetes pods with CNS volumes cannot be created, deleted, or re-scheduled during vSAN stretched cluster
partition
When a vSAN stretched cluster has a network partition between sites, an intermittent timing issue can cause volume
information to be lost from the CNS. When volume metadata is not present in the CNS, you cannot create, delete, or
re-schedule pods with CNS volumes. vSphere CSI Driver must access volume information from CNS to perform these
operations.
When the network partition is fixed, CNS volume metadata is restored, and pods with CNS volumes can be created,
deleted, or re-scheduled.

Workaround: None.
Shutdown Cluster wizard displays an error on HCI Mesh compute-ony cluster
The vSAN Shutdown Cluster wizard is designed for vSAN clusters that have a vSAN datastore and vSAN services. It does
not support HCI Mesh compute-only clusters. If you use the wizard to shutdown a compute-only cluster, it displays the
following error message:
Cannot retrieve the health service data.
Workaround: None. Do not use the vSAN Shutdown Cluster wizard on an HCI Mesh compute-only cluster.
Remediation of ESXi hosts in a vSphere Lifecycle Manager cluster with vSAN fails if vCenter services are
deployed on custom ports
If vCenter Server services are deployed on custom ports in a cluster with vSAN, vSphere DRS, and vSphere HA,
remediation of vSphere Lifecycle Manager clusters might fail. This problem is caused by a vSAN resource health check
error. ESXi hosts cannot enter maintenance mode, which leads to failing remediation tasks.
Workaround: None.
When vSAN file service is enabled, DFC-related operations such as upgrade, enabling encryption or data-
efficiency might fail
When file service is enabled, an agent VM runs on each host. The underlying vSAN object might be placed across
multiple diskgroups. When the first diskgroup gets converted, the vSAN object becomes inaccessible and the agent VM
is in an invalid state. If you try to delete the VM and redeploy a new VM, the operation fails due to the VM’s invalid state.
The VM gets unregistered but the inaccessible object still exists there. When the next diskgroup gets converted, there is
a precheck for inaccessible objects in the whole cluster. This check fails the DFC since it finds inaccessible objects of the
old agent VM.
Workaround: Manually remove the inaccessible objects.
When such failure happens, you can see the DFC task failure.
1. Identify the inaccessible objects from the failure task fault information.
2. To ensure that the objects belong to the agent VM, inspect the hostd log file and confirm that the objects belong to the
VM’s object layout.
3. Log in to the host and use the /usr/lib/vmware/osfs/bin/objtool command to remove the objects manually.
Note: To prevent this problem, disable file service before performing any DFC-related operation.

esxcli vsan cluster leave command fails to disable vSAN on an ESXi host
In some cases, the following command fails to disable vSAN on a member host: esxcli vsan cluster leave
You might see an error message similar to the following:
Failed to unmount default vSAN datastore. Unable to complete Sysinfo operation. Please see the VMKernel log file for
more details.

31
VMware vSAN 8.0

Workaround: Perform the following steps in the vSphere Client to disable vSAN on a single member host:
1. Place the host into maintenance mode.
2. Move the host out of the vSAN cluster, and into its parent data center.
vSAN service on the host is disabled automatically during the movement.

Cannot extract host profile on a vSAN HCI mesh compute-only host


vSAN host profile plugin does not support vSAN HCI mesh compute-only hosts. If you try to extract the host profile on an
HCI mesh compute-only host, the attempt fails.
Workaround: None.
Deleting files in a file share might not be reflected in vSAN capacity view
The allocated blocks might not be returned back to the vSAN storage instantly after all the files are deleted and hence
it would take some time before the reclaimed storage capacity to be updated in vSAN capacity view. When new data is
written to the same file share, these deleted blocks might get reused prior to returning them to vSAN storage.
If unmap is enabled and vSAN deduplication is disabled, the space may not be freed back to vSAN unless 4MB aligned
space are freed in VDFS. If unmap is enabled and vSAN deduplication is enabled, space freed by VDFS will be freed
back to vSAN with a delay.

Workaround: To release the storage back to vSAN immediately, delete the file shares.

vSAN over RDMA might experience lower performance due to network congestion
RDMA requires lossless network infrastructure that is free of congestion. If your network has congestion, certain large I/O
workloads might experience lower performance than TCP.
Workaround: Address any network congestion issues following OEM best practices for RDMA.
vCenter VM crash on stretched cluster with data-in-transit encryption
vCenter VM might crash on a vSAN stretched cluster if the vCenter VM is on vSAN with data-in-transit encryption
enabled. When all hosts in one site are down and then power on again, the vCenter VM might crash after the failed site
returns to service.
Workaround: Use the following script to resolve this problem: thumbPrintRepair.py

vSAN allows a VM to be provisioned across local and remote datastores


vSphere does not prevent users from provisioning a VM across local and remote datastores in an HCI Mesh environment.
For example, you can provision one VMDK on the local vSAN datastore and one VMDK on remote vSAN datastore. This
is not supported because vSphere HA is not supported with this configuration.
Workaround: Do not provision a VM across local and remote datastores.
The object reformatting task is not progressing
If object reformatting is needed after an upgrade, a health alert is triggered, and vSAN begins reformatting. vSAN
performs this task in batches, and it depends on the amount of transient capacity available in the cluster. When the
transient capacity exceeds the maximum limit, vSAN waits for the transient capacity to be freed before proceeding
with the reformatting. During this phase, the task might appear to be halted. The health alert will clear and the task will
progress when transient capacity is available.
Workaround: None. The task is working as expected.

System VMs cannot be powered-off

32
VMware vSAN 8.0

With the release of vSphere Cluster Services (vCLS) in vSphere 7.0 Update 1, a set of system VMs might be placed
within the vSAN cluster. These system VMs cannot be powered-off by users. This issue can impact some vSAN
workflows, which are documented in the following article: https://kb.vmware.com/s/article/80877
Workaround: For more information about this issue, refer to this KB article: https://kb.vmware.com/s/article/80483.
vSAN File Service cannot be enabled due to an old vSAN on-disk format version
vSAN File Service cannot be enabled with the vSAN on-disk format version earlier than 11.0 (this is the on-disk format
version in vSAN 7.0).
Workaround: Upgrade the vSAN disk format version before enabling File Service.

Remediate cluster task might fail in large scale cluster due to vSAN health network test issues
Large scale clusters with more than 16 hosts, intermittent ping failures can occur during host upgrade. These failures can
interrupt host remediation in vSphere Life Cycle Manager.

Workaround: After remediation pre-check passes, silence alerts for the following vSAN health tests:
• vSAN: Basic (unicast) connectivity check
• vSAN: MTU check (ping with large packet size)
When the remediation task is complete, restore alerts for the vSAN health tests.

Host failure in hot-plug scenario when drive is reinserted


During a hot drive removal, VMware native NVMe hot-plug can cause a host failure if the NVMe drive is pulled and
reinserted within one minute. This is applicable to both vSphere and vSAN for any new or existing drive reinsertion.
Workaround: After removing a hot drive, wait for one minute before you reinsert the new or existing drive.
Cannot place last host in a cluster into maintenance mode, or remove a disk or disk group
Operations in Full data migration or Ensure accessibility mode might fail without providing guidance to add a new
resource, when there is only one host left in the cluster and that host enters maintenance mode. This can also happen
when there is only one disk or disk group left in the cluster and that disk or disk group is to be removed.
Workaround: Before you place the last remaining host in the cluster into maintenance mode with Full data migration or
Ensure accessibility mode selected, add another host with the same configuration to the cluster. Before you remove the
last remaining disk or disk group in the cluster, add a new disk or disk group with the same configuration and capacity.
Object reconfiguration workflows might fail due to the lack of capacity if one or more disks or disk groups are
almost full
vSAN resyncs get paused when the disks in non-deduplication clusters or disk groups in deduplication clusters reach a
configurable resync pause fullness threshold. This is to avoid filling up the disks with resync I/O. If the disks reach this
threshold, vSAN stops reconfiguration workflows, such as EMM, repairs, rebalance, and policy change.
Workaround: If space is available elsewhere in the cluster, rebalancing the cluster frees up space on the other disks, so
that subsequent reconfiguration attempts succeed.
After recovery from cluster full, VMs can lose HA protection
In a vSAN cluster that has hosts with disks 100% full, the VMs might have a question pending and hence lose the HA
protection. Also, the VMs that had a pending question are not HA protected after recovering from cluster full scenario.
Workaround: After recovering from vSAN cluster full scenario, perform one of the following actions:
• Disable and re-enable HA.
• Reconfigure HA.
• Power off and power on the VMs.

33
VMware vSAN 8.0

Power Off VMs fails with Question Pending


If a VM has a pending question, you are not allowed to do any VM-related operations until the question is answered.
Workaround: Try to free the disk space on the relevant volume, and then click Retry.
When the cluster is full, the IP addresses of VMs either change to IPV6 or become unavailable
When a vSAN cluster is full with one or more disk groups reaching 100%, there can be a VM pending question that
requires user action. If the question is not answered and if the cluster full condition is left unattended, the IP addresses
VMs might change to IPv6 or become unavailable. This prevents you from using SSH to access the VMs. It also prevents
you from using the VM console, because the console goes blank after you type root .
Workaround: None.
Unable to remove a dedupe enabled disk group after a capacity disk enters PDL state
When a capacity disk in a dedupe-enabled disk group is removed, or its unique ID changes, or when the device
experiences an unrecoverable hardware error, it enters Permanent Device Loss (PDL) state. If you try to remove the disk
group, you might see an error message informing you that the action cannot be completed.
Workaround: Whenever a capacity disk is removed, or its unique ID changes, or when the device experiences an
unrecoverable hardware error, wait for a few minutes before trying to remove the disk group.
In deduplication clusters, reactive rebalancing might not happen when the disks show more than 80% full
In deduplication clusters, when the disks display more than 80% full on the dashboard, the reactive rebalancing might not
start as expected. This is because in deduplication clusters, pending writes and deletes are also considered for calculating
the free capacity.
Workaround: None.
TRIM/UNMAP commands from Guest OS fail
If the Guest OS attempts to perform space reclamation during online snapshot consolidation, the TRIM/UNMAP
commands fail. This failure keeps space from being reclaimed.
Workaround: Try to reclaim the space after the online snapshot operation is complete. If subsequent TRIM/UNMAP
operations fail, remount the disk.
Space reclamation from SCSI TRIM/UNMAP is lost when online snapshot consolidation is performed
Space reclamation achieved from SCSI TRIM/UNMAP commands is lost when you perform online snapshot consolidation.
Offline snapshot consolidation does not affect SCSI unmap operation.
Workaround: Reclaim the space after online snapshot consolidation is complete.

Host failure when converting data host into witness host


When you convert a vSAN cluster into a stretched cluster, you must provide a witness host. You can convert a data host
into the witness host, but you must use maintenance mode with Full data migration during the process. If you place the
host into maintenance mode with Ensure accessibility option, and then configure it as the witness host, the host might
fail with a purple diagnostic screen.
Workaround: Remove the disk group on the witness host and then re-create the disk group.
Duplicate VM with the same name in vCenter Server when residing host fails during datastore migration
If a VM is undergoing storage vMotion from vSAN to another datastore, such as NFS, and the host on which
it resides encounters a failure on the vSAN network, causing HA failover of the VM, the VM might be duplicated in the
vCenter Server.
Workaround: Power off the invalid VM and unregister it from the vCenter Server.

34
VMware vSAN 8.0

Reconfiguring an existing stretched cluster under a new vCenter Server causes vSAN to issue a health check
warning
When rebuilding a current stretched cluster under a new vCenter Server, the vSAN cluster health check is red. The
following message appears: vSphere cluster members match vSAN cluster members
Workaround: Use the following procedure to configure the stretched cluster.
1. Use SSH to log in to the witness host.
2. Decommission the disks on witness host. Run the following command: esxcli vsan storage remove -s "SSD
UUID"
3. Force the witness host to leave the cluster. Run the following command: esxcli vsan cluster leave
4. Reconfigure the stretched cluster from the new vCenter Server (Configure > vSAN > Fault Domains & Stretched
Cluster).
Disk format upgrade fails while vSAN resynchronizes large objects
If the vSAN cluster contains very large objects, the disk format upgrade might fail while the object is resynchronized. You
might see the following error message: Failed to convert object(s) on vSAN
vSAN cannot perform the upgrade until the object is resynchronized. You can check the status of the resynchronization
(Monitor > vSAN > Resyncing Components) to verify when the process is complete.

Workaround: Wait until no resynchronization is pending, then retry the disk format upgrade.
vSAN stretched cluster configuration lost after you disable vSAN on a cluster
If you disable vSAN on a stretched cluster, the stretched cluster configuration is not retained. The stretched cluster,
witness host, and fault domain configuration is lost.
Workaround: Reconfigure the stretched cluster parameters when you re-enable the vSAN cluster.

Powered off VMs appear as inaccessible during witness host replacement


When you change a witness host in a stretched cluster, VMs that are powered off appear as inaccessible in the vSphere
Web Client for a brief time. After the process is complete, powered off VMs appear as accessible. All running VMs appear
as accessible throughout the process.
Workaround: None.
Cannot place hosts in maintenance mode if they have faulty boot media
vSAN cannot place hosts with faulty boot media into maintenance mode. The task to enter maintenance mode might fail
with an internal vSAN error, due to the inability to save configuration changes. You might see log events similar to the
following: Lost Connectivity to the device xxx backing the boot filesystem
Workaround: Remove disk groups manually from each host, using the Full data evacuation option. Then place the host
in maintenance mode.
After stretched cluster failover, VMs on the preferred site register alert: Failed to failover
If the secondary site in a stretched cluster fails, VMs failover to the preferred site. VMs already on the preferred site might
register the following alert: Failed to failover.
Workaround: Ignore this alert. It does not impact the behavior of the failover.
During network partition, components in the active site appear to be absent
During a network partition in a vSAN two-host or stretched cluster, the vSphere Web Client might display a view of the
cluster from the perspective of the non-active site. You might see active components in the primary site displayed as
absent.

35
VMware vSAN 8.0

Workaround: Use RVC commands to query the state of objects in the cluster. For example: vsan.vm_object_info

Some objects are non-compliant after force repair


After you perform a force repair, some objects might not be repaired because the ownership of the objects was transferred
to a different node during the process. The force repair might be delayed for those objects.
Workaround: Attempt the force repair operation after all other objects are repaired and resynchronized. You can wait until
vSAN repairs the objects.
When you move a host from one encrypted cluster to another, and then back to the original cluster, the task fails
When you move a host from an encrypted vSAN cluster to another encrypted vSAN cluster, then move the host back to
the original encrypted cluster, the task might fail. You might see the following message: A general system error
occurred: Invalid fault . This error occurs because vSAN cannot re-encrypt data on the host using the original
encryption key. After a short time, vCenter Server restores the original key on the host, and all unmounted disks in the
vSAN cluster are mounted.
Workaround: Reboot the host and wait for all disks to get mounted.
Stretched cluster imbalance after a site recovers
When you recover a failed site in a stretched cluster, sometimes hosts in the failed site are brought back sequentially over
a long period of time. vSAN might overuse some hosts when it begins repairing the absent components.
Workaround: Recover all of the hosts in a failed site together within a short time window.
VM operations fail due to HA issue with stretched clusters
Under certain failure scenarios in stretched clusters, certain VM operations such as vMotions or powering on a VM might
be impacted. These failures scenarios include a partial or a complete site failure, or the failure of the high speed network
between the sites. This problem is caused by the dependency on VMware HA being available for normal operation of
stretched cluster sites.
Workaround: Disable vSphere HA before performing vMotion, VM creation, or powering on VMs. Then re-enable vSphere
HA.
Cannot perform deep rekey if a disk group is unmounted
Before vSAN performs a deep rekey, it performs a shallow rekey. The shallow rekey fails if an unmounted disk group is
present. The deep rekey process cannot begin.
Workaround: Remount or remove the unmounted disk group.
Log entries state that firewall configuration has changed
A new firewall entry appears in the security profile when vSAN encryption is enabled: vsanEncryption. This rule controls
how hosts communicate directly to the KMS. When it is triggered, log entries are added to /var/log/vobd.log . You
might see the following messages:
Firewall configuration has changed. Operation 'addIP4' for rule set vsanEncryption
succeeded.
Firewall configuration has changed. Operation 'removeIP4' for rule set vsanEncryption
succeeded.
These messages can be ignored.

Workaround: None.
HA failover does not occur after setting Traffic Type option on a vmknic to support witness traffic

36
VMware vSAN 8.0

If you set the traffic type option on a vmknic to support witness traffic, vSphere HA does not automatically discover the
new setting. You must manually disable and then re-enable HA so it can discover the vmknic. If you configure the vmknic
and the vSAN cluster first, and then enable HA on the cluster, it does discover the vmknic.
Workaround: Manually disable vSphere HA on the cluster, and then re-enable it.

iSCSI MCS is not supported


vSAN iSCSI target service does not support Multiple Connections per Session (MCS).
Workaround: None.
Any iSCSI initiator can discover iSCSI targets
vSAN iSCSI target service allows any initiator on the network to discover iSCSI targets.
Workaround: You can isolate your ESXi hosts from iSCSI initiators by placing them on separate VLANs.
After resolving network partition, some VM operations on linked clone VMs might fail
Some VM operations on linked clone VMs that are not producing I/O inside the guest operating system might fail. The
operations that might fail include taking snapshots and suspending the VMs. This problem can occur after a network
partition is resolved, if the parent base VM's namespace is not yet accessible. When the parent VM's namespace
becomes accessible, HA is not notified to power on the VM.
Workaround: Power cycle VMs that are not actively running I/O operations.
Cannot place a witness host in Maintenance Mode
When you attempt to place a witness host in Maintenance Mode, the host remains in the current state and you see the
following notification: A specified parameter was not correct.
Workaround: When placing a witness host in Maintenance Mode, choose the No data migration option.
Moving the witness host into and then out of a stretched cluster leaves the cluster in a misconfigured state
If you place the witness host in a vSAN-enabled vCenter cluster, an alarm notifies you that the witness host cannot reside
in the cluster. But if you move the witness host out of the cluster, the cluster remains in a misconfigured state.
Workaround: Move the witness host out of the vSAN stretched cluster, and reconfigure the stretched cluster. For more
information, see this article: https://kb.vmware.com/s/article/2130587.
When a network partition occurs in a cluster which has an HA heartbeat datastore, VMs are not restarted on the
other data site
When the preferred or secondary site in a vSAN cluster loses its network connection to the other sites, VMs running on
the site that loses network connectivity are not restarted on the other data site, and the following error might appear:
vSphere HA virtual machine HA failover failed .
This is expected behavior for vSAN clusters.
Workaround: Do not select HA heartbeat datastore while configuring vSphere HA on the cluster.
Unmounted vSAN disks and disk groups displayed as mounted in the vSphere Web Client Operational Status
field
After the vSAN disks or disk groups are unmounted by either running the esxcli vsan storage disk group
unmount command or by the vSAN Device Monitor service when disks show persistently high latencies, the vSphere
Web Client incorrectly displays the Operational Status field as mounted.
Workaround: Use the Health field to verify disk status, instead of the Operational Status field.

37
VMware vSAN 8.0

VMware vSAN 8.0 Release Notes


This document contains the following sections
• Introduction
• What's in the Release Notes
• What's New
• VMware vSAN Community
• Upgrades for This Release
• Limitations
• Resolved Issues
• Known Issues

Introduction

VMware vSphere 8.0 | 11 OCT 2022


VMware ESXi 8.0 | 11 OCT 2022 | Build ISO Build 20513097
Check for additions and updates to these release notes.

What's in the Release Notes


These release notes introduce you to new features in VMware vSAN 8.0 and provide information on resolved and known
issues.

What's New
vSAN 8.0 introduces the following new features and enhancements:
Additional Features and Enhancements
• Enhanced network uplink latency metrics. vSAN defines more meaningful and relevant metrics catered to the
environment, whether the latencies are temporary or from an excessive workload.
• RDT level checksums. You can set checksums at the RDT layer. These new checksums can aid in debugging and
triaging.
• vSAN File Service debugging. File Service Day 0 operations have been improved for efficient validation and
troubleshooting.
• vSAN File Service over IPv6. You can create a file service domain with IPv6 network.
• vSAN File Service network reconfiguration. You can change file server IPs including the primary IP to new IPs in the
same or different subnet.
• vSphere Client Remote Plug-ins. All VMware-owned local plug-ins are transitioning to the new remote plug-in
architecture. vSAN local plug-ins have been moved to vSphere Client remote plug-ins. The local vSAN plug-ins are
deprecated in this release.
• vLCM HCL disk device. Enhancements improve vLCM’s functionality and efficiency for checking compatibility with the
desired image. It includes a check for “partNumber” and “vendor" to add coverage for more vendors.
• Reduced start time of vSAN health service. The time needed to stop vSAN health service as a part of vCenter
restart or upgrade has been reduced to 5 seconds.
• vSAN health check provides perspective to VCF LCM. This release provides only relevant vSAN health checks to
VCF in order to improve LCM resiliency in VCF.
• vSAN improves cluster NDU for VMC. New capabilities improve design and operation of a highly secure, reliable,
and operationally efficient service.
• vSAN encryption key verification. Detects invalid or corrupt keys sent from the KMS server, identifies discrepancies
between in-memory and on-disk DEKs, and alerts customers in case of discrepancies.

38
VMware vSAN 8.0

• Better handling of large component deletes. Reclaims the logical space and accounts for the physical space faster,
without causing NO_SPACE error.
• Renamed vSAN health "Check" to "Finding." This change makes the term consistent with all VMware products.
• Place vSAN in separate sandbox domain. Daemon sandboxing prevents lateral movement and provides defense in
depth. Starting with vSAN 8.0, least privilege security model is implemented, wherein any daemon that does not have
its custom sandbox domain defined, will run as a deprivileged domain. This achieves least-privilege model on an ESXi
host, with all vSAN running in their own sandbox domain with the least possible privilege.
• vSAN Proactive Insights. This mechanism enables vSAN clusters connected to VMware Analytics Cloud to identify
software and hardware anomalies proactively.
• Management and monitoring of PMEM for SAP HANA. You can manage PMEM devices within the hosts. vSAN
provides management capabilities such as health checks, performance monitoring, and space reporting for the
PMEM devices. PMEM management capabilities do not require vSAN services to be enabled. vSAN does not use
PMEM devices for caching vSAN metadata or for vSAN data services such as encryption, checksum, or dedupe and
compression. The PMEM datastore is local to each host, but can be managed from the monitor tab at the cluster level.
• Replace MD5, SHA1, and SHA2 in vSAN. SHA1 is no longer considered secure, so VMware is replacing SHA1,
MD5, and SHA2 with SHA256 across all VMware products, including vSAN.
• IL6 compliance. vSAN 8.0 is IL6 compliant.

Intuitive, Agile Operations


• Consistent interfaces across all vSAN platforms. vSAN ESA uses the same screens and workflows as vSAN OSA,
so the learning curve is small.
• Per-VM policies increase flexibility. vSAN ESA is moving cluster-wide settings to the SPBM level. In this release,
SPBM compression settings give you granular control down to the VM or even VMDK level, and you can apply them
broadly with datastore default policies.
• Proactive Insight into compatibility and compliance. This mechanism helps vSAN clusters connected to VMware
Analytics Cloud identify software and hardware anomalies. If an OEM partner publishes an advisory about issues for a
drive or I/O controller listed in vSAN HCL, you can be notified about the potentially impacted environment.
Availability and Serviceability
• Simplified and accelerated servicing per device. vSAN ESA removes the complexity of disk groups, which
streamlines the replacement process for failed drives.
• Smaller failure domains and reduced data resynchronization. vSAN ESA has no single points of failure in its
storage pool design. vSAN data and metadata are protected according to the Failures To Tolerate (FTT) SPBM setting.
Neither caching nor compression lead to more than a single disk failure domain if a disk crashes. Resync operations
complete faster with vSAN ESA.
• Enhanced data availability and improved SLAs. Reduction in disk failure domains and quicker repair times means
you can improve the SLAs provided to your customers or business units.
• vSAN boot-time optimizations. vSAN boot logic has been further optimized for faster startup.
• Enhanced shutdown and startup workflows. The vSAN cluster shutdown and cluster startup process has been
enhanced to support vSAN clusters that house vCenter or infrastructure services such as AD, DNS, DHCP, and so on.
• Reduced vSAN File Service failover time. vSAN File Service planned failovers have been streamlined.
Fast, Efficient Data Protection with vSAN ESA Native Snapshots
• Negligible performance impact. Long snapshot chains and deep snapshot chains cause minimal performance
impact.
• Faster snapshot operations. Applications that suffered from snapshot create or snapshot delete stun times will
perform better with vSAN ESA.
• Consistent partner backup application experience using VMware VADP. VMware snapshot APIs are unchanged.
VMware VADP supports all vSAN ESA native snapshot operations on the vSphere platform.

39
VMware vSAN 8.0

Supreme Resource and Space Efficiency


• Erasure Coding without compromising performance. The vSAN ESA RAID5/RAID6 capabilities with Erasure
Coding provide a highly efficient Erasure Coding code path, so you can have both a high-performance and a space-
efficient storage policy.
• Improved compression. vSAN ESA has advanced compression capabilities that can bring up to 4x better
compression. Compression is performed before data is sent across the vSAN network, providing better bandwidth
usage.
• Expanded usable storage potential. vSAN ESA consists of a single-tier architecture with all devices contributing to
capacity. This flat storage pool removes the need for disk groups with caching devices.
• Reduced performance overhead for high VM consolidation. Resource and space efficiency improvements enable
you to store more VM data per cluster, potentially increasing VM consolidation ratios.
• HCI Mesh support for 10 client clusters. A storage server cluster can be shared with up to 10 client clusters.

Performance without Tradeoffs


vSAN Express Storage Architecture. vSAN ESA is an alternative architecure that provides the potential for huge boosts
in performance with more predictable I/O latencies and optimized space efficiency.
Increased write buffer. vSAN Original Storage Architecture can support more intensive workloads. You can configure
vSAN hosts to increase the write buffer from 600 GB to 1.6 TB.
Native snapshots with minimal performance impact. vSAN ESA file system has snapshots built in. These native
snapshots cause minimal impact to VM performance, even if the snapshot chain gets deep. The snapshots are fully
compatible with existing backup applications using VMware VADP.

VMware vSAN Community


Use the vSAN Community Web site to provide feedback and request assistance with any problems you find while using
vSAN.

Upgrades for This Release


For instructions about upgrading vSAN, see the VMware vSAN 8.0 documentation.
Note: Before performing the upgrade, please review the most recent version of the VMware Compatibility Guide to
validate that the latest vSAN version is available for your platform.
Note: vSAN Express Storage Architecture is available only for new deployments. You cannot upgrade a cluster to vSAN
ESA.
vSAN 8.0 is a new release that requires a full upgrade to vSphere 8.0. Perform the following tasks to complete the
upgrade:
1. Upgrade to vCenter Server 8.0. For more information, see the VMware vSphere 8.0 Release Notes.
2. Upgrade hosts to ESXi 8.0. For more information, see the VMware vSphere 8.0 Release Notes.
3. Upgrade the vSAN on-disk format to version 17.0. If upgrading from on-disk format version 3.0 or later, no data
evacuation is required (metadata update only).
4. Upgrade FSVM to enable new File Service features such as access based enumeration for SMB shares.
Note: vSAN retired disk format version 1.0 in vSAN 7.0 Update 1. Disks running disk format version 1.0 are no
longer recognized by vSAN. vSAN will block upgrade through vSphere Update Manager, ISO install, or esxcli to vSAN 7.0
Update 1. To avoid these issues, upgrade disks running disk format version 1.0 to a higher version. If you have disks on
version 1.0, a health check alerts you to upgrade the disk format version.

40
VMware vSAN 8.0

Disk format version 1.0 does not have performance and snapshot enhancements, and it lacks support for advanced
features including checksum, deduplication and compression, and encryption. For more information about vSAN disk
format version, see KB 2148493.
Upgrading the On-disk Format for Hosts with Limited Capacity
During an upgrade of the vSAN on-disk format from version 1.0 or 2.0, a disk group evacuation is performed. The disk
group is removed and upgraded to on-disk format version 17.0, and the disk group is added back to the cluster. For two-
node or three-node clusters, or clusters without enough capacity to evacuate each disk group, select Allow Reduced
Redundancy from the vSphere Client. You also can use the following RVC command to upgrade the on-disk format:
vsan.ondisk_upgrade --allow-reduced-redundancy
When you allow reduced redundancy, your VMs are unprotected for the duration of the upgrade, because this method
does not evacuate data to the other hosts in the cluster. It removes each disk group, upgrades the on-disk format, and
adds the disk group back to the cluster. All objects remain available, but with reduced redundancy.
If you enable deduplication and compression during the upgrade to vSAN 8.0, you can select Allow Reduced
Redundancy from the vSphere Client.

Limitations
For information about maximum configuration limits for the vSAN 8.0 release, see the Configuration Maximums
documentation.

Resolved Issues

RemoveFileShare task failure may cause vSAN File Services server failover
RemoveFileShare task for the NFS share may fail on the VCenter Server even though the share is deleted. This happens
because the NFS server fails while removing the export. This does not cause any problems in the overall workflow as the
share gets successfully deleted.
When the NFS server fails, it triggers vSAN File Services server failover. Since NFS server and SMB server fails
over together, if there are any SMB shares exported from the same vSAN File Services server it causes SMB mount
disruptions. SMB mount disruption due to server failover is a known behavior as vSAN does not support transparent
failover for SMB servers.
Workaround: None.
vSAN Health cannot find VUM with proxy configured
When a proxy is configured for vSAN, the vsan-health service falsely reported that VMware Update Manager (VUM) is
disabled or not installed.
This issue is fixed in this release.

Known Issues

In stretched cluster, file server with no affinity cannot rebalance


In the stretched cluster vSAN File Service environment, a file server with no affinity location configured cannot be
rebalanced between Preferred ESXi hosts and Non-preferred ESXi hosts.
Workaround: Set the affinity location of the file server to Preferred or Non-Preferred by editing the file service domain
configuration.
vSAN File Service does not support NFSv4 delegations
vSAN File Service does not support NFSv4 delegations in this release.

41
VMware vSAN 8.0

Workaround: None.
Cannot change encryption settings on vSAN ESA
Encryption can only be configured vSAN ESA during cluster creation. You cannot change the settings later.
Workaround: None.
vSAN File Service not supported on vSAN ESA
vSAN Express Storage Architecture does not support vSAN File Service in this release.
Workaround: None.
Encryption deep rekey not supported on vSAN ESA
vSAN Express Storage Architecture does not support encryption deep rekey in this release.

Workaround: None.
Cannot upgrade cluster to vSAN Express Storage Architecture
You cannot upgrade or convert a cluster on vSAN Original Storage Architecture to vSAN Express Storage Architecture.
vSAN ESA is supported only on new deployments.

Workaround: None.
hostAffinity policy option lost during upgrade
When you upgrade from vSAN 6.7 to vSAN 8.0, the vCenter Server hostaffinity option value is changed to false.
Workaround: Set the hostaffinity option back to true to continue using vSAN HostLocal policy for a normal VM.
Kubernetes pods with CNS volumes cannot be created, deleted, or re-scheduled during vSAN stretched cluster
partition
When a vSAN stretched cluster has a network partition between sites, an intermittent timing issue can cause volume
information to be lost from the CNS. When volume metadata is not present in the CNS, you cannot create, delete, or
re-schedule pods with CNS volumes. vSphere CSI Driver must access volume information from CNS to perform these
operations.
When the network partition is fixed, CNS volume metadata is restored, and pods with CNS volumes can be created,
deleted, or re-scheduled.

Workaround: None.
Shutdown Cluster wizard displays an error on HCI Mesh compute-ony cluster
The vSAN Shutdown Cluster wizard is designed for vSAN clusters that have a vSAN datastore and vSAN services. It does
not support HCI Mesh compute-only clusters. If you use the wizard to shutdown a compute-only cluster, it displays the
following error message:
Cannot retrieve the health service data.
Workaround: None. Do not use the vSAN Shutdown Cluster wizard on an HCI Mesh compute-only cluster.
Remediation of ESXi hosts in a vSphere Lifecycle Manager cluster with vSAN fails if vCenter services are
deployed on custom ports
If vCenter Server services are deployed on custom ports in a cluster with vSAN, vSphere DRS, and vSphere HA,
remediation of vSphere Lifecycle Manager clusters might fail. This problem is caused by a vSAN resource health check
error. ESXi hosts cannot enter maintenance mode, which leads to failing remediation tasks.
Workaround: None.

42
VMware vSAN 8.0

When vSAN file service is enabled, DFC-related operations such as upgrade, enabling encryption or data-
efficiency might fail
When file service is enabled, an agent VM runs on each host. The underlying vSAN object might be placed across
multiple diskgroups. When the first diskgroup gets converted, the vSAN object becomes inaccessible and the agent VM
is in an invalid state. If you try to delete the VM and redeploy a new VM, the operation fails due to the VM’s invalid state.
The VM gets unregistered but the inaccessible object still exists there. When the next diskgroup gets converted, there is
a precheck for inaccessible objects in the whole cluster. This check fails the DFC since it finds inaccessible objects of the
old agent VM.
Workaround: Manually remove the inaccessible objects.
When such failure happens, you can see the DFC task failure.
1. Identify the inaccessible objects from the failure task fault information.
2. To ensure that the objects belong to the agent VM, inspect the hostd log file and confirm that the objects belong to the
VM’s object layout.
3. Log in to the host and use the /usr/lib/vmware/osfs/bin/objtool command to remove the objects manually.
Note: To prevent this problem, disable file service before performing any DFC-related operation.

esxcli vsan cluster leave command fails to disable vSAN on an ESXi host
In some cases, the following command fails to disable vSAN on a member host: esxcli vsan cluster leave
You might see an error message similar to the following:
Failed to unmount default vSAN datastore. Unable to complete Sysinfo operation. Please see the VMKernel log file for
more details.
Workaround: Perform the following steps in the vSphere Client to disable vSAN on a single member host:
1. Place the host into maintenance mode.
2. Move the host out of the vSAN cluster, and into its parent data center.
vSAN service on the host is disabled automatically during the movement.

Cannot extract host profile on a vSAN HCI mesh compute-only host


vSAN host profile plugin does not support vSAN HCI mesh compute-only hosts. If you try to extract the host profile on an
HCI mesh compute-only host, the attempt fails.
Workaround: None.
Deleting files in a file share might not be reflected in vSAN capacity view
The allocated blocks might not be returned back to the vSAN storage instantly after all the files are deleted and hence
it would take some time before the reclaimed storage capacity to be updated in vSAN capacity view. When new data is
written to the same file share, these deleted blocks might get reused prior to returning them to vSAN storage.
If unmap is enabled and vSAN deduplication is disabled, the space may not be freed back to vSAN unless 4MB aligned
space are freed in VDFS. If unmap is enabled and vSAN deduplication is enabled, space freed by VDFS will be freed
back to vSAN with a delay.

Workaround: To release the storage back to vSAN immediately, delete the file shares.

vSAN over RDMA might experience lower performance due to network congestion
RDMA requires lossless network infrastructure that is free of congestion. If your network has congestion, certain large I/O
workloads might experience lower performance than TCP.
Workaround: Address any network congestion issues following OEM best practices for RDMA.

43
VMware vSAN 8.0

vCenter VM crash on stretched cluster with data-in-transit encryption


vCenter VM might crash on a vSAN stretched cluster if the vCenter VM is on vSAN with data-in-transit encryption
enabled. When all hosts in one site are down and then power on again, the vCenter VM might crash after the failed site
returns to service.
Workaround: Use the following script to resolve this problem: thumbPrintRepair.py

VM migration from VMFS datastore or vSAN datastore to vSAN datastore fails


When you have Content Based Read Cache (CBRC) enabled, sVmotion or xVmotion might fail to migrate a VM that has
one or more snapshots to the vSAN datastore. You might see the following error message: The operation is not supported
on the object.
The following messages appear in /var/log/vmware/vpxd
/2021-01-31T17:12:27.477Z error vpxd[18588] [Originator@6876 sub=vpxLro opID=65ef3b53-01] [VpxLRO] Unexpected
Exception: N5Vmomi5Fault12NotSupported9ExceptionE(Message is: The operation is not supported on the object.,
--> Fault cause: vmodl.fault.NotSupported
--> Fault Messages are:
--> (null)
--> )
-->
Workaround: Consolidate snapshots, or delete all snapshots before migration.

vSAN allows a VM to be provisioned across local and remote datastores


vSphere does not prevent users from provisioning a VM across local and remote datastores in an HCI Mesh environment.
For example, you can provision one VMDK on the local vSAN datastore and one VMDK on remote vSAN datastore. This
is not supported because vSphere HA is not supported with this configuration.
Workaround: Do not provision a VM across local and remote datastores.
The object reformatting task is not progressing
If object reformatting is needed after an upgrade, a health alert is triggered, and vSAN begins reformatting. vSAN
performs this task in batches, and it depends on the amount of transient capacity available in the cluster. When the
transient capacity exceeds the maximum limit, vSAN waits for the transient capacity to be freed before proceeding
with the reformatting. During this phase, the task might appear to be halted. The health alert will clear and the task will
progress when transient capacity is available.
Workaround: None. The task is working as expected.

System VMs cannot be powered-off


With the release of vSphere Cluster Services (vCLS) in vSphere 7.0 Update 1, a set of system VMs might be placed
within the vSAN cluster. These system VMs cannot be powered-off by users. This issue can impact some vSAN
workflows, which are documented in the following article: https://kb.vmware.com/s/article/80877
Workaround: For more information about this issue, refer to this KB article: https://kb.vmware.com/s/article/80483.
vSAN File Service cannot be enabled due to an old vSAN on-disk format version
vSAN File Service cannot be enabled with the vSAN on-disk format version earlier than 11.0 (this is the on-disk format
version in vSAN 7.0).
Workaround: Upgrade the vSAN disk format version before enabling File Service.

44
VMware vSAN 8.0

Remediate cluster task might fail in large scale cluster due to vSAN health network test issues
Large scale clusters with more than 16 hosts, intermittent ping failures can occur during host upgrade. These failures can
interrupt host remediation in vSphere Life Cycle Manager.

Workaround: After remediation pre-check passes, silence alerts for the following vSAN health tests:
• vSAN: Basic (unicast) connectivity check
• vSAN: MTU check (ping with large packet size)
When the remediation task is complete, restore alerts for the vSAN health tests.

Host failure in hot-plug scenario when drive is reinserted


During a hot drive removal, VMware native NVMe hot-plug can cause a host failure if the NVMe drive is pulled and
reinserted within one minute. This is applicable to both vSphere and vSAN for any new or existing drive reinsertion.
Workaround: After removing a hot drive, wait for one minute before you reinsert the new or existing drive.
Cannot place last host in a cluster into maintenance mode, or remove a disk or disk group
Operations in Full data migration or Ensure accessibility mode might fail without providing guidance to add a new
resource, when there is only one host left in the cluster and that host enters maintenance mode. This can also happen
when there is only one disk or disk group left in the cluster and that disk or disk group is to be removed.
Workaround: Before you place the last remaining host in the cluster into maintenance mode with Full data migration or
Ensure accessibility mode selected, add another host with the same configuration to the cluster. Before you remove the
last remaining disk or disk group in the cluster, add a new disk or disk group with the same configuration and capacity.
Object reconfiguration workflows might fail due to the lack of capacity if one or more disks or disk groups are
almost full
vSAN resyncs get paused when the disks in non-deduplication clusters or disk groups in deduplication clusters reach a
configurable resync pause fullness threshold. This is to avoid filling up the disks with resync I/O. If the disks reach this
threshold, vSAN stops reconfiguration workflows, such as EMM, repairs, rebalance, and policy change.
Workaround: If space is available elsewhere in the cluster, rebalancing the cluster frees up space on the other disks, so
that subsequent reconfiguration attempts succeed.
After recovery from cluster full, VMs can lose HA protection
In a vSAN cluster that has hosts with disks 100% full, the VMs might have a question pending and hence lose the HA
protection. Also, the VMs that had a pending question are not HA protected after recovering from cluster full scenario.
Workaround: After recovering from vSAN cluster full scenario, perform one of the following actions:
• Disable and re-enable HA.
• Reconfigure HA.
• Power off and power on the VMs.

Power Off VMs fails with Question Pending


If a VM has a pending question, you are not allowed to do any VM-related operations until the question is answered.
Workaround: Try to free the disk space on the relevant volume, and then click Retry.
When the cluster is full, the IP addresses of VMs either change to IPV6 or become unavailable
When a vSAN cluster is full with one or more disk groups reaching 100%, there can be a VM pending question that
requires user action. If the question is not answered and if the cluster full condition is left unattended, the IP addresses
VMs might change to IPv6 or become unavailable. This prevents you from using SSH to access the VMs. It also prevents
you from using the VM console, because the console goes blank after you type root .

45
VMware vSAN 8.0

Workaround: None.
Unable to remove a dedupe enabled disk group after a capacity disk enters PDL state
When a capacity disk in a dedupe-enabled disk group is removed, or its unique ID changes, or when the device
experiences an unrecoverable hardware error, it enters Permanent Device Loss (PDL) state. If you try to remove the disk
group, you might see an error message informing you that the action cannot be completed.
Workaround: Whenever a capacity disk is removed, or its unique ID changes, or when the device experiences an
unrecoverable hardware error, wait for a few minutes before trying to remove the disk group.
vSAN health indicates non-availability related incompliance with failed pending policy
A policy change request leaves the object health status of vSAN in a non-availability related incompliance state. This is
because there might be other scheduled work that is utilizing the requested resources. However, vSAN reschedules this
policy request automatically as resources become available.
Workaround: The vSAN period scan fixes this issue automatically in most cases. However, other work in progress might
use up available resources even after the policy change was accepted but not applied. You can add more capacity if the
capacity reporting displays a high value.

In deduplication clusters, reactive rebalancing might not happen when the disks show more than 80% full
In deduplication clusters, when the disks display more than 80% full on the dashboard, the reactive rebalancing might not
start as expected. This is because in deduplication clusters, pending writes and deletes are also considered for calculating
the free capacity.
Workaround: None.
TRIM/UNMAP commands from Guest OS fail
If the Guest OS attempts to perform space reclamation during online snapshot consolidation, the TRIM/UNMAP
commands fail. This failure keeps space from being reclaimed.
Workaround: Try to reclaim the space after the online snapshot operation is complete. If subsequent TRIM/UNMAP
operations fail, remount the disk.
Space reclamation from SCSI TRIM/UNMAP is lost when online snapshot consolidation is performed
Space reclamation achieved from SCSI TRIM/UNMAP commands is lost when you perform online snapshot consolidation.
Offline snapshot consolidation does not affect SCSI unmap operation.
Workaround: Reclaim the space after online snapshot consolidation is complete.

Host failure when converting data host into witness host


When you convert a vSAN cluster into a stretched cluster, you must provide a witness host. You can convert a data host
into the witness host, but you must use maintenance mode with Full data migration during the process. If you place the
host into maintenance mode with Ensure accessibility option, and then configure it as the witness host, the host might
fail with a purple diagnostic screen.
Workaround: Remove the disk group on the witness host and then re-create the disk group.
Duplicate VM with the same name in vCenter Server when residing host fails during datastore migration
If a VM is undergoing storage vMotion from vSAN to another datastore, such as NFS, and the host on which
it resides encounters a failure on the vSAN network, causing HA failover of the VM, the VM might be duplicated in the
vCenter Server.
Workaround: Power off the invalid VM and unregister it from the vCenter Server.
Reconfiguring an existing stretched cluster under a new vCenter Server causes vSAN to issue a health check
warning

46
VMware vSAN 8.0

When rebuilding a current stretched cluster under a new vCenter Server, the vSAN cluster health check is red. The
following message appears: vSphere cluster members match vSAN cluster members
Workaround: Use the following procedure to configure the stretched cluster.
1. Use SSH to log in to the witness host.
2. Decommission the disks on witness host. Run the following command: esxcli vsan storage remove -s "SSD
UUID"
3. Force the witness host to leave the cluster. Run the following command: esxcli vsan cluster leave
4. Reconfigure the stretched cluster from the new vCenter Server (Configure > vSAN > Fault Domains & Stretched
Cluster).
Disk format upgrade fails while vSAN resynchronizes large objects
If the vSAN cluster contains very large objects, the disk format upgrade might fail while the object is resynchronized. You
might see the following error message: Failed to convert object(s) on vSAN
vSAN cannot perform the upgrade until the object is resynchronized. You can check the status of the resynchronization
(Monitor > vSAN > Resyncing Components) to verify when the process is complete.

Workaround: Wait until no resynchronization is pending, then retry the disk format upgrade.
vSAN stretched cluster configuration lost after you disable vSAN on a cluster
If you disable vSAN on a stretched cluster, the stretched cluster configuration is not retained. The stretched cluster,
witness host, and fault domain configuration is lost.
Workaround: Reconfigure the stretched cluster parameters when you re-enable the vSAN cluster.

Powered off VMs appear as inaccessible during witness host replacement


When you change a witness host in a stretched cluster, VMs that are powered off appear as inaccessible in the vSphere
Web Client for a brief time. After the process is complete, powered off VMs appear as accessible. All running VMs appear
as accessible throughout the process.
Workaround: None.
Cannot place hosts in maintenance mode if they have faulty boot media
vSAN cannot place hosts with faulty boot media into maintenance mode. The task to enter maintenance mode might fail
with an internal vSAN error, due to the inability to save configuration changes. You might see log events similar to the
following: Lost Connectivity to the device xxx backing the boot filesystem
Workaround: Remove disk groups manually from each host, using the Full data evacuation option. Then place the host
in maintenance mode.
After stretched cluster failover, VMs on the preferred site register alert: Failed to failover
If the secondary site in a stretched cluster fails, VMs failover to the preferred site. VMs already on the preferred site might
register the following alert: Failed to failover.
Workaround: Ignore this alert. It does not impact the behavior of the failover.
During network partition, components in the active site appear to be absent
During a network partition in a vSAN two-host or stretched cluster, the vSphere Web Client might display a view of the
cluster from the perspective of the non-active site. You might see active components in the primary site displayed as
absent.
Workaround: Use RVC commands to query the state of objects in the cluster. For example: vsan.vm_object_info

Some objects are non-compliant after force repair

47
VMware vSAN 8.0

After you perform a force repair, some objects might not be repaired because the ownership of the objects was transferred
to a different node during the process. The force repair might be delayed for those objects.
Workaround: Attempt the force repair operation after all other objects are repaired and resynchronized. You can wait until
vSAN repairs the objects.
When you move a host from one encrypted cluster to another, and then back to the original cluster, the task fails
When you move a host from an encrypted vSAN cluster to another encrypted vSAN cluster, then move the host back to
the original encrypted cluster, the task might fail. You might see the following message: A general system error
occurred: Invalid fault . This error occurs because vSAN cannot re-encrypt data on the host using the original
encryption key. After a short time, vCenter Server restores the original key on the host, and all unmounted disks in the
vSAN cluster are mounted.
Workaround: Reboot the host and wait for all disks to get mounted.
Stretched cluster imbalance after a site recovers
When you recover a failed site in a stretched cluster, sometimes hosts in the failed site are brought back sequentially over
a long period of time. vSAN might overuse some hosts when it begins repairing the absent components.
Workaround: Recover all of the hosts in a failed site together within a short time window.
VM operations fail due to HA issue with stretched clusters
Under certain failure scenarios in stretched clusters, certain VM operations such as vMotions or powering on a VM might
be impacted. These failures scenarios include a partial or a complete site failure, or the failure of the high speed network
between the sites. This problem is caused by the dependency on VMware HA being available for normal operation of
stretched cluster sites.
Workaround: Disable vSphere HA before performing vMotion, VM creation, or powering on VMs. Then re-enable vSphere
HA.
Cannot perform deep rekey if a disk group is unmounted
Before vSAN performs a deep rekey, it performs a shallow rekey. The shallow rekey fails if an unmounted disk group is
present. The deep rekey process cannot begin.
Workaround: Remount or remove the unmounted disk group.
Log entries state that firewall configuration has changed
A new firewall entry appears in the security profile when vSAN encryption is enabled: vsanEncryption. This rule controls
how hosts communicate directly to the KMS. When it is triggered, log entries are added to /var/log/vobd.log . You
might see the following messages:
Firewall configuration has changed. Operation 'addIP4' for rule set vsanEncryption
succeeded.
Firewall configuration has changed. Operation 'removeIP4' for rule set vsanEncryption
succeeded.
These messages can be ignored.

Workaround: None.
HA failover does not occur after setting Traffic Type option on a vmknic to support witness traffic
If you set the traffic type option on a vmknic to support witness traffic, vSphere HA does not automatically discover the
new setting. You must manually disable and then re-enable HA so it can discover the vmknic. If you configure the vmknic
and the vSAN cluster first, and then enable HA on the cluster, it does discover the vmknic.
Workaround: Manually disable vSphere HA on the cluster, and then re-enable it.

48
VMware vSAN 8.0

iSCSI MCS is not supported


vSAN iSCSI target service does not support Multiple Connections per Session (MCS).
Workaround: None.
Any iSCSI initiator can discover iSCSI targets
vSAN iSCSI target service allows any initiator on the network to discover iSCSI targets.
Workaround: You can isolate your ESXi hosts from iSCSI initiators by placing them on separate VLANs.
After resolving network partition, some VM operations on linked clone VMs might fail
Some VM operations on linked clone VMs that are not producing I/O inside the guest operating system might fail. The
operations that might fail include taking snapshots and suspending the VMs. This problem can occur after a network
partition is resolved, if the parent base VM's namespace is not yet accessible. When the parent VM's namespace
becomes accessible, HA is not notified to power on the VM.
Workaround: Power cycle VMs that are not actively running I/O operations.
Cannot place a witness host in Maintenance Mode
When you attempt to place a witness host in Maintenance Mode, the host remains in the current state and you see the
following notification: A specified parameter was not correct.
Workaround: When placing a witness host in Maintenance Mode, choose the No data migration option.
Moving the witness host into and then out of a stretched cluster leaves the cluster in a misconfigured state
If you place the witness host in a vSAN-enabled vCenter cluster, an alarm notifies you that the witness host cannot reside
in the cluster. But if you move the witness host out of the cluster, the cluster remains in a misconfigured state.
Workaround: Move the witness host out of the vSAN stretched cluster, and reconfigure the stretched cluster. For more
information, see this article: https://kb.vmware.com/s/article/2130587.
When a network partition occurs in a cluster which has an HA heartbeat datastore, VMs are not restarted on the
other data site
When the preferred or secondary site in a vSAN cluster loses its network connection to the other sites, VMs running on
the site that loses network connectivity are not restarted on the other data site, and the following error might appear:
vSphere HA virtual machine HA failover failed .
This is expected behavior for vSAN clusters.
Workaround: Do not select HA heartbeat datastore while configuring vSphere HA on the cluster.
Unmounted vSAN disks and disk groups displayed as mounted in the vSphere Web Client Operational Status
field
After the vSAN disks or disk groups are unmounted by either running the esxcli vsan storage disk group
unmount command or by the vSAN Device Monitor service when disks show persistently high latencies, the vSphere
Web Client incorrectly displays the Operational Status field as mounted.
Workaround: Use the Health field to verify disk status, instead of the Operational Status field.

49
VMware vSAN 8.0

vSAN Planning and Deployment


vSAN Planning and Deployment describes how to design and deploy a vSAN cluster in a vSphere environment. The
information includes system requirements, sizing guidelines, and suggested best practices.

Intended Audience
This manual is intended for anyone who wants to design and deploy avSAN cluster in a VMware vSphere environment.
The information in this manual is written for experienced system administrators who are familiar with virtual machine
technology and virtual datacenter operations. This manual assumes familiarity with VMware vSphere, including VMware
ESXi, vCenter Server, and the vSphere Client.
For more information about vSAN features and how to configure avSAN cluster, see Administering VMware vSAN.
For more information about monitoring avSAN cluster and fixing problems, see the vSAN Monitoring and Troubleshooting
Guide.

vSphere Client and vSphere Web Client


Instructions in this guide reflect the vSphere Client (an HTML5-based GUI). You can also use the instructions to perform
the tasks by using the vSphere Web Client (a Flex-based GUI).
Tasks for which the workflow differs significantly between the vSphere Client and the vSphere Web Client have duplicate
procedures that provide steps according to the respective client interface. The procedures that relate to the vSphere Web
Client, contain vSphere Web Client in the title.
NOTE
In vSphere 6.7 Update 1, almost all of the vSphere Web Client functionality is implemented in the vSphere
Client.

Updated Information
This document is updated with each release of the product or when necessary.
This table provides the update history of vSAN Planning and Deployment.

Revision Description

25 JUL 2024 • Updated vSAN license information in License Requirements.


• Clarified stretched cluster and two-host cluster support for
vSAN Storage Policy for SMP-FT VMs in vSAN Stretched
Cluster Design Considerations.
• Additional minor updates.
25JUN 2024 Initial release.

What Is vSAN
VMware vSAN is a distributed layer of software that runs natively as a part of the ESXi hypervisor.
vSAN aggregates local or direct-attached capacity devices of a host cluster and creates a single storage pool shared
across all hosts in the vSAN cluster. While supporting VMware features that require shared storage, such as HA, vMotion,
and DRS, vSAN eliminates the need for external shared storage and simplifies storage configuration and virtual machine
provisioning activities.

50
VMware vSAN 8.0

vSAN Concepts
VMware vSAN uses a software-defined approach that creates shared storage for virtual machines.
It virtualizes the local physical storage resources of ESXi hosts and turns them into pools of storage that can be
divided and assigned to virtual machines and applications according to their quality-of-service requirements. vSAN is
implemented directly in the ESXi hypervisor.
You can configure vSAN to work as either a hybrid or all-flash cluster. In hybrid clusters, flash devices are used for the
cache layer and magnetic disks are used for the storage capacity layer. In all-flash clusters, flash devices are used for
both cache and capacity.
You can activate vSAN on existing host clusters, or when you create a new cluster. vSAN aggregates all local capacity
devices into a single datastore shared by all hosts in the vSAN cluster. You can expand the datastore by adding capacity
devices or hosts with capacity devices to the cluster. vSAN works best when all ESXi hosts in the cluster share similar or
identical configurations across all cluster members, including similar or identical storage configurations. This consistent
configuration balances virtual machine storage components across all devices and hosts in the cluster. Hosts without any
local devices also can participate and run their virtual machines on the vSAN datastore.
In vSAN Original Storage Architecture (OSA), each host that contributes storage devices to the vSAN datastore must
provide at least one device for flash cache and at least one device for capacity. The devices on the contributing host
form one or more disk groups. Each disk group contains one flash cache device, and one or multiple capacity devices for
persistent storage. Each host can be configured to use multiple disk groups.
In vSAN Express Storage Architecture (ESA), all storage devices claimed by vSAN contribute to capacity and
performance. Each host's storage devices claimed by vSAN form a storage pool. The storage pool represents the amount
of caching and capacity provided by the host to the vSAN datastore.
For best practices, capacity considerations, and general recommendations about designing and sizing a vSAN cluster,
see the VMware vSAN Design and Sizing Guide.

Characteristics of vSAN
The following characteristics apply to vSAN, its clusters, and datastores.
vSAN includes numerous features to add resiliency and efficiency to your data computing and storage environment.

Table 1: vSAN Features

Supported Features Description

Shared storage support vSAN supports VMware features that require shared storage, such as HA,
vMotion, and DRS. For example, if a host becomes overloaded, DRS can
migrate virtual machines to other hosts in the cluster.
On-disk format vSAN on-disk virtual file format provides highly scalable snapshot and
clone management support per vSAN cluster. For information about
the number of virtual machine snapshots and clones supported per
vSAN cluster, refer to the vSphere Configuration Maximumshttps://
configmax.esp.vmware.com/home.
All-flash and hybrid configurations vSAN can be configured for all-flash or hybrid cluster.
Fault domains vSAN supports configuring fault domains to protect hosts from rack or
chassis failures when the vSAN cluster spans across multiple racks or
blade server chassis in a data center.
File service vSAN file service enables you to create file shares in the vSAN datastore
that client workstations or VMs can access.

51
VMware vSAN 8.0

Supported Features Description

iSCSI target service vSAN iSCSI target service enables hosts and physical workloads that
reside outside the vSAN cluster to access the vSAN datastore.
vSAN Stretched cluster and Two node vSAN cluster vSAN supports stretched clusters that span across two geographic
locations.
Support for Windows Server Failover Clusters (WSFC) vSAN 6.7 Update 3 and later releases support SCSI-3 Persistent
Reservations (SCSI3-PR) on a virtual disk level required by Windows
Server Failover Cluster (WSFC) to arbitrate an access to a shared disk
between nodes. Support of SCSI-3 PRs enables configuration of WSFC
with a disk resource shared between VMs natively on vSAN datastores.
Currently the following configurations are supported:
• Up to 6 application nodes per cluster.
• Up to 64 shared virtual disks per node.
NOTE
Microsoft SQL Server 2012 or later running on Microsoft
Windows Server 2012 or later has been qualified on vSAN.
vSAN health service vSAN health service includes preconfigured health check tests to monitor,
troubleshoot, diagnose the cause of cluster component problems, and
identify any potential risk.
vSAN performance service vSAN performance service includes statistical charts used to monitor IOPS,
throughput, latency, and congestion. You can monitor performance of a
vSAN cluster, host, disk group, disk, and VMs.
Integration with vSphere storage features vSAN integrates with vSphere data management features traditionally used
with VMFS and NFS storage. These features include snapshots, linked
clones, and vSphere Replication.
Virtual Machine Storage Policies vSAN works with VM storage policies to support a VM-centric approach to
storage management.
If you do not assign a storage policy to the virtual machine during
deployment, the vSAN Default Storage Policy is automatically assigned to
the VM.
®
Rapid provisioning vSAN enables rapid provisioning of storage in the vCenter Server during
virtual machine creation and deployment operations.
Deduplication and compression vSAN performs block-level deduplication and compression to save
storage space. When you enable deduplication and compression on a
vSAN all-flash cluster, redundant data within each disk group is reduced.
Deduplication and compression is a cluster-wide setting, but the functions
are applied on a disk group basis. Compression-only vSAN is applied on a
per-disk basis.
Data at rest encryption vSAN provides data at rest encryption. Data is encrypted after all other
processing, such as deduplication, is performed. Data at rest encryption
protects data on storage devices, in case a device is removed from the
cluster.
Data in transit encryption vSAN can encrypt data in transit across hosts in the cluster. When you
enable data-in-transit encryption, vSAN encrypts all data and metadata
traffic between hosts.

52
VMware vSAN 8.0

Supported Features Description

SDK support The VMware vSAN SDK is an extension of the VMware vSphere
Management SDK. It includes documentation, libraries and code examples
that help developers automate installation, configuration, monitoring, and
troubleshooting of vSAN.

vSAN Terms and Definitions


vSAN introduces specific terms and definitions that are important to understand.
Before you get started with vSAN, review the key vSAN terms and definitions.

Disk Group (vSAN Original Storage Architecture)


A disk group is a unit of physical storage capacity and performance on a host and a group of physical devices that provide
performance and capacity to the vSAN cluster. On each ESXi host that contributes its local devices to a vSAN cluster,
devices are organized into disk groups.
Each disk group must have one flash cache device and one or multiple capacity devices. The devices used for caching
cannot be shared across disk groups, and cannot be used for other purposes. A single caching device must be dedicated
to a single disk group. In hybrid clusters, flash devices are used for the cache layer and magnetic disks are used for the
storage capacity layer. In an all-flash cluster, flash devices are used for both cache and capacity. For information about
creating and managing disk groups, see Administering VMware vSAN.

Storage Pool (vSAN Express Storage Architecture)


A storage pool is a representation of all storage devices on a host that are claimed by vSAN. Each host contains one
storage pool. Each device in the storage pool contributes both capacity and performance. The number of storage devices
allowed is determined by the host configuration.

Consumed Capacity
Consumed capacity is the amount of physical capacity consumed by one or more virtual machines at any point. Many
factors determine consumed capacity, including the consumed size of your .vmdk files, protection replicas, and so on.
When calculating for cache sizing, do not consider the capacity used for protection replicas.

Object-Based Storage
vSAN stores and manages data in the form of flexible data containers called objects. An object is a logical volume that
has its data and metadata distributed across the cluster. For example, every .vmdk is an object, as is every snapshot.
When you provision a virtual machine on a vSAN datastore, vSAN creates a set of objects comprised of multiple
components for each virtual disk. It also creates the VM home namespace, which is a container object that stores all
metadata files of your virtual machine. Based on the assigned virtual machine storage policy, vSAN provisions and
manages each object individually, which might also involve creating a RAID configuration for every object.
NOTE
If vSAN Express Storage Architecture is enabled, every snapshot is not a new object. A base .vmdk and its
snapshots are contained in one vSAN object. Additionally, in vSAN ESA, digest is backed by vSAN objects.

53
VMware vSAN 8.0

When vSAN creates an object for a virtual disk and determines how to distribute the object in the cluster, it considers the
following factors:
• vSAN verifies that the virtual disk requirements are applied according to the specified virtual machine storage policy
settings.
• vSAN verifies that the correct cluster resources are used at the time of provisioning. For example, based on the
protection policy, vSAN determines how many replicas to create. The performance policy determines the amount of
flash read cache allocated for each replica and how many stripes to create for each replica and where to place them in
the cluster.
• vSAN continually monitors and reports the policy compliance status of the virtual disk. If you find any noncompliant
policy status, you must troubleshoot and resolve the underlying problem.
NOTE
When required, you can edit VM storage policy settings. Changing the storage policy settings does not affect
virtual machine access. vSAN actively throttles the storage and network resources used for reconfiguration
to minimize the impact of object reconfiguration to normal workloads. When you change VM storage policy
settings, vSAN might initiate an object recreation process and subsequent resynchronization. See vSAN
Monitoring and Troubleshooting.
• vSAN verifies that the required protection components, such as mirrors and witnesses, are placed on separate hosts
or fault domains. For example, to rebuild components during a failure, vSAN looks for ESXi hosts that satisfy the
placement rules where protection components of virtual machine objects must be placed on two different hosts, or
across fault domains.

vSAN Datastore
After you enable vSAN on a cluster, a single vSAN datastore is created. It appears as another type of datastore in the list
of datastores that might be available, including Virtual Volume, VMFS, and NFS. A single vSAN datastore can provide
different service levels for each virtual machine or each virtual disk. In vCenter Server®, storage characteristics of the
vSAN datastore appear as a set of capabilities. You can reference these capabilities when defining a storage policy for
virtual machines. When you later deploy virtual machines, vSAN uses this policy to place virtual machines in the optimal
manner based on the requirements of each virtual machine. For general information about using storage policies, see the
vSphere Storage documentation.
A vSAN datastore has specific characteristics to consider.
• vSAN provides a single vSAN datastore accessible to all hosts in the cluster, whether or not they contribute storage to
the cluster. Each host can also mount any other datastores, including Virtual Volumes, VMFS, or NFS.
• You can use Storage vMotion to move virtual machines between vSAN datastores, NFS datastores, and VMFS
datastores.
• Only magnetic disks and flash devices used for capacity can contribute to the datastore capacity. The devices used for
flash cache are not counted as part of the datastore.

Objects and Components


Each object is composed of a set of components, determined by capabilities that are in use in the VM Storage Policy.
For example, with Failures to tolerate set to 1, vSAN ensures that the protection components, such as replicas and
witnesses, are placed on separate hosts in the vSAN cluster, where each replica is an object component. In addition, in
the same policy, if the Number of disk stripes per object configured to two or more, vSAN also stripes the object across
multiple capacity devices and each stripe is considered a component of the specified object. When needed, vSAN might
also break large objects into multiple components.
A vSAN datastore contains the following object types:
VM Home Namespace
The virtual machine home directory where all virtual machine configuration files are stored, such as .vmx, log files, .vmdk
files, and snapshot delta description files.

54
VMware vSAN 8.0

VMDK
A virtual machine disk or .vmdk file that stores the contents of the virtual machine's hard disk drive.
VM Swap Object
Created when a virtual machine is powered on.
Snapshot Delta VMDKs
Created when virtual machine snapshots are taken. Such delta disks are not created for vSAN Express Storage Architecture.
Memory object
Created when the snapshot memory option is selected when creating or suspending a virtual machine.

Virtual Machine Compliance Status: Compliant and Noncompliant


A virtual machine is considered noncompliant when one or more of its objects fail to meet the requirements of its assigned
storage policy. For example, the status might become noncompliant when one of the mirror copies is inaccessible. If your
virtual machines are in compliance with the requirements defined in the storage policy, the status of your virtual machines
is compliant. From the Physical Disk Placement tab on the Virtual Disks page, you can verify the virtual machine object
compliance status. For information about troubleshooting a vSAN cluster, see vSAN Monitoring and Troubleshooting.

Component State: Degraded and Absent States


vSAN acknowledges the following failure states for components:
• Degraded. A component is Degraded when vSAN detects a permanent component failure and determines that
the failed component cannot recover to its original working state. As a result, vSAN starts to rebuild the degraded
components immediately. This state might occur when a component is on a failed device.
• Absent. A component is Absent when vSAN detects a temporary component failure where components, including all its
data, might recover and return vSAN to its original state. This state might occur when you are restarting hosts or if you
unplug a device from a vSAN host. vSAN starts to rebuild the components in absent status after waiting for 60 minutes.

Object State: Healthy and Unhealthy


Depending on the type and number of failures in the cluster, an object might be in one of the following states:
• Healthy. When at least one full RAID 1 mirror is available, or the minimum required number of data segments are
available, the object is considered healthy.
• Unhealthy. An object is considered unhealthy when no full mirror is available or the minimum required number of data
segments are unavailable for RAID 5 or RAID 6 objects. If fewer than 50 percent of an object's votes are available, the
object is unhealthy. Multiple failures in the cluster can cause objects to become unhealthy. When the operational status
of an object is considered unhealthy, it impacts the availability of the associated VM.

Witness
A witness is a component that contains only metadata and does not contain any actual application data. It serves as
a tiebreaker when a decision must be made regarding the availability of the surviving datastore components, after a
potential failure. A witness consumes approximately 2 MB of space for metadata on the vSAN datastore when using on-
disk format 1.0, and 4 MB for on-disk format version 2.0 and later.
vSAN maintains a quorum by using an asymmetrical voting system where each component might have more than one
vote to decide the availability of objects. Greater than 50 percent of the votes that make up a VM’s storage object must
be accessible at all times for the object to be considered available. When 50 percent or fewer votes are accessible to all
hosts, the object is no longer accessible to the vSAN datastore. Inaccessible objects can impact the availability of the
associated VM.

55
VMware vSAN 8.0

Storage Policy-Based Management (SPBM)


When you use vSAN, you can define virtual machine storage requirements, such as performance and availability, in the
form of a policy. vSAN ensures that the virtual machines deployed to vSAN datastores are assigned at least one virtual
machine storage policy. When you know the storage requirements of your virtual machines, you can define storage
policies and assign the policies to your virtual machines. If you do not apply a storage policy when deploying virtual
machines, vSAN automatically assigns a default vSAN policy with Failures to tolerate set to 1, a single disk stripe for
each object, and thin provisioned virtual disk. For best results, define your own virtual machine storage policies, even
if the requirements of your policies are the same as those defined in the default storage policy. For information about
working with vSAN storage policies, see Administering VMware vSAN.

vSphere PowerCLI
VMware vSphere PowerCLI adds command-line scripting support for vSAN, to help you automate configuration and
management tasks. vSphere PowerCLI provides a Windows PowerShell interface to the vSphere API. PowerCLI includes
cmdlets for administering vSAN components. For information about using vSphere PowerCLI, see vSphere PowerCLI
Documentation.

How vSAN Differs from Traditional Storage


Although vSAN shares many characteristics with traditional storage arrays, the overall behavior and function of vSAN is
different.
For example, vSAN can manage and work only with ESXi hosts, and a single vSAN instance provides a single datastore
for the cluster.
vSAN and traditional storage also differ in the following key ways:
• vSAN does not require external networked storage for storing virtual machine files remotely, such as on a Fibre
Channel (FC) or Storage Area Network (SAN).
• Using traditional storage, the storage administrator preallocates storage space on different storage systems. vSAN
automatically turns the local physical storage resources of the ESXi hosts into a single pool of storage. These pools
can be divided and assigned to virtual machines and applications according to their quality-of-service requirements.
• vSAN does not behave like traditional storage volumes based on LUNs or NFS shares. The iSCSI target service uses
LUNs to enable an initiator on a remote host to transport block-level data to a storage device in the vSAN cluster.
• Some standard storage protocols, such as FCP, do not apply to vSAN.
• vSAN is highly integrated with vSphere. You do not need dedicated plug-ins or a storage console for vSAN, compared
to traditional storage. You can deploy, manage, and monitor vSAN by using the vSphere Client.
• A dedicated storage administrator does not need to manage vSAN. Instead a vSphere administrator can manage a
vSAN environment.
• With vSAN, VM storage policies are automatically assigned when you deploy new VMs. The storage policies can be
changed dynamically as needed.

Building a vSAN Cluster


You can choose the storage architecture and deployment option when creating a vSAN cluster.
Chose the vSAN storage architecture that best suits your resources and your needs.

vSAN Original Storage Architecture


vSAN Original Storage Architecture (OSA) is designed for a wide range of storage devices, including flash solid state
drives (SSD) and magnetic disk drives (HDD). Each host that contributes storage contains one or more disk groups. Each
disk group contains one flash cache device and one or more capacity devices.

56
VMware vSAN 8.0

vSAN Express Storage Architecture


vSAN Express Storage Architecture (ESA) is designed for high-performance NVMe based TLC flash devices and high
performance networks. Each host that contributes storage contains a single storage pool of one or more flash devices.
Each flash device provides caching and capacity to the cluster.

Depending on your requirement, you can deploy vSAN in the following ways.

vSAN ReadyNode
The vSAN ReadyNode is a preconfigured solution of the vSAN software provided by VMware partners, such as Cisco,
Dell, HPE, Fujitsu, IBM, and Supermicro. This solution includes validated server configuration in a tested, certified
hardware form factor for vSAN deployment that is recommended by the server OEM and VMware. For information about
the vSAN ReadyNode solution for a specific partner, visit the VMware Partner website.

User-Defined vSAN Cluster


You can build a vSAN cluster by selecting individual software and hardware components, such as drivers, firmware,
and storage I/O controllers that are listed in the vSAN Compatibility Guide (VCG) website at http://www.vmware.com/
resources/compatibility/search.php. You can choose any servers, storage I/O controllers, capacity and flash cache
devices, memory, any number of cores you must have per CPU, that are certified and listed on the VCG website. Review
the compatibility information on the VCG website before choosing software and hardware components, drivers, firmware,
and storage I/O controllers that vSAN supports. When designing a vSAN cluster, use only devices, firmware, and drivers
that are listed on the VCG website. Using software and hardware versions that are not listed in the VCG might cause
cluster failure or unexpected data loss. For information about designing a vSAN cluster, see "Designing and Sizing a
vSAN Cluster" in vSAN Planning and Deployment.

57
VMware vSAN 8.0

vSAN Deployment Options


This section covers the supported deployment options for vSAN clusters.

Single Site vSAN Cluster


A single site vSAN cluster consists of a minimum of three hosts. Typically, all hosts in a single site vSAN cluster reside at
a single site, and are connected on the same Layer 2 network. All-flash configurations require 10 Gb network connections,
and vSAN Express Storage Architecture requires 10 Gb network connections.
For more information, see Creating a Single Site vSAN Cluster .

Two-Node vSAN Cluster


Two-node vSAN clusters are often used for remote office/branch office environments, typically running a small number of
workloads that require high availability. A two-node vSAN cluster consists of two hosts at the same location, connected
to the same network switch or directly connected. You can configure a two-node vSAN cluster that uses a third host as a
witness, which can be located remotely from the branch office. Usually the witness resides at the main site, along with the
vCenter Server.
For more information, see Creating a vSAN Stretched Cluster or Two-Node vSAN Cluster.

58
VMware vSAN 8.0

vSAN Stretched Cluster


A vSAN stretched cluster provides resiliency against the loss of an entire site. The hosts in a vSAN stretched cluster are
distributed evenly across two sites. The two sites must have a network latency of no more than five milliseconds (5 ms). A
vSAN witness host resides at a third site to provide the witness function. The witness also acts as tie-breaker in scenarios
where a network partition occurs between the two data sites. Only metadata such as witness components is stored on the
witness.
For more information, see Creating a vSAN Stretched Cluster or Two-Node vSAN Cluster.

Integrate vSAN with Other VMware Software


After you have vSAN up and running, it is integrated with the rest of the VMware software stack.
You can do most of what you can do with traditional storage by using vSphere components and features including
vSphere vMotion, snapshots, clones, Distributed Resource Scheduler (DRS), vSphere High Availability, VMware Site
Recovery Manager, and more.

vSphere HA
You can enable vSphere HA and vSAN on the same cluster. As with traditional datastores, vSphere HA provides the same
level of protection for virtual machines on vSAN datastores. This level of protection imposes specific restrictions when
vSphere HA and vSAN interact. For specific considerations about integrating vSphere HA and vSAN, see Using vSAN
and vSphere HA.

59
VMware vSAN 8.0

VMware Horizon View


You can integrate vSAN with VMware Horizon View. When integrated, vSAN provides the following benefits to virtual
desktop environments:
• High-performance storage with automatic caching
• Storage policy-based management, for automatic remediation
For information about integrating vSAN with VMware Horizon, see the VMware with Horizon View documentation. For
designing and sizing VMware Horizon View for vSAN, see the Designing and Sizing Guide for Horizon View.

Limitations of vSAN
This topic discusses the limitations of vSAN.
When working with vSAN, consider the following limitations:
• vSAN does not support hosts participating in multiple vSAN clusters. However, a vSAN host can access other external
storage resources that are shared across clusters.
• vSAN does not support vSphere DPM and Storage I/O Control.
• vSAN does not support SE Sparse disks.
• vSAN does not support RDM, VMFS, diagnostic partition, and other device access features.

Requirements for Enabling vSAN


Before you deploy a vSAN cluster, verify that your environment meets the requirements for running vSAN.

Hardware Requirements for vSAN


Verify that your ESXi hosts and storage devices meet the vSAN hardware requirements.

Storage Device Requirements


All capacity devices, drivers, and firmware versions in your configuration must be certified and listed in the section of the
VMware Compatibility Guide.

Table 2: vSAN Original Storage Architecture storage device requirements

Storage Component Requirements

Cache • One SAS or SATA solid-state disk (SSD) or PCIe flash device.
• Before calculating the Failures to tolerate, check the size of the flash
caching device in each disk group. For hybrid cluster, it must provide at
least 10 percent of the anticipated storage consumed on the capacity
devices, not including replicas such as mirrors.
• vSphere Flash Read Cache must not use any of the flash devices
reserved for vSAN cache.
• The cache flash devices must not be formatted with VMFS or another file
system.
Capacity • Hybrid group configuration must have at least one SAS or NL-SAS
magnetic disk.
• All-flash disk group configuration must have at least one SAS, or SATA
solid-state disk (SSD), or PCIe flash device.

60
VMware vSAN 8.0

Storage Component Requirements

Storage controllers One SAS or SATA host bus adapter (HBA), or a RAID controller that is in
passthrough mode or RAID 0 mode.
To avoid issues, consider these points when the same storage controller is
backing both vSAN and non-vSAN disks:
Do not mix the controller mode for vSAN and non-vSAN disks to avoid
handling the disks inconsistently, which can negatively impact vSAN
operation. If the vSAN disks are in RAID mode, the non-vSAN disks must
also be in RAID mode.
When you use non-vSAN disks for VMFS, use the VMFS datastore only for
scratch, logging, and core dumps.
Do not run virtual machines from a disk or RAID group that shares its
controller with vSAN disks or RAID groups.
Do not passthrough non-vSAN disks to virtual machine guests as Raw
Device Mappings (RDMs).
To learn about controller supported features, such as passthrough and
RAID, refer to the vSAN HCL: https://www.vmware.com/resources/
compatibility/search.php?deviceCategory=vsan

Table 3: vSAN Express Storage Architecture storage device requirements

Storage Component Requirements

Cache and capacity Each storage pool must have at least one NVMe TLC device.

Host Memory
The memory requirements for vSAN Original Storage Architecture depend on the number of disk groups and devices
that the ESXi hypervisor must manage. For more information, see the VMware knowledge base article at https://
kb.vmware.com/s/article/2113954.
vSAN Express Storage Architecture requires at least 128 GB host memory. The memory needed for your environment
depends on the number of devices in the host's storage pool.

Flash Boot Devices


During installation, the ESXi installer creates a coredump partition on the boot device. The default size of the coredump
partition satisfies most installation requirements.
• If the memory of the ESXi host has 512 GB of memory or less, you can boot the host from a USB, SD, or SATADOM
device. When you boot a vSAN host from a USB device or SD card, the size of the boot device must be at least 4 GB.
• If the memory of the ESXi host has more than 512 GB, consider the following guidelines.
– You can boot the host from a SATADOM or disk device with a size of at least 16 GB. When you use a SATADOM
device, use a single-level cell (SLC) device.
– If you are using vSAN 6.5 or later, you must resize the coredump partition on ESXi hosts to boot from USB/SD
devices.
When you boot an ESXi 6.0 or later host from USB device or from SD card, vSAN trace logs are written to RAMDisk.
These logs are automatically offloaded to persistent media during shutdown or system crash (panic). This is the only
support method for handling vSAN traces when booting an ESXi from a USB stick or SD card. If a power failure occurs,
vSAN trace logs are not preserved.
When you boot an ESXi 6.0 or later host from a SATADOM device, vSAN trace logs are written directly to the SATADOM
device. Therefore it is important that the SATADOM device meets the specifications outlined in this guide.

61
VMware vSAN 8.0

Cluster Requirements for vSAN


Verify that a host cluster meets the requirements for enabling vSAN.
• All capacity devices, drivers, and firmware versions in your configuration must be certified and listed in the section of
the VMware Compatibility Guide.
• A standard vSAN cluster must contain a minimum of three hosts that contribute capacity to the cluster. A two host
vSAN cluster consists of two data hosts and an external witness host. For information about the considerations for a
three-host cluster, see Design Considerations for a vSAN Cluster.
• A host that resides in a vSAN cluster must not participate in other clusters.

Software Requirements for vSAN


Verify that the vSphere components in your environment meet the software version requirements for using vSAN.
To use the full set of vSAN capabilities, the ESXi hosts that participate in vSAN clusters must be version 8.0 Update 1 or
later. During the vSAN upgrade from previous versions, you can keep the current on-disk format version, but you cannot
use many of the new features. vSAN 8.0 Update 1 and later software supports all on-disk formats.

Networking Requirements for vSAN


Verify that the network infrastructure and the networking configuration on the ESXi hosts meet the minimum networking
requirements for vSAN.

Table 4: Networking Requirements for vSAN

Networking Component Requirement

Host Bandwidth Each host must have minimum bandwidth dedicated to vSAN.
• vSAN OSA: Dedicated 1 Gbps for hybrid configurations, dedicated or
shared 10 Gbps for all-flash configurations
• vSAN ESA: Dedicated or shared 10 Gbps
For information about networking considerations in vSAN, see Designing the
vSAN Network.
Connection between hosts Each host in the vSAN cluster, regardless of whether it contributes capacity,
must have a VMkernel network adapter for vSAN traffic. See Set Up a
VMkernel Network for vSAN.
Host network All hosts in your vSAN cluster must be connected to a vSAN Layer 2 or Layer 3
network.
IPv4 and IPv6 support The vSAN network supports both IPv4 and IPv6.
Network latency • Maximum of 1 ms RTT for single site (non-stretched) vSAN clusters
between all hosts in the cluster
• Maximum of 5 ms RTT between the two main sites for vSAN stretched
clusters
• Maximum of 200 ms RTT from a main site to the vSAN witness host

License Requirements
vSAN clusters are licensed differently with the per TiB, per CPU, and per Core licensing model.
In a vSphere environment converted to VMware Cloud-connection based vSphere+ subscription, you can continue to use
vSAN CPU license keys. For more information, see the VMware vSphere+ documentation.

62
VMware vSAN 8.0

Per TiB License for vSAN


The per Tebitbyte (TiB) license for vSAN in VMware Cloud Foundation is subscription based.
You can assign a vSAN per TiB license to a single vSAN cluster or multiple vSAN clusters. If multiple vSAN clusters share
a single per capacity license, the capacity gets shared by multiple vSAN clusters. To calculate the capacity that you need
for your vSAN environment, you need enough TiB licenses for the total physical device capacity in tebibytes on all the
ESXi hosts in each vSAN cluster.
For example, consider a vSAN cluster with 3 ESXi hosts, 1 CPU per host, and each host has 4.7 TiBs of storage per CPU.
With a total of 14.1 TiBs (3 * 1 * 4.7) of storage, the cluster rounds up the license usage to 15 TiBs vSAN capacity.
The vSAN clusters reflect the total raw storage capacity available. For more information about calculating the license
capacity that you need for your environment, see the VMware knowledge base article at https://kb.vmware.com/s/
article/95927.

VMware Cloud Foundation License for vSAN


The license for vSAN in VMware Cloud Foundation is subscription based.
Consider a vSAN cluster with 3 ESXi hosts, 1 CPU per host, and each host has 6 licensed cores per CPU. For each VCF
cores that you purchase, you receive one TiB of vSAN capacity. You must purchase the subscription capacity of 16 cores
per CPU as it is the required minimum license capacity. With a total of 48 (3 * 1 * 16) licensed cores per VCF cluster, you
receive 48 TiB vSAN capacity.

Number of ESXi Host Number of Core Entitled vSAN Capacity


CPUs per ESXi Host Cores per CPU
(in vSAN cluster) License (TiB)
3 1 6 (minimum 16 cores 48 48
required)
3 2 16 96 96
3 2 24 144 144

The license use of the vSAN is recalculated and updated in the following cases:
• If you assign a new license to the vSAN cluster
• If you add a new host to the vSAN cluster
• If a host is removed from the cluster
• If the total number of TiBs in a cluster changes
You need to purchase a vSAN add-on license if you need additional capacity. For capacity larger than the total entitled
vSAN capacity in tebibytes, you can purchase additional vSAN capacity. When you purchase additional capacity, you
receive a vSAN capacity license. You can combine multiple license keys and apply the resulting license key to vSAN
clusters.

VMware vSphere Foundation Capacity License for vSAN


With vSAN 8.0 Update 3 release, you do not need a separate vSAN license to deploy the vSAN clusters with VMware
vSphere.
With the VMware vSphere Foundation license, you receive 0.25 tebibyte (TiB) of vSAN storage per vSAN host physical
core.
You can use a Solution License to license all the components of VVF. For more information on applying the Solution
License to the VVF components, see "VMware vSphere Foundation (VVF) Licensing" in the vCenter Server and Host
Management guide.

63
VMware vSAN 8.0

To calculate the capacity that you need for your vSAN environment, you need the total number of licensed CPU cores for
each CPU on all the ESXi hosts in your environment. For example, consider a vSAN cluster with 3 ESXi hosts, 1 CPU
per host, and each host has 8 physical cores per CPU. You can use up to 0.25 TiB of included vSAN storage per vSAN
host physical core. You must purchase a vSphere Foundation license with the subscription capacity of 16 cores per CPU
because it is the required minimum license capacity. With a total of 48 (3 * 1 * 16) physical cores per CPU, you receive 12
TiBs (0.25 TiB * total cores in the vSAN cluster) capacity. Similarly, With a total of 128 (4 * 2* 16) physical cores per CPU,
you receive 32 TiBs capacity.

Number of ESXi Host Number of Core Entitled vSAN Capacity


CPUs per ESXi Host Cores per CPU
(in vSAN cluster) License (TiB)
3 1 8 (minimum 16 cores 48 12
required)
4 2 16 128 32
4 2 24 192 48

You need to purchase an add-on license if you need additional capacity. The vSAN clusters with more than 0.25 TiB
core of storage requires a vSAN add-on license for the entire storage capacity of the cluster. For more information about
calculating the license capacity that you need for your environment, see the VMware knowledge base article at https://
kb.vmware.com/s/article/95927.

Per CPU License for vSAN


After you enable vSAN on a cluster, you must assign the cluster an appropriate vSAN license.
Similar to vSphere licenses, vSAN licenses have per CPU capacity. When you assign a per CPU vSAN license to a
cluster, the amount of license capacity used equals the total number of CPUs in the hosts participating in the cluster.
The vSAN cluster can have any of the following:
• vSAN CPU license with a maximum 32 physical cores per CPU that needs one vSAN license for every 32 cores of
CPU.
• vSAN CPU license without a maximum physical core that needs one vSAN license for each CPU.
For example, if you have a vSAN cluster that contains 4 hosts with 2 CPUs each, assign the cluster a vSAN license with a
minimum capacity of 8 CPUs assuming the quantity of physical cores on each CPU is less than or equal to 32 cores.
The license use of the vSAN is recalculated and updated in the following cases:
• If you assign a new license to the vSAN cluster
• If you add a new host to the vSAN cluster
• If a host is removed from the cluster
• If the total number of CPUs in a cluster changes
You must maintain the vSAN clusters in compliance with the vSAN licensing model. The total number of CPUs of all hosts
in the cluster must not exceed the capacity of the vSAN license that is assigned to the cluster.

Per Core License for vSAN


The per core licensing model is subscription based.
To calculate the capacity you need for your environment, you need the total number of the physical CPU cores for each
CPU on all ESXi hosts in your vSAN cluster. Each core requires a single license, and the minimum license capacity you
can purchase is 16 cores per CPU.

64
VMware vSAN 8.0

For example, if you have 1 ESXi host with 1 CPU, and 8 CPU cores per CPU, you must purchase the subscription
capacity of 16 cores per CPU because it is the minimum license capacity.

Number of ESXi Hosts Number of CPUs Cores per CPU Number of Core Licenses

1 1 8 16
2 2 8 64
2 2 16 64

For more information about calculating the number of licenses you need for your environment, see the VMware
knowledge base article at https://kb.vmware.com/s/article/95927.

License and Evaluation Period Expiry


When the license or the evaluation period of vSAN expires, you can continue to use the currently configured vSAN
resources and features if you have an active license. However, you cannot add SSD or HDD capacity to an existing disk
group or create new disk groups.

vSAN for Desktop


vSAN for Desktop is intended for use in VDI environments, such as vSphere for Desktop or Horizon ™View™. The license
use for vSAN for Desktop equals the total number of powered on VMs in a cluster with enabled vSAN.
To remain EULA compliant, the license use for vSAN for Desktop must not exceed the license capacity. The number of
powered on desktop VMs in a vSAN cluster must be less than or equal to the license capacity of vSAN for Desktop.

Designing and Sizing a vSAN Cluster


For best performance and use, plan the capabilities and configuration of your hosts and their storage devices before you
deploy vSAN in a vSphere environment. Carefully consider certain host and networking configurations within the vSAN
cluster.
The Administering VMware vSAN documentation examines the key points about designing and sizing a vSAN cluster. For
detailed instructions about designing and sizing a vSAN cluster, see VMware Design and Sizing Guide.

Designing and Sizing vSAN Storage


Plan capacity and cache based on the expected data storage consumption. Consider your requirements for availability
and endurance.

Planning Capacity in vSAN


You can calculate the capacity of a vSAN datastore to accommodate the virtual machines (VMs) files in the cluster, and to
handle failures and maintenance operations.

Raw Capacity
Use this formula to determine the raw capacity of a vSAN datastore. Multiply the total number of disk groups in the cluster
by the size of the capacity devices in those disk groups. Subtract the overhead required by the vSAN on-disk format.

Failures to Tolerate
When you plan the capacity of the vSAN datastore, not including the number of virtual machines and the size of their
VMDK files, you must consider the Failures to tolerate of the virtual machine storage policies for the cluster.

65
VMware vSAN 8.0

The Failures to tolerate has an important role when you plan and size storage capacity for vSAN. Based on the
availability requirements of a virtual machine, the setting might result in doubled consumption or more, compared with the
consumption of a virtual machine and its individual devices.
For example, if the Failures to tolerate is set to 1 failure - RAID-1 (Mirroring), virtual machines can use about 50
percent of the raw capacity. If the FTT is set to 2, the usable capacity is about 33 percent. If the FTT is set to 3, the usable
capacity is about 25 percent.
But if the Failures to tolerate is set to 1 failure - RAID-5 (Erasure Coding), virtual machines can use about 75 percent
of the raw capacity. If the FTT is set to 2 failures - RAID-6 (Erasure Coding), the usable capacity is about 67 percent.
For more information about RAID 5/6, see Administering VMware vSAN.
For information about the attributes in a vSAN storage policy, see Administering VMware vSAN.

Capacity Sizing Guidelines


• Keep some unused space to prevent vSAN from rebalancing the storage load. vSAN rebalances the components
across the cluster whenever the consumption on a single capacity device reaches 80 percent or more. The rebalance
operation might impact the performance of applications. To avoid these issues, keep storage consumption to less than
80 percent. vSAN 7.0 Update 1 and later enables you to manage unused capacity using operations reserve and host
rebuild reserve.
• Plan extra capacity to handle any potential failure or replacement of capacity devices, disk groups, and hosts. When
a capacity device is not reachable, vSAN recovers the components from another device in the cluster. When a flash
cache device fails or is removed, vSAN recovers the components from the entire disk group.
• Reserve extra capacity to make sure that vSAN recovers components after a host failure or when a host enters
maintenance mode. For example, provision hosts with enough capacity so that you have sufficient free capacity left for
components to rebuild after a host failure or during maintenance. This extra space is important when you have more
than three hosts, so you have sufficient free capacity to rebuild the failed components. If a host fails, the rebuilding
takes place on the storage available on another host, so that another failure can be tolerated. However, in a three-
host cluster, vSAN does not perform the rebuild operation if the Failures to tolerate is set to 1 because when one host
fails, only two hosts remain in the cluster. To tolerate a rebuild after a failure, you must have at least three surviving
hosts.
• Provide enough temporary storage space for changes in the vSAN VM storage policy. When you dynamically change a
VM storage policy, vSAN might create a new RAID tree layout of the object. When vSAN instantiates and synchronizes
a new layout, the object may consume extra space temporarily. Keep some temporary storage space in the cluster to
handle such changes.
• If you plan to use advanced features, such as software checksum or deduplication and compression, reserve extra
capacity to handle the operational overhead.
You can use the vSAN Sizer tool https://vsansizer.esp.vmware.com/ to assist with capacity requirements, and to
determine how vSAN can meet your performance requirements.

Considerations for Virtual Machine Objects


When you plan the storage capacity in the vSAN datastore, consider the space required in the datastore for the VM home
namespace objects, snapshots, and swap files.
• VM Home Namespace. You can assign a storage policy specifically to the home namespace object for a virtual
machine. To prevent unnecessary allocation of capacity and cache storage, vSAN applies only the Failures to tolerate
and the Force provisioning settings from the policy on the VM home namespace. Plan storage space to meet the
requirements for a storage policy assigned to a VM Home Namespace whose Failures to tolerate is greater than 0.
• Snapshots. Delta devices inherit the policy of the base VMDK file. Plan extra space according to the expected size and
number of snapshots, and to the settings in the vSAN storage policies.

66
VMware vSAN 8.0

The space that is required might be different. Its size depends on how often the virtual machine changes data and how
long a snapshot is attached to the virtual machine.
• Swap files. In vSAN 6.7 and later, virtual machine swap files inherit the storage policy of the VM Namespace.

Design Considerations for Flash Caching Devices in vSAN


Plan the configuration of flash devices for vSAN cache and all-flash capacity to provide high performance and required
storage space, and to accommodate future growth.

Choosing Between PCIe or SSD Flash Devices


Choose SSD flash devices according to the requirements for performance, capacity, write endurance, and cost of the
storage.
• Compatibility. The model of the SSD devices must be listed in the section of the VMware Compatibility Guide.
• Performance. PCIe devices generally have faster performance than SATA devices.
• Capacity. The maximum capacity that is available for PCIe devices is generally greater than the maximum capacity
that is currently listed for SATA devices for in the VMware Compatibility Guide.
• Write endurance. The write endurance of the SSD devices must meet the requirements for capacity or for cache in all-
flash configurations, and for cache in hybrid configurations.
For information about the write endurance requirements for all-flash and hybrid configurations, see the VMware
Design and Sizing Guide. For information about the write endurance class of SSD devices, see the section of the
VMware Compatibility Guide.
• Cost. PCIe devices generally have higher cost than SSD devices.

Flash Devices as vSAN Cache


Design the configuration of flash cache for vSAN for write endurance, performance, and potential growth based on these
considerations.

67
VMware vSAN 8.0

Table 5: Sizing vSAN Cache

Storage Configuration Considerations

All-flash and hybrid configurations • A higher cache-to-capacity ratio eases future capacity growth. Oversizing
cache enables you to add more capacity to an existing disk group without
the need to increase the size of the cache.
• Flash caching devices must have high write endurance.
• Replacing a flash caching device is more complicated than replacing a
capacity device because such an operation impacts the entire disk group.
• If you add more flash devices to increase the size of the cache, you must
create more disk groups. The ratio between flash cache devices and disk
groups is always 1:1.
A configuration of multiple disk groups provides the following advantages:
– Reduced risk of failure. If a single caching device fails, fewer capacity
devices are affected.
– Potentially improved performance if you deploy multiple disk groups that
contain smaller flash caching devices.
However, when you configure multiple disk groups, the memory
consumption of the hosts increases.
All-flash configurations In all-flash configurations, vSAN uses the cache layer for write caching only.
The write cache must be able to handle high write activities. This approach
extends the life of capacity flash that might be less expensive and might have
lower write endurance.
Hybrid configurations The flash caching device must provide at least 10 percent of the anticipated
storage that virtual machines are expected to consume, not including replicas
such as mirrors. The Primary level of failures to tolerate attribute from the
VM storage policy does not impact the size of the cache.
If the read cache reservation is configured in the active VM storage policy, the
hosts in the vSAN cluster must have sufficient cache to satisfy the reservation
during a post-failure rebuild or maintenance operation.
If the available read cache is not sufficient to satisfy the reservation, the rebuild
or maintenance operation fails. Use read cache reservation only if you must
meet a specific, known performance requirement for a particular workload.
The use of snapshots consumes cache resources. If you plan to use several
snapshots, consider dedicating more cache than the conventional 10 percent
cache-to-consumed-capacity ratio.

Design Considerations for Flash Capacity Devices in vSAN


Plan the configuration of flash capacity devices for vSAN all-flash configurations to provide high performance and required
storage space, and to accommodate future growth.

Choosing Between PCIe or SSD Flash Devices


Choose SSD flash devices according to the requirements for performance, capacity, write endurance, and cost of the
storage.
• Compatibility. The model of the SSD devices must be listed in the section of the VMware Compatibility Guide.
• Performance. PCIe devices generally have faster performance than SATA devices.
• Capacity. The maximum capacity that is available for PCIe devices is generally greater than the maximum capacity
that is currently listed for SATA devices for in the VMware Compatibility Guide.

68
VMware vSAN 8.0

• Write endurance. The write endurance of the SSD devices must meet the requirements for capacity or for cache in all-
flash configurations, and for cache in hybrid configurations.
For information about the write endurance requirements for all-flash and hybrid configurations, see the VMware
Design and Sizing Guide. For information about the write endurance class of SSD devices, see the section of the
VMware Compatibility Guide.
• Cost. PCIe devices generally have higher cost than SSD devices.

Flash Devices as vSAN Capacity


In all-flash configurations, vSAN does not use cache for read operations and does not apply the read-cache reservation
setting from the VM storage policy. For cache, you can use a small amount of more expensive flash that has high write
endurance. For capacity, you can use flash that is less expensive and has lower write endurance.
Plan a configuration of flash capacity devices by following these guidelines:
• For better performance of vSAN, use more disk groups of smaller flash capacity devices.
• For balanced performance and predictable behavior, use the same type and model of flash capacity devices.

Design Considerations for Magnetic Disks in vSAN


Plan the size and number of magnetic disks for capacity in hybrid configurations by following the requirements for storage
space and performance.

SAS and NL-SAS Magnetic Devices


Use SAS or NL-SAS magnetic devices by following the requirements for performance, capacity, and cost of the vSAN
storage.
• Compatibility. The model of the magnetic disk must be certified and listed in the vSAN section of the VMware
Compatibility Guide.
• Performance. SAS and NL-SAS devices have faster performance.
• Capacity. The capacity of SAS or NL-SAS magnetic disks for vSAN is available in the vSAN section of the VMware
Compatibility Guide. Consider using a larger number of smaller devices instead of a smaller number of larger devices.
• Cost. SAS and NL-SAS devices can be expensive.

Magnetic Disks as vSAN Capacity


Plan a magnetic disk configuration by following these guidelines:
• For better performance of vSAN, use many magnetic disks that have smaller capacity.
You must have enough magnetic disks that provide adequate aggregated performance for transferring data between
cache and capacity. Using more small devices provides better performance than using fewer large devices. Using
multiple magnetic disk spindles can speed up the destaging process.
In environments that contain many virtual machines, the number of magnetic disks is also important for read
operations when data is not available in the read cache and vSAN reads it from the magnetic disk. In environments
that contain a small number of virtual machines, the disk number impacts read operations if the Number of disk
stripes per object in the active VM storage policy is greater than one.
• For balanced performance and predictable behavior, use the same type and model of magnetic disks in a vSAN
datastore.
• Dedicate a high enough number of magnetic disks to satisfy the value of the Failures to tolerate and the Number of
disk stripes per object attributes in the defined storage policies. For information about the VM storage policies for
vSAN, see Administering VMware vSAN.

69
VMware vSAN 8.0

Design Considerations for Storage Controllers in vSAN


Use storage controllers on the hosts of a vSAN cluster that best satisfy your requirements for performance and availability.
• Use storage controller models, and driver and firmware versions that are listed in the VMware Compatibility Guide.
Search for vSAN in the VMware Compatibility Guide.
• Use multiple storage controllers, if possible, to improve performance and to isolate a potential controller failure to only
a subset of disk groups.
• Use storage controllers that have the highest queue depths in the VMware Compatibility Guide. Using controllers with
high queue depth improves performance. For example, when vSAN is rebuilding components after a failure or when a
host enters maintenance mode.
• Use storage controllers in passthrough mode for best performance of vSAN. Storage controllers in RAID 0 mode
require higher configuration and maintenance efforts compared to storage controllers in passthrough mode.
• Deactivate caching on the controller, or set caching to 100 percent Read.

Designing and Sizing vSAN Hosts


Plan the configuration of the hosts in your vSAN cluster for best performance and availability.

Memory and CPU


Calculate the memory and the CPU requirements of the hosts in the vSAN cluster based on the following considerations.

Table 6: Sizing Memory and CPU of vSAN Hosts

Compute Resource Considerations

Memory • Memory per virtual machine


• Memory per host, based on the expected number of virtual
machines
• vSAN Original Storage Architecture must have at least 32
GB memory to support 5 disk groups per host and 7 capacity
devices per disk group.
• vSAN Express Storage Architecture requires at least 512 GB
memory.
Hosts that have 512-GB memory or less can boot from a USB,
SD, or SATADOM device. If the memory of the host is greater than
512 GB, boot the host from a SATADOM or disk device.
For more information, see the VMware knowledge base article at
https://kb.vmware.com/s/article/2113954
CPU • Sockets per host
• Cores per socket
• Number of vCPUs based on the expected number of virtual
machines
• vCPU-to-core ratio
NOTE
vSAN Express Storage Architecture requires at least
32 CPU cores per host.

70
VMware vSAN 8.0

Host Networking
Provide more bandwidth for vSAN traffic to improve performance.
• vSAN Original Storage Architecture
– If you plan to use hosts that have 1-GbE adapters, dedicate adapters for vSAN only. For all-flash configurations,
plan hosts that have dedicated or shared 10-GbE adapters.
– If you plan to use 10-GbE adapters, they can be shared with other traffic types for both hybrid and all-flash
configurations.
• vSAN Express Storage Architecture
– Plan to use hosts that have dedicated or shared 25-GbE adapters or better.
– Network adapters can be shared with other traffic types.
• If a network adapter is shared with other traffic types, use a vSphere Distributed Switch to isolate vSAN traffic by using
Network I/O Control and VLANs.
• Create a team of physical adapters to provide redundancy for vSAN traffic.

Disk Groups vs. Storage Pools


vSAN Original Storage Architeture uses disk groups to balance performance and reliability. If a flash cache or storage
controller stops responding and a disk group fails, vSAN rebuilds all components from another location in the cluster.
Using multiple disk groups, with each disk group providing a portion of datastore capacity, provides advantages but also
has disadvantages.
• Advantages of multiple disk groups
– Performance is improved because the datastore has more aggregated cache, and I/O operations are faster.
– Risk of failure is spread among multiple disk groups.
– If a disk group fails, vSAN rebuilds fewer components, so performance is improved.
• Disadvantages of multiple disk groups
– Costs are increased because two or more caching devices are required.
– More memory is required to handle more disk groups.
– Multiple storage controllers are required to reduce the risk of a single point of failure.
vSAN Express Storage Architecture uses storage pools, where each device provides both performance and capacity. Any
single device can fail without impacting the availability of data on any of the other devices in the storage pool. This design
reduces the size of a failure domain.

Drive Bays
For easy maintenance, consider hosts whose drive bays and PCIe slots are at the front of the server body.

Hot Plug and Swap of Devices


Consider the storage controller passthrough mode support for easy hot plugging or replacement of magnetic disks and
flash capacity devices on a host. If a controller works in RAID 0 mode, you must perform additional steps before the host
can discover the new drive.

Design Considerations for a vSAN Cluster


Design the configuration of hosts and management nodes for best availability and tolerance to consumption growth.

71
VMware vSAN 8.0

Sizing the vSAN Cluster for Failures to Tolerate


You configure the Failures to tolerate (FTT) attribute in the VM storage policies to handle host failures. The number
of hosts required for the cluster is calculated as follows: 2 * FTT + 1 . The more failures the cluster is configured to
tolerate, the more capacity hosts are required.
If the cluster hosts are connected in rack servers, you can organize the hosts into fault domains to improve resilience
against issues such as top-of-rack switch failures and loss of server rack power. See Designing and Sizing vSAN Fault
Domains .

Limitations of a Two-Host or Three-Host Cluster Configuration


In a three-host configuration, you can tolerate only one host failure by setting the number of failures to tolerate to 1. vSAN
saves each of the two required replicas of virtual machine data on separate hosts. The witness object is on a third host.
Because of the small number of hosts in the cluster, the following limitations exist:
• When a host fails, vSAN cannot rebuild data on another host to protect against another failure.
• If a host must enter maintenance mode, vSAN cannot evacuate data from the host to maintain policy compliance.
While the host is in maintenance mode, data is exposed to a potential failure or inaccessibility if an additional failure
occurs.
You can use only the Ensure data accessibility data evacuation option. Ensure data accessibility guarantees that
the object remains available during data migration, although it might be at risk if another failure occurs. vSAN objects
on two-host or three-host clusters are not policy compliant. When the host exits maintenance mode, objects are rebuilt
to ensure policy compliance.
In any situation where two-host or three-host cluster has an inaccessible host or disk group, vSAN objects are at risk of
becoming inaccessible should another failure occur.

Balanced and Unbalanced Cluster Configuration


vSAN works best on hosts with uniform configurations, including storage configurations.
Using hosts with different configurations has the following disadvantages in a vSAN cluster:
• Reduced predictability of storage performance because vSAN does not store the same number of components on
each host.
• Different maintenance procedures.
• Reduced performance on hosts in the cluster that have smaller or different types of cache devices.

Deploying vCenter Server on vSAN


If the vCenter Server becomes unavailable, vSAN continues to operate normally and virtual machines continue to run.
If vCenter Server is deployed on the vSAN datastore, and a problem occurs in the vSAN cluster, you can use a Web
browser to access each ESXi host and monitor vSAN through the vSphere Host Client. vSAN health information is visible
in the Host Client, and also through esxcli commands.

Designing the vSAN Network


Consider networking features that can provide availability, security, and bandwidth guarantee in a vSAN cluster.
For details about the vSAN network configuration, see the Network Design Guide.

Networking Failover and Load Balancing


vSAN uses the teaming and failover policy that is configured on the backing virtual switch for network redundancy only.
vSAN does not use NIC teaming for load balancing.

72
VMware vSAN 8.0

If you plan to configure a NIC team for availability, consider these failover configurations.

Teaming Algorithm Failover Configuration of the Adapters in the Team

Route based on originating virtual port Active/Passive


Route based on IP hash Active/Active with static EtherChannel for the standard switch and
LACP port channel for the distributed switch
Route based on physical network adapter load Active/Active

vSAN supports IP-hash load balancing, but cannot guarantee improvement in performance for all configurations. You can
benefit from IP hash when vSAN is among its many consumers. In this case, IP hash performs load balancing. If vSAN
is the only consumer, you might observe no improvement. This behavior specifically applies to 1-GbE environments. For
example, if you use four 1-GbE physical adapters with IP hash for vSAN, you might not be able to use more than 1 Gbps.
This behavior also applies to all NIC teaming policies that VMware supports.
vSAN does not support multiple VMkernel adapters on the same subnet. You can use different VMkernel adapters on
different subnets, such as another VLAN or separate physical fabric. Providing availability by using several VMkernel
adapters has configuration costs that involve vSphere and the network infrastructure. You can increase network
availability by teaming physical network adapters.

Using Unicast in vSAN Network


In vSAN 6.6 and later releases, multicast is not required on the physical switches that support the vSAN cluster. You can
design a simple unicast network for vSAN. Earlier releases of vSAN rely on multicast to enable heartbeat and to exchange
metadata between hosts in the cluster. If some hosts in your vSAN cluster are running earlier versions of software, a
multicast network is still required. For more information about using multicast in a vSAN cluster, refer to an earlier version
of Administering VMware vSAN.
NOTE
The following configuration is not supported: vCenter Server deployed on a vSAN 6.6 cluster that is using IP
addresses from DHCP without reservations. You can use DHCP with reservations, because the assigned IP
addresses are bound to the MAC addresses of VMkernel ports.

Using RDMA
vSAN 7.0 Update 2 and later releases can use Remote Direct Memory Access (RDMA). RDMA typically has lower CPU
utilization and less I/O latency. If your hosts support the RoCE v2 protocol, you can enable RDMA through the vSAN
network service in vSphere Client.
Consider the following guidelines when designing vSAN over RDMA:
• Each vSAN host must have a vSAN certified RDMA-capable NIC, as listed in the vSAN section of the VMware
Compatibility Guide. Use only the same model network adapters from the same vendor on each end of the connection.
Configure the DCBx mode to IEEE.
• All hosts must support RDMA. If any host loses RDMA support, the entire vSAN cluster switches to TCP.
• The network must be lossless. Configure network switches to use Data Center Bridging with Priority Flow Control.
Configure a lossless traffic class for vSAN traffic marked at priority level 3.
• vSAN with RDMA does not support LACP or IP-hash-based NIC teaming. vSAN with RDMA does support NIC failover.
• All hosts must be on the same subnet. vSAN with RDMA supports up to 32 hosts.

73
VMware vSAN 8.0

Allocating Bandwidth for vSAN by Using Network I/O Control


vSAN traffic can share physical network adapters with other system traffic types, such as vSphere vMotion traffic, vSphere
HA traffic, and virtual machine traffic. To guarantee the amount of bandwidth required for vSAN, use vSphere Network I/O
Control in the vSphere Distributed Switch.
In vSphere Network I/O Control, you can configure reservation and shares for the vSAN outgoing traffic.
• Set a reservation so that Network I/O Control guarantees that minimum bandwidth is available on the physical adapter
for vSAN.
• Set shares so that when the physical adapter assigned for vSAN becomes saturated, certain bandwidth is available
to vSAN and to prevent vSAN from consuming the entire capacity of the physical adapter during rebuild and
synchronization operations. For example, the physical adapter might become saturated when another physical adapter
in the team fails and all traffic in the port group is transferred to the other adapters in the team.
For example, on a 10-GbE physical adapter that handles traffic for vSAN, vSphere vMotion, and virtual machines, you can
configure certain bandwidth and shares.

Table 7: Example Network I/O Control Configuration for a Physical Adapter That Handles vSAN

Traffic Type Reservation, Gbps Shares

vSAN 1 100
vSphere vMotion 0.5 70
Virtual machine 0.5 30

If the network adapter becomes saturated, Network I/O Control allocates 5 Gbps to vSAN on the physical adapter.
For information about using vSphere Network I/O Control to configure bandwidth allocation for vSAN traffic, see the
vSphere Networking documentation.

Marking vSAN Traffic


Priority tagging is a mechanism to indicate to the connected network devices that vSAN traffic has high Quality of Service
(QoS) demands. You can assign vSAN traffic to a certain class and mark the traffic accordingly with a Class of Service
(CoS) value from 0 (low priority) to 7 (high priority). Use the traffic filtering and marking policy of vSphere Distributed
Switch to configure priority levels.

Segmenting vSAN Traffic in a VLAN


Consider isolating vSAN traffic in a VLAN for enhanced security and performance, especially if you share the capacity of
the backing physical adapter among several traffic types.

Jumbo Frames
If you plan to use jumbo frames with vSAN to improve CPU performance, verify that jumbo frames are enabled on all
network devices and hosts in the cluster.
By default, the TCP segmentation offload (TSO) and large receive offload (LRO) features are enabled on ESXi. Consider
whether using jumbo frames improves the performance enough to justify the cost of enabling them on all nodes on the
network.

Creating Static Routes for vSAN Networking


You might need to create static routes in your vSAN environment.

74
VMware vSAN 8.0

In traditional configurations, where vSphere uses a single default gateway, all routed traffic attempts to reach its
destination through this gateway.
NOTE
vSAN 7.0 and later enables you to override the default gateway for the vSAN VMkernel adapter on each host,
and configure a gateway address for the vSAN network.
However, certain vSAN deployments might require static routing. For example, deployments where the witness is on
a different network, or the vSAN stretched cluster deployment, where both the data sites and the witness host are on
different networks.
To configure static routing on your ESXi hosts, use the esxcli command:
esxcli network ip route ipv4 add -g gateway-to-use –n remote-network
remote-network is the remote network that your host must access, and gateway-to-use is the interface to use when
traffic is sent to the remote network.
For information about network design for vSAN stretched clusters, see Administering VMware vSAN.

Best Practices for vSAN Networking


Consider networking best practices for vSAN to improve performance and throughput.
• vSAN OSA: For hybrid configurations, dedicate at least 1 GbE physical network adapter. Place vSAN traffic on a
dedicated or shared 10 GbE physical adapter for best networking performance. For all-flash configurations, use a
dedicated or shared 10 GbE physical network adapter.
• vSAN ESA: Use a dedicated or shared 25 GbE physical network adapter.
• Provision one additional physical NIC as a failover NIC.
• If you use a shared network adapter, place the vSAN traffic on a distributed switch and configure Network I/O Control
to guarantee bandwidth to vSAN.

Designing and Sizing vSAN Fault Domains


vSAN fault domains can spread redundancy components across the servers in separate computing racks. In this way, you
can protect the environment from a rack-level failure such as loss of power or connectivity.

Fault Domain Constructs


vSAN requires at least three fault domains to support Failures to tolerate (FTT) of 1. Each fault domain consists of one
or more hosts. Fault domain definitions must acknowledge physical hardware constructs that might represent a potential
zone of failure, for example, an individual computing rack enclosure.
If possible, use at least four fault domains. Three fault domains do not support certain data evacuation modes, and vSAN
is unable to reprotect data after a failure. In this case, you need an additional fault domain with capacity for rebuilding,
which you cannot provide with only three fault domains.
If fault domains are enabled, vSAN applies the active virtual machine storage policy to the fault domains instead of the
individual hosts.
Calculate the number of fault domains in a cluster based on the FTT attribute from the storage policies that you plan to
assign to virtual machines.
number of fault domains = 2 * FTT + 1

If a host is not a member of a fault domain, vSAN interprets it as a stand-alone fault domain.

75
VMware vSAN 8.0

Using Fault Domains Against Failures of Several Hosts


Consider a cluster that contains four server racks, each with two hosts. If the Failures to tolerate is set to one and
fault domains are not enabled, vSAN might store both replicas of an object with hosts in the same rack enclosure. In
this way, applications might be exposed to a potential data loss on a rack-level failure. When you configure hosts that
could potentially fail together into separate fault domains, vSAN ensures that each protection component (replicas and
witnesses) is placed in a separate fault domain.
If you add hosts and capacity, you can use the existing fault domain configuration or you can define fault domains.
For balanced storage load and fault tolerance when using fault domains, consider the following guidelines:
• Provide enough fault domains to satisfy the Failures to tolerate that are configured in the storage policies.
Define at least three fault domains. Define a minimum of four domains for best protection.
• Assign the same number of hosts to each fault domain.
• Use hosts that have uniform configurations.
• Dedicate one fault domain of free capacity for rebuilding data after a failure, if possible.

Using Boot Devices and vSAN


Starting an ESXi installation that is a part of a vSAN cluster from a flash device imposes certain restrictions.
When you boot a vSAN host from a USB/SD device, you must use a high-quality USB or SD flash drive of 4 GB or larger.
When you boot a vSAN host from a SATADOM device, you must use single-level cell (SLC) device. The size of the boot
device must be at least 16 GB.
During installation, the ESXi installer creates a coredump partition on the boot device. The default size of the coredump
partition satisfies most installation requirements.
• If the memory of the ESXi host has 512 GB of memory or less, you can boot the host from a USB, SD, or SATADOM
device.
• If the memory of the ESXi host has more than 512 GB, consider the following guidelines.
– You can boot the host from a SATADOM or disk device with a size of at least 16 GB. When you use a SATADOM
device, use a single-level cell (SLC) device.
– If you are using vSAN 6.5 or later, you must resize the coredump partition on ESXi hosts to boot from USB/SD
devices.
Hosts that boot from a disk have a local VMFS. If you have a disk with VMFS that runs VMs, you must separate the disk
for an ESXi boot that is not for vSAN. In this case you need separate controllers.

Log Information and Boot Devices in vSAN


When you boot ESXi from a USB or SD device, log information and stack traces are lost on host reboot. They are lost
because the scratch partition is on a RAM drive. Use persistent storage for logs, stack traces, and memory dumps.
Do not store log information on the vSAN datastore. This configuration is not supported because a failure in the vSAN
cluster could impact the accessibility of log information.
Consider the following options for persistent log storage:
• Use a storage device that is not used for vSAN and is formatted with VMFS or NFS.
• Configure the ESXi Dump Collector and vSphere Syslog Collector on the host to send memory dumps and system logs
to vCenter Server.
For information about setting up the scratch partition with a persistent location, see the vCenter Server Installation and
Setup documentation.

76
VMware vSAN 8.0

Persistent Logging in a vSAN Cluster


Provide storage for persistence of the logs from the hosts in the vSAN cluster.
If you install ESXi on a USB or SD device and you allocate local storage to vSAN, you might not have enough local
storage or datastore space left for persistent logging.
To avoid potential loss of log information, configure the ESXi Dump Collector and vSphere Syslog Collector to redirect
ESXi memory dumps and system logs to a network server.
For more information about configuring the vSphere Syslog Collector, see http://kb.vmware.com/kb/2021652.
For more information about configuring the ESXi Dump Collector, see https://kb.vmware.com/s/article/2002954.

Preparing a New or Existing Cluster for vSAN


Before you deploy a vSAN cluster and start using it as virtual machine storage, you must provide the infrastructure that is
required for correct operation of vSAN.

Preparing Storage
Provide enough disk space for vSAN and for the virtualized workloads that use the vSAN datastore.

Verify the Compatibility of Storage Devices


Consult the VMware Compatibility Guide to verify that your storage devices, drivers, and firmware are compatible with
vSAN.
You can choose from several options for vSAN compatibility.
• Use a vSAN ReadyNode server, a physical server that OEM vendors and VMware validate for vSAN compatibility.
• Assemble a node by selecting individual components from validated device models.

VMware Compatibility Guide Section Component Type for Verification

Systems Physical server that runs ESXi.


vSAN • Magnetic disk SAS model for hybrid configurations.
• Flash device model that is listed in the VMware Compatibility
Guide. Certain models of PCIe flash devices can also work
with vSAN. Consider also write endurance and performance
class.
• Storage controller model that supports passthrough.
vSAN can work with storage controllers that are configured
for RAID 0 mode if each storage device is represented as an
individual RAID 0 group.

Preparing Storage Devices


Use flash devices and magnetic disks based on the requirements for vSAN.
Verify that the cluster has the capacity to accommodate anticipated virtual machine consumption and the Failures to
tolerate in the storage policy for the virtual machines.

77
VMware vSAN 8.0

The storage devices must meet the following requirements so that vSAN can claim them:
• The storage devices are local to the ESXi hosts. vSAN cannot claim remote devices.
• The storage devices do not have any existing partition information.
• On the same host, you cannot have both all-flash and hybrid disk groups.

Prepare Devices for Disk Groups


Each disk group provides one flash caching device and at least one magnetic disk or one flash capacity device. For hybrid
clusters, the capacity of the flash caching device must be at least 10 percent of the anticipated consumed storage on the
capacity device, without the protection copies.
vSAN requires at least one disk group on a host that contributes storage to a cluster that consists of at least three hosts.
Use hosts that have uniform configuration for best performance of vSAN.

Raw and Usable Capacity


Provide raw storage capacity that is greater than the capacity for virtual machines to handle certain cases.
• Do not include the size of the flash caching devices as capacity. These devices do not contribute storage and are used
as cache unless you have added flash devices for storage.
• Provide enough space to handle the Failures to tolerate (FTT) value in a virtual machine storage policy. A FTT that
is greater than 0 extends the device footprint. If the FTT is set to 1, the footprint is double. If the FTT is set to 2, the
footprint is triple, and so on.
• Verify whether the vSAN datastore has enough space for an operation by examining the space on the individual hosts
rather than on the consolidated vSAN datastore object. For example, when you evacuate a host, all free space in the
datastore might be on the host that you are evacuating. The cluster is not able to accommodate the evacuation to
another host.
• Provide enough space to prevent the datastore from running out of capacity, if workloads that have thinly provisioned
storage start consuming a large amount of storage.
• Verify that the physical storage can accommodate the reprotection and maintenance mode of the hosts in the vSAN
cluster.
• Consider the vSAN overhead to the usable storage space.
– On-disk format version 3.0 and later adds an extra overhead, typically no more than 1-2 percent capacity per
device. Deduplication and compression with software checksum enabled require extra overhead of approximately
6.2 percent capacity per device.
For more information about planning the capacity of vSAN datastores, see the VMware Design and Sizing Guide.

vSAN Policy Impact on Capacity


The vSAN storage policy for virtual machines affects the capacity devices in several ways.

78
VMware vSAN 8.0

Table 8: vSAN VM Policy and Raw Capacity

Aspects of Policy Influence Description

Policy changes • The Failures to tolerate (FTT) influences the physical storage space
that you must supply for virtual machines. The greater the FTT is for
higher availability, the more space you must provide.
When FTT is set to 1, it imposes two replicas of the VMDK file of a
virtual machine. With FTT set to 1, a VMDK file that is 50 GB requires
100-GB space on different hosts. If the FTT is changed to 2, you must
have enough space to support three replicas of the VMDK across the
hosts in the cluster, or 150 GB.
• Some policy changes, such as a new number of disk stripes per object,
require temporary resources. vSAN recreates the objects affected by
the change. For a certain time, the physical storage must accommodate
the old and new objects.
Available space for reprotecting or maintenance mode When you place a host in maintenance mode or you clone a virtual
machine, the datastore might not be able to evacuate the virtual machine
objects, although the vSAN datastore indicates that enough space is
available. This lack of space can occur if the free space is on the host that
is being placed in maintenance mode.

Preparing Storage Controllers


Configure the storage controller on each host according to the requirements of vSAN.
Verify that the storage controllers on the vSAN hosts satisfy certain requirements for mode, driver, and firmware version,
queue depth, caching, and advanced features.

Table 9: Examining Storage Controller Configuration for vSAN

Storage Controller Feature Storage Controller Requirement

Required mode • Review the vSAN requirements in the VMware Compatibility Guide for the
required mode, passthrough or RAID 0, of the controller.
• If both passthrough and RAID 0 modes are supported, configure passthrough
mode instead of RAID0. RAID 0 introduces complexity for disk replacement.
RAID mode • In the case of RAID 0, create one RAID volume per physical disk device.
• Do not enable a RAID mode other than the mode listed in the VMware
Compatibility Guide.
• Do not enable controller spanning.
Driver and firmware version • Use the latest driver and firmware version for the controller according to
VMware Compatibility Guide.
• If you use the in-box controller driver, verify that the driver is certified for vSAN.
OEM ESXi releases might contain drivers that are not certified and listed in the
VMware Compatibility Guide.
Queue depth Verify that the queue depth of the controller is 256 or higher. Higher queue depth
provides improved performance.
Cache Deactivate the storage controller cache, or set it to 100 percent read if disabling
cache is not possible.
Advanced features Deactivate advanced features, for example, HP SSD Smart Path.

79
VMware vSAN 8.0

Mark Flash Devices as Capacity Using ESXCLI


You can manually mark the flash devices on each host as capacity devices using esxcli.
Verify that you are using vSAN 6.5 or later.
1. To learn the name of the flash device that you want to mark as capacity, run the following command on each host.
a) In the ESXi Shell, run the esxcli storage core device list command.
b) Locate the device name at the top of the command output and write the name down.
The command takes the following options:

Table 10: Command Options

Options Description

-d|--disk=str The name of the device that you want to tag as a capacity device. For example, mpx.vmhba1
:C0:T4:L0
-t|--tag=str Specify the tag that you want to add or remove. For example, the capacityFlash tag is
used for marking a flash device for capacity.

The command lists all device information identified by ESXi.


2. In the output, verify that the Is SSD attribute for the device is true.
3. To tag a flash device as capacity, run the esxcli vsan storage tag add -d <device name> -t capacityFlash
command.
For example, the esxcli vsan storage tag add -t capacityFlash -d mpx.vmhba1:C0:T4:L0 command, where
mpx.vmhba1:C0:T4:L0 is the device name.

4. Verify whether the flash device is marked as capacity.


a) In the output, identify whether the IsCapacityFlash attribute for the device is set to 1.
Command Output
You can run the vdq -q -d <device name> command to verify the IsCapacityFlash attribute. For example,
running the vdq -q -d mpx.vmhba1:C0:T4:L0 command, returns the following output.
\{
"Name" : "mpx.vmhba1:C0:T4:L0",
"VSANUUID" : "",
"State" : "Eligible for use by VSAN",
"ChecksumSupport": "0",
"Reason" : "None",
"IsSSD" : "1",
"IsCapacityFlash": "1",
"IsPDL" : "0",
\},

80
VMware vSAN 8.0

Untag Flash Devices Used as Capacity Using ESXCLI


You can untag flash devices that are used as capacity devices, so that they are available for caching.
1. To untag a flash device marked as capacity, run the esxcli vsan storage tag remove -d <device name> -
t capacityFlash command. For example, the esxcli vsan storage tag remove -t capacityFlash -d
mpx.vmhba1:C0:T4:L0 command, where mpx.vmhba1:C0:T4:L0 is the device name.

2. Verify whether the flash device is untagged.


a) In the output, identify whether the IsCapacityFlash attribute for the device is set to 0.
Command Output
You can run the vdq -q -d <device name> command to verify the IsCapacityFlash attribute. For example,
running the vdq -q -d mpx.vmhba1:C0:T4:L0 command, returns the following output.

[
\{
"Name" : "mpx.vmhba1:C0:T4:L0",
"VSANUUID" : "",
"State" : "Eligible for use by VSAN",
"ChecksumSupport": "0",
"Reason" : "None",
"IsSSD" : "1",
"IsCapacityFlash": "0",
"IsPDL" : "0",
\},

Mark Flash Devices as Capacity Using RVC


Run the vsan.host_claim_disks_differently RVC command to mark storage devices as flash, capacity flash, or
magnetic disk (HDD).
• Verify that you are using vSAN version 6.5 or later.
• Verify that SSH is enabled on the vCenter Server.
You can use the RVC tool to tag flash devices as capacity devices either individually, or in a batch by specifying the model
of the device. When you want to tag flash devices as capacity devices, you can include them in all-flash disk groups.
NOTE
The vsan.host_claim_disks_differently command does not check the device type before tagging them.
The command tags any device that you append with the capacity_flash command option, including the
magnetic disks and devices that are already in use. Make sure that you verify the device status before tagging.
For information about the RVC commands for vSAN management, see the RVC Command Reference Guide.
1. Open an SSH connection to the vCenter Server.
2. Log in to the vCenter Server by using a local account that has administrator privilege.
3. Start the RVC by running the following command.
rvc local_user_name@target_vCenter_Server

For example, to use the same vCenter Server to mark flash devices for capacity as a user root, run the following
command:
rvc root@localhost

81
VMware vSAN 8.0

4. Enter the password for the user name.


5. Navigate to the vcenter_server/data_center/computers/cluster/hosts directory in the vSphere
infrastructure.
6. Run the vsan.host_claim_disks_differently command with the --claim-type capacity_flash--
modelmodel_name options to mark all flash devices of the same model as capacity on all hosts in the cluster.
vsan.host_claim_disks_differently --claim-type capacity_flash --model model_name *

Enable vSAN on the cluster and claim capacity devices.

Providing Memory for vSAN


Provision hosts with memory to support to the maximum number of devices and disks that you intend to use for vSAN.
To satisfy the case of the maximum number of devices and disk groups, you must provision hosts with 32 GB of memory
for system operations. For information about the maximum device configuration, refer to the vSphere Configuration
Maximumshttps://configmax.esp.vmware.com/home.

Preparing Your Hosts for vSAN


As a part of the preparation for enabling vSAN, review the requirements and recommendations about the configuration of
hosts for the cluster.
• Verify that the storage devices on the hosts, and the driver and firmware versions for them, are listed in the vSAN
section of the VMware Compatibility Guide.
• Make sure that a minimum of three hosts contribute storage to the vSAN datastore.
• For maintenance and remediation operations on failure, add at least four hosts to the cluster.
• Designate hosts that have uniform configuration for best storage balance in the cluster.
• Do not add hosts that have only compute resources to the cluster to avoid unbalanced distribution of storage
components on the hosts that contribute storage. Virtual machines that require much storage space and run on
compute-only hosts might store a great number of components on individual capacity hosts. As a result, the storage
performance in the cluster might be lower.
• Do not configure aggressive CPU power management policies on the hosts for saving power. Certain applications
that are sensitive to CPU speed latency might have low performance. For information about CPU power management
policies, see the vSphere Resource Management documentation.
• If your cluster contains blade servers, consider extending the capacity of the datastore with an external storage
enclosure that is connected to the blade servers. Make sure the storage enclosure is listed in the vSAN section of the
VMware Compatibility Guide.
• Consider the configuration of the workloads that you place on a hybrid or all-flash disk configuration.
– For high levels of predictable performance, provide a cluster of all-flash disk groups.
– For balance between performance and cost, provide a cluster of hybrid disk groups.

vSAN and vCenter Server Compatibility


Synchronize the versions of vCenter Server and ESXi to avoid potential faults caused by mismatched software.
For best integration between vSAN components on vCenter Server and ESXi, deploy the latest version of the two vSphere
components. See the vCenter Server Installation and Setup and vSphere Upgrade documentation.

Configuring the vSAN Network


Before you enable vSAN on a cluster of ESXi hosts, you must provide the necessary network infrastructure to carry vSAN
communication.

82
VMware vSAN 8.0

vSAN provides a distributed storage solution, which implies exchanging data across the ESXi hosts that participate in the
cluster. Preparing the network for installing vSAN includes certain configuration aspects.
For information about network design guidelines, see Designing the vSAN Network.

Placing Hosts in the Same Subnet


Hosts must be connected in the same subnet for best networking performance. In vSAN 6.0 and later, you can also
connect hosts in the same Layer 3 network if necessary.

Dedicating Network Bandwidth on a Physical Adapter


Allocate at least 1 Gbps bandwidth for vSAN. You might use one of the following configuration options:
• vSAN OSA: Dedicate 1 GbE physical adapters for a hybrid host configuration, or use dedicated or shared 10-GbE
physical adapters if possible. Use dedicated or shared 10 GbE physical adapters for all-flash configurations.
• vSAN ESA: Use dedicated or shared 25 GbE physical adapters.
• Direct vSAN traffic on a physical adapter that handles other system traffic and use vSphere Network I/O Control on a
distributed switch to reserve bandwidth for vSAN.

Configuring a Port Group on a Virtual Switch


Configure a port group on a virtual switch for vSAN.
• Assign the physical adapter for vSAN to the port group as an active uplink.
When you need a NIC team for network availability, select a teaming algorithm based on the connection of the physical
adapters to the switch.
• If designed, assign vSAN traffic to a VLAN by enabling tagging in the virtual switch.

Examining the Firewall on a Host for vSAN


vSAN sends messages on certain ports on each host in the cluster. Verify that the host firewalls allow traffic on these
ports.
When you enable vSAN on a cluster, all required ports are added to ESXi firewall rules and configured automatically.
There is no need for an administrator to open any firewall ports or enable any firewall services manually.
You can view open ports for incoming and outgoing connections. Select the ESXi host, and click Configure > Security
Profile.

Creating a Single Site vSAN Cluster


You can enable vSAN when you create a vSphere cluster, or you can or enable vSAN on an existing clusters.

Characteristics of a vSAN Cluster


Before working on a vSAN environment, be aware of the characteristics of a vSAN cluster.
A vSAN cluster includes the following characteristics:
• You can have multiple vSAN clusters for each vCenter Server instance. You can use a single vCenter Server to
manage more than one vSAN cluster.
• vSAN consumes all devices, including flash cache and capacity devices, and does not share devices with other
features.
• vSAN clusters can include hosts with or without capacity devices. The minimum requirement is three hosts with
capacity devices. For best results, create a vSAN cluster with uniformly configured hosts.
• If a host contributes capacity, it must have at least one flash cache device and one capacity device.

83
VMware vSAN 8.0

• In hybrid clusters, the magnetic disks are used for capacity and flash devices for read and write cache. vSAN allocates
70 percent of all available cache for read cache and 30 percent of available cache for the write buffer. In a hybrid
configuration, the flash devices serve as a read cache and a write buffer.
• In all-flash clusters, one designated flash device is used as a write cache, additional flash devices are used for
capacity. In all-flash clusters, all read requests come directly from the flash pool capacity.
• Only local or direct-attached capacity devices can participate in a vSAN cluster. vSAN cannot consume other external
storage, such as SAN or NAS, attached to cluster.
To learn about the characteristics of a vSAN cluster configured through Quickstart, see Using Quickstart to Configure and
Expand a vSAN Cluster.
For best practices about designing and sizing a vSAN cluster, see Designing and Sizing a vSAN Cluster.

Before Creating a vSAN Cluster


This topic provides a checklist of software and hardware requirements for creating a vSAN cluster. You can also use the
checklist to verify that the cluster meets the guidelines and basic requirements.

Requirements for vSAN Cluster


Before you get started, verify specific models of hardware devices, and specific versions of drivers and firmware in the
VMware Compatibility Guide website at http://www.vmware.com/resources/compatibility/search.php. The following table
lists the key software and hardware requirements supported by vSAN.
CAUTION
Using uncertified software and hardware components, drivers, controllers, and firmware might cause
unexpected data loss and performance issues.

84
VMware vSAN 8.0

Table 11: vSAN Cluster Requirements

Requirements Description

ESXi hosts • Verify that you are using the latest version of ESXi on your hosts.
• Verify that there are at least three ESXi hosts with supported storage configurations
available to be assigned to the vSAN cluster. For best results, configure the vSAN
cluster with four or more hosts.
Memory • Verify that each host has a minimum of 32 GB of memory.
• For larger configurations and better performance, you must have a minimum of 32 GB
of memory in the cluster. See Designing and Sizing vSAN Hosts.

Storage I/O controllers, drivers, firmware • Verify that the storage I/O controllers, drivers, and firmware versions are certified and
listed in the VCG website at http://www.vmware.com/resources/compatibility/search.p
hp.
• Verify that the controller is configured for passthrough or RAID 0 mode.
• Verify that the controller cache and advanced features are deactivated. If you cannot
deactivate the cache, you must set the read cache to 100 percent.
• Verify that you are using controllers with higher queue depths. Using controllers with
queue depths less than 256 can significantly impact the performance of your virtual
machines during maintenance and failure.

Cache and capacity • For vSAN Original Storage Architecture, verify that vSAN hosts contributing storage to
the cluster have at least one cache and one capacity device. vSAN requires exclusive
access to the local cache and capacity devices of the hosts in the vSAN cluster.
They cannot share these devices with other uses, such as Virtual Flash File System
(VFFS), VMFS partitions, or an ESXi boot partition.
• For vSAN Express Storage Architecture, verify that hosts contributing storage have
compatible flash storage devices.
• For best results, create a vSAN cluster with uniformly configured hosts.

Network connectivity • Verify that each host is configured with at least one network adapter.
• For hybrid configurations, verify that vSAN hosts have a minimum dedicated
bandwidth of 1 GbE.
• For all-flash configurations, verify that vSAN hosts have a minimum bandwidth of 10
GbE.
For best practices and considerations about designing the vSAN network, see Designing
the vSAN Network and Networking Requirements for Virtual SAN.
vSAN and vCenter Server compatibility Verify that you are using the latest version of the vCenter Server.
License key • Verify that you have a valid vSAN license key.
• To use the all-flash feature, your license must support that capability.
• To use advanced features, such as vSAN stretched clusters or deduplication and
compression, your license must support those features.
• Verify that the amount of license capacity that you plan on using equals the total
number of CPUs in the hosts participating in the vSAN cluster. Do not provide
license capacity only for hosts providing capacity to the cluster. For information about
licensing for vSAN, see the vCenter Server and Host Management documentation.

For detailed information about vSAN cluster requirements, see Requirements for Enabling vSAN.
For in-depth information about designing and sizing the vSAN cluster, see the VMware vSAN Design and Sizing Guide.

Using Quickstart to Configure and Expand a vSAN Cluster


You can use the Quickstart workflow to quickly create, configure, and expand a vSAN cluster.

85
VMware vSAN 8.0

Quickstart consolidates the workflow to enable you to quickly configure a new vSAN cluster that uses recommended
default settings for common functions such as networking, storage, and services. Quickstart groups common tasks and
uses configuration wizards that guide you through the process. Once you enter the required information on each wizard,
Quickstart configures the cluster based on your input.

Quickstart uses the vSAN health service to validate the configuration and help you correct configuration issues. Each
Quickstart card displays a configuration checklist. You can click a green message, yellow warning, or red failure to display
details.
Hosts added to a Quickstart cluster are automatically configured to match the cluster settings. The ESXi software and
patch levels of new hosts must match those in the cluster. Hosts cannot have any networking or vSAN configuration

86
VMware vSAN 8.0

when added to a cluster using the Quickstart workflow. For more information about adding hosts, see "Expanding a vSAN
Cluster" in Administering VMware vSAN.
NOTE
If you modify any network settings outside of QuickStart, this hampers your ability to add and configure more
hosts to the cluster using the QuickStart workflow.

Characteristics of a Quickstart Cluster


A vSAN cluster configured using Quickstart has the following characteristics.
• Hosts must have ESXi 6.0 Update 2 or later.
• Host all have similar configuration, including network settings. Quickstart modifies network settings on each host to
match the cluster requirements.
• Cluster configuration is based on recommended default settings for networking and services.
• Licenses are not assigned through the Quickstart workflow. You must manually assign a license to your cluster.

Managing and Expanding a Quickstart Cluster


Once you complete the Quickstart workflow, you can manage the cluster through vCenter Server, using the vSphere
Client or command-line interface.
You can use the Quickstart workflow to add hosts to the cluster and claim additional disks. But once the cluster is
configured through Quickstart, you cannot use Quickstart to modify the cluster configuration.
The Quickstart workflow is available only through the HTML5-based vSphere Client.

Skipping Quickstart
You can use the Skip Quickstart button to exit the Quickstart workflow, and continue configuring the cluster and its hosts
manually. You can add new hosts individually, and manually configure those hosts. Once skipped, you cannot restore the
Quickstart workflow for the cluster.
The Quickstart workflow is designed for new clusters. When you upgrade an existing vSAN cluster to 6.7 Update 1 or
later, the Quickstart workflow appears. Skip the Quickstart workflow and continue to manage the cluster through vCenter
Server.

Use Quickstart to Configure a vSAN Cluster


You can use the Quickstart workflow to quickly configure a vSAN cluster.
• Verify that hosts are running ESXi 6.0 Update 2 or later.
• Verify that ESXi hosts in the cluster do not have any existing vSAN or networking configuration.

87
VMware vSAN 8.0

NOTE
If you perform network configuration through Quickstart, then modify those parameters from outside of
Quickstart, you cannot use Quickstart to add or configure additional hosts.
1. Navigate to the cluster in the vSphere Client.
2. Click the Configure tab, and select Configuration > Quickstart.
3. (optional) On the Cluster basics card, click Edit to open the Cluster basics wizard.
a) (Optional) Enter a cluster name.
b) Select basic services, such as DRS, vSphere HA, and vSAN.
Check Enable vSAN ESA to use vSAN Express Storage Architecture. vSAN Express Storage Architecture is
optimized for high-performance flash storage devices that provide greater performance and efficiency.
c) Click OK or Finish.
4. On the Add hosts card, click Add to open the Add hosts wizard.
a) On the Add hosts page, enter information for new hosts, or click Existing hosts and select from hosts listed in the
inventory.
b) On the Host summary page, verify the host settings.
c) On the Ready to complete page, click Finish.
NOTE
If you are running vCenter Server on a host, the host cannot be placed into maintenance mode as you add it
to a cluster using the Quickstart workflow. The same host also can be running a Platform Services Controller.
All other VMs on the host must be powered off.
5. On the Cluster configuration card, click Configure to open the Cluster configuration wizard.
a) (vSAN ESA clusters) On the Cluster Type page, enter the HCI cluster type:
• vSAN HCI provides compute resources and storage resources. The datastore can be shared across data
centers and vCenters.
• vSAN Max provides storage resources, but not compute resources. The datastore can be mounted by remote
vSAN clusters across data centers and vCenters.
b) On the Configure the distributed switches page, enter networking settings, including distributed switches, port
groups, and physical adapters.
• In the Distributed switches section, enter the number of distributed switches to configure from the drop-down
menu. Enter a name for each distributed switch. Click Use Existing to select an existing distributed switch.
If the host has a standard virtual switch with the same name as the selected distributed switch, the standard
switch is migrated to the corresponding distributed switch.
Network resource control is enabled and set to version 3. Distributed switches with network resource control
version 2 cannot be used.
• In the Port Groups section, select a distributed switch to use for vMotion and a distributed switch to use for the
vSAN network.
• In the Physical adapters section, select a distributed switch for each physical network adapter. You must
assign each distributed switch to at least one physical adapter.
If the physical adapters chosen are attached to a standard virtual switch with the same name across hosts, the
standard switch is migrated to the distributed switch. If the physical adapters chosen are unused, there is no
migration from standard switch to distributed switch.

88
VMware vSAN 8.0

Network resource control is enabled and set to version 3. Distributed switches with network resource control
version 2 cannot be used.
c) On the vMotion traffic page, enter IP address information for vMotion traffic.
d) On the Storage traffic page, enter IP address information for storage traffic.
e) On the Advanced options page, enter information for cluster settings, including DRS, HA, vSAN, host options, and
EVC.
f) On the Claim disks page, select storage devices on each host. For clusters with vSAN Original Storage
Architecure, select one cache device and one or more capacity devices. For clusters with vSAN Express STorage
Architecture, select flash devices for the host's storage pool.
NOTE
Only the vSAN Data Persistence platform can consume vSAN Direct storage. The vSAN Data
Persistence platform provides a framework for software technology partners to integrate with VMware
infrastructure. Each partner must develop their own plug-in for VMware customers to receive the benefits
of the vSAN Data Persistence platform. The platform is not operational until the partner solution running
on top is operational. For more information, see vSphere with Tanzu Configuration and Management.
g) (Optional) On the Create fault domains page, define fault domains for hosts that can fail together.
For more information about fault domains, see "Managing Fault Domains in vSAN Clusters" in Administering
VMware vSAN.
h) (Optional) On the Proxy setting page, configure the proxy server if your system uses one.
i) On the Review page, verify the cluster settings, and click Finish.
You can manage the cluster through your vCenter.
You can add hosts to the cluster through Quickstart. For more information. see "Expanding a vSAN Cluster" in
Administering VMware vSAN.

Manually Enabling vSAN


To create a vSAN cluster, you create a vSphere host cluster and enable vSAN on the cluster.
A vSAN cluster can include hosts with capacity and hosts without capacity. Follow these guidelines when you create a
vSAN cluster.
• A vSAN cluster must include a minimum of three ESXi hosts. For a vSAN cluster to tolerate host and device failures,
at least three hosts that join the vSAN cluster must contribute capacity to the cluster. For best results, consider adding
four or more hosts contributing capacity to the cluster.
• Only ESXi 5.5 Update 1 or later hosts can join the vSAN cluster.
• Before you move a host from a vSAN cluster to another cluster, make sure that the destination cluster is vSAN
enabled.
• To be able to access the vSAN datastore, an ESXi host must be a member of the vSAN cluster.
After you enable vSAN, the vSAN storage provider is automatically registered with vCenter Server and the vSAN
datastore is created. For information about storage providers, see the vSphere Storage documentation.

89
VMware vSAN 8.0

Set Up a VMkernel Network for vSAN


To enable the exchange of data in the vSAN cluster, you must provide a VMkernel network adapter for vSAN traffic on
each ESXi host.
1. Right-click the host, and select Add Networking.
2. On the Select connection type page, select VMkernel Network Adapter and click Next.
3. On the Select target device page, configure the target switching device.
4. On the Port properties page, select vSAN service.
5. Complete the VMkernel adapter configuration.
6. On the Ready to complete page, verify that vSAN is Enabled in the status for the VMkernel adapter, and click Finish.

vSAN network is enabled for the host.


You can enable vSAN on the host cluster.

Create a vSAN Cluster


You can create a cluster, and then configure the cluster for vSAN.
1. Right-click a data center and select New Cluster.
2. Type a name for the cluster in the Name text box.
3. Turn on DRS, vSphere HA, and vSAN for the cluster.
Check Enable vSAN ESA to use vSAN Express Storage Architecture. vSAN Express Storage Architecture is
optimized for high-performance flash storage devices that provide greater performance and efficiency.
4. Click OK.
The cluster appears in the inventory.
5. Add hosts to the vSAN cluster.
vSAN clusters can include hosts with or without capacity devices. For best results, add hosts with capacity.
Configure services for the vSAN cluster. See Configure a Cluster for vSAN Using the vSphere Client .

Configure a Cluster for vSAN Using the vSphere Client


You can use the vSphere Client to configure vSAN on an existing cluster.
Verify that your environment meets all requirements. See "Requirements for Enabling vSAN" in vSAN Planning and
Deployment.

90
VMware vSAN 8.0

Create a cluster and add hosts to the cluster before enabling and configuring vSAN. Configure the port properties on each
host to add the vSAN service.
NOTE
You can use Quickstart to quickly create and configure a vSAN cluster. For more information, see "Using
Quickstart to Configure and Expand a vSAN Cluster" in vSAN Planning and Deployment .
1. Navigate to an existing host cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.

a) Select an HCI configuration type.


• vSAN HCI provides compute resources and storage resources. The datastore can be shared across clusters in
the same data center, and across clusters managed by remote vCenters.
91
• vSAN Compute Cluster provides vSphere compute resources only. It can mount datastores served by vSAN
Max clusters in the same data center and from remote vCenters.
• vSAN Max (vSAN ESA clusters) provides storage resources, but not compute resources. The datastore can be
mounted by client vSphere clusters and vSAN clusters in the same data center and from remote vCenters.
VMware vSAN 8.0

b) Select a deployment option (Single site vSAN cluster, Two node vSAN cluster, or vSAN stretched cluster).
c) Click Configure to open the Configure vSAN wizard.

4. Select vSAN ESA if your cluster is compatible, and click Next.


5. Configure the vSAN services to use, and click Next.
Configure data management features, including deduplication and compression, data-at-rest encryption, data-in-transit
encryption. Select RDMA (remote direct memory access) if your network supports it.
6. Claim disks for the vSAN cluster, and click Next.
For vSAN Original Storage Architecture (vSAN OSA), each host that contribute storage requires at least one flash
device for cache, and one or more devices for capacity. For vSAN Express Storage Architecture (vSAN ESA), each
host that contributes storage requires one or more flash devices.
7. Create fault domains to group hosts that can fail together.
8. Review the configuration, and click Finish.

Enabling vSAN creates a vSAN datastore and registers the vSAN storage provider. vSAN storage providers are built-in
software components that communicate the storage capabilities of the datastore to vCenter Server.
Verify that the vSAN datastore has been created. See View vSAN Datastore.
Verify that the vSAN storage provider is registered.

92
VMware vSAN 8.0

Edit vSAN Settings


You can edit the settings of your vSAN cluster to configure data management features and enable services provided by
the cluster.
Edit the settings of an existing vSAN cluster if you want to enable deduplication and compression, or to enable encryption.
If you enable deduplication and compression, or if you enable encryption, the on-disk format of the cluster is automatically
upgraded to the latest version.

93
VMware vSAN 8.0

1. Navigate to the vSAN cluster.


2. Click the Configure tab.
a) Under vSAN, select Services.
b) Click the Edit or Enable button for the service you want to configure.
• Configure Storage. Click Mount Remote Datastores to use storage from other vSAN clusters.
• Configure vSAN performance service. For more information, see "Monitoring vSAN Performance" in vSAN
Monitoring and Troubleshooting.
• Enable File Service. For more information, see "vSAN File Service" in Administering VMware vSAN .

94
VMware vSAN 8.0

• Configure vSAN Network options. For more information, see "Configuring vSAN Network" in vSAN Planning
and Deployment.
• Configure iSCSI target service. For more information, see "Using the vSAN iSCSI Target Service" in
Administering VMware vSAN.
• Configure Data Services, including deduplication and compression, data-at-rest encryption, and data-in-transit
encryption.
• Configure vSAN Data Protection. Before you can use vSAN Data Protection, you must deploy the vSAN
Snapshot Service. For more information, see "Deploying the Snapshot Service Appliance" in Administering
VMware vSAN.
• Configure capacity reservations and alerts. For more information, see "About Reserved Capacity" in vSAN
Monitoring and Troubleshooting.
• Configure advanced options:
– Object Repair Timer
– Site Read Locality for vSAN stretched clusters
– Thin Swap provisioning
– Large Cluster Support for up to 64 hosts
– Automatic Rebalance
• Configure vSAN historical health service.
c) Modify the settings to match your requirements.
3. Click Apply to confirm your selections.

Enable vSAN on an Existing Cluster


You can enable vSAN on an existing cluster, and configure features and services.
Verify that your environment meets all requirements. See "Requirements for Enabling vSAN" in vSAN Planning and
Deployment.
1. Navigate to an existing host cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.
a) Select a configuration type (Single site vSAN cluster, Two node vSAN cluster, or vSAN Stretched cluster.
b) Select I need local vSAN Datastore if you plan to add disk groups or storage pools to the cluster hosts.
c) Click Configure to open the Configure vSAN wizard.
4. Select vSAN ESA if your cluster is compatible, and click Next.
5. Configure the vSAN services to use, and click Next.
Configure data management features, including deduplication and compression, data-at-rest encryption, data-in-transit
encryption. Select RDMA (remote direct memory access) if your network supports it.
6. Claim disks for the vSAN cluster, and click Next.
For vSAN Original Storage Archtecture (vSAN OSA), each host that contribute storage requires at least one flash
device for cache, and one or more devices for capacity. For vSAN Express Storage Architecture (vSAN ESA), each
host that contributes storage requires one or more flash devices.

95
VMware vSAN 8.0

7. Create fault domains to group hosts that can fail together.


8. Review the configuration, and click Finish.

Configure License Settings for a vSAN Cluster


You must assign a license to a vSAN cluster before its evaluation period expires or its currently assigned license expires.
• To view and manage vSAN licenses, you must have the Global > Licenses privilege on the vCenter Server systems.
If you upgrade, combine, or divide vSAN licenses, you must assign the new licenses to vSAN clusters. When you assign a
vSAN license to a cluster, the amount of license capacity used equals the total number of CPUs in the hosts participating
in the cluster. The license use of the vSAN cluster is recalculated and updated every time you add or remove a host from
the cluster. For information about managing licenses and licensing terminology and definitions, see the vCenter Server
and Host Management documentation.
When you enable vSAN on a cluster, you can use vSAN in evaluation mode to explore its features. The evaluation period
starts when vSAN is enabled, and expires after 60 days. To use vSAN, you must license the cluster before the evaluation
period expires. Just like vSphere licenses, vSAN licenses have per CPU capacity. Some advanced features, such as all-
flash configuration and vSAN stretched clusters, require a license that supports the feature.
1. Navigate to your vSAN cluster.
2. Click the Configure tab.
3. Under Licensing, select vSAN Cluster.
4. Click Assign License.
5. Select an existing license and click OK.

View a Subscribed Feature for a vSAN Cluster


For vSAN+ clusters that are on subscription, you can view the subscription usage using the VMC Console or view the
subscribed feature list using the vCenter Server. For more information on subscription usage in the VMC Console, see
"View Subscription Usage and Billing" in the Using and Managing vSphere+ Guide.
vCenter Server must be converted to vSphere+ subscription.

1. Navigate to your vSAN cluster.


2. Click the Configure tab.
3. Under Licensing & Subscription, select vSAN Cluster to view the list of features subscribed.
After you add the vSAN+ subscription to your vSphere+ environment, the number of cores displayed is equal to the
total number of physical CPU cores of each CPU on all the hosts associated with the vSAN clusters. You require
a minimum of 16 core capacity per CPU. Physical CPUs with less than 16 cores per CPU are counted as 16 cores
per CPU usage. To view information on vSAN+ subscription, see vSphere+ and vSAN+ Subscriptions. For more
information on the core requirements, see "Purchase Subscriptions" in the Getting Started with vSphere+ Guide.

View vSAN Datastore


After you enable vSAN, a single datastore is created. You can review the capacity of the vSAN datastore.

96
VMware vSAN 8.0

Configure vSAN and and disk groups or storage pools.

1. Navigate to Storage.
2. Select the vSAN datastore.
3. Click the Configure tab.
4. Review the vSAN datastore capacity.
The size of the vSAN datastore depends on the number of capacity devices per ESXi host and the number of ESXi
hosts in the cluster. For example, if a host has seven 2 TB for capacity devices, and the cluster includes eight hosts,
the approximate storage capacity is 7 x 2 TB x 8 = 112 TB. When using the all-flash configuration, flash devices are
used for capacity. For hybrid configuration, magnetic disks are used for capacity.
Some capacity is allocated for metadata.
• On-disk format version 1.0 adds approximately 1 GB per capacity device.
• On-disk format version 2.0 adds capacity overhead, typically no more than 1-2 percent capacity per device.
• On-disk format version 3.0 and later adds capacity overhead, typically no more than 1-2 percent capacity
per device. Deduplication and compression with software checksum enabled require additional overhead of
approximately 6.2 percent capacity per device.
Create a storage policy for virtual machines using the storage capabilities of the vSAN datastore. For information, see the
vSphere Storage documentation.

Using vSAN and vSphere HA


You can enable vSphere HA and vSAN on the same cluster. vSphere HA provides the same level of protection for virtual
machines on vSAN datastores as it does on traditional datastores. This level of protection imposes specific restrictions
when vSphere HA and vSAN interact.

97
VMware vSAN 8.0

ESXi Host Requirements


You can use vSAN with a vSphere HA cluster only if the following conditions are met:
• The cluster's ESXi hosts all must be version 5.5 Update 1 or later.
• The cluster must have a minimum of three ESXi hosts, unless it is a vSAN two-host cluster. For best results, configure
the vSAN cluster with four or more hosts.
NOTE
vSAN 7.0 Update 2 and later supports Proactive HA. Select the following remediation method: Maintenance
mode for all failures. Quarantine mode is supported, but it does not protect against data loss if the host in
quarantine mode fails, and there are objects with FTT=0 or objects with FTT=1 that are degraded.

Networking Differences
vSAN uses its own logical network. When vSAN and vSphere HA are enabled for the same cluster, the HA interagent
traffic flows over this storage network rather than the management network. vSphere HA uses the management network
only when vSAN is turned off. vCenter Server chooses the appropriate network when vSphere HA is configured on a host.
NOTE
Make sure vSphere HA is not enabled when you enable vSAN on the cluster. Then you can re-enable vSphere
HA.
When a virtual machine is only partially accessible in all network partitions, you cannot power on the virtual machine
or fully access it in any partition. For example, if you partition a cluster into P1 and P2, the VM namespace object is
accessible to the partition named P1 and not to P2. The VMDK is accessible to the partition named P2 and not to P1. In
such cases, the virtual machine cannot be powered on and it is not fully accessible in any partition .
The following table shows the differences in vSphere HA networking whether or not vSAN is used.

Table 12: vSphere HA Networking Differences

vSAN On vSAN Off

Network used by vSphere HA vSAN storage network Management network


Heartbeat datastores Any datastore mounted to more than one Any datastore mounted to more than one
host, but not vSAN datastores host
Host declared isolated Isolation addresses not pingable and vSAN Isolation addresses not pingable and
storage network inaccessible management network inaccessible

If you change the vSAN network configuration, the vSphere HA agents do not automatically acquire the new network
settings. To change the vSAN network, you must re-enable host monitoring for the vSphere HA cluster:
1. Deactivate Host Monitoring for the vSphere HA cluster.
2. Make the vSAN network changes.
3. Right-click all hosts in the cluster and select Reconfigure HA.
4. Reactivate Host Monitoring for the vSphere HA cluster.

Capacity Reservation Settings


When you reserve capacity for your vSphere HA cluster with an admission control policy, this setting must be coordinated
with the corresponding Failures to tolerate policy setting in the vSAN rule set. It must not be lower than the capacity
reserved by the vSphere HA admission control setting. For example, if the vSAN rule set allows for only two failures, the
vSphere HA admission control policy must reserve capacity that is equivalent to only one or two host failures. If you are
using the Percentage of Cluster Resources Reserved policy for a cluster that has eight hosts, you must not reserve more

98
VMware vSAN 8.0

than 25 percent of the cluster resources. In the same cluster, with the Failures to tolerate policy, the setting must not
be higher than two hosts. If vSphere HA reserves less capacity, failover activity might be unpredictable. Reserving too
much capacity overly constrains the powering on of virtual machines and intercluster vSphere vMotion migrations. For
information about the Percentage of Cluster Resources Reserved policy, see the vSphere Availability documentation.

vSAN and vSphere HA Behavior in a Multiple Host Failure


After a vSAN cluster fails with a loss of failover quorum for a virtual machine object, vSphere HA might not be able to
restart the virtual machine even when the cluster quorum has been restored. vSphere HA guarantees the restart only
when it has a cluster quorum and can access the most recent copy of the virtual machine object. The most recent copy is
the last copy to be written.
Consider an example where a vSAN virtual machine is provisioned to tolerate one host failure. The virtual machine runs
on a vSAN cluster that includes three hosts, H1, H2, and H3. All three hosts fail in a sequence, with H3 being the last host
to fail.
After H1 and H2 recover, the cluster has a quorum (one host failure tolerated). Despite this quorum, vSphere HA is unable
to restart the virtual machine because the last host that failed (H3) contains the most recent copy of the virtual machine
object and is still inaccessible.
In this example, either all three hosts must recover at the same time, or the two-host quorum must include H3. If neither
condition is met, HA attempts to restart the virtual machine when host H3 is online again.

Deploying vSAN with vCenter Server


You can create a vSAN cluster as you deploy vCenter Server, and host the vCenter Server on that cluster.
The vCenter Server is a preconfigured virtual machine used to administer ESXi hosts in a cluster. You can host the
vCenter Server on a vSAN cluster.
When you use the vCenter Server Installer to deploy a vCenter, you can create a single-host vSAN cluster, and host the
vCenter Server on the cluster. During Stage 1 of the deployment, when you select a datastore, click Install on a new
vSAN cluster containing the target host. Follow the steps in the Installer wizard to complete the deployment.
The vCenter Server Installer creates a one-host vSAN cluster, with disks claimed from the host. vCenter Server is
deployed on the vSAN cluster.
After you complete the deployment, you can manage the single-host vSAN cluster with the vCenter. You must complete
the configuration of the vSAN cluster.

Turn Off vSAN


You can turn off vSAN for a host cluster.
Verify that the hosts are in maintenance mode. For more information, see Place a Member of vSAN Cluster in
Maintenance Mode.
When you turn off vSAN for a cluster, all virtual machines and data services located on the vSAN datastore become
inaccessible. If you have consumed storage on the vSAN cluster using vSAN Direct, then the vSAN Direct monitoring
services, such as health checks, space reporting, and performance monitoring, are not available. If you intend to use

99
VMware vSAN 8.0

virtual machines while vSAN is off, make sure you migrate virtual machines from vSAN datastore to another datastore
before turning off the vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.
4. Click Turn Off vSAN.
5. On the Turn Off vSAN dialog, confirm your selection.

Creating a vSAN Stretched Cluster or Two-Node vSAN Cluster


You can create a vSAN stretched cluster that spans two geographic locations (or sites). vSAN stretched clusters enable
you to extend the vSAN datastore across two sites to use it as stretched storage. The vSAN stretched cluster continues to
function if a failure or scheduled maintenance occurs at one site.

What Are vSAN Stretched Clusters


vSAN stretched clusters extend the vSAN cluster from a single data site to two sites for a better level of availability and
intersite load balancing. vSAN stretched clusters are typically deployed in environments where the distance between data
centers is limited, such as metropolitan or campus environments.
You can use vSAN stretched clusters to manage planned maintenance and avoid disaster scenarios, because
maintenance or loss of one site does not affect the overall operation of the cluster. In a vSAN stretched cluster
configuration, both data sites are active sites. If either site fails, vSAN uses the storage on the other site. vSphere HA
restarts any VM that must be restarted on the remaining active site.
You must designate one site as the preferred site. The other site becomes a secondary or nonpreferred site. If the network
connection between the two active sites is lost, vSAN continues operation with the preferred site. The site designated as
preferred typically is the one that remains in operation, unless it is resyncing or has another issue. The site that leads to
maximum data availability is the one that remains in operation.
A vSAN stretched cluster can tolerate one link failure at a time without data becoming unavailable. A link failure is a loss
of network connection between the two sites or between one site and the witness host. During a site failure or loss of
network connection, vSAN automatically switches to fully functional sites.
vSAN 7.0 Update 3 and later vSAN stretched clusters can tolerate a witness host failure when one site is unavailable.
Configure the storage policy Site disaster tolerance rule to Site mirroring - stretched cluster. If one site is down due to
maintenance or failure and the witness host fails, objects become non-compliant but remain accessible.
For more information about working with vSAN stretched clusters, see the vSAN Stretched Cluster Guide.

Witness Host
Each vSAN stretched cluster consists of two data sites and one witness host. The witness host resides at a third site
and contains the witness components of virtual machine objects. The witness host does not store customer data, only
metadata, such as the size and UUID of vSAN object and components.
The witness host serves as a tiebreaker when a decision must be made regarding availability of datastore components
when the network connection between the two sites is lost. In this case, the witness host typically forms a vSAN cluster
with the preferred site. But if the preferred site becomes isolated from the secondary site and the witness, the witness host
forms a cluster using the secondary site. When the preferred site is online again, data is resynchronized to ensure that
both sites have the latest copies of all data.
If the witness host fails, all corresponding objects become noncompliant but are fully accessible.

100
VMware vSAN 8.0

The witness host has the following characteristics:


• The witness host can use low bandwidth/high latency links.
• The witness host cannot run VMs.
• A single witness host can support only one vSAN stretched cluster. Two-node vSAN clusters can share a single
witness host.
• The witness host must have one VMkernel adapter with vSAN traffic enabled, with connections to all hosts in the
cluster. The witness host uses one VMkernel adapter for management and one VMkernel adapter for vSAN data traffic.
The witness host can have only one VMkernel adapter dedicated to vSAN.
• The witness host must be a standalone host dedicated to the vSAN stretched cluster. It cannot be added to any other
cluster or moved in inventory through vCenter Server.
The witness host can be a physical host or an ESXi host running inside a VM. The VM witness host does not provide
other types of functionality, such as storing or running VMs. Multiple witness hosts can run as VMs on a single physical
server. For patching and basic networking and monitoring configuration, the VM witness host works in the same way as
a typical ESXi host. You can manage it with vCenter Server, patch it and update it by using esxcli or vSphere Lifecycle
Manager, and monitor it with standard tools that interact with ESXi hosts.
You can use a witness virtual appliance as the witness host in a vSAN stretched cluster. The witness virtual appliance
is an ESXi host in a VM, packaged as an OVF or OVA. The appliance is available in different options, based on the size
of the deployment. You can use a witness virtual appliance as the witness host in a vSAN stretched cluster. The witness
virtual appliance is an ESXi host in a VM, packaged as an OVF or OVA. Different appliances and different options are
available, based on the vSAN architecuture and the size of the deployment.

vSAN Stretched Clusters and Fault Domains


vSAN stretched clusters use fault domains to provide redundancy and failure protection across sites. Each site in a vSAN
stretched cluster resides in a separate fault domain.
A vSAN stretched cluster requires three fault domains: the preferred site, the secondary site, and a witness host. Each
fault domain represents a separate site. When the witness host fails or enters maintenance mode, vSAN considers it a
site failure.
In vSAN 6.6 and later releases, you can provide an extra level of local fault protection for virtual machine objects in vSAN
stretched clusters. When you configure a vSAN stretched cluster, the following policy rules are available for objects in the
cluster:
• Site disaster tolerance. For vSAN stretched clusters, this rule defines the failure tolerance method. Select Site
mirroring - stretched cluster.
• Failures to tolerate (FTT). For vSAN stretched clusters, FTT defines the number of additional host failures that a
virtual machine object can tolerate.
• None. You can set this data locality rule to None, Preferred, or Secondary. This rule enables you to restrict virtual
machine objects to a selected site in the vSAN stretched cluster.
In a vSAN stretched cluster with local fault protection, even when one site is unavailable, the cluster can perform repairs
on missing or broken components in the available site.
vSAN 7.0 and later continue to serve I/O if any disks or disks on one site reach 96 percent full or 5 GB free capacity
(whichever is less) while disks on the other site have free space available. Components on the affected site are marked
absent, and vSAN continues to perform I/O to healthy object copies on the other site. When disks on the affected site disk
reach 94 percent capacity or 10 GB (whichever is less), the absent components become available. vSAN resyncs the
available components and all objects become policy compliant.

101
VMware vSAN 8.0

vSAN Stretched Cluster Design Considerations


Consider these guidelines when working with a vSAN stretched cluster.
• Configure DRS settings for the vSAN stretched cluster.
– DRS must be enabled on the cluster. If you place DRS in partially automated mode, you can control which VMs to
migrate to each site. vSAN 7.0 Update 2 enables you to operate DRS in automatic mode, and recover gracefully
from network partitions.
– Create two host groups, one for the preferred site and one for the secondary site.
– Create two VM groups, one to hold the VMs on the preferred site and one to hold the VMs on the secondary site.
– Create two VM-Host affinity rules that map VMs-to-host groups, and specify which VMs and hosts reside in the
preferred site and which VMs and hosts reside in the secondary site.
– Configure VM-Host affinity rules to perform the initial placement of VMs in the cluster.
• Configure HA settings for the vSAN stretched cluster.
– HA rule settings should respect VM-Host affinity rules during failover.
– Disable HA datastore heartbeats.
– Use HA with Host Failure Monitoring, Admission Control, and set FTT to the number of hosts in each site.
• vSAN stretched clusters require on-disk format 2.0 or later. If necessary, upgrade the on-disk format before configuring
a vSAN stretched cluster. See "Upgrade vSAN Disk Format" in Administering VMware vSAN.
• Configure the FTT to 1 for vSAN stretched clusters.
• vSAN stretched clusters support enabling Symmetric Multiprocessing Fault Tolerance (SMP-FT) VMs only when Site
Disaster Tolerance is set to None with either Preferred or Secondary. vSAN does not support SMP-FT VMs on a
vSAN stretched cluster with Site Disaster Tolerance set to 1 or more. vSAN two-host clusters support enabling SMP-
FT with FTT set to 1 only when both data nodes are in the same site.
• When a host is disconnected or not responding, you cannot add or remove the witness host. This limitation ensures
that vSAN collects enough information from all hosts before initiating reconfiguration operations.
• Using esxcli to add or remove hosts is not supported for vSAN stretched clusters.
• Do not create snapshots of the witness host or backup the witness host. If the witness host fails, change the witness
host.

Best Practices for Working with vSAN Stretched Clusters


When working with vSAN stretched clusters, follow these recommendations for proper performance.
• If one of the sites (fault domains) in a vSAN stretched cluster is inaccessible, new VMs can still be provisioned in the
subcluster containing the operational site. These new VMs are implicitly force provisioned and are non-compliant until
the partitioned site rejoins the cluster. This implicit force provisioning is performed only when two of the three sites are
available. A site here refers to either a data site or the witness host.
• If an entire site goes offline due to a power outage or loss of network connection, restart the site immediately, without
much delay. Instead of restarting vSAN hosts one by one, bring all hosts online approximately at the same time, ideally
within a span of 10 minutes. By following this process, you avoid resynchronizing a large amount of data across the
sites.
• If a host is permanently unavailable, remove the host from the cluster before you perform any reconfiguration tasks.
• If you want to clone a VM witness host to support multiple vSAN stretched clusters, do not configure the VM as a
witness host before cloning it. First deploy the VM from OVF, then clone the VM, and configure each clone as a
witness host for a different cluster. Or you can deploy as many VMs as you need from the OVF, and configure each
one as a witness host for a different cluster.

102
VMware vSAN 8.0

vSAN Stretched Clusters Network Design


All three sites in a vSAN stretched cluster communicate across the management network and across the vSAN network.
The VMs in both data sites communicate across a common virtual machine network.
A vSAN stretched cluster must meet certain basic networking requirements.
• Management network requires connectivity across all three sites, using a Layer 2 stretched network or a Layer 3
network.
• The vSAN network requires connectivity across all three sites. It must have independent routing and connectivity
between the data sites and the witness host. vSAN supports both Layer 2 and Layer 3 between the two data sites, and
Layer 3 between the data sites and the witness host.
• VM network requires connectivity between the data sites, but not the witness host. Use a Layer 2 stretched network or
Layer 3 network between the data sites. In the event of a failure, the VMs do not require a new IP address to work on
the remote site.
• vMotion network requires connectivity between the data sites, but not the witness host. Use a Layer 2 stretched or a
Layer 3 network between data sites.
NOTE
vSAN over RDMA is not supported on vSAN stretched clusters or two-node vSAN clusters.

Using Static Routes on ESXi Hosts


If you use a single default gateway on ESXi hosts, each ESXi host contains a default TCP/IP stack that has a single
default gateway. The default route is typically associated with the management network TCP/IP stack.
The management network and the vSAN network might be isolated from one another. For example, the management
network might use vmk0 on physical NIC 0, while the vSAN network uses vmk2 on physical NIC 1 (separate network
adapters for two distinct TCP/IP stacks). This configuration implies that the vSAN network has no default gateway.
In vSAN 7.0 and later, you can override the default gateway for the vSAN VMkernel adapter on each host, and configure a
gateway address for the vSAN network.
You also can use static routes to communicate across networks. Consider a vSAN network that is stretched over two data
sites on a Layer 2 broadcast domain (for example, 172.10.0.0) and the witness host is on another broadcast domain (for
example, 172.30.0.0). If the VMkernel adapters on a data site try to connect to the vSAN network on the witness host, the
connection fails because the default gateway on the ESXi host is associated with the management network. There is no
route from the management network to the vSAN network.
Define a new routing entry that indicates which path to follow to reach a particular network. For a vSAN network on a
vSAN stretched cluster, you can add static routes to ensure proper communication across all hosts.
For example, you can add a static route to the hosts on each data site, so requests to reach the 172.30.0.0 witness
network are routed through the 172.10.0.0 interface. Also add a static route to the witness host so that requests to reach
the 172.10.0.0 network for the data sites are routed through the 172.30.0.0 interface.
NOTE
If you use static routes, you must manually add the static routes for new ESXi hosts added to either site before
those hosts can communicate across the cluster. If you replace the witness host, you must update the static
route configuration.
Use the esxcli network ip route command to add static routes.

What Are Two-Node vSAN Clusters


A two-node vSAN cluster has two hosts at the same location. The witness function is performed at a second site on a
dedicated virtual appliance.

103
VMware vSAN 8.0

Two-node vSAN clusters are often used for remote office/branch office environments, typically running a small number of
workloads that require high availability. A two-node vSAN cluster consists of two hosts at the same location, connected to
the same network switch or directly connected. A third host acts as a witness host, which can be located remotely from
the branch office. Usually the witness host resides at the main site, with the vCenter Server.
A single witness host can support up to 64 two-node vSAN clusters. The number of clusters supported by a shared
witness host is based on the host memory.
When you configure a two-node vSAN cluster in Quickstart or with the Configure vSAN wizard, you can select a witness
host. To assign a new witness host for your cluster, right-click the cluster in the vSphere Client and select menu vSAN >
Assign Shared Witness.

Use Quickstart to Configure a vSAN Stretched Cluster or Two-Node vSAN


Cluster
You can use the Quickstart workflow to quickly configure a vSAN stretched cluster or two-node vSAN cluster.
• Deploy a host outside of any cluster to use as a witness host.
• Verify that hosts are running ESXi 6.0 Update 2 or later. For a two-node vSAN cluster, verify that hosts are running
ESXi 6.1 or later.
• Verify that ESXi hosts in the cluster do not have any existing vSAN or networking configuration.
When you create a cluster in the vSphere Client, the Quickstart workflow appears. You can use Quickstart to perform
basic configuration tasks, such as adding hosts and claiming disks.
1. Navigate to the cluster in the vSphere Client.
2. Click the Configure tab, and select Configuration > Quickstart.
3. On the Cluster basics card, click Edit to open the Cluster basics wizard.
a) Enter the cluster name.
b) Enable the vSAN slider.
Select vSAN ESA if your cluster is compatible. You also can enable other features, such as DRS or vSphere HA.
c) Click Finish.
4. On the Add hosts card, click Add to open the Add hosts wizard.
a) On the Add hosts page, enter information for new hosts, or click Existing hosts and select from hosts listed in the
inventory.
b) On the Host summary page, verify the host settings.
c) On the Ready to complete page, click Finish.
5. On the Cluster configuration card, click Configure to open the Cluster configuration wizard.
a) (vSAN ESA clusters) On the Cluster Type page, enter the HCI cluster type:
• vSAN HCI provides compute resources and storage resources. The datastore can be shared across data
centers and vCenters.
• vSAN Scale Flex provides storage resources, but not compute resources. The datastore can be mounted by
remote vSAN clusters across data centers and vCenters.
b) On the Configure the distributed switches page, enter networking settings, including distributed switches, port
groups, and physical adapters.
• In the Distributed switches section, enter the number of distributed switches to configure from the drop-down
menu. Enter a name for each distributed switch. Click Use Existing to select an existing distributed switch.
If the physical adapters chosen are attached to a standard virtual switch with the same name across hosts, the
standard switch is migrated to the distributed switch. If the physical adapters chosen are unused, the standard
switch is migrated to the distributed switch.

104
VMware vSAN 8.0

Network resource control is enabled and set to version 3. Distributed switches with network resource control
version 2 cannot be used.
• In the Port Groups section, select a distributed switch to use for vMotion and a distributed switch to use for the
vSAN network.
• In the Physical adapters section, select a distributed switch for each physical network adapter. You must
assign each distributed switch to at least one physical adapter.
This mapping of physical NICs to the distributed switches is applied to all hosts in the cluster. If you are using
an existing distributed switch, the physical adapter selection can match the mapping of the distributed switch.
c) On the vMotion traffic page, enter IP address information for vMotion traffic.
d) On the Storage traffic page, enter IP address information for storage traffic.
e) On the Advanced options page, enter information for cluster settings, including DRS, HA, vSAN, host options, and
EVC.
In the vSAN options section, select vSAN Stretched cluster or Two node vSAN cluster as the Deployment type.
f) On the Claim disks page, select storage devices to create the vSAN datastore.
For vSAN Original Storage Architecture, select devices for cache and for capacity. vSAN uses those devices to
create disk groups on each host.
For vSAN Express Storage Architecture, select compatible flash devices or enable I want vSAN to manage the
disks. vSAN uses those devices to create storage pools on each host.
g) (Optional) On the Proxy settings, page, configure the proxy server if your system uses one.
h) On the Configure fault domains page, define fault domains for the hosts in the Preferred site and the Secondary
site.
For more information about fault domains, see "Managing Fault Domains in vSAN Clusters" in Administering
VMware vSAN.
i) On the Select witness host page, select a host to use as a witness host. The witness host cannot be part of the
vSAN stretched cluster, and it can have only one VMkernel adapter configured for vSAN data traffic.
Before you configure the witness host, verify that it is empty and does not contain any components. A two-node
vSAN cluster can share a witness with other two-node vSAN clusters.
j) On the Claim disks for witness host page, select disks on the witness host.
k) On the Review page, verify the cluster settings, and click Finish.
You can manage the cluster through vCenter Server.
You can add hosts to the cluster and modify the configuration through Quickstart. You also can modify the configuration
manually with the vSphere Client.

105
VMware vSAN 8.0

Manually Configure vSAN Stretched Cluster


Configure a vSAN cluster that stretches across two geographic locations or sites.
• Verify that you have a minimum of three hosts: one for the preferred site, one for the secondary site, and one host to
act as a witness.
• Verify that you have configured one host to serve as the witness host for the vSAN stretched cluster. Verify that the
witness host is not part of the vSAN cluster, and that it has only one VMkernel adapter configured for vSAN data traffic.
• Verify that the witness host is empty and does not contain any components. To configure an existing vSAN host as a
witness host, first evacuate all data from the host and delete the storage devices.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.

4. Click Configure Stretched Cluster to open the vSAN stretched cluster configuration wizard.
5. Select the hosts that you want to assign to the secondary fault domain and click >>.
The hosts that are listed under the Preferred fault domain are in the preferred site.
6. Click Next.
7. Select a witness host that is not a member of the vSAN stretched cluster and click Next.
8. Claim storage devices on the witness host and click Next.
For vSAN Original Storage Architecture, select devices for cache and for capacity.
For vSAN Express Storage Architecture, select compatible flash devices or enable I want vSAN to manage the
disks.

106
VMware vSAN 8.0

9. On the Ready to complete page, review the configuration and click Finish.

Change the Preferred Fault Domain


You can configure the Secondary site as the Preferred site. The current Preferred site becomes the Secondary site.
NOTE
Objects with Data locality=Preferred policy setting always move to the Preferred fault domain. Objects with
Data locality=Secondary always move to the Secondary fault domain. If you change the Preferred domain to
Secondary, and the Secondary domain to Preferred, these objects move from one site to the other. This action
might cause an increase in resynchronization activity. To avoid unnecessary resynchronization, you can change
the Data locality setting to None before you swap the Preferred and Secondary domains. Once you swap the
domains back, you can reset the Data locality.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
4. Select the secondary fault domain and click the Change Preferred Fault Domain icon.
5. Click Yes or Apply to confirm.
The selected fault domain is marked as the preferred fault domain.

Deploying a vSAN Witness Appliance


Specific vSAN configurations, such as a stretched cluster, require a witness host. Instead of using a dedicated physical
ESXi host as a witness host, you can deploy the vSAN witness appliance. The appliance is a preconfigured virtual
machine that runs ESXi and is distributed as an OVA file.
You can cross-host the witness for multiple stretched clusters only if the clusters run in four geographical locations. vSAN
does not support cross-hosting with clusters available only in two locations.
Unlike a general purpose ESXi host, the witness appliance does not run virtual machines. Its only purpose is to serve as a
vSAN witness, and it can contain only witness components.
The workflow to deploy and configure the vSAN witness appliance includes this process.
When you deploy the vSAN witness appliance, you must configure the size of the witness supported by the vSAN
stretched cluster. Choose one of the following options:
• Tiny supports up to 750 components (10 VMs or fewer).
• Medium supports up to 21,833 components (500 VMs). As a shared witness, the Medium witness appliance supports
up to 21,000 components and up to 21 two-node vSAN clusters.
• Large supports up to 45,000 components (more than 500 VMs). As a shared witness, the Large witness appliance
supports up to 24,000 components and up to 24 two-node vSAN clusters.
• Extra Large supports up to 64,000 components (more than 500 VMs). As a shared witness, the Extra Large witness
appliance supports up to 64,000 components and up to 64 two-node vSAN clusters.
NOTE
These estimates are based on standard VM configurations. The number of components that make up a VM can
vary, depending on the number of virtual disks, policy settings, snapshot requirements, and so on. For more
information about witness appliance sizing for two-node vSAN clusters, refer to the vSAN 2 Node Guide.
You also must select a datastore for the vSAN witness appliance. The witness appliance must use a different datastore
than the vSAN stretched cluster datastore.
1. Download the appliance from the VMware website.

107
VMware vSAN 8.0

2. Deploy the appliance to a vSAN host or cluster. For more information, see Deploying OVF Templates in the vSphere
Virtual Machine Administration documentation.
3. Configure the vSAN network on the witness appliance.
4. Configure the management network on the witness appliance.
5. Add the appliance to vCenter Server as a witness ESXi host. Make sure to configure the vSAN VMkernel interface on
the host.

Set Up the vSAN Network on the Witness Appliance


The vSAN witness appliance includes two preconfigured network adapters. You must change the configuration of the
second adapter so that the appliance can connect to the vSAN network.
1. Navigate to the virtual appliance that contains the witness host.
2. Right-click the appliance and select Edit Settings.
3. On the Virtual Hardware tab, expand the second Network adapter.
4. From the drop-down menu, select the vSAN port group and click OK.

Configure Management Network on the Witness Appliance


Configure the witness appliance, so that it is reachable on the network.
By default, the appliance can automatically obtain networking parameters if your network includes a DHCP server. If not,
you must configure appropriate settings.
1. Power on your witness appliance and open its console.
Because your appliance is an ESXi host, you see the Direct Console User Interface (DCUI).
2. Press F2 and navigate to the Network Adapters page.
3. On the Network Adapters page, verify that at least one vmnic is selected for transport.
4. Configure the IPv4 parameters for the management network.
a) Navigate to the IPv4 Configuration section and change the default DHCP setting to static.
b) Enter the following settings:
• IP address
• Subnet mask
• Default gateway
5. Configure DNS parameters.
• Primary DNS server
• Alternate DNS server
• Hostname

108
VMware vSAN 8.0

Configure Network Interface for Witness Traffic


You can separate data traffic from witness traffic in two-node vSAN clusters and vSAN stretched clusters.
• Verify that the data site to witness traffic connection has a minimum bandwidth of 2 Mbps for every 1,000 vSAN
components.
• Verify the latency requirements:
– Two-node vSAN clusters must have less than 500 ms RTT.
– vSAN stretched clusters with less than 11 hosts per site must have less than 200 ms RTT.
– vSAN stretched clusters with 11 or more hosts per site must have less than 100 ms RTT.
• Verify that the vSAN data connection meets the following requirements.
– For hosts directly connected in a two-node vSAN cluster, use a 10 Gbps direct connection between hosts. Hybrid
clusters also can use a 1 Gbps crossover connection between hosts.
– For hosts connected to a switched infrastructure, use a 10 Gbps shared connection (required for all-flash clusters),
or a 1 Gbps dedicated connection.
• Verify that data traffic and witness traffic use the same IP version.
vSAN data traffic requires a low-latency, high-bandwidth link. Witness traffic can use a high-latency, low-bandwidth and
routable link. To separate data traffic from witness traffic, you can configure a dedicated VMkernel network adapter for
vSAN witness traffic.
You can add support for a direct network cross-connection to carry vSAN data traffic in a vSAN stretched cluster. You can
configure a separate network connection for witness traffic. On each data host in the cluster, configure the management
VMkernel network adapter to also carry witness traffic. Do not configure the witness traffic type on the witness host.
NOTE
Network Address Translation (NAT) is not supported between vSAN data hosts and the witness host.
1. Open an SSH connection to the ESXi host.
2. Use the esxcli network ip interface list command to determine which VMkernel network adapter is used for
management traffic.
For example:
esxcli network ip interface list
vmk0
Name: vmk0
MAC Address: e4:11:5b:11:8c:16
Enabled: true
Portset: vSwitch0
Portgroup: Management Network
Netstack Instance: defaultTcpipStack
VDS Name: N/A
VDS UUID: N/A
VDS Port: N/A
VDS Connection: -1
Opaque Network ID: N/A
Opaque Network Type: N/A
External ID: N/A
MTU: 1500
TSO MSS: 65535
Port ID: 33554437

vmk1
Name: vmk1
MAC Address: 00:50:56:6a:3a:74

109
VMware vSAN 8.0

Enabled: true
Portset: vSwitch1
Portgroup: vsandata
Netstack Instance: defaultTcpipStack
VDS Name: N/A
VDS UUID: N/A
VDS Port: N/A
VDS Connection: -1
Opaque Network ID: N/A
Opaque Network Type: N/A
External ID: N/A
MTU: 9000
TSO MSS: 65535
Port ID: 50331660

NOTE
Multicast information is included for backward compatibility. vSAN 6.6 and later releases do not require
multicast.
3. Use the esxcli vsan network ip add command to configure the management VMkernel network adapter to support
witness traffic.
esxcli vsan network ip add -i vmkx -T witness

4. Use the esxcli vsan network list command to verify the new network configuration.
For example:
esxcli vsan network list
Interface
VmkNic Name: vmk0
IP Protocol: IP
Interface UUID: 8cf3ec57-c9ea-148b-56e1-a0369f56dcc0
Agent Group Multicast Address: 224.2.3.4
Agent Group IPv6 Multicast Address: ff19::2:3:4
Agent Group Multicast Port: 23451
Master Group Multicast Address: 224.1.2.3
Master Group IPv6 Multicast Address: ff19::1:2:3
Master Group Multicast Port: 12345
Host Unicast Channel Bound Port: 12321
Multicast TTL: 5
Traffic Type: witness

Interface
VmkNic Name: vmk1
IP Protocol: IP
Interface UUID: 6df3ec57-4fb6-5722-da3d-a0369f56dcc0
Agent Group Multicast Address: 224.2.3.4
Agent Group IPv6 Multicast Address: ff19::2:3:4
Agent Group Multicast Port: 23451
Master Group Multicast Address: 224.1.2.3
Master Group IPv6 Multicast Address: ff19::1:2:3
Master Group Multicast Port: 12345
Host Unicast Channel Bound Port: 12321
Multicast TTL: 5
Traffic Type: vsan

110
VMware vSAN 8.0

In the vSphere Client, the management VMkernel network interface is not selected for vSAN traffic. Do not re-enable the
interface in the vSphere Client.

Change the Witness Host


You can replace or change the witness host for a vSAN stretched cluster.
Verify that the witness host is not in use by another cluster, has a VMkernel configured for vSAN traffic, and has no vSAN
partitions on its disks.
Change the ESXi host used as a witness host for your vSAN stretched cluster.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
4. Click the Change button. The Change Witness Host wizard opens.
5. Select a new host to use as a witness host, and click Next.
6. Claim disks on the new witness host, and click Next.
7. On the Ready to complete page, review the configuration, and click Finish.

Convert a vSAN Stretched Cluster to a Single Site vSAN Cluster


You can decommission a vSAN stretched cluster and convert it to a single site vSAN cluster.
• Back up all running VMs, and verify that all VMs are compliant with their current storage policy.
• Ensure that no health issues exist, and that all resync activities are complete.
• Change the associated storage policy to move all VM objects to one site. Use the Data locality rule to restrict virtual
machine objects to the selected site.
When you decommission a vSAN stretched cluster, the witness host is removed, but the fault domain configuration
remains. Because the witness host is not available, all witness components are missing for your virtual machines. To
ensure full availability for your VMs, repair the cluster objects immediately.
1. Navigate to the vSAN stretched cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
4. Disable the vSAN stretched cluster.
a) Click Disable. The Remove Witness Host dialog opens.
b) Click Remove to confirm.
5. Remove the fault domain configuration.
a) Select a fault domain, and choose menu Actions > Delete. Click Yes to confirm.
b) Select the other fault domain, and choose menu Actions > Delete. Click Yes to confirm.
6. Remove the witness host from inventory.
7. Repair the objects in the cluster.
a) Click the Monitor tab.
b) Under vSAN, click Health and click vSAN object health.
c) Click Repair object immediately.
vSAN recreates the witness components within the cluster.

111
VMware vSAN 8.0

vSAN Network Design


The vSAN Network Design guide describes network requirements, network design, and configuration practices for
deploying a highly available and scalable vSAN cluster.
vSAN is a distributed storage solution. As with any distributed solution, the network is an important component of the
design. For best results, you must adhere to the guidance provided in this document as improper networking hardware
and designs can lead to unfavorable results.
At VMware, we value inclusion. To foster this principle within our customer, partner, and internal community, we create
content using inclusive language.

Intended Audience
This guide is intended for anyone who is designing, deploying, and managing a vSAN cluster. The information in this
guide is written for experienced network administrators who are familiar with network design and configuration, virtual
machine management, and virtual data center operations. This guide also assumes familiarity with VMware vSphere,
including VMware ESXi, vCenter Server, and the vSphere Client.

Related Documents
In addition to this guide, you can refer to the following guides to know more about vSAN networking:
• vSAN Planning and Deployment Guide, to know more about creating vSAN clusters
• Administering VMware vSAN, to configure a vSAN cluster and learn more about vSAN features
• vSAN Monitoring and Troubleshooting Guide, to monitor and troubleshoot vSAN clusters

What is vSAN Network


You can use vSAN to provision the shared storage within vSphere. vSAN aggregates local or direct-attached storage
devices of a host cluster and creates a single storage pool shared across all hosts in the vSAN cluster.
vSAN is a distributed and shared storage solution that depends on a highly available, properly configured network for
vSAN storage traffic. A high performing and available network is crucial to a successful vSAN deployment. This guide
provides recommendations on how to design and configure a vSAN network.
vSAN has a distributed architecture that relies on a high-performing, scalable, and resilient network. All host nodes
within a vSAN cluster communicate over the IP network. All the hosts must maintain IP unicast connectivity, so they can
communicate over a Layer 2 or Layer 3 network. For more information on the unicast communication, see Using Unicast
in vSAN Network.

112
VMware vSAN 8.0

vSAN Networking Terms and Definitions


vSAN introduces specific terms and definitions that are important to understand. Before you get start designing your vSAN
network, review the key vSAN networking terms and definitions.

Terms Definitions

CLOM The Cluster-Level Object Manager (CLOM) is responsible for


ensuring that an object's configuration matches its storage policy.
The CLOM checks whether enough fault domains are available
to satisfy that policy. It decides where to place components and
witnesses in a cluster.
CMMDS The Cluster Monitoring, Membership, and Directory Service
(CMMDS) is responsible for the recovery and maintenance of a
cluster of networked node members. It manages the inventory of
items such as host nodes, devices, and networks. It also stores
metadata information, such as policies and RAID configuration for
vSAN objects.
DOM The Distributed Object Manager (DOM) is responsible for creating
the components and distributing them across the cluster. After a
DOM object is created, one of the nodes (host) is nominated as
the DOM owner for that object. This host handles all IOPS to that
DOM object by locating the respective child components across
the cluster and redirecting the I/O to respective components
over the vSAN network. DOM objects include vdisk, snapshot,
vmnamespace, vmswap, vmem, and so on.
LSOM The Log-Structured Object Manager (LSOM) is responsible
for locally storing the data on the vSAN file system as vSAN
Component or LSOM-Object (data component or witness
component).
NIC Teaming Network Interface Card (NIC) teaming can be defined as two or
more network adapters (NICs) that are set up as a "team" for high
availability and load balancing.
NIOC Network I/O Control (NIOC) determines the bandwidth that
different network traffic types are given on a vSphere distributed
switch. The bandwidth distribution is a user configurable
parameter. When NIOC is enabled, distributed switch traffic is
divided into predefined network resource pools: Fault Tolerance
traffic, iSCSI traffic, vMotion traffic, management traffic, vSphere
Replication traffic, NFS traffic, and virtual machine traffic.

113
VMware vSAN 8.0

Terms Definitions

Objects and Components Each object is composed of a set of components, determined by


capabilities that are in use in the VM Storage Policy.
A vSAN datastore contains several object types:
• VM Home Namespace - The VM Home Namespace is a
virtual machine home directory where all virtual machine
configuration files are stored. This includes files such as .vmx,
log files, vmdks, and snapshot delta description files.
• VMDK - VMDK is a virtual machine disk or .vmdk file that
stores the contents of the virtual machine's hard disk drive.
• VM Swap Object - VM Swap Objects are created when a
virtual machine is powered on.
• Snapshot Delta VMDKs - Snapshot Delta VMDKs are created
when virtual machine snapshots are taken.
• Memory Object - Memory Objects are created when the
snapshot memory option is selected when creating or
suspending a virtual machine.
RDT The Reliable Data Transport (RDT) protocol is used for
communication between hosts over the vSAN VMkernel ports. It
uses TCP at the transport layer and is responsible to create and
destroy TCP connections (sockets) on demand. It is optimized to
send large files.
SPBM Storage Policy-Based Management (SPBM) provides a storage
policy framework that serves as a single unified control panel
across a broad range of data services and storage solutions. This
framework helps you to align storage with application demands of
your virtual machines.
VASA The vSphere Storage APIs for Storage Awareness (VASA) is a
set of application program interfaces (APIs) that enables vCenter
Server to recognize the capabilities of storage arrays. VASA
providers communicate with vCenter Server to determine the
storage topology, capability, and state information which supports
policy-based management, operations management, and DRS
functionality.
VLAN A VLAN enables a single physical LAN segment to be further
segmented so that groups of ports are isolated from one another
as if they were on physically different segments.
Witness Component A witness is a component that contains only metadata and does
not contain any actual application data. It serves as a tiebreaker
when a decision must be made regarding the availability of the
surviving datastore components, after a potential failure. A witness
consumes approximately 2 MB of space for metadata on the
vSAN datastore when using on-disk format 1.0, and 4 MB for the
on-disk format for version 2.0 and later.

Understanding vSAN Networking


A vSAN network facilitates the communication between cluster hosts, and must guarantee fast performance, high
availability, and bandwidth.
vSAN uses the network to communicate between the ESXi hosts and for virtual machine disk I/O.

114
VMware vSAN 8.0

Virtual machines (VMs) on vSAN datastores are made up of a set of objects, and each object can be made up of one or
more components. These components are distributed across multiple hosts for resilience to drive and host failures. vSAN
maintains and updates these components using the vSAN network.
The following diagram provides a high-level overview of the vSAN network:

vSAN Network Characteristics


vSAN is network-dependent. Understanding and configuring the right vSAN network settings is key to avoiding
performance and stability issues.
A reliable and robust vSAN network has the following characteristics:
Unicast
vSAN 6.6 and later releases support unicast communication. Unicast traffic is a one-to-one transmission of IP packets
from one point in the network to another point. Unicast transmits the heartbeat sent from the primary host to all other hosts
each second. This ensures that the hosts are active and indicates the participation of hosts in the vSAN cluster. You can
design a simple unicast network for vSAN. For more information on the unicast communication, see Using Unicast in
vSAN Network.
NOTE
If possible, always use the latest version of vSAN.
Layer 2 and Layer 3 Network
All hosts in the vSAN cluster must be connected through a Layer 2 or Layer 3 network. vSAN releases earlier than
vSAN 6.0 support only Layer 2 networking, whereas subsequent releases include support for both Layer 2 and Layer 3
protocols. Use a Layer 2 or Layer 3 network to provide communication between the data sites and the witness site. For
more information on Layer 2 and Layer 3 network topologies, see Standard Deployments.
VMkernel Network

115
VMware vSAN 8.0

Each ESXi host in a vSAN cluster must have a network adapter for vSAN communication. All the intra-cluster node
communication happens through the vSAN VMkernel port. VMkernel ports provide Layer 2 and Layer 3 services to each
vSAN host and hosted virtual machines.
vSAN Network Traffic
Several different traffic types are available in the vSAN network, such as the storage traffic and the unicast traffic. The
compute and storage of a virtual machine can be on the same host or on different hosts in the cluster. A VM that is not
configured to tolerate a failure might be running on one host, and accessing a VM object or component that resides on a
different host. This implies that all I/O from the VM passes through the network. The storage traffic constitutes most of the
traffic in a vSAN cluster.
The cluster-related communication between all the ESXi hosts creates traffic in the vSAN cluster. This unicast traffic also
contributes to the vSAN network traffic.
Virtual Switch
vSAN supports the following types of virtual switches:
• The Standard Virtual Switch provides connectivity from VMs and VMkernel ports to external networks. This switch is
local to each ESXi host.
• A vSphere Distributed Switch provides central control of the virtual switch administration across multiple ESXi hosts. A
distributed switch also provides networking features such as Network I/O Control (NIOC) that can help you set Quality
of Service (QoS) levels on vSphere or virtual network. vSAN includes vSphere Distributed Switch irrespective of the
vCenter Server version.
Bandwidth
vSAN traffic can share physical network adapters with other system traffic types, such as vSphere vMotion traffic, vSphere
HA traffic, and virtual machine traffic. It also provides more bandwidth for shared network configurations where vSAN,
vSphere management, vSphere vMotion traffic, and so on, are on the same physical network. To guarantee the amount of
bandwidth required for vSAN, use vSphere Network I/O Control in the distributed switch.
In vSphere Network I/O Control, you can configure reservation and shares for the vSAN outgoing traffic:
• Set a reservation so that Network I/O Control guarantees that a minimum bandwidth is available on the physical
adapter for vSAN.
• Set the share value to 100 so that when the physical adapter assigned for vSAN becomes saturated, certain bandwidth
is available to vSAN. For example, the physical adapter might become saturated when another physical adapter in the
team fails and all traffic in the port group is transferred to the other adapters in the team.
For information about using Network I/O Control to configure bandwidth allocation for vSAN traffic, see the vSphere
Networking documentation.

ESXi Traffic Types


ESXi hosts use different network traffic types to support vSAN.
Following are the different traffic types that you need to set up for vSAN.

116
VMware vSAN 8.0

Table 13: Network Traffic Types

Traffic Types Description

Management network The management network is the primary network interface that
uses a VMkernel TCP/IP stack to facilitate the host connectivity
and management. It can also handle the system traffic such as
vMotion, iSCSI, Network File System (NFS), Fiber Channel over
Ethernet (FCoE), and fault tolerance.
Virtual Machine network With virtual networking, you can network virtual machines and
build complex networks within a single ESXi host or across
multiple ESXi hosts.
vMotion network Traffic type that facilitates migration of VM from one host to
another. Migration with vMotion requires correctly configured
network interfaces on source and target hosts. Ensure that the
vMotion network is distinct from the vSAN network.
vSAN network A vSAN cluster requires the VMkernel network for the exchange of
data. Each ESXi host in the vSAN cluster must have a VMkernel
network adapter for the vSAN traffic. For more information, refer to
"Manually Enabling vSAN" in vSAN Planning and Deployment.

Network Requirements for vSAN


vSAN is a distributed storage solution that depends on the network for communication between hosts. Before deployment,
ensure that your vSAN environment has all the networking requirements.

Physical NIC Requirements


Network Interface Cards (NICs) used in vSAN hosts must meet certain requirements. vSAN works on 10 Gbps, 25 Gbps,
40 Gbps, 50 Gbps, and 100 Gbps networks.
Ensure your hosts meet the minimum NIC requirements for vSAN Original Storage Architecture (OSA) or vSAN Express
Storage Architecture (ESA).

Table 14: vSAN OSA Minimum NIC Requirements and Recommendations

Latency Bandwidth
Inter- Between Between
Topology or Support for
Support for Support for Inter-Node Site Link Nodes Nodes
Deployment Architecture NICs Greater
1 GbE NIC 10 GbE NIC Latency Bandwidth and vSAN and vSAN
Mode than 10 GbE
or Latency Witness Witness
Hosts Hosts
Single Site Hybrid Yes Yes Yes Less than 1 NA NA NA
vSAN Cluster Cluster (Minimum) (Recommended) ms RTT.
All-Flash No Yes Yes
Cluster (Recommended)
vSAN Hybrid or All- No Yes Yes Less than RecommendedLess than 2 Mbps
Stretched Flash Cluster (Minimum) 1 ms RTT is 10 GbE 200 ms RTT. per 1000
Cluster within each (Workload Up to 10 components
site. Dependent) hosts per site. (Maximum
and 5 ms Less than of 100 Mbps
RTT or less. 100 ms RTT. with 45 k
11–15 hosts components).
per site.

117
VMware vSAN 8.0

Latency Bandwidth
Inter- Between Between
Topology or Support for
Support for Support for Inter-Node Site Link Nodes Nodes
Deployment Architecture NICs Greater
1 GbE NIC 10 GbE NIC Latency Bandwidth and vSAN and vSAN
Mode than 10 GbE
or Latency Witness Witness
Hosts Hosts
Two-Node Hybrid Yes (Up to 10 Yes Yes Less than RecommendedLess than 2 Mbps
vSAN Cluster Cluster VMs) (Recommended) 1 ms RTT is 10 GbE 500 ms RTT. per 1000
All-Flash No Yes within the and 5 ms components
Cluster (Minimum) same site. RTT or less. (Maximum of
1.5 Mbps).

Table 15: vSAN ESA Minimum NIC Requirements and Recommendations

Latency Bandwidth
Support for Inter-Site Link Between Between
Deployment Support for 1 Support for 10 Inter-Node
NICs Greater Bandwidth or Nodes and Nodes and
Type GbE NIC GbE NIC Latency
than 10 GbE Latency vSAN Witness vSAN Witness
Hosts Hosts
Single Site No Yes Yes Less than 1 ms NA NA NA
vSAN Cluster RTT.
vSAN No Yes Yes Less than 1 Minimum 10 Less than 200 2 Mbps
Stretched ms RTT within GbE (workload ms RTT. Up to per 1000
Cluster each site. dependent) and 10 hosts per components
5 ms RTT. site. (Maximum
Less than 100 of 100 Mbps
ms RTT. 11–15 with 45 k
hosts per site. components).
Two-Node No Yes Yes Less than 1 ms Recommended Less than 500 2 Mbps
vSAN Cluster RTT within the is 25 GbE and ms RTT. per 1000
same site. 5 ms RTT or components
less. (Maximum of
1.5 Mbps).

NOTE
These NIC requirements assume that the packet loss is not more than 0.0001% in the hyper-converged
environments. There can be a drastic impact on the vSAN performance, if any of these requirements are
exceeded.
For more information about the vSAN stretched cluster NIC requirements, see vSAN Stretched Cluster Guide.

Bandwidth and Latency Requirements


To ensure high performance and availability, vSAN clusters must meet certain bandwidth and network latency
requirements.
The bandwidth requirements between the primary and secondary sites of a vSAN stretched cluster depend on the vSAN
workload, amount of data, and the way you want to handle failures. For more information, see VMware vSAN Design and
Sizing Guide.

118
VMware vSAN 8.0

Table 16: Bandwidth and Latency Requirements

Site Communication Bandwidth Latency

Site to Site vSAN OSA: minimum of 10 Gbps Less than 5 ms latency RTT.
vSAN ESA: minimum of 10 Gbps
Site to Witness 2 Mbps per 1000 vSAN components • Less than 500 ms latency RTT for 1
host per site.
• Less than 200 ms latency RTT for up to
10 hosts per site.
• Less than 100 ms latency RTT for 11-15
hosts per site.

Layer 2 and Layer 3 Support


VMware recommends Layer 2 connectivity between all vSAN hosts sharing the subnet.
vSAN also supports deployments using routed Layer 3 connectivity between vSAN hosts. You must consider the number
of hops and additional latency incurred while the traffic gets routed.

Table 17: Layer 2 and Layer 3 Support

Cluster Type L2 Supported L3 Supported Considerations

Hybrid Cluster Yes Yes L2 is recommended and L3 is


supported.
All-Flash Cluster Yes Yes L2 is recommended and L3 is
supported.
vSAN Stretched Cluster Data Yes Yes Both L2 and L3 between data
sites are supported.
vSAN Stretched Cluster Witness No Yes L3 is supported. L2 between
data and witness sites is not
supported.
Two-Node vSAN Cluster Yes Yes Both L2 and L3 between data
sites are supported.

Routing and Switching Requirements


All three sites in a vSAN stretched cluster communicate across the management network and across the vSAN network.
The VMs in all data sites communicate across a common virtual machine network.
Following are the vSAN stretched cluster routing requirements:

Table 18: Routing Requirements

Site Communication Deployment Model Layer Routing

Site to Site Default Layer 2 Not required


Site to Site Default Layer 3 Use static routes or gateway
override.
Site to Witness Default Layer 3 Use static routes or gateway
override.

119
VMware vSAN 8.0

Site Communication Deployment Model Layer Routing

Site to Witness Witness Traffic Separation Layer 3 Use static routes or gateway
override when using an
interface other than the
Management (vmk0) interface.
Site to Witness Witness Traffic Separation Layer 2 for two-host cluster Static routes are not required.

Virtual Switch Requirements


You can create a vSAN network with either vSphere Standard Switch or vSphere Distributed Switch. Use a distributed
switch to prioritize bandwidth for vSAN traffic. vSAN uses a distributed switch with all the vCenter Server versions.
The following table compares the advantages and benefits of a distributed switch over a standard switch:

Table 19: Virtual Switch Types

Option 1 - vSphere Distributed Option 2 - vSphere Standard


Design Requirement Description
Switch Switch
Availability No impact No impact You can use either of the
options
Manageability Positive impact Negative impact The distributed switch is
centrally managed across all
hosts, unlike the standard switch
which is managed on each host
individually.
Performance Positive impact Negative impact The distributed switch has
added controls, such as
Network I/O Control, which
you can use to guarantee
performance for vSAN traffic.
Recoverability Positive impact Negative impact The distributed switch
configuration can be backed
up and restored, the standard
switch does not have this
functionality.
Security Positive impact Negative impact The distributed switch has
added built-in security controls
to help protect traffic.

vSAN Network Port Requirements


vSAN deployments require specific network ports and settings to provide access and services.
vSAN sends messages on certain ports on each host in the cluster. Verify that the host firewalls allow traffic on these
ports. For the list of all supported vSAN ports and protocols, see the VMware Ports and Protocols portal at https://
ports.broadcom.com/.

Firewall Considerations
When you enable vSAN on a cluster, all required ports are added to ESXi firewall rules and configured automatically.
There is no need for an administrator to open any firewall ports or enable any firewall services manually.

120
VMware vSAN 8.0

You can view open ports for incoming and outgoing connections. Select the ESXi host, and click Configure > Security
Profile.

Network Firewall Requirements


When you configure the network firewall, consider which version of vSAN you are deploying.
When you enable vSAN on a cluster, all required ports are added to ESXifirewall rules and configured automatically. You
do not need to open any firewall ports or enable any firewall services manually. You can view open ports for incoming and
outgoing connections in the ESXi host security profile (Configure > Security Profile).
vsanEncryption Firewall Rule
If your cluster uses vSAN encryption, consider the communication between hosts and the KMS server.
vSAN encryption requires an external Key Management Server (KMS). vCenter Server obtains the key IDs from the KMS,
and distributes them to the ESXi hosts. KMS servers and ESXi hosts communicate directly with each other. KMS servers
might use different port numbers, so the vsanEncryption firewall rule enables you to simplify communication between each
vSAN host and the KMS server. This allows a vSAN host to communicate directly to any port on a KMS server (TCP port
0 through 65535).
When a host establishes communication to a KMS server, the following operations occur.
• The KMS server IP is added to the vsanEncryption rule and the firewall rule is enabled.
• Communication between vSAN node and KMS server is established during the exchange.
• After communication between the vSAN node and the KMS server ends, the IP address is removed from
vsanEncryption rule, and the firewall rule is deactivatedagain.
vSAN hosts can communicate with multiple KMS hosts using the same rule.

Using Unicast in vSAN Network


Unicast traffic refers to a one-to-one transmission from one point in the network to another. vSAN version 6.6 and later
uses unicast to simplify network design and deployment.
All ESXi hosts use the unicast traffic, and the vCenter Server becomes the source for the cluster membership. The vSAN
nodes are automatically updated with the latest host membership list that vCenter provides. vSAN communicates using
unicast for CMMDS updates.
Releases earlier than vSAN version 6.6 rely on multicast to enable the heartbeat and to exchange metadata between
hosts in the cluster. If some hosts in your vSAN cluster are running earlier versions of the software, a multicast network
is still required. The switch to unicast network from multicast provides better performance and network support. For more
information on multicast, see Using Multicast in vSAN Network.

Pre-Version 5 Disk Group Behavior


The availability of a single version 5 disk group in vSAN version 6.6 disk group triggers the cluster to communicate
permanently in the unicast mode.
vSAN version 6.6 clusters automatically revert to multicast communication in the following situations:
• All cluster hosts are running vSAN version 6.5 or lower.
• All disk groups are using on-disk version 3 or earlier.
• A non-vSAN 6.6 host such as vSAN 6.2 or vSAN 6.5 is added to the cluster.
For example, if a host running vSAN 6.5 or earlier is added to an existing vSAN 6.6 cluster, the cluster reverts to multicast
mode and includes the 6.5 host as a valid node. To avoid this behavior, use the latest version for both ESXi hosts and

121
VMware vSAN 8.0

on-disk format. To ensure that vSAN cluster continues communicating in unicast mode and does not revert to multicast,
upgrade the disk groups on the vSAN 6.6 hosts to on-disk version 5.0.
NOTE
Avoid having a mixed mode cluster where vSAN version 6.5 or earlier are available in the same cluster along
with vSAN version 6.6 or later.

Version 5 Disk Group Behavior


The presence of a single version 5 disk group in a vSAN version 6.6 cluster triggers the cluster to communicate
permanently in unicast mode.
In an environment where a vSAN 6.6 cluster is already using an on-disk version 5 and a vSAN 6.5 node is added to the
cluster, the following events occur:
• The vSAN 6.5 node forms its own network partition.
• The vSAN 6.5 node continues to communicate in multicast mode but is unable to communicate with vSAN 6.6 nodes
as they use unicast mode.
A cluster summary warning appears on the on-disk format showing that one node is at an earlier version. You can
upgrade the node to the latest version. You cannot upgrade disk format versions when a cluster is in a mixed mode.

DHCP Support on Unicast Network


vCenter Server deployed on a vSAN 6.6 cluster can use IP addresses from Dynamic Host Configuration Protocol (DHCP)
without reservations.
You can use DHCP with reservations as the assigned IP addresses are tied to the MAC addresses of VMkernel ports.

IPv6 Support on Unicast Network


vSAN 6.6 supports IPv6 with unicast communications.
With IPv6, the link-local address is automatically configured on any interface using the link-local prefix. By default, vSAN
does not add the link local address of a node to other neighboring cluster nodes. As a result, vSAN 6.6 does not support
IPv6 link local addresses for unicast communications.

Query Unicast with ESXCLI


You can run ESXCLI commands to determine the unicast configuration.

View the Communication Modes


Using esxcli vsan cluster get command, you can view the CMMDS mode (unicast or multicast) of the vSAN
cluster node.
Run the esxcli vsan cluster get command.

Cluster Information
Enabled: true
Current Local Time: 2020-04-09T18:19:52Z
Local Node UUID: 5e8e3dc3-43ab-5452-795b-a03d6f88f022
Local Node Type: NORMAL
Local Node State: AGENT
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: 5e8e3d3f-3015-9075-49b6-a03d6f88d426
Sub-Cluster Backup UUID: 5e8e3daf-e5e0-ddb6-a523-a03d6f88dd4a

122
VMware vSAN 8.0

Sub-Cluster UUID: 5282f9f3-d892-3748-de48-e2408dc34f72


Sub-Cluster Membership Entry Revision: 11
Sub_cluster Member Count: 5
Sub-Cluster Member UUIDs: 5e8e3d3f-3015-9075-49b6-a03d6f88d426, 5e8e3daf-e5e0-ddb6-a523-a03d6f88dd4a,
5e8e3d73-6d1c-0b81-1305-a03d6f888d22, 5e8e3d33-5825-ee5c-013c-a03d6f88ea4c, 5e8e3d-
c3-43ab-5452-795b-a03d6f88f022
Sub-Cluster Member HostNames: testbed-1.vmware.com, testbed2.vmware.com,
testbed3.vmware.com, testbed4.vmware.com, testbed5.vmware.com
Sub-Cluster Membership UUID: 0f438e5e-d400-1bb2-f4d1-a03d6f88d426
Unicast Mode Enabled: true
Maintenance Mode State: OFF
Config Generation: ed845022-5c08-48d0-aa1d-6b62c0022222 7 2020-04-08T22:44:14.889

Verify the vSAN Cluster Hosts


Use the esxcli vsan cluster unicastagent list command to verify whether the vSAN cluster hosts are
operating in unicast mode.
Run the esxcli vsan cluster unicastagent list command.

NodeUuid IsWitness Supports Unicast IP Address Port Iface Name Cert Thumbprint
SubClusterUuid
------------------------------------ --------- ---------------- ---------- ----- ----------
5e8e3d73-6d1c-0b81-1305-a03d6f888d22 0 true 10.198.95.10 12321
43:80:B7:A1:3F:D1:64:07:8C:58:01:2B:CE:A2:F5:DE:D6:B1:41:AB
5e8e3daf-e5e0-ddb6-a523-a03d6f88dd4a 0 true 10.198.94.240 12321
FE:39:D7:A5:EF:80:D6:41:CD:13:70:BD:88:2D:38:6C:A0:1D:36:69
5e8e3d3f-3015-9075-49b6-a03d6f88d426 0 true 10.198.94.244 12321
72:A3:80:36:F7:5D:8F:CE:B0:26:02:96:00:23:7D:8E:C5:8C:0B:E1
5e8e3d33-5825-ee5c-013c-a03d6f88ea4c 0 true 10.198.95.11 12321
5A:55:74:E8:5F:40:2F:2B:09:B5:42:29:FF:1C:95:41:AB:28:E0:57

The output includes the vSAN node UUID, IPv4 address, IPv6 address, UDP port with which vSAN node communicates,
and whether the node is a data host (0) or a witness host (1). You can use this output to identify the vSAN cluster nodes
that are operating in unicast mode and view the other hosts in the cluster. vCenter Server maintains the output list.

View the vSAN Network Information


Use the esxcli vsan network list command to view the vSAN network information such as the VMkernel interface
that vSAN uses for communication, the unicast port (12321), and the traffic type (vSAN or witness) associated with the
vSAN interface.
Run the esxcli vsan network list command.

Interface
VmkNic Name: vmk1
IP Protocol: IP
Interface UUID: e290be58-15fe-61e5-1043-246e962c24d0
Agent Group Multicast Address: 224.2.3.4
Agent Group IPv6 Multicast Address: ff19::2:3:4
Agent Group Multicast Port: 23451
Master Group Multicast Address: 224.1.2.3
Master Group IPv6 Multicast Address: ff19::1:2:3
Master Group Multicast Port: 12345

123
VMware vSAN 8.0

Host Unicast Channel Bound Port: 12321


Multicast TTL: 5
Traffic Type: vsan

This output also displays the multicast information.

Intra-Cluster Traffic
In unicast mode, the primary node addresses all the cluster nodes as it sends the same message to all the vSAN nodes in
a cluster.
For example, if N is the number of vSAN nodes, then the primary node sends the messages N number of times. This
results in a slight increase of vSAN CMMDS traffic. You might not notice this slight increase of traffic during normal,
steady-state operations.

Intra-Cluster Traffic in a Single Rack


If all the nodes in a vSAN cluster are connected to the same top of the rack (TOR) switch, then the total increase in traffic
is only between the primary node and the switch.
If a vSAN cluster spans more than one TOR switch, traffic between the switch expands. If a cluster spans many racks,
multiple TORs form Fault Domains (FD) for the rack awareness. The primary node sends N messages to the racks or fault
domains, where N is the number of hosts in each fault domain.

124
VMware vSAN 8.0

Intra-Cluster Traffic in a vSAN Stretched Cluster


In a vSAN stretched cluster, the primary node is located at the preferred site.
In a fault domain, CMMDS data must be communicated from the secondary site to the preferred site. To calculate the
traffic in a vSAN stretched cluster, you must multiply the number of nodes in a secondary site with the CMMDS node size
(in MB) to the number of nodes in the secondary site.
Traffic in a vSAN stretched cluster = number of nodes in the secondary site * CMMDS node size (in MB) * number of
nodes in the secondary site.

With the unicast traffic, there is no change in the witness site traffic requirements.

Configuring IP Network Transport


Transport protocols provide communication services across the network. These services include the TCP/IP stack and
flow control.

vSphere TCP/IP Stacks


vSphere does not include a dedicated TCP/IP stack for the vSAN traffic service. You can add the vSAN VMkernel network
interface to the default TCP/IP stack and define static routes for all hosts in the vSAN cluster.
vSphere does not support the creation of a custom vSAN TCP/IP stack. You can ensure vSAN traffic in Layer 3 network
topologies leaves over the vSAN VMkernel network interface. Add the vSAN VMkernel network interface to the default
TCP/IP stack and define static routes for all hosts in the vSAN cluster.

125
VMware vSAN 8.0

NOTE
vSAN does not have its own TCP/IP stack. Use static routes to route vSAN traffic across L3 networks.
vSphere 6.0 introduced a new TCP/IP stack architecture, which can use multiple TPC/IP stacks to manage different
VMkernel network interfaces. With this architecture, you can configure traffic services such as vMotion, management, and
fault tolerance on isolated TCP/IP stacks, which can use multiple default gateways.
For network traffic isolation and security requirements, deploy the different traffic services onto different network segments
or VLANs. This prevents the different traffic services from traversing through the same default gateway.

When you configure the traffic services on separate TCP/IP stacks, deploy each traffic service type onto its own network
segment. The network segments are accessed through a physical network adapter with VLAN segmentation. Map each
segment to different VMkernel network interfaces with the respective traffic services enabled.

TCP/IP Stacks Available in vSphere


vSphere provides TCP/IP stacks that support vSAN traffic requirements.
• Default TCP/IP Stack. Manage the host-related traffic services. This stack shares a single default gateway between all
configured network services.
• vMotion TCP/IP Stack. Isolates vMotion traffic onto its own stack. The use of this stack completely removes or
deactivates vMotion traffic from the default TCP/IP stack.
• Provisioning TCP/IP Stack. Isolates some virtual machine-related operations, such as cold migrations, cloning,
snapshot, or NFC traffic.
• Mirror TCP/IP Stack. Separates port mirroring traffic from management traffic. Without this stack, mirror traffic is
bound to the default TCP/IP stack.
• Ops TCP/IP Stack. Provides support for vSphere network flow data collection.
You can select a different TCP/IP stack during the creation of a VMkernel interface.
Environments with isolated network requirements for the vSphere traffic services cannot use the same default gateway
to direct traffic. Using different TCP/IP stacks simplifies management for traffic isolation, because you can use different
default gateways and avoid adding static routes. Use this technique when you must route vSAN traffic to another network
that is not accessible over the default gateway.

126
VMware vSAN 8.0

vSphere RDMA
vSAN 7.0 Update 2 and later supports Remote Direct Memory Access (RDMA) communication.
RDMA allows direct memory access from the memory of one computer to the memory of another computer without
involving the operating system or CPU. The transfer of memory is offloaded to the RDMA-capable Host Channel Adapters
(HCA).
vSAN supports the RoCE v2 protocol. RoCE v2 requires a network configured for lossless operation that is free of
congestion. If your network has congestion, certain large I/O workloads might experience lower performance than TCP.
Each vSAN host must have a vSAN certified RDMA-capable NIC, as listed in the vSAN section of the VMware
Compatibility Guide. Use only the same model network adapters from the same vendor on each end of the connection.
NOTE
vSphere RDMA is not supported on vSAN stretched clusters, two-node vSAN clusters, vSAN Max, or Datastore
Sharing (HCI-mesh).
All hosts in the cluster must support RDMA. If any host loses RDMA support, the entire vSAN cluster switches to TCP.
vSAN with RDMA supports NIC failover, but does not support LACP or IP-hash-based NIC teaming.

IPv6 Support
vSAN 6.2 and later supports IPv6.

vSAN supports the following IP versions.


• IPv4
• IPv6 (vSAN 6.2 and later)
• Mixed IPv4/IPv6 (vSAN 6.2 and later)
In releases earlier than vSAN 6.2, only IPv4 is supported. Use mixed mode when migrating your vSAN cluster from IPv4
to IPv6.
IPv6 multicast is also supported.
For more information about using IPv6, consult with your network vendor.

Static Routes
You can use static routes to allow vSAN network interfaces from hosts on one subnet to reach the hosts on another
network.
Most organizations separate the vSAN network from the management network, so the vSAN network does not have a
default gateway. In an L3 deployment, hosts that are on different subnets or different L2 segments cannot reach each
other over the default gateway, which is typically associated with the management network.
Use static routes to allow the vSAN network interfaces from hosts on one subnet to reach the vSAN networks on hosts on
the other network. Static routes instruct a host how to reach a particular network over an interface, rather than using the
default gateway.
The following example shows how to add an IPv4 static route to an ESXi host. Specify the gateway (-g) and the network (-
n) you want to reach through that gateway:
esxcli network ip route ipv4 add –g 172.16.10.253 -n 192.168.10.0/24

When the static routes have been added, vSAN traffic connectivity is available across all networks, assuming the physical
infrastructure allows it. Run the vmkping command to test and confirm communication between the different networks by

127
VMware vSAN 8.0

pinging the IP address or the default gateway of the remote network. You also can check different size packets (-s) and
prevent fragmentation (-d) of the packet.
vmkping –I vmk3 192.168.10.253

Jumbo Frames
vSAN fully supports jumbo frames on the vSAN network.
Jumbo frames are Ethernet frames with more than 1500 bytes of payload. Jumbo frames typically carry up to 9000 bytes
of payload, but variations exist.
Using jumbo frames can reduce CPU utilization and improve throughput.
NOTE
Enable jumbo frames support for vSAN Max deployments to improve performance.
You must decide whether these gains outweigh the overhead of implementing jumbo frames throughout the network. In
data centers where jumbo frames are already enabled in the network infrastructure, you can use them for vSAN. The
operational cost of configuring jumbo frames throughout the network might outweigh the limited CPU and performance
benefits.

Using VMware NSX with vSAN


vSAN and VMware NSX can be deployed and coexist in the same vSphere infrastructure.
NSX does not support the configuration of the vSAN data network over an NSX-managed VXLAN or Geneve overlay.
vSAN and NSX are compatible. vSAN and NSX are not dependent on each other to deliver their functionalities, resources,
and services.
One reason VMkernel traffic is not supported over the NSX-managed VxLAN overlay is to avoid any circular dependency
between the VMkernel networks and the VxLAN overlay that they support. The logical networks delivered with the NSX-
managed VxLAN overlay are used by virtual machines, which require network mobility and flexibility.
When you implement LACP/LAG in NSX, a Cisco Nexus environment defines the LAGs as virtual port channels (vPCs).

Using Congestion Control and Flow Control


Use flow control to manage the rate of data transfer between senders and receivers on the vSAN network. Congestion
control handles congestion in the network.

Flow Control
You can use flow control to manage the rate of data transfer between two devices.
Flow control is configured when two physically connected devices perform auto-negotiation.
An overwhelmed network node might send a pause frame to halt the transmission of the sender for a specified period. A
frame with a multicast destination address sent to a switch is forwarded out through all other ports of the switch. Pause
frames have a special multicast destination address that distinguishes them from other multicast traffic. A compliant switch
does not forward a pause frame. Frames sent to this range are meant to be acted upon only within the switch. Pause
frames have a limited duration, and expire after a time interval. Two computers that are connected through a switch never
send pause frames to each other, but can send pause frames to a switch.
One reason to use pause frames is to support network interface controllers (NICs) that do not have enough buffering to
handle full-speed reception. This problem is uncommon with advances in bus speeds and memory sizes.

128
VMware vSAN 8.0

Congestion Control
Congestion control helps you control the traffic on the network.
Congestion control applies mainly to packet switching networks. Network congestion within a switch might be caused by
overloaded inter-switch links. If inter-switch links overload the capability on the physical layer, the switch introduces pause
frames to protect itself.

Priority Flow Control


Priority-based flow control (PFC) helps you eliminate frame loss due to congestion.
Priority-based flow control (IEEE 802.1Qbb) is achieved by a mechanism similar to pause frames, but operates on
individual priorities. PFC is also called Class-Based Flow Control (CBFC) or Per Priority Pause (PPP).

Flow Control and Congestion Control


Flow control is an end-to-end mechanism that controls the traffic between a sender and a receiver. Flow control occurs in
the data link layer and the transport layer.
Congestion control is used by a network to control congestion in the network. This problem is not as common in modern
networks with advances in bus speeds and memory sizes. A more likely scenario is network congestion within a switch.
Congestion Control is handled by the network layer and the transport layer.

Flow Control Design Considerations


By default, flow control is enabled on all network interfaces in ESXi hosts.
Flow control configuration on a NIC is done by the driver. When a NIC is overwhelmed by network traffic, the NIC sends
pause frames.
Flow control mechanisms such as pause frames can trigger overall latency in the VM guest I/O due to increased latency
at the vSAN network layer. Some network drivers provide module options that configure flow control functionality within
the driver. Some network drivers enable you to modify the configuration options using the ethtool command-line utility
on the console of the ESXi host. Use module options or ethtool , depending on the implementation details of a given
driver.
For information about configuring flow control on ESXi hosts, see VMware KB 1013413.
In deployments with 1 Gbps, leave flow control enabled on ESXi network interfaces (default). If pause frames are a
problem, carefully plan disabling flow control in conjunction with Hardware Vendor Support or VMware Global Support
Services.
To learn how you can recognize the presence of pause frames being sent from a receiver to an ESXi host, see
Troubleshooting the vSAN Network. A large number of pause frames in an environment usually indicates an underlying
network or transport issue to investigate.

Basic NIC Teaming, Failover, and Load Balancing


Many vSAN environments require some level of network redundancy.
You can use NIC teaming to achieve network redundancy. You can configure two or more network adapters (NICs)
as a team for high availability and load balancing. Basic NIC teaming is available with vSphere networking, and these
techniques can affect vSAN design and architecture.
Several NIC teaming options are available. Avoid NIC teaming policies that require physical switch configurations, or
that require an understanding of networking concepts such as Link Aggregation. Best results are achieved with a basic,
simple, and reliable setup.
If you are not sure about NIC teaming options, use an Active/Standby configuration with explicit failover.

129
VMware vSAN 8.0

Basic NIC Teaming


Basic NIC teaming uses multiple physical uplinks, one vmknic, and a single switch.
vSphere NIC teaming uses multiple uplink adapters, called vmnics, which are associated with a single virtual switch
to form a team. This is the most basic option, and you can configure it using a standard vSphere standard switch or a
vSphere distributed switch.

Failover and Redundancy


vSAN can use the basic NIC teaming and failover policy provided by vSphere.
NIC teaming on a vSwitch can have multiple active uplinks, or an Active/Standby uplink configuration. Basic NIC teaming
does not require any special configuration at the physical switch layer.
NOTE
vSAN does not use NIC teaming for load balancing.
A typical NIC teaming configuration has the following settings. When working on distributed switches, edit the settings of
the distributed port group used for vSAN traffic.
• Load balancing: Route based on originating virtual port
• Network failure detection: Link status only
• Notify switches: Yes
• Failback: Yes
Load Balancing vSAN traffic.
• Load balancing: Route based on originating virtual port
• Network failure detection: Link status only
• Notify switches: Yes
• Failback: Yes

130
VMware vSAN 8.0

Configure Load Balancing for NIC Teams


Several load-balancing techniques are available for NIC teaming, and each technique has its pros and cons.

Route Based on Originating Virtual Port


In Active/Active or Active/Passive configurations, use Route based on originating virtual port for basic NIC teaming.
When this policy is in effect, only one physical NIC is used per VMkernel port.
Pros
• This is the simplest NIC teaming method that requires minimal physical switch configuration.
• This method requires only a single port for vSAN traffic, which simplifies troubleshooting.
Cons
• A single VMkernel interface is limited to a single physical NIC's bandwidth. As typical vSAN environments use one
VMkernel adapter, only one physical NIC in the team is used.

Route Based on Physical NIC Load


Route Based on Physical NIC Load is based on Route Based on Originating Virtual Port, where the virtual switch
monitors the actual load of the uplinks and takes steps to reduce load on overloaded uplinks. This load-balancing method
is available only with a vSphere Distributed Switch, not on vSphere Standard Switches.
The distributed switch calculates uplinks for each VMkernel port by using the port ID and the number of uplinks in the NIC
team. The distributed switch checks the uplinks every 30 seconds, and if the load exceeds 75 percent, the port ID of the
VMkernel port with the highest I/O is moved to a different uplink.
Pros
• No physical switch configuration is required.
• Although vSAN has one VMkernel port, the same uplinks can be shared by other VMkernel ports or network services.
vSAN can benefit by using different uplinks from other contending services, such as vMotion or management.
Cons
• As vSAN typically only has one VMkernel port configured, its effectiveness is limited.
• The ESXi VMkernel reevaluates the traffic load after each time interval, which can result in processing overhead.

Settings: Network Failure Detection


Use the default setting: Link status only. Do not use Beacon probing for link failure detection. Beacon probing requires at
least three physical NICs to avoid split-brain scenarios. For more details, see VMware KB 1005577.

Settings: Notify Switches


Use the default setting: Yes. Physical switches have MAC address forwarding tables to associate each MAC address
with a physical switch port. When a frame comes in, the switch determines the destination MAC address in the table and
decides the correct physical port.
If a NIC failover occurs, the ESXi host must notify the network switches that something has changed, or the physical
switch might continue to use the old information and send the frames to the wrong port.
When you set Notify Switches to Yes, if one physical NIC fails and traffic is rerouted to a different physical NIC in the
team, the virtual switch sends notifications over the network to update the lookup tables on physical switches.
This setting does not catch VLAN misconfigurations, or uplink losses that occur further upstream in the network. The
vSAN network partitions health check can detect these issues.

131
VMware vSAN 8.0

Settings: Failback
This option determines how a physical adapter is returned to active duty after recovering from a failure. A failover event
triggers the network traffic to move from one NIC to another. When a link up state is detected on the originating NIC,
traffic automatically reverts to the original network adapter when Failback is set to Yes. When Failback is set to No, a
manual failback is required.
Setting Failback to No can be useful in some situations. For example, after a physical switch port recovers from a failure,
the port might be active but can take several seconds to begin forwarding traffic. Automatic Failback has been known to
cause problems in certain environments that use the Spanning Tree Protocol. For more information about Spanning Tree
Protocol (STP), see VMware KB 1003804.

Setting Failover Order


Failover order determines which links are active during normal operations, and which links are active in the event of a
failover. Different supported configurations are possible for the vSAN network.
Active/Standby uplinks: If a failure occurs on an Active/Standby setup, the NIC driver notifies vSphere of a link down
event on Uplink 1. The standby Uplink 2 becomes active, and traffic resumes on Uplink 2.
Active/Active uplinks: If you set the failover order to Active/Active, the virtual port used by vSAN traffic cannot use both
physical ports at the same time.
If your NIC teaming configuration for both Uplink 1 and Uplink 2 is active, there is no need for the standby uplink to
become active.
NOTE
When using an Active/Active configuration, ensure that Failback is set to No. For more information, see VMware
KB 2072928.

Advanced NIC Teaming


You can use advanced NIC teaming methods with multiple VMkernel adapters to configure the vSAN network. If you use
Link Aggregation Protocol (LAG/LACP), the vSAN network can be configured with a single VMkernel adapter.
You can use advanced NIC teaming to implement an air gap, so a failure that occurs on one network path does not impact
the other network path. If any part of one network path fails, the other network path can carry the traffic. Configure multiple
VMkernel NICs for vSAN on different subnets, such as another VLAN or separate physical network fabric.
vSphere and vSAN do not support multiple VMkernel adapters (vmknics) on the same subnet. For more details, see
VMware KB 2010877.

132
VMware vSAN 8.0

Link Aggregation Group Overview


By using the LACP protocol, a network device can negotiate an automatic bundling of links by sending LACP packets to a
peer.
A Link Aggregation Group (LAG) states that Link Aggregation allows one or more links to be aggregated together to form
a Link Aggregation Group.
LAG can be configured as either static (manual) or dynamic by using LACP to negotiate the LAG formation. LACP can be
configured as follows:
Active
Devices immediately send LACP messages when the port comes up. End devices with LACP enabled (for example, ESXi
hosts and physical switches) send and receive frames called LACP messages to each other to negotiate the creation of a
LAG.
Passive
Devices place a port into a passive negotiating state, in which the port only responds to received LACP messages, but do not
initiate negotiation.

NOTE
If the host and switch are both in passive mode, the LAG does not initialize, because an active part is required to
trigger the linking. At least one must be Active.
In vSphere 5.5 and later releases, this functionality is called Enhanced LACP. This functionality is only supported on
vSphere Distributed Switch version 5.5 or later.
For more information about LACP support on a vSphere Distributed Switch, see the vSphere 6 Networking
documentation.
NOTE
The number of LAGs you can use depends on the capabilities of the underlying physical environment and the
topology of the virtual network.
For more information about the different load-balancing options, see KB 2051826.

133
VMware vSAN 8.0

Static and Dynamic Link Aggregation


You can use LACP to combine and aggregate multiple network connections.
When LACP is in active or dynamic mode, a physical switch sends LACP messages to network devices, such as ESXi
hosts, to negotiate the creation of a Link Aggregation Group (LAG).
To configure Link Aggregation on hosts using vSphere Standard Switches (and pre-5.5 vSphere Distributed Switches),
configure a static channel-group on the physical switch. See your vendor documentation for more details.

Pros and Cons of Dynamic Link Aggregation


Consider the tradeoffs to using Dynamic Link Aggregation.
Pros
Improves performance and bandwidth. One vSAN host or VMkernel port can communicate with many other vSAN
hosts using many different load-balancing options.
Provides network adapter redundancy. If a NIC fails and the link-state fails, the remaining NICs in the team continue to
pass traffic.
Improves traffic balancing. Balancing of traffic after failures is automatic and fast.
Cons
Less flexible. Physical switch configuration requires that physical switch ports be configured in a port-channel
configuration.
More complex. Use of multiple switches to produce full physical redundancy configuration is complex. Vendor-specific
implementations add to the complexity.

134
VMware vSAN 8.0

Static LACP with Route Based on IP Hash


You can create a vSAN 6.6 cluster using static LACP with an IP-hash policy. This section focuses on vSphere Standard
Switches, but you also can use vSphere Distributed Switches.
You can use the Route based on IP Hash load balancing policy.
Select Route based on IP Hash load balancing policy at a vSwitch or port-group level. Set all uplinks assigned to static
channel group to the Active Uplink position on the Teaming and Failover Policies at the virtual switch or port-group level.
When IP Hash is configured on a vSphere port group, the port group uses the Route based on IP Hash policy. The
number of ports in the port-channel must be same as the number of uplinks in the team.

Pros and Cons of Static LACP with IP Hash


Consider the tradeoffs to using Static LACP with IP Hash.
Pros
• Improves performance and bandwidth. One vSAN host or VMkernel port can communicate with many other vSAN
hosts using the IP Hash algorithm.
• Provides network adapter redundancy. If a NIC fails and the link-state fails, the remaining NICs in the team continue
to pass traffic.
• Adds flexibility. You can use IP Hash with both vSphere Standard Switches and vSphere Distributed Switches.

135
VMware vSAN 8.0

Cons
• Physical switch configuration is less flexible. Physical switch ports must be configured in a static port-channel
configuration.
• Increased chance of misconfiguration. Static port-channels form without any verification on either end (unlike LACP
dynamic port-channel).
• More complex. Introducing full physical redundancy configuration increases complexity when multiple switches are
used. Implementations can become quite vendor specific.
• Limited load balancing. If your environment has only a few IP addresses, the virtual switch might consistently pass
the traffic through one uplink in the team. This can be especially true for small vSAN clusters.

Understanding Network Air Gaps


You can use advanced NIC teaming methods to create an air-gap storage fabric. Two storage networks are used to create
a redundant storage network topology, with each storage network physically and logically isolated from the other by an air
gap.
You can configure a network air gap for vSAN in a vSphere environment. Configure multiple VMkernel ports per vSAN
host. Associate each VMkernel port to dedicated physical uplinks, using either a single vSwitch or multiple virtual
switches, such as vSphere Standard Switch or vSphere Distributed Switch.

Typically, each uplink must be connected to fully redundant physical infrastructure.


This topology is not ideal. The failure of components such as NICs on different hosts that reside on the same network can
lead to interruption of storage I/O. To avoid this problem, implement physical NIC redundancy on all hosts and all network
segments. Configuration example 2 addresses this topology in detail.
These configurations are applicable to both L2 and L3 topologies, with both unicast and multicast configurations.

Pros and Cons of Air Gap Network Configurations with vSAN


Network air gaps can be useful to separate and isolate vSAN traffic. Use caution when configuring this topology.
Pros
• Physical and logical separation of vSAN traffic.

136
VMware vSAN 8.0

Cons
• vSAN does not support multiple VMkernel adapters (vmknics) on the same subnet. For more information, see VMware
KB 2010877.
• Setup is complex and error prone, so troubleshooting is more complex.
• Network availability is not guaranteed with multiple vmknics in some asymmetric failures, such as one NIC failure on
one host and another NIC failure on another host.
• Load-balanced vSAN traffic across physical NICs is not guaranteed.
• Costs increase for vSAN hosts, as you might need multiple VMkernel adapters (vmknics) to protect multiple physical
NICs (vmnics). For example, 2 x 2 vmnics might be required to provide redundancy for two vSAN vmknics.
• Required logical resources are doubled, such as VMkernel ports, IP addresses, and VLANs.
• vSAN does not implement port binding. This means that techniques such as multi-pathing are not available.
• Layer 3 topologies are not suitable for vSAN traffic with multiple vmknics. These topologies might not function as
expected.
• Command-line host configuration might be required to change vSAN multicast addresses.
Dynamic LACP combines, or aggregates, multiple network connections in parallel to increase throughput and provide
redundancy. When NIC teaming is configured with LACP, load balancing of the vSAN network across multiple uplinks
occurs. This load balancing happens at the network layer, and is not done through vSAN.
NOTE
Other terms sometimes used to describe link aggregation include port trunking, link bundling, Ethernet/network/
NIC bonding, EtherChannel.
This section focuses on Link Aggregation Control Protocol (LACP). The IEEE standard is 802.3ad, but some vendors
have proprietary LACP features, such as PAgP (Port Aggregation Protocol). Follow the best practices recommended by
your vendor.
NOTE
The LACP support introduced in vSphere Distributed Switch 5.1 only supports IP-hash load balancing. vSphere
Distributed Switch 5.5 and later fully support LACP.
LACP is an industry standard that uses port-channels. Many hashing algorithms are available. The vSwitch port-group
policy and the port-channel configuration must agree and match.

NIC Teaming Configuration Examples


The following NIC teaming configurations illustrate typical vSAN networking scenarios.

Configuration 1: Single vmknic, Route Based on Physical NIC Load


You can configure basic Active/Active NIC Teaming with the Route based on Physical NIC Load policy for vSAN hosts.
Use a vSphere Distributed Switch (vDS).
For this example, the vDS must have two uplinks configured for each host. A distributed port group is designated for vSAN
traffic and isolated to a specific VLAN. Jumbo frames are already enabled on the vDS with an MTU value of 9000.
Configure teaming and failover for the distributed port group for vSAN traffic as follows:
• Load balancing policy set to Route Based on Physical Nic Load.
• Network failure detection set to Link status only.
• Notify Switches set to Yes.
• Failback set to No . You can set Failback to yes, but not for this example.
• Ensure both uplinks are in the Active uplinks position.

137
VMware vSAN 8.0

Network Uplink Redundancy Lost


When the link down state is detected, the workload switches from one uplink to another. There is no noticeable impact to
the vSAN cluster and VM workload.

Recovery and Failback


When you set Failback to No, traffic is not promoted back to the original vmnic. If Failback is set to Yes, traffic is
promoted back to the original vmnic on recovery.

Load Balancing
Since this is a single VMkernel NIC, there is no performance benefit to using Route based on physical load.
Only one physical NIC is in use at any time. The other physical NIC is idle.

Configuration 2: Multiple vmknics, Route Based on Originating Port ID


You can use two non-routable VLANs that are logically and physically separated, to produce an air-gap topology.
This example provides configuration steps for a vSphere distributed switch, but you also can use vSphere standard
switches. It uses two 10 Gb physical NICs and logically separates them on the vSphere networking layer.
Create two distributed port groups for each vSAN VMkernel vmknic. Each port group has a separate VLAN tag. For vSAN
VMkernel configuration, two IP addresses on both VLANs are required for vSAN traffic.
NOTE
Practical implementations typically use four physical uplinks for full redundancy.

138
VMware vSAN 8.0

For each port group, the teaming and failover policy use the default settings.
• Load balancing set to Route based on originating port ID
• Network failure detection set to Link Status Only
• Notify Switches set to the default value of Yes
• Failback set to the default value of Yes
• The uplink configuration has one uplink in the Active position and one uplink in the Unused position.
One network is completely isolated from the other network.

vSAN Port Group 1


This example uses a distributed port group called vSAN-DPortGroup-1. VLAN 3266 is tagged for this port group with the
following Teaming and Failover policy:
• Traffic on the port group tagged with VLAN 3266
• Load balancing set to Route based on originating port ID
• Network failure detection set to Link Status Only
• Notify Switches set to default value of Yes
• Failback set to default value of Yes
• The uplink configuration has Uplink 1 in the Active position and Uplink 2 in the Unused position.

vSAN Port Group 2


To complement vSAN port group 1, configure a second distributed port group called vSAN-portgroup-2, with the following
differences:
• Traffic on the port group tagged with VLAN 3265
• The uplink configuration has Uplink 2 in the Active position and Uplink 1 in the Unusedposition.

vSAN VMkernel Port Configuration


Create two vSAN VMkernel interfaces and on both port groups. In this example, the port groups are named vmk1 and
vmk2.
• vmk1 is associated with VLAN 3266 (172.40.0.xx), and as a result port group vSAN-DPortGroup-1.
• vmk2 is associated with VLAN 3265 (192.60.0.xx), and as a result port group vSAN-DPortGroup-2.

139
VMware vSAN 8.0

Load Balancing
vSAN has no load balancing mechanism to differentiate between multiple vmknics, so the vSAN I/O path chosen is not
deterministic across physical NICs. The vSphere performance charts show that one physical NIC is often more utilized
than the other. A simple I/O test performed in our labs, using 120 VMs with a 70:30 read/write ratio with a 64K block size
on a four-host all flash vSAN cluster, revealed an unbalanced load across NICs.
vSphere performance graphs show an unbalanced load across NICs.

Network Uplink Redundancy Lost


Consider a network failure introduced in this configuration. vmnic1 is not enabled on a given vSAN host. As a result, port
vmk2 is impacted. A failing NIC triggers both network connectivity alarms and redundancy alarms.
For vSAN, this failover process triggers approximately 10 seconds after CMMDS (Cluster Monitoring, Membership,
and Directory Services) detects a failure. During failover and recovery, vSAN stops any active connections on the failed
network, and attempts to re-establish connections on the remaining functional network.
Since two separate vSAN VMkernel ports communicate on isolated VLANs, vSAN health check failures might be
triggered. This is expected as vmk2 can no longer communicate to its peers on VLAN 3265.
The performance charts show that the affected workload has restarted on vmnic0, since vmnic1 has a failure. This test
illustrates an important distinction between vSphere NIC teaming and this topology. vSAN attempts to re-establish or
restart connections on the remaining network.
However, in some failure scenarios, recovering the impacted connections might require up to 90 seconds to complete,
due to ESXi TCP connection timeout. Subsequent connection attempts might fail, but connection attempts time out at 5
seconds, and the attempts rotate through all possible IP addresses. This behavior might affect virtual machine guest I/O.
As a result, application and virtual machine I/O might have to be retried.
For example, on Windows Server 2012 VMs, Event IDs 153 (device reset) and 129 (retry events) might be logged during
the failover and recovery process. In the example, event ID 129 was logging for approximately 90 seconds until the I/O
was recovered.

140
VMware vSAN 8.0

You might have to modify disk timeout settings of some guest OSes to ensure that they are not severely impacted. Disk
timeout values might vary, depending on the presence of VMware Tools, and the specific guest OS type and version. For
more information about changing guest OS disk timeout values, go to VMware KB 1009465.

Recovery and Failback


When the network is repaired, workloads are not automatically rebalanced unless another failure to force workload occurs,
due to another failure. As soon as the impacted network is recovered, it becomes available for new TCP connections.

Configuration 3: Dynamic LACP


You can configure a two-port LACP port channel on a switch and a two-uplink Link Aggregation Group on a vSphere
distributed switch.
In this example, use 10Gb networking with two physical uplinks per server.
NOTE
vSAN over RDMA does not support this configuration.

Configure the Network Switch


Configure the vSphere distributed switch with the following settings.
• Identify the ports in question where the vSAN host will connect.
• Create a port channel.
• If using VLANs, then trunk the correct VLAN to the port channel.
• Configure the desired distribution or load-balancing options (hash).
• Setting LACP mode to active/dynamic.
• Verify MTU configuration.

Configure vSphere
Configure the vSphere network with the following settings.
• Configure vDS with the correct MTU.
• Add hosts to vDS.
• Create a LAG with the correct number of uplinks and matching attributes to port channel.
• Assign physical uplinks to the LAG.
• Create a distributed port group for vSAN traffic and assign correct VLAN.
• Configure VMkernel ports for vSAN with correct MTU.

Set Up the Physical Switch


Configure the physical switch with the following settings. For guidance about how to set up this configuration on Dell
servers, refer to: http://www.dell.com/Support/Article/us/en/19/HOW10364.
Configure a two uplink LAG:
• Use switch ports 36 and 18.
• This configuration uses VLAN trunking, so port channel is in VLAN trunk mode, with the appropriate VLANs trunked.
• Use the following method for load-balancing or load distribution: Source and destination IP addresses, TCP/UDP
port and VLAN
• Verify that the LACP mode is Active (Dynamic).

141
VMware vSAN 8.0

Use the following commands to configure an individual port channel on a Dell switch:
• Create a port-channel.
# interface port-channel 1
• Set port-channel to VLAN trunk mode.
# switchport mode trunk
• Allow VLAN access.
# switchport trunk allowed vlan 3262
• Configure the load balancing option.
# hashing-mode 6
• Assign the correct ports to the port-channel and set the mode to Active.
• Verify that the port channel is configured correctly.
# show interfaces port-channel 1
Channel Ports Ch-Type Hash Type Min-links Local Prf
------- ----------------------------- -------- --------- --------- ---------
Po1 Active: Te1/0/36, Te1/0/18 Dynamic 6 1 Disabled
Hash Algorithm Type
1 - Source MAC, VLAN, EtherType, source module and port Id
2 - Destination MAC, VLAN, EtherType, source module and port Id
3 - Source IP and source TCP/UDP port
4 - Destination IP and destination TCP/UDP port
5 - Source/Destination MAC, VLAN, EtherType, source MODID/port
6 - Source/Destination IP and source/destination TCP/UDP port
7 - Enhanced hashing mode
# interface range Te1/0/36, Te1/0/18
# channel-group 1 mode active
Full configuration:
# interface port-channel 1
# switchport mode trunk
# switchport trunk allowed vlan 3262
# hashing-mode 6
# exit
# interface range Te1/0/36,Te1/018
# channel-group 1 mode active
# show interfaces port-channel 1
NOTE
Repeat this procedure on all participating switch ports that are connected to vSAN hosts.

Set Up vSphere Distributed Switch


Before you begin, make sure that the vDS is upgraded to a version that supports LACP. To verify, right click the vDS, and
check if the Upgrade option is available. You might have to upgrade the vDS to a version that supports LACP.

142
VMware vSAN 8.0

Create LAG on vDS


To create a LAG on a distributed switch, select the vDS, click the Configure tab, and select LACP. Add a new LAG.

Configure the LAG with the following properties:


• LAG name: lag1
• Number of ports: 2 (to match port channel on switch)
• Mode: Active, to match the physical switch.
• Load balancing mode: Source and destination IP addresses, TCP/UDP port and VLAN

Add Physical Uplinks to LAG


vSAN hosts have been added to the vDS. Assign the individual vmnics to the appropriate LAG ports.
• Right click the vDS, and select Add and Manage Hosts…
• Select Manage Host Networking, and add your attached hosts.
• On Manage Physical Adapters, select the appropriate adapters and assign them to the LAG port.
• Migrate vmnic0 from Uplink 1 position to port 0 on LAG1.
Repeat the procedure for vmnic1 to the second LAG port position, lag1-1.

143
VMware vSAN 8.0

Configure Distributed Port Group Teaming and Failover Policy


Assign the LAG group as an Active uplink on distributed port group teaming and failover policy. Select or create the
designated distributed port group for vSAN traffic. This configuration uses a vSAN port group called vSAN with VLAN ID
3262 tagged. Edit the port group, and configure Teaming and Failover Policy to reflect the new LAG configuration.
Ensure the LAG group lag1 is in the active uplinks position, and ensure the remaining uplinks are in the Unused position.
NOTE
When a link aggregation group (LAG) is selected as the only active uplink, the load-balancing mode of the LAG
overrides the load-balancing mode of the port group. Therefore, the following policy plays no role: Route based
on originating virtual port.

Create the VMkernel Interfaces


The final step is to create the VMkernel interfaces to use the new distributed port group, ensuring that they are tagged for
vSAN traffic. Observe that each vSAN vmknic can communicate over vmnic0 and vmnic1 on a LAG group to provide load
balancing and failover.

144
VMware vSAN 8.0

Configure Load Balancing


From a load balancing perspective, there is not a consistent balance of traffic across all hosts on all vmnics in this LAG
setup, but there is more consistency compared to Route based on physical NIC load used in Configuration 1 and the
air-gapped/multiple vmknics method used in Configuration 2.
The individual hosts’ vSphere performance graph shows improved load balancing.

Network Uplink Redundancy Lost


When vmnic1 is not enabled on a given vSAN host, a Network Redundancy alarm is triggered.
No vSAN health alarms are triggered, and the impact to Guest I/O is minimal compared to the air-gapped, multi-vmknics
configuration. This configuration does not have to stop any TCP sessions with LACP configured.

Recovery and Failback


In a failback scenario, the behavior differs between Load Based Teaming, multiple vmknics, and LACP in a vSAN
environment. After vmnic1 recovers, traffic is automatically balanced across both active uplinks. This behavior can be
advantageous for vSAN traffic.

Failback Set to Yes or No?


A LAG load-balancing policy overrides the Teaming and Failover policy for vSphere distributed port groups. Also consider
the guidance on Failback value. Lab tests show no discernable behavior differences between Failback set to yes or no
with LACP. LAG settings takes priority over the port-group settings.
NOTE
Network failure detection values remain as link status only, since beacon probing is not supported with LACP.
See VMware KB Understanding IP Hash load balancing (2006129)

Configuration 4: Static LACP – Route Based on IP Hash


You can use a two-port LACP static port-channel on a switch, and two active uplinks on a vSphere Standard Switch.

145
VMware vSAN 8.0

In this configuration, use 10Gb networking with two physical uplinks per server. A single VMkernel interface (vmknic) for
vSAN exists on each host.
For more information about host requirements and configuration examples, see the following VMware Knowledge Base
articles:
• Host requirements for link aggregation for ESXi and ESX (1001938)
• Sample configuration of EtherChannel / Link Aggregation Control Protocol (LACP) with ESXi/ESX and Cisco/HP
switches (KB 1004048)
NOTE
vSAN over RDMA does not support this configuration.

Configure the Physical Switch


Configure a two-uplink static port-channel as follows:
• Switch ports 43 and 44
• VLAN trunking, so port-channel is in VLAN trunk mode, with the appropriate VLANs trunked.
• Do not specify the load-balancing policy on the port-channel group.
These steps can be used to configure an individual port-channel on the switch:
Step 1: Create a port-channel.
#interface port-channel 13
Step 2: Set port-channel to VLAN trunk mode.
#switchport mode trunk
Step 3: Allow appropriate VLANs.
#switchport trunk allowed vlan 3266
Step 4: Assign the correct ports to the port-channel and set mode to active.
#interface range Te1/0/43, Te1/0/44
#channel-group 1 mode on
Step 5: Verify that the port-channel is configured as a static port-channel.
#show interfaces port-channel 13

Channel Ports Ch-Type Hash Type Min-links Local Prf


------- ----------------------------- -------- --------- --------- --
Po13 Active: Te1/0/43, Te1/0/44 Static 7 1 Disabled
Hash Algorithm Type
1 - Source MAC, VLAN, EtherType, source module and port Id
2 - Destination MAC, VLAN, EtherType, source module and port Id
3 - Source IP and source TCP/UDP port
4 - Destination IP and destination TCP/UDP port
5 - Source/Destination MAC, VLAN, EtherType, source MODID/port
6 - Source/Destination IP and source/destination TCP/UDP port
7 - Enhanced hashing mode

146
VMware vSAN 8.0

Configure vSphere Standard Switch


This example assumes you understand the configuration and creation of vSphere Standard Switches.
This example uses the following configuration:
• Identical vSAN hosts
• Uplinks named vmnic0 and vmnic1
• VLAN 3266 trunked to the switch ports and port-channel
• Jumbo frames
On each host, create a vSwitch1 with MTU set to 9000, and vmnic0 and vmnic1 added to the vSwitch. On the Teaming
and Failover Policy, set both adapters to the Active position. Set the Load Balancing Policy to Route Based on IP Hash.
Configure teaming and failover for the distributed port group for vSAN traffic as follows:
• Load balancing policy set to Route Based on IP hash.
• Network failure detection set to Link status only.
• Notify Switches set to Yes.
• Failback set to Yes.
• Ensure both uplinks are in the Active uplinks position.
Use defaults for network detection, Notify Switches and Failback. All port groups inherit the Teaming and Failover Policy
that was set at the vSwitch level. You can override individual port group teaming and failover polices to differ from the
parent vSwitch, but make sure you use the same set of uplinks for IP hash load balancing for all port groups.

Configure Load Balancing


Although both physical uplinks are utilized, there is not a consistent balance of traffic across all physical vmnics. The
figure shows that only active traffic is vSAN traffic, which was essentially four vmknics or IP addresses. The behavior
might be caused by the low number of IP addresses and possible hashes. However, in some situations, the virtual switch
might consistently pass the traffic through one uplink in the team. For further details on the IP Hash algorithm, see the
official vSphere documentation about Route Based on IP Hash.

147
VMware vSAN 8.0

Network Redundancy
In this example, vmnic1 is connected to a port that has been disabled from the switch, to focus on failure and redundancy
behavior. Note that a network uplink redundancy alarm has triggered.
No vSAN health alarms were triggered. Cluster and VM components are not affected and Guest Storage I/O is not
interrupted by this failure.

Recovery and Failback


Once vmnic1 recovers, traffic is automatically balanced across both active uplinks.

Network I/O Control


Use vSphere Network I/O Control to set Quality of Service (QoS) levels on network traffic.
vSphere Network I/O Control is a feature available with vSphere Distributed Switches. Use it to implement Quality of
Service (QoS) on network traffic. This can be useful for vSAN when vSAN traffic must share the physical NIC with other
traffic types, such as vMotion, management, virtual machines.

Reservations, Shares, and Limits


You can set a reservation so that Network I/O Control guarantees minimum bandwidth is available on the physical
adapter for vSAN.
Reservations can be useful when bursty traffic, such as vMotion or full host evacuation, might impact vSAN traffic.
Reservations are only invoked if there is contention for network bandwidth. One disadvantage with reservations in
Network I/O Control is that unused reservation bandwidth cannot be allocated to virtual machine traffic. The total
bandwidth reserved among all system traffic types cannot exceed 75 percent of the bandwidth provided by the physical
network adapter with the lowest capacity.
vSAN best practices for reservations. Traffic reserved for vSAN cannot be allocated to virtual machine traffic, so avoid
using NIOC reservations in vSAN environments.
Setting shares makes a certain bandwidth available to vSAN when the physical adapter assigned for vSAN becomes
saturated. This prevents vSAN from consuming the entire capacity of the physical adapter during rebuild and
synchronization operations. For example, the physical adapter might become saturated when another physical adapter
in the team fails and all traffic in the port group is transferred to the remaining adapters in the team. The shares option
ensures that no other traffic impacts the vSAN network.
vSAN recommendation on shares. This is the fairest bandwidth allocation technique in NIOC, and is preferred for use in
vSAN environments.
Setting limits defines the maximum bandwidth that a certain traffic type can consume on an adapter. If no one else is
using the additional bandwidth, the traffic type with the limit also cannot consume it.
vSAN recommendation on limits. As traffic types with limits cannot consume additional bandwidth, avoid using NIOC
limits in vSAN environments.

Network Resource Pools


You can view all system traffic types that can be controlled with Network I/O Control. If you have multiple virtual machine
networks, you can assign certain bandwidth to virtual machine traffic. Use network resource pools to consume parts of
that bandwidth based on the virtual machine port group.

148
VMware vSAN 8.0

Enabling Network I/O Control


You can enable Network I/O Control in the configuration properties of the vDS. Right-click the vDS in the vSphere Client,
and choose menu Settings > Edit Settings.
NOTE
Network I/O Control is only available on vSphere distributed switches, not on standard vSwitches.
You can use Network I/O Control to reserve bandwidth for network traffic based on the capacity of the physical adapters
on a host. For example, if vSAN traffic uses 10 GbE physical network adapters, and those adapters are shared with other
system traffic types, you can use vSphere Network I/O Control to guarantee a certain amount of bandwidth for vSAN. This
can be useful when traffic such as vSphere vMotion, vSphere HA, and virtual machine traffic share the same physical NIC
as the vSAN network.

Network I/O Control Configuration Example


You can configure Network I/O Control for a vSAN cluster.
Consider a vSAN cluster with a single 10 GbE physical adapter. This NIC handles traffic for vSAN, vSphere vMotion,
and virtual machines. To change the shares value for a traffic type, select that traffic type from the System Traffic view
(VDS > Configure > Resource Allocation > System Traffic), and click Edit. The shares value for vSAN traffic has been
changed from the default of Normal/50 to High/100.

149
VMware vSAN 8.0

Edit the other traffic types to match the share values shown in the table.

Table 20: Sample NIOC Settings

Traffic Type Shares Value


vSAN High 100
vSphere vMotion Low 25
Virtual machine Normal 50
iSCSI/NFS Low 25

If the 10 GbE adapter becomes saturated, Network I/O Control allocates 5 Gbps to vSAN on the physical adapter,
3.5 Gbps to virtual machine traffic, and 1.5 Gbps to vMotion. Use these values as a starting point to configure NIOC
configuration on your vSAN network. Ensure that vSAN has the highest priority of any protocol.
For more details about the various parameters for bandwidth allocation, see vSphere Networking documentation.
With each of the vSphere editions for vSAN, VMware provides a vSphere Distributed Switch as part of the edition.
Network I/O Control can be configured with any vSAN edition.

Understanding vSAN Network Topologies


vSAN architecture supports different network topologies. These topologies impact on the overall deployment and
management of vSAN.
The introduction of unicast support in vSAN 6.6 simplifies the network design.

Standard Deployments
vSAN supports several single-site deployment types.

Layer-2, Single Site, Single Rack


This network topology is responsible for forwarding packets through intermediate Layer 2 devices such as hosts, bridges,
or switches.

150
VMware vSAN 8.0

The Layer 2 network topology offers the simplest implementation and management of vSAN. VMware recommends
the use and configuration of IGMP Snooping to avoid sending unnecessary multicast traffic on the network. In this first
example, we are looking at a single site, and perhaps even a single rack of servers using vSAN 6.5 or earlier. This
version uses multicast, so enable IGMP Snooping. Since everything is on the same L2, you need not configure routing for
multicast traffic.
Layer 2 implementations are simplified even further with vSAN 6.6 and later, which introduces unicast support. IGMP
Snooping is not required.

Layer 2, Single Site, Multiple Racks


This network topology works with the Layer 2 implementation where there are multiple racks, and multiple top-of-rack
switches, or TORs, connected to a core switch.
In the following figures, the blue dotted line between the TORs shows that the vSAN network is available and accessible
to all the hosts in the vSAN cluster. However, the hosts in the different racks communicate to each other over Layer 3,
which implies using PIM to route multicast traffic between the hosts. The TORs are not physically connected to each
other.
VMware recommends that all TORs are configured for IGMP Snooping, to prevent unnecessary multicast traffic on the
network. As there is no routing of the traffic, there is no need to configure PIM to route the multicast traffic.
This implementation is easier in vSAN 6.6 and later, because vSAN traffic is unicast. With unicast traffic, there is no need
to configure IGMP Snooping on the switches.

151
VMware vSAN 8.0

Layer 3, Single Site, Multiple Racks


This network topology works for vSAN deployments where Layer 3 is used to route vSAN traffic.
This simple Layer 3 network topology uses multiple racks in the same data center, each with its own TOR switch. Route
the vSAN network between the different racks over L3, to allow all the hosts in the vSAN cluster to communicate. Place
the vSAN VMkernel ports on different subnets or VLANs, and use a separate subnet or VLAN for each rack.
This network topology routes packets through intermediate Layer 3 capable devices, such as routers and Layer 3 capable
switches. Whenever hosts are deployed across different Layer 3 network segments, the result is a routed network
topology.
With vSAN 6.5 and earlier, VMware recommends the use and configuration of IGMP Snooping, because these
deployments require multicast. Configure PIM on the physical switches to facilitate the routing of the multicast traffic.
vSAN 6.6 and later simplifies this topology. As there is no multicast traffic, there is no need to configure IGMP Snooping.
You do not need to configure PIM to route multicast traffic.
Here is an overview of an example vSAN 6.6 deployment over L3. There is no requirement for IGMP Snooping or PIM,
because there is no multicast traffic.

152
VMware vSAN 8.0

vSAN Stretched Cluster Deployments


vSAN supports stretched cluster deployments that span two locations.
In vSAN 6.5 and earlier, vSAN traffic between data sites is multicast for metadata and unicast for I/O.
In vSAN 6.6 and later, all traffic is unicast. In all versions of vSAN, the witness traffic between a data site and the witness
host is unicast.

Layer 2 Everywhere
You can configure a vSAN stretched cluster in a Layer 2 network, but this configuration is not recommended.
Consider a design where the vSAN stretched cluster is configured in one large Layer 2 design. Data Site 1 and Site 2 are
where the virtual machines are deployed. Site 3 contains the witness host.
NOTE
For best results, do not use a stretched Layer 2 network across all sites.
To demonstrate Layer 2 everywhere as simply as possible, we use switches (and not routers) in the topologies.
Layer 2 networks cannot have any loops (multiple paths), so features such as Spanning Tree Protocol (STP) are needed
to block one of the connections between Site 1 and Site 2. Now consider a situation where the link between Site 2 and
Site 3 is broken (the link between Site 1 and Site 2). Network traffic can be switched from Site 1 to Site 2 through the
witness host at Site 3. As VMware supports a much lower bandwidth and higher latency for the witness host, you see a
significant decrease in performance if data network traffic passes through a lower specification witness site.
If switching traffic between data sites through the witness site does not impact latency of applications, and bandwidth is
acceptable, a stretched L2 configuration between sites is possible. In most cases, such a configuration is not feasible, and
adds complexity to the networking requirements.

153
VMware vSAN 8.0

With vSAN 6.5 or earlier, which uses multicast traffic, you must configure IGMP snooping on the switches. This is not
necessary with vSAN 6.6 and later. PIM is not necessary because there is no routing of multicast traffic.

Supported vSAN Stretched Cluster Configurations


vSAN supports stretched cluster configurations.
The following configuration prevent traffic from Site 1 being routed to Site 2 through the witness host, in the event of a
failure on either of the data sites' network. This configuration avoids performance degradation. To ensure that data traffic
is not switched through the witness host, use the following network topology.
Between Site 1 and Site 2, implement a stretched Layer 2 switched configuration or a Layer 3 routed configuration. Both
configurations are supported.
Between Site 1 and the witness host, implement a Layer 3 routed configuration.
Between Site 2 and the witness host, implement a Layer 3 routed configuration.
These configurations (L2+L3, and L3 everywhere) are described with considerations given to multicast in vSAN 6.5 and
earlier, and unicast only, which is available in vSAN 6.6. Multicast traffic introduces additional configuration steps for IGMP
snooping, and PIM for routing multicast traffic.
We shall examine a stretched Layer 2 network between the data sites and a Layer 3 routed network to the witness site. To
demonstrate a combination of Layer 2 and Layer 3 as simply as possible, use a combination of switches and routers in the
topologies.

Stretched Layer 2 Between Data Sites, Layer 3 to Witness Host


vSAN supports stretched Layer 2 configurations between data sites.
The only traffic that is routed in this case is the witness traffic. With vSAN 6.5 and earlier, which uses multicast, use
IGMP snooping for the multicast traffic on the stretched L2 vSAN between data sites. However, since the witness traffic is
unicast, there is no need to implement PIM on the Layer 3 segments.

154
VMware vSAN 8.0

With vSAN 6.6, which uses unicast, there is no requirement to consider IGMP snooping or PIM.

Layer 3 Everywhere
In this vSAN stretched cluster configuration, the data traffic is routed between the data sites and the witness host.
To implement Layer 3 everywhere as simply as possible, use routers or routing switches in the topologies.
For example, consider an environment with vSAN 6.5 or earlier, which uses multicast traffic. In this case, configure IGMP
snooping on the data site switches to manage the amount of multicast traffic on the network. This is unnecessary at the
witness host since witness traffic is unicast. The multicast traffic is routed between the data sites, so configure PIM to
allow multicast routing.
With vSAN 6.6 and later, neither IGMP snooping nor PIM are needed because all the routed traffic is unicast.

155
VMware vSAN 8.0

Separating Witness Traffic on vSAN Stretched Clusters


vSAN supports separating witness traffic on stretched clusters.
Inn vSAN 6.5 and later releases, you can separate witness traffic from vSAN traffic in two-node configurations. This
means that the two vSAN hosts can be directly connected without a 10 Gb switch.
This witness traffic separation is only supported on two-node deployments in vSAN 6.6. Separating the witness traffic on
vSAN stretched clusters is supported in vSAN 6.7 and later.

Using vSAN Stretched Cluster to Achieve Rack Awareness


With vSAN stretched clusters, vSAN provides rack awareness in a single site.
If you have two racks of vSAN hosts, you can continue to run your vSAN cluster after a complete rack failure. In this case,
availability of the VM workloads is provided by the remaining rack and a remote witness host.
NOTE
For this configuration to be supported, do not place the witness host within the two racks of vSAN hosts.

156
VMware vSAN 8.0

In this example, if rack 1 fails, rack 2 and the witness host provide VM availability. This configuration is a pre- vSAN 6.6
environment, and needs multicast configured on the network. The witness host must be on the vSAN network. Witness
traffic is unicast. In vSAN 6.6 and later, all traffic is unicast.
This topology is also supported over L3. Place the vSAN VMkernel ports on different subnets or VLANs, and use a
separate subnet or VLAN for each rack.

157
VMware vSAN 8.0

This topology supports deployments with two racks to achieve rack awareness (fault domains) with a vSAN stretched
cluster. This solution uses a witness host that is external to the cluster.

Two Node vSAN Deployments


vSAN supports two-node deployments. Two-node vSAN deployments are used for remote offices/branch offices (ROBO)
that have a small number of workloads, but require high availability.
vSAN two-node deployments use a third witness host, which can be located remotely from the branch office. Often the
witness is maintained in the branch office, along with the management components, such as the vCenter Server.

Two Node vSAN Deployments Earlier than vSAN 6.5


vSAN releases earlier than 6.5 that support two-node deployments require a physical switch at the remote site.
Early two-node vSAN have a requirement to include a physical 10 Gb switch at the remote site. If the only servers at this
remote site were the vSAN hosts, this could be an inefficient solution.
With this deployment, if there are no other devices using the 10 Gb switch, then no consideration needs to be given to
IGMP snooping. If other devices at the remote site share the 10 Gb switch, use IGMP snooping to prevent excessive and
unnecessary multicast traffic.
PIM is not required because the only routed traffic is witness traffic, which is unicast.

158
VMware vSAN 8.0

Two Node Deployments for vSAN 6.5 and Later


vSAN 6.5 and later supports two-node deployments.
With vSAN version 6.5 and later, this two-node vSAN implementation is much simpler. vSAN 6.5 and later allows the two
hosts at the data site to be directly connected.

To enable this functionality, the witness traffic is separated completely from the vSAN data traffic. The vSAN data traffic
can flow between the two nodes on the direct connect, while the witness traffic can be routed to the witness site over the
management network.
The witness appliance can be located remotely from the branch office. For example, the witness might be running back in
the main data center, alongside the management infrastructure (vCenter Server, vROps, Log Insight, and so on). Another
supported place where the witness can reside remotely from the branch office is in vCloud Air.
In this configuration, there is no switch at the remote site. As a result, there is no need configure support for multicast
traffic on the vSAN back-to-back network. You do not need to consider multicast on the management network because the
witness traffic is unicast.
vSAN 6.6 and later uses all unicast, so there are no multicast considerations. Multiple remote office/branch office two-
node deployments are also supported, so long as each has their own unique witness.

159
VMware vSAN 8.0

Common Considerations for Two Node vSAN Deployments


Two-node vSAN deployments provide support to other topologies. This section describes common configurations.
For more information about two-node configurations and detailed deployment considerations outside of network, see the
vSAN core documentation.

Running the Witness on Another Two Node vSAN Cluster


vSAN does not support running the witness on another two-node cluster.

Witness Running on Another Standard vSAN Deployment


vSAN supports witness running on another standard vSAN deployment.
This configuration is supported. Any failure on the two-node vSAN at the remote site does not impact the availability of the
standard vSAN environment at the main data center.

Configuration of Network from Data Sites to Witness Host


The host interfaces in the data sites communicate to the witness host over the vSAN network. There are different
configuration options available.
This topic discusses how to implement these configurations. It addresses how the interfaces on the hosts in the data sites,
which communicate to each other over the vSAN network, communicate with the witness host.

160
VMware vSAN 8.0

Option 1: Physical ESXi Witness Connected over L3 with Static Routes


The data sites can be connected over a stretched L2 network. Use this also for the data sites’ management network,
vSAN network, vMotion network, and virtual machine network.
The physical network router in this network infrastructure does not automatically transfer traffic from the hosts in the data
sites (site 1 and site 2) to the host in the witness site (site 3). To configure the vSAN stretched cluster successfully, all
hosts in the cluster must communicate. It is possible to deploy a vSAN stretched cluster in this environment.
The solution is to use static routes configured on the ESXi hosts, so that the vSAN traffic from site1 and site 2 can reach
the witness host in site 3. In the case of the ESXi hosts on the data sites, add a static route to the vSAN interface, which
redirects traffic to the witness host on site 3 over a specified gateway for that network. In the case of the witness host,
the vSAN interface must have a static route added, which redirects vSAN traffic destined for the hosts in the data sites.
Use the following command to add a static route on each ESXi host in the vSAN stretched cluster: esxcli network ip
route ipv4 add -g <gateway>-n <network>
NOTE
The vCenter Server must be able to manage the ESXi hosts at both the data sites and the witness site. As
long as there is direct connectivity from the witness host to vCenter Server, there are no additional concerns
regarding the management network.
There is no need to configure a vMotion network or a VM network, or add any static routes for these networks in the
context of a vSAN stretched cluster. Virtual machines are never migrated or deployed to the vSAN witness host. Its
purpose is to maintain witness objects only, and does not require either of these networks for this task.

Option 2: Virtual ESXi Witness Appliance Connected over L3 with Static Routes
Since the witness host is a virtual machine that gets deployed on a physical ESXi host, which is not part of the vSAN
cluster, that physical ESXi host must have a minimum of one VM network pre-configured. This VM network must reach
both the management network and the vSAN network shared by the ESXi hosts on the data sites.
NOTE
The witness host does not need to be a dedicated host. It can be used for many other VM workloads, while
simultaneously hosting the witness.
An alternative option is to have two preconfigured VM networks on the underlying physical ESXi host, one for the
management network and one for the vSAN network. When the virtual ESXi witness is deployed on this physical ESXi
host, the network needs to be attached and configured accordingly.
Once you have deployed the virtual ESXi witness host, configure the static route. Assume that the data sites are
connected over a stretched L2 network. Use this also for the data sites’ management network, vSAN network, vMotion
network, and virtual machine network. vSAN traffic is not routed from the hosts in the data sites (site 1 and site 2) to the
host in the witness site (site 3) over the default gateway. To configure the vSAN stretched cluster successfully, all hosts in
the cluster require static routes, so that the vSAN traffic from site 1 and site 2 can reach the witness host in site 3. Use the
esxcli network ip route command to add a static route on each ESXi host.

Corner Case Deployments


It is possible to deploy vSAN in unusual, or corner-case configurations.
These unusual topologies require special considerations.

Three Locations, No vSAN Stretched Cluster, Distributed Witness Hosts


You can deploy vSAN across multiple rooms, buildings or sites, rather than deploy a stretched cluster configuration.
This configuration is supported. The one requirement is that the latency between the sites must be at the same level as
the latency expected for a normal vSAN deployment in the same data center. The latency must be <1ms between all

161
VMware vSAN 8.0

hosts. If latency is greater than this value, consider a vSAN stretched cluster which tolerates latency of 5ms. With vSAN
6.5 or earlier, additional considerations for multicast must be addressed.
For best results, maintain a uniform configuration across all sites in such a topology. To maintain availability of VMs,
configure fault domains, where the hosts in each room, building, or site are placed in the same fault domain. Avoid
asymmetric partitioning of the cluster, where host A cannot communicate to host B, but host B can communicate to host A.

Two-Node Deployed as 1+1+W Stretched Cluster


You can deploy a two-node configuration as a vSAN stretched cluster configuration, placing each host in different rooms,
buildings, or sites.
Attempt to increase the number of hosts at each site fail with an error related to licensing. For any cluster that is larger
than two hosts and that uses the dedicated witness appliance/host feature (N+N+Witness, where N>1), the configuration
is considered a vSAN stretched cluster.

Troubleshooting the vSAN Network


vSAN allows you to examine and troubleshoot the different types of issues that arise from a misconfigured vSAN network.
vSAN operations depend on the network configuration, reliability, and performance. Many support requests stem from an
incorrect network configuration, or the network not performing as expected.
Use the vSAN health service to resolve network issues. Network health checks can direct you to an appropriate
Knowledge Base article, depending on the results of the health check. The Knowledge Base article provides instructions
to solve the network problem.

Network Health Checks


The health service includes a category for networking health checks.
Each health check has an Ask VMware link. If a health check fails, click Ask VMware and read the associated VMware
Knowledge Base article for further details, and guidance on how to address the issue at hand.
The following networking health checks provide useful information about your vSAN environment.
• vSAN: Basic (unicast) connectivity check. This check verifies that IP connectivity exists among all ESXi hosts in the
vSAN cluster, by pinging each ESXi host on the vSAN network from each other ESXi host.
• vMotion: Basic (unicast) connectivity check. This check verifies that IP connectivity exists among all ESXi hosts in
the vSAN cluster that have vMotion configured. Each ESXi host on the vMotion network pings all other ESXi hosts.
• All hosts have a vSAN vmknic configured. This check ensures each ESXi host in the vSAN cluster has a VMkernel
NIC configured for vSAN traffic.
• All hosts have matching multicast settings. This check ensures that each hosts have a properly configured multicast
address.
• All hosts have matching subnets. This check tests that all ESXi hosts in a vSAN cluster have been configured so
that all vSAN VMkernel NICs are on the same IP subnet.
• Hosts disconnected from VC. This check verifies that the vCenter Server has an active connection to all ESXi hosts
in the vSAN cluster.
• Hosts with connectivity issues. This check refers to situations where vCenter Server lists the host as connected,
but API calls from vCenter to the host are failing. It can highlight connectivity issues between a host and the vCenter
Server.
• Network latency. This check performs a network latency check of vSAN hosts. If the threshold exceeds 5 ms, a
warning is displayed.

162
VMware vSAN 8.0

• vMotion: MTU checks (ping with large packet size). This check complements the basic vMotion ping connectivity
check. Maximum Transmission Unit size is increased to improve network performance. Incorrectly configured MTUs
might not appear as a network configuration issue, but can cause performance issues.
• vSAN cluster partition. This health check examines the cluster to see how many partitions exist. It displays an error if
there is more than a single partition in the vSAN cluster.
• Multicast assessment based on other checks. This health check aggregates data from all network health checks. If
this check fails, it indicates that multicast is likely the root cause of a network partition.

Commands to Check the Network


When the vSAN network has been configured, use these commands to check its state. You can check which VMkernel
Adapter (vmknic) is used for vSAN, and what attributes it contains.
Use ESXCLI and RVC commands to verify that the network is fully functional, and to troubleshoot any network issues with
vSAN.
You can verify that the vmknic used for the vSAN network is uniformly configured correctly across all hosts, check that
multicast is functional, and verify that hosts participating in the vSAN cluster can successfully communicate with one
another.

esxcli vsan network list


This command enables you to identify the VMkernel interface used by the vSAN network.
The output below shows that the vSAN network is using vmk2. This command continues to work even if vSAN has been
turned off and the hosts no longer participate in vSAN.
The Agent Group Multicast and Master Group Multicast are also important to check.
[root@esxi-dell-m:~] esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IP
Interface UUID: 32efc758-9ca0-57b9-c7e3-246e962c24d0
Agent Group Multicast Address: 224.2.3.4
Agent Group IPv6 Multicast Address: ff19::2:3:4
Agent Group Multicast Port: 23451
Master Group Multicast Address: 224.1.2.3
Master Group IPv6 Multicast Address: ff19::1:2:3
Master Group Multicast Port: 12345
Host Unicast Channel Bound Port: 12321
Multicast TTL: 5
Traffic Type: vsan

This provides useful information, such as which VMkernel interface is being used for vSAN traffic. In this case, it is vmk1.
However, also shown are the multicast addresses. This information might be displayed even when the cluster us running
in unicast mode. There is the group multicast address and port. Port 23451 is used for the heartbeat, sent every second
by the primary, and is visible on every other host in the cluster. Port 12345 is used for the CMMDS updates between the
primary and backup.

esxcli network ip interface list


This command enables you to verify items such as vSwitch or distributed switch.
Use this command to check which vSwitch or distributed switch that it is attached to, and the MTU size, which can be
useful if jumbo frames have been configured in the environment. In this case, MTU is at the default of 1500.

163
VMware vSAN 8.0

[root@esxi-dell-m:~] esxcli network ip interface list


vmk0
Name: vmk0
<<truncated>>
vmk1
Name: vmk1
MAC Address: 00:50:56:69:96:f0
Enabled: true
Portset: DvsPortset-0
Portgroup: N/A
Netstack Instance: defaultTcpipStack
VDS Name: vDS
VDS UUID: 50 1e 5b ad e3 b4 af 25-18 f3 1c 4c fa 98 3d bb
VDS Port: 16
VDS Connection: 1123658315
Opaque Network ID: N/A
Opaque Network Type: N/A
External ID: N/A
MTU: 9000
TSO MSS: 65535
Port ID: 50331814

The Maximum Transmission Unit size is shown as 9000, so this VMkernel port is configured for jumbo frames, which
require an MTU of about 9,000. VMware does not make any recommendation around the use of jumbo frames. However,
jumbo frames are supported for use with vSAN.

esxcli network ip interface ipv4 get –i vmk2


This command displays information such as IP address and netmask of the vSAN VMkernel interface.
With this information, an administrator can now begin to use other commands available at the command line to check that
the vSAN network is working correctly.
[root@esxi-dell-m:~] esxcli network ip interface ipv4 get -i vmk1
Name IPv4 Address IPv4 Netmask IPv4 Broadcast Address Type Gateway DHCP DNS
---- ------------ ------------- -------------- ------------ ------- --------
vmk1 172.40.0.9 255.255.255.0 172.40.0.255 STATIC 0.0.0.0 false

vmkping
The vmkping command verifies whether all the other ESXi hosts on the network are responding to your ping requests.
~ # vmkping -I vmk2 172.32.0.3 -s 1472 -d
PING 172.32.0.3 (172.32.0.3): 56 data bytes
64 bytes from 172.32.0.3: icmp_seq=0 ttl=64 time=0.186 ms
64 bytes from 172.32.0.3: icmp_seq=1 ttl=64 time=2.690 ms
64 bytes from 172.32.0.3: icmp_seq=2 ttl=64 time=0.139 ms

--- 172.32.0.3 ping statistics ---


3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.139/1.005/2.690 ms

While it does not verify multicast functionality, it can help identify a rogue ESXi host that has network issues. You can also
examine the response times to see if there is any abnormal latency on the vSAN network.

164
VMware vSAN 8.0

If jumbo frames are configured, this command does not report any issues if the jumbo frame MTU size is incorrect. By
default, this command uses an MTU size of 1500. If there is a need to verify if jumbo frames are successfully working end-
to-end, use vmkping with a larger packet size (-s) option as follows:
~ # vmkping -I vmk2 172.32.0.3 -s 8972 -d
PING 172.32.0.3 (172.32.0.3): 8972 data bytes
9008 bytes from 172.32.0.3: icmp_seq=0 ttl=64 time=0.554 ms
9008 bytes from 172.32.0.3: icmp_seq=1 ttl=64 time=0.638 ms
9008 bytes from 172.32.0.3: icmp_seq=2 ttl=64 time=0.533 ms

--- 172.32.0.3 ping statistics ---


3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.533/0.575/0.638 ms
~ #

Consider adding -d to the vmkping command to test if packets can be sent without fragmentation.

esxcli network ip neighbor list


This command helps to verify if all vSAN hosts are on the same network segment.
In this configuration, we have a four-host cluster, and this command returns the ARP (Address Resolution Protocol)
entries of the other three hosts, including their IP addresses and their vmknic (vSAN is configured to use vmk1 on all
hosts in this cluster).
[root@esxi-dell-m:~] esxcli network ip neighbor list -i vmk1
Neighbor Mac Address Vmknic Expiry State Type
----------- ----------------- ------ ------- ----- -------
172.40.0.12 00:50:56:61:ce:22 vmk1 164 sec Unknown
172.40.0.10 00:50:56:67:1d:b2 vmk1 338 sec Unknown
172.40.0.11 00:50:56:6c:fe:c5 vmk1 162 sec Unknown
[root@esxi-dell-m:~]

esxcli network diag ping


This command checks for duplicates on the network, and round-trip times.
To get even more detail regarding the vSAN network connectivity between the various hosts, ESXCLI provides a powerful
network diagnostic command. Here is an example of one such output, where the VMkernel interface is on vmk1 and the
remote vSAN network IP of another host on the network is 172.40.0.10
[root@esxi-dell-m:~] esxcli network diag ping -I vmk1 -H 172.40.0.10
Trace:
Received Bytes: 64
Host: 172.40.0.10
ICMP Seq: 0
TTL: 64
Round-trip Time: 1864 us
Dup: false
Detail:

Received Bytes: 64
Host: 172.40.0.10
ICMP Seq: 1
TTL: 64
Round-trip Time: 1834 us

165
VMware vSAN 8.0

Dup: false
Detail:

Received Bytes: 64
Host: 172.40.0.10
ICMP Seq: 2
TTL: 64
Round-trip Time: 1824 us
Dup: false
Detail:
Summary:
Host Addr: 172.40.0.10
Transmitted: 3
Recieved: 3
Duplicated: 0
Packet Lost: 0
Round-trip Min: 1824 us
Round-trip Avg: 1840 us
Round-trip Max: 1864 us
[root@esxi-dell-m:~]

vsan.lldpnetmap
This RVC command displays uplink port information.
If there are non-Cisco switches with Link Layer Discovery Protocol (LLDP) enabled in the environment, there is an
RVC command to display uplink <-> switch <-> switch port information. For more information on RVC, refer to the RVC
Command Guide.
This helps you determine which hosts are attached to which switches when the vSAN cluster is spanning multiple
switches. It can help isolate a problem to a particular switch when only a subset of the hosts in the cluster is impacted.
> vsan.lldpnetmap 02013-08-15 19:34:18 -0700: This operation will take 30-60 seconds ...+---------------
+---------------------------+| Host | LLDP info |+---------------
+---------------------------+| 10.143.188.54 | w2r13-vsan-x650-2: vmnic7 || | w2r13-vsan-x650-1:
vmnic5 |+---------------+---------------------------+

This is only available with switches that support LLDP. To configure it, log in to the switch and run the following:
switch# config t
Switch(Config)# feature lldp

To verify that LLDP is enabled:


switch(config)#do show running-config lldp

NOTE
LLDP operates in both send and receive mode, by default. Check the settings of your vDS properties if the
physical switch information is not being discovered. By default, vDS is created with discovery protocol set to
CDP, Cisco Discovery Protocol. To resolve this, set the discovery protocol to LLDP, and set operation to both on
the vDS.

Checking Multicast Communications


Multicast configurations can cause issues for initial vSAN deployment.

166
VMware vSAN 8.0

One of the simplest ways to verify if multicast is working correctly in your vSAN environment is by using the tcpdump-uw
command. This command is available from the command line of the ESXi hosts.
This tcpdump-uw command shows if the primary is correctly sending multicast packets (port and IP info) and if all other
hosts in the cluster are receiving them.
On the primary, this command shows the packets being sent out to the multicast address. On all other hosts, the same
packets are visible (from the primary to the multicast address). If they are not visible, multicast is not working correctly.
Run the tcpdump-uw command shown here on any host in the cluster, and the heartbeats from the primary are visible. In
this case, the primary is at IP address 172.32.0.2. The -v for verbosity is optional.
[root@esxi-hp-02:~] tcpdump-uw -i vmk2 multicast -v
tcpdump-uw: listening on vmk2, link-type EN10MB (Ethernet), capture size 96 bytes
11:04:21.800575 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 34917, offset 0, flags [none], proto
UDP (17), length 228)
172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200
11:04:22.252369 IP truncated-ip - 234 bytes missing! (tos 0x0, ttl 5, id 15011, offset 0, flags [none], proto
UDP (17), length 316)
172.32.0.2.38170 > 224.2.3.4.23451: UDP, length 288
11:04:22.262099 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 3359, offset 0, flags [none], proto
UDP (17), length 228)
172.32.0.3.41220 > 224.2.3.4.23451: UDP, length 200
11:04:22.324496 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 20914, offset 0, flags [none], proto
UDP (17), length 228)
172.32.0.5.60460 > 224.1.2.3.12345: UDP, length 200
11:04:22.800782 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 35010, offset 0, flags [none], proto
UDP (17), length 228)
172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200
11:04:23.252390 IP truncated-ip - 234 bytes missing! (tos 0x0, ttl 5, id 15083, offset 0, flags [none], proto
UDP (17), length 316)
172.32.0.2.38170 > 224.2.3.4.23451: UDP, length 288
11:04:23.262141 IP truncated-ip - 146 bytes missing! (tos 0x0, ttl 5, id 3442, offset 0, flags [none], proto
UDP (17), length 228)
172.32.0.3.41220 > 224.2.3.4.23451: UDP, length 200

While this output might seem a little confusing, suffice to say that the output shown here indicates that the four hosts in the
cluster are getting a heartbeat from the primary. This tcpdump-uw command must be run on every host to verify that they
are all receiving the heartbeat. This verifies that the primary is sending the heartbeats, and every other host in the cluster
is receiving them, which indicates that multicast is working.
If some of the vSAN hosts are not able to pick up the one-second heartbeats from the primary, the network administrator
needs to check the multicast configuration of their switches.
To avoid the annoying truncated-ip – 146 bytes missing! message, use the –s0 option to the same command
to stop trunacating of packets:
[root@esxi-hp-02:~] tcpdump-uw -i vmk2 multicast -v -s0
tcpdump-uw: listening on vmk2, link-type EN10MB (Ethernet), capture size 65535 bytes
11:18:29.823622 IP (tos 0x0, ttl 5, id 56621, offset 0, flags [none], proto UDP (17), length 228)
172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200
11:18:30.251078 IP (tos 0x0, ttl 5, id 52095, offset 0, flags [none], proto UDP (17), length 228)
172.32.0.3.41220 > 224.2.3.4.23451: UDP, length 200
11:18:30.267177 IP (tos 0x0, ttl 5, id 8228, offset 0, flags [none], proto UDP (17), length 316)
172.32.0.2.38170 > 224.2.3.4.23451: UDP, length 288
11:18:30.336480 IP (tos 0x0, ttl 5, id 28606, offset 0, flags [none], proto UDP (17), length 228)
172.32.0.5.60460 > 224.1.2.3.12345: UDP, length 200

167
VMware vSAN 8.0

11:18:30.823669 IP (tos 0x0, ttl 5, id 56679, offset 0, flags [none], proto UDP (17), length 228)
172.32.0.4.44824 > 224.1.2.3.12345: UDP, length 200

The tcpdump command is related to IGMP (Internet Group Management Protocol) membership. Hosts (and network
devices) use IGMP to establish multicast group membership.
Each ESXi host in the vSAN cluster sends out regular IGMP membership reports (Join).
The tcpdump command shows IGMP member reports from a host:
[root@esxi-dell-m:~] tcpdump-uw -i vmk1 igmp
tcpdump-uw: verbose output suppressed, use -v or -vv for full protocol decode
listening on vmk1, link-type EN10MB (Ethernet), capture size 262144 bytes
15:49:23.134458 IP 172.40.0.9 > igmp.mcast.net: igmp v3 report, 1 group record(s)
15:50:22.994461 IP 172.40.0.9 > igmp.mcast.net: igmp v3 report, 1 group record(s)

The output shows IGMP v3 reports are taking place, indicating that the ESXi host is regularly updating its membership. If
a network administrator has any doubts whether or not vSAN ESXi hosts are doing IGMP correctly, running this command
on each ESXi host in the cluster and showing this trace can be used to verify.
If you have multicast communications, use IGMP v3.
In fact, the following command can be used to look at multicast and IGMP traffic at the same time:
[root@esxi-hp-02:~] tcpdump-uw -i vmk2 multicast or igmp -v -s0

A common issue is that the vSAN cluster is configured across multiple physical switches, and while multicast has been
enabled on one switch, it has not been enabled across switches. In this case, the cluster forms with two ESXi hosts in one
partition, and another ESXi host (connected to the other switch) is unable to join this cluster. Instead it forms its own vSAN
cluster in another partition. The vsan.lldpnetmap command seen earlier can help you determine network configuration,
and which hosts are attached to which switch.
While a vSAN cluster forms, there are indicators that show multicast might be an issue.
Assume that the checklist for subnet, VLAN, MTU has been followed, and each host in the cluster can vmkping every
other host in the cluster.
If there is a multicast issue when the cluster is created, a common symptom is that each ESXi host forms its own vSAN
cluster, with itself as the primary. If each host has a unique network partition ID, this symptom suggests that there is no
multicast between any of the hosts.
However, if there is a situation where a subset of the ESXi hosts form a cluster, and another subset form another cluster,
and each have unique partitions with their own primary, backup and perhaps even agent hosts, multicast is enabled in
the switch, but not across switches. vSAN shows hosts on the first physical switch forming their own cluster partition, and
hosts on the second physical switch forming their own cluster partition, each with its own primary. If you can verify which
switches the hosts in the cluster connect to, and hosts in a cluster are connected to the same switch, then this probably is
the issue.

Checking vSAN Network Performance


Make that there is sufficient bandwidth between your ESXi hosts. This tool can assist you in testing whether your vSAN
network is performing optimally.
To check the performance of the vSAN network, you can use iperf tool to measure maximum TCP bandwidth and
latency. It is located in /usr/lib/vmware/vsan/bin/iperf.copy. Run it with -–help to see the various options.
Use this tool to check network bandwidth and latency between ESXi hosts participating in a vSAN cluster.
VMware KB 2001003 can assist with setup and testing.
This is most useful when a vSAN cluster is being commissioned. Running iperf tests on the vSAN network when the
cluster is already in production can impact the performance of the virtual machines running on the cluster.

168
VMware vSAN 8.0

Checking vSAN Network Limits


The vsan.check.limits command verifies that none of the vSAN thresholds are being breached.
> ls
0 /
1 vcsa-04.rainpole.com/
> cd 1
/vcsa-04.rainpole.com> ls
0 Datacenter (datacenter)
/vcsa-04.rainpole.com> cd 0
/vcsa-04.rainpole.com/Datacenter> ls
0 storage/
1 computers [host]/
2 networks [network]/
3 datastores [datastore]/
4 vms [vm]/
/vcsa-04.rainpole.com/Datacenter> cd 1
/vcsa-04.rainpole.com/Datacenter/computers> ls
0 Cluster (cluster): cpu 155 GHz, memory 400 GB
1 esxi-dell-e.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
2 esxi-dell-f.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
3 esxi-dell-g.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
4 esxi-dell-h.rainpole.com (standalone): cpu 38 GHz, memory 123 GB
/vcsa-04.rainpole.com/Datacenter/computers> vsan.check_limits 0
2017-03-14 16:09:32 +0000: Querying limit stats from all hosts ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-m.rainpole.com (may take a moment) ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-n.rainpole.com (may take a moment) ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-o.rainpole.com (may take a moment) ...
2017-03-14 16:09:34 +0000: Fetching vSAN disk info from esxi-dell-p.rainpole.com (may take a moment) ...
2017-03-14 16:09:39 +0000: Done fetching vSAN disk infos
+--------------------------+--------------------
+-----------------------------------------------------------------+
| Host | RDT | Disks
|
+--------------------------+--------------------
+-----------------------------------------------------------------+
| esxi-dell-m.rainpole.com | Assocs: 1309/45000 | Components: 485/9000
|
| | Sockets: 89/10000 | naa.500a075113019b33: 0% Components: 0/0
|
| | Clients: 136 | naa.500a075113019b37: 40% Components: 81/47661
|
| | Owners: 138 | t10.ATA_____Micron_P420m2DMTFDGAR1T4MAX_____ 0% Components:
0/0 |
| | | naa.500a075113019b41: 37% Components: 80/47661
|
| | | naa.500a07511301a1eb: 38% Components: 81/47661
|
| | | naa.500a075113019b39: 39% Components: 79/47661
|
| | | naa.500a07511301a1ec: 41% Components: 79/47661
|
<<truncated>>

169
VMware vSAN 8.0

From a network perspective, it is the RDT associations (Assocs) and sockets count that are important. There are 45,000
associations per host in vSAN 6.0 and later. An RDT association is used to track peer-to-peer network state within vSAN.
vSAN is sized so that it never runs out of RDT associations. vSAN also limits how many TCP sockets it is allowed to use,
and vSAN is sized so that it never runs out of its allocation of TCP sockets. There is a limit of 10,000 sockets per host.
A vSAN client represents object's access in the vSAN cluster. The client typically represents a virtual machine running on
a host. The client and the object might not be on the same host. There is no hard defined limit, but this metric is shown to
help understand how clients balance across hosts.
There is only one vSAN owner for a given vSAN object, typically co-located with the vSAN client accessing this object.
vSAN owners coordinate all access to the vSAN object and implement functionality, such as mirroring and striping. There
is no hard defined limit, but this metric is once again shown to help understand how owners balance across hosts.

Using Multicast in vSAN Network


Multicast is a network communication technique that sends information packets to a group of destinations over an IP
network.
Releases earlier than vSAN version 6.6 support IP multicast and used IP multicast communication as a discovery
protocol to identify the nodes trying to join a vSAN cluster. Releases earlier than vSAN version 6.6 depend on IP multicast
communication while joining and leaving the cluster groups and during other intra-cluster communication operations.
Ensure that you enable and configure the IP multicast in the IP network segments to carry the vSAN traffic service.
An IP multicast address is called a Multicast Group (MG). IP multicast sends source packets to multiple receivers as
a group transmission. IP multicast relies on communication protocols that hosts, clients, and network devices use to
participate in multicast-based communications. Communication protocols such as Internet Group Management Protocol
(IGMP) and Protocol Independent Multicast (PIM) are the main components and dependencies for the use of IP multicast
communications.
While creating a vSAN cluster, a default multicast address is assigned to each vSAN cluster. The vSAN traffic service
automatically assigns the default multicast address settings to each host. This multicast address sends frames to a default
multicast group and multicast group agent.
When multiple vSAN clusters reside on the same Layer 2 network, VMware recommends changing the default multicast
address within the additional vSAN clusters. This prevents multiple clusters from receiving all multicast streams. See
VMware KB 2075451 for more information about changing the default vSAN multicast address.

Internet Group Management Protocol


You can use Internet Group Management Protocol (IGMP) to add receivers to the IP Multicast group membership within
the Layer 2 domains.
IGMP allows receivers to send requests to the multicast groups they want to join. Becoming a member of a multicast
group allows routers to forward traffic for the multicast groups on the Layer 3 segment where the receiver is connected to
switch port.
You can use IGMP snooping to limit the physical switch ports participating in the multicast group to only vSAN VMkernel
port uplinks. IGMP snooping is configured with an IGMP snooping querier. The need to configure an IGMP snooping
querier to support IGMP snooping varies by switch vendor. Consult your specific switch vendor for IGMP snooping
configuration.
vSAN supports both IGMP version 2 and IGMP version 3. When you perform the vSAN deployment across Layer 3
network segments, you can configure a Layer 3 capable device such as a router or a switch with a connection and access
to the same Layer 3 network segments.
All VMkernel ports on the vSAN network subscribe to a multicast group using IGMP to avoid multicast flooding all network
ports.

170
VMware vSAN 8.0

NOTE
You can deactivate IGMP snooping if vSAN is on a non-routed or trunked VLAN that you can extend to the
vSAN ports of all the hosts in the cluster.

Protocol Independent Multicast


Protocol Independent Multicast (PIM) consists of Layer 3 multicast routing protocols.
It provides different communication techniques for IP multicast traffic to reach receivers that are in different Layer 3
segments from the multicast groups sources. For earlier vSAN version 6.6 cluster, you must use PIM to enable the
multicast traffic to flow across different subnets. Consult your network vendor for the implementation of PIM.

Networking Considerations for vSAN File Service


vSAN File Service is a layer that sits on top of vSAN to provide file shares. It currently supports SMB, NFSv3, and
NFSv4.1 file shares.
Following are the network considerations for vSAN File Service:
• You must allocate static IP addresses as file server IPs from vSAN File Service network, each IP is the access point to
vSAN file shares.
– For best performance, the number of IP addresses must be equal to the number of hosts in the vSAN cluster.
– All the static IP addresses should be from the same subnet.
– Every static IP address has a corresponding FQDN, which should be part of the Forward lookup and Reverse
lookup zones in the DNS server.
• You must ensure to prepare the network as vSAN File Service network:
– If using standard switch based network, the Promiscuous Mode and Forged Transmits are enabled as part of the
vSAN File Services enablement process.
– If using DVS based network, vSAN File Services are supported on DVS version 6.6.0 or later. Create a dedicated
port group for vSAN File Services in the DVS. MacLearning and Forged Transmits are enabled as part of the vSAN
File Services enablement process for a provided DVS port group.
NOTE
If using NSX-based network, ensure that MacLearning is enabled for the provided network entity from
the NSX admin console, and all the hosts and File Services nodes are connected to the desired NSX-T
network.
• For SMB share and NFS share with Kerberos security, you must provide information about your AD domain and
organizational unit (optional). In addition, a user account with sufficient privileges to create and delete objects is
required.
• Ensure that the file server can access AD server and DNS server. The file server must be able to access all the ports
required by AD service.
Following are the ports that vSAN File Service uses for network connectivity. Ensure that these ports are not blocked
by the firewall.

Service Port Number Entity Connectivity Requirements

Server Message Block (SMB) TCP port 445 File Servers External network to file servers
Quotas for a user of a local TCP port 875 File Servers External network to file servers
filesystem (RQUOTA)
Network File System (NFS) TCP and UDP port 2049 File Servers External network to file servers.
NFSv3 can use both TCP and
UDP ports but NFSv4.1 uses
only TCP.

171
VMware vSAN 8.0

Service Port Number Entity Connectivity Requirements

NFS Mount TCP and UDP port 20048 File Servers External network to file servers
Network Status Monitor (NSM) TCP and UDP port 27689 File Servers External network to file servers.
server daemon Both inward and outward
communication must be
permitted.
Network Lock Manager (NLM) TCP and UDP port 32803 File Servers External network to file servers.
Allows the connection initiated
from File Server to client.
Inbound and outbound
connections must be allowed on
firewall. The default port is UDP.
Sun remote procedure call TCP and UDP port 111 File Servers External network to file servers
(sunrpc)
LDAP TCP port 389 Active Directory (AD) servers (if File servers to AD servers
AD domain is configured)
LDAP to Global Catalog TCP port 3268 AD servers (if AD domain is File servers to AD servers
configured)
Kerberos TCP port 88 AD servers (if AD domain is File servers to AD servers
configured)
Kerberos password change TCP port 464 AD servers (if AD domain is File servers to AD servers
configured)
Domain Name Server (DNS) TCP and UDP port 53 DNS servers File servers to DNS servers
vSAN Distributed File System TCP port 1564 ESXi hosts Inside vSAN network
(VDFS) Server
Remote Procedure Call TCP port 135 AD servers (if AD domain is File servers to AD servers
configured)
NetBIOS Session Service TCP port 139 AD servers (if AD domain is File servers to AD servers
configured)
DNS UDP port 53 AD servers (if AD domain is File servers to AD servers
configured)
LDAP, DC Locator, and Net Log UDP port 389 AD servers (if AD domain is File servers to AD servers
on configured)
Randomly allocated high TCP TCP 49152 - 65535 AD servers (if AD domain is File servers to AD servers
ports configured)

Networking Considerations for iSCSI on vSAN


vSAN iSCSI target service allows hosts and physical workloads that reside outside the vSAN cluster to access the vSAN
datastore. This feature enables an iSCSI initiator on a remote host to transport block-level data to an iSCSI target on a
storage device within the vSAN cluster.
The iSCSI targets on vSAN are managed using Storage Policy Based Management (SPBM) similar to other vSAN
objects. For the iSCSI LUNs, this space savings the space through deduplication and compression, and provides security
through encryption. For enhanced security, vSAN iSCSI target service uses Challenge Handshake Authentication Protocol
(CHAP) and Mutual CHAP authentication.

172
VMware vSAN 8.0

vSAN identifies each iSCSI target by a unique iSCSI qualified Name (IQN). The iSCSI target is presented to a remote
iSCSI initiator using the IQN, so that the initiator can access the LUN of the target. vSAN iSCSI target service allows
creating iSCSI initiator groups. The iSCSI initiator group restricts access to only those initiators that are members of the
group.

Characteristics of vSAN iSCSI Network


Following are the characteristics of a vSAN iSCSI network:
• iSCSI Routing - iSCSI initiators can make routed connections to vSAN iSCSI targets over an L3 network.
• IPv4 and IPv6 - vSAN iSCSI network supports both IPv4 and IPv6.
• IP Security - IPSec on the vSAN iSCSI network provides increased security.
NOTE
ESXi hosts support IPsec using IPv6 only.
• Jumbo Frames - Jumbo Frames are supported on the vSAN iSCSI network.
• NIC Teaming - All NIC teaming configurations are supported on the vSAN iSCSI network.
• Multiple Connections per Session (MCS) - vSAN iSCSI implementation does not support MCS.

Migrating from Standard to Distributed vSwitch


You can migrate from a vSphere Standard Switch to a vSphere Distributed Switch, and use Network I/O Control. This
enables you to prioritize the QoS (Quality of Service) on vSAN traffic.
WARNING
It is best to have access to the ESXi hosts, although you might not need it. If something goes wrong, you can
access the console of the ESXi hosts.
Make a note of the original vSwitch setup. In particular, note the load-balancing and NIC teaming settings on the source.
Make sure the destination configuration matches the source.

Create a Distributed Switch


Create the distributed vSwitch and give it a name.
1. In the vSphere Client Host and Clusters view, right-click a data center and select menu New Distributed Switch.
2. Enter a name.
3. Select the version of the vSphere Distributed Switch. In this example, version 6.6.0 is used for the migration.
4. Add the settings. Determine how many uplinks you are currently using for networking. This example has six:
management, vMotion, virtual machines, and three for vSAN (a LAG configuration). Enter 6 for the number of uplinks.
Your environment might be different, but you can edit it later.
You can create a default port group at this point, but additional port groups are needed.
5. Finish the configuration of the distributed vSwitch.
The next step is to configure and create the additional port groups.

173
VMware vSAN 8.0

Create Port Groups


A single default port group was created for the management network. Edit this port group to make sure it has all the
characteristics of the management port group on the standard vSwitch, such as VLAN and NIC teaming, and failover
settings.

Configure the management port group.


1. In the vSphere Client Networking view, select the distributed port group, and click Edit.
2. For some port groups, you must change the VLAN. Since VLAN 51 is the management VLAN, tag the distributed port
group accordingly.

174
VMware vSAN 8.0

3. Click OK.
Create distributed port groups for vMotion, virtual machine networking, and vSAN networking.
1. Right-click the vSphere Distributed Switch and select menu Distributed Port Group > New Distributed Port Group.
2. For this example, create a port group for the vMotion network.
Create all the distributed port groups on the distributed vSwitch. Then migrate the uplinks, VMkernel networking, and
virtual machine networking to the distributed vSwitch and associated distributed port groups.
WARNING
Migrate the uplinks and networks in step-by-step fashion to proceed smoothly and with caution.

Migrate Management Network


Migrate the management network (vmk0) and its associated uplink (vmnic0) from the standard vSwitch to the distributed
vSwitch (vDS).
1. Add hosts to the vDS.
a. Right-click the vDS and select menu Add and Manage Hosts.
b. Add hosts to the vDS. Click the green Add icon (+), and add all hosts from the cluster.
2. Configure the physical adapters and VMkernel adapters.
a. Click Manage physical adapters to migrate the physical adapters and VMkernel adapters, vmnic0 and vmk0 to
the vDS.
b. Select an appropriate uplink on the vDS for physical adapter vmnic0. For this example, use Uplink1. The physical
adapter is selected and an uplink is chosen.
3. Migrate the management network on vmk0 from the standard vSwitch to the distributed vSwitch. Perform these steps
on each host.
a. Select vmk0, and click Assign port group.
b. Assign the distributed port group created for the management network earlier.
4. Finish the configuration.
a. Review the changes to ensure that you are adding four hosts, four uplinks (vmnic0 from each host), and four
VMkernel adapters (vmk0 from each host).
b. Click Finish.
When you examine the networking configuration of each host, review the switch settings, with one uplink (vmnic0) and the
vmk0 management port on each host.
Repeat this process for the other networks.

175
VMware vSAN 8.0

Migrate vMotion
To migrate the vMotion network, use the same steps used for the management network.
Before you begin, ensure that the distributed port group for the vMotion network has the same attributes as the port group
on the standard vSwitch. Then migrate the uplink used for vMotion (vmnic1), with the VMkernel adapter (vmk1).

Migrate vSAN Network


If you have a single uplink for the vSAN network, then use the same process as before. However, if you are using more
than one uplink, there are additional steps.
If the vSAN network is using Link Aggregation (LACP), or it is on a different VLAN to the other VMkernel networks, place
some of the uplinks into an unused state for certain VMkernel adapters.
For example, VMkernel adapter vmk2 is used for vSAN. However, uplinks vmnic3, 4 and 5 are used for vSAN and
they are in a LACP configuration. Therefore, for vmk2, all other vmnics (0, 1 and 2) must be placed in an unused state.
Similarly, for the management adapter (vmk0) and vMotion adapter (vmk0), place the vSAN uplinks/vmnics in an unused
state.
Modify the settings of the distributed port group and change the path policy and failover settings. On the Manage
physical network adapter page, perform the steps for multiple adapters.
Assign the vSAN VMkernel adapter (vmk2) to the distributed port group for vSAN.
NOTE
If you are only now migrating the uplinks for the vSAN network, you might not be able to change the distributed
port group settings until after the migration. During this time, vSAN might have communication issues. After the
migration, move to the distributed port group settings and make any policy changes and mark any uplinks to
be unused. vSAN networking then returns to normal when this task is finished. Use the vSAN health service to
verify that everything is functional.

176
VMware vSAN 8.0

Migrate VM Network
The final task needed to migrate the network from a standard vSwitch to a distributed vSwitch is to migrate the VM
network.
Manage host networking.
1. Right-click the vDS and choose menu Add and Manage Hosts.
2. Select all the hosts in the cluster, to migrate virtual machine networking for all hosts to the distributed vSwitch.
Do not move any uplinks. However, if the VM networking on your hosts used a different uplink, then migrate the uplink
from the standard vSwitch.
3. Select the VMs to migrate from a virtual machine network on the standard vSwitch to the virtual machine distributed
port group on the distributed vSwitch. Click Assign port group, and select the distributed port group.
4. Review the changes and click Finish. In this example, you are moving to VMs. Any templates using the original
standard vSwitch virtual machine network must be converted to virtual machines, and edited. The new distributed
port group for virtual machines must be selected as the network. This step cannot be achieved through the migration
wizard.
Since the standard vSwitch no longer has any uplinks or port groups, it can be safely removed.
This completes the migration from a vSphere Standard Switch to a vSphere Distributed Switch.

Checklist Summary for vSAN Network


Use the checklist summary to verify your vSAN network requirements.
• Check if you use shared 10Gb NIC or dedicated 1Gb NIC. All-flash clusters require 10Gb NICs.
• Verify that redundant NIC teaming connections are configured.
• Verify that flow control is enabled on the ESXi host NICs.
• Verify that VMkernel port for vSAN network traffic is configured on each host.
• Verify that you have identical VLAN, MTU and subnet across all interfaces.
• Verify that you can run vmkping successfully between all hosts. Use the health service to verify.
• If you use jumbo frames, verify that you can run vmkping successfully with 9000 packet size between all hosts. Use
the health service to verify.
• If your vSAN version is earlier than v6.6, verify that multicast is enabled on the network.

177
VMware vSAN 8.0

• If your vSAN version is earlier than v6.6 and multiple vSAN clusters are on the same network, configure multicast to
use unique multicast addresses.
• If your vSAN version is earlier than v6.6 and spans multiple switches, verify that multicast is configured across
switches.
• If your vSAN version is earlier than v6.6 and is routed, verify that PIM is configured to allow multicast routing.
• Ensure that the physical switch can meet vSAN requirements (multicast, flow control, feature interoperability).
• Verify that the network does not have performance issues, such as excessive dropped packets or pause frames.
• Verify that network limits are within acceptable margins.
• Test vSAN network performance with iperf, and verify that it meets expectations.

178
VMware vSAN 8.0

Administering VMware vSAN


Administering VMware vSAN describes how to configure and manage a vSAN cluster in a VMware vSphere®
environment.
In addition, Administering VMware vSAN explains how to manage the local physical storage resources that serve as
storage capacity devices in a vSAN cluster, and how to define storage policies for virtual machines deployed to vSAN
datastores.
At VMware, we value inclusion. To foster this principle within our customer, partner, and internal community, we create
content using inclusive language.

Intended Audience
This information is for experienced virtualization administrators who are familiar with virtualization technology, day-to-day
data center operations, and vSAN concepts.
For more information about vSAN and how to create a vSAN cluster, see the vSAN Planning and Deployment Guide.
For more information about monitoring a vSAN cluster and fixing problems, see the vSAN Monitoring and Troubleshooting
Guide.

Updated Information
This document is updated with each release of the product or when necessary.
This table provides the update history of Administering VMware vSAN.

Revision Description

25 JUL 2024 • Updated licensing information on vSAN Max in Sharing


Remote vSAN Datastores .
• Added information about restoring deleted VMs in Restore a
VM from a vSAN Snapshot.
• Additional minor updates.
25 JUN 2024 Initial release.

What Is vSAN
VMware vSAN is a distributed layer of software that runs natively as a part of the ESXi hypervisor.
vSAN aggregates local or direct-attached capacity devices of a host cluster and creates a single storage pool shared
across all hosts in the vSAN cluster. While supporting VMware features that require shared storage, such as HA, vMotion,
and DRS, vSAN eliminates the need for external shared storage and simplifies storage configuration and virtual machine
provisioning activities.

vSAN Concepts
VMware vSAN uses a software-defined approach that creates shared storage for virtual machines.
It virtualizes the local physical storage resources of ESXi hosts and turns them into pools of storage that can be
divided and assigned to virtual machines and applications according to their quality-of-service requirements. vSAN is
implemented directly in the ESXi hypervisor.

179
VMware vSAN 8.0

You can configure vSAN to work as either a hybrid or all-flash cluster. In hybrid clusters, flash devices are used for the
cache layer and magnetic disks are used for the storage capacity layer. In all-flash clusters, flash devices are used for
both cache and capacity.
You can activate vSAN on existing host clusters, or when you create a new cluster. vSAN aggregates all local capacity
devices into a single datastore shared by all hosts in the vSAN cluster. You can expand the datastore by adding capacity
devices or hosts with capacity devices to the cluster. vSAN works best when all ESXi hosts in the cluster share similar or
identical configurations across all cluster members, including similar or identical storage configurations. This consistent
configuration balances virtual machine storage components across all devices and hosts in the cluster. Hosts without any
local devices also can participate and run their virtual machines on the vSAN datastore.
In vSAN Original Storage Architecture (OSA), each host that contributes storage devices to the vSAN datastore must
provide at least one device for flash cache and at least one device for capacity. The devices on the contributing host
form one or more disk groups. Each disk group contains one flash cache device, and one or multiple capacity devices for
persistent storage. Each host can be configured to use multiple disk groups.
In vSAN Express Storage Architecture (ESA), all storage devices claimed by vSAN contribute to capacity and
performance. Each host's storage devices claimed by vSAN form a storage pool. The storage pool represents the amount
of caching and capacity provided by the host to the vSAN datastore.
For best practices, capacity considerations, and general recommendations about designing and sizing a vSAN cluster,
see the VMware vSAN Design and Sizing Guide.

Characteristics of vSAN
The following characteristics apply to vSAN, its clusters, and datastores.
vSAN includes numerous features to add resiliency and efficiency to your data computing and storage environment.

Table 21: vSAN Features

Supported Features Description

Shared storage support vSAN supports VMware features that require shared storage, such as HA,
vMotion, and DRS. For example, if a host becomes overloaded, DRS can
migrate virtual machines to other hosts in the cluster.
On-disk format vSAN on-disk virtual file format provides highly scalable snapshot and
clone management support per vSAN cluster. For information about
the number of virtual machine snapshots and clones supported per
vSAN cluster, refer to the vSphere Configuration Maximumshttps://
configmax.esp.vmware.com/home.
All-flash and hybrid configurations vSAN can be configured for all-flash or hybrid cluster.
Fault domains vSAN supports configuring fault domains to protect hosts from rack or
chassis failures when the vSAN cluster spans across multiple racks or
blade server chassis in a data center.
File service vSAN file service enables you to create file shares in the vSAN datastore
that client workstations or VMs can access.
iSCSI target service vSAN iSCSI target service enables hosts and physical workloads that
reside outside the vSAN cluster to access the vSAN datastore.
vSAN Stretched cluster and Two node vSAN cluster vSAN supports stretched clusters that span across two geographic
locations.

180
VMware vSAN 8.0

Supported Features Description

Support for Windows Server Failover Clusters (WSFC) vSAN 6.7 Update 3 and later releases support SCSI-3 Persistent
Reservations (SCSI3-PR) on a virtual disk level required by Windows
Server Failover Cluster (WSFC) to arbitrate an access to a shared disk
between nodes. Support of SCSI-3 PRs enables configuration of WSFC
with a disk resource shared between VMs natively on vSAN datastores.
Currently the following configurations are supported:
• Up to 6 application nodes per cluster.
• Up to 64 shared virtual disks per node.
NOTE
Microsoft SQL Server 2012 or later running on Microsoft
Windows Server 2012 or later has been qualified on vSAN.
vSAN health service vSAN health service includes preconfigured health check tests to monitor,
troubleshoot, diagnose the cause of cluster component problems, and
identify any potential risk.
vSAN performance service vSAN performance service includes statistical charts used to monitor IOPS,
throughput, latency, and congestion. You can monitor performance of a
vSAN cluster, host, disk group, disk, and VMs.
Integration with vSphere storage features vSAN integrates with vSphere data management features traditionally used
with VMFS and NFS storage. These features include snapshots, linked
clones, and vSphere Replication.
Virtual Machine Storage Policies vSAN works with VM storage policies to support a VM-centric approach to
storage management.
If you do not assign a storage policy to the virtual machine during
deployment, the vSAN Default Storage Policy is automatically assigned to
the VM.
®
Rapid provisioning vSAN enables rapid provisioning of storage in the vCenter Server during
virtual machine creation and deployment operations.
Deduplication and compression vSAN performs block-level deduplication and compression to save
storage space. When you enable deduplication and compression on a
vSAN all-flash cluster, redundant data within each disk group is reduced.
Deduplication and compression is a cluster-wide setting, but the functions
are applied on a disk group basis. Compression-only vSAN is applied on a
per-disk basis.
Data at rest encryption vSAN provides data at rest encryption. Data is encrypted after all other
processing, such as deduplication, is performed. Data at rest encryption
protects data on storage devices, in case a device is removed from the
cluster.
Data in transit encryption vSAN can encrypt data in transit across hosts in the cluster. When you
enable data-in-transit encryption, vSAN encrypts all data and metadata
traffic between hosts.
SDK support The VMware vSAN SDK is an extension of the VMware vSphere
Management SDK. It includes documentation, libraries and code examples
that help developers automate installation, configuration, monitoring, and
troubleshooting of vSAN.

vSAN Terms and Definitions


vSAN introduces specific terms and definitions that are important to understand.

181
VMware vSAN 8.0

Before you get started with vSAN, review the key vSAN terms and definitions.

Disk Group (vSAN Original Storage Architecture)


A disk group is a unit of physical storage capacity and performance on a host and a group of physical devices that provide
performance and capacity to the vSAN cluster. On each ESXi host that contributes its local devices to a vSAN cluster,
devices are organized into disk groups.
Each disk group must have one flash cache device and one or multiple capacity devices. The devices used for caching
cannot be shared across disk groups, and cannot be used for other purposes. A single caching device must be dedicated
to a single disk group. In hybrid clusters, flash devices are used for the cache layer and magnetic disks are used for the
storage capacity layer. In an all-flash cluster, flash devices are used for both cache and capacity. For information about
creating and managing disk groups, see Administering VMware vSAN.

Storage Pool (vSAN Express Storage Architecture)


A storage pool is a representation of all storage devices on a host that are claimed by vSAN. Each host contains one
storage pool. Each device in the storage pool contributes both capacity and performance. The number of storage devices
allowed is determined by the host configuration.

Consumed Capacity
Consumed capacity is the amount of physical capacity consumed by one or more virtual machines at any point. Many
factors determine consumed capacity, including the consumed size of your .vmdk files, protection replicas, and so on.
When calculating for cache sizing, do not consider the capacity used for protection replicas.

Object-Based Storage
vSAN stores and manages data in the form of flexible data containers called objects. An object is a logical volume that
has its data and metadata distributed across the cluster. For example, every .vmdk is an object, as is every snapshot.
When you provision a virtual machine on a vSAN datastore, vSAN creates a set of objects comprised of multiple
components for each virtual disk. It also creates the VM home namespace, which is a container object that stores all
metadata files of your virtual machine. Based on the assigned virtual machine storage policy, vSAN provisions and
manages each object individually, which might also involve creating a RAID configuration for every object.
NOTE
If vSAN Express Storage Architecture is enabled, every snapshot is not a new object. A base .vmdk and its
snapshots are contained in one vSAN object. Additionally, in vSAN ESA, digest is backed by vSAN objects.
When vSAN creates an object for a virtual disk and determines how to distribute the object in the cluster, it considers the
following factors:
• vSAN verifies that the virtual disk requirements are applied according to the specified virtual machine storage policy
settings.
• vSAN verifies that the correct cluster resources are used at the time of provisioning. For example, based on the
protection policy, vSAN determines how many replicas to create. The performance policy determines the amount of
flash read cache allocated for each replica and how many stripes to create for each replica and where to place them in
the cluster.
• vSAN continually monitors and reports the policy compliance status of the virtual disk. If you find any noncompliant
policy status, you must troubleshoot and resolve the underlying problem.
NOTE
When required, you can edit VM storage policy settings. Changing the storage policy settings does not affect
virtual machine access. vSAN actively throttles the storage and network resources used for reconfiguration
to minimize the impact of object reconfiguration to normal workloads. When you change VM storage policy

182
VMware vSAN 8.0

settings, vSAN might initiate an object recreation process and subsequent resynchronization. See vSAN
Monitoring and Troubleshooting.
• vSAN verifies that the required protection components, such as mirrors and witnesses, are placed on separate hosts
or fault domains. For example, to rebuild components during a failure, vSAN looks for ESXi hosts that satisfy the
placement rules where protection components of virtual machine objects must be placed on two different hosts, or
across fault domains.

vSAN Datastore
After you enable vSAN on a cluster, a single vSAN datastore is created. It appears as another type of datastore in the list
of datastores that might be available, including Virtual Volume, VMFS, and NFS. A single vSAN datastore can provide
different service levels for each virtual machine or each virtual disk. In vCenter Server®, storage characteristics of the
vSAN datastore appear as a set of capabilities. You can reference these capabilities when defining a storage policy for
virtual machines. When you later deploy virtual machines, vSAN uses this policy to place virtual machines in the optimal
manner based on the requirements of each virtual machine. For general information about using storage policies, see the
vSphere Storage documentation.
A vSAN datastore has specific characteristics to consider.
• vSAN provides a single vSAN datastore accessible to all hosts in the cluster, whether or not they contribute storage to
the cluster. Each host can also mount any other datastores, including Virtual Volumes, VMFS, or NFS.
• You can use Storage vMotion to move virtual machines between vSAN datastores, NFS datastores, and VMFS
datastores.
• Only magnetic disks and flash devices used for capacity can contribute to the datastore capacity. The devices used for
flash cache are not counted as part of the datastore.

Objects and Components


Each object is composed of a set of components, determined by capabilities that are in use in the VM Storage Policy.
For example, with Failures to tolerate set to 1, vSAN ensures that the protection components, such as replicas and
witnesses, are placed on separate hosts in the vSAN cluster, where each replica is an object component. In addition, in
the same policy, if the Number of disk stripes per object configured to two or more, vSAN also stripes the object across
multiple capacity devices and each stripe is considered a component of the specified object. When needed, vSAN might
also break large objects into multiple components.
A vSAN datastore contains the following object types:
VM Home Namespace
The virtual machine home directory where all virtual machine configuration files are stored, such as .vmx, log files, .vmdk
files, and snapshot delta description files.
VMDK
A virtual machine disk or .vmdk file that stores the contents of the virtual machine's hard disk drive.
VM Swap Object
Created when a virtual machine is powered on.
Snapshot Delta VMDKs
Created when virtual machine snapshots are taken. Such delta disks are not created for vSAN Express Storage Architecture.
Memory object
Created when the snapshot memory option is selected when creating or suspending a virtual machine.

Virtual Machine Compliance Status: Compliant and Noncompliant


A virtual machine is considered noncompliant when one or more of its objects fail to meet the requirements of its assigned
storage policy. For example, the status might become noncompliant when one of the mirror copies is inaccessible. If your
virtual machines are in compliance with the requirements defined in the storage policy, the status of your virtual machines

183
VMware vSAN 8.0

is compliant. From the Physical Disk Placement tab on the Virtual Disks page, you can verify the virtual machine object
compliance status. For information about troubleshooting a vSAN cluster, see vSAN Monitoring and Troubleshooting.

Component State: Degraded and Absent States


vSAN acknowledges the following failure states for components:
• Degraded. A component is Degraded when vSAN detects a permanent component failure and determines that
the failed component cannot recover to its original working state. As a result, vSAN starts to rebuild the degraded
components immediately. This state might occur when a component is on a failed device.
• Absent. A component is Absent when vSAN detects a temporary component failure where components, including all its
data, might recover and return vSAN to its original state. This state might occur when you are restarting hosts or if you
unplug a device from a vSAN host. vSAN starts to rebuild the components in absent status after waiting for 60 minutes.

Object State: Healthy and Unhealthy


Depending on the type and number of failures in the cluster, an object might be in one of the following states:
• Healthy. When at least one full RAID 1 mirror is available, or the minimum required number of data segments are
available, the object is considered healthy.
• Unhealthy. An object is considered unhealthy when no full mirror is available or the minimum required number of data
segments are unavailable for RAID 5 or RAID 6 objects. If fewer than 50 percent of an object's votes are available, the
object is unhealthy. Multiple failures in the cluster can cause objects to become unhealthy. When the operational status
of an object is considered unhealthy, it impacts the availability of the associated VM.

Witness
A witness is a component that contains only metadata and does not contain any actual application data. It serves as
a tiebreaker when a decision must be made regarding the availability of the surviving datastore components, after a
potential failure. A witness consumes approximately 2 MB of space for metadata on the vSAN datastore when using on-
disk format 1.0, and 4 MB for on-disk format version 2.0 and later.
vSAN maintains a quorum by using an asymmetrical voting system where each component might have more than one
vote to decide the availability of objects. Greater than 50 percent of the votes that make up a VM’s storage object must
be accessible at all times for the object to be considered available. When 50 percent or fewer votes are accessible to all
hosts, the object is no longer accessible to the vSAN datastore. Inaccessible objects can impact the availability of the
associated VM.

Storage Policy-Based Management (SPBM)


When you use vSAN, you can define virtual machine storage requirements, such as performance and availability, in the
form of a policy. vSAN ensures that the virtual machines deployed to vSAN datastores are assigned at least one virtual
machine storage policy. When you know the storage requirements of your virtual machines, you can define storage
policies and assign the policies to your virtual machines. If you do not apply a storage policy when deploying virtual
machines, vSAN automatically assigns a default vSAN policy with Failures to tolerate set to 1, a single disk stripe for
each object, and thin provisioned virtual disk. For best results, define your own virtual machine storage policies, even
if the requirements of your policies are the same as those defined in the default storage policy. For information about
working with vSAN storage policies, see Administering VMware vSAN.

vSphere PowerCLI
VMware vSphere PowerCLI adds command-line scripting support for vSAN, to help you automate configuration and
management tasks. vSphere PowerCLI provides a Windows PowerShell interface to the vSphere API. PowerCLI includes
cmdlets for administering vSAN components. For information about using vSphere PowerCLI, see vSphere PowerCLI
Documentation.

184
VMware vSAN 8.0

How vSAN Differs from Traditional Storage


Although vSAN shares many characteristics with traditional storage arrays, the overall behavior and function of vSAN is
different.
For example, vSAN can manage and work only with ESXi hosts, and a single vSAN instance provides a single datastore
for the cluster.
vSAN and traditional storage also differ in the following key ways:
• vSAN does not require external networked storage for storing virtual machine files remotely, such as on a Fibre
Channel (FC) or Storage Area Network (SAN).
• Using traditional storage, the storage administrator preallocates storage space on different storage systems. vSAN
automatically turns the local physical storage resources of the ESXi hosts into a single pool of storage. These pools
can be divided and assigned to virtual machines and applications according to their quality-of-service requirements.
• vSAN does not behave like traditional storage volumes based on LUNs or NFS shares. The iSCSI target service uses
LUNs to enable an initiator on a remote host to transport block-level data to a storage device in the vSAN cluster.
• Some standard storage protocols, such as FCP, do not apply to vSAN.
• vSAN is highly integrated with vSphere. You do not need dedicated plug-ins or a storage console for vSAN, compared
to traditional storage. You can deploy, manage, and monitor vSAN by using the vSphere Client.
• A dedicated storage administrator does not need to manage vSAN. Instead a vSphere administrator can manage a
vSAN environment.
• With vSAN, VM storage policies are automatically assigned when you deploy new VMs. The storage policies can be
changed dynamically as needed.

Building a vSAN Cluster


You can choose the storage architecture and deployment option when creating a vSAN cluster.
Chose the vSAN storage architecture that best suits your resources and your needs.

vSAN Original Storage Architecture


vSAN Original Storage Architecture (OSA) is designed for a wide range of storage devices, including flash solid state
drives (SSD) and magnetic disk drives (HDD). Each host that contributes storage contains one or more disk groups. Each
disk group contains one flash cache device and one or more capacity devices.

185
VMware vSAN 8.0

vSAN Express Storage Architecture


vSAN Express Storage Architecture (ESA) is designed for high-performance NVMe based TLC flash devices and high
performance networks. Each host that contributes storage contains a single storage pool of one or more flash devices.
Each flash device provides caching and capacity to the cluster.

Depending on your requirement, you can deploy vSAN in the following ways.

vSAN ReadyNode
The vSAN ReadyNode is a preconfigured solution of the vSAN software provided by VMware partners, such as Cisco,
Dell, HPE, Fujitsu, IBM, and Supermicro. This solution includes validated server configuration in a tested, certified
hardware form factor for vSAN deployment that is recommended by the server OEM and VMware. For information about
the vSAN ReadyNode solution for a specific partner, visit the VMware Partner website.

User-Defined vSAN Cluster


You can build a vSAN cluster by selecting individual software and hardware components, such as drivers, firmware,
and storage I/O controllers that are listed in the vSAN Compatibility Guide (VCG) website at http://www.vmware.com/
resources/compatibility/search.php. You can choose any servers, storage I/O controllers, capacity and flash cache
devices, memory, any number of cores you must have per CPU, that are certified and listed on the VCG website. Review
the compatibility information on the VCG website before choosing software and hardware components, drivers, firmware,
and storage I/O controllers that vSAN supports. When designing a vSAN cluster, use only devices, firmware, and drivers
that are listed on the VCG website. Using software and hardware versions that are not listed in the VCG might cause
cluster failure or unexpected data loss. For information about designing a vSAN cluster, see "Designing and Sizing a
vSAN Cluster" in vSAN Planning and Deployment.

vSAN Deployment Options


This section covers the supported deployment options for vSAN clusters.

Single Site vSAN Cluster


A single site vSAN cluster consists of a minimum of three hosts. Typically, all hosts in a single site vSAN cluster reside at
a single site, and are connected on the same Layer 2 network. All-flash configurations require 10 Gb network connections,
and vSAN Express Storage Architecture requires 10 Gb network connections.
For more information, see Creating a Single Site vSAN Cluster .

186
VMware vSAN 8.0

Two-Node vSAN Cluster


Two-node vSAN clusters are often used for remote office/branch office environments, typically running a small number of
workloads that require high availability. A two-node vSAN cluster consists of two hosts at the same location, connected
to the same network switch or directly connected. You can configure a two-node vSAN cluster that uses a third host as a
witness, which can be located remotely from the branch office. Usually the witness resides at the main site, along with the
vCenter Server.
For more information, see Creating a vSAN Stretched Cluster or Two-Node vSAN Cluster.

vSAN Stretched Cluster


A vSAN stretched cluster provides resiliency against the loss of an entire site. The hosts in a vSAN stretched cluster are
distributed evenly across two sites. The two sites must have a network latency of no more than five milliseconds (5 ms). A
vSAN witness host resides at a third site to provide the witness function. The witness also acts as tie-breaker in scenarios
where a network partition occurs between the two data sites. Only metadata such as witness components is stored on the
witness.
For more information, see Creating a vSAN Stretched Cluster or Two-Node vSAN Cluster.

187
VMware vSAN 8.0

Integrate vSAN with Other VMware Software


After you have vSAN up and running, it is integrated with the rest of the VMware software stack.
You can do most of what you can do with traditional storage by using vSphere components and features including
vSphere vMotion, snapshots, clones, Distributed Resource Scheduler (DRS), vSphere High Availability, VMware Site
Recovery Manager, and more.

vSphere HA
You can enable vSphere HA and vSAN on the same cluster. As with traditional datastores, vSphere HA provides the same
level of protection for virtual machines on vSAN datastores. This level of protection imposes specific restrictions when
vSphere HA and vSAN interact. For specific considerations about integrating vSphere HA and vSAN, see Using vSAN
and vSphere HA.

VMware Horizon View


You can integrate vSAN with VMware Horizon View. When integrated, vSAN provides the following benefits to virtual
desktop environments:
• High-performance storage with automatic caching
• Storage policy-based management, for automatic remediation
For information about integrating vSAN with VMware Horizon, see the VMware with Horizon View documentation. For
designing and sizing VMware Horizon View for vSAN, see the Designing and Sizing Guide for Horizon View.

Limitations of vSAN
This topic discusses the limitations of vSAN.
When working with vSAN, consider the following limitations:
• vSAN does not support hosts participating in multiple vSAN clusters. However, a vSAN host can access other external
storage resources that are shared across clusters.
• vSAN does not support vSphere DPM and Storage I/O Control.
• vSAN does not support SE Sparse disks.
• vSAN does not support RDM, VMFS, diagnostic partition, and other device access features.

188
VMware vSAN 8.0

Configuring and Managing a vSAN Cluster


You can configure and manage a vSAN cluster by using the vSphere Client, esxcli commands, and other tools.

Configure a Cluster for vSAN Using the vSphere Client


You can use the vSphere Client to configure vSAN on an existing cluster.
Verify that your environment meets all requirements. See "Requirements for Enabling vSAN" in vSAN Planning and
Deployment.

189
VMware vSAN 8.0

Create a cluster and add hosts to the cluster before enabling and configuring vSAN. Configure the port properties on each
host to add the vSAN service.
NOTE
You can use Quickstart to quickly create and configure a vSAN cluster. For more information, see "Using
Quickstart to Configure and Expand a vSAN Cluster" in vSAN Planning and Deployment .
1. Navigate to an existing host cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.

a) Select an HCI configuration type.


• vSAN HCI provides compute resources and storage resources. The datastore can be shared across clusters in
the same data center, and across clusters managed by remote vCenters.
190
• vSAN Compute Cluster provides vSphere compute resources only. It can mount datastores served by vSAN
Max clusters in the same data center and from remote vCenters.
• vSAN Max (vSAN ESA clusters) provides storage resources, but not compute resources. The datastore can be
mounted by client vSphere clusters and vSAN clusters in the same data center and from remote vCenters.
VMware vSAN 8.0

b) Select a deployment option (Single site vSAN cluster, Two node vSAN cluster, or vSAN stretched cluster).
c) Click Configure to open the Configure vSAN wizard.

4. Select vSAN ESA if your cluster is compatible, and click Next.


5. Configure the vSAN services to use, and click Next.
Configure data management features, including deduplication and compression, data-at-rest encryption, data-in-transit
encryption. Select RDMA (remote direct memory access) if your network supports it.
6. Claim disks for the vSAN cluster, and click Next.
For vSAN Original Storage Architecture (vSAN OSA), each host that contribute storage requires at least one flash
device for cache, and one or more devices for capacity. For vSAN Express Storage Architecture (vSAN ESA), each
host that contributes storage requires one or more flash devices.
7. Create fault domains to group hosts that can fail together.
8. Review the configuration, and click Finish.

Enabling vSAN creates a vSAN datastore and registers the vSAN storage provider. vSAN storage providers are built-in
software components that communicate the storage capabilities of the datastore to vCenter Server.
Verify that the vSAN datastore has been created. See View vSAN Datastore.
Verify that the vSAN storage provider is registered.

191
VMware vSAN 8.0

Enable vSAN on an Existing Cluster


You can enable vSAN on an existing cluster, and configure features and services.
Verify that your environment meets all requirements. See "Requirements for Enabling vSAN" in vSAN Planning and
Deployment.
1. Navigate to an existing host cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.
a) Select a configuration type (Single site vSAN cluster, Two node vSAN cluster, or vSAN Stretched cluster.
b) Select I need local vSAN Datastore if you plan to add disk groups or storage pools to the cluster hosts.
c) Click Configure to open the Configure vSAN wizard.
4. Select vSAN ESA if your cluster is compatible, and click Next.
5. Configure the vSAN services to use, and click Next.
Configure data management features, including deduplication and compression, data-at-rest encryption, data-in-transit
encryption. Select RDMA (remote direct memory access) if your network supports it.
6. Claim disks for the vSAN cluster, and click Next.
For vSAN Original Storage Archtecture (vSAN OSA), each host that contribute storage requires at least one flash
device for cache, and one or more devices for capacity. For vSAN Express Storage Architecture (vSAN ESA), each
host that contributes storage requires one or more flash devices.
7. Create fault domains to group hosts that can fail together.
8. Review the configuration, and click Finish.

Turn Off vSAN


You can turn off vSAN for a host cluster.
Verify that the hosts are in maintenance mode. For more information, see Place a Member of vSAN Cluster in
Maintenance Mode.
When you turn off vSAN for a cluster, all virtual machines and data services located on the vSAN datastore become
inaccessible. If you have consumed storage on the vSAN cluster using vSAN Direct, then the vSAN Direct monitoring
services, such as health checks, space reporting, and performance monitoring, are not available. If you intend to use
virtual machines while vSAN is off, make sure you migrate virtual machines from vSAN datastore to another datastore
before turning off the vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.
4. Click Turn Off vSAN.
5. On the Turn Off vSAN dialog, confirm your selection.

Edit vSAN Settings


You can edit the settings of your vSAN cluster to configure data management features and enable services provided by
the cluster.
Edit the settings of an existing vSAN cluster if you want to enable deduplication and compression, or to enable encryption.
If you enable deduplication and compression, or if you enable encryption, the on-disk format of the cluster is automatically
upgraded to the latest version.

192
VMware vSAN 8.0

1. Navigate to the vSAN cluster.


2. Click the Configure tab.
a) Under vSAN, select Services.
b) Click the Edit or Enable button for the service you want to configure.
• Configure Storage. Click Mount Remote Datastores to use storage from other vSAN clusters.
• Configure vSAN performance service. For more information, see "Monitoring vSAN Performance" in vSAN
Monitoring and Troubleshooting.
• Enable File Service. For more information, see "vSAN File Service" in Administering VMware vSAN .

193
VMware vSAN 8.0

• Configure vSAN Network options. For more information, see "Configuring vSAN Network" in vSAN Planning
and Deployment.
• Configure iSCSI target service. For more information, see "Using the vSAN iSCSI Target Service" in
Administering VMware vSAN.
• Configure Data Services, including deduplication and compression, data-at-rest encryption, and data-in-transit
encryption.
• Configure vSAN Data Protection. Before you can use vSAN Data Protection, you must deploy the vSAN
Snapshot Service. For more information, see "Deploying the Snapshot Service Appliance" in Administering
VMware vSAN.
• Configure capacity reservations and alerts. For more information, see "About Reserved Capacity" in vSAN
Monitoring and Troubleshooting.
• Configure advanced options:
– Object Repair Timer
– Site Read Locality for vSAN stretched clusters
– Thin Swap provisioning
– Large Cluster Support for up to 64 hosts
– Automatic Rebalance
• Configure vSAN historical health service.
c) Modify the settings to match your requirements.
3. Click Apply to confirm your selections.

View vSAN Datastore


After you enable vSAN, a single datastore is created. You can review the capacity of the vSAN datastore.

194
VMware vSAN 8.0

Configure vSAN and and disk groups or storage pools.

1. Navigate to Storage.
2. Select the vSAN datastore.
3. Click the Configure tab.
4. Review the vSAN datastore capacity.
The size of the vSAN datastore depends on the number of capacity devices per ESXi host and the number of ESXi
hosts in the cluster. For example, if a host has seven 2 TB for capacity devices, and the cluster includes eight hosts,
the approximate storage capacity is 7 x 2 TB x 8 = 112 TB. When using the all-flash configuration, flash devices are
used for capacity. For hybrid configuration, magnetic disks are used for capacity.
Some capacity is allocated for metadata.
• On-disk format version 1.0 adds approximately 1 GB per capacity device.
• On-disk format version 2.0 adds capacity overhead, typically no more than 1-2 percent capacity per device.
• On-disk format version 3.0 and later adds capacity overhead, typically no more than 1-2 percent capacity
per device. Deduplication and compression with software checksum enabled require additional overhead of
approximately 6.2 percent capacity per device.
Create a storage policy for virtual machines using the storage capabilities of the vSAN datastore. For information, see the
vSphere Storage documentation.

Upload Files or Folders to vSAN Datastores


You can upload vmdk files to a vSAN datastore.
You can also upload folders to a vSAN datastore. For more information about datastores, see vSphere Storage. When
you upload a vmdk file to a vSAN datastore, the following considerations apply:
• You can upload only stream-optimized vmdk files to a vSAN datastore. VMware stream-optimized file format is a
monolithic sparse format compressed for streaming. If you want to upload a vmdk file that is not in stream-optimized

195
VMware vSAN 8.0

format, then, before uploading, convert it to stream-optimized format using the vmware-vdiskmanager command‐line
utility. For more information, see Virtual Disk Manager User’s Guide.
• When you upload a vmdk file to a vSAN datastore, the vmdk file inherits the default policy of that datastore. The
vmdk does not inherit the policy of the VM from which it was downloaded. vSAN creates the objects by applying the
vsanDatastore default policy, which is RAID -1. You can change the default policy of the datastore. See Change the
Default Storage Policy for vSAN Datastores .
• You must upload a vmdk file to VM home folder.
1. Navigate to vSAN Datastore.
2. Click the Files tab.
Option Description
Upload Files 1. Select the target folder and click Upload Files. You see a
message informing that you can upload vmdk files only in
VMware stream-optimized format. If you try uploading a
vmdk file in a different format, you see an internal server
error message.
2. Click Upload.
3. Locate the item to upload on the local computer and click
Open.
Upload Folders 1. Select the target folder and click Upload Folder. You see
a message informing that you can upload vmdk files only in
VMware stream-optimized format.
2. Click Upload.
3. Locate the item to upload on the local computer and click
Open.

Download Files or Folders from vSAN Datastores


You can download files and folders from a vSAN datastore.
For more information about datastores, see vSphere Storage. The vmdk files are downloaded as stream-optimized files
with the filename <vmdkName>_stream.vmdk . VMware stream-optimized file format is a monolithic sparse format
compressed for streaming.
You can convert a VMware stream-optimized vmdk file to other vmdk file formats using the vmware-vdiskmanager
command‐line utility. For more information, see Virtual Disk Manager User’s Guide.
1. Navigate to vSAN Datastore.
2. Click the Files tab and then click Download.
You see a message alerting you that vmdk files are downloaded from the vSAN datastores in VMware stream-
optimized format with the filename extension .stream.vmdk .
3. Click Download.
4. Locate the item to download and then click Download.

Using vSAN Policies


When you use vSAN, you can define virtual machine storage requirements, such as performance and availability, in a
policy.
vSAN ensures that each virtual machine deployed to vSAN datastores is assigned at least one storage policy. After they
are assigned, the storage policy requirements are pushed to the vSAN layer when a virtual machine is created. The virtual
device is distributed across the vSAN datastore to meet the performance and availability requirements.

196
VMware vSAN 8.0

vSAN uses storage providers to supply information about underlying storage to the vCenter Server. This information helps
you to make appropriate decisions about virtual machine placement, and to monitor your storage environment.

What are vSAN Policies


vSAN storage policies define storage requirements for your virtual machines.
These policies determine how the virtual machine storage objects are provisioned and allocated within the datastore to
guarantee the required level of service. When you enable vSAN on a host cluster, a single vSAN datastore is created and
a default storage policy is assigned to the datastore.
When you know the storage requirements of your virtual machines, you can create a storage policy referencing
capabilities that the datastore advertises. You can create several policies to capture different types or classes of
requirements.
Each virtual machine deployed to vSAN datastores is assigned at least one virtual machine storage policy. You can assign
storage policies when you create or edit virtual machines.
NOTE
If you do not assign a storage policy to a virtual machine, vSAN assigns a default policy. The default policy has
Failures to tolerate set to 1, a single disk stripe per object, and a thin-provisioned virtual disk.
The VM swap object and the VM snapshot memory object adhere to the storage policies assigned to a VM, with Failures
to tolerate set to 1. They might not have the same availability as other objects that have been assigned a policy with a
different value for Failures to tolerate.
NOTE
If vSAN Express Storage Architecture is enabled, every snapshot is not a new object. A base VMDK and its
snapshots are contained in one vSAN object. In addition, in vSAN ESA, digest is backed by vSAN object. This is
different from vSAN Original Storage Architecture.

197
VMware vSAN 8.0

Table 22: Storage Policy - Availability

Capability Description

Failures to tolerate (FTT) Defines the number of host and device failures that a virtual machine object can
tolerate. For n failures tolerated, each piece of data written is stored in n+1 places,
including parity copies if using RAID-5 or RAID-6.
If fault domains are configured, 2n+1 fault domains with hosts contributing capacity
are required. A host which does not belong to a fault domain is considered its own
single-host fault domain.
You can select a data replication method that optimizes for performance or
capacity. RAID-1 (Mirroring) uses more disk space to place the components of
objects but provides better performance for accessing the objects. RAID-5/6
(Erasure Coding) uses less disk space, but performance is reduced. You can select
one of the following options:
• No data redundancy: Specify this option if you do not want vSAN to protect
a single mirror copy of virtual machine objects. This means that your data
is unprotected, and you might lose data when the vSAN cluster encounters
a device failure. The host might experience unusual delays when entering
maintenance mode. The delays occur because vSAN must evacuate the object
from the host for the maintenance operation to complete successfully.
• No data redundancy with host affinity: Specify this option only if you want
to run vSAN Shared Nothing Architecture (SNA) workloads on the vSAN Data
Persistence Platform.
• 1 failure - RAID-1 (Mirroring): Specify this option if your VM object can tolerate
one host or device failure. To protect a 100 GB VM object by using RAID-1
(Mirroring) with an FTT of 1, you consume 200 GB.
• 1 failure - RAID-5 (Erasure Coding): Specify this option if your VM object
can tolerate one host or device failure. For vSAN OSA, to protect a 100 GB
VM object by using RAID-5 (Erasure Coding) with an FTT of 1, you consume
133.33 GB.
NOTE
If you use vSAN Express Storage Architecture, vSAN creates an
optimized RAID-5 format based on the cluster size. If the number
of hosts in the cluster is less than 6, vSAN creates a RAID-5 (2+1)
format. If the number of hosts is equal to or greater than 6, vSAN
creates a RAID-6 (4+1) format. When the cluster size eventually
expands or shrinks, vSAN automatically readjusts the format after 24
hours from the configuration change.
• 2 failures - RAID-1 (Mirroring): Specify this option if your VM object can
tolerate up to two device failures. Since you need to have an FTT of 2 using
RAID-1 (Mirroring), there is a capacity overhead. To protect a 100 GB VM
object by using RAID-1(Mirroring) with an FTT of 2, you consume 300 GB.
• 2 failures - RAID-6 (Erasure Coding): Specify this option if your VM objects
can tolerate up to two device failures. To protect a 100 GB VM object by using
RAID-6 (Erasure Coding) with an FTT of 2, you consume 150 GB. For more
information, refer to Using RAID 5 or RAID 6 Erasure Coding in vSAN Cluster.
• 3 failures - RAID-1 (Mirroring): Specify this option if your VM objects can
tolerate up to three device failures. To protect a 100 GB VM object by using
RAID-1 (Mirroring) with an FTT of 3, you consume 400 GB.
NOTE
If you create a storage policy and you do not specify a value for FTT,
vSAN creates a single mirror copy of the VM objects. It can tolerate a
single failure. However, if multiple component failures occur, your data
might be at risk.
Site disaster tolerance This rule defines whether to use a standard, stretched, or 2-node cluster. If you use
a vSAN stretched cluster, you can define whether data is mirrored at both sites or
only at one site. For a vSAN stretched cluster, you can choose to keep data on the
Preferred or Secondary site for host affinity. 198

• None - standard cluster is the default value. This means that there is no site
disaster tolerance.
• Host mirroring - 2 node cluster defines the number of additional failures that
an object can tolerate after the number of failures defined by FTT is reached.
VMware vSAN 8.0

Table 23: Storage Policy - Storage rules

Capability Description

Encryption services Defines the encryption options for the VMs that you deploy to your datastore.
Choone one of the following options:
• Data-At-Rest encryption: Specify this option if you want to apply encryption to
the data that is stored in your datastore.
• No encryption: Specify this option if you do not want to apply any form of
encryption to your data.
• No preference: Specify this option if you do not want to explicitly apply any
encryption rules. By selecting this option, vSAN applies both rules to your VMs.
Space efficiency Defines the space efficiency options for the VMs that you deploy to your datastore.
Choose one of the following options:
• Deduplication and compression: Specify this option if you want to apply both
deduplication and compression to your data.
• Compression only: Specify this option if you want to apply only compression
to your data.
NOTE
For vSAN Original Storage Architecture, compression is a cluster-
level setting. For vSAN Express Storage Architecture, compresson
only is performed at the object level. This means that you can use
compression for one VM but not for another VM in the same cluster.
• No space efficiency: Specify this option if you do not want to apply
compression to your objects.
• No preference: Specify this option if you do not want to explicitly apply
any space efficiency rules. By selecting this option, vSAN applies all space
efficiency rules to your VMs.
Storage tier Specify the storage tier for all VMs with the defined storage policy. Choose one of
the following options:
• All flash: Specify this option if you want to make your VMs compatible with all-
flash environment.
• Hybrid: Specify this option if you want to make your VMs compatible with only
hybrid environment.
• No preference: Specify this option if you do not want to explicitly apply any
storage tier rules. By selecting this option, vSAN makes the VMs compatible
with both hybrid and all flash environments.

199
VMware vSAN 8.0

Table 24: Storage Policy - Advanced Policy Rules

Capability Description

Number of disk stripes per object The minimum number of capacity devices across which each replica of a virtual
machine object is striped. A value higher than 1 might result in better performance,
but also results in higher use of system resources.
Default value is 1. Maximum value is 12.
Do not change the default striping value.
In a hybrid environment, the disk stripes are spread across magnetic disks. For an
all-flash configuration, the striping is across flash devices that make up the capacity
layer. Make sure that your vSAN environment has sufficient capacity devices
present to accommodate the request.
IOPS limit for object Defines the IOPS limit for an object, such as a VMDK. IOPS is calculated as the
number of I/O operations, using a weighted size. If the system uses the default
base size of 32 KB, a 64-KB I/O represents two I/O operations.
When calculating IOPS, read and write are considered equivalent, but cache hit
ratio and sequentiality are not considered. If a disk’s IOPS exceeds the limit, I/O
operations are throttled. If the IOPS limit for object is set to 0, IOPS limits are not
enforced.
vSAN allows the object to double the rate of the IOPS limit during the first second
of operation or after a period of inactivity.
Object space reservation Percentage of the logical size of the virtual machine disk (vmdk) object that must
be reserved, or thick provisioned when deploying virtual machines. The following
options are available:
• Thin provisioning (default)
• 25% reservation
• 50% reservation
• 75% reservation
• Thick provisioning
Flash read cache reservation (%) Flash capacity reserved as read cache for the virtual machine object. Specified as
a percentage of the logical size of the virtual machine disk (vmdk) object. Reserved
flash capacity cannot be used by other objects. Unreserved flash is shared fairly
among all objects. Use this option only to address specific performance issues.
You do not have to set a reservation to get cache. Setting read cache reservations
might cause a problem when you move the virtual machine object because the
cache reservation settings are always included with the object.
The Flash Read Cache Reservation storage policy attribute is supported only for
hybrid storage configurations. Do not use this attribute when defining a VM storage
policy for an all-flash cluster or for a vSAN ESA cluster.
Default value is 0%. Maximum value is 100%.
NOTE
By default, vSAN dynamically allocates read cache to storage objects
based on demand. This feature represents the most flexible and the
most optimal use of resources. As a result, typically, you do not need to
change the default 0 value for this parameter.
Note: To increase the value when solving a performance problem,
exercise caution. Over-provisioned cache reservations across several
virtual machines can cause flash device space to be wasted on over-
reservations. These cache reservations are not available to service the
workloads that need the required space at a given time. This space
wasting and unavailability might lead to performance degradation.

200
VMware vSAN 8.0

Capability Description

Disable Object Checksum If the option is set to No, the object calculates checksum information to ensure
the integrity of its data. If this option is set to Yes, the object does not calculate
checksum information.
vSAN uses end-to-end checksum to ensure the integrity of data by confirming that
each copy of a file is exactly the same as the source file. The system checks the
validity of the data during read/write operations, and if an error is detected, vSAN
repairs the data or reports the error.
If a checksum mismatch is detected, vSAN automatically repairs the data by
overwriting the incorrect data with the correct data. Checksum calculation and
error-correction are performed as background operations.
The default setting for all objects in the cluster is No, which means that checksum
is enabled.
NOTE
For vSAN Express Storage Architecture, object checksum is always on
and cannot be deactivated.

Force provisioning If the option is set to Yes, the object is provisioned even if the Failures to tolerate,
Number of disk stripes per object, and Flash read cache reservation policies
specified in the storage policy cannot be satisfied by the datastore. Use this
parameter in bootstrapping scenarios and during an outage when standard
provisioning is no longer possible.
The default No is acceptable for most production environments. vSAN fails to
provision a virtual machine when the policy requirements are not met, but it
successfully creates the user-defined storage policy.

When working with virtual machine storage policies, you must understand how the storage capabilities affect the
consumption of storage capacity in the vSAN cluster. For more information about designing and sizing considerations of
storage policies, refer to "Designing and Sizing a vSAN Cluster" in vSAN Planning and Deployment.

How vSAN Manages Policy Changes


vSAN 6.7 Update 3 and later manages policy changes to reduce the amount of transient space consumed across the
cluster.
Transient capacity is generated when vSAN reconfigures objects for a policy change.
When you modify a policy, the change is accepted but not applied immediately. vSAN batches the policy change requests
and performs them asynchronously, to maintain a fixed amount of transient space.
Policy changes are rejected immediately for non-capacity related reasons, such as changing a RAID-5 policy to RAID-6
on a five-host cluster.
You can view transient capacity usage in the vSAN Capacity monitor. To verify the status of a policy change on an object,
use the vSAN health service to check the vSAN object health.

View vSAN Storage Providers


Enabling vSAN automatically configures and registers a storage provider for each host in the vSAN cluster.
vSAN storage providers are built-in software components that communicate datastore capabilities to vCenter Server.
A storage capability typically is represented by a key-value pair, where the key is a specific property offered by the
datastore. The value is a number or range that the datastore can provide for a provisioned object, such as a virtual
machine home namespace object or a virtual disk. You can also use tags to create user-defined storage capabilities and
reference them when defining a storage policy for a virtual machine. For information about how to apply and use tags with
datastores, see the vSphere Storage documentation.

201
VMware vSAN 8.0

The vSAN storage providers report a set of underlying storage capabilities to vCenter Server. They also communicate with
the vSAN layer to report the storage requirements of the virtual machines. For more information about storage providers,
see the vSphere Storage documentation.
vSAN 6.7 and later releases register only one vSAN Storage Provider for all the vSAN clusters managed by the vCenter
Server using the following URL:
https://<VC fqdn>:<VC https port>/vsan/vasa/version.xml
Verify that the storage providers are registered.
1. Navigate to .
2. Click the Configure tab, and click Storage Providers.

The storage provider for vSAN appears on the list.


NOTE
You cannot manually unregister storage providers used by vSAN. To remove or unregister the vSAN storage
providers, remove corresponding hosts from the vSAN cluster and then add the hosts back. Make sure that at
least one storage provider is active.

What are vSAN Default Storage Policies


vSAN requires that the virtual machines deployed on the vSAN datastores are assigned at least one storage policy.
When provisioning a virtual machine, if you do not explicitly assign a storage policy, vSAN assigns a default storage policy
to the virtual machine. Each default policy contains vSAN rule sets and a set of basic storage capabilities, typically used
for the placement of virtual machines deployed on vSAN datastores.

Table 25: vSAN Default Storage Policy Specifications

Specification Setting

Failures to tolerate 1
Number of disk stripes per object 1
Flash read cache reservation, or flash capacity used for the read 0
cache
Object space reservation 0
NOTE
Setting the Object space reservation to zero means
that the virtual disk is thin provisioned, by default.

Force provisioning No

If you use a vSAN Express Storage Architecture cluster, depending on your cluster size, you can use one of the ESA
policies listed here.

Table 26: vSAN ESA Default Storage Policy Specifications - RAID-5

Specification Setting

Failures to tolerate 1
Number of disk stripes per object 1
Flash read cache reservation, or flash capacity used for the read 0
cache

202
VMware vSAN 8.0

Specification Setting

Object space reservation Thin provisioning


Force provisioning No

NOTE
RAID-5 in vSAN ESA supports three host clusters. If you enable auto-policy management, the cluster must have
four hosts to use RAID-5.

Table 27: vSAN ESA Default Storage Policy Specifications - RAID-6

Specification Setting

Failures to tolerate 2
Number of disk stripes per object 1
Flash read cache reservation, or flash capacity used for the read 0
cache
Object space reservation Thin provisioning
Force provisioning No

NOTE
To use RAID-6, you must have at least six hosts in the cluster.
You can review the configuration settings for the default virtual machine storage policy when you navigate to the VM
Storage Policies > name of the default storage policy > Rule-Set 1: VSAN.
For best results, consider creating and using your own VM storage policies, even if the requirements of the policy are
same as those defined in the default storage policy. In some cases, when you scale up a cluster, you must modify the
default storage policy to maintain compliance with the requirements of the Service Level Agreement for VMware Cloud on
AWS.
When you assign a user-defined storage policy to a datastore, vSAN applies the settings for the user-defined policy on the
specified datastore. Only one storage policy can be the default policy for the vSAN datastore.

vSAN Default Storage Policy Characteristics


The following characteristics apply to the vSAN datastore default storage policies.
• A vSAN datastore default storage policy is assigned to all virtual machine objects if you do not assign any other vSAN
policy when you provision a virtual machine. The VM Storage Policy text box is set to Datastore default on the Select
Storage page. For more information about using storage policies, refer to the vSphere Storage documentation.
NOTE
VM swap and VM memory objects receive a vSAN default storage policy with Force provisioning set to
Yes.
• A vSAN default policy only applies to vSAN datastores. You cannot apply a default storage policy to non- vSAN
datastores, such as NFS or a VMFS datastore.
• Objects in a vSAN Express Storage Architecture cluster with RAID 0 or RAID 1 configuration will have 3 disk stripes,
even if the default policy defines only 1 disk stripe.
• Because the vSAN Default Storage Policy is compatible with any vSAN datastore in the vCenter Server, you can move
your virtual machine objects provisioned with the default policy to any vSAN datastore in the vCenter Server.
• You can clone the default policy and use it as a template to create a user-defined storage policy.
• You can edit the default policy, if you have the StorageProfile.View privilege. You must have at least one vSAN-enabled
cluster that contains at least one host. Typically you do not edit the settings of the default storage policy.

203
VMware vSAN 8.0

• You cannot edit the name and description of the default policy, or the vSAN storage provider specification. All other
parameters including the policy rules are editable.
• You cannot delete the default storage policy.
• A default storage policy is assigned when the policy that you assign during virtual machine provisioning does not
include rules specific to vSAN.

Auto Policy Management


Clusters with vSAN Express Storage Architecture can use Auto Policy Management to generate an optimal default
storage policy, based on the cluster type (standard or stretched) and the number of hosts. vSAN configures the Site
disaster tolerance and Failures to tolerate to optimal settings for the cluster.
The name of the auto-generated policy is based on the cluster name, as follows: ClusterName - Optimal Default Datastore
Policy
When you enable Auto Policy, vSAN assigns a new optimal policy to the vSAN datastore, and that policy becomes the
datastore default policy for the cluster.
To enable Auto Policy management, use the slide control on vSAN > Services > Storage > Edit.

Change the Default Storage Policy for vSAN Datastores


You can change the default storage policy for a selected vSAN datastore.
Verify that the VM storage policy you want to assign as the default policy to the vSAN datastore meets the requirements of
virtual machines in the vSAN cluster.
1. Navigate to the vSAN datastore.
2. Click Configure.
3. Under General, click the Default Storage Policy Edit button, and select the storage policy that you want to assign as
the default policy to the vSAN datastore.
NOTE
You can also edit the Improved Virtual Disk Home Storage Policy. Click Edit and select the home storage
policy that you want to assign as the storage policy for the home object.
You can choose from a list of storage policies that are compatible with the vSAN datastore, such as the vSAN Default
Storage Policy and user-defined storage policies that have vSAN rule sets defined.
4. Select a policy and click OK.
The storage policy is applied as the default policy when you provision new virtual machines without explicitly specifying
a storage policy for a datastore.
You can define a new storage policy for virtual machines. See Define a Storage Policy for vSAN Using vSphere Client.

204
VMware vSAN 8.0

Define a Storage Policy for vSAN Using vSphere Client


You can create a storage policy that defines storage requirements for a VM and its virtual disks.

• Verify that the vSAN storage provider is available. Refer to View vSAN Storage Providers.
• Required privileges: Profile-driven storage.Profile-driven storage view and Profile-driven storage.Profile-driven
storage update
NOTE
Clusters with vSAN Express Storage Architecure can use Auto Policy management. For more information, refer
to What are vSAN Default Storage Policies.
In this policy, you reference storage capabilities supported by the vSAN datastore.
1. Navigate to Policies and Profiles, then click VM Storage Policies.
2. Click Create.
3. On the Name and description page, select a vCenter Server.
4. Type a name and a description for the storage policy and click Next.
5. On the Policy structure page, select Enable rules for "vSAN" storage, and click Next.
6. On the vSAN page, define the policy rule set, and click Next.
a) On the Availability tab, define the Site disaster tolerance and Failures to tolerate.
Availability options define the rules for failures to tolerate, Data locality, and Failure tolerance method.
• Site disaster tolerance defines the type of site failure tolerance used for virtual machine objects.
• Failures to tolerate defines the number of host and device failures that a virtual machine object can tolerate,
and the data replication method.
For example, if you choose Dual site mirroring and 2 failures - RAID-6 (Erasure Coding), vSAN configures the
following policy rules:
• Failures to tolerate: 1
• Secondary level of failures to tolerate: 2
• Data locality: None

205
VMware vSAN 8.0

• Failure tolerance method: RAID-5/6 (Erasure Coding) - Capacity


b) On the Storage Rules tab, define the encryption, space efficiency, and storage tier rules that can be used along
with the HCI Mesh to distinguish the remote datastores.
• Encryption services: Defines the encryption rules for virtual machines that you deploy with this policy. You can
choose one of the following options:
– Data-At-Rest encryption: Encryption is enabled on the virtual machines.
– No encryption: Encryption is not enabled on the virtual machines.
– No preference: Makes the virtual machines compatible with both Data-At-Rest encryption and No encryption
options.
• Space Efficiency: Defines the space saving rules for the virtual machines that you deploy with this policy. You
can choose one of the following options:
– Deduplication and compression: Enables both deduplication and compression on the virtual machines.
Deduplication and compression are available only on all-flash disk groups. For more information, see
Deduplication and Compression Design Considerations in vSAN Cluster.
– Compression only: Enables only compression on the virtual machines. Compression is available only on
all-flash disk groups. For more information, see Deduplication and Compression Design Considerations in
vSAN Cluster.
– No space efficiency: Space efficiency features are not enabled on the virtual machines. Choosing this
option requires datastores without any space efficiency options to be turned on.
– No preference: Makes the virtual machines compatible with all the options.
• Storage tier: Specifies the storage tier for the virtual machines that you deploy with this policy. You can choose
one of the following options. Choosing the No preference option makes the virtual machines compatible with
both hybrid and all flash environments.
– All flash
– Hybrid
– No preference
c) On the Advanced Policy Rules tab, define advanced policy rules, such as number of disk stripes per object and
IOPS limits.
d) On the Tags tab, click Add Tag Rule, and define the options for your tag rule.
Make sure that the values you provide are within the range of values advertised by storage capabilities of the vSAN
datastore.
7. On the Storage compatibility page, review the list of datastores under the COMPATIBLE and INCOMPATIBLE tabs
and click Next.
To be eligible, a datastore does not need to satisfy all rule sets within the policy. The datastore must satisfy at least
one rule set and all rules within this set. Verify that the vSAN datastore meets the requirements set in the storage
policy and that it appears on the list of compatible datastores.
8. On the Review and finish page, review the policy settings, and click Finish.

The new policy is added to the list.


Assign this policy to a virtual machine and its virtual disks. vSAN places the virtual machine objects according to the
requirements specified in the policy. For information about applying the storage policies to virtual machine objects, see the
vSphere Storage documentation.

Expanding and Managing a vSAN Cluster


After you have set up your vSAN cluster, you can add hosts and capacity devices, remove hosts and devices, and
manage failure scenarios.

206
VMware vSAN 8.0

Expanding a vSAN Cluster


You can expand an existing vSAN cluster by adding hosts or adding devices to existing hosts, without disrupting any
ongoing operations.
Use one of the following methods to expand your vSAN cluster.
• Add new ESXi hosts to the cluster that are configured using the supported cache and capacity devices. See Add a
Host to the vSAN Cluster.
• Move existing ESXi hosts to the vSAN cluster and configure them by using host profile. See Configuring Hosts in the
vSAN Cluster Using Host Profile.
• Add new capacity devices to ESXi hosts that are cluster members. See Add Devices to the Disk Group in vSAN
Cluster.

Expanding vSAN Cluster Capacity and Performance


If your vSAN cluster is out of storage capacity or when you notice reduced performance, you can expand the cluster for
capacity and performance.
• (Only for vSAN Original Storage Architecture) Expand the storage capacity of your cluster either by adding storage
devices to existing disk groups or by adding disk groups. New disk groups require flash devices for the cache. For
information about adding devices to disk groups, see Add Devices to the Disk Group in vSAN Cluster. Adding capacity
devices without increasing the cache might reduce your cache-to-capacity ratio to an unsupported level. For more
information See vSAN Planning and Deployment
Improve the cluster performance by adding at least one cache device (flash) and one capacity device (flash or
magnetic disk) to an existing storage I/O controller or to a new host. Or you can add one or more hosts with disk
groups to produce the same performance impact after vSAN completes automatic rebalance in the vSAN cluster.
• (Only for vSAN Express Storage Architecture) Expand the storage capacity of your cluster by adding flash devices to
the storage pools of the existing hosts or by adding one or more new hosts with flash devices.
Although compute-only hosts can exist in a vSAN cluster, and consume capacity from other hosts in the cluster, add
uniformly configured hosts for efficient operation. Although it is best to use the same or similar devices in your disk groups
or storage pools, any device listed on the vSAN HCL is supported. Try to distribute capacity evenly across hosts. For
information about adding devices to disk groups or storage pools, see Create a Disk Group or Storage Pool in vSAN
Cluster.
After you expand the cluster capacity, enable automatic rebalance to distribute resources evenly across the cluster. For
more information, see vSAN Monitoring and Troubleshooting.

Use Quickstart to Add Hosts to a vSAN Cluster


If you configured your vSAN cluster through Quickstart, you can use the Quickstart workflow to add hosts and storage
devices to the cluster.
• The Quickstart workflow must be available for your vSAN cluster.
• No network configuration performed through the Quickstart workflow has been modified from outside of the Quickstart
workflow.
• Networking settings configured while creating the cluster with Quickstart have not been modified.
When you add new hosts to the vSAN cluster, you can use the Cluster configuration wizard to complete the host
configuration. For more information about Quickstart, see "Using Quickstart to Configure and Expand a vSAN Cluster in
vSAN Planning and Deployment.

207
VMware vSAN 8.0

NOTE
If you are running vCenter Server on a host, the host cannot be placed into maintenance mode as you add it to
a cluster using the Quickstart workflow. The same host also can be running a Platform Services Controller. All
other VMs on the host must be powered off.
1. Navigate to the cluster in the vSphere Client.
2. Click the Configure tab, and select Configuration > Quickstart.
3. On the Add hosts card, click Launch to open the Add hosts wizard.
a) On the Add hosts page, enter information for new hosts, or click Existing hosts and select from hosts listed in the
inventory.
b) On the Host summary page, verify the host settings.
c) On the Ready to complete page, click Finish.
4. On the Cluster configuration card, click Launch to open the Cluster configuration wizard.
a) On the Configure the distributed switches page, enter networking settings for the new hosts.
b) (optional) On the Claim disks page, select disks on each new host.
c) (optional) On the Create fault domains page, move the new hosts into their corresponding fault domains.
For more information about fault domains, see Managing Fault Domains in vSAN Clusters.
d) On the Ready to complete page, verify the cluster settings, and click Finish.

Add a Host to the vSAN Cluster


You can add ESXi hosts to a running vSAN cluster without disrupting any ongoing operations.
• Verify that the resources, including drivers, firmware, and storage I/O controllers, are listed in the VMware Compatibility
Guide at http://www.vmware.com/resources/compatibility/search.php.
• VMware recommends creating uniformly configured hosts in the vSAN cluster, so you have an even distribution of
components and objects across devices in the cluster. However, there might be situations where the cluster becomes
unevenly balanced, particularly during maintenance or if you overcommit the capacity of the vSAN datastore with
excessive virtual machine deployments.
The new host's resources become associated with the cluster.
1. Navigate to the vSAN cluster.
2. Right-click the cluster and select Add Hosts. The Add hosts wizard appears.
Option Description
New hosts 1. Enter the host name or IP address.
2. Enter the user name and password associated with the host.
Existing hosts 1. Select hosts that you previously added to vCenter Server.

3. Click Next.
4. View the summary information and click Next.
5. Review the settings and click Finish.
The host is added to the cluster.
Verify that the vSAN Disk Balance health check is green.
For more information about vSAN cluster configuration and fixing problems, see "vSAN Cluster Configuration Issues" in
vSAN Monitoring and Troubleshooting.

208
VMware vSAN 8.0

Configuring Hosts in the vSAN Cluster Using Host Profile


When you have multiple hosts in the vSAN cluster, you can use the profile of an existing vSAN host to configure the hosts
in the vSAN cluster.
• Verify that the host is in maintenance mode.
• Verify that the hardware components, drivers, firmware, and storage I/O controllers are listed in the VMware
Compatibility Guide at http://www.vmware.com/resources/compatibility/search.php.
The host profile includes information about storage configuration, network configuration, and other characteristics of the
host. If you are planning to create a cluster with many hosts, such as 8, 16, 32, or 64 hosts, use the host profile feature.
Host profiles enable you to add more than one host at a time to the vSAN cluster.
1. Create a host profile.
a) Navigate to the Host Profiles view.
b) Click the Extract Profile from a Host icon ( ).
c) Select the host that you intend to use as the reference host and click Next.
The selected host must be an active host.
d) Type a name and description for the new profile and click Next.
e) Review the summary information for the new host profile and click Finish.
The new profile appears in the Host Profiles list.

209
VMware vSAN 8.0

2. Attach the host to the intended host profile.


a) From the Profile list in the Host Profiles view, select the host profile to be applied to the vSAN host.
b) Click the Attach/Detach Hosts and clusters to a host profile icon ( ).
c) Select the host from the expanded list and click Attach to attach the host to the profile.
The host is added to the Attached Entities list.
d) Click Next.
e) Click Finish to complete the attachment of the host to the profile.
3. Detach the referenced vSAN host from the host profile.
When a host profile is attached to a cluster, the host or hosts within that cluster are also attached to the host profile.
However, when the host profile is detached from the cluster, the association between the host or hosts in the cluster
and that of the host profile remains intact.
a) From the Profile List in the Host Profiles view, select the host profile to be detached from a host or cluster.
b) Click the Attach/Detach Hosts and clusters to a host profile icon ( ).
c) Select the host or cluster from the expanded list and click Detach.
d) Click Detach All to detach all the listed hosts and clusters from the profile.
e) Click Next.
f) Click Finish to complete the detachment of the host from the host profile.
4. Verify the compliance of the vSAN host to its attached host profile and determine if any configuration parameters on
the host are different from those specified in the host profile.
a) Navigate to a host profile.
The Objects tab lists all host profiles, the number of hosts attached to that host profile, and the summarized results
of the last compliance check.
b) Click the Check Host Profile Compliance icon ( ).

To view specific details about which parameters differ between the host that failed compliance and the host profile,
click the Monitor tab and select the Compliance view. Expand the object hierarchy and select the non-compliant
host. The parameters that differ are displayed in the Compliance window, below the hierarchy.
If compliance fails, use the Remediate action to apply the host profile settings to the host. This action changes all
host profile-managed parameters to the values that are contained in the host profile attached to the host.
c) To view specific details about which parameters differ between the host that failed compliance and the host profile,
click the Monitor tab and select the Compliance view.
d) Expand the object hierarchy and select the failing host.
The parameters that differ are displayed in the Compliance window, below the hierarchy.
5. Remediate the host to fix compliance errors.
a) Select the Monitor tab and click Compliance.
b) Right-click the host or hosts to remediate and select All vCenter Actions > Host Profiles > Remediate.
You can update or change the user input parameters for the host profiles policies by customizing the host.
c) Click Next.
d) Review the tasks that are necessary to remediate the host profile and click Finish.
The host is part of the vSAN cluster and its resources are accessible to the vSAN cluster. The host can also access all
existing vSAN storage I/O policies in the vSAN cluster.

Sharing Remote vSAN Datastores


Remote datastore sharing enables vSAN clusters to share their datastores with other clusters.

210
VMware vSAN 8.0

You can provision VMs running on your local cluster to use storage space on a remote datastore. When you provision a
new virtual machine, you can select a remote datastore that is mounted to the client cluster. You can ssign any compatible
storage policy configured for the remote datastore.
Mounting a remote datastore is a cluster-wide configuration. When you mount a remote datastore to a vSAN cluster, it is
available to all hosts in the cluster.
When you create a vSphere cluster for mounting remote datastore, select any one of the following vSAN cluster types:
• vSAN HCI cluster provides compute resources and storage resources. It can share its datastore across data centers
and vCenter instances and mount datastores from other vSAN HCI clusters.
• vSAN Compute Cluster is a vSphere cluster that provides compute resources only. It can mount datastores served by
vSAN max clusters and vSAN HCI clusters.
• vSAN Max (vSAN ESA only) provides storage resources, but not compute resources. Its datastore can be mounted by
remote vSAN Max or vSAN HCI clusters across data centers and vCenter instances.
vSAN datastore sharing has the following design considerations:
• vSAN Original Storage Architecture clusters running 8.0 Update 1 or later can share datastores across clusters in the
same data center, or across clusters managed by remote vCenters, as long as they are on the same network. vSAN
Express Storage Architecture clusters running 8.0 Update 2 or later have this feature.
• A vSAN HCI or vSAN Max cluster can serve its local datastore to up to 10 client clusters.
• A client cluster can mount up to 5 remote datastores from one or more vSAN server clusters.
• A single datastore can be mounted to up to 128 vSAN hosts, including hosts in the local vSAN server cluster.
• All objects that make up a VM must reside on the same datastore.
• For vSphere HA to work with vSAN datastore sharing, configure the following failure response for Datastore with APD:
Power off and restart VMs.
• Client hosts that are not part of a cluster are not supported. You can configure a single host compute-only cluster, but
vSphere HA does not work unless you add a second host to the cluster.
• Data-in-transit encryption is not supported.
The following configurations are not supported with vSAN datastore sharing:
• Remote provisioning of iSCSI volumes, or CNS persistent volumes. You can provision iSCSI volumes on the local
vSAN datastore, but not on any remote vSAN datastore. For remote provisioning of CNS persistent volumes, see
vSphere Functionality Supported by vSphere Container Storage Plug-in and Using vSphere Container Storage Plug-in
for HCI Mesh Deployment in the vSphere Storage guide.
• Air-gapped networks or clusters using multiple vSAN VMkernel ports

Disaggregated Storage with vSAN Max


vSAN Max is a fully distributed, scalable, shared storage solution for vSphere clusters and vSAN clusters. Storage
resources are disaggregated from compute resources, so you can scale storage and compute resources independently.
vSAN Max uses vSAN Express Storage Architecture and high-density vSAN Ready Nodes for increased capacity and
performance.
NOTE
vSAN Max can be deployed by purchasing VMware Cloud Foundation or by acquiring the advanced add-on offer
for VMware vSphere Foundation. Licensing for vSAN Max is based on a per-TiB metric, which corresponds to
the total amount of raw storage capacity needed for the environments.
A vSAN Max cluster acts as a server cluster that only provides storage. You can mount its datastore to vSphere clusters
configured as vSAN compute clusters or vSAN HCI client clusters.

211
VMware vSAN 8.0

vSAN Max clusters have the following design considerations:


• Supported only on vSAN Express Storage Architecture running on vSAN Ready Nodes certified for vSAN Max.
• Not compatible with vSAN Original Storage Architecture.
• Acts as a storage server only, not as a client. Do not run workload VMs on vSAN Max hosts.
• Requires a minimum of six hosts, and 150 TiB per host. To optimize performance, use a uniform configuration of
storage devices across all hosts.
• Requires 100 Gbps network connections between hosts in the vSAN Max cluster, and 10 Gbps connections from
compute clients to the vSAN Max cluster. For best performance, enable support for jumbo frames (MTU = 9000) and
ensure you have sufficient resources at the network spine.
• Enable Auto-Policy management (Configure > vSAN > Services > Storage > Edit) to ensure optimal levels of
resilience and space efficiency.
• Enable Automatic rebalance (Configure > vSAN > Services > Advanced Options > Edit) to ensure an evenly
balanced, distributed storage system.
NOTE
You can configure vSAN Max only during cluster creation. You cannot convert an existing vSAN HCI cluster to a
vSAN Max cluster, and you cannot convert vSAN Max cluster to a vSAN HCI cluster. You must deactivate vSAN
on the cluster and reconfigure the cluster.
vSAN Compute cluster
A vSAN Compute cluster is a vSphere cluster with a small vSAN element that enables it to mount a vSAN Max datastore. The
hosts in a Compute cluster do not have local storage. You can monitor the capacity, health, and performance of the remote
datastore.
vSAN compute clusters have the following design considerations:
• vSAN networking must be configured on hosts in the Compute cluster.
• No storage devices can be present on hosts in a Compute cluster.
• No data management features can be configured on the Compute cluster.

212
VMware vSAN 8.0

Cross-Cluster Capacity Sharing


vSAN HCI clusters can share their datastores with other vSAN HCI clusters. A vSAN HCI cluster can act as a server to
provide data storage, or as a client that consumes storage.
vSAN Original Storage Architecture and vSAN Express Storage Architecture are not compatible, and cannot share
datastores with each other. A client cluster cannot mount datastores from different vSAN architectures. If a cluster has
mounted a datastore that uses vSAN Original Storage Architecture, it cannot mount a datastore that uses vSAN Express
Storage Architecture.

Use the Remote Datastores view to monitor and manage remote datastores mounted on the local vSAN cluster. Each
client vSAN cluster can mount remote datastores from server vSAN clusters. Each compatible vSAN cluster also can act
as a server, and allow other vSAN clusters to mount its local datastore.

213
VMware vSAN 8.0

Monitor views for capacity, performance, health, and placement of virtual objects show the status of remote objects and
datastores.

Using Remote vCenters as Datastore Sources


vSAN HCI and vSAN Max clusters can share remote datastores across vCenters. You can add a remote vCenter as a
datastore source for clusters on the local vCenter. Client clusters on the local vCenter can mount datastores that reside on
the remote vCenter.

214
VMware vSAN 8.0

Use the vCenter's Remote Datastores page to manage remote datastore sources (Configure > vSAN > Remote
Datastores). Click the tabs to access information about shared datastores across vCenters, add vCenters as datastores
sources, and mount datastores to local clusters.

Datastore Sources View and manage datastore sources residing in remote vCenters. You can add or remove remote datastore
sources for the local vCenter.
Clusters View and manage clusters residing in the local vCenter. You can mount or unmount datastores from remote
vCenters to the selected cluster.
Datastores View all datastores available under this vCenter.

vCenter to vCenter datastore sharing has the following design considerations:


• Each vCenter can serve up to 10 client vCenters.
• Each client vCenter can add up to 5 remote vCenter datastore sources.
• When a VM on a client cluster managed by one vCenter uses storage from a server managed by another vCenter, the
storage policy on the client's vCenter takes precedence.

215
VMware vSAN 8.0

View Remote vSAN Datastores


Use the Remote Datastores page to view remote datastores mounted to the local vSAN cluster, and client clusters sharing
the local datastore.

1. Navigate to the local vSAN cluster.


2. Click the Configure tab.
3. Under vSAN, click Remote Datastores.

This view lists information about each datastore mounted to the local cluster.
• Server cluster that hosts the datastore
• vCenter of the server cluster (if applicable)
• Capacity usage of the datastore
• Free capacity available
• Number of VMs using the datastore (number of VMs using the compute resources of the local cluster, but the storage
resources of the server cluster)
• Client clusters that have mounted the datastore
You can mount or unmount remote datastores from this page.

216
VMware vSAN 8.0

Mount Remote vSAN Datastore


You can mount one or more datastores from other vSAN clusters.
1. Navigate to the local vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Remote Datastores.
4. Click Mount Remote Datastore to open the wizard.
5. (Optional) Select a remote vCenter as the datastore source.
6. Select a datastore.
7. (Optional) If the server cluster is a vSAN stretched cluster, configure Site Coupling to choose the optimal data path
between the vSAN HCI servers and the clients.
A vSAN stretched cluster might have an asymmetrical network, where links within each site have higher bandwidth
and lower latency than links between sites. A symmetrical network has similar links within each site and across sites.
a) On the Network Topology page, select Symmetrical or Asymmetrical. If you select Asymmetrical, the Site
Coupling page appears.
b) Select a site on the server cluster to couple with the appropriate client site. Select the server site that is physically
closer or adjacent to each client site.
8. Check the datastore compatibility, and click Finish.

The remote datastore is mounted to the local vSAN cluster.


When you provision a VM, you can select the remote datastore as the storage resource. Assign a storage policy that is
supported by the remote datastore.

Unmount Remote vSAN Datastore


You can unmount a remote datastore from a vSAN cluster.
If no virtual machines on the local cluster are using the remote vSAN datastore, you can unmount the datastore from your
local vSAN cluster.
1. Navigate to the local vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Remote Datastores.
4. Select a remote datastore, and click Unmount.
5. Click Unmount to confirm.

The selected datastore is unmounted from the local cluster.

Monitor Datastore Sharing with vSphere Client


You can use the vSphere Client to monitor the status of vSAN datastore sharing operations.
vSAN capacity monitor notifies you when remote datastores are mounted to the cluster. You can select the remote
datastore to view its capacity information.
The Virtual Objects view shows the datastore where virtual objects reside. The Physical disk placement view for a VM
located on a remote datastore shows information about its remote location.

217
VMware vSAN 8.0

vSAN health checks report on the status of HCI functions.


• Data > vSAN Object health check shows accessibility information of remote objects.
• Network > Server cluster partition check reports about network partitions between hosts in the client cluster and the
server cluster.
• Network > Latency checks the latency between hosts in the client cluster and the server cluster.
vSAN cluster performance views include VM performance charts that display the VM level performance of the client
cluster from the perspective of the remote cluster. You can select a remote datastore to view the performance.

You can run pro-active tests on remote datastores to verify VM creation and network performance. The VM creation test
creates a VM on the remote datastore. The Network performance test checks the network performance between all hosts
in the client cluster and all hosts the server clusters.

218
VMware vSAN 8.0

Add Remote vCenter as Datastore Source


You can add a remote vCenter as a remote datastore source for clients on the local vCenter.

1. Navigate to the vCenter in the vSphere Client.


2. Select Configure > vSAN > Remote Datastores.
3. On the Datastore Sources tab, click Add Remote Datastore Source to open the wizard.
4. Enter information to specify the remote vCenter.
5. Check the compatibility, review the configuration, and click Finish.

The remote vCenter is added as a datastore source. vSAN clusters on this vCenter can mount remote datastores that
reside on the remote vCenter.

Working with Members of the vSAN Cluster in Maintenance Mode


Before you shut down, reboot, or disconnect a host that is a member of a vSAN cluster, you must put the host in
maintenance mode.
When working with maintenance mode, consider the following guidelines:
• When you place an ESXi host in maintenance mode, you must select a data evacuation mode, such as Ensure
accessibility or Full data migration.
• When any member host of a vSAN cluster enters maintenance mode, the cluster capacity automatically reduces as the
member host no longer contributes storage to the cluster.
• A virtual machine's compute resources might not reside on the host that is being placed in maintenance mode, and the
storage resources for virtual machines might be located anywhere in the cluster.
• The Ensure accessibility mode is faster than the Full data migration mode because the Ensure accessibility
migrates only the components from the hosts that are essential for running the virtual machines. When in this mode,
if you encounter a failure, the availability of your virtual machine is affected. Selecting the Ensure accessibility mode
does not reprotect your data during failure and you might experience unexpected data loss.
• When you select the Full data migration mode, your data is automatically reprotected against a failure, if the
resources are available and the Failures to tolerate set to 1 or more. When in this mode, all components from the
host are migrated and, depending on the amount of data you have on the host, the migration might take longer. With
Full data migration mode, your virtual machines can tolerate failures, even during planned maintenance.
• When working with a three-host cluster, you cannot place a server in maintenance mode with Full data migration.
Consider designing a cluster with four or more hosts for maximum availability.
Before you place a host in maintenance mode, you must verify the following:
• If you are using Full data migration mode, verify that the cluster has enough hosts and capacity available to meet the
Failures to tolerate policy requirements.
• Verify that enough flash capacity exists on the remaining hosts to handle any flash read cache reservations. To
analyze the current capacity use per host, and whether a single host failure might cause the cluster to run out of
space and impact the cluster capacity, cache reservation, and cluster components, run the following RVC command:
vsan.whatif_host_failures. For information about the RVC commands, see the RVC Command Reference Guide.
• Verify that you have enough capacity devices in the remaining hosts to handle stripe width policy requirements, if
selected.
• Make sure that you have enough free capacity on the remaining hosts to handle the amount of data that must be
migrated from the host entering maintenance mode.

219
VMware vSAN 8.0

The Confirm Maintenance Mode dialog box provides information to guide your maintenance activities. You can view the
impact of each data evacuation option.
• Whether or not sufficient capacity is available to perform the operation.
• How much data will be moved.
• How many objects will become non-compliant.
• How many objects will become inaccessible.

Check the Data Migration Capabilities of a Host in the vSAN Cluster


Use data migration pre-check to identify the impact of migration options when placing a host into maintenance mode or
removing it from the cluster.
Before you place a vSAN host into maintenance mode, run the data migration pre-check. The test results provide
information to help you determine the impact to cluster capacity, predicted health checks, and any objects that will go out
of compliance. If the operation will not succeed, pre-check provides information about what resources are needed.

220
VMware vSAN 8.0

1. Navigate to the vSAN cluster.


2. Click the Monitor tab.
3. Under vSAN, click Data Migration Pre-check.
4. Select a host, a data migration option, and click Pre-check.
vSAN runs the data migration precheck tests.
5. View the test results.
The pre-check results show whether the host can safely enter maintenance mode.
• The Object Compliance and Accessibility tab displays objects that might have issues after the data migration.
• The Cluster Capacity tab displays the impact of data migration on the vSAN cluster before and after you perform
the operation.
• The Predicted Health tab displays the health checks that might be affected by the data migration.
If the pre-check indicates that you can place the host into maintenance mode, you can click Enter Maintenance Mode to
migrate the data and place the host into maintenance mode.

Place a Member of vSAN Cluster in Maintenance Mode


Before you shut down, reboot, or disconnect a host that is a member of a vSAN cluster, you must place the host in
maintenance mode.
Verify that your environment has the capabilities required for the option you select.
When you place a host in maintenance mode, you must select a data evacuation mode, such as Ensure accessibility
or Full data migration. When any member host of a vSAN cluster enters maintenance mode, the cluster capacity is
automatically reduced, because the member host no longer contributes capacity to the cluster.
NOTE
The vSAN File Service VMs (FSVM) running on a host are automatically powered off when a host in the vSAN
cluster enters maintenance mode.

221
VMware vSAN 8.0

Any vSAN iSCSI targets served by this host are transferred to other hosts in the cluster, and thus the iSCSI initiator are
redirected to the new target owner.
1. Right-click the host and select Maintenance Mode > Enter Maintenance Mode.
2. Select a data evacuation mode and click OK.
Option Description
Ensure accessibility This is the default option. When you power off or remove the
host from the cluster, vSAN migrates just enough data to ensure
every object is accessible after the host goes into maintenance
mode. Select this option if you want to take the host out of the
cluster temporarily, for example, to install upgrades, and plan to
have the host back in the cluster. This option is not appropriate if
you want to remove the host from the cluster permanently.
Typically, only partial data evacuation is required. However,
the virtual machine might no longer be fully compliant to a VM
storage policy during evacuation. That means, it might not have
access to all its replicas. If a failure occurs while the host is in
maintenance mode and the Failures to tolerate is set to 1, you
might experience data loss in the cluster.
NOTE
This is the only evacuation mode available if you are
working with a three-host cluster or a vSAN cluster
configured with three fault domains.
Full data migration vSAN evacuates all data to other hosts in the cluster and
maintains the current object compliance state. Select this option
if you plan to migrate the host permanently. When evacuating
data from the last host in the cluster, make sure that you migrate
the virtual machines to another datastore and then place the
host in maintenance mode.
This evacuation mode results in the largest amount of data
transfer and consumes the most time and resources. All
the components on the local storage of the selected host
are migrated elsewhere in the cluster. When the host enters
maintenance mode, all virtual machines have access to their
storage components and are still compliant with their assigned
storage policies.
NOTE
If there are objects in reduced availability state, this
mode maintains this compliance state and does not
guarantee that the objects will become compliant.
If a virtual machine object that has data on the host
is not accessible and is not fully evacuated, the host
cannot enter maintenance mode.

222
VMware vSAN 8.0

Option Description
No data migration vSAN does not evacuate any data from this host. If you power
off or remove the host from the cluster, some virtual machines
might become inaccessible.

A cluster with three fault domains has the same restrictions that a three-host cluster has, such as the inability to use
Full data migration mode or to reprotect data after a failure.
Alternatively, you can place a host in the maintenance mode by using ESXCLI. Before placing a host in this mode,
ensure that you powered off the VMs that run on the host.
To perform an action before entering maintenance mode, run the following command on the host:
esxcli system maintenanceMode set --enable 1 --vsanmode=<str>

Following are the string values allowed for vsanmode:


• ensureObjectAccessibility - Evacuate data from the disk to ensure object accessibility in the vSAN cluster, before
entering maintenance mode.
NOTE
The default value is ensureObjectAccessibility. This value will be used if you do not specify any value for
the vsanmode.
• evacuateAllData - Evacuate all data from the disk before entering maintenance mode.
• noAction - Do not move vSAN data out of the disk before entering maintenance mode.
To verify the status of the host, run the following command:
esxcli system maintenanceMode get

To exit maintenance mode, run the following command:


esxcli system maintenanceMode set --enable 0

You can track the progress of data migration in the cluster. For more information see vSAN Monitoring and
Troubleshooting.

Managing Fault Domains in vSAN Clusters


Fault domains enable you to protect against rack or chassis failure if your vSAN cluster spans across multiple racks or
blade server chassis.
You can create fault domains and add one or more hosts to each fault domain. A fault domain consists of one or more
vSAN hosts grouped according to their physical location in the data center. When configured, fault domains enable vSAN
to tolerate failures of entire physical racks as well as failures of a single host, capacity device, network link, or a network
switch dedicated to a fault domain.
The Failures to tolerate policy for the cluster depends on the number of failures a virtual machine is provisioned to
tolerate. When a virtual machine is configured with the Failures to tolerate set to 1 (FTT=1), vSAN can tolerate a single
failure of any kind and of any component in a fault domain, including the failure of an entire rack.
When you configure fault domains on a rack and provision a new virtual machine, vSAN ensures that protection objects,
such as replicas and witnesses, are placed in different fault domains. For example, if a virtual machine's storage policy
has the Failures to tolerate set to N (FTT=n), vSAN requires a minimum of 2*n+1 fault domains in the cluster. When
virtual machines are provisioned in a cluster with fault domains using this policy, the copies of the associated virtual
machine objects are stored across separate racks.
A minimum of three fault domains are required to support FTT=1. For best results, configure four or more fault domains in
the cluster. A cluster with three fault domains has the same restrictions that a three host cluster has, such as the inability

223
VMware vSAN 8.0

to reprotect data after a failure and the inability to use the Full data migration mode. For information about designing and
sizing fault domains, see "Designing and Sizing vSAN Fault Domains" in vSAN Planning and Deployment.
Consider a scenario where you have a vSAN cluster with 16 hosts. The hosts are spread across four racks, that is, four
hosts per rack. To tolerate an entire rack failure, create a fault domain for each rack. You can configure a cluster of such
capacity with the Failures to tolerate set to 1. If you want the Failures to tolerate set to 2, configure five fault domains in
the cluster.
When a rack fails, all resources including the CPU, memory in the rack become unavailable to the cluster. To reduce
the impact of a potential rack failure, configure fault domains of smaller sizes. Increasing the number of fault domains
increases the total amount of resource availability in the cluster after a rack failure.
When working with fault domains, follow these best practices.
• Configure a minimum of three fault domains in the vSAN cluster. For best results, configure four or more fault domains.
• A host not included in any fault domain is considered to reside in its own single-host fault domain.
• You do not need to assign every vSAN host to a fault domain. If you decide to use fault domains to protect the vSAN
environment, consider creating equal sized fault domains.
• When moved to another cluster, vSAN hosts retain their fault domain assignments.
• When designing a fault domain, place a uniform number of hosts in each fault domain.
For guidelines about designing fault domains, see "Designing and Sizing vSAN Fault Domains" in vSAN Planning and
Deployment.
• You can add any number of hosts to a fault domain. Each fault domain must contain at least one host.

Create a New Fault Domain in vSAN Cluster


To ensure that the virtual machine objects continue to run smoothly during a rack failure, you can group hosts in different
fault domains.
• Choose a unique fault domain name. vSAN does not support duplicate fault domain names in a cluster.
• Verify the version of your ESXi hosts. You can only include hosts that are 6.0 or later in fault domains.
• Verify that your vSAN hosts are online. You cannot assign hosts to a fault domain that is offline or unavailable due to
hardware configuration issue.
When you provision a virtual machine on the cluster with fault domains, vSAN distributes protection components, such as
witnesses and replicas of the virtual machine objects across different fault domains. As a result, the vSAN environment
becomes capable of tolerating entire rack failures in addition to a single host, storage disk, or network failure.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
4. Click the plus icon. The New Fault Domain wizard opens.
5. Enter the fault domain name.
6. Select one or more hosts to add to the fault domain.
A fault domain cannot be empty. You must select at least one host to include in the fault domain.
7. Click Create.
The selected hosts appear in the fault domain. Each fault domain displays the used and reserved capacity information.
This enables you to view the capacity distribution across the fault domain.

224
VMware vSAN 8.0

Move Host into Selected Fault Domain in vSAN Cluster


You can move a host into a selected fault domain in the vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
4. Click and drag the host that you want to add onto an existing fault domain.
The selected host appears in the fault domain.

Move Hosts out of a Fault Domain in vSAN Cluster


Depending on your requirement, you can move hosts out of a fault domain.
Verify that the host is online. You cannot move hosts that are offline or unavailable from a fault domain.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
a) Click and drag the host from the fault domain to the Standalone Hosts area.
b) Click Move to confirm.

The selected host is no longer part of the fault domain. Any host that is not part of a fault domain is considered to reside in
its own single-host fault domain.
You can add hosts to fault domains. See Move Host into Selected Fault Domain in vSAN Cluster.

Rename a Fault Domain in vSAN Cluster


You can change the name of an existing fault domain in your vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
a) Click the Actions icon on the right side of the fault domain, and choose Edit.
b) Enter a new fault domain name.
4. Click Apply or OK.
The new name appears in the list of fault domains.

225
VMware vSAN 8.0

Remove Selected Fault Domains from vSAN Cluster


When you no longer need a fault domain, you can remove it from the vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Fault Domains.
4. Click the Actions icon on the right side of the fault domain, and select Delete.
5. Click Delete to confirm.

All hosts in the fault domain are removed and the selected fault domain is deleted from the vSAN cluster. Each host that is
not part of a fault domain is considered to reside in its own single-host fault domain.

Tolerate Additional Failures with Fault Domain in vSAN Cluster


Fault domains in a vSAN cluster provides resilience and assures that the data is available even with failures based on
policy.
With failures to tolerate (FTT) set to 1, the object can tolerate a failure. However, a temporary failure followed by a
permanent failure in a cluster can result in data loss. An additional fault domain provides vSAN the ability to create a
durability component without having additional FTTs for the object. vSAN triggers this extra component during planned
and unplanned failures. Unplanned failures include network disconnect, disk failures, and host failures. Planned failures
include Entering Maintenance Mode (EMM). For example, a 6 host cluster with RAID 6 object cannot create a durability
component if there is a host failure.
vSAN ensures the data availability of the objects when the components go offline and comes back online unexpectedly
based on the FTTs specified in the storage policy. During a failure, the writes of the failed component is redirected to
the durability component. When the component recovers from the transient failure and comes back online, the durability
component disappears and results in the resynchronization of the component.
Without the durability component in place, if there is a second permanent failure in the cluster and the mirror object is
affected, the object data gets permanently lost even if the failure is resolved.

Using vSAN Data Protection


vSAN data protection enables you to quickly recover VMs from operational failure or ransomware attacks, using native
snapshots stored locally on the vSAN cluster.
vSAN data protection is supported on vSAN HCI clusters powered by vSAN ESA. It uses native vSAN snapshots to
capture the current state of your VMs. You can use vSAN snapshots to restore a VM to its previous state, or clone a VM
for development and testing.

226
VMware vSAN 8.0

vSAN data protection requires the VMware Snapshot Service to manage vSAN snapshots. Deploy the Snapshot Service
appliance to enable vSAN data protection in the vSphere Client.
Use the following tabs to navigate the vSAN data protection page.

Tab Description

Summary Displays general information about vSAN data protection, including the number of protection groups,
percentage of protected VMs, number of VM snapshots, and amount of storage space used for snapshots.
Protection Groups Displays a list of vSAN data protection groups and their status. Select a protection group to view snapshots
in the protection group, or edit the configuration.
VMs Displays a list of VMs in the vSAN cluster with details about their data protection status. Deleted VMs that
have snapshots available are visible here.
You can select a VM and click to restore or clone the VM.

227
VMware vSAN 8.0

vSAN Snapshots
vSAN snapshots preserve the state and data of a virtual machine at the time you take the snapshot. This local archive
preserves the VM's data as it existed at that time. You can restore a VM to the state that existed when the snapshot was
taken, or create a linked clone VM that matches the state preserved in the snapshot.
Taking a snapshot captures the VM state at a specific point in time. vSAN snapshots are not quiesced, and they capture
the current running state of the VM.
Snapshots operate on individual virtual machines. Each VM requires a separate snapshot. You can take manual or
scheduled snapshots of virtual machines by placing them in protection groups.
Each vSAN snapshot contains the state of the VM's namespace object and virtual disk objects. vSAN take snapshots of
VMs in protection groups at scheduled intervals. These vSAN snapshots are stored locally in the vSAN datastore.

Protection Groups
Protection groups enable you to schedule and manage snapshots for one or multiple VMs. You can add VMs to a
protection group, configure snapshot schedules, and view snapshot information.
Select a protection group, and use the following tabs to manage the group.

Tab Description

Overview Displays general information about the protection group, including a list of member VMs, the snapshot
schedules, and the number of snapshots taken.
Snapshots Displays the snapshot series associated with the protection group. You can select and delete individual
snapshots from the series.
VMs Displays a list of VMs that are members of the protection group, and the number of snapshots available for
each VM.

When you create a protection group, add member VMs and configure one or more snapshot schedules. You can add VMs
individually, or enter VM name patterns to add all VMs that match the pattern. You can use both methods to add VMs to
the protection group.
You can define multiple snapshot schedules to periodically capture the state of VMs in a protection group. As new
snapshots are captured, vSAN removes old snapshots from the series, based on the retention setting. You also can take a
manual snapshot to capture the current state of VMs in the protection group.
Enable immutability mode on a protection group for additional security. You cannot edit or delete this protection group,
change the VM membership, edit or delete snapshots. An immutable snapshot is a read-only copy of data that cannot be
modified or deleted, even by an attacker with administrative privileges.
NOTE
Once immutability mode is enabled on a protection group, it cannot be disabled by an administrator.
You can monitor and modify protection groups from the Protection Groups tab. Click a protection group to view details.
• Overview displays general information about the protection group, including VM membership, snapshot schedules,
and number of snapshots.
• Snapshots displays a list of snapshots available in the protection group. You can select a snapshot, and click >> to
view individual snapshots for each VM, and perform actions.
• VMs displays a list of VMs in the protection group with details about the available snapshots. Select a VM radio button,
and click Restore VM or Clone VM, then choose a snapshot.

228
VMware vSAN 8.0

Click one of the following buttons to perform actions on the protection group.

Action Description

Take snapshot You can change the default name of the snapshot, and define the retention period. vSAN takes a separate
snapshot for each VM in the protection group.
Edit You can add or remove VMs, modify the VM name patterns, and add or modify the snapshot schedules.
Pause schedule/ You can pause the snapshot schedules defined for the protection group. No snapshots are taken or deleted
Resume schedule while the schedules are paused.

To delete a protection group, click the More... icon next to the group name, and select menu Delete. When you delete the
protection group, you must decide how to manage its snapshots.
• Keep snapshots until their expiration date. The protection group will be deleted after all existing snapshots have
expired.
• Delete snapshots. The protection group and its existing snapshots are deleted immediately.

vSAN and VMware Live Cyber Recovery


VMware Live Cyber Recovery can leverage vSAN snapshots on the protected site for faster recovery of ransomware-
infected VMs in the cloud. VLCR can reduce restore times by using vSAN snapshots to update only the VM deltas at the
production site.
For more information, refer to "Fast Restore Using VMware vSAN Local Snapshots" in VMware Live Cyber Recovery.

Deploying the Snapshot Service Appliance


vSAN data protection requires the VMware Snapshot Service appliance to manage vSAN snapshots.
Deploy the Snapshot Service appliance at the same site as your vCenter, with a low latency network connection.
Download and deploy the OVA file to add the VMware Snapshot Service appliance. Deploying the appliance OVA is
similar to deploying a virtual machine from a template.
This appliance requires a trusted vCenter Server certificate. To obtain the certificate, choose any one of the following:
• From the vCenter home page, click Download trusted root CA certificates. Extract the certificate files, open Certs
> lin, and copy text from the file with .0 extension. For detailed instructions, refer to the following KB article: https://
knowledge.broadcom.com/external/article/330833/how-to-download-and-install-vcenter-serv.html.
• From a system with OpenSSL installed, use the following command:

openssl s_client -connect <vCenter Server fqdn> -prexit


Copy the text from -----BEGIN CERTIFICATE----- to -----END CERTIFICATE-----

229
VMware vSAN 8.0

1. Download the VMware Snapshot Service appliance from the Broadcom website at https://support.broadcom.com/
group/ecx/downloads.
2. Right-click the vSAN cluster in the vSphere Client, and select Deploy OVF Template to open the wizard.
3. On the Select an OVF template page, specify the location of the appliance OVA file and click Next.
4. On the Select a name and folder page, you can enter a unique name for the appliance, select your data center as the
deployment location.
5. On the Select a compute resource page, select the vSAN cluster as the compute resource.
6. On the Select storage page, select a datastore.
7. On the Select networks page, select the same network as the vCenter, and click Next.
8. On the Customize template page, enter FQDN in the vCenter Server Hostname field. For example, sfo-w01-
dp01.sfo.rainpole.io. The vCenter server credentials are used only once to create a dedicated local service account.

The VMware Snapshot Service is deployed to the specified vCenter, and vSAN data protection pages are available in the
vSphere Client.

Create a vSAN Data Protection Group


Place VMs in a data protection group to schedule and manage snapshots consistently for all VM members of the group.
Ensure your vSAN cluster meets the following requirements:
• vSAN Express Storage Architecture
• vSAN 8.0 Update 3 or later
• VMware Snapshot Service appliance deployed on vCenter
Protection groups enable you to schedule and manage vSAN snapshots for one or multiple VMs. You cannot add linked
clone VMs or VMs that have vSphere snapshots to a vSAN data protection group.

230
VMware vSAN 8.0

1. Navigate to a vSAN cluster in the vSphere Client.


2. Click the Configure tab, and select vSAN > Data Protection.
3. Select Protection Groups, and click Create Protection Group to open the wizard.
a) On the General page, enter a name for the protection group and choose how to define VM membership.
NOTE
Enable immutability mode to take read-only snapshots that cannot be modified or deleted, even by an
attacker with administrative privileges. Once immutability mode is enabled, it cannot be disabled by an
administrator.
b) (Optional) On the Add VM name patterns page, enter one or more VM name patterns to match.
All VMs in the cluster with a name that matches the pattern are added to the protection group. Use special
characters to help define each VM name pattern.
• Use * to match zero or more characters. For example, VM name patterns database* and prod-*-x match VMs
named "databaseSQL", "prod-1-x", and "prod-23-x"
• Use ? to match exactly one character. For example, VM name patterns prod-? matches VMs named "prod-1",
but not "prod-23"
c) (Optional) On the Select individual VMs page, select VMs from the list to add as members of the protection
group.
d) On the Add snapshots schedules page, define the snapshot schedules and retention intervals.
You can add up to 10 snapshot schedules. Enter the schedule name, and select how often vSAN takes snapshots
of VMs in the protection group. Select how long to keep the scheduled snapshots.
e) On the Review page, review your selections, and click Create.
You can edit the protection group settings. You can take a manual snapshot to capture the current state of VMs in the
protection group.

231
VMware vSAN 8.0

Delete vSAN Snapshots


Use the vSphere Client to delete vSAN snapshots from a protection group.
Select vSAN snapshots in a protection group, and delete the snapshots from the group.

1. Navigate to the vSAN cluster in the vSphere Client, and select Configure > vSAN > Data Protection.
2. Select the Protection Groups tab, click a protection group, and select the Snapshots tab.
3. Select a snapshot, and click Delete Snapshot.
4. Click Delete.

Restore a VM from a vSAN Snapshot


You can use a vSAN snapshot to restore a VM to its previous state preserved by the snapshot.
When you restore a VM from a vSAN snapshot, vSAN replaces the current VM with the snapshot VM. You can restore a
deleted VM that has snapshots available.
1. Right-click a VM in the vSphere Client, and select menu Snapshots > vSAN Data Protection > Snapshot
Management.
To find snapshots for a removed or deleted VM, go to the Configure > vSAN > Data Protection page, click the VMs tab,
and click Removed VMs.

232
VMware vSAN 8.0

2. Select a snapshot from the list, and click Restore VM.


3. On the Restore dialog, click Restore to perform the operation.
The VM is powered off, and a new snapshot is created to capture the current state of the VM, so you can revert to it if
necessary.

The VM is restored to the previous state specified by the snapshot.

Clone a VM from a vSAN Snapshot


You can use a vSAN snapshot to create a linked clone VM to match the state of the original VM.
When you clone a VM from a vSAN snapshot, you must specify the location and compute resource for the clone.
1. Right-click a VM in the vSphere Client, and select menu Snapshots > vSAN Data Protection > Snapshot
Management.
2. Select a snapshot from the list, and click Clone VM to open the Clone VM dialog.
3. Enter a name for the clone, select a location, and click Next.
4. Select a compute resource for the clone, and click Next.
5. Review the information, and click Clone.

The linked clone VM is created, and is available in vCenter.

Using the vSAN iSCSI Target Service


Use the iSCSI target service to enable hosts and physical workloads that reside outside the vSAN cluster to access the
vSAN datastore.
This feature enables an iSCSI initiator on a remote host to transport block-level data to an iSCSI target on a storage
device in the vSAN cluster. vSAN 6.7 and later releases support Windows Server Failover Clustering (WSFC), so WSFC
nodes can access vSAN iSCSI targets.
After you configure the vSAN iSCSI target service, you can discover the vSAN iSCSI targets from a remote host. To
discover vSAN iSCSI targets, use the IP address of any host in the vSAN cluster, and the TCP port of the iSCSI target. To
ensure high availability of the vSAN iSCSI target, configure multipath support for your iSCSI application. You can use the
IP addresses of two or more hosts to configure the multipath.
NOTE
vSAN iSCSI target service does not support other vSphere or ESXi clients or initiators, third-party hypervisors,
or migrations using raw device mapping (RDMs).
vSAN iSCSI target service supports the following CHAP authentication methods:
CHAP
In CHAP authentication, the target authenticates the initiator, but the initiator does not authenticate the target.
Mutual CHAP
In mutual CHAP authentication, an extra level of security enables the initiator to authenticate the target.

For more information about using the vSAN iSCSI target service, refer to the iSCSI Target Usage Guidehttps://
core.vmware.com/resource/vsan-iscsi-target-usage-guide.

233
VMware vSAN 8.0

iSCSI Targets
You can add one or more iSCSI targets that provide storage blocks as logical unit numbers (LUNs). vSAN identifies each
iSCSI target by a unique iSCSI qualified Name (IQN). You can use the IQN to present the iSCSI target to a remote iSCSI
initiator so that the initiator can access the LUN of the target.
Each iSCSI target contains one or more LUNs. You define the size of each LUN, assign a vSAN storage policy to each
LUN, and enable the iSCSI target service on a vSAN cluster. You can configure a storage policy to use as the default
policy for the home object of the vSAN iSCSI target service.

iSCSI Initiator Groups


You can define a group of iSCSI initiators that have access to a specified iSCSI target. The iSCSI initiator group restricts
access to only those initiators that are members of the group. If you do not define an iSCSI initiator or initiator group, then
each target is accessible to all iSCSI initiators.
A unique name identifies each iSCSI initiator group. You can add one or more iSCSI initiators as members of the group.
Use the IQN of the initiator as the member initiator name.

Enable the vSAN iSCSI Target Service


Before you can create iSCSI targets and LUNs and define iSCSI initiator groups, you must enable the iSCSI target service
on the vSAN cluster.
1. Navigate to the vSAN cluster and click Configure > vSAN > Services.
2. On the vSAN iSCSI Target Service row, click ENABLE.
The Edit vSAN iSCSI Target Service wizard opens.
3. Edit the vSAN iSCSI target service configuration.
You can select the default network, TCP port, and Authentication method at this time. You also can select a vSAN
storage policy.
4. Click the Enable vSAN iSCSI Target service slider to turn it on and then click APPLY.

The vSAN iSCSI target service is enabled.


After the iSCSI target service is enabled, you can create iSCSI targets and LUNs, and define iSCSI initiator groups.

Create a vSAN iSCSI Target


You can create or edit an iSCSI target and its associated LUN.
Verify that the vSAN iSCSI target service is enabled.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
a) Under vSAN, click iSCSI Target Service.
b) Click the iSCSI Targets tab.
c) Click Add. The New iSCSI Target dialog box is displayed. If you leave the target IQN field blank, the IQN is
generated automatically.
d) Enter a target Alias.
e) Select a Storage policy, Network, TCP port, and Authentication method.
f) Select the I/O Owner Location. This feature is available only if you have configured vSAN cluster as a stretched
cluster. It allows you to specify the site location for hosting the iSCSI target service for a target. This helps in
avoiding the cross site iSCSI traffic. If you have set the policy as HFT>=1, then in the event of a site failure, the I/
O owner location changes to the alternate site. After the site failure recovery, the I/O owner location automatically

234
VMware vSAN 8.0

changes back to the original I/O owner location as per the configuration. You can select one of the following
options to set the site location:
• Either: Hosts the iSCSI target service either on Preferred or Secondary site.
• Preferred: Hosts the iSCSI target service on the Preferred site.
• Secondary: Hosts the iSCSI target service on the Secondary site.
3. Click OK.

iSCSI target is created and listed under the vSAN iSCSI Targets section with the information such as IQN, I/O owner host,
and so on.
Define a list of iSCSI initiators that can access this target.

Add a LUN to a vSAN iSCSI Target


You can add one or more LUNs to a vSAN iSCSI target, or edit an existing LUN.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
a) Under vSAN, click iSCSI Target Service.
b) Click the iSCSI Targets tab, and select a target.
c) In the vSAN iSCSI LUNs section, click Add. The Add LUN to Target dialog box is displayed.
d) Enter the size of the LUN. The vSAN Storage Policy configured for the iSCSI target service is assigned
automatically. You can assign a different policy to each LUN.
3. Click Add.

Resize a LUN on a vSAN iSCSI Target


Depending on your requirement, you can increase the size of an online LUN.
Online resizing of the LUN is enabled only if all hosts in the cluster are upgraded to vSAN 6.7 Update 3 or later.
1. In the vSphere Client, navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click iSCSI Target Service.
4. Click the iSCSI Targets tab and select a target.
5. In the vSAN iSCSI LUNs section, select a LUN and click Edit. The Edit LUN dialog box is displayed.
6. Increase the size of the LUN depending on your requirement.
7. Click OK.

Create a vSAN iSCSI Initiator Group


You can create a vSAN iSCSI initiator group to provide access control for vSANiSCSI targets.
Only iSCSI initiators that are members of the initiator group can access the vSAN iSCSI targets.
NOTE
The initiators outside the initiator group cannot access the target if the initiator group for access control is
created on the iSCSI target. The existing connections from these initiators will be lost and cannot be recovered

235
VMware vSAN 8.0

until they are added to the initiator group. You must check the current initiator connections and ensure that all
the authorized initiators are added to the initiator group before group creation.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
a) Under vSAN, click iSCSI Target Service.
b) Click the Initiator Groups tab, and click Add. The New Initiator Group dialog box is displayed.
c) Enter a name for the iSCSI initiator group.
d) (Optional) To add members to the initiator group, enter the IQN of each member. Use the following format to enter
the member IQN:
iqn.YYYY-MM.domain:name
Where:
• YYYY = year, such as 2016
• MM = month, such as 09
• domain = domain where the initiator resides
• name = member name (optional)
3. Click OK or Create.
Add members to the iSCSI initiator group.

Assign a Target to a vSAN iSCSI Initiator Group


You can assign a vSAN iSCSI target to an iSCSI initiator group.
Verify that you have an existing iSCSI initiator group.
Only those initiators that are members of the initiator group can access the assigned targets.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
a) Under vSAN, click iSCSI Target Service.
b) Select the Initiator Groups tab.
c) In the Accessible Targets section, click Add. The Add Accessible Targets dialog box is displayed.
d) Select a target from the list of available targets.
3. Click Add.

Turn Off the vSAN iSCSI Target Service


You can turn off the vSAN iSCSI target service.
Workloads running on iSCSI LUNs are stopped when you turn off the iSCSI target service. Before you turn it off, ensure
that there are no workloads running on iSCSI LUNs.
Turning off vSAN iSCSI target service does not delete the LUNs/Targets. If you wish to reclaim the space, delete the
LUNs/targets manually before you turn off vSAN iSCSI target service.
1. Navigate to the vSAN cluster and click Configure > vSAN > Services.

2. On the vSAN iSCSI Target Service row, click EDIT.


The Edit vSAN iSCSI Target Service wizard opens.

236
VMware vSAN 8.0

3. Click the Enable vSAN iSCSI Target Service slider to turn it off and click Apply.

The vSAN iSCSI target service is not enabled.

Monitor vSAN iSCSI Target Service


You can monitor the iSCSI target service to view the physical placement of iSCSI target components and to check for
failed components.
Verify that you have enabled the vSAN iSCSI target service and created targets and LUNs.
You also can monitor the health status of the iSCSI target service.
1. Browse to the vSAN cluster.
2. Click Monitor and select Virtual Objects. iSCSI targets are listed on the page.
3. Select a target and click View Placement Details. The Physical Placement shows where the data components of the
target are located.
4. Click Group components by host placement to view the hosts associated with the iSCSI data components.

vSAN File Service


Use the vSAN file service to create file shares in the vSAN datastore that client workstations or VMs can access.
The data stored in a file share can be accessed from any device that has access rights. vSAN File Service is a layer
that sits on top of vSAN to provide file shares. It currently supports SMB, NFSv3, and NFSv4.1 file shares. vSAN
File Service comprises of vSAN Distributed File System (vDFS) which provides the underlying scalable filesystem by
aggregating vSAN objects, a Storage Services Platform which provides resilient file server end points and a control plane
for deployment, management, and monitoring. File shares are integrated into the existing vSAN Storage Policy Based
Management, and on a per-share basis. vSAN file service brings in capability to host the file shares directly on the vSAN
cluster.

237
VMware vSAN 8.0

When you configure vSAN file service, vSAN creates a single VDFS distributed file system for the cluster which will be
used internally for management purposes. A file service VM (FSVM) is placed on each host. The FSVMs manage file
shares in the vSAN datastore. Each FSVM contains a file server that provides both NFS and SMB service.
A static IP address pool should be provided as an input while enabling file service workflow. One of the IP addresses is
designated as the primary IP address. The primary IP address can be used for accessing all the shares in the file services
cluster with the help of SMB and NFSv4.1 referrals. A file server is started for every IP address provided in the IP pool.
A file share is exported by only one file server. However, the file shares are evenly distributed across all the file servers.
To provide computing resources that help manage access requests, the number of IP addresses must be equal to the
number of hosts in the vSAN cluster.
vSAN file service supports vSAN stretched clusters and two-node vSAN clusters. A two-node vSAN cluster should have
two data node servers in the same location or office, and the witness in a remote or shared location.
For more information about Cloud Native Storage (CNS) file volumes, see the VMware vSphere Container Storage Plug-in
documentation and vSphere with Tanzu Configuration and Management documentation.

Limitations and Considerations of vSAN File Service


Consider the following when configuring vSAN File Service:
• vSAN 8.0 supports two-node configurations and stretched clusters.
• vSAN 8.0 supports 64 file servers in a 64 host setup.
• vSAN 8.0 supports 100 file shares.
• vSAN 8.0 Update 2 supports File Service on Express Storage Architecture (ESA).

238
VMware vSAN 8.0

• vSAN 8.0 Update 3 ESA cluster supports 250 file shares. Out of those 250 file shares, maximum 100 file shares can
be SMB. For example, if you create 100 SMB file shares then the cluster can only support additional 150 NFS file
shares.
• vSAN File Services does not support the following:
– Read-Only Domain Controllers (RODC) for joining domains because the RODC cannot create machine accounts.
As a security best practice, a dedicated org unit should be pre-created in the Active Directory and the user name
mentioned here should be controlling this organization.
– Disjoint namespace.
– Multi domain and Single Active Directory Forest environments.
• When a host enters maintenance mode, the file server moves to another FSVM. The FSVM on the host that entered
maintenance mode is powered off. After the host exits maintenance mode, the FSVM is powered on.
• vSAN File Services VM (FSVM) docker internal network may overlap with the customer network without warning or
reconfiguration.
There is known conflict issue if the specified file service network overlaps with the docker internal network
(172.17.0.0/16). This causes routing problem for the traffic to the correct endpoint.
As a workaround, specify a different file service network so that it does not overlap with the docker internal network
(172.17.0.0/16).

Enable vSAN File Service


You can enable vSAN File Services on a vSAN Original Storage Architecture (OSA) cluster or a vSAN Express Storage
Architecture (ESA) cluster.

239
VMware vSAN 8.0

Ensure that the following are configured before enabling the vSAN File Services:
• The vSAN cluster must be a regular vSAN cluster, a vSAN stretched cluster, or a vSAN ROBO cluster.
• Every ESXi host in the vSAN cluster must have minimal hardware requirements such as:
– 4 Core CPU
– 16 GB physical memory
• You must ensure to prepare the network as vSAN File Service network:
– If using standard switch based network, the Promiscuous Mode and Forged Transmits are enabled as part of the
vSAN File Services enablement process.
– If using DVS based network, vSAN File Services are supported on DVS version 6.6.0 or later. Create a dedicated
port group for vSAN File Services in the DVS. MacLearning and Forged Transmits are enabled as part of the vSAN
File Services enablement process for a provided DVS port group.
– IMPORTANT
If using NSX-based network, ensure that MacLearning is enabled for the provided network entity from
the NSX admin console, and all the hosts and File Services nodes are connected to the desired NSX-T
network.
1. Navigate to the vSAN cluster and click Configure > vSAN > Services.
2. On the File Service row, click Enable.
The Enable File Service wizard

240

opens.
VMware vSAN 8.0

3. From the Select drop-down, select a network.


4. In the File service agent, select one of the following options to download the OVF file.

Option Description

Automatically load latest OVF This option lets the system search and download the OVF.
NOTE
• Ensure that you have configured the proxy and
firewall so that vCenter can access the following
website and download the appropriate JSON file.
https://download3.vmware.com/software/
VSANOVF/FsOvfMapping.json
For more information about configuring the vCenter
DNS, IP address, and proxy settings, see vCenter
Server Appliance Configuration.
• Use current OVF: Lets you use the OVF that is
already available.
• Automatically load latest OVF: Lets the system
search and download the latest OVF.
Manually load OVF This option allows you to browse and select an OVF that is
already available on your local system.
NOTE
If you select this option, you should upload all the
following files:
• VMware-vSAN-File-Services-Applianc
e-x.x.x.x-x_OVF10.mf
• VMware-vSAN-File-Services-Applianc
e-x.x.x.x-x-x_OVF10.cert
• VMware-vSAN-File-Services-Applianc
e-x.x.x.x-x-x-system.vmdk
• VMware-vSAN-File-Services-Appliance-x.x.x.x-x-cl
oud-components.vmdk
• VMware-vSAN-File-Services-Appliance-x.x.x.x-x-lo
g.vmdk
• VMware-vSAN-File-Services-Appliance-x.x.x.x-x_O
VF10.ovf

5. Click Enable.

• The OVF is downloaded and deployed.


• The vSAN file services is enabled.
• A File Services VM (FSVM) is placed on each host.
NOTE
The FSVMs are managed by the vSAN File Services. Do not perform any operation on the FSVMs.

Configure vSAN File Service


You can configure the File Service, which enable you to create file shares on your vSAN datastore.

241
VMware vSAN 8.0

Ensure the following before configuring the vSAN File Service:


• Enable vSAN file service.
• Allocate static IP addresses as file server IPs from vSAN File Service network, each IP is the single point access to
vSAN file shares.
– For best performance, the number of IP addresses must be equal to the number of hosts in the vSAN cluster.
– All the static IP addresses must be from the same subnet.
– Every static IP address has a corresponding FQDN, which must be part of the Forward lookup and Reverse lookup
zones in the DNS server.
• If you are planning to create a Kerberos based SMB file share or a Kerberos based NFS file share, you need the
following:
– Microsoft Active Directory (AD) domain to provide authentication to create an SMB file share or an NFS file share
with the Kerberos security.
– (Optional) Active Directory Organizational Unit to create all file server computer objects.
– A domain user in the directory service with the sufficient privileges to create and delete computer objects.
1. Navigate to the vSAN cluster and click Configure > vSAN > Services.
2. On the File Service row, click Configure Domain.
The File Service Domain wizard

242
VMware vSAN 8.0

3. In the File service domain page, enter the unique namespace and click Next. The domain name must have minimum
two characters. The first character must be an alphabet or a number. The remaining characters can include an
alphabet, a number, an underscore ( _ ), a period ( . ), a hyphen ( - ).
4. In the Networking page, enter the following information, and click Next:
• Protocol: You can select IPv4 or IPv6. vSAN File Service only supports IPv4 or IPv6 stack. The reconfiguration
between IPv4 and IPv6 is not supported.
• DNS servers: Enter a valid DNS server to ensure the proper configuration of File Service.
• DNS suffixes: Provide the DNS suffix that is used with the file service. All other DNS suffixes from where the
clients can access these file servers must also be included. File Service does not support DNS domain with
single label, such as "app", "wiz", "com" and so on. A domain name given to file service must be of the format
thisdomain.registerdrootdnsname. DNS name and suffix must adhere to the best practices detailed in https://
docs.microsoft.com/en-us/windows-server/identity/ad-ds/plan/selecting-the-forest-root-domain.
• Subnet mask: Enter a valid subnet mask. This text box appears when you select IPv4.
• Prefix length: Enter a number between 1 and 128. This text box appears when you select IPv6.
• Gateway: Enter a valid gateway.
• IP Pool: Enter primary IP address and DNS name.
With vSAN 8.0 Update 3, vSAN ESA cluster supports 250 file shares. Out of those 250 file shares, maximum 100 file
shares can be SMB. For example, if you create 100 SMB file shares then the cluster can only support additional 150
NFS file shares.
Each file server on a vSAN ESA cluster can support a maximum of 25 file shares and requires at least 10 IPs to have
the maximum of 250 shares. With the increase in the file servers or file shares per host, there might be an impact
on the performance of vSAN File Service. For best performance, the number of IP address must to be equal to the
number of hosts in the vSAN cluster.
Affinity site option is available if you are configuring vSAN file service on a vSAN stretched cluster. This option allows
you to configure the placement of the file server on Preferred or Secondary site. This helps in reducing the cross-site
traffic latency. The default value is Either, which indicates that no site affinity rule is applied to the file server.
NOTE
If your cluster is a ROBO cluster, ensure that the Affinity site value is set to Either.
In a site failure event, the file server affiliated to that site fails over to the other site. The file server fails back to the
affiliated site when it is recovered. Configure more file servers to one site if more workloads can be expected from a
certain site.
NOTE
If the file server contains SMB file shares, then it does not failback automatically even if the site failure is
recovered.
Consider the following while configuring the IP addresses and DNS names:
• To ensure proper configuration of File Service, the IP addresses you enter in the Networking page must be static
addresses and the DNS server must have records for those IP addresses. For best performance, the number of IP
addresses must be equal to the number of hosts in the vSAN cluster.
• You can have a maximum of 64 hosts in the cluster. If large scale cluster support is configured, you can enter up to
64 IP addresses.
• You can use the following options to automatically fill the IP address and DNS server name text boxes:
AUTO FILL: This option is displayed after you enter the first IP address in the IP address text box. Click the AUTO
FILL option to automatically fill the remaining fields with sequential IP addresses, based on the subnet mask and
gateway address of the IP address that you have provided in the first row. You can edit the auto filled IP addresses.
LOOK UP DNS: This option is displayed after you enter the first IP address in the IP address text box. Click the
LOOK UP DNS option to automatically retrieve the FQDN corresponding to the IP addresses in the IP address
column.

243
VMware vSAN 8.0

NOTE
• All valid rules apply for the FQDNs. For more information, see https://tools.ietf.org/html/rfc953.
• The first part of the FQDN, also known as NetBIOS Name, must not have more than 15 characters.
The FQDNs are automatically retrieved only under the following conditions:
– You must have entered a valid DNS server in the Domain page.
– The IP addresses entered in the IP Pool page must be static addresses and the DNS server must have records
for those IP addresses.
5. In the Directory service page, enter the following information and click Next.

Option Description

Directory service Configure an Active Directory domain to vSAN File Service for
authentication. If you are planning to create an SMB file share or
an NFSv4.1 file share with Kerberos authentication, then you must
configure an AD domain to vSAN File Service.
AD domain Fully qualified domain name joined by the file server.
Preferred AD Server Enter the IP address of the preferred AD server. In case of multiple
IP addresses, ensure that they are separated by comma.
Organizational unit (Optional) Contains the computer account that the vSAN File Service
creates. In an organization with complex hierarchies, create the
computer account in a specified container by using a forward
slash mark to denote hierarchies (for example, organizational_unit/
inner_organizational_unit).
NOTE
By default, the vSAN File Service create the computer
account in the Computers container.

AD username User name to be used for connecting and configuring the Active
Directory service.
This user name authenticates the active directory on the domain.
A domain user authenticates the domain controller and creates
vSAN File Service computer accounts, related SPN entries, and
DNS entries (when using Microsoft DNS). As a best practice,
create a dedicated service account for the file service.
A domain user in the directory service with the following sufficient
privileges to create and delete computer objects:
• (Optional) Add/Update DNS entries

244
VMware vSAN 8.0

Option Description

Password Password for the user name of the Active Directory on the domain.
vSAN File Service use the password to authenticate to AD and to
create the vSAN File Service computer account.

NOTE
• vSAN File Service does not support the following:
– Read-Only Domain Controllers (RODC) for joining domains because the RODC cannot create
machine accounts. As a security best practice, a dedicated org unit must be pre-created in the Active
Directory and the user name mentioned here must be controlling this organization.
– Disjoint namespace.
– Multi domain and Single Active Directory Forest environments.
• Only English characters are supported for Active Directory user name.
• Only single AD domain configuration is supported. However, the file servers can be put on a valid DNS
subdomain. For example, an AD domain with the name example.com can have file server FQDN as
name1.eng.example.com .
• Pre-created computer objects for file servers are not supported. Make sure that the user provided here
have sufficient privilege over the organizational unit.
• vSAN File Service update the DNS records for the file servers if the Active Directory is also used as a
DNS server and the user has sufficient permission to update the DNS records. vSAN File Service also
has a Health Check to indicate if the forward and reverse lookups for file servers are working properly.
However, if there are other proprietary solutions used as DNS servers, the Vi admin must update these
DNS records.
6. Review the settings and click Finish.

The file service domain is configured. File servers are started with the IP addresses that were assigned during the vSAN
File Service configuration process.

Edit vSAN File Service


You can edit and reconfigure the settings of a vSAN File Service.
• If you are upgrading from vSAN 7.0 to 7.0 Update 1, you can create SMB and NFS Kerberos file shares. This requires
configuring the Active Directory domain to vSAN File Service.
• If there are active shares, changing the Active Directory domain is not permitted as this action can disrupt the user
permissions on the active shares.
• If your Active Directory password has been changed, then you can edit the Active Directory configuration settings and
provide the new password.
NOTE
This action might cause minor disruption to the inflight I/Os on the file shares.

1. Navigate to the vSAN cluster and click Configure > vSAN > Services.
2. On the File Service row, click Edit> Edit domain.
The File Service Domain wizard opens.

245
VMware vSAN 8.0

3. In the File service domain page, edit the file service domain name and click Next.
4. In the Networking page, make the appropriate configuration changes and click Next. You can edit the primary IP
addresses, static IP addresses, and DNS names. You can add or remove the primary IP addresses or static IP
addresses. You cannot change the DNS name without changing the IP.
NOTE
Changing domain information is a disruptive action. It might require all clients to use new URLs to reconnect
to the file shares.
5. In the Directory service page, make appropriate directory related changes and click Next.
NOTE
You cannot change the AD domain, organizational unit, and username after initially configuring vSAN File
Services.
6. In the Review page, click Finish after making necessary changes.

The changes are applied to the vSAN File Service configuration.

Create a vSAN File Share


When the vSAN file service is enabled, you can create one or more file shares on the vSAN datastore.
If you are creating an SMB file share or a NFSv4.1 file share with Kerberos security, then ensure that you have configured
vSAN File Service to an AD domain.
Considerations for Share Name and Usage
• Usernames with non-ascii characters can be used to access share data.
• Share names cannot exceed 80 characters and can contain English characters, numbers, and hypen character. Every
hyphen character must be preceded and followed by a number or alphabet. Consecutive hyphens are not allowed.
• For SMB type shares, file and directories can contain any Unicode compatible strings.
• For pure NFSv4 type shares, the file and directories can contain any UTF-8 compatible strings.
• For pure NFSv3 and NFSv3+NFSv4 shares file and directories can contain only ASCII compatible strings.
• Migrating any share data from older NFSv3 to new vSAN File Service shares with NFSv4 only requires conversion of
all file and directories names to UTF-8 encoding. There are third part tools to achieve the same.
vSAN File Service does not support using NFS file shares on ESXi.
1. Navigate to the vSAN cluster and click Configure > vSAN > File Shares.
With vSAN 8.0 Update 3, vSAN ESA cluster supports 250 file shares. Out of those 250 file shares, maximum 100 file
shares can be SMB. For example, if you create 100 SMB file shares then the cluster can only support additional 150
NFS file shares
Each file server on a vSAN ESA cluster can support a maximum of 25 file shares and requires at least 10 IPs to have
the maximum of 250 shares. With the increase in the file servers or file shares per host, there might be an impact on
the performance of vSAN File Service. For best performance, the number of IP addresses must to be equal to the
number of hosts in the vSAN cluster.
2. Click Add.
The Create file share wizard opens.
3. In the General page, enter the following information and click Next.
• Name: Enter a name for the file share.
• Protocol: Select an appropriate protocol. vSAN File Service supports SMB and NFS file system protocols.

246
VMware vSAN 8.0

If you select the SMB protocol, you can also configure the SMB file share to accept only the encrypted data using
the Protocol encryption option.
If you select the NFS protocol, you can configure the file share to support either NFS 3, NFS 4, or both NFS 3 and
NFS 4 versions. If you select NFS 4 version, you can set either AUTH_SYS or Kerberos security.
NOTE
SMB protocol and Kerberos security for NFS protocol can be configured only if the vSAN File Service is
configured with Active Directory. For more information, see Configure vSAN File Service.
• With SMB protocol, you can hide the files and folders that the share client user does not have permission to access
using the Access based enumeration option.
• Storage Policy: Select an appropriate storage policy.
• Affinity site: This option is available if you are creating a file share on a vSAN stretched cluster. This option helps
you place the file share on a file server that belongs to the site of your choice. Use this option when you prefer low
latency while accessing the file share. The default value is Either, which indicates that the file share is placed on a
site with less traffic on either preferred or secondary site.
• Storage space quotas: You can set the following values:
– Share warning threshold: When the share reaches this threshold, a warning message is displayed.
– Share hard quota: When the share reaches this threshold, new block allocation is denied.
• Labels: A label is a key-value pair that helps you organize file shares. You can attach labels to each file share and
then filter them based on their labels. A label key is a string with 1~250 characters. A label value is a string and the
length of the label value should be less than 1k characters. vSAN File Service supports up to 5 labels per share.
4. The Net access control page, provides options to define access to the file share. Net access control options are
available only for NFS shares. Select one of the following options and click Next.
• No access: Select this option to make the file share inaccessible from any IP address.
• Allow access from any IP: Select this option to make the file share accessible from all IP addresses.
• Customize net access: Select this option to define permissions for specific IP addresses. Using this option you
can specify whether a particular IP address can access, make changes, or only read the file share. You can also
enable Root squash for each IP address. You can enter the IP addresses in the following formats:
– A single IP address. For example, 123.23.23.123
– IP address along with a subnet mask. For example, 123.23.23.0/8
– A range by specifying a starting IP address and ending IP address separated by a hyphen ( - ). For example,
123.23.23.123-123.23.23.128
– Asterisk ( * ) to imply all the clients.
5. In the Review page, review the settings, and then click Finish.
A new file share is created on the vSAN datastore.

View vSAN File Shares


You can view the list of vSAN file shares.
To view the list of vSAN file shares, navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
A list of vSAN file shares appears. For each file share, you can view information such as storage policy, hard quota, usage
over quota, actual usage, and so on. The vSAN ESA cluster displays the number of existing file shares and the maximum
file share limit allowed in a cluster.

Access vSAN File Shares


You can access a file share from a host client.

247
VMware vSAN 8.0

Access NFS File Share

You can access a file share from a host client, using an operating system that communicates with NFS file systems.
For RHEL-based Linux distributions, NFS 4.1 support is available in RHEL 7.3 and CentOS 7.3-1611 running kernel
3.10.0-514 or later. For Debian based Linux distributions, NFS 4.1 support is available in Linux kernel version 4.0.0 or
later. All NFS clients must have unique hostnames for NFSv4.1 to work. You can use the Linux mount command with the
Primary IP to mount a vSAN file share to the client. For example: mount -t nfs4 -o minorversion=1,sec=sys
<primary ip>:/vsanfs/<share name>. NFSv3 support is available for RHEL-based and Debian based Linux
distributions. You can use the Linux mount command to mount a vSAN file share to the client. For example: mount -t
nfs vers=3 <nfsv3_access_point> <localmount_point>.
Sample v41 commands for verifying the NFS file share from a host client:
[root@localhost ~]# mount -t nfs4 -o minorversion=1,sec=sys <primary ip address>:/vsan-
fs/TestShare-0 /mnt/TestShare-0
[root@localhost ~]# cd /mnt/TestShare-0/
[root@localhost TestShare-0]# mkdir bar
[root@localhost TestShare-0]# touch foo
[root@localhost TestShare-0]# ls -l
total 0
drwxr-xr-x. 1 root root 0 Feb 19 18:35 bar
-rw-r--r--. 1 root root 0 Feb 19 18:35 foo
Access NFS Kerberos File Share

A Linux client accessing an NFS Kerberos share should have a valid Kerberos ticket.
Sample v41 commands for verifying the NFS Kerberos file share from a host client: An NFS Kerberos
share can be mounted using the following mount command:
[root@localhost ~]# mount -t nfs4 -o minorversion=1,sec=krb5/krb5i/krb5p <primary ip ad-
dress>:/vsanfs/TestShare-0 /mnt/TestShare-0
[root@localhost ~]# cd /mnt/TestShare-0/
[root@localhost TestShare-0]# mkdir bar
[root@localhost TestShare-0]# touch foo
[root@localhost TestShare-0]# ls -l
total 0
drwxr-xr-x. 1 root root 0 Feb 19 18:35 bar
-rw-r--r--. 1 root root 0 Feb 19 18:35 foo

Changing Ownership of a NFS Kerberos share You must log in using the AD domain user name for
changing the ownership of a share. The AD domain user name provided in the file service configuration acts
as a sudo user for the Kerberos file share.
[root@localhost ~]# mount -t nfs4 -o minorversion=1,sec=sys <primary ip address>:/vsan-
fs/TestShare-0 /mnt/TestShare-0
[fsadmin@ocalhost ~]# chown user1 /mnt/TestShare-0
[user1@localhost ~]# ls -l /mnt/TestShare-0
total 0
drwxr-xr-x. 1 user1 domain users 0 Feb 19 18:35 bar
-rw-r--r--. 1 user1 domain users 0 Feb 19 18:35 foo

Access SMB File Share

You can access an SMB file share from a Windows client.

248
VMware vSAN 8.0

Ensure that the Windows client is joined to the Active Directory domain that is configured with vSAN File Service.
1. Copy the SMB file share path using the following procedure:
1. Navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
List of all the vSAN file shares appears.
2. Select the SMB file share that you want to access from the Windows client.
3. Click COPY PATH > SMB.
The SMB file share path gets copied to your clipboard.
2. Log into the Windows client as a normal Active Directory domain user.
3. Access the SMB file share using path that you have copied.

Edit a vSAN File Share


You can edit the settings of a vSAN file share.
1. Navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
List of all the vSAN file shares appears.
2. Select the file share that you want to modify and click EDIT.
3. In the Edit file share page, make appropriate changes to the file share settings and click Finish.

The file share settings are updated.


NOTE
vSAN does not allow file share protocol change between SMB and NFS.

Manage SMB File Share on vSAN Cluster


vSAN File Service supports the shared folders snap-in for the Microsoft Management Console (MMC) for managing the
SMB shares on the vSAN cluster.
You can perform the following tasks on vSAN File System SMB shares using the MMC tool:
• Manage Access Control List (ACL).
• Close open files.
• View active sessions.
• View open files.
• Close client connections.
1. Copy the MMC Command using the following procedure:
1. Navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
List of all the vSAN file shares appears.
2. Select the SMB file share that you want to manage from the Windows client using the MMC tool.
3. Click COPY MMC COMMAND.
The MMC command gets copied to your clipboard.

249
VMware vSAN 8.0

2. Log into the Windows client as a file service admin user. The file service admin user is configured when you create the
file service domain. A file service admin user has all the privileges on the file server.
3. In the search box on the taskbar, type Run, and then select Run.
4. In the Run box, run the MMC command that you have copied to access and manage the SMB share using the MMC
tool.

Delete a vSAN File Share


You can delete a file share when you no longer need it.
When you delete a file share, all the snapshots associated with that file share are also deleted.
1. Navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
List of all the vSAN file shares appears.
2. Select the file share that you want to modify and click DELETE.
3. On the Delete file shares dialogue, click DELETE.

vSAN Distributed File System Snapshot


A snapshot provides a space-efficient and time-based archive of the data.
It provides the ability to retrieve data from a file or a set of files in the event of accidental deletion of a file. A file system
level snapshot provides you information about the files that have been changed and the changes made to the file. It
provides you an automated file recovery service and it is more efficient compared to the traditional tape-based backup
method. A snapshot on its own does not provide a full disaster recovery solution but it can be used by the third-party
backup vendors to copy the changed files (incremental backup) to a different physical location.
vSAN File Services has a built-in feature that allows you to create a point- in-time image of the vSAN file share. When the
vSAN File Service is enabled, you can create up to 32 snapshots per share. A vSAN file share snapshot is a file system
snapshot that provides a point-in-time image of a vSAN file share.
NOTE
vSAN distributed file system snapshot is supported on version 7.0 Update 2 or later.

Considerations for File System Snapshot


• Use Default as the snapshot name to retireve data.
• Snapshot name cannot exceed 100 characters and can contain English characters, numbers, and special characters
except the following:
– " (ASCII 34)
– $ (ASCII 36)
– % (ASCII 37)
– & (ASCII 38)
– * (ASCII 42)
– / (ASCII 47)
– : (ASCII 58)
– < (ASCII 60)
– > (ASCII 62)
– ? (ASCII 63)
– \ (ASCII 92)
– ^ (ASCII 94)
– | (ASCII 124)

250
VMware vSAN 8.0

– ~ (ASCII 126)

Create a Snapshot

When the vSAN file service is enabled, you can create one or more snapshots that provide a point-in-time image of the
vSAN file share. You can create a maximum of 32 snapshots per file share.
You should have created a vSAN file share.
1. Navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
A list of vSAN file shares appears.
2. Select the file share for which you want to create a snapshot and then click SNAPSHOTS > NEW SNAPSHOT.
Create new snapshot dialogue appears.
3. On the Create new snapshot dialogue, provide a name for the snapshot, and click Create.

A point-in-time snapshot for the selected file share is created.


View a Snapshot

You can view the list of snapshots along with the information such as date and time of the snapshot creation, and its size.
1. Navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
A list of vSAN file shares appears.
2. Select a file share and click SNAPSHOTS.

A list of snapshots for that file share appears. You can view information such as date and time of the snapshot creation,
and its size.
Delete a Snapshot

You can delete a snapshot when you no longer need it.


1. Navigate to the vSAN cluster and click Configure > vSAN > File Service Shares.
A list of vSAN file shares appears.
2. Select a file share and click SNAPSHOTS.
A list of snapshots of that belongs to the file share you have selected appears.
3. Select the snapshot that you want to delete and click DELETE.

Rebalance Workload on vSAN File Service Hosts


Skyline Health displays the workload balance health status for all the hosts that are part of the vSAN File Service
Infrastructure.

If there is an imbalance in the workload of a host, you can correct it by rebalancing the workload.
1. Navigate to the vSAN cluster and then click Monitor > vSAN > Skyline Health.
2. Under Skyline Health, expand File Service and then click Infrastructure Health.
The Infrastructure Health tab displays a list of all the hosts that are part of the vSAN File Service infrastructure. For
each host, the status of workload balance is displayed. If there is an imbalance in the workload of a host, an alert is
displayed in the Description column.

251
VMware vSAN 8.0

3. Click REMIDIATE IMBALANCE and then REBALANCE to fix the imbalance.


Before proceeding with rebalancing, consider the following:
• During rebalancing, containers in the hosts with an imbalanced workload might be moved to other hosts. The
rebalancing activity might also impact the other hosts in the cluster.
• During the rebalance process, the workloads running on NFS shares are not disrupted. However, the I/O to SMB
shares located in the containers that have moved are disrupted.

The host workload is balanced and the workload balance status turns green.

Reclaiming Space with Unmap in vSAN Distributed File System


UNMAP commands enable you to reclaim storage space that is mapped to deleted files in the vSAN Distributed File
System (VDFS) created by the guest on the vSAN object.
vSAN 6.7 Update 2 and later supports UNMAP commands. Deleting or removing files and snapshots frees space within
the file system. This free space is mapped to a storage device until the file system releases or unmaps it. vSAN supports
reclamation of free space, which is also called the unmap operation. You can free storage space in the VDFS when you
delete file shares and snapshots, consolidate file shares and snapshots, and so on. You can unmap storage space when
you delete files or snapshots
Unmap capability is not enabled by default. To enable unmap on a vSAN cluster, use the following RVC command:
vsan.unmap_support –enable

When you enable unmap on a vSAN cluster, you must power off and then power on all VMs. VMs must use virtual
hardware version 13 or above to perform unmap operations.

Upgrade vSAN File Service


When you upgrade the file service, the upgrade is performed on a rolling basis.
Ensure that the following are upgraded:
• ESXi Hosts
• vCenter Server
• vSAN disk format
During the upgrade, the file server containers running on the virtual machines which are undergoing upgrade fails over to
other virtual machines. The file shares remain accessible during the upgrade. During the upgrade, you might experience
some interruptions while accessing the file shares.
1. Navigate to the vSAN cluster and then click Configure > vSAN > Services.
2. Under vSAN Services, on the File Service row, click CHECK UPGRADE.
3. In the Upgrade File Service dialog box, select one of the following deployment options and then click UPGRADE.
Option Action
Automatic approach This is the default option. This option lets the system search and
download the OVF. After the upgrade begins, you cannot cancel
the task.
NOTE
vSAN requires internet connectivity for this option.
Manual approach This option allows you to browse and select an OVF that is
already available on your local system. After the upgrade
begins, you cannot cancel the task.

252
VMware vSAN 8.0

Option Action
NOTE
If you select this option, you should upload all the
following files:
• VMware-vSAN-File-Services-
Appliance-x.x.x.x-x_OVF10.mf
• VMware-vSAN-File-Services-
Appliance-x.x.x.x-x-x_OVF10.cert
• VMware-vSAN-File-Services-
Appliance-x.x.x.x-x-x-system.vmdk
• VMware-vSAN-File-Services-
Appliance-x.x.x.x-x-cloud-
components.vmdk
• VMware-vSAN-File-Services-
Appliance-x.x.x.x-x-log.vmdk
• VMware-vSAN-File-Services-
Appliance-x.x.x.x-x_OVF10.ovf

Monitor Performance of vSAN File Service


You can monitor the performance of NFS and SMB file shares.
Ensure that vSAN Performance Service is enabled. If you are using the vSAN Performance Service for the first time,
you see a message alerting you to enable it. For more information about vSAN Performance Service, see the vSAN
Monitoring and Troubleshooting Guide.
1. Navigate to the vSAN cluster and then click Monitor > vSAN > Performance.
2. Click the FILE SHARE tab.
3. Select one of the following options:
Option Action
Time Range • Select Last to select the number of hours for which you want
to view the performance report.
• Select CUSTOM to select the date and time for which you
want to view the performance report.
• Select SAVE to add the current setting as an option to the
Time Range list.
File share Select the file share for which you want to generate and view the
performance report.

4. Click SHOW RESULTS.

The throughput, IOPS, and latency metrics of the vSAN file service for the selected period are displayed.
For more information on vSAN Performance Graphs, see the VMware knowledge base article at https://kb.vmware.com/s/
article/2144493.

253
VMware vSAN 8.0

Monitor vSAN File Share Capacity


You can monitor the capacity for both native file shares and CNS-managed file shares.
1. Navigate to the vSAN cluster and then click Monitor > vSAN > Capacity.
2. Click CAPACITY USAGE tab.
3. In the Usage breakdown before dedupe and compression section, expand User objects.

The file share capacity information is displayed.


For more information about monitoring vSAN capacity, see the vSAN Monitoring and Troubleshooting Guide.

Monitor vSAN File Service and File Share Health


You can monitor the health of both vSAN file service and file share objects.
View vSAN File Service Health

You can monitor the vSAN file service health.


Ensure that vSAN Performance Service is enabled.
1. Navigate to the vSAN cluster and then click Monitor > vSAN .
2. In the Skyline Health section, expand File Service.
3. Click the following file service health parameters to view the status.
Option Action
Infrastructure health Displays the file service infrastructure health status per ESXi
host. For more information, click the Info tab.
File Server Health Displays the file server health status. For more information, click
the Info tab.
Share health Displays the file service share health. For more information, click
the Info tab.

Monitor vSAN File Share Objects Health

You can monitor the health of file share objects.


To view the file share object health, navigate to the vSAN cluster and then click Monitor > vSAN > Virtual Objects.
The device information such as name, identifier or UUID, number of devices used for each virtual machine, and how they
are mirrored across hosts is displayed in the VIEW PLACEMENT DETAILS section.

Migrate a Hybrid vSAN Cluster to an All-Flash Cluster


You can migrate the disk groups in a hybrid vSAN cluster to all-flash disk groups.
• Ensure that all the vSAN policies that the cluster uses specify No preference for encryption services, space efficiency,
and storage tier.
• You must use RAID-1 (Mirroring) for Failures to tolerate until all the disk groups are converted to all-flash.
These prerequsites are not applicable when you migrate a cluster from SSD to NVMe or NVMe to SSD.
The vSAN hybrid cluster uses magnetic disks for the capacity layer and flash devices for the cache layer. You can change
the configuration of the disk groups in the cluster so that it uses flash devices on the cache layer and the capacity layer.

254
VMware vSAN 8.0

NOTE
Follow the steps to migrate a hybrid vSAN cluster to Solid State Drive (SSD), hybrid vSAN cluster to NVMe, or
SSD to NVMe.
1. Remove the hybrid disk groups on the host.
a) In the vSphere Client, navigate to the vSAN cluster, and click the Configure tab.
b) Under vSAN, click Disk Management.
c) Under Disk Groups, select the disk group to remove, click …, and then click Remove.
Select Full data migration as a migration mode and click Yes.
NOTE
Migrate the disk groups on each host in the vSAN cluster.
2. Remove the physical HDD disks from the host.
3. Add the flash devices to the host.
Verify that no partitions exist on the flash devices.
4. Create the all-flash disk groups on the host.
5. Repeat the steps 1 through 4 on each host until all the hybrid disk groups are converted to the all-flash disk groups.
NOTE
If you cannot hot-plug disks on the host, place the host in maintenance mode before removing disks in the
vSphere Client. Shut down the host to replace the disks with flash devices. Then power on the host, exit
maintenance mode, and create new disk groups.

Shutting Down and Restarting the vSAN Cluster


You can shut down the entire vSAN cluster to perform maintenance or troubleshooting.
Use the Shutdown Cluster wizard to shutdown the vSAN cluster. The wizard performs the necessary steps and alerts you
when it requires user action. You also can manually shut down the cluster, if necessary.
NOTE
When you shut down a vSAN stretched cluster, the witness host remains active.

255
VMware vSAN 8.0

Shut Down the vSAN Cluster Using the Shutdown Cluster Wizard
Use the Shutdown cluster wizard to gracefully shut down the vSAN cluster for maintenance or troubleshooting.
The Shutdown Cluster Wizard is available with vSAN 7.0 Update 3 and later releases.
NOTE
If you have a vSphere with Tanzu environment, you must follow the specified order when shutting down or
starting up the components. For more information, see "Shutdown and Startup of VMware Cloud Foundation" in
the VMware Cloud Foundation Operations Guide.
1. Prepare the vSAN cluster for shutdown.
a) Check the vSAN Skyline Health to confirm that the cluster is healthy.
b) Power off all virtual machines (VMs) stored in the vSAN cluster, except for vCenter Server VMs, vCLS VMs and
file service VMs. If vCenter Server is hosted on the vSAN cluster, do not power off the vCenter Server VM or VM
service VMs (such as DNS, Active Directory) used by vCenter Server.
c) If this is an HCI Mesh server cluster, power off all client VMs stored on the cluster. If the client cluster's vCenter
Server VM is stored on this cluster, either migrate or power off the VM. Once this server cluster is shutdown, its
shared datastore is inaccessible to clients.
d) Verify that all resynchronization tasks are complete.
Click the Monitor tab and select vSAN > Resyncing Objects.
NOTE
If any member hosts are in lockdown mode, add the host's root account to the security profile Exception User
list. For more information, see Lockdown Mode in vSphere Security.
2. Right-click the vSAN cluster in the vSphere Client, and select menu Shutdown cluster.
You also can click Shutdown Cluster on the vSAN Services page.
3. On the Shutdown cluster wizard, verify that the Shutdown pre-checks are green checks. Resolve any issues that are
red exclamations. Click Next.
If vCenter Server appliance is deployed on the vSAN cluster, the Shutdown cluster wizard displays the vCenter Server
notice. Note the IP address of the orchestration host, in case you need it during the cluster restart. If vCenter Server
uses service VMs such as DNS or Active Directory, note them as exceptional VMs in the Shutdown cluster wizard.
4. Enter a reason for performing the shutdown, and click Shutdown.
The vSAN Services page changes to display information about the shutdown process.
5. Monitor the shutdown process.
vSAN performs the steps to shutdown the cluster, powers off the system VMs, and powers off the hosts.
Restart the vSAN cluster. See Restart the vSAN Cluster.

Restart the vSAN Cluster


You can restart a vSAN cluster that is shut down for maintenance or troubleshooting.

1. Power on the cluster hosts.


If the vCenter Server is hosted on the vSAN cluster, wait for vCenter Server to restart.
2. Right-click the vSAN cluster in the vSphere Client, and select menu Restart cluster.
You also can click Restart Cluster on the vSAN Services page.
3. On the Restart Cluster dialog, click Restart.
The vSAN Services page changes to display information about the restart process.

256
VMware vSAN 8.0

4. After the cluster has restarted, check the vSAN Skyline Health and resolve any outstanding issues.

Manually Shut Down and Restart the vSAN Cluster


You can manually shut down the entire vSAN cluster to perform maintenance or troubleshooting.
Use the Shutdown Cluster wizard unless your workflow requires a manual shut down. When you manually shut down the
vSAN cluster, do not deactivate vSAN on the cluster.
NOTE
If you have a vSphere with Tanzu environment, you must follow the specified order when shutting down or
starting up the components. For more information, see "Shutdown and Startup of VMware Cloud Foundation" in
the VMware Cloud Foundation Operations Guide.
1. Shut down the vSAN cluster.
a) Check the vSAN Skyline Health to confirm that the cluster is healthy.
b) Power off all virtual machines (VMs) running in the vSAN cluster, if vCenter Server is not hosted on the cluster. If
vCenter Server is hosted in the vSAN cluster, do not power off the vCenter Server VM or service VMs (such as
DNS, Active Directory) used by vCenter Server.
c) If vSAN file service is enabled in the vSAN cluster, you must deactivate the file service. Deactivating the vSAN file
service removes the empty file service domain. If you want to retain the empty file service domain after restarting
the vSAN cluster, you must create an NFS or SMB file share before deactivating the vSAN file service.
d) Click the Configure tab and turn off HA. As a result, the cluster does not register host shutdowns as failures.
For vSphere 7.0 U1 and later, enable vCLS retreat mode. For more information, see the VMware knowledge base
article at https://kb.vmware.com/s/article/80472.
e) Verify that all resynchronization tasks are complete.
Click the Monitor tab and select vSAN > Resyncing Objects.
f) If vCenter Server is hosted on the vSAN cluster, power off the vCenter Server VM.
Make a note of the host that runs the vCenter Server VM. It is the host where you must restart the vCenter Server
VM.
g) Deactivate cluster member updates from vCenter Server by running the following command on the ESXi hosts in
the cluster. Ensure that you run the following command on all the hosts.
esxcfg-advcfg -s 1 /VSAN/IgnoreClusterMemberListUpdates
h) Log in to any host in the cluster other than the witness host.
i) Run the following command only on that host. If you run the command on multiple hosts concurrently, it may cause
a race condition causing unexpected results.
python /usr/lib/vmware/vsan/bin/reboot_helper.py prepare

The command returns and prints the following:


Cluster preparation is done.
NOTE
• The cluster is fully partitioned after the successful completion of the command.
• If you encounter an error, resolve the issue based on the error message and try enabling vCLS retreat
mode again.
• If there are unhealthy or disconnected hosts in the cluster, remove the hosts and retry the command.

257
VMware vSAN 8.0

j) Place all the hosts into maintenance mode with No Action. If the vCenter Server is powered off, use the following
command to place the ESXi hosts into maintenance mode with No Action.
esxcli system maintenanceMode set -e true -m noAction

Perform this step on all the hosts.


To avoid the risk of data unavailability while using No Action at the same time on multiple hosts, followed by a
reboot of multiple hosts, see the VMware knowledge base article at https://kb.vmware.com/s/article/60424. To
perform simultaneous reboot of all hosts in the cluster using a built-in tool, see the VMware knowledge base article
at https://kb.vmware.com/s/article/70650.
k) After all hosts have successfully entered maintenance mode, perform any necessary maintenance tasks and
power off the hosts.
2. Restart the vSAN cluster.
a) Power on the ESXi hosts.
Power on the physical box where ESXi is installed. The ESXi host starts, locates the VMs, and functions normally.
If any hosts fail to restart, you must manually recover the hosts or move the bad hosts out of the vSAN cluster.
b) When all the hosts are back after powering on, exit all hosts from maintenance mode. If the vCenter Server is
powered off, use the following command on the ESXi hosts to exit maintenance mode.
esxcli system maintenanceMode set -e false

Perform this step on all the hosts.


c) Log in to one of the hosts in the cluster other than the witness host.
d) Run the following command only on that host. If you run the command on multiple hosts concurrently, it may cause
a race condition causing unexpected results.
python /usr/lib/vmware/vsan/bin/reboot_helper.py recover

The command returns and prints the following:


Cluster reboot/power-on is completed successfully!
e) Verify that all the hosts are available in the cluster by running the following command on each host.
esxcli vsan cluster get
f) Enable cluster member updates from vCenter Server by running the following command on the ESXi hosts in the
cluster. Ensure that you run the following command on all the hosts.
esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListUpdates
g) Restart the vCenter Server VM if it is powered off. Wait for the vCenter Server VM to be powered up and
running. To deactivate vCLS retreat mode, see the VMware knowledge base article at https://kb.vmware.com/s/
article/80472.
h) Verify again that all the hosts are participating in the vSAN cluster by running the following command on each host.
esxcli vsan cluster get
i) Restart the remaining VMs through vCenter Server.
j) Check the vSAN Skyline Health and resolve any outstanding issues.
k) (Optional) Enable vSAN file service.
l) (Optional) If the vSAN cluster has vSphere Availability enabled, you must manually restart vSphere Availability to
avoid the following error: Cannot find vSphere HA master agent.
To manually restart vSphere Availability, select the vSAN cluster and navigate to:
1. Configure > Services > vSphere Availability > EDIT > Disable vSphere HA
2. Configure > Services > vSphere Availability > EDIT > Enable vSphere HA

258
VMware vSAN 8.0

3. If there are unhealthy or disconnected hosts in the cluster, recover or remove the hosts from the vSAN cluster. If
vCenter Server uses service VMs such as DNS or Active Directory, note them as exceptional VMs in the Shutdown
cluster wizard.
Retry the above commands only after the vSAN Skyline Health shows all available hosts in the green state.
If you have a three-node vSAN cluster, the command reboot_helper.py recover cannot work in a one host failure
situation. As an administrator, do the following:
1. Temporarily remove the failure host information from the unicast agent list.
2. Add the host after running the following command.
reboot_helper.py recover

Following are the commands to remove and add the host to a vSAN cluster:
#esxcli vsan cluster unicastagent remove -a <IP Address> -t node -u <NodeUuid>
#esxcli vsan cluster unicastagent add -t node -u <NodeUuid> -U true -a <IP Address> -p 12321

Restart the vSAN cluster. See Restart the vSAN Cluster.

Device Management in a vSAN Cluster


You can perform various device management tasks in a vSAN cluster.
You can create hybrid or all-flash disk groups, enable vSAN to claim devices for capacity and cache, turn LED indicators
on or off, mark devices as flash, mark remote devices as local, and so on.
NOTE
Marking devices as flash and marking remote devices as local are not supported in a vSAN Express Storage
Architecture cluster.

Managing Storage Devices in vSAN Cluster


When you configure vSAN on a cluster, claim storage devices on each host to create the vSAN datastore.
The vSAN cluster initially contains a single vSAN datastore. As you claim disks for disk groups or storage pool on each
host, the size of the datastore increases according to the amount of physical capacity added by those devices.
vSAN has a uniform workflow for claiming disks across all scenarios. You can list all available disks by model and size, or
by host.
Add a Disk Group (vSAN Original Storage Architecture)
When you add a disk group, you must specify the host and the devices to claim. Each disk group contains one flash cache
device and one or more capacity devices. You can create multiple disk groups on each host, and claim a cache device for
each disk group.
When adding a disk group, consider the ratio of flash cache to consumed capacity. The ratio depends on the requirements
and workload of the cluster. For a hybrid cluster, consider using at least 10 percent of flash cache to consumed capacity ratio
(not including replicas such as mirrors).
NOTE
If a new ESXi host is added to the vSAN cluster, the local storage from that host is not added to the vSAN
datastore automatically. You must add a disk group to use the storage from the new host.
Add a Storage Pool (vSAN Express Storage Architecture)
Each host that contributes storage contains a single storage pool of flash devices. Each flash device provides caching and
capacity to the cluster. You can add a storage pool using any compatible devices. vSAN creates only one storage pool per
host, irrespective of the number of storage disks the host is attached to.
Claim Disks for vSAN Direct
Use vSAN Direct to enable stateful services to access raw, non-vSAN local storage through a direct path.

259
VMware vSAN 8.0

You can claim host-local devices for vSAN Direct, and use vSAN to manage and monitor those devices. On each local
device, vSAN Direct creates and independent VMFS datastore and makes it available to your stateful application.
Each local vSAN Direct datastore appears as a vSAN-D datastore.
NOTE
If vSAN Express Storage Architecture is enabled for the cluster, you cannot claim disks for vSAN Direct.

Create a Disk Group or Storage Pool in vSAN Cluster


Depending on the storage architecture you use in your cluster, you can decide to create a disk group or storage pool.

Create a Disk Group on a Host (vSAN Original Storage Architecture)


You can claim cache and capacity devices to define disk groups on a vSAN host. Select one cache device and one or
more capacity devices to create the disk group.
1. Navigage to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management, select a host from the table, and click VIEW DISKS.
4. Click CREATE DISK GROUP.
5. Select disks to claim.
a. Select the flash device to use for the cache tier.
b. Select the disks to use for the capacity tier.
7. Click Create to confirm your selections.
NOTE
The new disk group appears in the list.

Create a Storage Pool on a Host (vSAN Express Storage Architecture)


You can claim disks to define a storage pool on a vSAN host. Each host that contributes storage contains a single storage
pool of flash devices. Each flash device provides caching and capacity to the cluster. You can create a storage pool with
any devices that are compatible for ESA. vSAN creates only one storage pool per host.
In a storage pool, each device provides both caching and capacity in a single tier. This is different from a Disk Group,
which has dedicated devices in different tiers of cache and capacity.
Use vSAN Managed Disk Claim to automatically claim all compatible disks on the cluster hosts. When you add new hosts,
vSAN will also claim compatible disks on those hosts. Any disks added manually are not affected by this setting. You can
manually add such disks to the storage pool.
1. Navigage to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Click Claim Unused Disks.
NOTE
You can change the disk claim mode to use vSAN Managed Disk Claim. vSAN will automatically claim all
compatible devices on cluster hosts.
5. Group by host.
6. Select compatible disks to claim.

260
VMware vSAN 8.0

7. Click Create to confirm your selections.


NOTE
The disk management page appears with the hosts listed. There will be an indication that disks are claimed
on the hosts in the "Disks in use" column reflecting the updated number of disks per host. To see the claimed
disks for the host, click the "View disks" button.

Claim Storage Devices for vSAN Original Storage Architecture Cluster


You can select a group of cache and capacity devices, and vSAN organizes them into default disk groups.
In this method, you select devices to create disk groups for the vSAN cluster. You need one cache device and at least one
capacity device for each disk group.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Click Claim Unused Disks.
5. Select devices to add to disk groups.
• For hybrid disk groups, each host that contributes storage must contribute one flash cache device and one or more
HDD capacity devices. You can add only one cache device per disk group.
• Select a flash devices to be used as cache and click Claim for cache tier.
• Select one or more HDD device to be used as capacity and click Claim for capacity tier for each of them.
• Click Create or OK.
• For all-flash disk groups, each host that contributes storage must contribute one flash cache device and one or
more flash capacity devices. You can add only one cache device per disk group.
• Select one or more flash devices to be used as cache and click Claim for cache tier for reach of them.
• Select a flash device to be used for capacity and click Claim for capacity tier.
• Click Create or OK.
vSAN claims the devices that you selected and organizes them into default disk groups that contribute the vSAN
datastore.
To verify the role of each device added to the all-flash disk group, navigate to the "Claimed as" column for a given host
on the Disk Management page. The table shows the list of devices and their purpose in a disk group. For all-flash and
hybrid disk groups, the cache disk is always shown first in the disk group grid.

Claim Storage Devices for vSAN Express Storage Architecture Cluster


You can select a group of devices from a host, and vSAN organizes them into a storage pool.
After vSAN ESA is enabled, you can claim disks either manually or automatically. In the manual method, you can select a
group of storage devices to be claimed.
In automatic disk claim, vSAN automatically selects all compatible disks from the hosts. When new hosts are added to
the cluster, vSAN automatically claims the compatible disks available in those hosts and adds the storage to the vSAN
datastore.

261
VMware vSAN 8.0

You can choose devices that are not reported as certified for vSAN ESA and those devices will be considered in the
storage pool, but such configuration is not recommended and can impact performance.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. To manually claim disks, click Claim Unused Disks.
a) Select the devices you want to claim
b) Click Create.
5. To automatically claim disks, click CHANGE DISK CLAIM MODE and click the vSAN managed disk claim toggle
button.
NOTE
If you chose to use vSAN managed disk claiming when configuring the cluster, the toggle button would be
already enabled.
vSAN claims the devices that you selected and organizes them into storage pools that support the vSAN datastore.
By default, vSAN creates one storage pool for each ESXi host that contributes storage to the cluster. If the selected
devices are not certified for vSAN ESA, those devices are not considered for creating storage pools.

Claim Disks for vSAN Direct


You can claim local storage devices as vSAN Direct for use with the vSAN Data Persistence Platform.
NOTE
Only the vSAN Data Persistence platform can consume vSAN Direct storage. The vSAN Data Persistence
platform provides a framework for software technology partners to integrate with VMware infrastructure.
Each partner must develop their own plug-in for VMware customers to receive the benefits of the vSAN Data
Persistence platform. The platform is not operational until the partner solution running on top is operational. For
more information, see vSphere with Tanzu Configuration and Management.
1. In the vSphere Client, navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Click Claim Unused Disks.
5. On the Claim Unused Disks dialog, select the vSAN Direct tab.
6. Select a device to claim by selecting the checkbox in the Claim for vSAN Direct column.
NOTE
Devices claimed for your vSAN cluster do not appear in the vSAN Direct tab.
7. Click Create.

For each device you claim, vSAN creates a new vSAN Direct datastore.
You can click the Datastores tab to display the vSAN Direct datastores in your cluster.

Working with Individual Devices in vSAN Cluster


You can perform various device management tasks in the vSAN cluster.
You can add devices to a disk group, remove devices from a disk group, enable or disable locator LEDs, and mark
devices. You can also add or remove disks that are claimed using the vSAN Direct.

262
VMware vSAN 8.0

Add Devices to the Disk Group in vSAN Cluster


When you configure vSAN to claim disks in manual mode, you can add additional local devices to existing disk groups.
The devices must be the same type as the existing devices in the disk groups, such as SSD or magnetic disks.
1. Navigate to the cluster.
2. Click the Configure tab.
3. Under , click Disk Management.
4. Select the disk group, and click the Add Disks.
5. Select the device that you want to add and click Add.
If you add a used device that contains residual data or partition information, you must first clean the device. For
information about removing partition information from devices, see Remove Partition From Devices. You can also run
the host_wipe_vsan_disks RVC command to format the device.
Verify that the vSAN Disk Balance health check is green. If the Disk Balance health check issues a warning, perform
automatic rebalance operation during off-peak hours. For more information, see "Configure Automatic Rebalance in vSAN
Cluster" in vSAN Monitoring and Troubleshooting.

Check a Disk or Disk Group's Data Migration Capabilities from vSAN Cluster
Use the data migration pre-check to find the impact of migration options when unmounting a disk or disk group, or
removing it from the vSAN cluster.
Run the data migration pre-check before you unmount or remove a disk or disk group from the vSAN cluster. The test
results provide information to help you determine the impact to cluster capacity, predicted health checks, and any objects

263
VMware vSAN 8.0

that will go out of compliance. If the operation will not succeed, pre-check provides information about what resources are
needed.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, click Data Migration Pre-check.
4. Select a disk or disk group, choose a data migration option, and click Pre-check.
vSAN runs the data migration precheck tests.
5. View the test results.
The pre-check results show whether you can safely unmount or remove the disk or disk group.
• The Object Compliance and Accessibility tab displays objects that might have issues after the data migration.
• The Cluster Capacity tab displays the impact of data migration on the vSAN cluster before and after you perform
the operation.
• The Predicted Health tab displays the health checks that might be affected by the data migration.
If the pre-check indicates that you can unmount or remove the device, click the option to continue the operation.

Remove Disk Groups or Devices from vSAN


You can remove selected devices from a disk group, or you can remove an entire disk group from a vSAN OSA cluster.
Run data migration pre-check on the device or disk group before you remove it from the cluster. For more information, see
Because removing unprotected devices might be disruptive for the vSAN datastore and virtual machines in the datastore,
avoid removing devices or disk groups.
Typically, you delete devices or disk groups from vSAN when you are upgrading a device or replacing a failed device, or
when you must remove a cache device. Other vSphere storage features can use any flash-based device that you remove
from the vSAN cluster.
Deleting a disk group permanently deletes the disk membership and the data stored on the devices.
NOTE
Removing one flash cache device or all capacity devices from a disk group removes the entire disk group.
NOTE
If the cluster uses deduplication and compression, you cannot remove a single disk from the disk group. You
must remove the entire disk group.
Evacuating data from devices or disk groups might result in the temporary noncompliance of virtual machine storage
policies.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Remove a disk group or selected devices.
Option Description
Remove the Disk Group 1. Under Disk Groups, select the disk group to remove, and
click …, then Remove.
2. Select a data evacuation mode.
Remove the Selected Device 1. Under Disk Groups, select the disk group that contains the
device that you are removing.

264
VMware vSAN 8.0

Option Description
2. Under Disks, select the device to remove, and click the
Remove Disk(s).
3. Select a data evacuation mode.

5. Click Yes or Remove to confirm.


The data is evacuated from the selected devices or disk group.

Recreate a Disk Group in vSAN Cluster


When you recreate a disk group in the vSAN cluster, the existing disks are removed from the disk group, and the disk
group is deleted.
vSAN recreates the disk group with the same disks. When you recreate a disk group on a vSAN cluster, vSAN manages
the process for you. vSAN evacuates data from all disks in the disk group, removes the disk group, and creates the disk
group with the same disks.
1. Navigate to the vSAN cluster in the vSphere Client.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Under Disk Groups, select the disk group to recreate.
5. Click …, then click the Recreate.
The Recreate Disk Group dialog box appears.
6. Select a data migration mode, and click Recreate.

All data residing on the disks is evacuated. The disk group is removed from the cluster, and recreated.

Using Locator LEDs in vSAN


You can use locator LEDs to identify the location of storage devices.
vSAN can light the locator LED on a failed device so that you can easily identify the device. This is particularly useful
when you are working with multiple hot plug and host swap scenarios.
Consider using I/O storage controllers with pass-through mode, because controllers with RAID 0 mode require additional
steps to enable the controllers to recognize locator LEDs.
For information about configuring storage controllers with RAID 0 mode, see your vendor documentation.

Locator LEDs

You can turn locator LEDs on vSAN storage devices on or off. When you turn on the locator LED, you can identify the
location of a specific storage device.
• Verify that you have installed the supported drivers for storage I/O controllers that enable this feature. For information
about the drivers that are certified by VMware, see the VMware Compatibility Guide at http://www.vmware.com/
resources/compatibility/search.php.
• In some cases, you might need to use third-party utilities to configure the Locator LED feature on your storage I/O
controllers. For example, when you are using HP you should verify that the HP SSA CLI is installed.

265
VMware vSAN 8.0

For information about installing third-party VIBs, see the vSphere Upgrade documentation.
When you no longer need a visual alert on your vSAN devices, you can turn off locator LEDs on the selected devices.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Select a host to view the list of devices.
5. At the bottom of the page, select one or more storage devices from the list, and perform the desired action for the
locator LEDs.
Option Action
Turn on LED Turns on locator LED on the selected storage device. You also
can use the Manage tab and click Storage> Storage Devices.
Turn off LED Turns off locator LED on the selected storage device. You also
can use the Manage tab and click Storage> Storage Devices.

Mark Devices as Flash in vSAN


When flash devices are not automatically identified as flash by ESXi hosts, you can manually mark them as local flash
devices.
• Verify that the device is local to your host.
• Verify that the device is not in use.
• Make sure that the virtual machines accessing the device are powered off and the datastore is unmounted.
Flash devices might not be recognized as flash when they are enabled for RAID 0 mode rather than passthrough mode.
When devices are not recognized as local flash, they are excluded from the list of devices offered for vSAN and you
cannot use them in the vSAN cluster. Marking these devices as local flash makes them available to vSAN.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Select the host to view the list of available devices.
5. From the Show drop-down menu at the bottom of the page, select Not in Use.
6. Select one or more flash devices from the list and click the Mark as Flash Disk.
7. Click Yes to save your changes.
The Drive type for the selected devices appears as Flash.

266
VMware vSAN 8.0

Mark Devices as HDD in vSAN


When local magnetic disks are not automatically identified as HDD devices by ESXi hosts, you can manually mark them
as local HDD devices.
• Verify that the magnetic disk is local to your host.
• Verify that the magnetic disk is not in use and is empty.
• Verify that the virtual machines accessing the device are powered off.
If you marked a magnetic disk as a flash device, you can change the disk type of the device by marking it as a magnetic
disk.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Select the host to view the list of available magnetic disks.
5. From the Show drop-down menu at the bottom of the page, select Not in Use.
6. Select one or more magnetic disks from the list and click Mark as HDD Disk.
7. Click Yes to save.
The Drive Type for the selected magnetic disks appears as HDD.

Mark Devices as Local in vSAN


When hosts are using external SAS enclosures, vSAN might recognize certain devices as remote and might be unable to
automatically claim them as local.

267
VMware vSAN 8.0

Make sure that the storage device is not shared.


In such cases, you can mark the devices as local.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Select a host to view the list of devices.
5. From the Show drop-down menu at the bottom of the page, select Not in Use.
6. From the list of devices, select one or more remote devices that you want to mark as local and click the Mark as local
disk.
7. Click Yes to save your changes.

Mark Devices as Remote in vSAN


Hosts that use external SAS controllers can share devices.
You can manually mark those shared devices as remote, so that vSAN does not claim the devices when it creates disk
groups. In vSAN, you cannot add shared devices to a disk group.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Select a host to view the list of devices.
5. From the Show drop-down menu at the bottom of the page, select Not in Use.
6. Select one or more devices that you want to mark as remote and click the Mark as remote.
7. Click Yes to confirm.

Add a Capacity Device to vSAN Disk Group


You can add a capacity device to an existing vSAN disk group.
Verify that the device is formatted and is not in use.
You cannot add a shared device to a disk group.
1. Navigate to the cluster.
2. Click the Configure tab.
3. Under , click Disk Management.
4. Select a disk group.
5. Click the Add Disks at the bottom of the page.
6. Select the capacity device that you want to add to the disk group.
7. Click OK or Add.
The device is added to the disk group.

268
VMware vSAN 8.0

Remove Partition From Devices


You can remove partition information from a device so vSAN can claim the device for use.
Verify that the device is not in use by ESXi as boot disk, VMFS datastore, or vSAN.
If you have added a device that contains residual data or partition information, you must remove all preexisting partition
information from the device before you can claim it for vSAN use. VMware recommends adding clean devices to disk
groups.
When you remove partition information from a device, vSAN deletes the primary partition that includes disk format
information and logical partitions from the device.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
4. Select a host to view the list of available devices.
5. From the Show drop-down menu, select Ineligible.
6. Select a device from the list and click Erase partitions.
7. Click OK to confirm.
The device is clean and does not include any partition information.

Increasing Space Efficiency in a vSAN Cluster


You can use space efficiency techniques to reduce the amount of space for storing data.
These techniques reduce the total storage space required to meet your needs.

vSAN Space Efficiency Features


You can use space efficiency techniques to reduce the amount of space for storing data.
These techniques reduce the total storage capacity required to meet your needs. vSAN 6.7 Update 1 and later supports
SCSI unmap commands that enable you to reclaim storage space that is mapped to a deleted vSAN object.
You can use deduplication and compression on a vSAN cluster to eliminate duplicate data and reduce the amount
of space required to store data. Or you can use compression-only vSAN to reduce storage requirements without
compromising server performance.
You can set the Failure tolerance method policy attribute on VMs to use RAID 5 or RAID 6 erasure coding. Erasure
coding can protect your data while using less storage space than the default RAID 1 mirroring.
You can use deduplication and compression, and RAID 5 or RAID 6 erasure coding to increase storage space savings.
RAID 5 or RAID 6 each provide clearly defined space savings over RAID 1. Deduplication and compression can provide
additional savings.

Reclaiming Storage Space in vSAN with SCSI Unmap


SCSI UNMAP commands enable you to reclaim storage space that is mapped to deleted files in the file system created by
the guest on the vSAN object.
vSAN 6.7 Update 1 and later supports SCSI UNMAP. Deleting or removing files frees space within the file system. This
free space is mapped to a storage device until the file system releases or unmaps it. vSAN supports reclamation of free
space, which is also called the unmap operation. You can free storage space in the vSAN datastore when you delete or
migrate a VM, consolidate a snapshot, and so on.

269
VMware vSAN 8.0

Reclaiming storage space can provide a higher host-to-flash I/O throughput and improve the flash endurance.
Unmap capability is not enabled by default. Enable Guest Trim/Unmap on the vSAN Services Advanced options tab.
When you enable unmap on a vSAN cluster, you must power off and then power on all VMs. VMs must use virtual
hardware version 13 or above to perform unmap operations.
vSAN also supports the SCSI UNMAP commands issued directly from a guest operating system to reclaim storage space.
vSAN supports offline unmaps and inline unmaps. On Linux OS, offline unmaps are performed with the fstrim(8)
command, and inline unmaps are performed when the mount -o discard command is used. On Windows OS, NTFS
performs inline unmaps by default.

Using Deduplication and Compression in vSAN Cluster


vSAN can perform block-level deduplication and compression to save storage space.
When you enable deduplication and compression on a vSAN all-flash cluster, redundant data within each disk group
or storage pool is reduced. Deduplication removes redundant data blocks, whereas compression removes additional
redundant data within each data block. These techniques work together to reduce the amount of space required to store
the data. vSAN applies deduplication and then compression as it moves data from the cache tier to the capacity tier. Use
compression-only vSAN for workloads that do not benefit from deduplication, such as online transactional processing.
Deduplication occurs inline when data is written back from the cache tier to the capacity tier. The deduplication algorithm
uses a fixed block size and is applied within each disk group. Redundant copies of a block within the same disk group are
deduplicated.
For the vSAN Original Storage Architecture, deduplication and compression are enabled as a cluster-wide setting,
but they are applied on a disk group basis. Additionally, you cannot enable compression on specific workloads as the
settings cannot be changed through vSAN policies. When you enable deduplication and compression on a vSAN cluster,
redundant data within a particular disk group is reduced to a single copy.
NOTE
Compression-only vSAN is applied on a per-disk basis.
For the vSAN Express Storage Architecture, compression is enabled by default on the cluster. If you do not want to enable
compression on some of your virtual machine workloads, you can do so by creating a customized storage policy and
applying the policy to the virtual machines. Additionally, compression for vSAN Express Storage Architecture is only for
new writes. Old blocks are left uncompressed even after compression is turned on for an object.
You can enable deduplication and compression when you create a vSAN all-flash cluster or when you edit an existing
vSAN all-flash cluster. For more information, see Enable Deduplication and Compression on an Existing vSAN Cluster.
When you enable or disable deduplication and compression, vSAN performs a rolling reformat of every disk group or
storage pool on every host. Depending on the data stored on the vSAN datastore, this process might take a long time. Do
not perform these operations frequently. If you plan to disable deduplication and compression, you must first verify that
enough physical capacity is available to place your data.
NOTE
Deduplication and compression might not be effective for encrypted VMs, because VM Encryption encrypts data
on the host before it is written out to storage. Consider storage tradeoffs when using VM Encryption.

How to Manage Disks in a Cluster with Deduplication and Compression


NOTE
This topic is applicable only for vSAN Original Storage Architecture cluster.

270
VMware vSAN 8.0

Consider the following guidelines when managing disks in a cluster with deduplication and compression enabled. These
guidelines do not apply to compression-only vSAN.
• Avoid adding disks to a disk group incrementally. For more efficient deduplication and compression, consider adding a
disk group to increase the cluster storage capacity.
• When you add a disk group manually, add all the capacity disks at the same time.
• You cannot remove a single disk from a disk group. You must remove the entire disk group to make modifications.
• A single disk failure causes the entire disk group to fail.

Verifying Space Savings from Deduplication and Compression


The amount of storage reduction from deduplication and compression depends on many factors, including the type of
data stored and the number of duplicate blocks. Larger disk groups tend to provide a higher deduplication ratio. You can
check the results of deduplication and compression by viewing the Usage breakdown before dedup and compression in
the vSAN Capacity monitor.

You can view the Usage breakdown before dedup and compression when you monitor vSAN capacity in the vSphere
Client. It displays information about the results of deduplication and compression. The Used Before space indicates the
logical space required before applying deduplication and compression, while the Used After space indicates the physical
space used after applying deduplication and compression. The Used After space also displays an overview of the amount
of space saved, and the Deduplication and Compression ratio.
The Deduplication and Compression ratio is based on the logical (Used Before) space required to store data before
applying deduplication and compression, in relation to the physical (Used After) space required after applying
deduplication and compression. Specifically, the ratio is the Used Before space divided by the Used After space. For
example, if the Used Before space is 3 GB, but the physical Used After space is 1 GB, the deduplication and compression
ratio is 3x.
When deduplication and compression are enabled on the vSAN cluster, it might take several minutes for capacity updates
to be reflected in the Capacity monitor as disk space is reclaimed and reallocated.

271
VMware vSAN 8.0

Deduplication and Compression Design Considerations in vSAN Cluster


Consider these guidelines when you configure deduplication and compression in a vSAN cluster.
• Deduplication and compression are available only on all-flash disk groups.
• On-disk format version 3.0 or later is required to support deduplication and compression.
• You must have a valid license to enable deduplication and compression on a cluster.
• When you enable deduplication and compression on a vSAN cluster, all disk groups participate in data reduction
through deduplication and compression.
• vSAN can eliminate duplicate data blocks within each disk group, but not across disk groups (applicable only for vSAN
Original Storage Architecture).
• Capacity overhead for deduplication and compression is approximately five percent of total raw capacity.
• Policies must have either 0 percent or 100 percent object space reservations. Policies with 100 percent object space
reservations are always honored, but can make deduplication and compression less efficient.

Enable Deduplication and Compression on a New vSAN Cluster


You can enable deduplication and compression when you configure a new vSAN all-flash cluster.
1. Navigate to a new all-flash vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.
a) Click EDIT under Data Services.
b) Select a space efficiency option: Deduplication and compression, or Compression only.
c) Under Encryption, enable data-at-rest encryption by using the toggle button.
NOTE
If you use vSAN Express Storage Architecture cluster, you cannot change this setting after claiming
disks.
d) (Optional) Select Allow Reduced Redundancy. If needed, vSAN reduces the protection level of your VMs while
enabling Deduplication and Compression. For more details, see Reduce VM Redundancy for vSAN Cluster.
4. Complete your cluster configuration.

Enable Deduplication and Compression on an Existing vSAN Cluster


You can enable deduplication and compression by editing configuration parameters on an existing all-flash vSAN cluster.
To enable on a vSAN Original Storage Architecture cluster:
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, select Services
a. Click to edit Space Efficiency.
b. Select a space efficiency option: Deduplication and compression, or Compression only.
c. (Optional) Select Allow Reduced Redundancy. If needed, vSAN reduces the protection level of your VMs while
enabling Deduplication and Compression. For more details, see Reduce VM Redundancy for vSAN Cluster.
4. Click Apply to save your configuration changes.
To enable on a vSAN Express Storage Architecture cluster:
1. Navigate to the cluster.
2. Click the Configure tab.

272
VMware vSAN 8.0

3. Under vSAN, select Services.


4. Under Data Services, click EDIT.
a. Under Encryption, enable data-at-rest encryption by using the toggle button.
NOTE
You cannot change this setting after claiming disks.
b. Enable data-in-transit encryption by using the Data-In-Transit encryption toggle button, and specify the rekey
interval.
c. (Optional) Select Allow Reduced Redundancy. If needed, vSAN reduces the protection level of your VMs while
enabling Deduplication and Compression. For more details, see Reduce VM Redundancy for vSAN Cluster.
5. Click Apply to save your configuration changes.
While enabling deduplication and compression, vSAN updates the on-disk format of each disk group of the cluster. To
accomplish this change, vSAN evacuates data from the disk group, removes the disk group, and recreates it with a new
format that supports deduplication and compression.
The enablement operation does not require virtual machine migration or DRS. The time required for this operation
depends on the number of hosts in the cluster and amount of data. You can monitor the progress on the Tasks and
Events tab.

Disable Deduplication and Compression on vSAN Cluster


You can disable deduplication and compression on your vSAN cluster.
When deduplication and compression are disabled on the vSAN cluster, the size of the used capacity in the cluster can
expand (based on the deduplication ratio). Before you disable deduplication and compression, verify that the cluster has
enough capacity to handle the size of the expanded data.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
a) Under vSAN, select Services.
b) Click Edit.
c) Disable Deduplication and Compression.
d) (Optional) Select Allow Reduced Redundancy. If needed, vSAN reduces the protection level of your VMs, while
disabling Deduplication and Compression. See Reduce VM Redundancy for vSAN Cluster.
3. Click Apply or OK to save your configuration changes.

While disabling deduplication and compression, vSAN changes the disk format on each disk group of the cluster.
It evacuates data from the disk group, removes the disk group, and recreates it with a format that does not support
deduplication and compression.
The time required for this operation depends on the number of hosts in the cluster and amount of data. You can monitor
the progress on the Tasks and Events tab.

Reduce VM Redundancy for vSAN Cluster


When you enable deduplication and compression, in certain cases, you might need to reduce the level of protection for
your virtual machines.
Enabling deduplication and compression requires a format change for disk groups. To accomplish this change,
vSAN evacuates data from the disk group, removes the disk group, and recreates it with a new format that supports
deduplication and compression.
In certain environments, your vSAN cluster might not have enough resources for the disk group to be fully evacuated.
Examples for such deployments include a three-node cluster with no resources to evacuate the replica or witness while

273
VMware vSAN 8.0

maintaining full protection. Or a four-node cluster with RAID-5 objects already deployed. In the latter case, you have no
place to move part of the RAID-5 stripe, since RAID-5 objects require a minimum of four nodes.
You can still enable deduplication and compression and use the Allow Reduced Redundancy option. This option keeps
the VMs running, but the VMs might be unable to tolerate the full level of failures defined in the VM storage policy. As a
result, temporarily during the format change for deduplication and compression, your virtual machines might be at risk of
experiencing data loss. vSAN restores full compliance and redundancy after the format conversion is completed.

Add or Remove Disks with Deduplication and Compression Enabled


When you add disks to a vSAN cluster with enabled deduplication and compression, specific considerations apply.
• You can add a capacity disk to a disk group with enabled deduplication and compression. However, for more efficient
deduplication and compression, instead of adding capacity disks, create a new disk group to increase cluster storage
capacity.
• When you remove a disk from a cache tier, the entire disk group is removed. Removing a cache tier disk when
deduplication and compression are enabled triggers data evacuation.
• Deduplication and compression are implemented at a disk group level. You cannot remove a capacity disk from the
cluster with enabled deduplication and compression. You must remove the entire disk group.
• If a capacity disk fails, the entire disk group becomes unavailable. To resolve this issue, identify and replace the failing
component immediately. When removing the failed disk group, use the No Data Migration option.

Using RAID 5 or RAID 6 Erasure Coding in vSAN Cluster


You can use RAID 5 or RAID 6 erasure coding to protect against data loss and increase storage efficiency.
Erasure coding can provide the same level of data protection as mirroring (RAID 1), while using less storage capacity.
RAID 5 or RAID 6 erasure coding enables vSAN to tolerate the failure of up to two capacity devices in the datastore. You
can configure RAID 5 on all-flash clusters with four or more fault domains. You can configure RAID 5 or RAID 6 on all-
flash clusters with six or more fault domains.
RAID 5 or RAID 6 erasure coding requires less additional capacity to protect your data than RAID 1 mirroring. For
example, a VM protected by a Failures to tolerate value of 1 with RAID 1 requires twice the virtual disk size, but with
RAID 5 it requires 1.33 times the virtual disk size. The following table shows a general comparison between RAID 1 and
RAID 5 or RAID 6.

Table 28: Capacity Required to Store and Protect Data at Different RAID Levels

RAID Configuration Failures to Tolerate Data Size Capacity Required

RAID 1 (mirroring) 1 100 GB 200 GB


RAID 5 or RAID 6 (erasure 1 100 GB 133 GB
coding) with four fault domains
RAID 1 (mirroring) 2 100 GB 300 GB
RAID 5 or RAID 6 (erasure 2 100 GB 150 GB
coding) with six fault domains

RAID 5 or RAID 6 erasure coding is a policy attribute that you can apply to virtual machine components. To use RAID 5,
set Failure tolerance method to RAID-5/6 (Erasure Coding) and Failures to tolerate to 1. To use RAID 6, set Failure
tolerance method to RAID-5/6 (Erasure Coding) and Failures to tolerate to 2. RAID 5 or RAID 6 erasure coding does
not support a Failures to tolerate value of 3.

274
VMware vSAN 8.0

To use RAID 1, set Failure tolerance method to RAID-1 (Mirroring). RAID 1 mirroring requires fewer I/O operations to
the storage devices, so it can provide better performance. For example, a cluster resynchronization takes less time to
complete with RAID 1.
NOTE
In a vSAN stretched cluster, the Failure tolerance method of RAID-5/6 (Erasure Coding) applies only to the
Site disaster tolerance setting.
NOTE
For a vSAN Express Storage Architecture cluster, depending on the number of fault domains that you use, the
number of components listed under RAID 5 (Monitor > vSAN> Virtual Objects> testVM > View Placement
Details) will vary. If six or more fault domains are available in the cluster, then five components will be listed
under RAID 5. If five or fewer fault domains are available, then three components will be listed.
For more information about configuring policies, see Using vSAN Policies.

RAID 5 or RAID 6 Design Considerations in vSAN Cluster


Consider these guidelines when you configure RAID 5 or RAID 6 erasure coding in a vSAN cluster.
• RAID 5 or RAID 6 erasure coding is available only on all-flash disk groups.
• On-disk format version 3.0 or later is required to support RAID 5 or RAID 6.
• You must have a valid license to enable RAID 5/6 on a cluster.
• You can achieve additional space savings by enabling deduplication and compression on the vSAN cluster.

Using Encryption in a vSAN Cluster


You can encrypt data-in transit in your vSAN cluster, and encrypt data-at-rest in your vSAN datastore.
vSAN can encrypt data in transit across hosts in the vSAN cluster. Data-in-transit encryption protects data as it moves
around the vSAN cluster.
vSAN can encrypt data at rest in the vSAN datastore. Data-at-rest encryption protects data on storage devices, in case a
device is removed from the cluster.

vSAN Data-In-Transit Encryption


vSAN can encrypt data in transit, as it moves across hosts in your vSAN cluster.
vSAN can encrypt data in transit across hosts in the cluster. When you enable data-in-transit encryption, vSAN encrypts
all data and metadata traffic between hosts.
vSAN data-in-transit encryption has the following characteristics:
• vSAN uses AES-256 bit encryption on data in transit.
• vSAN data-in-transit encryption is not related to data-at-rest-encryption. You can enable or disable each one
separately.
• Forward secrecy is enforced for vSAN data-in-transit encryption.
• Traffic between data hosts and witness hosts is encrypted.
• File service data traffic between the VDFS proxy and VDFS server is encrypted.
• vSAN file services inter-host connections are encrypted.
vSAN uses symmetric keys that are generated dynamically and shared between hosts. Hosts dynamically generate an
encryption key when they establish a connection, and they use the key to encrypt all traffic between the hosts. You do not
need a key management server to perform data-in-transit encryption.
Each host is authenticated when it joins the cluster, ensuring connections only to trusted hosts are allowed. When a host
is removed from the cluster, it is authentication certificate is removed.

275
VMware vSAN 8.0

vSAN data-in-transit encryption is a cluster-wide setting. When enabled, all data and metadata traffic is encrypted as it
transits across hosts.

Enable Data-In-Transit Encryption on a vSAN Cluster


You can enable data-in-transit encryption by editing the configuration parameters of a vSAN cluster.

1. Navigate to an existing cluster.


2. Click the Configure tab.
3. Under vSAN, select Services and click the Data-In-Transit Encryption Edit button.
4. Click to enable Data-In-Transit encryption, and select a rekey interval.
5. Click Apply.

Encryption of data in transit is enabled on the vSAN cluster. vSAN encrypts all data moving across hosts and file service
inter-host connections in the cluster.
vSAN Data-At-Rest Encryption
vSAN can encrypt data at rest in your vSAN datastore.
When you enable data at rest encryption, vSAN encrypts data after all other processing, such as deduplication, is
performed. Data at rest encryption protects data on storage devices, in case a device is removed from the cluster.
Using encryption on your vSAN datastore requires some preparation. After your environment is set up, you can enable
data-at-rest encryption on your vSAN cluster.
Data-at-rest encryption requires an external Key Management Server (KMS) or a vSphere Native Key Provider. For more
information about vSphere encryption, see vSphere Security.
You can use an external Key Management Server (KMS), the vCenter Server system, and your ESXi hosts to encrypt
data in your vSAN cluster. vCenter Server requests encryption keys from an external KMS. The KMS generates and
stores the keys, and vCenter Server obtains the key IDs from the KMS and distributes them to the ESXi hosts.
vCenter Server does not store the KMS keys, but keeps a list of key IDs.

How vSAN Data-At-Rest Encryption Works


When you enable data-at-rest encryption, vSAN encrypts everything in the vSAN datastore.
All files are encrypted, so all virtual machines and their corresponding data are protected. Only administrators with
encryption privileges can perform encryption and decryption tasks. vSAN uses encryption keys as follows:
• vCenter Server requests an AES-256 Key Encryption Key (KEK) from the KMS. vCenter Server stores only the ID of
the KEK, but not the key itself.
• The ESXi host encrypts disk data using the industry standard AES-256 XTS mode. Each disk has a different randomly
generated Data Encryption Key (DEK).
• Each ESXi host uses the KEK to encrypt its DEKs, and stores the encrypted DEKs on disk. The host does not store
the KEK on disk. If a host reboots, it requests the KEK with the corresponding ID from the KMS. The host can then
decrypt its DEKs as needed.
• A host key is used to encrypt core dumps, not data. All hosts in the same cluster use the same host key. When
collecting support bundles, a random key is generated to re-encrypt the core dumps. You can specify a password to
encrypt the random key.

276
VMware vSAN 8.0

When a host reboots, it does not mount its disk groups until it receives the KEK. This process can take several minutes
or longer to complete. You can monitor the status of the disk groups in the vSAN health service, under Physical disks >
Software state health.

Encryption Key Persistence


In vSAN 7.0 Update 3 and later, data-at-rest encryption can continue to function even when the key server is temporarily
offline or unavailable. With key persistence enabled, the ESXi hosts can persist the encryption keys even after a reboot.
Each ESXi host obtains the encryption keys initially and retains them in its key cache. If the ESXi host has a Trusted
Platform Module (TPM), the encryption keys are persisted in the TPM across reboots. The host does not need to request
encryption keys. Encryption operations can continue when the key server is unavailable, because the keys have persisted
in the TPM.
Use the following commands to enable key persistence on a cluster host.
esxcli system settings encryption set --mode=TPM

esxcli system security keypersistence enable

For more information about encryption key persistence, see "Key Persistence Overview" in vSphere Security.

Using vSphere Native Key Provider


vSAN 7.0 Update 2 supports vSphere Native Key Provider. If your environment is set up for vSphere Native Key Provider,
you can use it to encrypt virtual machines in your vSAN cluster. For more information, see "Configuring and Managing
vSphere Native Key Provider" in vSphere Security.
vSphere Native Key Provider does not require an external Key Management Server (KMS). vCenter Server generates the
Key Encryption Key and pushes it to the ESXi hosts. The ESXi hosts then generate Data Encryption Keys.
NOTE
If you use vSphere Native Key Provider, make sure you backup the Native Key Provider to ensure
reconfiguration tasks run smoothly.
vSphere Native Key Provider can coexist with an existing key server infrastructure.

Design Considerations for vSAN Data-At-Rest Encryption


Consider these guidelines when working with data-at-rest encryption.
• Do not deploy your KMS server on the same vSAN datastore that you plan to encrypt.
• Encryption is CPU intensive. AES-NI significantly improves encryption performance. Enable AES-NI in your BIOS.
• The witness host in a vSAN stretched cluster does not participate in vSAN encryption. The witness host does not store
customer data, only metadata, such as the size and UUID of vSAN object and components.
NOTE
If the witness host is an appliance running on another cluster, you can encrypt the metadata stored on it.
Enable data-at-rest encryption on the cluster that contains the witness host.
• Establish a policy regarding core dumps. Core dumps are encrypted because they can contain sensitive information. If
you decrypt a core dump, carefully handle its sensitive information. ESXi core dumps might contain keys for the ESXi
host and for the data on it.
– Always use a password when you collect a vm-support bundle. You can specify the password when you generate
the support bundle from the vSphere Client or using the vm-support command.

277
VMware vSAN 8.0

The password recrypts core dumps that use internal keys to use keys that are based on the password. You
can later use the password to decrypt any encrypted core dumps that might be included in the support bundle.
Unencrypted core dumps or logs are not affected.
– The password that you specify during vm-support bundle creation is not persisted in vSphere components. You are
responsible for keeping track of passwords for support bundles.

Set Up the Standard Key Provider


Use a standard key provider to distribute the keys that encrypt the vSAN datastore.
Before you can encrypt the vSAN datastore, you must set up a standard key provider to support encryption. That task
includes adding the KMS to vCenter Server and establishing trust with the KMS. vCenter Server provisions encryption
keys from the key provider.
The KMS must support the Key Management Interoperability Protocol (KMIP) 1.1 standard. See the vSphere Compatibility
Matrices for details.

Add a KMS to vCenter Server

You add a Key Management Server (KMS) to your vCenter Server system from the vSphere Web Client.
• Verify that the key server is in the vSphere Compatibility Matrices and is KMIP 1.1 compliant.
– Verify that you have the required privileges:Cryptographer.ManageKeyServers
• Connecting to a KMS by using only an IPv6 address is not supported.
• Connecting to a KMS through a proxy server that requires user name or password is not supported.
vCenter Server creates a KMS cluster when you add the first KMS instance. If you configure the KMS cluster on two or
more vCenter Servers, make sure you use the same KMS cluster name.
NOTE
Do not deploy your KMS servers on the Virtual SAN cluster you plan to encrypt. If a failure occurs, hosts in the
Virtual SAN cluster must communicate with the KMS.
• When you add the KMS, you are prompted to set this cluster as a default. You can later change the default cluster
explicitly.
• After vCenter Server creates the first cluster, you can add KMS instances from the same vendor to the cluster.
• You can set up the cluster with only one KMS instance.
• If your environment supports KMS solutions from different vendors, you can add multiple KMS clusters.
1. Log in to the vCenter Server system with the vSphere Web Client.
2. Browse the inventory list and select the vCenter Server instance.
3. Click Configure and click Key Management Servers.
4. Click Add KMS, specify the KMS information in the wizard, and click OK.
Option Value
KMS cluster Select Create new cluster for a new cluster. If a cluster exists,
you can select that cluster.
Cluster name Name for the KMS cluster. You can use this name to connect to
the KMS if your vCenter Server instance becomes unavailable.
Server alias Alias for the KMS. You can use this alias to connect to the KMS
if your vCenter Server instance becomes unavailable.
Server address IP address or FQDN of the KMS.
Server port Port on which vCenter Server connects to the KMS.

278
VMware vSAN 8.0

Option Value
Proxy address Optional proxy address for connecting to the KMS.
Proxy port Optional proxy port for connecting to the KMS.
User name Some KMS vendors allow users to isolate encryption keys that
are used by different users or groups by specifying a user name
and password. Specify a user name only if your KMS supports
this functionality, and if you intend to use it.
Password Some KMS vendors allow users to isolate encryption keys that
are used by different users or groups by specifying a user name
and password. Specify a password only if your KMS supports
this functionality, and if you intend to use it.

Establish a Standard Key Provider Trusted Connection by Exchanging Certificates


After you add the standard key provider to the vCenter Server system, you can establish a trusted connection.
Add the standard key provider.
The exact process depends on the certificates that the key provider accepts, and on your company policy.
1. Navigate to the .
2. Click Configure and select Key Providers under Security.
3. Select the key provider.
The KMS for the key provider is displayed.
4. Select the KMS.
5. From the Establish Trust drop-down menu, select Make KMS trust vCenter.
6. Select the option appropriate for your server and follow the steps.
Option See
vCenter Server Root CA certificate Use the Root CA Certificate Option to Establish a Standard Key
Provider Trusted Connection.
vCenter Server Certificate Use the Certificate Option to Establish a Standard Key Provider
Trusted Connection.
Upload certificate and private key Use the Upload Certificate and Private Key Option to Establish a
Standard Key Provider Trusted Connection.
New Certificate Signing Request Use the New Certificate Signing Request Option to Establish a
Standard Key Provider Trusted Connection.

Use the Root CA Certificate Option to Establish a Standard Key Provider Trusted Connection
Some Key Management Server (KMS) vendors require that you upload your root CA certificate to the KMS.
All certificates that are signed by your root CA are then trusted by this KMS. The root CA certificate that vSphere Virtual
Machine Encryption uses is a self-signed certificate that is stored in a separate store in the VMware Endpoint Certificate
Store (VECS) on the vCenter Server system.

279
VMware vSAN 8.0

NOTE
Generate a root CA certificate only if you want to replace existing certificates. If you do, other certificates that are
signed by that root CA become invalid. You can generate a new root CA certificate as part of this workflow.
1. Navigate to the .
2. Click Configure and select Key Providers under Security.
3. Select the key provider with which you want to establish a trusted connection.
The KMS for the key provider is displayed.
4. From the Establish Trust drop-down menu, select Make KMS trust vCenter.
5. Select vCenter Root CA Certificate and click Next.
The Download Root CA Certificate dialog box is populated with the root certificate that vCenter Server uses for
encryption. This certificate is stored in VECS.
6. Copy the certificate to the clipboard or download the certificate as a file.
7. Follow the instructions from your KMS vendor to upload the certificate to their system.
NOTE
Some KMS vendors require that the KMS vendor restarts the KMS to pick up the root certificate that you
upload.
Finalize the certificate exchange. See Finish the Trust Setup for a Standard Key Provider.
Use the Certificate Option to Establish a Standard Key Provider Trusted Connection
Some Key Management Server (KMS) vendors require that you upload the vCenter Server certificate to the KMS.
After the upload, the KMS accepts traffic that comes from a system with that certificate. vCenter Server generates a
certificate to protect connections with the KMS. The certificate is stored in a separate key store in the VMware Endpoint
Certificate Store (VECS) on the vCenter Server system.
1. Navigate to the .
2. Click Configure and select Key Providers under Security.
3. Select the key provider with which you want to establish a trusted connection.
The KMS for the key provider is displayed.
4. From the Establish Trust drop-down menu, select Make KMS trust vCenter.
5. Select vCenter Certificate and click Next.
The Download Certificate dialog box is populated with the root certificate that vCenter Server uses for encryption. This
certificate is stored in VECS.
NOTE
Do not generate a new certificate unless you want to replace existing certificates.
6. Copy the certificate to the clipboard or download it as a file.
7. Follow the instructions from your KMS vendor to upload the certificate to the KMS.
Finalize the trust relationship. See Finish the Trust Setup for a Standard Key Provider.
Use the New Certificate Signing Request Option to Establish a Standard Key Provider Trusted Connection

280
VMware vSAN 8.0

Some Key Management Server (KMS) vendors require that vCenter Server generate a Certificate Signing Request (CSR)
and send that CSR to the KMS.
The KMS signs the CSR and returns the signed certificate. You can upload the signed certificate to vCenter Server. Using
the New Certificate Signing Request option is a two-step process. First you generate the CSR and send it to the KMS
vendor. Then you upload the signed certificate that you receive from the KMS vendor to vCenter Server.
1. Navigate to the .
2. Click Configure and select Key Providers under Security.
3. Select the key provider with which you want to establish a trusted connection.
The KMS for the key provider is displayed.
4. From the Establish Trust drop-down menu, select Make KMS trust vCenter.
5. Select New Certificate Signing Request (CSR) and click Next.
6. In the dialog box, copy the full certificate in the text box to the clipboard or download it as a file.
Use the Generate new CSR button in the dialog box only if you explicitly want to generate a CSR.
7. Follow the instructions from your KMS vendor to submit the CSR.
8. When you receive the signed certificate from the KMS vendor, click Key Providers again, select the key provider, and
from the Establish Trust drop-down menu, select Upload Signed CSR Certificate.
9. Paste the signed certificate into the bottom text box or click Upload File and upload the file, and click Upload.
Finalize the trust relationship. See Finish the Trust Setup for a Standard Key Provider.
Use the Upload Certificate and Private Key Option to Establish a Standard Key Provider Trusted Connection
Some Key Management Server (KMS) vendors require that you upload the KMS server certificate and private key to the
vCenter Server system.
• Request a certificate and private key from the KMS vendor. The files are X509 files in PEM format.
Some KMS vendors generate a certificate and private key for the connection and make them available to you. After you
upload the files, the KMS trusts your vCenter Server instance.
1. Navigate to the .
2. Click Configure and select Key Providers under Security.
3. Select the key provider with which you want to establish a trusted connection.
The KMS for the key provider is displayed.
4. From the Establish Trust drop-down menu, select Make KMS trust vCenter.
5. Select KMS certificate and private key and click Next.
6. Paste the certificate that you received from the KMS vendor into the top text box or click Upload a File to upload the
certificate file.
7. Paste the key file into the bottom text box or click Upload a File to upload the key file.
8. Click Establish Trust.
Finalize the trust relationship. See Finish the Trust Setup for a Standard Key Provider.
Set the Default Key Provider Using the vSphere Client

You can use the vSphere Client to set the default key provider at the vCenter Server level.

281
VMware vSAN 8.0

As a best practice, verify that the Connection Status in the Key Providers tab shows Active and a green check mark.
You must set the default key provider if you do not make the first key provider the default, or if your environment uses
multiple key providers and you remove the default one.
1. Log in using the vSphere Client.
2. Navigate to the .
3. Click Configure and select Key Providers under Security.
4. Select the key provider.
5. Click Set as Default.
A confirmation dialog box appears.
6. Click Set as Default.
The key provider displays as the current default.
Finish the Trust Setup for a Standard Key Provider

Unless the Add Standard Key Provider dialog prompted you to trust the KMS, you must explicitly establish trust after
certificate exchange is complete.
You can complete the trust setup, that is, make vCenter Server trust the KMS, either by trusting the KMS or by uploading
a KMS certificate. You have two options:
• Trust the certificate explicitly by using the Upload KMS certificate option.
• Upload a KMS leaf certificate or the KMS CA certificate to vCenter Server by using the Make vCenter Trust KMS
option.
NOTE
If you upload the root CA certificate or the intermediate CA certificate, vCenter Server trusts all certificates that
are signed by that CA. For strong security, upload a leaf certificate or an intermediate CA certificate that the
KMS vendor controls.
1. Navigate to the .
2. Click Configure and select Key Providers under Security.
3. Select the key provider with which you want to establish a trusted connection.
The KMS for the key provider is displayed.
4. Select the KMS.
5. Select one of the following options from the Establish Trust drop-down menu.
Option Action
Make vCenter Trust KMS In the dialog box that appears, click Trust.
Upload KMS certificate 1. In the dialog box that appears, either paste in the certificate,
or click Upload a file and browse to the certificate file.
2. Click Upload.

282
VMware vSAN 8.0

Enable Encryption on a New vSAN Cluster


You can enable encryption when you configure a new vSAN cluster.
• Required privileges:
– Host.Inventory.EditCluster
– Cryptographer.ManageEncryptionPolicy
– Cryptographer.ManageKMS
– Cryptographer.ManageKeys
• You must have set up a KMS cluster and established a trusted connection between vCenter Server and the KMS.
1. Navigate to an existing cluster in the vSphere Web Client.
2. Click the Configure tab.
3. Under vSAN, select General and click the Configure vSAN button.
4. On the vSAN capabilites page, select the Encryption check box, and select a KMS cluster.
NOTE
Make sure the Erase disks before use check box is deselected, unless you want to wipe existing data from
the storage devices as they are encrypted.
5. On the Claim disks page, specify which disks to claim for the vSAN cluster.
a)
Select a flash device to be used for capacity and click the Claim for capacity tier icon ( ).
b) Select a flash device to be used as cache and click the Claim for cache tier icon ( ).

6. Complete your cluster configuration.

Encryption of data at rest is enabled on the vSAN cluster. vSAN encrypts all data added to the vSAN datastore.

283
VMware vSAN 8.0

Generate New Encryption Keys


You can generate new encryption keys, in case a key expires or becomes compromised.
• Required privileges:
– Host.Inventory.EditCluster
– Cryptographer.ManageKeys
• You must have set up a KMS cluster and established a trusted connection between vCenter Server and the KMS.
The following options are available when you generate new encryption keys for your Virtual SAN cluster.
• If you generate a new KEK, all hosts in the Virtual SAN cluster receive the new KEK from the KMS. Each host's DEK is
re-encrypted with the new KEK.
• If you choose to re-encrypt all data using new keys, a new KEK and new DEKs are generated. A rolling disk re-format
is required to re-encrypt data.
1. Navigate to the Virtual SAN host cluster in the vSphere Web Client.
2. Click the Configure tab.
3. Under vSAN, select General.
4. In the vSAN is turned ON pane, click the Generate new encryption keys button.
5. To generate a new KEK, click OK. The DEKs will be re-encrypted with the new KEK.
• To generate a new KEK and new DEKs, and re-encrypt all data in the Virtual SAN cluster, select the following
check box: Also re-encrypt all data on the storage using new keys.
• If your Virtual SAN cluster has limited resources, select the Allow Reduced Redundancy check box. If you allow
reduced redundancy, your data might be at risk during the disk reformat operation.

Enable vSAN Encryption on Existing vSAN Cluster


You can enable encryption by editing the configuration parameters of an existing vSAN cluster.
• Required privileges:
– Host. Inventory. EditCluster
– Cryptographer. ManageEncryptionPolicy
– Cryptographer. ManageKMS
– Cryptographer . ManageKeys
• You must have set up a KMS cluster and established a trusted connection between vCenter Server and the KMS.
• The cluster's disk-claiming mode must be set to manual.
1. Navigate to the vSAN host cluster in the vSphere Web Client.
2. Click the Configure tab.
3. Under vSAN, select General.
4. In the vSAN is turned ON pane, click the Edit button.
5. On the Edit vSAN settings dialog, check the Encryption check box, and select a KMS cluster.
6. (Optional) If the storage devices in your cluster contain sensitive data, select the Erase disks before use check box.
This setting directs vSAN to wipe existing data from the storage devices as they are encrypted.
7. Click OK.

A rolling reformat of all disk groups takes places as vSAN encrypts all data in the vSAN datastore.

284
VMware vSAN 8.0

vSAN Encryption and Core Dumps


If your vSAN cluster uses data-at-rest encryption, and if an error occurs on the ESXi host, the resulting core dump is
encrypted to protect data.
Core dumps that are included in the vm-support package are also encrypted.
NOTE
Core dumps can contain sensitive information. Follow your organization's data security and privacy policy when
handling core dumps.

Core Dumps on ESXi Hosts


When an ESXi host crashes, an encrypted core dump is generated and the host reboots. The core dump is encrypted with
the host key that is in the ESXi key cache. What you can do next depends on several factors.
• In most cases, vCenter Server retrieves the key for the host from the KMS and attempts to push the key to the ESXi
host after reboot. If the operation is successful, you can generate the vm-support package and you can decrypt or re-
encrypt the core dump.
• If vCenter Server cannot connect to the ESXi host, you might be able to retrieve the key from the KMS.
• If the host used a custom key, and that key differs from the key that vCenter Server pushes to the host, you cannot
manipulate the core dump. Avoid using custom keys.

Core Dumps and vm-support Packages


When you contact VMware Technical Support because of a serious error, your support representative usually asks you to
generate a vm-support package. The package includes log files and other information, including core dumps. If support
representatives cannot resolve the issues by looking at log files and other information, you can decrypt the core dumps to
make relevant information available. Follow your organization's security and privacy policy to protect sensitive information,
such as host keys.

Core Dumps on vCenter Server Systems


A core dump on a vCenter Server system is not encrypted. vCenter Server already contains potentially sensitive
information. At the minimum, ensure that the vCenter Server is protected. You also might consider turning off core dumps
for the vCenter Server system. Other information in log files can help determine the problem.

Collect a vm-support Package for an ESXi Host in an Encrypted vSAN Datastore

If data-at-rest encryption is enabled on a vSAN cluster, any core dumps in the vm-support package are encrypted.
Inform your support representative that data-at-rest encryption is enabled for the vSAN datastore. Your support
representative might ask you to decrypt core dumps to extract relevant information.

285
VMware vSAN 8.0

NOTE
Core dumps can contain sensitive information. Follow your organization's security and privacy policy to protect
sensitive information such as host keys.
You can collect the package, and you can specify a password if you expect to decrypt the core dump later. The vm-
support package includes log files, core dump files, and more.
1. Log in to vCenter Server using the vSphere Client.
2. Click Hosts and Clusters, and right-click the ESXi host.
3. Select Export System Logs.
4. In the dialog box, select Password for encrypted core dumps, and specify and confirm a password.
5. Leave the defaults for other options or make changes if requested by VMware Technical Support, and click Finish.
6. Specify a location for the file.
7. If your support representative asked you to decrypt the core dump in the vm-support package, log in to any ESXi host
and follow these steps.
a) Log in to the ESXi and connect to the directory where the vm-support package is located.
The filename follows the pattern esx.date_and_time.tgz.
b) Make sure that the directory has enough space for the package, the uncompressed package, and the
recompressed package, or move the package.
c) Extract the package to the local directory.
vm-support -x *.tgz .
The resulting file hierarchy might contain core dump files for the ESXi host, usually in /var/core, and might
contain multiple core dump files for virtual machines.
d) Decrypt each encrypted core dump file separately.
crypto-util envelope extract --offset 4096 --keyfile vm-support-incident-key-file
--password encryptedZdumpdecryptedZdump
vm-support-incident-key-file is the incident key file that you find at the top level in the directory.
encryptedZdump is the name of the encrypted core dump file.
decryptedZdump is the name for the file that the command generates. Make the name similar to the
encryptedZdump name.
e) Provide the password that you specified when you created the vm-support package.
f) Remove the encrypted core dumps, and compress the package again.
vm-support --reconstruct

8. Remove any files that contain confidential information.


Decrypt or Re-Encrypt an Encrypted Core Dump on ESXi Host

You can decrypt or re-encrypt an encrypted core dump on your ESXi host by using the crypto-util CLI.
The ESXi host key that was used to encrypt the core dump must be available on the ESXi host that generated the core
dump.
You can decrypt and examine the core dumps in the vm-support package yourself. Core dumps might contain sensitive
information. Follow your organization's security and privacy policy to protect sensitive information, such as host keys.
For details about re-encrypting a core dump and other features of crypto-util, see the command-line help.

286
VMware vSAN 8.0

NOTE
crypto-util is for advanced users.

1. Log directly in to the ESXi host on which the core dump occurred.
If the ESXi host is in lockdown mode, or if SSH access is not enabled, you might have to enable access first.
2. Determine whether the core dump is encrypted.
Option Description
Monitor core dump crypto-util envelope describe vmmcores.ve
zdump file crypto-util envelope describe --offset 4096 zdump-
File

3. Decrypt the core dump, depending on its type.


Option Description
Monitor core dump crypto-util envelope extract vmmcores.ve vmmcores
zdump file crypto-util envelope extract --offset 4096 zdumpEn-
cryptedzdumpUnencrypted

Upgrading the vSAN Cluster


Upgrading vSAN is a multistage process, in which you must perform the upgrade procedures in the order described here.
NOTE
You cannot upgrade a vSAN Original Architecture cluster to a vSAN Express Storage Architecture cluster by
using the vSphere client or Ruby vSphere Console (RVC).
Before you attempt to upgrade, make sure you understand the complete upgrade process clearly to ensure a smooth and
uninterrupted upgrade. If you are not familiar with the general vSphere upgrade procedure, you should first review the
vSphere Upgrade documentation.
NOTE
Failure to follow the sequence of upgrade tasks described here will lead to data loss and cluster failure.
The vSAN cluster upgrade proceeds in the following sequence of tasks.
1. Upgrade the vCenter Server. See the vSphere Upgrade documentation.
2. Upgrade the ESXi hosts. See Upgrade the ESXi Hosts. For information about migrating and preparing your ESXi hosts
for upgrade, see the vSphere Upgrade documentation.
3. Upgrade the vSAN disk format. Upgrading the disk format is optional, but for best results, upgrade the objects to use
the latest version. The on-disk format exposes your environment to the complete feature set of vSAN. See Upgrade
vSAN Disk Format Using RVC.

Before You Upgrade vSAN


Plan and design your upgrade to be fail-safe.
Before you attempt to upgrade vSAN, verify that your environment meets the vSphere hardware and software
requirements.

Upgrade Prerequisite
Consider the aspects that might delay the overall upgrade process. For guidelines and best practices, see the vSphere
Upgrade documentation.
Review the key requirements before you upgrade your cluster.

287
VMware vSAN 8.0

Table 29: Upgrade Prerequisite

Upgrade Prerequisites Description

Software, hardware, drivers, firmware, and storage Verify that the new version of vSAN supports the software and hardware
I/O controllers components, drivers, firmware, and storage I/O controllers that you plan on using.
Supported items are listed on the VMware Compatibility Guide website at http://w
ww.vmware.com/resources/compatibility/search.php.
vSAN version Verify that you are using the latest version of vSAN. You cannot upgrade from a
beta version to the new vSAN. When you upgrade from a beta version, you must
perform a fresh deployment of vSAN.
Disk space Verify that you have enough space available to complete the software version
upgrade. The amount of disk storage needed for the vCenter Server installation
depends on your vCenter Server configuration. For guidelines about the disk
space required for upgrading vSphere, see the vSphere Upgrade documentation.
vSAN disk format vSAN disk format is a metadata upgrade that does not require data evacuation or
rebuilding.
vSAN hosts Verify that you have placed the vSAN hosts in maintenance mode and selected
the Ensure data accessibility or Evacuate all data option.
You can use the vSphere Lifecycle Manager for automating and testing the
upgrade process. However, when you use vSphere Lifecycle Manager to upgrade
vSAN, the default evacuation mode is Ensure data accessibility. When you
use the Ensure data accessibility mode, your data is not protected, and if you
encounter a failure while upgrading vSAN, you might experience unexpected data
loss. However, the Ensure data accessibility mode is faster than the Evacuate
all data mode, because you do not need to move all data to another host in the
cluster. For information about various evacuation modes, see the Administering
VMware vSAN documentation.
Virtual Machines Verify that you have backed up your virtual machines.

Recommendations
Consider the following recommendations when deploying ESXi hosts for use with vSAN:
• If ESXi hosts are configured with memory capacity of 512 GB or less, use SATADOM, SD, USB, or hard disk devices
as the installation media.
• If ESXi hosts are configured with memory capacity greater than 512 GB, use a separate magnetic disk or flash device
as the installation device. If you are using a separate device, verify that vSAN is not claiming the device.
• When you boot a vSAN host from a SATADOM device, you must use a single-level cell (SLC) device and the size of
the boot device must be at least 16 GB.
• To ensure your hardware meets the requirements for vSAN, refer to vSAN Planning and Deployment.
vSAN 6.5 and later enables you to adjust the boot size requirements for an ESXi host in a vSAN cluster.

Upgrading the Witness Host in a Two Host or vSAN Stretched Cluster


The witness host for a two host cluster or vSAN stretched cluster is located outside of the vSAN cluster, but it is managed
by the same vCenter Server. You can use the same process to upgrade the witness host as you use for a vSAN data
host.
Upgrade the witness host before you upgrade the data hosts.

288
VMware vSAN 8.0

Using vSphere Lifecycle Manager to upgrade hosts in parallel can result in the witness host being upgraded in parallel
with one of the data hosts. To avoid upgrade problems, configure vSphere Lifecycle Manager so it does not upgrade the
witness host in parallel with the data hosts.

Upgrade the vCenter Server


This first task to perform during the vSAN upgrade is a general vSphere upgrade, which includes upgrading vCenter
Server and ESXi hosts.
VMware supports in-place upgrades on 64-bit systems from vCenter Server 4.x, vCenter Server 5.0.x, vCenter Server
5.1.x, and vCenter Server 5.5 to vCenter Server 6.0 and later. The vCenter Server upgrade includes a database schema
upgrade and an upgrade of the vCenter Server.
The details and level of support for an upgrade to ESXi 7.0 depend on the host to be upgraded and the upgrade method
that you use. Verify that the upgrade path from your current version of ESXi to the version to which you are upgrading, is
supported. For more information, see the VMware Product Interoperability Matrices at http://www.vmware.com/resources/
compatibility/sim/interop_matrix.php.
Instead of performing an in-place upgrade to vCenter Server, you can use a different machine for the upgrade. For
detailed instructions and upgrade options, see the vCenter Server Upgrade documentation.

Upgrade the ESXi Hosts


After you upgrade the vCenter Server, the next task for the vSAN cluster upgrade is upgrading the ESXi hosts to use the
current version.
You can upgrade the ESXi hosts in the vSAN cluster using:
• vSphere Lifecycle Manager - By using images or baselines, vSphere Lifecycle Manager enables you to upgrade ESXi
hosts in the vSAN cluster. The default evacuation mode is Ensure data accessibility. If you use this mode, and while
upgrading vSAN you encounter a failure, data can become inaccessible until one of the hosts is back online. For
information about working with evacuation and maintenance modes, see Working with Members of the vSAN Cluster in
Maintenance Mode. For more information about upgrades and updates, see the Managing Host and Cluster Lifecycle
documentation.
• Esxcli command - You can use components, base images, and add-ons as new software deliverables to update or
patch ESXi 7.0 hosts using the manual upgrade.
When you upgrade a vSAN cluster with configured fault domains, vSphere Lifecycle Manager upgrades a host within
a single fault domain and then proceeds to the next host. This ensures that the cluster has the same vSphere versions
running on all the hosts. When you upgrade a vSAN stretched cluster, vSphere Lifecycle Manager upgrades all the hosts
from the preferred site and then proceeds to the host in the secondary site. This ensures that the cluster has the same
vSphere versions running on all the hosts. For more information on the upgrading a vSAN stretched cluster, see the
Managing Host and Cluster Lifecycle documentation.
Before you attempt to upgrade the ESXi hosts, review the best practices discussed in the vSphere Upgrade
documentation. VMware provides several ESXi upgrade options. Choose the upgrade option that works best with the
type of host that you are upgrading. For detailed instructions and upgrade options, see the VMware ESXi Upgrade
documentation.
1. Upgrade the vSAN disk format. See Upgrade vSAN Disk Format Using RVC.
2. Verify the host license. In most cases, you must reapply your host license. For more information about applying host
licenses, see the vCenter Server and Host Management documentation.
3. Upgrade the virtual machines on the hosts by using the vSphere Client or vSphere Lifecycle Manager.

About the vSAN Disk Format


After you complete your ESXi update, upgrade the vSAN on-disk format to access the complete feature set of vSAN.

289
VMware vSAN 8.0

Each vSAN release supports the on-disk format of prior releases. All hosts in the cluster must have the same on-disk
format version. Because some features are tied to the on-disk format version, it's best to upgrade the vSAN on-disk
format to the highest version supported by the ESXi version. For more information, refer to https://kb.vmware.com/s/
article/2148493.
vSAN on-disk format version 3 and higher require only a metadata upgrade that takes a few minutes. No disk evacuation
or reconfiguration is performed during the on-disk format upgrade.
Before you upgrade the vSAN on-disk format, run the Pre-Check Upgrade to ensure a smooth upgrade. The pre-check
identifies potential issues that might prevent a successful upgrade, such as failed disks or unhealthy objects.
NOTE
Once you upgrade the on-disk format, you cannot roll back software on the hosts or add certain older hosts to
the cluster.

Upgrading vSAN Disk Format Using vSphere Client


After you have finished upgrading the vSAN hosts, you can perform the disk format upgrade.
• Verify that you are using the updated version of vCenter Server.
• Verify that you are using the latest version of ESXi hosts.
• Verify that the disks are in a healthy state. Navigate to the Disk Management page to verify the object status.
• Verify that the hardware and software that you plan on using are certified and listed in the VMware Compatibility Guide
website at http://www.vmware.com/resources/compatibility/search.php.
• Verify that you have enough free space to perform the disk format upgrade. Run the RVC command,
vsan.whatif_host_failures, to determine whether you have enough capacity to complete the upgrade or perform a
component rebuild, in case you encounter any failure during the upgrade.
• Verify that your hosts are not in maintenance mode. When upgrading the disk format, do not place the hosts in
maintenance mode. When any member host of a vSAN cluster enters maintenance mode, the member host no longer
contributes capacity to the cluster. The cluster capacity is reduced and the cluster upgrade might fail.
• Verify that there are no component rebuilding tasks currently in progress in the vSAN cluster. For information about
vSAN resynchronization, see vSphere Monitoring and Performance.

290
VMware vSAN 8.0

NOTE
If you enable encryption or deduplication and compression on an existing vSAN cluster, the on-disk format is
automatically upgraded to the latest version. This procedure is not required. See Edit vSAN Settings.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, select Disk Management.
4. (Optional) Click Pre-check Upgrade.
The upgrade pre-check analyzes the cluster to uncover any issues that might prevent a successful upgrade. Some of
the items checked are host status, disk status, network status, and object status. Upgrade issues are displayed in the
Disk pre-check status text box.
5. Click Upgrade.
6. Click Yes on the Upgrade dialog box to perform the upgrade of the on-disk format.

vSAN successfully upgrades the on-disk format. The On-disk Format Version column displays the disk format version of
storage devices in the cluster.
If a failure occurs during the upgrade, you can check the Resyncing Objects page. Wait for all resynchronizations to
complete, and run the upgrade again. You also can check the cluster health using the health service. After you have
resolved any issues raised by the health checks, you can run the upgrade again.

Upgrade vSAN Disk Format Using RVC


After you have finished upgrading the vSAN hosts, you can use the Ruby vSphere Console (RVC) to continue with the
disk format upgrade.
• Verify that you are using the updated version of vCenter Server.
• Verify that the version of the ESXi hosts running in the vSAN cluster is 6.5 or later.
• Verify that the disks are in a healthy state from the Disk Management page. You can also run the vsan.disks_stats
RVC command to verify disk status.

291
VMware vSAN 8.0

• Verify that the hardware and software that you plan on using are certified and listed in the VMware Compatibility Guide
website at http://www.vmware.com/resources/compatibility/search.php.
• Verify that you have enough free space to perform the disk format upgrade. Run the RVC
vsan.whatif_host_failures command to determine that you have enough capacity to complete the upgrade or
perform a component rebuild in case you encounter failure during the upgrade.
• Verify that you have PuTTY or similar SSH client installed for accessing RVC.
For detailed information about downloading the RVC tool and using the RVC commands, see the RVC Command
Reference Guide.
• Verify that your hosts are not in maintenance mode. When upgrading the on-disk format, do not place your hosts in
maintenance mode. When any member host of a vSAN cluster enters maintenance mode, the available resource
capacity in the cluster is reduced because the member host no longer contributes capacity to the cluster. The cluster
upgrade might fail.
• Verify that there are no component rebuilding tasks currently in progress in the vSAN cluster by running the RVC
vsan.resync_dashboard command.

1. Log in to your vCenter Server using RVC.


2. Run the following RVC command to view the disk status: vsan.disks_stats /< vCenter IP address or
hostname>/<data center name>/computers/<cluster name>
For example: vsan.disks_stats /192.168.0.1/BetaDC/computers/VSANCluster
The command lists the names of all devices and hosts in the vSAN cluster. The command also displays the current
disk format and its health status. You can also check the current health of the devices in the Health Status column
from the Disk Management page. For example, the device status appears as Unhealthy in the Health Status column
for the hosts or disk groups that have failed devices.
3. Run the following RVC command: vsan.ondisk_upgrade <path to vsan cluster>
For example: vsan.ondisk_upgrade /192.168.0.1/BetaDC/computers/VSANCluster
4. Monitor the progress in RVC.
RVC upgrades one disk group at a time.
After the disk format upgrade has completed successfully, the following message appears.
Done with disk format upgrade phase
There are n v1 objects that require upgrade Object upgrade progress: n upgraded, 0 left
Object upgrade completed: n upgraded
Done VSAN upgrade
5. Run the following RVC command to verify that the object versions are upgraded to the new on-disk format:
vsan.obj_status_report

Verify the vSAN Disk Format Upgrade


After you finish upgrading the disk format, you must verify whether the vSAN cluster is using the new on-disk format.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, click Disk Management.
The current disk format version appears in the Disk Format Version column.

About vSAN Object Format


The operations space needed by vSAN to perform policy change or other such operations on an object created by vSAN
7.0 or earlier is the space used by a largest object in the cluster.

292
VMware vSAN 8.0

This is typically difficult to plan for and hence the guidance was to keep 30 percent of free space in the cluster assuming
that it is unlikely that the largest object in the cluster consumes more than 25 percent of the space and 5 percent of the
space is reserved to make sure cluster does not become full due to policy changes. In vSAN 7.0U1 and later, all objects
are created in a new format which allows the operations space needed by vSAN to perform policy change on an object if
there is 255 GB per host for objects less than 8 TB and 765 GB per host for objects 8 TB or larger.
After a cluster is upgraded to vSAN 7.0 U1 or later from vSAN 7.0 or earlier release, the objects greater than 255 GB
created with the older release must be rewritten in the new format before vSAN can provide the benefit of being able to
perform operations on an object with the new free space requirements. A new object format health alert is displayed after
an upgrade, if there are objects that must be fixed to the new object format and allows the health state to be remediated
by starting a relayout task to fix these objects. The health alert provides information on the number of objects that must
be fixed and the amount of data that will be rewritten. The cluster might experience a drop of about 20 percent in the
performance while the relayout task is in progress. The resync dashboard provides more accurate information about the
amount of time this operation takes to complete.

Verify the vSAN Cluster Upgrade


The vSAN cluster upgrade is not complete until you have verified that you are using the latest version of vSphere and
vSAN is available for use.
1. Navigate to the vSAN cluster.
2. Click the Configure tab, and verify that vSAN is listed.
• You also can navigate to your ESXi host and select Summary > Configuration, and verify that you are using the
latest version of the ESXi host.

Using the RVC Upgrade Command Options During vSAN Cluster Upgrade
The vsan.ondisk_upgrade command provides various command options that you can use to control and manage the
vSAN cluster upgrade.
For example, you can allow reduced redundancy to perform the upgrade when you have little free space available. Run
the vsan.ondisk_upgrade --help command to display the list of RVC command options.
Use these command options with the vsan.ondisk_upgrade command.

Table 30: Upgrade Command Options

Options Description

--hosts_and_clusters Use to specify paths to all host systems in the cluster or cluster's compute resources.
--ignore-objects, -i Use to skip vSAN object upgrade. You can also use this command option to eliminate
the object version upgrade. When you use this command option, objects continue to use
the current on-disk format version.
--allow-reduced-redundancy, -a Use to remove the requirement of having a free space equal to one disk group during
disk upgrade. With this option, virtual machines operate in a reduced redundancy
mode during upgrade, which means certain virtual machines might be unable to
tolerate failures temporarily and that inability might cause data loss. vSAN restores full
compliance and redundancy after the upgrade is completed.
--force, -f Use to enable force-proceed and automatically answer all confirmation questions.
--help, -h Use to display the help options.

For information about using the RVC commands, see the RVC Command Reference Guide.

293
VMware vSAN 8.0

vSAN Build Recommendations for vSphere Lifecycle Manager


vSAN generates system baselines and baseline groups that you can use with vSphere Lifecycle Manager.
vSphere Lifecycle Manager in vSphere 7.0 includes the system baselines that Update Manager provided in earlier
vSphere releases. It also includes new image management functionality for hosts running ESXi 7.0 and later.
vSAN 6.6.1 and later generates automated build recommendations for vSAN clusters. vSAN combines information in
the VMware Compatibility Guide and vSAN Release Catalog with information about the installed ESXi releases. These
recommended updates provide the best available release to keep your hardware in a supported state.
System baselines for vSAN 6.7.1 to vSAN 7.0 also can include device driver and firmware updates. These updates
support the ESXi software recommended for your cluster.
For vSAN 6.7.3 and later, you can choose to provide build recommendations for the current ESXi release only, or for the
latest supported ESXi release. A build recommendation for the current release includes all patches and driver updates for
the release.
In vSAN 7.0 and later, vSAN build recommendations include patch updates and applicable driver updates. To update
firmware on vSAN 7.0 clusters, you must use an image through vSphere Lifecycle Manager.

vSAN System Baselines


vSAN build recommendations are provided through vSAN system baselines for vSphere Lifecycle Manager. These system
baselines are managed by vSAN. They are read-only and cannot be customized.
vSAN generates one baseline group for each vSAN cluster. vSAN system baselines are listed in the Baselines pane of
the Baselines and Groups tab. You can continue to create and remediate your own baselines.
vSAN system baselines can include custom ISO images provided by certified vendors. If hosts in your vSAN cluster
have OEM-specific custom ISOs, then vSAN recommended system baselines can include custom ISOs from the same
vendor. vSphere Lifecycle Manager cannot generate a recommendation for custom ISOs not supported by vSAN. If you
are running a customized software image that overrides the vendor name in the host's image profile, vSphere Lifecycle
Manager cannot recommend a system baseline.
vSphere Lifecycle Manager automatically scans each vSAN cluster to check compliance against the baseline group. To
upgrade your cluster, you must manually remediate the system baseline through vSphere Lifecycle Manager. You can
remediate vSAN system baseline on a single host or on the entire cluster.

vSAN Release Catalog


The vSAN release catalog maintains information about available releases, preference order for releases, and critical
patches needed for each release. The vSAN release catalog is hosted on the VMware Cloud.
vSAN requires Internet connectivity to access the release catalog. You do not need to be enrolled in the Customer
Experience Improvement Program (CEIP) for vSAN to access the release catalog.
If you do not have an Internet connection, you can upload the vSAN release catalog directly to the vCenter Server. In the
vSphere Client, click Configure > vSAN > Update, and click Upload from file in the Release Catalog section. You can
download the latest vSAN release catalog.
vSphere Lifecycle Manager enables you to import storage controller drivers recommended for your vSAN cluster. Some
storage controller vendors provide a software management tool that vSAN can use to update controller drivers. If the
management tool is not present on ESXi hosts, you can download the tool.

Working with vSAN Build Recommendations


vSphere Lifecycle Manager checks the installed ESXi releases against information in the Hardware Compatibility List
(HCL) in the VMware Compatibility Guide. It determines the correct upgrade path for each vSAN cluster, based on the

294
VMware vSAN 8.0

current vSAN Release Catalog. vSAN also includes the necessary drivers and patch updates for the recommended
release in its system baseline.
vSAN build recommendations ensure that each vSAN cluster remains at the current hardware compatibility status or
better. If hardware in the vSAN cluster is not included on the HCL, vSAN can recommend an upgrade to the latest release,
since it is no worse than the current state.
NOTE
vSphere Lifecycle Manager uses the vSAN health service when performing remediation precheck for hosts
in a vSAN cluster. vSAN health service is not available on hosts running ESXi 6. 0 Update 1 or earlier. When
vSphere Lifecycle Manager upgrades hosts running ESXi 6.0 Update 1 or earlier, the upgrade of the last host
in the vSAN cluster might fail. If remediation failed because of vSAN health issues, you can still complete
the upgrade. Use the vSAN health service to resolve health issues on the host, then take that host out of
maintenance mode to complete the upgrade workflow.
The following examples describe the logic behind vSAN build recommendations.
Example 1
A vSAN cluster is running 6.0 Update 2, and its hardware is included on the 6.0 Update 2 HCL. The HCL lists the hardware
as supported up to release 6.0 Update 3, but not supported for 6.5 and later. vSAN recommends an upgrade to 6.0 Update 3,
including the necessary critical patches for the release.
Example 2
A vSAN cluster is running 6.7 Update 2, and its hardware is included on the 6.7 Update 2 HCL. The hardware is also
supported on the HCL for release 7.0 Update 3. vSAN recommends an upgrade to release 7.0 Update 3.
Example 3
A vSAN cluster is running 6.7 Update 2 and its hardware is not on the HCL for that release. vSAN recommends an upgrade
to 7.0 Update 3, even though the hardware is not on the HCL for 7.0 Update 3. vSAN recommends the upgrade because the
new state is no worse than the current state.
Example 4
A vSAN cluster is running 6.7 Update 2, and its hardware is included on the 6.7 Update 2 HCL. The hardware is also
supported on the HCL for release 7.0 Update 3 and selected baseline preference is patch-only. vSAN recommends an
upgrade to 7.0 Update 3, including the necessary critical patches for the release.

The recommendation engine runs periodically (once each day), or when the following events occur.
• Cluster membership changes. For example, when you add or remove a host.
• The vSAN management service restarts.
• A user logs in to Broadcom Support Portal using a web browser or RVC.
• An update is made to the VMware Compatibility Guide or the vSAN Release Catalog.
The vSAN Build Recommendation health check displays the current build that is recommended for the vSAN cluster. It
also can warn you about any issues with the feature.

System Requirements
vSphere Lifecycle Manager is an extension service in vCenter Server 7.0 and later.
vSAN requires Internet access to update release metadata, to check the VMware Compatibility Guide, and to download
ISO images from Broadcom Support Portal.
vSAN requires valid credentials to download ISO images for upgrades from Broadcom Support Portal. For hosts running
6.0 Update 1 and earlier, you must use RVC to enter the Broadcom Support Portal credentials. For hosts running later
software, you can log in from the ESX Build Recommendation health check.
To enter Broadcom Support Portal credentials from RVC, run the following command: vsan.login_iso_depot -
u<username>-p<password>

295
VMware vSAN 8.0

vSAN Monitoring and Troubleshooting


vSAN Monitoring and Troubleshooting describes how to monitor and troubleshoot VMware vSAN® by using the vSphere
Client.
In addition, vSAN Monitoring and Troubleshooting explains how to monitor and troubleshoot a vSAN cluster using esxcli
and RVC commands, and other tools.
At VMware, we value inclusion. To foster this principle within our customer, partner, and internal community, we create
content using inclusive language.

Intended Audience
This manual is intended for anyone who wants to monitor vSAN operation and performance, or troubleshoot problems
with a vSAN cluster. The information in this manual is written for experienced system administrators who are familiar with
virtual machine technology and virtual datacenter operations. This manual assumes familiarity with VMware vSphere,
including VMware ESXi, vCenter Server, and the vSphere Client.
For more information about vSAN and how to create a vSAN cluster, see the vSAN Planning and Deployment Guide.
For more information about vSAN features and how to configure a vSAN cluster, see Administering VMware vSAN.

What Is vSAN
VMware vSAN is a distributed layer of software that runs natively as a part of the ESXi hypervisor.
vSAN aggregates local or direct-attached capacity devices of a host cluster and creates a single storage pool shared
across all hosts in the vSAN cluster. While supporting VMware features that require shared storage, such as HA, vMotion,
and DRS, vSAN eliminates the need for external shared storage and simplifies storage configuration and virtual machine
provisioning activities.

Monitoring the vSAN Cluster


You can monitor the vSAN cluster and all the objects related to it.
You can monitor all of the objects in a vSAN environment, including hosts that participate in a vSAN cluster and the vSAN
datastore. For more information about monitoring objects and storage resources in a vSAN cluster, see the vSphere
Monitoring and Performance documentation.

Monitor vSAN Capacity


You can monitor the capacity of the vSAN datastore, vSAN Direct storage, and Persistent Memory (PMem) storage.
You can analyze usage and view the capacity breakdown at the cluster level.
The cluster Summary page includes a summary of vSAN capacity. You also can view more detailed information in the
Capacity monitor.

296
VMware vSAN 8.0

297
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, click Capacity to view the vSAN capacity information.
VMware vSAN 8.0

have deduplication and compression enabled, you can view the deduplication and compression savings and the
deduplication and compression ratio.
NOTE
vSAN Express Storage Architecture (ESA) does not support deduplication.

Terms Description

Free space Total free space available in the cluster


Used space Total written physical space
Actually written Actually used capacity. This capacity is displayed when
deduplication or compression are not enabled.
Compression savings Space saved when data compression is enabled.
Object reserved Includes the reservation for objects created with a policy that has
specified object space reservation. This capacity is not actually
used by the objects.
Reserved capacity Includes the operations reserve and the host rebuild reserve.
• What if analysis enables you to estimate the effective free space while keeping the deduplication ratio as 1. Effective
free space is an estimate of free space available based on the selected storage policy. The effective free space
typically is smaller than the free space available on the disks, due to the cluster topology or the distribution of space
across fault domains. For example, consider a cluster with 100 GB free space available on the disks. However, 100 GB
cannot be provisioned as a single 100 GB object due to the distribution of free space across fault domains. If there are
three fault domains and each fault domain has 33 GB free space, then the largest object that you can create with FTT
1 is 33 GB.
Oversubscription reports the vSAN capacity required if all the thin provisioned VMs and user objects are used at
full capacity. It shows a ratio of the required usage compared with the total available capacity. While calculating the
oversubscription, vSAN includes all the available VMs, user objects, and storage policy overhead, and does not
consider the vSAN namespace and swap objects.
NOTE
Oversubscription is applicable only for vSAN hosts that are running 6.7 Update 1 or later.
NOTE
PMem storage does not support What if analysis and Oversubscription.
– The Usage breakdown before deduplication and compression displays the amount of storage space used by VMs,
user objects, and the system. You can view a pie chart that represents the different usage categories. Click the pie
chart to view the details of the selected category.

298
VMware vSAN 8.0

Following are the different usage categories available:

Category Description

VM (user objects) usage Displays the following:


• VM home objects - Usage of VM namespace object.
• Swap objects - Usage of VM swap files.
• VMDK - Capacity consumed by VMDK objects that reside
on the vSAN datastore that can be categorized as primary
data and replica usage. Primary data includes the actual user
data written into the physical disk which does not include any
overhead. Replica usage displays the RAID overhead for the
virtual disk.
• VM memory snapshots - Usage of memory snapshot file for
VMs.
• Block container volumes (attached to a VM) - Capacity
consumed by the container objects that are attached to a VM.
• vSphere replication persistent state file - vSAN object used to
store the persistent state file (PSF) at source site.
Non-VM (user objects) usage Displays iSCSI objects, block container volumes that are not
attached to VM, user-created files, ISO files, VM templates, files
shares, file container volumes, and vSAN objects used by the
vSphere replication service at the target site.
System usage Displays the following:
• Performance management objects - Capacity consumed by
objects created for storing performance metrics when you
enable the performance service.
• File system overhead - vSAN on-disk format overhead that
may take up on the capacity drives.
• ESA object overhead - vSAN ESA uses the capacity to store
object metadata and to provide high performance.
• Checksum overhead - Overhead to store all the checksums.
• Dedup & compression overhead - Overhead to get the benefits
of deduplication and compression. This data is visible only if
you enable deduplication and compression.
• Operations usage - Temporary space usage in a cluster. The
temporary space usage includes temporary capacity used for
rebalance operations or moving objects due to FTT changes.
• Native trace objects - Capacity consumed by objects created
for storing vSAN traces.

NOTE
PMem only supports VMDK, Non-Volatile Dual In-line Memory Module (NVDIMM), and file system
overhead.
When you enable deduplication and compression, it might take several minutes for capacity updates to be reflected in the
Capacity monitor, as disk space is reclaimed and reallocated. For more information about deduplication and compression,
see "Using Deduplication and Compression" in Administering VMware vSAN.
In vSAN ESA, Usage by Snapshots displays the snapshot usage by the vSAN datastore. You can delete one or more
snapshots and free the used space, thus managing space consumption. To delete a snapshot, right-click the virtual

299
VMware vSAN 8.0

machine > Snapshots > Manage Snapshots. Click Delete to delete a snapshot. Click Delete All Snapshots to delete all
the snapshots. The following are the different usage snapshots available:

Snapshot Description

Container volume snapshots Displays the container volume snapshot usage in the vSAN
datastore.
VMDK snapshots Displays the VMDK snapshot usage in the vSAN datastore.
vSAN file share snapshots Displays the file share snapshot usage in the vSAN datastore.
Current data Displays the usage data that is not included in the snapshot usage
data. You can calculate the current data by subtracting the total
snapshot usage from the total used space.

You can check the history of capacity usage in the vSAN datastore. Click Capacity History, select a time range, and click
Show Results.
The Capacity monitor displays two thresholds represented as vertical markers in the bar chart:
• Operations threshold - Displays the space vSAN requires to perform internal operations in the cluster. If the used
space reaches beyond that threshold, vSAN might not be able to operate properly.
• Host rebuild threshold - Displays the space vSAN requires to tolerate one host failure. If the used space reaches
beyond the host rebuild threshold and the host fails, vSAN might not successfully restore all data from the failed host.
If you enable reserved capacity, the Capacity monitor displays the following:
• Operations reserve - Reserved space in the cluster for internal operations.
• Host rebuild reserve - Reserved space for vSAN to be able to repair in case of single host failure. The Capacity
monitor displays the host rebuild threshold only when the host rebuild reserve is enabled.
If the resynchronization of objects is in progress in a cluster, vSAN displays the capacity used in the capacity chart as
operations usage. In case there is enough free space in the cluster, vSAN might use more space than the operations
threshold for the resyncing operations to complete faster.
Click Configure tab to enable the capacity reserve. You can also click Configure > vSAN > Services to enable the
capacity reserve. For more information on configuring the reserved capacity, see Configure Reserved Capacity for vSAN
Cluster.
In a cluster, if there is more utilization than the host rebuild threshold and the reserved capacity is not enabled, the
capacity chart turns yellow as a warning. If the most consumed host fails, vSAN cannot recover the data. If you enable
the host rebuild reserve, the capacity chart turns yellow at 80% of the host rebuild threshold. If the used space reaches
beyond the operations threshold and the reserved capacity is not enabled, vSAN cannot perform or complete operations
such as rebalance, resync object components due to policy changes, and so on. In that case, the capacity chart turns red
to indicate that the disk usage exceeds the operations threshold. For more information about capacity reserve, see About
Reserved Capacity in vSAN Cluster.

300
VMware vSAN 8.0

Monitor Physical Devices in vSAN Cluster


You can monitor hosts, cache devices, and capacity devices used in the vSAN cluster.

1. Navigate to the vSAN cluster.


2. Click the Configure tab.
3. Click Disk Management to review all hosts, cache devices, and capacity devices in the cluster. The physical location
is based on the hardware location of cache and capacity devices on vSAN hosts. You can see the virtual objects on
any selected host, disk group, or disk and view the impact of the selected entity to the virtual objects in the cluster.

Monitor Devices that Participate in vSAN Datastores


Verify the status of the devices that back up the vSAN datastore. You can check whether the devices experience any
problems.
1. Navigate to Storage.
2. Select the vSAN datastore.
3. Click the Configure tab.
You can view general information about the vSAN datastore, including capacity, capabilities, and the default storage
policy.
4. Display information about local devices.
a) Click Disk Management and select the disk group to display local devices in the table at the bottom of the page.
b) Click Capacity to review information about the amount of capacity provisioned and used in the cluster, and also to
review a breakdown of the used capacity by object type or by data type.

Monitor Virtual Objects in vSAN Cluster


You can view the status of virtual objects in the vSAN cluster.
When one or more hosts are unable to communicate with the vSAN datastore, the information about virtual objects might
not be displayed.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, select Virtual Objects to view the corresponding virtual objects in the vSAN cluster.
4.
Click to filter the virtual objects based on name, type, storage policy, and UUID.
a) Select the check box on one of the virtual objects and click View Placement Details to open the Physical
Placement dialog box. You can view the device information, such as name, identifier or UUID, number of devices
used for each virtual machine, and how they are mirrored across hosts.
b) On the Physical Placement dialog box, select the Group components by host placement check box to organize
the objects by host and by disk.
NOTE
At the cluster level, the Container Volumes filter displays detached container volumes. To view attached
volumes, expand the VM to which the container is attached.

301
VMware vSAN 8.0

5. Select the check box of the attached block type or file volumes and click View Performance. You can use the
vSAN cluster performance charts to monitor the workload in your cluster. For more information on the vSAN cluster
performance charts, see View vSAN Cluster Performance.
6. Select the check box on one of the container volumes and click View Container Volume. For more information about
monitoring container volumes, see Monitor Container Volumes in vSAN Cluster.
7. Select the check box on one of the file volumes and click View File Share. For more information about file volume,
see Administering VMware vSAN.

Monitor Container Volumes in vSAN Cluster


You can view the status of the container volumes in the vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under Cloud Native Storage, select Container Volumes to view the container volumes in the vSAN cluster. You can
view information about the volume name, label, datastore, compliance status, health status, and capacity quota.
4.
Click to view the following:
• Click the Basics tab to view the volume details such as volume type, ID, datastore, storage policy, compliance, and
health status.
• Click the Kubernetes objects tab to view Kubernetes related data such as Kubernetes cluster, namespace, pod,
persistent volume claim, labels, and so on.
• Click the Physical Placement tab to view the type, host, cache, and capacity disk of the virtual object components.
• Click the Performance tab to view the performance of the container volumes.
5. Select the check box for the volumes that have an out-of-date policy status. Click Reapply Policy to reapply the policy
on the selected volumes.
6. Select the check box for the container volume you want to delete and click Delete.
7. Use the Add Filter option to add filters to the container volumes.

About Reserved Capacity in vSAN Cluster


vSAN requires capacity for its internal operations.
For a cluster to be able to tolerate a single host failure, vSAN requires free space to restore the data of the failed host.
The capacity required to restore a host failure matches the total capacity of the largest host in the cluster. These values
are represented as thresholds in the Capacity Monitor page:
• Operations threshold - Displays the space vSAN requires to run its internal operations in the cluster. If the used space
exceeds the operations threshold, vSAN might not operate properly.
• Host rebuild threshold - Displays the space vSAN requires to tolerate one host failure. If the used space exceeds the
host rebuild threshold and the host fails, vSAN might not successfully restore all data from the failed host.
For more information on the capacity thresholds, see Monitor vSAN Capacity.
vSAN provides you the option to reserve the capacity in advance so that it has enough free space available to perform
internal operations and to repair data back to compliance following a single host failure. By enabling reserve capacity in
advance, vSAN prevents you from using the space to create workloads and intends to save the capacity available in a
cluster. By default, the reserved capacity is not enabled.
If there is enough free space in the vSAN cluster, you can enable the operations reserve and/or the host rebuild reserve.
• Operations reserve - Reserved space in the cluster for vSAN internal operations.
• Host rebuild reserve - Reserved space for vSAN to be able to repair in case of a single host failure.

302
VMware vSAN 8.0

These soft reservations prevent the creation of new VMs or powering on VMs if such operations consume the reserved
space. Once the reserved capacity is enabled, vSAN does not prevent powered on VM operations, such as I/O from the
guest operating system or applications from consuming the space even after the threshold limits are reached. After you
enable the reserved capacity, you must monitor the disk space health alerts and capacity usage in the cluster and take
appropriate actions to keep the capacity usage below the threshold limits.
NOTE
The reserved capacity is not supported on a vSAN stretched cluster, cluster with fault domains and nested fault
domains, ROBO cluster, or if the number of hosts in the cluster is less than four.
To enable reserved capacity for the host rebuild, you must first enable the operations reserve. When you enable
operations reserve, vSAN reserves 5% additional capacity in the operations reserve as a buffer to ensure you have time
to react to the capacity fullness before the actual threshold is reached.
vSAN indicates when the capacity usage is high in a cluster. The indications can be in the form of health alerts, capacity
chart turning yellow or red, and so on. Due to the reservation, vSAN might not have enough free space left. This results in
the inability to create VMs or VM snapshots, creating or extending virtual disks, and so on.
NOTE
You cannot enable reserved capacity, if the cluster is at a capacity higher than the specified threshold.

Capacity Reservation Considerations


Following are the considerations if you enable reserved capacity:
• When you enable reserved capacity with the host rebuild reserve and a host is put into maintenance mode, the host
might not come back online. In this case, vSAN continues to reserve capacity for another host failure. This host failure
is in addition to the host that is already in the maintenance mode. This might cause the failure of operations if the
capacity usage is above the host rebuild threshold.
• When you enable reserved capacity with the host rebuild reserve and a host fails, vSAN might not start repairing the
affected objects until the repair timer expires. During this time, vSAN continues to reserve capacity for another host
failure. This can cause failure of operations if the capacity usage is above the current host rebuild threshold, after the
first host failure. After the repairs are complete, you can deactivate the reserved capacity for the host rebuild reserve if
the cluster does not have the capacity for another host failure.

Configure Reserved Capacity for vSAN Cluster


You can configure reserved capacity for a vSAN cluster to reserve capacity for internal operations.
You can also configure reserve capacity to reserve capacity for data repair following a single host failure. Ensure that you
have the following required privileges: Host.Inventory.EditCluster and Host.Config.Storage.

303
VMware vSAN 8.0

Verify that the vSAN cluster:


• Is not configured as a vSAN stretched cluster or ROBO cluster.
• Has no fault domains and nested fault domains created.
• Has a minimum of four hosts.

1. Navigate to the vSAN cluster.


2. Click the Configure tab.
3. Under vSAN, select Services.
4. Click to edit the Reservations and Alerts.

304
VMware vSAN 8.0

5. Click to enable or deactivate the operations reserve. On enabling the operations reserve, vSAN ensures that the
cluster has enough space to complete the internal operations.
6. Click to enable or deactivate the host rebuild reserve. On enabling the host rebuild reserve, vSAN provides the
reservation of space to repair data back to compliance following a single host failure. You can enable the host rebuild
reserve only after you enable the operations reserve. After enabling, if you deactivate the operations reserve, the host
rebuild reserve gets automatically deactivated.
7. Select Customize alerts. You can set a customized threshold to receive warning and error alerts. The threshold
percentage is calculated based on the available capacity, which is the difference between the total capacity and the
reserved capacity. If you do not set a customized value, vSAN uses the default thresholds to generate alerts.
8. Click Apply.

About vSAN Cluster Resynchronization


You can monitor the status of virtual machine objects that are being resynchronized in the vSAN cluster.

When a hardware device, host, or network fails, or if a host is placed into maintenance mode, vSAN initiates
resynchronization in the vSAN cluster. However, vSAN might briefly wait for the failed components to come back online
before initiating resynchronization tasks.
The following events trigger resynchronization in the cluster:
• Editing a virtual machine (VM) storage policy. When you change VM storage policy settings, Virtual SAN might initiate
object recreation and subsequent resynchronization of the objects.
Certain policy changes might cause Virtual SAN to create another version of an object and synchronize it with the
previous version. When the synchronization is complete, the original object is discarded.
Virtual SAN ensures that VMs continue to run and are not interrupted by this process. This process might require
additional temporary capacity.
• Restarting a host after a failure.
• Recovering hosts from a permanent or long-term failure. If a host is unavailable for more than 60 minutes (by default),
Virtual SAN creates copies of data to recover the full policy compliance.
• Evacuating data by using the Full data migration mode before you place a host in maintenance mode.
• Exceeding the utilization threshold of a capacity device. Resynchronization is triggered when capacity device utilization
in the Virtual SAN cluster approaches or exceeds the threshold level of 80 percent.
If a VM is not responding due to latency caused by resynchronization, you can throttle the IOPS used for
resynchronization.

Monitor the Resynchronization Tasks in vSAN Cluster


To evaluate the status of objects that are being resynchronized, you can monitor the resynchronization tasks that are
currently in progress.

305
VMware vSAN 8.0

Verify that hosts in your vSAN cluster are running ESXi 7.0 or later.
1. Navigate to the vSAN cluster.
2. Select the Monitor tab.
3. Click vSAN.
4. Select Resyncing objects.
5. Track the progress of resynchronization of virtual machine objects.
The Object Repair Time defines the time vSAN waits before repairing a non-compliant object after placing a host in
a failed state or maintenance mode. The default setting is 60 minutes. To change the setting, edit the Object Repair
Timer (Configure > vSAN > Services > Advanced Options).
You can also view the following information about the objects that are resynchronized:

Objects Description

Total resyncing objects Total number of objects to be resynchronized in the vSAN cluster.
Bytes left to resync Data (in bytes) that is remaining before the resynchronization is
complete.
Total resyncing ETA Estimated time left for the resynchronization to complete.
The objects to be resynchronized are categorized as active,
queued, and suspended. The objects that are actively
synchronizing fall in the active category. The objects that are in the
queue for resynchronization are the queued objects. The objects
that were actively synchronizing but are now in the suspended
state falls in the suspended category.
Scheduled resyncing Remaining number of objects to be resynchronized.
You can classify scheduled resyncing into two categories:
scheduled and pending. The scheduled category displays the
objects that are not resyncing because the delay timer has not
expired. Resynchronization of objects starts once the timer
expires. The pending category displays the objects with the
expired delay timer that cannot be resynchronized. This can be
due to insufficient resources in the current cluster or the vSAN
FTT policy set on the cluster not being met.

You can also view the resynchronization objects based on various filters such as Intent and Status. Using Show first,
you can modify the view to display the number of objects.

About vSAN Cluster Rebalancing


When any capacity device in your cluster reaches 80 percent full, vSAN automatically rebalances the cluster.
The vSAN cluster rebalancing continues until the space and components available on all capacity devices is below the
threshold. Cluster rebalancing evenly distributes resources across the cluster to maintain consistent performance and
availability.
The following operations can cause disk capacity to reach 80% and initiate cluster rebalancing:
• Hardware failures occur on the cluster.
• vSAN hosts are placed in maintenance mode with the Evacuate all data option.
• vSAN hosts are placed in maintenance mode with Ensure data accessibility when objects assigned FTT=0 reside on
the host.

306
VMware vSAN 8.0

NOTE
To provide enough space for maintenance and reprotection, and to minimize automatic rebalancing events in the
vSAN cluster, consider keeping 30-percent capacity available at all times.

Configure Automatic Rebalance in vSAN Cluster


vSAN automatically rebalances data on the disks by default. You can configure settings for automatic rebalancing.
Your vSAN cluster can become unbalanced based on the space or component usage for many reasons such as when you
create objects of different sizes, when you add new hosts or capacity devices, or when objects write different amounts of
data to the disks. If the cluster becomes unbalanced, vSAN automatically rebalances the disks. Based on the space or
component usage, this operation moves components from over-utilized disks to under-utilized disks.
You can enable or deactivate automatic rebalance, and configure the variance threshold for triggering an automatic
rebalance. If any two disks in the cluster have a variance in capacity or component usage that exceeds the rebalancing
threshold, vSAN begins rebalancing the cluster.
Disk rebalancing can impact the I/O performance of your vSAN cluster. By default the rebalance threshold is set at 30
percentage and ensures that the cluster remains relatively balanced without significantly impacting the performance. If
the cluster becomes severely imbalanced, such as after adding one or more hosts or disks, temporarily using a lower
threshold of 10 or 20 percentage makes the cluster evenly balanced. This must be done during off-peak periods to

307
VMware vSAN 8.0

minimize the performance impact during the rebalancing activity. Once the rebalancing is complete, you can change the
threshold back to the default 30 percentage.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.
4. Click to edit Advanced Options.

5. Click to enable or deactivate Automatic Rebalance.


You can change the variance threshold to any percentage from 20 to 75.
You can use the vSAN Skyline Health to check the disk balance. Expand the Cluster category, and select vSAN Disk
Balance.

308
VMware vSAN 8.0

Using the vSAN Default Alarms


You can use the default vSAN alarms to monitor the cluster, hosts, and existing vSAN licenses.
The default alarms are automatically triggered when the events corresponding to the alarms are activated or if one or all
the conditions specified in the alarms are met. You cannot edit the conditions or delete the default alarms. To configure
alarms that are specific to your requirements, create custom alarms for vSAN. See Creating a vCenter Server Alarm for a
vSAN Event.
For information about monitoring alarms, events, and editing existing alarm settings, see the vSphere Monitoring and
Performance documentation.

View vSAN Default Alarms


Use the default vSAN alarms to monitor your cluster, hosts, analyze any new events, and assess the overall cluster
health.
1. Navigate to the vSAN cluster.
2. Click Configure and then click Alarm Definitions.
3.
Click and type vSAN in the search box to display the alarms that are specific to vSAN.
Type vSAN Health Service Alarm to search for vSAN health service alarms.
The default vSAN alarms are displayed.
4. From the list of alarms, click each alarm to view the alarm definition.

View vSAN Network Alarms


vSAN network diagnostics queries the latest network metrics and compares the metrics statistics with the defined
threshold values.
The vSAN performance service must be turned on.
If the value reaches above the threshold that you have set, vSAN network diagnostics raises an alarm. You must
acknowledge and manually reset the triggered alarms to green after fixing the network issues.
1. Navigate to the host in the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, select Performance.
4. Select Physical Adapters, and select a NIC. Select a time range for your query. vSAN displays performance charts
for the physical NIC (pNIC), including throughput, packets per second, and packets loss rate.
5.
Select . In the Threshold settings dialog box, enter a threshold value to receive warning and error alert.
6. Click Save.

vSAN displays the performance statistics of all the network I/Os in use. vSAN network diagnostics result appears in the
vCenter Server alerts. The redirection to the related performance charts is available in the vSAN network alerts generated
by the network diagnostics service.

Using the VMkernel Observations for Creating vSAN Alarms


VMkernel Observations (VOBs) are system events that you can use to set up vSAN alarms.

309
VMware vSAN 8.0

vSAN alarms are used for monitoring and troubleshooting performance and networking issues in the vSAN cluster. In
vSAN, these events are known as observations.

VMware ESXi Observation IDs for vSAN


Each VOB event is associated with an identifier (ID). Before you create a vSAN alarm in the vCenter Server, you must
identify an appropriate VOB ID for the vSAN event for which you want to create an alert. You can create alerts in the
VMware ESXi Observation Log file (vobd.log). For example, use the following VOB IDs to create alerts for any device
failures in the cluster.
• esx.problem.vob.vsan.lsom.diskerror
• esx.problem.vob.vsan.pdl.offline
To review the list of VOB IDs for vSAN, open the vobd.log file located on your ESXi host in the /var/log directory. The
log file contains the following VOB IDs that you can use for creating vSAN alarms.

Table 31: VOB IDs for vSAN

VOB ID Description

esx.audit.vsan.clustering.enabled The vSAN clustering service is enabled.


esx.clear.vob.vsan.pdl.online The vSAN device has come online.
esx.clear.vsan.clustering.enabled The vSAN clustering service is enabled.
esx.clear.vsan.vsan.network.available vSAN has one active network configuration.
esx.clear.vsan.vsan.vmknic.ready A previously reported vmknic has acquired a valid IP.
esx.problem.vob.vsan.lsom.componentthreshold vSAN reaches the near node component count limit.
esx.problem.vob.vsan.lsom.diskerror A vSAN device is in a permanent error state.
esx.problem.vob.vsan.lsom.diskgrouplimit vSAN fails to create a disk group.
esx.problem.vob.vsan.lsom.disklimit vSAN fails to add devices to a disk group.
esx.problem.vob.vsan.lsom.diskunhealthy vSAN disk is unhealthy.
esx.problem.vob.vsan.pdl.offline A vSAN device is offline.
esx.problem.vsan.clustering.disabled vSAN clustering services are not enabled.
esx.problem.vsan.lsom.congestionthreshold vSAN device memory or SSD congestion has been updated.
esx.problem.vsan.net.not.ready A vmknic is added to vSAN network configuration without a valid IP address.
This happens when the vSAN network is not ready.
esx.problem.vsan.net.redundancy.lost The vSAN network configuration does not have the required redundancy.
esx.problem.vsan.no.network.connectivity vSAN does not have existing networking configuration, which is in use.
esx.problem.vsan.vmknic.not.ready A vmknic is added to the vSAN network configuration without a valid IP address.
esx.problem.vob.vsan.lsom.devicerepair The vSAN device is offline and in a repaired state because of I/O failures.
esx.problem.vsan.health.ssd.endurance One or more vSAN disks exceed the warning usage of estimated endurance
threshold.
esx.problem.vsan.health.ssd.endurance.error A vSAN disk exceeds the estimated endurance threshold.
esx.problem.vsan.health.ssd.endurance.warning A vSAN disk exceeds 90% of its estimated endurance threshold.

Creating a vCenter Server Alarm for a vSAN Event


You can create alarms to monitor events on the selected vSAN object, including the cluster, hosts, datastores, networks,
and virtual machines.

310
VMware vSAN 8.0

You must have the required privilege level of Alarms.Create Alarm or Alarm.Modify Alarm
1. Navigate to the vSAN cluster.
2. On the Configure tab, select Alarm Definitions and click Add.
3. In the Name and Targets page, enter a name and description for the new alarm.
4. From the Target type drop-down menu, select the type of inventory object that you want this alarm to monitor and
click Next.
Depending on the type of target that you choose to monitor, the summary that follows the Targets, change.
5. In the Alarm Rule page, select a trigger from the drop-down menu.
The combined event triggers are displayed. You can set the rule for a single event only. You must create multiple rules
for multiple events.
6. Click Add Argument to select an argument from the drop-down menu.
a) Select an operator from the drop-down menu.
b) Select an option from the drop-down menu to set the threshold for triggering an alarm.
c) Select severity of the alarm from the drop-down menu. You can set the condition to either Show as Warning or
Show as Critical, but not for both. You must create a separate alarm definition for warning and critical status.
7. Select Send email notifications, to send email notifications when alarms are triggered.
8. In the Email to text box, enter recipient addresses. Use commas to separate multiple addresses.
9. Select Send SNMP traps to send traps when alarms are triggered on a vCenter Server instance.
10. Select Run script to run scripts when alarms are triggered.
11. In the Run this script text box, enter the following script or command:
For this type of command... Enter this...
EXE executable files Full pathname of the command. For example, to run the
cmd.exe command in the C:\tools directory, type:
c:\tools\cmd.exe
BAT batch file Full pathname of the command as an argument to the c:
\windows\system32\cmd.exe command. For example, to run the
cmd.bat command in the C:\tools directory, type:
c:\windows\system32\cmd.exe /c c:\tools\cmd.bat

12. Select an advanced action from the drop-down menu. You can define the advanced actions for virtual machine and
hosts. You can add multiple advanced actions for an alarm.
13. Click Next to set the Reset Rule.
14. Select Reset the alarm to green and click Next to review the alarm definition.
15. Select Enable this alarm to enable the alarm and click Create.

The alarm is configured.

Monitoring vSAN Skyline Health


You can check the overall health of the vSAN cluster, including hardware compatibility and networking configuration and
operations.
You can also check the advanced vSAN configuration options, storage device health, and virtual machine object health.

311
VMware vSAN 8.0

About the vSAN Skyline Health


Use the vSAN Skyline health to monitor the health of your vSAN cluster.
You can use the vSAN Skyline health to monitor the status of cluster components, diagnose issues, and troubleshoot
problems. The health findings cover hardware compatibility, network configuration and operation, advanced vSAN
configuration options, storage device health, and virtual machine objects.

You can use Overview to monitor the core health issues of your vSAN cluster. You can also view the following:
• Cluster health score based on the health findings
• View the health score trend for 24 hours
• View the health score trend for a particular period
Ensure that the Historical Health Service is enabled to view details of the Health score trend. Click View Details in the
Health score trend chart to examine the health state of the cluster for a selected time point within 24 hours. Use Custom
to customize the time range as per your requirement.
You can use the vSAN Health findings to diagnose issues, troubleshoot problems, and remediate the problems.
The health findings are classified as follows:
• Unhealthy – Critical or important issue(s) being detected that needs attention.
• Healthy – There are no issues found that needs attention.
• Info – Health findings which may not impact the cluster running state but important for awareness.
• Silenced – Health findings have been silenced without triggering vSAN health alarm by intention.
To troubleshoot an issue, you can sort the findings by root cause to resolve the primary issues initially and then verify if
the impacted issues can also be resolved.
vSAN periodically retests each health finding and updates the results. To run the health findings and update the results
immediately, click the Retest button.

312
VMware vSAN 8.0

If you participate in the Customer Experience Improvement Program (CEIP), you can run health findings and send
the data to VMware for advanced analysis. Click Retest with Online health and then click OK. Online notifications is
enabled by default if the vCenter Server can connect to VMware Analytics Cloud without enrolling CEIP. If you do not
want to participate in CEIP, you can still receive vSAN health notifications for software and hardware issues using Online
notifications.

Monitoring vSAN Health on a Host


The ESXi host client is a browser-based interface for managing a single ESXi host. It enables you to manage the host
when vCenter Server is not available. The host client provides tabs for managing and monitoring vSAN at the host level.
• The vSAN tab displays basic vSAN configuration.
• The Hosts tab displays the hosts participating in the vSAN cluster.
• The Health tab displays host-level health findings.

Viewing vSAN Health History


The vSAN health history helps you examine health issues by querying the historical health records. You can only view
the historical health data of a cluster. By default, the health history is enabled. To deactivate the health history, select
the cluster and navigate to the Configure > vSAN > Services > Historical Health Service and click Disable. If you
deactivate the health history, all the health data collected on the vCenter Server database gets purged. The database
stores the health data for up to 30 days depending on the available capacity.
Using the Skyline Health view, you can view the health history for a selected time range. The start date of the time range
must not be earlier than 30 days from the current date. The end date must not be later than the current date. Based
on your selection, you can view the historical health findings. Click View History Details to view the history of a health
finding within a selected time period. The historical data is displayed as a graphical representation with green circles,
yellow triangles, and red squares showing success, warning, and failure respectively. The detailed information about each
health finding result is displayed in a table.

Using vSAN Support Insight


vSAN support insight is a platform that helps you maintain a reliable and consistent compute, storage, and network
environment. VMware support uses the vSAN support insight to monitor the vSAN performance diagnostics and resolve
performance issues. vSAN uses Customer Experience Improvement Program (CEIP) to send data to VMware for analysis
on a regular basis. To deactivate CEIP, select vSphere Client > Administration > Customer Experience Improvement
Program > Leave Program.

Check vSAN Skyline Health


You can view the status of vSAN health findings to verify the configuration and operation of your vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, select Skyline Health to review the vSAN health finding.
4. Under Health findings, perform the following:
• Click Unhealthy to view the issues and the details. Click Troubleshoot to troubleshoot and fix an issue. You can
sort the findings by root cause to resolve the primary issues and then verify if the impacted issues can be resolved.
• Click View History Details to identify the status history of the health finding for a particular time period. The
default time period is 24 hours. You can also customize the time period as per your requirement. The status of an
unhealthy finding is displayed in yellow or red. Click the Ask VMware button to open a knowledge base article that

313
VMware vSAN 8.0

describes the health finding and provides information about how to resolve the issue. You can also view the status
history of the health finding for a given period using History Details tab.
• You can click Silence alert on a health finding, so it does not display any warnings or failures.
• Click Healthy to view health findings that are healthy. Click View Current Result to view the current status of the
health finding. Click View History Details to identify the status history of the health finding for a particular time
period. The status is displayed in green. You can also view the status history of the health finding for a given period
using History Details tab.

Monitor vSAN from ESXi Host Client


You can monitor vSAN health and basic configuration through the ESXi host client.
1. Open a browser and enter the IP address of the host.
The browser redirects to the login page for the host client.
2. Enter the username and password for the host, and click Login.
3. In the host client navigator, click Storage.
4. In the main page, click the vSAN datastore to display the Monitor link in the navigator.
5. Click the tabs to view vSAN information for the host.
a) Click the vSAN tab to display basic vSAN configuration.
b) Click the Hosts tab to display the hosts participating in the vSAN cluster.
c) Click the Health tab to display host-level health findings.
6. (Optional) On the vSAN tab, click Edit Settings to correct configuration issues at the host level.
Select the values that match the configuration of your vSAN cluster, and click Save.

Proactive Tests on vSAN Cluster


You can initiate a health test on your vSAN cluster to verify that the cluster components are working as expected.
NOTE
You must not conduct the proactive test in a production environment as it creates network traffic and impacts the
vSAN workload.
Run the VM creation test to verify the vSAN cluster health. Running the test creates a virtual machine on each host in the
cluster. The test creates a VM and deletes it. If the VM creation and deletion tasks are successful, assume that the cluster
components are working as expected and the cluster is functional.
Run the Network performance test to detect and diagnose connectivity issues, and to make sure the network bandwidth
between the hosts supports the requirements of vSAN. The test is performed between the hosts in the cluster. It verifies
that the network bandwidth between hosts, and reports a warning if the bandwidth is less than 850Mbps. You can run the
proactive test at a maximum speed limit of 10Gbps. In vSAN ESA, the proactive test reports error when the result is zero
bps and the Health Status displays the test results as info when the result is a non-zero number.
To access a proactive test, select your vSAN cluster in the vSphere Client, and click the Monitor tab. Click vSAN >
Proactive Tests.

Managing Proactive Hardware


vSAN Proactive Hardware Management (PHM) informs you of any dying disks based on disk predictive failure events
generated by the Original Equipment Manufacturer (OEM) vendor.
Based on this information provided, you can take the necessary remediation. PHM resides within the vSAN management
service on the vCenter Server. The Hardware Support Manager (HSM) is registered with the vCenter Server. PHM collects
vendor hardware information from HSM and sends it to vSAN.

314
VMware vSAN 8.0

About Hardware Support Managers


The deployment method and the management of a hardware support manager are determined by the respective OEM
vendor.
Several of the major OEM vendors develop and supply hardware support managers. For example:
• Dell - The hardware support manager that Dell provides is part of their host management solution, OpenManage
Integration for VMware vCenter (OMIVV), which you deploy as an appliance.
• HPE - The hardware support managers that HPE provides are part of their management tools, iLO Amplifier and
OneView, which you deploy as appliances.
• Lenovo - The hardware support manager that Lenovo provides is part of their server management solution, Lenovo
XClarity Integrator for VMware vCenter, which you deploy as an appliance.
You can find the full list of all VMware-certified hardware support managers in the VMware Compatibility Guide at https://
www.vmware.com/resources/compatibility/search.php?deviceCategory=hsm.

Deploying and Configuring Hardware Support Managers


Regardless of the hardware vendor, you must deploy the hardware support manager appliance on a host with sufficient
memory, storage, and processing resources.
Typically, hardware support manager appliances are distributed as OVF or OVA templates. You can deploy them on any
host in any vCenter Server instance.
After you deploy the appliance, you must power on the appliance virtual machine and register the appliance as a vCenter
Server extension. You might need to log in to the appliance as an administrator. Each hardware support manager might
register with only one or multiple vCenter Server systems.
A vCenter Server plug-in user interface might become available in the vSphere Client after you deploy a hardware
support manager appliance, but the hardware support manager might also have a separate user interface of its own. For
example, OMIVV, iLO Amplifier, and Lenovo XClarity Integrator for VMware vCenter all have a vCenter Server plug-in
user interface, which helps you configure and work with the respective hardware support manager.
For detailed information about deploying, configuring, and managing hardware support managers, refer to the respective
OEM-provided documentation.

Registering Hardware Support Manager


You must register HSM with PHM that resides within the vSAN management service on the vCenter Server using the
vendor management service.
For detailed information about registering hardware support managers, refer to the respective OEM-provided
documentation.

Associating and Dissociating Hosts


After registering HSM with PHM, you need to associate appropriate hosts available in the vCenter Server with the HSM.
This enables PHM on each host. HSM informs PHM on any change in the managed host list. PHM associates the
managed hosts available in a vSAN cluster. When a host is associated or dissociated with PHM, vCenter Server event
gets generated. For detailed information about associating and dissociating hosts, refer to the respective OEM-provided
documentation.

315
VMware vSAN 8.0

Processing Hardware Failures


PHM checks for HSM generated hardware failure events every 10 minutes.
You can customize the time interval using the vSAN configuration file.
1. Log in to vCenter Server console as root.
2. Open the /usr/lib/vsan-health/VsanVcMgmtConfig.xml file.
3. Set the interval value using healthUpdatePollIntervalInSeconds xml tag.
4. Restart the vSAN Health service.

PHM uses these events to generate alarms, which appears in the vSAN Skyline Health. For more information on the
vSAN Skyline Health events, see the VMware knowledge base article at https://knowledge.broadcom.com/external/
article?articleNumber=367770.

Monitoring vSAN Performance


You can monitor the performance of your vSAN cluster.
Performance charts are available for clusters, hosts, physical disks, virtual machines, and virtual disks.

About the vSAN Performance Service


You can use vSAN performance service to monitor the performance of your vSAN environment, and investigate potential
problems.
The performance service collects and analyzes performance statistics and displays the data in a graphical format. You
can use the performance charts to manage your workload and determine the root cause of problems.

When the vSAN performance service is turned on, the cluster summary displays an overview of vSAN performance
statistics, including IOPS, throughput, and latency. You can view detailed performance statistics for the cluster, and for
each host, disk group, and disk in the vSAN cluster. You also can view performance charts for virtual machines and virtual
disks.

316
VMware vSAN 8.0

Configure vSAN Performance Service


Use the vSAN Performance Service to monitor the performance of vSAN clusters, hosts, disks, and VMs.
• All hosts in the vSAN cluster must be running ESXi 7.0 or later.
• Before you configure the vSAN Performance Service, make sure that the cluster is properly configured and has no
unresolved health problems.
NOTE
When you create vSAN OSA, you can optionally enable or deactivate the Performance Service. You can enable
and configure the Performance Service. When you create vSAN ESA, the Performance Service is enabled by
default. You can then configure the Performance Service.
To support the Performance Service, vSAN uses a Stats database object to collect statistical data. The Stats database is
a namespace object in the cluster's vSAN datastore.
1. Navigate to the vSAN cluster.
2. Click the Configure tab.
3. Under vSAN, select Services.
4. (Optional for vSAN ESA cluster.) Click the Performance Service Enable button.
5. (Optional for vSAN ESA cluster.) In vSAN Performance Service Settings, select a storage policy for the stats database
object.
6. (Optional for vSAN ESA cluster.) Click Enable to enable vSAN Performance Service.
7. Click Edit if you want to select a different storage policy in the vSAN Performance Service Settings.
8. (Optional) Click to enable the verbose mode. This check box appears only after enabling vSAN Performance Service.
When enabled, vSAN collects and saves the additional performance metrics to a Stats DB object. If you enable the
verbose mode for more than 5 days, a warning message appears indicating that the verbose mode can be resource-
intensive. Ensure that you do not enable it for a longer duration.
9. (Optional) Click to enable the network diagnostic mode. This check box appears only after enabling vSAN
Performance Service. When enabled, vSAN collects and saves the additional network performance metrics to a
RAM disk stats object. If you enable the network diagnostic mode for more than a day, a warning message appears

317
VMware vSAN 8.0

indicating that the network diagnostic mode can be resource-intensive. Ensure that you do not enable it for a longer
duration.
10. Click Apply.

Use Saved Time Range in vSAN Cluster


You can select saved time ranges from the time range picker in performance views.
• The vSAN performance service must be turned on.
• All hosts in the vSAN cluster must be running ESXi 7.0 or later.
You can manually save a time range with customized name. When you run a storage performance test, the selected time
range is saved automatically. You can save a time range for any of the performance views.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab and click Performance.
3. Select any tab, such as Backend. In the time range drop-down, select Save.
4. Enter a name for the selected time range.
5. Confirm your changes.
You can save the selected time range at the VM and the host level.

View vSAN Cluster Performance


You can use the vSAN cluster performance charts to monitor the workload in your cluster and determine the root cause of
problems.
The vSAN performance service must be turned on before you can view performance charts.
When the performance service is turned on, the cluster summary displays an overview of vSAN performance statistics,
including vSAN IOPS, throughput, and latency. At the cluster level, you can view detailed statistical charts for virtual
machine consumption and the vSAN back end.
NOTE
• To view iSCSI performance charts, all hosts in the vSAN cluster must be running ESXi 7.0 or later.
• To view file service performance charts, you must enable vSAN File Service.
• To view vSAN Direct performance charts, you must claim disks for vSAN Direct.

318
VMware vSAN 8.0

• To view PMem performance charts, you must have PMem storage attached to the hosts in the cluster.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, select Performance.
4. Select Top

319
VMware vSAN 8.0

based on the I/O latency graph of the cluster, you can select a timestamp and get the top contributors with latency
statistics. You can also select a single contributor and view the latency graph. You have the option to switch
between the combined view and table view.
5. Select VM.
Perform one of the following:
• Select Cluster level metrics to display the aggregated performance metrics for the cluster that you selected.
• Select Show specific VMs to display metrics for all the VMs selected. If you enable Show separate chart by
VMs, vSAN displays separate metrics for all the VMs selected.
Select a time range for your query. vSAN displays performance charts for clients running on the cluster, including
IOPS, throughput, latency, congestions, and outstanding I/Os. The statistics on these charts are aggregated from
the hosts within the cluster. You can also select Real-Time as the time range that displays real-time data that is
automatically refreshed every 30 seconds. The real-time statistics data is retained in the SQL database for seven days
until it gets purged.
6. Select Backend. Select a time range for your query. vSAN displays performance charts for the cluster back-end
operations, including IOPS, throughput, latency, congestions, and outstanding I/Os. The statistics on these charts are
aggregated from the hosts within the cluster.
7. Select File Share and choose a file. Select a time range for your query. Select NFS performance or File system
performance based on the protocol layer performance or file system layer performance that you want to display.
vSAN displays performance charts for vSAN file services, including IOPS, throughput, and latency.
8. Select iSCSI and select an iSCSI target or LUN. Select a time range for your query. vSAN displays performance charts
for iSCSI targets or LUNs, including IOPS, bandwidth, latency, and outstanding I/O.
9. (Optional) Select I/O Insight. For more information on I/O Insight, see Use vSAN I/O Insight.
10. Select vSAN Direct to display the performance data of the vSAN direct disks. Select a time range for your query.
vSAN displays performance charts for vSAN direct, including IOPS, bandwidth, latency, and outstanding I/O.
11. Select PMEM to display the performance data of all VMs placed on the PMem storage. Select a time range for your
query. You can also select Real-time as the time range that displays real time data that is automatically refreshed
every 30 seconds. PMem displays performance charts including IOPS, bandwidth, and latency. For more information
about PMem metrics collection settings, see https://kb.vmware.com/s/article/89100.
12. Click Refresh or Show Results to update the display.

View vSAN Host Performance


You can use the vSAN host performance charts to monitor the workload on your hosts and determine the root cause of
problems.
The vSAN performance service must be turned on before you can view performance charts.
To view the following performance charts, hosts in the vSAN cluster must be running ESXi 7.0 or later: Physical Adapters,
VMkernal Adapters, VMkernal Adapters Aggregation, iSCSI, vSAN - Backend resync I/Os, resync IOPS, resync
throughput, Disk Group resync latency.
You can view vSAN performance charts for hosts, disk groups, and individual storage devices. When the performance
service is turned on, the host summary displays performance statistics for each host and its attached disks. At the host
level, you can view detailed statistical charts for virtual machine consumption and the vSAN back end, including IOPS,
throughput, latency, and congestion. Additional charts are available to view the local client cache read IOPS and hit rate.

320
VMware vSAN 8.0

At the disk group level, you can view statistics for the disk group. At the disk level, you can view statistics for an individual
storage device.
1. Navigate to the vSAN cluster, and select a host.
2. Click the Monitor tab.
3. Under vSAN, select Performance.
4. Select VM.
• Select Host level metrics to display the aggregated performance metrics for the host that you selected.
• Select Show specific VMs to display metrics for all the VMs selected on the host. If you enable Show separate
chart by VMs, vSAN displays separate metrics for all the VMs selected on the host.
Select a time range for your query. vSAN displays performance charts for clients running on the host, including IOPS,
throughput, latency, congestions, and outstanding I/Os. You can also select Real-Time as the time range that displays
real-time data that is automatically refreshed every 30 seconds. The real-time statistics data is retained in the SQL
database for seven days until it gets purged.
5. In vSAN ESA, select Backend Cache. Select a time range for your query. vSAN displays the performance charts for
the backend cache operations of the host, including the overall backend cache statistics, the overall cache miss by the
different types, cache miss by types for the different transactions, and the catch latency for the different transactions.
6. Select Backend. Select a time range for your query. vSAN displays performance charts for the host back-end
operations, including IOPS, throughput, latency, congestions, outstanding I/Os, and resync I/Os.
7. Perform one of the following:
• Select Disks, and select a disk group. Select a time range for your query. vSAN displays performance charts for
the disk group, including front end (Guest) IOPS, throughput, and latency, as well as overhead IOPS and latency.
It also displays the read-cached hit rate, evictions, write-buffer free percentage, capacity and usage, cache disk
destage rate, congestions, outstanding I/O, outstanding I/O size, delayed I/O percentage, delayed I/O average
latency, internal queue IOPS, internal queue throughput, resync IOPS, resync throughput, and resync latency.
• In vSAN ESA, select Disks, and then select a disk. Select a time range for your query. vSAN displays performance
charts for the disk, including vSAN layer IOPS, throughput, and latency. It also displays the physical or firmware
layer IOPS, throughput, and latency.
8. Select Physical Adapters, and select a NIC. Select a time range for your query. vSAN displays performance charts
for the physical NIC (pNIC), including throughput, packets per second, and packets loss rate.
9. Select Host Network, and select a VMkernel adapter, such as vmk1. Select a time range for your query. vSAN
displays performance charts for all network I/Os processed in the network adapters used by vSAN, including
throughput, packets per second, and packets loss rate.
10. Select iSCSI. Select a time range for your query. vSAN displays performance charts for all the iSCSI services on the
host, including IOPS, bandwidth, latency, and outstanding I/Os.
11. (Optional) Select I/O Insight. For more information on I/O Insight, see Use vSAN I/O Insight.
12. Select vSAN Direct to display the performance data of the vSAN direct disks. Select a time range for your query.
vSAN displays performance charts for vSAN direct, including IOPS, bandwidth, latency, and outstanding I/O.
13. Select PMEM to display the performance data of all VMs placed on the PMem storage. Select a time range for your
query. You can also select Real-time as the time range that displays real time data that is automatically refreshed

321
VMware vSAN 8.0

every 30 seconds. PMem displays the performance charts including IOPS, bandwidth, and latency. For more
information about PMem metrics collection settings, see https://kb.vmware.com/s/article/89100.
14. Click Refresh or Show Results to update the display.

View vSAN VM Performance


You can use the vSAN VM performance charts to monitor the workload on your virtual machines and virtual disks.
The vSAN performance service must be turned on before you can view performance charts.
When the performance service is turned on, you can view detailed statistical charts for virtual machine performance and
virtual disk performance. VM performance statistics cannot be collected during migration between hosts, so you might
notice a gap of several minutes in the VM performance chart.
NOTE
The performance service supports only virtual SCSI controllers for virtual disks. Virtual disks using other
controllers, such as IDE, are not supported.
1. Navigate to the vSAN cluster, and select a VM.
2. Click the Monitor tab.
3. Under vSAN, select Performance.
4. Select VM. Select a time range for your query. vSAN displays performance charts for the VM, including IOPS,
throughput, and latency.
5. Select Virtual Disk. Select a time range for your query. vSAN displays performance charts for the virtual disks,
including IOPS, delayed normalized IOPS, virtual SCSI IOPS, virtual SCSI throughput, and virtual SCSI latency. The
virtual SCSI latency performance charts display a highlighted area due to the IOPS limit enforcement.
6. (Optional) In the Virtual Disk, click New I/O Insight Instance. For more information on I/O Insight, see Use vSAN I/O
Insight.
7. Click Refresh or Show Results to update the display.

Use vSAN I/O Insight


I/O Insight allows you to select and view I/O performance metrics of virtual machines in a vSAN cluster.
By understanding the I/O characteristics of VMs, you can ensure better capacity planning and performance tuning.
1. Navigate to the vSAN cluster or host.
You can also access I/O Insight from the VM. Select the VM and navigate to Monitor > vSAN > Performance >
Virtual Disks.

322
VMware vSAN 8.0

2. Click the Monitor tab.


3. Under vSAN, select Performance.
4. Select the I/O Insight tab and click New Instance.
5. Select the required hosts or VMs that you want to monitor. You can also search for VMs.
6. Click Next.
7. Enter a name and select a duration.
8. Click Next and review the instance information.
9. Click Finish.
I/O Insight instance monitors the selected VMs for the specified duration. However, you can stop an instance before
completion of the duration that you specified.
NOTE
VMs monitored by I/O Insight must not be vMotioned. vMotion stops the VMs from being monitored and will
result in an unsuccessful trace.

vSAN displays performance charts for the VMs in the cluster, including IOPS, throughput, I/O size distribution, I/O latency
distribution, and so on.
You can view metrics for the I/O Insight instance that you created.

View vSAN I/O Insight Metrics


I/O Insight performance metrics chart displays the metrics at the virtual disk level.
When I/O Insight is running, vSAN collects and displays the metrics for selected VMs, for a set duration. You can view the
performance metrics for up to 90 days. The I/O Insight instances are automatically deleted after this period.
1. Navigate to the vSAN cluster or host.
You can also access I/O Insight from the VM. Select the VM and navigate to Monitor > vSAN > Performance >
Virtual Disks.

323
VMware vSAN 8.0

2. Click the Monitor tab.


3. Under vSAN, select Performance.
4. Select the I/O Insight tab. You can organize the instances based on time or hosts.
5.
To view the metrics of an instance, click and click View Metrics. You can optionally stop a running instance
before completing the specified duration.

324
VMware vSAN 8.0

You can rerun an instance, and rename or delete the existing instances.

Use vSAN I/O Trip Analyzer


You can use vSAN I/O trip analyzer to diagnose the virtual machine I/O latency issues.
The vSAN performance service must be enabled before you can run the I/O trip analyzer and view the test results.
vSAN latency issues can be caused by outstanding I/Os, network hardware issues, network congestions, or disk
slowness. The trip analyzer allows you to get the breakdown of the latencies at each layer of the vSAN stack. The
topology diagram shows only the hosts with VM I/O traffic.

NOTE
All the ESXi hosts and vCenter Server in the vSAN cluster must be running 7.0 Update 3 or later.
Using the I/O trip analyzer scheduler running 8.0 Update 1 or later, you can set the recurrence for I/O trip analyzer
diagnostic operations. You can either set a one time occurrence or set the recurrence to later. On reaching the recurrence
time, the scheduler automatically collects the results. You can view the results collected within 30 days.

325
VMware vSAN 8.0

NOTE
The I/O trip analyzer supports stretched cluster and multiple VMs (maximum 8 VMs and 64 VMDKs) in one
diagnostic run for a single cluster.
1. Navigate to the vSAN cluster, and select a VM.
2. Click the Monitor tab.
3. Under vSAN, select I/O Trip Analyzer.
4. Click Run New Test.
5. In the Run VM I/O Trip Analyzer Test, select the duration of the test.
6. (Optional) Select Schedule for a future time to schedule the test for a later time. You can either select Start now or
enter a time based on your requirement in the Custom time field. Select the repeat options and click Schedule.
NOTE
You can schedule only a single I/O trip analyzer per cluster. You can schedule another I/O trip analyzer after
deleting the current scheduler. To delete a scheduler, click Schedules > Delete. You can also modify a
schedule that you created. Click Schedules > Edit.
7. Click RUN. The trip analyzer test data is persisted and is available only for 30 days.
NOTE
vSAN does not support I/O trip analyzer for virtual disks in a remote vSAN datastore.
8. Click VIEW RESULT to view the visualized I/O topology.
9. From the Virtual Disks drop-down, select the disk for which you want to view the I/O topology. You can also view the
performance details of the network and the disk groups. Click the edge points of the topology to view the latency
details.
Click the edge points of the topology to view the latency details. If there is a latency issue, click the red icon to focus
on that area.

View vSAN Performance Metrics for Support Cases


Use the vSAN cluster performance metrics to monitor the performance of your cluster and determine the root cause of the
performance issues.
The vSAN performance service must be turned on before you can view performance charts.
You can the vSAN Obfuscation Map to identify the obfuscated data sent to VMware. For more information on obfuscation
map, see View vSAN Obfuscation Map.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, select Support > Performance For Support.
4. Select a performance dashboard from the drop-down menu.
5. Select hosts, disks, or NICs from the drop-down menu.
6. Select a time range for your query.
The default time range is the most recent hour. You can increase the range to include the last 24 hours, or define a
custom time range within the last 90 days. If you used the HCIbench tool to run performance benchmark tests on the
vSAN cluster, the time ranges of those tests appear in the drop-down menu.
7. Click Show Results.
vSAN displays performance charts for selected entities, such as IOPS, throughput, latency, congestions, and
outstanding I/Os.

326
VMware vSAN 8.0

Using vSAN Performance Diagnostics


You can use vSAN performance diagnostics to improve the performance of your vSAN OSA cluster, and resolve
performance issues.
• The vSAN performance service must be turned on.
• vCenter Server requires Internet access to download ISO images and patches and to send data to VMware to analyze
vSAN performance data.
• You must participate in the Customer Experience Improvement Program (CEIP).
The vSAN performance diagnostics tool analyzes previously run benchmarks gathered from the vSAN performance
service. It can detect issues, suggest remediation steps, and provide supporting performance graphs for further insight.
The vSAN performance service provides the data used to analyze vSAN performance diagnostics. vSAN uses CEIP to
send data to VMware for analysis.
NOTE
Do not use vSAN performance diagnostics for general evaluation of performance on a production vSAN cluster.
1. Navigate to the vSAN cluster.
2. Click the Monitor tab.
3. Under vSAN, select Performance Diagnostics.
4. Select a benchmark goal from the drop-down menu.
You can select a goal based on the performance improvement that you want to achieve, such as maximum IOPS,
maximum throughput, or minimum latency.
5. Select a time range for your query.
The default time range is the most recent hour. You can increase the range to include the last 24 hours, or define a
custom time range within the last 90 days. If you used the HCIbench tool to run performance benchmark tests on the
vSAN cluster, the time ranges of those tests appear in the drop-down menu.
6. Click Show Results.

When you click Show Results, vSAN transmits performance data to the vSphere backend analytics server. After
analyzing the data, the vSAN performance diagnostics tool displays a list of issues that might have affected the
benchmark performance for the chosen goal.
You can click to expand each issue to view more details about each issue, such as a list of affected items. You also can
click See More or Ask VMware to display a Knowledge Base article that describes recommendations to address the issue
and achieve your performance goal.

View vSAN Obfuscation Map


You can use the vSAN Obfuscation Map to identify the obfuscated data sent to VMware.
vSAN Obfuscation Map provides mapping of the obfuscated data sent to VMware as part of Customer Experience
Improvement Program (CEIP) to facilitate communication during the Support Request process between vSAN user and
VMware Global Support. Use notepad or any text editor to view the obfuscation map. For more information on obfuscation
map, see the VMware knowledge base article at https://kb.vmware.com/s/article/51120 .

Handling Failures and Troubleshooting vSAN


If you encounter problems when using vSAN, you can use troubleshooting topics.
The topics help you understand the problem and offer you a workaround, when it is available.

327
VMware vSAN 8.0

Uploading a vSAN Support Bundle


You can upload a vSAN support bundle so VMware service personnel can analyze the diagnostic information.
VMware Technical Support routinely requests diagnostic information from your vSAN cluster when a support request is
addressed. The support bundle is an archive that contains diagnostic information related to the environment, such as
product specific logs, configuration files, and so on.
The log files, collected and packaged into a zip file, include the following:
• vCenter support bundle
• Host support bundle
The host support bundle in the cluster includes the following:
["Userworld:HostAgent", "Userworld:FDM",
"System:VMKernel", "System:ntp", "Storage:base", "Network:tcpip",
"Network:dvs", "Network:base", "Logs:System", "Storage:VSANMinimal",
"Storage:VSANHealth", "System:BaseMinmal", "Storage:VSANTraces"]

vSAN performs an automated upload of the support bundle, and does not allow you to review, obfuscate, or otherwise edit
the contents of your support data prior to it being sent to VMware. vSAN connects to the FTP port 21 or HTTPS port 443
of the target server with the domain name vmware.com, to automatically upload the support bundle.
NOTE
Data collected in the support bundle may be considered sensitive. If your support data contains regulated data,
such as personal, health care, or financial data, you may want to avoid uploading the support bundle.
1. Right-click the vSAN cluster in the vSphere Client.
2. Choose menu vSAN > Upload support bundle...
3. Enter your service request ID and a description of your issue.
4. Click Upload.

Using Esxcli Commands with vSAN


Use Esxcli commands to obtain information about vSAN OSA or vSAN ESA and to troubleshoot your vSAN environment.
The following commands are available:

Command Description

esxcli vsan network list Verify which VMkernel adapters are used for vSAN
communication.
esxcli vsan storage list List storage disks claimed by vSAN.
esxcli vsan storagepool list List storage pool claimed by vSAN ESA. This command is
applicable only for vSAN ESA cluster.
esxcli vsan cluster get Get vSAN cluster information.
esxcli vsan health Get vSAN cluster health status.
esxcli vsan debug Get vSAN cluster debug information.

The esxcli vsan debug commands can help you debug and troubleshoot the vSAN cluster, especially when vCenter
Server is not available.
Use: esxcli vsan debug {cmd} [cmd options]

328
VMware vSAN 8.0

Debug commands:

Command Description

esxcli vsan debug disk Debug vSAN physical disks.


esxcli vsan debug object Debug vSAN objects.
esxcli vsan debug resync Debug vSAN resyncing objects.
esxcli vsan debug controller Debug vSAN disk controllers.
esxcli vsan debug limit Debug vSAN limits.
esxcli vsan debug vmdk Debug vSAN VMDKs.

Example esxcli vsan debug commands:


esxcli vsan debug disk summary get
Overall Health: green
Component Metadata Health: green
Memory Pools (heaps): green
Memory Pools (slabs): green
esxcli vsan debug disk list
UUID: 52e1d1fa-af0e-0c6c-f219-e5e1d224b469
Name: mpx.vmhba1:C0:T1:L0
SSD: False
Overall Health: green
Congestion Health:
State: green
Congestion Value: 0
Congestion Area: none
In Cmmds: true
In Vsi: true
Metadata Health: green
Operational Health: green
Space Health:
State: green
Capacity: 107365793792 bytes
Used: 1434451968 bytes
Reserved: 150994944 bytes
esxcli vsan debug object health summary get
Health Status Number Of Objects
------------------------------------------------ -----------------
reduced-availability-with-no-rebuild-delay-timer 0
reduced-availability-with-active-rebuild 0
inaccessible 0
data-move 0
healthy 1
nonavailability-related-incompliance 0
nonavailability-related-reconfig 0
reduced-availability-with-no-rebuild 0
esxcli vsan debug object list
Object UUID: 47cbdc58-e01c-9e33-dada-020010d5dfa3
Version: 5
Health: healthy
Owner:

329
VMware vSAN 8.0

Policy:
stripeWidth: 1
CSN: 1
spbmProfileName: vSAN Default Storage Policy
spbmProfileId: aa6d5a82-1c88-45da-85d3-3d74b91a5bad
forceProvisioning: 0
cacheReservation: 0
proportionalCapacity: [0, 100]
spbmProfileGenerationNumber: 0
hostFailuresToTolerate: 1

Configuration:
RAID_1
Component: 47cbdc58-6928-333f-0c51-020010d5dfa3
Component State: ACTIVE, Address Space(B): 273804165120 (255.00GB),
Disk UUID: 52e95956-42cf-4d30-9cbe-763c616614d5, Disk Name: mpx.vmhba1..
Votes: 1, Capacity Used(B): 373293056 (0.35GB),
Physical Capacity Used(B): 369098752 (0.34GB), Host Name: sc-rdops...
Component: 47cbdc58-eebf-363f-cf2b-020010d5dfa3
Component State: ACTIVE, Address Space(B): 273804165120 (255.00GB),
Disk UUID: 52d11301-1720-9901-eb0a-157d68b3e4fc, Disk Name: mpx.vmh...
Votes: 1, Capacity Used(B): 373293056 (0.35GB),
Physical Capacity Used(B): 369098752 (0.34GB), Host Name: sc-rdops-vm..
Witness: 47cbdc58-21d2-383f-e45a-020010d5dfa3
Component State: ACTIVE, Address Space(B): 0 (0.00GB),
Disk UUID: 52bfd405-160b-96ba-cf42-09da8c2d7023, Disk Name: mpx.vmh...
Votes: 1, Capacity Used(B): 12582912 (0.01GB),
Physical Capacity Used(B): 4194304 (0.00GB), Host Name: sc-rdops-vm...

Type: vmnamespace
Path: /vmfs/volumes/vsan:52134fafd48ad6d6-bf03cb6af0f21b8d/New Virtual Machine
Group UUID: 00000000-0000-0000-0000-000000000000
Directory Name: New Virtual Machine
esxcli vsan debug controller list
Device Name: vmhba1
Device Display Name: LSI Logic/Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ult..
Used By VSAN: true
PCI ID: 1000/0030/15ad/1976
Driver Name: mptspi
Driver Version: 4.23.01.00-10vmw
Max Supported Queue Depth: 127
esxcli vsan debug limit get
Component Limit Health: green
Max Components: 750
Free Components: 748
Disk Free Space Health: green
Lowest Free Disk Space: 99 %
Used Disk Space: 1807745024 bytes
Used Disk Space (GB): 1.68 GB
Total Disk Space: 107365793792 bytes
Total Disk Space (GB): 99.99 GB
Read Cache Free Reservation Health: green

330
VMware vSAN 8.0

Reserved Read Cache Size: 0 bytes


Reserved Read Cache Size (GB): 0.00 GB
Total Read Cache Size: 0 bytes
Total Read Cache Size (GB): 0.00 GB
esxcli vsan debug vmdk list
Object: 50cbdc58-506f-c4c2-0bde-020010d5dfa3
Health: healthy
Type: vdisk
Path: /vmfs/volumes/vsan:52134fafd48ad6d6-bf03cb6af0f21b8d/47cbdc58-e01c-9e33-
dada-020010d5dfa3/New Virtual Machine.vmdk
Directory Name: N/A
esxcli vsan debug resync list
Object Component Bytes Left To Resync GB Left To Resync
---------------- --------------------- -------------------- -----------------
31cfdc58-e68d... Component:23d1dc58... 536870912 0.50
31cfdc58-e68d... Component:23d1dc58... 1073741824 1.00
31cfdc58-e68d... Component:23d1dc58... 1073741824 1.00

Using vsantop Command-Line Tool


Use the command-line tool - vsantop - that runs on ESXi hosts to view the real time vSAN performance metrics.
You can use this tool to monitor vSAN performance. To display the different performance views and metrics in vsantop,
enter the following commands:

Command Description

^L Redraw screen
Space Update display
h or ? Help; show this text
q Quit
f/F Add or remove fields
o/O Change the order of displayed fields
s Set the delay in seconds between updates
# Set the number of instances to display
E Change the selected entity type
L Change the length of the field
l Limit display to specific node id
. Sort by column, same number twice to change sort order

vSAN Configuration on an ESXi Host Might Fail


In certain circumstances, the task of configuring vSAN on a particular host might fail.
An ESXi host that joins a vSAN cluster fails to have vSAN configured.

331
VMware vSAN 8.0

If a host does not meet hardware requirements or experiences other problems, vSAN might fail to configure the host. For
example, insufficient memory on the host might prevent vSAN from being configured.
1. Place the host that causes the failure in Maintenance Mode.
2. Move the host out of the vSAN cluster.
3. Resolve the problem that prevents the host to have vSAN configured.
4. Exit Maintenance Mode.
5. Move the host back into the vSAN cluster.

Not Compliant Virtual Machine Objects Do Not Become Compliant Instantly


When you use the Check Compliance button, a virtual machine object does not change its status from Not Compliant to
Compliant even though vSAN resources have become available and satisfy the virtual machine profile.
When you use force provisioning, you can provision a virtual machine object even when the policy specified in the virtual
machine profile cannot be satisfied with the resources available in the vSAN cluster. The object is created, but remains in
the non-compliant status.
vSAN is expected to bring the object into compliance when storage resources in the cluster become available, for
example, when you add a host. However, the object's status does not change to compliant immediately after you add
resources.
This occurs because vSAN regulates the pace of the reconfiguration to avoid overloading the system. The amount of time
it takes for compliance to be achieved depends on the number of objects in the cluster, the I/O load on the cluster and the
size of the object in question. In most cases, compliance is achieved within a reasonable time.

vSAN Cluster Configuration Issues


After you change the vSAN configuration, vCenter Server performs validation checks for vSAN configuration.
Error messages indicate that vCenter Server has detected a problem with vSAN configuration.
NOTE
Validation checks are also performed as a part of a host synchronization process.
If vCenter Server detects any configuration problems, it displays error messages. Use the following methods to fix vSAN
configuration problems.

Table 32: vSAN Configuration Errors and Solutions

vSAN Configuration Error Solution

Host with the vSAN service enabled is not in the vCenter cluster Add the host to the vSAN cluster.
1. Right-click the host, and select Move To.
2. Select the vSAN cluster and click OK.

Host is in a vSAN enabled cluster but does not have vSAN service Verify whether vSAN network is properly configured and enabled
enabled on the host. See vSAN Planning and Deployment.
vSAN network is not configured Configure vSAN network. See vSAN Planning and Deployment.
Host cannot communicate with all other nodes in the vSAN Might be caused by network isolation. See the vSAN Planning and
enabled cluster Deployment documentation.
Found another host participating in the vSAN service which is not Make sure that the vSAN cluster configuration is correct and all
a member of this host's vCenter cluster. vSAN hosts are in the same subnet. See vSAN Planning and
Deployment.

332
VMware vSAN 8.0

Handling Failures in vSAN


vSAN handles failures of the storage devices, hosts and network in the cluster according to the severity of the failure.
You can diagnose problems in vSAN by observing the performance of the vSAN datastore and network.

Failure Handling in vSAN


vSAN implements mechanisms for indicating failures and rebuilding unavailable data for data protection.
Failure States of vSAN Components

In vSAN, components that have failed can be in absent or degraded state.


According to the component state, vSAN uses different approaches for recovering virtual machine data. vSAN also
provides alerts about the type of component failure. See Using the VMkernel Observations for Creating vSAN Alarms and
Using the vSAN Default Alarms.
vSAN supports two types of failure states for components:

Table 33: Failure States of Components in vSAN

Component
Description Recovery Cause
Failure State
Degraded A component is in degraded state vSAN starts rebuilding the affected • Failure of a flash caching device
if vSAN detects a permanent components immediately. • Magnetic or flash capacity
component failure and assumes that device failure
the component is not going to recover • Storage controller failure
to working state.
Absent A component is in absent state if vSAN starts rebuilding absent • Lost network connectivity
vSAN detects a temporary component components if they are not • Failure of a physical network
failure where the component might available within a certain time adapter
recover and restore its working state. interval. By default, vSAN starts • ESXi host failure
rebuilding absent components after
• Unplugged flash caching device
60 minutes.
• Unplugged magnetic disk or
flash capacity device

Examine the Failure State of a Component


You can determine whether a component is in the absent or degraded failure state.
If a failure occurs in the cluster, vSAN marks the components for an object as absent or degraded based on the failure
severity.
1. Navigate to the cluster.
2. On the Monitor tab, click vSAN and select Virtual Objects.
The home directories and virtual disks of the virtual machines in the cluster appear.
3. Select the check box on one of the virtual objects and click View Placement Details to open the Physical Placement
dialog. You can view device information, such as name, identifier or UUID, number of devices used for each virtual
machine, and how they are mirrored across hosts.
If a failure has occurred in the vSAN cluster, the Placement and Availability is equal to Absent or Degraded.

333
VMware vSAN 8.0

Object States That Indicate Problems in vSAN

Examine the compliance status and the operational state of a virtual machine object to find how a failure in the cluster
affects the virtual machine.

Table 34: Object State

Object State Type Description

Compliance Status The compliance status of a virtual machine object indicates whether it meets the
requirements of the assigned VM storage policy.
Operational State The operational state of an object can be healthy or unhealthy. It indicates the type
and number of failures in the cluster.
An object is healthy if an intact replica is available and more than 50 percent of the
object's votes are still available.
An object is unhealthy if an entire replica is not available or less than 50 percent of
the object's votes are unavailable. For example, an object might become unhealthy
if a network failure occurs in the cluster and a host becomes isolated.

To determine the overall influence of a failure on a virtual machine, examine the compliance status and the operational
state. If the operational state remains healthy although the object is noncompliant, the virtual machine can continue using
the vSAN datastore. If the operational state is unhealthy, the virtual machine cannot use the datastore.

Examine the Health of an Object in vSAN


Use the vSphere Client to examine whether a virtual machine is healthy.
A virtual machine is considered as healthy when a replica of the VM object and more than 50 percent of the votes for an
object are available.
1. Navigate to the cluster.
2. On the Monitor tab, click vSAN and select Virtual Objects.
The home directories and virtual disks of the virtual machines in the cluster appear.
3. Select an object type in the Affected inventory objects area at the top of the page to display information about each
object, such as object state, storage policy, and vSAN UUID.
If the inventory object is Unhealthy, the vSphere Client indicates the reason for the unhealthy state in brackets.

334
VMware vSAN 8.0

Examine the Compliance of a Virtual Machine in vSAN


Use the vSphere Client to examine whether a virtual machine object is compliant with the assigned VM storage policy.
1. Examine the compliance status of a virtual machine.
a) Browse to the virtual machine in the navigator.
b) On the Summary tab, examine the value of the VM Storage Policy Compliance property under VM Storage
Policies.
2. Examine the compliance status of the objects of the virtual machine.
a) Navigate to the cluster.
b) On the Monitor tab, click vSAN and select Virtual Objects.
c) Select an object type in the Affected inventory objects area at the top of the page to display information about
each object, such as object state, storage policy, and vSAN UUID.
d) Select the check box on one of the virtual objects and click View Placement Details to open the Physical
Placement dialog. You can view device information, such as name, identifier or UUID, number of devices used for
each virtual machine, and how they are mirrored across hosts.
e) On the Physical Placement dialog, check the Group components by host placement check box to organize the
objects by host and by disk.
Accessibility of Virtual Machines Upon a Failure in vSAN

If a virtual machine uses vSAN storage, its storage accessibility might change according to the type of failure in the vSAN
cluster.
Changes in the accessibility occur when the cluster experiences more failures than the policy for a virtual machine object
tolerates.
As a result from a failure in the vSAN cluster, a virtual machine object might become inaccessible. An object is
inaccessible if a full replica of the object is not available because the failure affects all replicas, or when less than 50
percent of the object's votes are available.
According to the type of object that is inaccessible, virtual machines behave in the following ways:

Table 35: Inaccessibility of Virtual Machine Objects

Object Type Virtual Machine State Virtual Machine Symptoms

VM Home Namespace • Inaccessible The virtual machine process might crash


• Orphaned if vCenter Server or the ESXi host cannot and the virtual machine might be powered
access the .vmx file of the virtual machine. off.
VMDK Inaccessible The virtual machine remains powered on
but the I/O operations on the VMDK are not
being performed. After a certain timeout
passes, the guest operating system ends
the operations.

Virtual machine inaccessibility is not a permanent state. After the underlying issue is resolved, and a full replica and more
than 50 percent of the object's votes are restored, the virtual machine automatically becomes accessible again.

Storage Device is Failing in vSAN Cluster

vSAN monitors the performance of each storage device and proactively isolates unhealthy devices.
It detects gradual failure of a storage device and isolates the device before congestion builds up within the affected host
and the entire vSAN cluster.

335
VMware vSAN 8.0

If a disk experiences sustained high latencies or congestion, vSAN considers the device as a dying disk, and evacuates
data from the disk. vSAN handles the dying disk by evacuating or rebuilding data. No user action is required, unless the
cluster lacks resources or has inaccessible objects.

Component Failure State and Accessibility


The vSAN components that reside on the magnetic disk or flash capacity device are marked as absent.

Behavior of vSAN
vSAN responds to the storage device failure in the following ways.

Parameter Behavior

Alarms An alarm is generated from each host whenever an unhealthy


device is diagnosed. A warning is issued whenever a disk is
suspected of being unhealthy.
Health finding The Disk operation health finding issues a warning for the dying
disk.
Health status On the Disk Management page, the health status of the dying disk
is listed as Unhealthy. When vSAN completes evacuation of data,
the health status is listed as DyingDiskEmpty.
Rebuilding data vSAN examines whether the hosts and the capacity devices
can satisfy the requirements for space and placement rules for
the objects on the failed device or disk group. If such a host
with capacity is available, vSAN starts the recovery process
immediately because the components are marked as degraded.
If resources are available, vSAN automatically reprotects the data.

If vSAN detects a disk with a permanent error, it makes a limited number of attempts to revive the disk by unmounting and
mounting it.

Capacity Device Not Accessible in vSAN Cluster

When a magnetic disk or flash capacity device fails, vSAN evaluates the accessibility of the objects on the device.
vSAN rebuilds them on another host if space is available and the Primary level of failures to tolerate is set to 1 or more.

Component Failure State and Accessibility


The vSAN components that reside on the magnetic disk or flash capacity device are marked as degraded.

336
VMware vSAN 8.0

Behavior of vSAN
vSAN responds to the capacity device failure in the following ways.

Parameter Behavior

Primary level of failures to tolerate If the Primary level of failures to tolerate in the VM storage
policy is equal to or greater than 1, the virtual machine objects are
still accessible from another host in the cluster. If resources are
available, starts an automatic reprotection.
If the Primary level of failures to tolerate is set to 0, a virtual
machine object is inaccessible if one of the object's components
resides on the failed capacity device.
Restore the virtual machine from a backup.
I/O operations on the capacity device stops all running I/O operations for 5-7 seconds until it re-
evaluates whether an object is still available without the failed
component.
If determines that the object is available, all running I/O
operations are resumed.
Rebuilding data examines whether the hosts and the capacity devices can satisfy
the requirements for space and placement rules for the objects
on the failed device or disk group. If such a host with capacity is
available, starts the recovery process immediately because the
components are marked as degraded.
If resources are available, an automatic reprotect will occur.

Storage Pool Device Is Not Accessible in vSAN ESA Cluster

When a storage pool device fails, vSAN evaluates the accessibility of the objects on the device.
vSAN rebuilds them on another host if space is available and the Primary level of failures to tolerate is set to 1 or more.

Component Failure State and Accessibility


vSAN responds to the storage pool device failure in the following ways.

Parameter Behavior

Primary level of failures to tolerate If the Primary level of failures to tolerate in the VM storage
policy is equal to or greater than 1, the virtual machine objects are
still accessible from another host in the cluster. If resources are
available, starts an automatic reprotection.
If the Primary level of failures to tolerate is set to 0, a virtual
machine object is inaccessible if one of the object's components
resides on the failed capacity device.
Restore the virtual machine from a backup.
I/O operations on the capacity device stops all running I/O operations for 5-7 seconds until it re-
evaluates whether an object is still available without the failed
component.
If determines that the object is available, all running I/O
operations are resumed.

337
VMware vSAN 8.0

Parameter Behavior

Rebuilding data examines whether the hosts and the capacity devices can satisfy
the requirements for space and placement rules for the objects
on the failed device or disk group. If such a host with capacity is
available, starts the recovery process immediately because the
components are marked as degraded.
If resources are available, an automatic reprotect will occur.

A Flash Caching Device Is Not Accessible in a vSAN Cluster

When a flash caching device fails, vSAN evaluates the accessibility of the objects on the disk group that contains the
cache device.
vSAN rebuilds them on another host if possible and the Primary level of failures to tolerate is set to 1 or more.

Component Failure State and Accessibility


Both cache device and capacity devices that reside in the disk group, for example, magnetic disks, are marked as
degraded. vSAN interprets the failure of a single flash caching device as a failure of the entire disk group.

Behavior of vSAN
vSAN responds to the failure of a flash caching device in the following way:

Parameter Behavior

Primary level of failures to tolerate If the Primary level of failures to tolerate in the VM storage
policy is equal to or greater than 1, the virtual machine objects are
still accessible from another host in the cluster. If resources are
available, starts an automatic reprotection.
If the Primary level of failures to tolerate is set to 0, a virtual
machine object is inaccessible if one of the object's components is
on the failed disk group.
I/O operations on the disk group stops all running I/O operations for 5-7 seconds until it re-
evaluates whether an object is still available without the failed
component.
If determines that the object is available, all running I/O
operations are resumed.
Rebuilding data examines whether the hosts and the capacity devices can satisfy
the requirements for space and placement rules for the objects
on the failed device or disk group. If such a host with capacity is
available, starts the recovery process immediately because the
components are marked as degraded.

A Host Is Not Responding in vSAN Cluster

If a host stops responding due to failure or reboot of the host, vSAN waits for the host to recover and rebuilds the
components elsewhere in the cluster.

Component Failure State and Accessibility


The vSAN components that reside on the host are marked as absent.

338
VMware vSAN 8.0

Behavior of vSAN
vSAN responds to the host failure in the following way:

Parameter Behavior

Primary level of failures to tolerate If the Primary level of failures to tolerate in the VM storage
policy is equal to or greater than 1, the virtual machine objects are
still accessible from another host in the cluster. If resources are
available, starts an automatic reprotection.
If the Primary level of failures to tolerate is set to 0, a virtual
machine object is inaccessible if the object's components reside
on the failed host.
I/O operations on the host stops all running I/O operations for 5-7 seconds until it re-
evaluates whether an object is still available without the failed
component.
If determines that the object is available, all running I/O
operations are resumed.
Rebuilding data If the host does not rejoin the cluster within 60 minutes, examines
whether some of the other hosts in the cluster can satisfy the
requirements for cache, space and placement rules for the objects
on the inaccessible host. If such a host is available, starts the
recovery process.
If the host rejoins the cluster after 60 minutes and recovery has
started, evaluates whether to continue the recovery or stop it and
resynchronize the original components.

Network Connectivity Is Lost in vSAN Cluster

When the connectivity between the hosts in the cluster is lost, vSAN determines the active partition.
vSAN rebuilds the components from the isolated partition on the active partition if the connectivity is not restored.

Component Failure State and Accessibility


vSAN determines the partition where more than 50 percent of the votes of an object are available. The components on the
isolated hosts are marked as absent.

339
VMware vSAN 8.0

Behavior of vSAN
vSAN responds to a network failure in the following way:

Parameter Behavior

Primary level of failures to tolerate If the Primary level of failures to tolerate in the VM storage
policy is equal to or greater than 1, the virtual machine objects are
still accessible from another host in the cluster. If resources are
available, starts an automatic reprotection.
If the Primary level of failures to tolerate is set to 0, a virtual
machine object is inaccessible if the object's components are on
the isolated hosts.
I/O operations on the isolated hosts stops all running I/O operations for 5-7 seconds until it re-
evaluates whether an object is still available without the failed
component.
If determines that the object is available, all running I/O
operations are resumed.
Rebuilding data If the host rejoins the cluster within 60 minutes, synchronizes the
components on the host.
If the host does not rejoin the cluster within 60 minutes, examines
whether some of the other hosts in the cluster can satisfy the
requirements for cache, space and placement rules for the objects
on the inaccessible host. If such a host is available, starts the
recovery process.
If the host rejoins the cluster after 60 minutes and recovery has
started, evaluates whether to continue the recovery or stop it and
resynchronize the original components.

A Storage Controller Fails in vSAN Cluster

When a storage controller fails, vSAN evaluates the accessibility of the objects on the disk groups that are attached to the
controller.
vSAN rebuilds them on another host.

Symptoms
If a host contains a single storage controller and multiple disk groups, and all devices in all disk groups are failed, then you
might assume that a failure in the common storage controller is the root cause. Examine the VMkernel log messages to
determine the nature of the fault.

Component Failure State and Accessibility


When a storage controller fails, the components on the flash caching devices and capacity devices in all disk groups that
are connected to the controller are marked as degraded.
If a host contains multiple controllers, and only the devices that are attached to an individual controller are inaccessible,
then you might assume that this controller has failed.

340
VMware vSAN 8.0

Behavior of vSAN
vSAN responds to a storage controller failure in the following way:

Parameter Behavior

Primary level of failures to tolerate If the Primary level of failures to tolerate in the VM storage
policy is equal to or greater than 1, the virtual machine objects are
still accessible from another host in the cluster. If resources are
available, starts an automatic reprotection.
If the Primary level of failures to tolerate is set to 0, a virtual
machine object is inaccessible if the object's components reside
on the disk groups that are connected to the storage controller.
Rebuilding data examines whether the hosts and the capacity devices can satisfy
the requirements for space and placement rules for the objects
on the failed device or disk group. If such a host with capacity is
available, starts the recovery process immediately because the
components are marked as degraded.

vSAN Stretched Cluster Site Fails or Loses Network Connection

A vSAN stretched cluster manages failures that occur due to the loss of a network connection between sites or the
temporary loss of one site.

vSAN Stretched Cluster Failure Handling


In most cases, the vSAN stretched cluster continues to operate during a failure and automatically recovers after the failure
is resolved.

Table 36: How vSAN Stretched Cluster Handles Failures

Type of Failure Behavior

Network Connection Lost Between Active Sites If the network connection fails between the two active sites, the witness host
and the preferred site continue to service storage operations, and keep data
available. When the network connection returns, the two active sites are
resynchronized.
Secondary Site Fails or Loses Network Connection If the secondary site goes offline or becomes isolated from the preferred
site and the witness host, the witness host and the preferred site continue to
service storage operations, and keep data available. When the secondary
site returns to the cluster, the two active sites are resynchronized.
Preferred Site Fails or Loses Network Connection If the preferred site goes offline or becomes isolated from the secondary site
and the witness host, the secondary site continues storage operations if it
remains connected to the witness host. When the preferred site returns to
the cluster, the two active sites are resynchronized.
Witness Host Fails or Loses Network Connection If the witness host goes offline or becomes isolated from the preferred
site or the secondary site, objects become noncompliant but data remains
available. VMs that are currently running are not affected.

Troubleshooting vSAN
Examine the performance and accessibility of virtual machines to diagnose problems in the vSAN cluster.

341
VMware vSAN 8.0

Verify Drivers, Firmware, Storage I/O Controllers Against the VMware Compatibility Guide

Use the vSAN Skyline Health to verify whether your hardware components, drivers, and firmware are compatible with
vSAN.
Using hardware components, drivers, and firmware that are not compatible with vSAN might cause problems in the
operation of the vSAN cluster and the virtual machines running on it.
The hardware compatibility health findings verify your hardware against the VMware Compatibility Guide. For more
information about using the vSAN Skyline Health, see Monitoring vSAN Skyline Health.
Examining Performance in vSAN Cluster

Monitor the performance of virtual machines, hosts, and the vSAN datastore to identify potential storage problems.
Monitor regularly the following performance indicators to identify faults in vSAN storage, for example, by using the
performance charts in the vSphere Client:
• Datastore. Rate of I/O operations on the aggregated datastore.
• Virtual Machine. I/O operations, memory and CPU usage, network throughput and bandwidth.
You can use the vSAN performance service to access detailed performance charts. For information about using the
performance service, see Monitoring vSAN Performance. For more information about using performance data in a vSAN
cluster, see the Troubleshooting Reference Manual.

Network Misconfiguration Status in vSAN Cluster

After you enable vSAN on a cluster, the datastore is not assembled correctly because of a detected network
misconfiguration.
After you enable vSAN on a cluster, on the Summary tab for the cluster the Network Status for vSAN appears as
Misconfiguration detected.
One or more members of the cluster cannot communicate because of either of the following reasons:
• A host in the cluster does not have a VMkernel adapter for vSAN.
• The hosts cannot connect each other in the network.
Join the members of the cluster to the same network. See vSAN Planning and Deployment.
Virtual Machine Appears as Noncompliant, Inaccessible or Orphaned in the vSAN Cluster

The state of a virtual machine that stores data on a vSAN datastore appears as noncompliant, inaccessible, or orphaned
due to the vSAN cluster failures.
A virtual machine on a vSAN datastore is in one of the following states that indicate a fault in the vSAN cluster.
• The virtual machine is non-compliant and the compliance status of some of its object is noncompliant. See Examine
the Compliance of a Virtual Machine in vSAN.
• The virtual machine object is inaccessible or orphaned. See Examine the Failure State of a Component.
If an object replica is still available on another host, vSAN forwards the I/O operations of the virtual machine to the replica.
If the object of the virtual machine can no longer satisfy the requirement of the assigned VM storage policy, vSAN
considers it noncompliant. For example, a host might temporarily lose connectivity. See Object States That Indicate
Problems in vSAN.
If vSAN cannot locate a full replica or more than 50 percent of the votes for the object, the virtual machine becomes
inaccessible. If a vSAN detects that the .vmx file is not accessible because the VM Home Namespace is corrupted, the
virtual machine becomes orphaned. See Accessibility of Virtual Machines Upon a Failure in vSAN.

342
VMware vSAN 8.0

If the cluster contains enough resources, vSAN automatically recovers the corrupted objects if the failure is permanent.
If the cluster does not have enough resources to rebuild the corrupted objects, extend the space in the cluster. See
Administering VMware vSAN.
Attempt to Create a Virtual Machine on vSAN Fails

When you try to deploy a virtual machine in a vSAN cluster, the operation fails with an error that the virtual machine files
cannot be created.
The operation for creating a virtual machine fails with an error status: Cannot complete file creation
operation.
The deployment of a virtual machine on vSAN might fail for several reasons.
• vSAN cannot allocate space for the virtual machine storage policies and virtual machine objects. Such a failure
might occur if the datastore does not have enough usable capacity, for example, if a physical disk is temporarily
disconnected from the host.
• The virtual machine has very large virtual disks and the hosts in the cluster cannot provide storage for them based on
the placement rules in the VM storage policy
For example, if the Primary level of failures to tolerate in the VM storage policy is set to 1, vSAN must store two
replicas of a virtual disk in the cluster, each replica on a different host. The datastore might have this space after
aggregating the free space on all hosts in the cluster. However, no two hosts can be available in the cluster, each
providing enough space to store a separate replica of the virtual disk.
vSAN does not move components between hosts or disks groups to free space for a new replica, even though the
cluster might contain enough space for provisioning the new virtual machine.
Verify the state of the capacity devices in the cluster.
a) Navigate to the cluster.
b) On the Monitor tab, click vSAN and select Physical Disks.
c) Examine the capacity and health status of the devices on the hosts in the cluster.
vSAN Stretched Cluster Configuration Error When Adding a Host

Before adding new hosts to a vSAN stretched cluster, all current hosts must be connected. If a current host is
disconnected, the configuration of the new host is incomplete.
After you add a host to a vSAN stretched cluster in which some hosts are disconnected, on the Summary tab for the
cluster the Configuration Status for vSAN appears as Unicast agent unset on host.
When a new host joins a stretched cluster, vSAN must update the configuration on all hosts in the cluster. If one or more
hosts are disconnected from the vCenter Server, the update fails. The new host successfully joins the cluster, but its
configuration is incomplete.
Verify that all hosts are connected to vCenter Server, and click the link provided in the Configuration Status message to
update the configuration of the new host.
If you cannot rejoin the disconnected host, remove the disconnected host from the cluster, and click the link provided in
the Configuration Status message to update the configuration of the new host.
vSAN Stretched Cluster Configuration Error When Using RVC to Add a Host

If you use the RVC tool to add a host to a vSAN stretched cluster, the configuration of the new host is incomplete.
After you use the RVC tool to add a host to a vSAN stretched cluster, on the Summary tab for the cluster the Configuration
Status for vSAN appears as Unicast agent unset on host.

343
VMware vSAN 8.0

When a new host joins a stretched cluster, vSAN must update the configuration on all hosts in the cluster. If you use the
RVC tool to add the host, the update does not occur. The new host successfully joins the cluster, but its configuration is
incomplete.
Verify that all hosts are connected to vCenter Server, and click the link provided in the Configuration Status message to
update the configuration of the new host.
Cannot Add or Remove the Witness Host in vSAN Stretched Cluster

Before adding or removing the witness host in a vSAN stretched cluster, all current hosts must be connected. If a current
host is disconnected, you cannot add or remove the witness host.
When you add or remove a witness host in a vSAN stretched cluster in which some hosts are disconnected, the operation
fails with an error status: The operation is not allowed in the current state. Not all hosts in
the cluster are connected to Virtual Center.
When the witness host joins or leaves a stretched cluster, vSAN must update the configuration on all hosts in the cluster.
If one or more hosts are disconnected from the vCenter Server, the witness host cannot be added or removed.
Verify all hosts are connected to vCenter Server, and retry the operation. If you cannot rejoin the disconnected host,
remove the disconnected host from the cluster, and then you can add or remove the witness host.
Disk Group Becomes Locked in vSAN Cluster

In an encrypted vSAN cluster, when communication between a host and the KMS is lost, the disk group can become
locked if the host reboots.
vSAN locks a host's disk groups when the host reboots and it cannot get the KEK from the KMS. The disks behave as if
they are unmounted. Objects on the disks become inaccessible.
You can view a disk group's health status on the Disk Management page in the vSphere Client. An Encryption health
finding warning notifies you that a disk is locked.
Hosts in an encrypted vSAN cluster do not store the KEK on disk. If a host reboots and cannot get the KEK from the KMS,
vSAN locks the host's disk groups.
To exit the locked state, you must restore communication with the KMS and reestablish the trust relationship.

Replacing Existing Hardware Components in vSAN Cluster


Under certain conditions, you must replace hardware components, drivers, firmware, and storage I/O controllers in the
vSAN cluster.
In vSAN, you should replace hardware devices when you encounter failures or if you must upgrade your cluster.
vSAN ESA contains a single storage pool of flash devices. Each flash device provides caching and capacity to the cluster.
For more information on how the vSAN ESA is designed, see the vSAN Planning and Deployment guide.

Replace a Flash Caching Device on a Host in vSAN Cluster

You must replace a flash caching device if you detect a failure or when there is a disk group upgrade.
• Verify that the storage controllers on the hosts are configured in passthrough mode and support the hot-plug feature.

344
VMware vSAN 8.0

If the storage controllers are configured in RAID 0 mode, see the vendor documentation for information about adding
and removing devices.
• If you upgrade the flash caching device, verify the following requirements:
– If you upgrade the flash caching device, verify that the cluster contains enough space to migrate the data from the
disk group that is associated with the flash device.
– Place the host in maintenance mode.
Removing the cache device removes the entire disk group from the vSAN cluster. When you replace a flash caching
device, the virtual machines on the disk group become inaccessible and the components on the group are marked as
degraded. See A Flash Caching Device Is Not Accessible in a vSAN Cluster.
1. Navigate to the vSAN cluster.
2. On the Configure tab, click Disk Management under vSAN.
3. Select the entire disk group that contains the flash caching device that you want to remove. vSAN does not allow you
to remove the cache disk. To remove the cache disk, you must remove the entire disk group.
4.

Click and click REMOVE.


5. In the Remove Disk Group dialog box, select any of the following data migration mode to evacuate the data on the
disks.
• Full data migration - Transfers all the data available on the host to other hosts in the cluster.
• Ensure accessibility - Transfers data available on the host to the other hosts in the cluster partially. During the
data transfer, all virtual machines on the host remains accessible.
• No data migration - There is no data transfer from the host. At this time, some objects might become inaccessible.
6. Click GO TO PRE-CHECK to find the impact on the cluster if the object is removed or placed in maintenance mode.
7. Click REMOVE to remove the disk group.

vSAN removes the flash caching device along with the entire disk group from the cluster.
1. Add a new device to the host.
The host automatically detects the device.
2. If the host is unable to detect the device, perform a device rescan.
For more information on creating a disk group, claiming storage devices, or adding devices to the disk group in the vSAN
Cluster, see Device Management in a vSAN Cluster.
Replace a Capacity Device in vSAN Cluster

You must replace a flash capacity device or a magnetic disk if you detect a failure or when you upgrade it.
• Verify that the storage controllers on the hosts are configured in passthrough mode and support the hot-plug feature.
If the storage controllers are configured in RAID 0 mode, see the vendor documentation for information about adding
and removing devices.
• If you upgrade the capacity device, verify that the cluster contains enough space to migrate the data from the capacity
device.
Before you physically remove the device from the host, you must manually delete the device from vSAN. When you
unplug a capacity device without removing it from the vSAN cluster, the components on the disk are marked as absent. If
the capacity device fails, the components on the disk are marked as degraded. When the number of failures of the object
replica with the affected components exceeds the FTT value, the virtual machines on the disk become inaccessible. See
Capacity Device Not Accessible in vSAN Cluster.

345
VMware vSAN 8.0

NOTE
If your vSAN cluster uses deduplication and compression, you must remove the entire disk group from the
cluster before you replace the device.
You can also watch the video about how to replace a failed capacity device in vSAN.
1. Navigate to the vSAN cluster.
2. On the Configure tab, click Disk Management under vSAN.
3. Select the flash capacity device or magnetic disk, and click Remove Disk.
NOTE
You cannot remove a capacity device from the cluster with enabled deduplication and compression.
You must remove the entire disk group. If you want to remove a disk group from a vSAN cluster with
deduplication and compression enabled, see "Adding or Removing Disks with Deduplication and
Compression Enabled" in Administering VMware vSAN.
4. In the Remove Disk dialog box, select Full data migration to transfer all the data available on the host to other hosts
in the cluster.
5. Click Go To Pre-Check to find the impact on the cluster if the object is removed or placed in maintenance mode.
6. Click Remove to remove the capacity device.
1. Add a new device to the host.
The host automatically detects the device.
2. If the host is unable to detect the device, perform a device rescan.
Replace a Storage Pool Device in vSAN ESA Cluster

The storage pool represents the amount of capacity provided by the host to the vSAN datastore.
• Verify that the storage controllers on the hosts are configured in passthrough mode and support the hot-plug feature.
If the storage controllers are configured in RAID 0 mode, see the vendor documentation for information about adding
and removing devices.
• If you upgrade the storage pool device, verify that the cluster contains enough space to migrate the data from the
storage pool device.
Each host's storage devices claimed by vSAN form a storage pool. All storage devices claimed by vSAN contribute to
capacity and performance.
1. Navigate to the vSAN cluster.
2. On the Configure tab, click Disk Management under vSAN.
3. Select the storage pool device, and click Remove Disk.
4. In the Remove Disk dialog box, select Full data migration to transfer all the data available on the host to other hosts
in the cluster.
5. Click Go To Pre-Check to find the impact on the cluster if the object is removed or placed in maintenance mode.
6. Click Remove to remove the storage pool device.
1. Add a new device to the host.
The host automatically detects the device.
2. If the host is unable to detect the device, perform a device rescan.
3. Claim a disk using the vSAN cluster > Configure > vSAN > Disk Management.

346
VMware vSAN 8.0

Replace a Storage Controller in vSAN Cluster

You must replace a storage controller on a host if you detect a failure.


1. Place the host into maintenance mode and power down the host.
2. Replace the failed card.
The replacement storage controller must have a supported firmware level listed in the VMware Compatibility Guide.
3. Power on the host.
4. Configure the card for passthrough mode. Refer to the vendor documentation for information about configuring the
device.
5. Exit maintenance mode.
Remove a Device from a Host in vSAN Cluster by Using an ESXCLI Command

If you detect a failed storage device or if you upgrade a device, you can manually remove it from a host by using an
ESXCLI command.
Verify that the storage controllers on the hosts are configured in passthrough mode and support the hot-plug feature.
If the storage controllers are configured in RAID 0 mode, see the vendor documentation for information about adding and
removing devices.
If you remove a flash caching device, vSAN deletes the disk group that is associated with the flash device and all its
member devices.
1. Open an SSH connection to the ESXi host.
2. To identify the device ID of the failed device, run this command and learn the device ID from the output.
esxcli vsan storage list

3. To remove the device from vSAN, run this command.


esxcli vsan storage remove -d device_id

The following are the commands available for managing vSAN ESA cluster:

Table 37: vSAN ESA Commands

Command Description

esxcli vsan storagepool add Add physical disk for vSAN usage.
esxcli vsan storagepool list List vSAN storage pool configuration.
esxcli vsan storagepool mount Mount vSAN disk from storage pool.
esxcli vsan storagepool rebuild Rebuild vSAN storage pool disks.
esxcli vsan storagepool remove Remove physical disk from storage pool. Requires one --disk or --
uuid param.
esxcli vsan storagepool unmount Unmount vSAN disk from storage pool.

1. Add a new device to the host.


The host automatically detects the device.
2. If the host is unable to detect the device, perform a device rescan.

347
VMware vSAN 8.0

Shutting Down and Restarting the vSAN Cluster


You can shut down the entire vSAN cluster to perform maintenance or troubleshooting.
Use the Shutdown Cluster wizard to shutdown the vSAN cluster. The wizard performs the necessary steps and alerts you
when it requires user action. You also can manually shut down the cluster, if necessary.
NOTE
When you shut down a vSAN stretched cluster, the witness host remains active.

Shut Down the vSAN Cluster Using the Shutdown Cluster Wizard
Use the Shutdown cluster wizard to gracefully shut down the vSAN cluster for maintenance or troubleshooting.
The Shutdown Cluster Wizard is available with vSAN 7.0 Update 3 and later releases.
NOTE
If you have a vSphere with Tanzu environment, you must follow the specified order when shutting down or
starting up the components. For more information, see "Shutdown and Startup of VMware Cloud Foundation" in
the VMware Cloud Foundation Operations Guide.
1. Prepare the vSAN cluster for shutdown.
a) Check the vSAN Skyline Health to confirm that the cluster is healthy.
b) Power off all virtual machines (VMs) stored in the vSAN cluster, except for vCenter Server VMs, vCLS VMs and
file service VMs. If vCenter Server is hosted on the vSAN cluster, do not power off the vCenter Server VM or VM
service VMs (such as DNS, Active Directory) used by vCenter Server.
c) If this is an HCI Mesh server cluster, power off all client VMs stored on the cluster. If the client cluster's vCenter
Server VM is stored on this cluster, either migrate or power off the VM. Once this server cluster is shutdown, its
shared datastore is inaccessible to clients.
d) Verify that all resynchronization tasks are complete.
Click the Monitor tab and select vSAN > Resyncing Objects.
NOTE
If any member hosts are in lockdown mode, add the host's root account to the security profile Exception User
list. For more information, see Lockdown Mode in vSphere Security.
2. Right-click the vSAN cluster in the vSphere Client, and select menu Shutdown cluster.
You also can click Shutdown Cluster on the vSAN Services page.

348
VMware vSAN 8.0

3. On the Shutdown cluster wizard, verify that the Shutdown pre-checks are green checks. Resolve any issues that are
red exclamations. Click Next.
If vCenter Server appliance is deployed on the vSAN cluster, the Shutdown cluster wizard displays the vCenter Server
notice. Note the IP address of the orchestration host, in case you need it during the cluster restart. If vCenter Server
uses service VMs such as DNS or Active Directory, note them as exceptional VMs in the Shutdown cluster wizard.
4. Enter a reason for performing the shutdown, and click Shutdown.
The vSAN Services page changes to display information about the shutdown process.
5. Monitor the shutdown process.
vSAN performs the steps to shutdown the cluster, powers off the system VMs, and powers off the hosts.
Restart the vSAN cluster. See Restart the vSAN Cluster.
Restart the vSAN Cluster
You can restart a vSAN cluster that is shut down for maintenance or troubleshooting.

1. Power on the cluster hosts.


If the vCenter Server is hosted on the vSAN cluster, wait for vCenter Server to restart.
2. Right-click the vSAN cluster in the vSphere Client, and select menu Restart cluster.
You also can click Restart Cluster on the vSAN Services page.
3. On the Restart Cluster dialog, click Restart.
The vSAN Services page changes to display information about the restart process.
4. After the cluster has restarted, check the vSAN Skyline Health and resolve any outstanding issues.

Manually Shut Down and Restart the vSAN Cluster


You can manually shut down the entire vSAN cluster to perform maintenance or troubleshooting.
Use the Shutdown Cluster wizard unless your workflow requires a manual shut down. When you manually shut down the
vSAN cluster, do not deactivate vSAN on the cluster.
NOTE
If you have a vSphere with Tanzu environment, you must follow the specified order when shutting down or
starting up the components. For more information, see "Shutdown and Startup of VMware Cloud Foundation" in
the VMware Cloud Foundation Operations Guide.
1. Shut down the vSAN cluster.
a) Check the vSAN Skyline Health to confirm that the cluster is healthy.
b) Power off all virtual machines (VMs) running in the vSAN cluster, if vCenter Server is not hosted on the cluster. If
vCenter Server is hosted in the vSAN cluster, do not power off the vCenter Server VM or service VMs (such as
DNS, Active Directory) used by vCenter Server.
c) If vSAN file service is enabled in the vSAN cluster, you must deactivate the file service. Deactivating the vSAN file
service removes the empty file service domain. If you want to retain the empty file service domain after restarting
the vSAN cluster, you must create an NFS or SMB file share before deactivating the vSAN file service.
d) Click the Configure tab and turn off HA. As a result, the cluster does not register host shutdowns as failures.
For vSphere 7.0 U1 and later, enable vCLS retreat mode. For more information, see the VMware knowledge base
article at https://kb.vmware.com/s/article/80472.
e) Verify that all resynchronization tasks are complete.
Click the Monitor tab and select vSAN > Resyncing Objects.

349
VMware vSAN 8.0

f) If vCenter Server is hosted on the vSAN cluster, power off the vCenter Server VM.
Make a note of the host that runs the vCenter Server VM. It is the host where you must restart the vCenter Server
VM.
g) Deactivate cluster member updates from vCenter Server by running the following command on the ESXi hosts in
the cluster. Ensure that you run the following command on all the hosts.
esxcfg-advcfg -s 1 /VSAN/IgnoreClusterMemberListUpdates
h) Log in to any host in the cluster other than the witness host.
i) Run the following command only on that host. If you run the command on multiple hosts concurrently, it may cause
a race condition causing unexpected results.
python /usr/lib/vmware/vsan/bin/reboot_helper.py prepare

The command returns and prints the following:


Cluster preparation is done.
NOTE
• The cluster is fully partitioned after the successful completion of the command.
• If you encounter an error, resolve the issue based on the error message and try enabling vCLS retreat
mode again.
• If there are unhealthy or disconnected hosts in the cluster, remove the hosts and retry the command.
j) Place all the hosts into maintenance mode with No Action. If the vCenter Server is powered off, use the following
command to place the ESXi hosts into maintenance mode with No Action.
esxcli system maintenanceMode set -e true -m noAction

Perform this step on all the hosts.


To avoid the risk of data unavailability while using No Action at the same time on multiple hosts, followed by a
reboot of multiple hosts, see the VMware knowledge base article at https://kb.vmware.com/s/article/60424. To
perform simultaneous reboot of all hosts in the cluster using a built-in tool, see the VMware knowledge base article
at https://kb.vmware.com/s/article/70650.
k) After all hosts have successfully entered maintenance mode, perform any necessary maintenance tasks and
power off the hosts.
2. Restart the vSAN cluster.
a) Power on the ESXi hosts.
Power on the physical box where ESXi is installed. The ESXi host starts, locates the VMs, and functions normally.

350
VMware vSAN 8.0

If any hosts fail to restart, you must manually recover the hosts or move the bad hosts out of the vSAN cluster.
b) When all the hosts are back after powering on, exit all hosts from maintenance mode. If the vCenter Server is
powered off, use the following command on the ESXi hosts to exit maintenance mode.
esxcli system maintenanceMode set -e false

Perform this step on all the hosts.


c) Log in to one of the hosts in the cluster other than the witness host.
d) Run the following command only on that host. If you run the command on multiple hosts concurrently, it may cause
a race condition causing unexpected results.
python /usr/lib/vmware/vsan/bin/reboot_helper.py recover

The command returns and prints the following:


Cluster reboot/power-on is completed successfully!
e) Verify that all the hosts are available in the cluster by running the following command on each host.
esxcli vsan cluster get
f) Enable cluster member updates from vCenter Server by running the following command on the ESXi hosts in the
cluster. Ensure that you run the following command on all the hosts.
esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListUpdates
g) Restart the vCenter Server VM if it is powered off. Wait for the vCenter Server VM to be powered up and
running. To deactivate vCLS retreat mode, see the VMware knowledge base article at https://kb.vmware.com/s/
article/80472.
h) Verify again that all the hosts are participating in the vSAN cluster by running the following command on each host.
esxcli vsan cluster get
i) Restart the remaining VMs through vCenter Server.
j) Check the vSAN Skyline Health and resolve any outstanding issues.
k) (Optional) Enable vSAN file service.
l) (Optional) If the vSAN cluster has vSphere Availability enabled, you must manually restart vSphere Availability to
avoid the following error: Cannot find vSphere HA master agent.
To manually restart vSphere Availability, select the vSAN cluster and navigate to:
1. Configure > Services > vSphere Availability > EDIT > Disable vSphere HA
2. Configure > Services > vSphere Availability > EDIT > Enable vSphere HA
3. If there are unhealthy or disconnected hosts in the cluster, recover or remove the hosts from the vSAN cluster. If
vCenter Server uses service VMs such as DNS or Active Directory, note them as exceptional VMs in the Shutdown
cluster wizard.
Retry the above commands only after the vSAN Skyline Health shows all available hosts in the green state.
If you have a three-node vSAN cluster, the command reboot_helper.py recover cannot work in a one host failure
situation. As an administrator, do the following:
1. Temporarily remove the failure host information from the unicast agent list.
2. Add the host after running the following command.
reboot_helper.py recover

Following are the commands to remove and add the host to a vSAN cluster:
#esxcli vsan cluster unicastagent remove -a <IP Address> -t node -u <NodeUuid>
#esxcli vsan cluster unicastagent add -t node -u <NodeUuid> -U true -a <IP Address> -p 12321

Restart the vSAN cluster. See Restart the vSAN Cluster.

351
VMware vSAN 8.0

Documentation Legal Notice


This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred
to as the “Documentation”) is for your informational purposes only and is subject to change or withdrawal by Broadcom
at any time. This Documentation is proprietary information of Broadcom and may not be copied, transferred, reproduced,
disclosed, modified or duplicated, in whole or in part, without the prior written consent of Broadcom.
If you are a licensed user of the software product(s) addressed in the Documentation, you may print or otherwise make
available a reasonable number of copies of the Documentation for internal use by you and your employees in connection
with that software, provided that all Broadcom copyright notices and legends are affixed to each reproduced copy.
The right to print or otherwise make available copies of the Documentation is limited to the period during which the
applicable license for such software remains in full force and effect. Should the license terminate for any reason, it is your
responsibility to certify in writing to Broadcom that all copies and partial copies of the Documentation have been returned
to Broadcom or destroyed.
TO THE EXTENT PERMITTED BY APPLICABLE LAW, BROADCOM PROVIDES THIS DOCUMENTATION “AS
IS” WITHOUT WARRANTY OF ANY KIND, INCLUDING WITHOUT LIMITATION, ANY IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NONINFRINGEMENT. IN NO EVENT WILL
BROADCOM BE LIABLE TO YOU OR ANY THIRD PARTY FOR ANY LOSS OR DAMAGE, DIRECT OR INDIRECT,
FROM THE USE OF THIS DOCUMENTATION, INCLUDING WITHOUT LIMITATION, LOST PROFITS, LOST
INVESTMENT, BUSINESS INTERRUPTION, GOODWILL, OR LOST DATA, EVEN IF BROADCOM IS EXPRESSLY
ADVISED IN ADVANCE OF THE POSSIBILITY OF SUCH LOSS OR DAMAGE.
The use of any software product referenced in the Documentation is governed by the applicable license agreement and
such license agreement is not modified in any way by the terms of this notice
The manufacturer of this Documentation is Broadcom Inc.
Provided with “Restricted Rights.” Use, duplication or disclosure by the United States Government is subject to the
restrictions set forth in FAR Sections 12.212, 52.227-14, and 52.227-19(c)(1) - (2) and DFARS Section 252.227-7014(b)
(3), as applicable, or their successors.
Copyright © 2005–2025 Broadcom. All Rights Reserved. The term “Broadcom” refers to Broadcom Inc. and/or its
subsidiaries. All trademarks, trade names, service marks, and logos referenced herein belong to their respective
companies.

352

You might also like