First off, let me just say i am a HUGE VMware fan. I am even a VCP myself. I think they are a great company with incredible product offerings. A few months ago VMware announced their VSAN feature for vSphere 5.5 offering. It allows you to cluster together DAS and make it look like a true SAN to the hypervisor. This allows you to be able to do things like vMotion, HA, and FT without a physical SAN. This stuff is like black magic though there are a few holes in. Here is my list of what I feel needs to be addressed if someone wants to leverage VSAN as a Tier 1 storage strategy:
- Limits of Using vSphere Replication with Virtual SAN Storage
- For reasons of load and I/O latency, Virtual SAN storage is subject to limits in terms of the numbers of hosts that you can include in a Virtual SAN cluster and the number of virtual machines that you can run on each host. See the Limits section in the VMware Virtual SAN Design and Sizing Guide.
- Using vSphere Replication adds to the load on the storage. Every virtual machine generates regular read and write operations. Configuring vSphere Replication on those virtual machines adds another read operation to the regular read and write operations, which increases the I/O latency on the storage. The precise number of virtual machines that you can replicate to Virtual SAN storage by using vSphere Replication depends on your infrastructure. If you notice slower response times when you configure vSphere Replication for virtual machines in Virtual SAN storage, monitor the I/O latency of the Virtual SAN infrastructure. Potentially reduce the number of virtual machines that you replicate in the Virtual SAN datastore.
- Limits of Local Protection
- If you enable multiple point-in-time snapshots, you must make take into account the additional components that each snapshot creates in the Virtual SAN storage, based on the number of disks per virtual machine, the size of the disks, the number of snapshots to retain, and the number of failures to tolerate. When retaining snapshots and using Virtual SAN storage, you must calculate the number of extra storage that you require for each virtual machine:
- Number of disks x number of point-in-time snapshots x number of mirror/witness storage
- The number of point-in-time snapshots that you retain can increase I/O latency on the Virtual SAN storage.
- If you enable multiple point-in-time snapshots, you must make take into account the additional components that each snapshot creates in the Virtual SAN storage, based on the number of disks per virtual machine, the size of the disks, the number of snapshots to retain, and the number of failures to tolerate. When retaining snapshots and using Virtual SAN storage, you must calculate the number of extra storage that you require for each virtual machine:
- Stealing resources from the hosts
- When you enable VSAN you are taking away CPU cycles and RAM from your ESXi hosts that should be used to service your VMs. This load only increases when you enable snap shots and vSphere replication.
- Limited Scalability
- Currently VSAN supports a maximum of 32 hosts in a cluster but the recommended size is 8 or under.
- The current limit is 40 VMs per host when there are 9 hosts or more in a cluster (If you have 8 hosts you can do ~100 VMs per host though).
- With 8 hosts in a cluster, you get a maximum of 40 SSDs and 280 HDDs in a fully configured VSAN cluster.
- You can only have 100 VMs on each host which means a fully configured VSAN with best practice can only handle 800 VMs.
- Currently VSAN supports a maximum of 32 hosts in a cluster but the recommended size is 8 or under.
- Networking considerations
- At a minimum VSAN will require a dedicated 1Gbps NIC port. Of course it is needless to say that 10Gbps would be preferred with solutions like these, in order to keep latency down between the cluster. Customers should always have an additional NIC port available for resiliency purposes as well.
- Scalability
- Add capacity by adding HDDs; add performance by adding SDD
- Eventually you have to add hosts to scale up the VSAN cluster, that means that in 3-5 years your Virtual environment will be built on antiquated hardware with no easy way to upgrade aside from rebuying new hosts for a second cluster to vMotion everything back over on. So you essentially have to rebuy all the hardware in your environment. This can be costly as the VSAN cluster grows.
Additional things to note:
- VSAN Management clusters are used for larger environments that segregate a vSphere cluster for management functions like Operations Manager, etc. This has no impact on production storage and possibly less expensive consumption option.
- Up to 32 Hosts / Nodes can be part of the VSAN cluster contributing storage to the datastore. Other non-members can use the storage just like any other datastore.
- Some Hero numbers that VMware are putting out that VSAN can handle at max is 4+ petabytes with 4,000,000 IOPS.
I feel if VMware is able to address these issues in later releases, they may take a good chunk out of the storage market. It will be interesting to see how the big storage players like EMC, Netapp, IBM, etc address this in the future.