I was watching some training videos on EMC’s new ScaleIO and at one point they briefly talked about calculating Bandwidth for a ScaleIO cluster. The math seems easy enough but I was surprised to find that there isn’t a tool for it even with this new ScaleIO initiative.
The way ScaleIO writes data is very similar to RAID1/0, though ScaleIO itself does not use any sort of raid for protection. It has to send every write IO over the network twice, because every slice of a ScaleIO volume is held in two different locations. But when it comes to reads, ScaleIO only reads off of one slice. It reads off a primary slice so there is no penalty. Here is the math the way I see it.
Total IOPS = Writes + Reads
[((Average IO Size in KB * Write IOPS) * 2) + (Average IO Size in KB * Read IOPS)] = Required Bandwidth per host in KBps
Then Convert KBps to Kbps:
Required Bandwidth per host in KBps * 8 = Required Bandwidth per host in Kbps
Required Bandwidth per host in Kbps / 1024 = Required Bandwidth per host in Mbps
Then multiple that by the number of hosts:
Required Bandwidth per host in Mbps * Number of hosts = Required Bandwidth for the whole cluster in Mbps
So I took all of this and put it in an Excel doc called “ScaleIO Bandwidth Guestimator” to automate a lot of the math.
The reason I chose average and not 95th percentile is because on most of our performance collects documents it gives you the average but doesn’t always give you the 95th percentile.
In reality people should be scoping for the average in the maximum so that they can cover all their bases when looking at ScaleIO.
Tell me what you think in your comments below! 🙂