We have a Dell MD3220i with 24 SAS drives, dual controllers, two 1GB links per controller. Each esxi 5.1 host is connected to both controllers using dedicated NICs.
We only have a few VMs sitting on this datastore at the moment (about 10 low disk IO VMs).
However we're experiencing VM crashes. They lock up with the error "The lock protecting virtualdisk.vmdk has been lost. This is most likely due to underlying storage having problems, resulting in this virtual machine getting powered on at another ESX host as well. This virtual machine needs to be powered off at this host now. Kindly confirm that the virtual machine is running successfully on another host before clicking the OK button."
Looking through the vmware logs, we see lots of long latency errors (Long VMFS rsv time). Also looking through the performance charts, we see various clues about disk latency problems...random read latency spikes up to 1600ms, write latency spikes to 11000ms. However, l have a monitoring tool (Orion Storage Manager) that monitors our Dell MD3220i performance. It doesn't show any of those same latency spikes. It shows latency under consistantly 10m/s.
Troubleshooting done so far:
-Checked our iscsi network switch, no errors
-Upgraded the NIC firmware and BIOS on our esxi hosts
-Upgraded the NIC drivers on our esxi hosts
-Disabled netq (since we only have 1GB interfaces)
Any ideas what else to check?