Quantcast
Channel: VMware Communities: Message List - vSphere™ Storage
Viewing all articles
Browse latest Browse all 5826

Re: I am out of ideas - High Latency on a LUN - on hosts with no VMs

$
0
0

Evening,

 

You got a fun one there...  A few questions first:

 

additional blade to our infrastructure  - Was it a gen 7 or gen 8 HP Blade?  If yes then I know what caused the initial issue and how to solve it.  Look into the emulex parts on your blade they caused the issue and will continue to do it.  Random nics going down without correct failover.


->

Locks - our guess .. So some VMs we expected to be the culprit, were rebooted .. and ola ... latency gone.

No one can explain what happens, why that "fixed" some issues, but heh - we were happy ...

-> This is a fun issue I have seen these where the descriptors to clean up but to create that much latency is odd.  I am afraid you took the correct process reboot neither side will admit or really diagnose the issue when the solution is easy like a reboot... (I once had a Kernel panic in ESXi and support said how long to re-install and I said about 30 minutes... their response was who cares about root cause we have been on the phone for 2 hours just reinstall. Not to blame vmware support it was the quickest way to get into a good state again)

 


Now VMkernel logs show some SCSI aborts and yes, this is likely due to storage issues which we may still have - however, how can the only hosts showing now a latency with no VMs on it when they are out of maintenance mode, but look fine when in maintenance mode and all other hosts with the VMs running, are fine ?

-> If a host has no vm's and no locks running the only difference between a host on maint mode and not is HA.  If you are running HA I guess it could be the HA datastore lock files (assuming your running 5.0+) causing the latency.  

 

No matter you have a bad situation that should not happen.   I would do the following:

 

1. Send the logs from your chassis back plane to the vendor to make sure nothing is boinked up

2. Reboot each ESXi host in an orderly fashion

3. Reboot the storage controllers in an orderly fashion

 

This will essentially clear out any odd locks.   If it does not solve the problem I am afraid you have to press the vendors more... 

 

Also if you provide more info on version of ESXi and storage/ blade type it might help.


Viewing all articles
Browse latest Browse all 5826

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>