Morning,
->So I suppose you guys using IP Storage (ISCSI ?) - because we only use the onboard NICs (Emulex indeed) for network traffic. Our storage - which seems to be the issue here - is using QLogic HBAs. We never actually had problems where the nics refused to send traffic. I now forwarded VMware my findings yet again and asked for an explanation and how we can avoid this in the future .. Still waiting for a reply so it will be interesting to see what the outcome is / will be.
- We are 100% FC SAN based storage using HBA's as well the original outage issue was caused by the emulex see below:
We introduced an additional blade to our infrastructure. It was load-tested for 10 days, all stable and nice. Monday then that host disappears from the vCenter.
The host itself is still up, just cannot connect to vCenter / Client. VMs are up too so that was a bonus. After hours with VMware support they basically gave up and we had not choice but to bounce the host - well, to add insult to the injury, HA didn't work and did not fail the VMs over.
-> This was caused by emulex which started the mess... now the mess seems to be a storage issue to me... just my two cents:
Remediation:
1. To avoid network outage remove emulex and replace with broadcom
2. To resolve current latency issues press storage vendor and reboot controllers asap
Just my two cents on this issue.