Hi All,
First and foremost hello to everyone in the VMware community, new here and this my first post and i am hoping someone will be able to assist me as i am at wits end.
Our Setup
Hosts : DL380 Gen8's running esxi 5.5
HBA's : Emulex Lpe12000 8GB
Switch : HP SAN 8/8 Fabric switch
SAN : P2000 G3 FC ( 3 Enclosures with 10 x 900GB 10K rpm disks in each )
I have noticed high disk latency on a couple of our servers, for example on a SQL server running 1 DB the drive holding that DB is generating latency of 60ms, we have another server where we a have raw lun mapped for file share purposes and writing back to that lun is generating latency of around 60ms+
Hardware acceleration is enabled along with storage I/O. We are using raid 5 Vdisks with a couple of small 2TB volumes approx 3 presented to the ESXi hosts to form a storage cluster.
I've tried everything to reduce the latency from ensuring there is adequate ram/cpu that the disks are eager zero and the drives are split out using multi scsi controllers etc....
I've also done the following as per HP's recommended best practices for the HP P2000 G3 SAN
Change default PSP path to Round Robin for HP P2000 G3
esxcli storage nmp satp set --default-psp=VMW_PSP_RR --satp=VMW_SATP_ALUA
Next ran this to set the psp for all existing volumes to round robin.
for i in `esxcli storage nmp device list | grep naa.600` ; do esxcli storage nmp device set --device $i --psp VMW_PSP_RR; done
Finally set the path-change-frequency from 1000 IOps (default) to 1 so every IO the other (optimized) path is used
for i in `esxcli storage nmp device list | grep naa.600` ; do esxcli storage nmp psp roundrobin deviceconfig set -t iops -I 1 -d $i; done
This has made no difference to the latency
Also installed the latest VAAI.
I've taken some random snippets from two hosts
The VM Kernel logs for host 2 are reporting
2014-01-14T20:18:45.807Z cpu8:33531)WARNING: Migrate: 262: Invalid message type for new connection: 542393671. Expecting message of type INIT (0).
2014-01-14T20:19:01.661Z cpu12:34002)Config: 346: "HostLocalSwapDirEnabled" = 0, Old Value: 0, (Status: 0x0)
2014-01-14T20:21:02.771Z cpu6:32811)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412e803e8cc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:21:02.771Z cpu6:32811)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:21:02.771Z cpu6:32811)ScsiDeviceIO: 2337: Cmd(0x412e803e8cc0) 0x1a, CmdSN 0x21cb from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-01-14T20:21:02.787Z cpu6:32811)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412e803e8cc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:21:02.806Z cpu4:32809)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412e803e8cc0, 0) to dev "mpx.vmhba35:C0:T0:L0" on path "vmhba35:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2014-01-14T20:21:02.806Z cpu4:32809)ScsiDeviceIO: 2337: Cmd(0x412e803e8cc0) 0x1a, CmdSN 0x21cf from world 0 to dev "mpx.vmhba35:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2014-01-14T20:26:02.773Z cpu3:32808)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412e82d28680, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:26:02.773Z cpu3:32808)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:26:02.773Z cpu3:32808)ScsiDeviceIO: 2337: Cmd(0x412e82d28680) 0x1a, CmdSN 0x21e8 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-01-14T20:26:02.792Z cpu3:32808)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412e82d28680, 0) to dev "mpx.vmhba35:C0:T0:L0" on path "vmhba35:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2014-01-14T20:26:02.792Z cpu3:32808)ScsiDeviceIO: 2337: Cmd(0x412e82d28680) 0x1a, CmdSN 0x21ee from world 0 to dev "mpx.vmhba35:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
Host 5 is reporting
014-01-14T20:17:00.194Z cpu15:32820)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x2a failed <3/23> sid x010400, did x010000, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:03.400Z cpu16:33500)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x2a failed <1/3> sid x010400, did x010800, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.242Z cpu10:33500)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.242Z cpu10:33322)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.242Z cpu10:33322)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.242Z cpu10:33500)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.242Z cpu10:33322)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.242Z cpu10:33322)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.243Z cpu10:33322)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.243Z cpu10:33322)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:17:35.245Z cpu10:33500)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x28 failed <0/4> sid x010400, did x010900, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:18:09.402Z cpu1:32800)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd x2a failed <1/3> sid x010400, did x010800, oxid xffff SCSI Reservation Conflict -
2014-01-14T20:20:42.407Z cpu6:32811)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 1:(0):3271: FCP cmd xa3 failed <0/1> sid x010400, did x010900, oxid xffff SCSI Chk Cond - Unit Attn: Data(x2:x6:x3f:xe)
2014-01-14T20:20:42.409Z cpu23:514490)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 0:(0):3271: FCP cmd xa3 failed <3/2> sid x010400, did x010900, oxid xffff SCSI Chk Cond - Unit Attn: Data(x2:x6:x3f:xe)
2014-01-14T20:20:42.409Z cpu23:514490)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 0:(0):3271: FCP cmd xa3 failed <3/1> sid x010400, did x010900, oxid xffff SCSI Chk Cond - Unit Attn: Data(x2:x6:x3f:xe)
2014-01-14T20:20:42.409Z cpu23:514490)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 0:(0):3271: FCP cmd xa3 failed <3/8> sid x010400, did x010900, oxid xffff SCSI Chk Cond - Unit Attn: Data(x2:x6:x3f:xe)
2014-01-14T20:20:42.409Z cpu23:33320)lpfc: lpfc_scsi_cmd_iocb_cmpl:2145: 0:(0):3271: FCP cmd xa3 failed <3/0> sid x010400, did x010900, oxid xffff SCSI Chk Cond - Unit Attn: Data(x2:x6:x3f:xe)
2014-01-14T20:20:42.414Z cpu19:32824)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412fc00128c0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:20:42.414Z cpu19:32824)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:20:42.414Z cpu19:32824)ScsiDeviceIO: 2337: Cmd(0x412fc00128c0) 0x1a, CmdSN 0x888e from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-01-14T20:25:42.423Z cpu21:32826)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412fc6702c80, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x4 0x0 0x0. Act:EVAL
2014-01-14T20:25:42.423Z cpu21:32826)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:25:42.423Z cpu21:32826)ScsiDeviceIO: 2337: Cmd(0x412fc6702c80) 0x1a, CmdSN 0x88aa from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x4 0x0 0x0.
2014-01-14T20:30:42.432Z cpu21:32826)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412fc42be600, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:30:42.432Z cpu21:32826)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:30:42.432Z cpu21:32826)ScsiDeviceIO: 2337: Cmd(0x412fc42be600) 0x1a, CmdSN 0x88c2 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-01-14T20:32:07.873Z cpu11:32816)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412e8a0d9ec0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:32:07.873Z cpu11:32816)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:32:07.873Z cpu11:32816)ScsiDeviceIO: 2337: Cmd(0x412e8a0d9ec0) 0x1a, CmdSN 0x88cc from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-01-14T20:32:07.891Z cpu11:32816)ScsiDeviceIO: 2337: Cmd(0x412e8a0d9ec0) 0x1a, CmdSN 0x88cd from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-01-14T20:35:42.441Z cpu19:32824)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412fc6531c00, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:35:42.441Z cpu19:32824)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:35:42.441Z cpu19:32824)ScsiDeviceIO: 2337: Cmd(0x412fc6531c00) 0x1a, CmdSN 0x88db from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-01-14T20:40:42.447Z cpu12:32817)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412fc23d6200, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0. Act:EVAL
2014-01-14T20:40:42.447Z cpu12:32817)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:40:42.447Z cpu12:32817)ScsiDeviceIO: 2337: Cmd(0x412fc23d6200) 0x1a, CmdSN 0x88f1 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.
2014-01-14T20:45:42.457Z cpu19:32824)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x1a (0x412fc36e9240, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2014-01-14T20:45:42.457Z cpu19:32824)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "mpx.vmhba32:C0:T0:L0" state in doubt; requested fast path state update...
2014-01-14T20:45:42.457Z cpu19:32824)ScsiDeviceIO: 2337: Cmd(0x412fc36e9240) 0x1a, CmdSN 0x8907 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Attached some pictures
Latency 1 : this is a server using VMDK's only no RDM's
Latency 2 : this server has an RDM attached (scsi1:0) latency is the same
I've also updated on host 2 the HBA firmware to Firmware 2.01a12 which is the latest one available for the LPE12000 this again has made no difference
The SAN management controllers are running the latest firmware available also
Any help would be greatly appreciated please.