Hi all, I'm getting my notes together for VCP-550 (yes, 550) and have concluded that I really don't understand multipathing. So I will be asking a whole flurry of questions on the subject.
First question;
How does multipathing really work?
I am trying to think through how it would be implemented in software, and here's a mental model I have;
- there is an ESXi host with 2 or more FC HBAs
- there are 2 or more FC switches
- there is a storage array with 2 or more FC ports
the ESXi hypervisor must have a CPU process that manages a "routing table" analogous to a Cisco router maintaining its routing table;
in the case of FC multipathing, the host has a destination FC address and figures out that there are N possible paths, and lets say all N paths are up. so the "routing table" has a list of target LUNs, and for each LUN it has a list of egress FC ports and maybe what the policy is (Fixed, MRU, or RR).
when a VM issues a SCSI write command to a virtual disk, ESXi "intercepts" and encapsulates that command into FCP frames, looks at its "routing table", realizes there are multiple paths and that multipathing has been configured. the hypervisor figures out what to do with each outgoing FCP frame based on the multipathing config, lets say Round Robin. it transmits the frames to the array according to the design of the NMP.
the frames eventually reach the storage array, possibly hitting multiple FC ports, so the array network stack needs to assemble into order and then pass the raw SCSI command to the SCSI controller. after the writes, the SCSI controller returns a completion status message. the storage array's network stack then looks at its own "routing table", chooses an egress FC port, and transmits the frame back to the host.
so in other words, multipathing is handled entirely by the OS of the host and of the array; the VM has no clue, and FC switches in between have no idea what those two rascals are doing.
is this anywhere close to how multipathing works?