Quantcast
Channel: VMware Communities : Discussion List - All Communities
Viewing all articles
Browse latest Browse all 180259

Round Robin PSP and Clariion CX4 ALUA

$
0
0

I'm trying to understand the documented "need" to enable ALUA for the host when you use Round Robin Path Selection in ESXi 5.0.

 

Here's the paths for an RDM LUN:
VMware MRU.PNG

I have four paths, two are active, one is being used for IO. I assume the two active paths are those going to the Clariion Storage Processor that currently owns the LUN, and the two "stand by" paths go to the other SP. So ESXi has looked at all the available paths, and picked one of the two active ones to use for IO, it'll stick with this path unless it fails, and then will switch to another, which I assume will be a different active path. The failure might be caused by a LUN trespass caused by the array (e.g. an code upgrade on the SP) or physical failure of the HBA/Fabric/SP FC port that currently Active and being used for IO. The set of paths that are maked as Active or Stand by is determined by which SP currently owns the LUN.

 

Now, here's a VM where I've changed the PSP to Round Robin:

vmware rr.PNG

Now we have both active paths being used for IO. These are all the available paths to the SP that owns the LUN. The paths to the other SP are still shwon as Stand by. Presumably the ESXi host won't use these paths (according to https://community.emc.com/docs/DOC-12817 and assuming Clariion initiator failover mode is set to 1, IOs sent down these paths will be rejected). The host wouldn't be able to use the paths even if it wanted to.

 

What I don't understand is why the recommendation is to change the Clariion settings for the ESXi host to enable ALUA (change the initiator failover mode to 4). I know that by using Round Robin with initiator failover mode set to 1 I'm only using 50% of the possible paths, but if I changed this to 4 50% of the IO would be going to the "wrong" SP, and OK ALUA might handle that, but the documentation seems to imply that I "must" use ALUA with Round Robin. Why?

 

What am I missing/not understanding? From my point of view, using both active paths is still twice as good as just using one.

 

I've seen references to Round Robin causing LUN trespass, but surely that'd need the initiator failover mode to be set to 0 (LUN based trespass mode). The link to DOC-12817 implies that initiator failover mode defaults to 0 but all my stuff seems to be 1 and I (and other storage admins) don't recall changing this. Perhaps the stuff I've been reading is now out of date?

 

Thanks in advance for any help/advice.


Viewing all articles
Browse latest Browse all 180259

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>