Important tip for FC zonning to simplify your life

Use WWN based FC zones, not port-based. It will save your effort during troubleshooting.

Many storage teams use port numbers instead of WWNs during FC zone configuration. Do you belong to them?

Personally, I think you are making big mistake.

It seems that life is easier with port based FC zones but I will show you it could be opposite.

If every touching of configuration is handled as planned change by your documented processes, read next.

Is a life easier with port based FC zones?

Consider the common design of storage infrastructure.

Disk array connected be few pairs of FC ports (usually labeled as 1A, 1B, 2A, 2B,…) to FC switch. The servers are connected by a pair of FC ports ( serverA, serverB) as well.

Then typical situation will happen - monitoring system sends alert about losing one disk path on some particular server.

As it happened only on one server you can exclude troubles of the path between array and switch.

Then there can be a fault on server’s FC port, FC cable or FC port of the switch.

What’s the process of usual troubleshooting?

Check port status on the switch. The presence of the optical signal, Tx/Rx levels, counters of bad frames and s.o. (I suggest to periodically monitor error counters and report anomalies.)

Everything seems to be fine on switch side (counters and s.o.) just Rx signal is zero.

Therefore you have faulted SFP on the server, on FC switch or broken cable.

During years of my practice we tuned (with my colleagues) the troubleshooting procedure as follows:

  1. Move the cable on FC switch to some other port (if available) as it’s the easiest step. (To replace SFP will not help always as the backend port can be broken as well.)
    If it didn’t help, just move cable back.

  2. Replace SFP on the server.

  3. Replace FC cable between the server and patch panel. It’s a second easiest step and probability of the failure elimination is pretty high.

  4. Replace FC cable between the FC switch and patch panel. The difficulty of replacement is the same as in step 3 but there is (usually) not such often movement of cables next to switches than in servers’ racks.

  5. Replace FC card on the server.

By this procedure, you will eliminate all possible points of failure.

Note, that you are starting by the easiest step - just unplug FC cable from the switch and plug it to some other port. In most cases, you just solved the issue.

And the beauty is, you don’t need to reconfigure anything.

But only in case you are not using port based FC zones.

How do you need to change the step 1 in the case of port based FC zones?

1a. Add new empty port to zone
1b. Move cable to this new port
1c. If it didn’t help, move the cable to the origin location
1d. Remove the new port from FC zone

In every environment using port based FC zones, the admins are claiming it’s better because if there is the FC card replaced in the server they don’t need to do anything. No zone change means no call during Saturday’s night when vendor replaced HW in the server.

But it’s the only case with an advantage for storage support team.

But to start troubleshooting by server outage and HW replacement? (O.K., sometimes it’s obvious when the card is missing in device list.)

And if you are replacing FC interface in the server you have to raise change request anyway and the FC alias change is just one small task in this change. (No, don’t tell me you are not using aliases!!!)

Maybe, my procedure doesn’t seem to be easier than your current one (if you are using port based FC zones). The admin configuration effort is almost equal.

But if you are using ITIL (or similar process system), the difference can be huge.

Every configuration change has pass an approval process.

And it takes much more effort than the configuration itself. Then, my advice will save you a lot of time.

If you think it’s a good advice, please, share the article (icons below the post).