HP VSS update fixes Cluster Shared Volume error: ‘STATUS_IO_DEVICE_ERROR(c0000185)’

by

If you’re mounting CSVs over an iSCSI session in Windows Server, then you’re no doubt aware of the many hotfixes that Microsoft has put out to address various disconnect and data corruption issues, in particular the one referenced in KB2878635.

However, in our environment, even with all available hotfixes installed on all of our nodes, we were still getting the below (for Google crawling) during VSS backups and once in a while even during non-backup operations:

Cluster Shared Volume ‘MSA 2040-1 vd02’ (‘MSA 2040-1 vd02’) is no longer available on this node because of ‘STATUS_IO_DEVICE_ERROR(c0000185)’. All I/O will temporarily be queued until a path to the volume is reestablished.

It turns out that the true fix might have come from our storage vendor, HP, in the form of an updated VSS hardware provider.

According to the release notes, this update prevents “mapping of volumes to non-existent controller ports on controllers with less than four host ports.” It just so happens that we are currently using only two of the four available iSCSI ports on each controller. After installing the updated VSS, VDS and CAPI Proxy software, then rebooting each node, we haven’t seen these errors!

4/30/2014 – Update: the ‘STATUS_IO_DEVICE_ERROR(c0000185)’ errors have returned; ignore the above.

5/14/2014: Update: problem solved! I found that two of our nodes had VSS hardware provider for a retired SAN still resident. It didn’t show up in the Windows uninstaller Control Panel, and if my memory serves, it didn’t show up when doing a ‘vssadmin show providers’ – but if you searched the registry for the provider name, it was all over the registry. We could delete all of the references on one node, but the registry permissions were too screwed up on one node to be able to delete the keys. After deleting the reg keys on one node and wiping and reinstalling Windows on the other, no more 5120 Event IDs!

1/21/2015

Recently, we started experiencing the same ‘STATUS_IO_DEVICE_ERROR(c0000185)’ errors again.

The problem is that when VSS-enabled backup software such as Veeam B&R (not it’s Veeam’s fault; Veeam B&R is awesome!) starts a backup, it chooses among the available VSS providers. But you can (and should!) force it to use the correct VSS provider for your target (SAN).

The reappearance of this problem probably coincided with creating a new failover cluster. New cluster means default settings in Veeam B&R.