Sun Microsystems, Inc.  Sun System Handbook - ISO 4.1 October 2012 Internal/Partner Edition
   Home | Current Systems | Former STK Products | EOL Systems | Components | General Info | Search | Feedback

Asset ID: 1-72-1478668.1
Update Date:2012-08-16
Keywords:

Solution Type  Problem Resolution Sure

Solution  1478668.1 :   Sun Storage 7000 Unified Storage System: Messages like "stmf_ic_scsi_data_msg_unmarshal: nvlist lookup of icsd_task_msgid and friends failed" when upgrading appliance software on a cluster  


Related Items
  • Sun Storage 7410 Unified Storage System
  •  
  • Sun Storage 7310 Unified Storage System
  •  
  • Sun ZFS Storage 7320
  •  
  • Sun ZFS Storage 7420
  •  
Related Categories
  • PLA-Support>Sun Systems>DISK>NAS>SN-DK: 7xxx NAS
  •  
  • .Old GCS Categories>Sun Microsystems>Storage - Disk>Unified Storage
  •  




In this Document
Symptoms
Changes
Cause
Solution


Applies to:

Sun Storage 7410 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun Storage 7310 Unified Storage System - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7320 - Version Not Applicable to Not Applicable [Release N/A]
Sun ZFS Storage 7420 - Version Not Applicable to Not Applicable [Release N/A]
7000 Appliance OS (Fishworks)

Symptoms

When upgrading a cluster that presents Fibre Channel LUNs to clients, after the first head has been upgraded successfully, error messages like:

Jul 26 11:48:18 7420-a pppt: [ID 247295 kern.warning] WARNING: stmf_ic_scsi_data_msg_unmarshal: nvlist lookup of icsd_task_msgid and friends failed
Jul 26 11:48:18 7420-a pppt: [ID 100316 kern.warning] WARNING: stmf_ic_rx_msg: unmarshal failed

may appear in the system log.

These will be seen in both system.sys and debug.sys from the shell or in a support bundle

It may also be noticed at this time that the STANDBY path(s) from the clients to the LUNs, that are usually presented by the "standby" cluster peer (i.e. the cluster head that is the partner of the head that owns the pool that contains the LUNs), are now offline.

This can make it seem as though the error messages are related to the loss of the standby paths, and that proceeding with the upgrade may be dangerous.

To discuss this information further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Community - 7000 Series ZFS Appliances

Changes

The cluster appliance software is being upgraded.

One head has been successfully upgraded, has rebooted and rejoined cluster but now has a different appliance software version to it's peer.

Cause

There is actually no problem here.

The STANDBY paths are lost because one cluster head is now at a different version of appliance software to it's peer and so no synchronization of configuration will take place. Only the head that is running the pool containing the LUNs will be able to provide the paths at this time - the ACTIVE paths. This is entirely expected.

The error messages reported are due to task management commands being sent down the offline path as clients attempt to discover LUNs. There is no fault associated with them.

See <SunBug 7187364>

Solution

Continue with the upgrade as normal.

When the second head is upgraded and the reboots take place, the paths that were previously offline on the peer head will come online and become ACTIVE.

When the second head finishes the upgrade and boots up into cluster the appliance versions are the same and the configuration can be fully synched.  The standby paths to all the LUNs will now appear again.

Once the standby paths are back no further error messages will be seen in the system log.


Attachments
This solution has no attachment
  Copyright © 2012 Sun Microsystems, Inc.  All rights reserved.
 Feedback