Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Sun Alert Sure Solution 1019002.1 : Collecting Support Data On Certain Arrays May Cause One or Both Array Controllers to Reboot
PreviouslyPublishedAs 231741 Bug Id <SUNBUG: 6620880> Product Sun StorageTek Flexline 240 Array Sun StorageTek Flexline 280 Array Sun StorageTek Flexline 380 Array Sun StorageTek 6130 Array Sun StorageTek 6140 Array Sun StorageTek 6540 Array Date of Workaround Release 15-Feb-2008 Date of Resolved Release 16-Jun-2008 Collecting Support Data On Certain Arrays May Cause One or Both Array Controllers to Reboot 1. ImpactFor Sun StorageTek 6130, 6140, 6540 and Flexline 240, 280, and 380 arrays, collecting support data may cause one or both array controllers to reboot. Results can range from device path failovers to loss of access due to both primary and secondary paths to a device being inaccessible.2. Contributing FactorsThis issue can occur on the following platforms and in the following releases:
This issue can only occur during array support data collection using SANtricity or Common Array Manager (CAM). Specifically the "Capture State" function of the collection which generates the stateCaptureData.dmp file. The following specific circumstances lead to this bug: 1) The Host Port/HBA must be or have been seen in the SAN by one or both controllers. This can be verified if it is listed in the WWPN drop down in CAM, or as an un-aliased Host Port in SANtricity. This means that "fake" WWPNs that are aliased on the array, or proactively created before the SAN is zoned WILL NOT cause this issue. 2) The Host Port/Initiator must have an existing alias. This means that not only had the HBA been seen by the array, but at some point the user had created an alias for it to assign to a specific host for Volume-to-LUN mapping on the array. 3) Since having been aliased, as in step 2, the Host Port/Initiator must have either: A) been removed from (unliased) and
placed in the "Free" list (available as a drop down in CAM)
or B) had the alias name changed
4) After conditions 1-3 are met, the execution of collecting the "All Support Data" or specifically the "State Capture" (SANtricity Only) will cause the controllers to panic. 3. SymptomsAt a minimum, users will see device path fail-overs. At worst, they will see a loss of access due to both primary and secondary paths to a device being inaccessible.This issue can only occur during array support data collection. Host message events showing SCSI or Fibre Channel messages indicating loss of access to one or both array controllers are the most common symptom associated with this issue. SANtricity or CAM may throw an error during collection, similar to the following (CAM error shown): ionShow 99 fails!The controller(s) will boot due to a defined inititiator or host port not being seen by the array at the time the data collection command is run. 4. WorkaroundThis issue can be avoided by not collecting support data when you know that:A) An Initiator/Host Port alias for a
connected HBA has been changed
B) An Initiator/Host Port alias has
been deleted from the array. This is normal if you replace/remove an
HBA as part of reallocation or hardware replacement.
If either of the above, consider performing a controller reset to avoid an unplanned outage. Otherwise, individual collections of "Alarms", "Array Profiles", and "Major Event Logs" are possible within each of the management utilities. The following CLI commands can be run to obtain support data for CAM: service -d <array_name> -c print -t arrayprofile > arrayProfileSummary.txtThe following SMcli or script editor commands can be run under SANtricity:
5. ResolutionThis issue is addressed in the following releases:
The above firmware releases are available with CAM 6.1.0 at: http://www.sun.com/download/index.jsp View by Category -> Systems Administration -> Storage Management This Sun Alert notification is being provided to you on an "AS IS" basis. This Sun Alert notification may contain information provided by third parties. The issues described in this Sun Alert notification may or may not impact your system(s). Sun makes no representations, warranties, or guarantees as to the information contained herein. ANY AND ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY DISCLAIMED. BY ACCESSING THIS DOCUMENT YOU ACKNOWLEDGE THAT SUN SHALL IN NO EVENT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES THAT ARISE OUT OF YOUR USE OR FAILURE TO USE THE INFORMATION CONTAINED HEREIN. This Sun Alert notification contains Sun proprietary and confidential information. It is being provided to you pursuant to the provisions of your agreement to purchase services from Sun, or, if you do not have such an agreement, the Sun.com Terms of Use. This Sun Alert notification may only be used for the purposes contemplated by these agreements. Copyright 2000-2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, CA 95054 U.S.A. All rights reserved. Modification History 16-Jun-2008: Updated Contributing Factors and Resolution sections; Now Resolved 14-Jul-2008: Updated Workaround section for corrections Internal Comments Please send technical questions to the following email: [email protected] and CC the following persons: Internal Contributor/Submitter Internal Eng Responsible Engineer Internal Services Knowledge Engineer Note: Firmware 07.10.25.10
is the next revision up, but is not bundled with CAM. In order to obtain this Firmware, a call with Sun Support is required. If stateCapture is required, please escalate the issue, and a tailored serial collection sequence will be made available. Controllers will report an exception in the exception log(excLogShow) as in the following example: ---- Log Entry #33 OCT-10-2007 08:27:55 AM ---- Exception: Invalid Opcode pc: 0x00000780 (Unknown Program Counter) Registers: edi = 1e330955 esi = 0 ebp = 1d549f08 esp = 1d549ec 4 ebx = 14 edx = 0 ecx = 8 eax = 0 eflags = 246 pc = 780 Stack Trace: 53cdba vxTaskEntry +a : vkiTask (1d54a8cc, 0, 0, 0, 0, 0, 0, 0, 0, 0) 4bb6eb vkiTask +bb : srcOpTask (1e33b2f0, 0, 0, 0, 0, 0, 0, 0, 0 , 0, 1d54a6c0, 4bb640, 0, 0, eeeeeeee, 1d54a6a4, ...) 1d9c325b srcOpTask +db : cmdProcess ([17b651d8, ffffffff, 1d54a640, 297, 1d54a8cc, ...]) ??? 1dbb32ff cmdProcess +6f : symSYMbolCommandHandler (17b651d8, &cmdE0, 0, 53cf13, 17b651d8, 3429334d) 1dd9fd54 symSYMbolCommandHandler+44 : svciov_dispatch (17a8ba10, 17b651d8, 1d5 4a5a4, &cmdE0) 1ddb26bf svciov_dispatch +3cf : stateCapture_1 ([17a7b780, 17b651d8, 54, 17 a7b780, &cmdE0, ...]) ??? 1dd9dd91 stateCapture_1 +41 : systemStateCapture ([17a7b75c, 1, a010241, 1ddb28b9, 0, ...]) ??? 1ddf5737 systemStateCapture +6b7 : ionShow (63, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 1e0defa2 ionShow +52 : showState__CQ23ion10IonManager ([1d264434, 1d54a39c, 1d54a390, 1e0def5f, &ionShow, ...]) ??? 1e0df0e6 showState__CQ23ion10IonManager+a6 : tditnShow__CQ23ion10IonManageri ( 1d264434, 0, 1ddf4fa0, 1d54a39c) 1e103977 tditnShow__CQ23ion10IonManageri+a7 : tditn +340 ([0, 43108e, 1d54a310 , 1e1038e4, 1, ...]) ??? 1e102f2e tditn +42e : tditn +70 (0, 4, 0, 1e3308c7) ******** Task Id: 0x1d54a6f0 Name: "symTask1" Status: 0x00 (ready) Options: 0x0004 (dealloc_stk) Priority: 125 Stack base: 0x1d54a6f0 Stack end: 0x1d5476f0 (adjusted for name) Stack size: 0x3000 (12288) Stack margin: 0x17c (380) Stack limit: 0x1d5479b8 Pend queue: 0x17a7b498 Last errno: 0x1c0001 value = 1 = 0x1 -> Internal Contributor/submitter [email protected] Internal Eng Responsible Engineer [email protected] Internal Services Knowledge Engineer [email protected] Internal Eng Business Unit Group NWS (Network Storage) Internal Escalation ID 37959518, 37970982, 1-22943001, 65766342 Internal Sun Alert & FAB Admin Info 13-Feb-2008, david m: draft created, send for review 14-Feb-2008, david m: revised draft considerable, resend for review 16-Jun-2008, david m: CAM release, republish resolved Attachments This solution has no attachment |
||||||||||||
|