Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Technical Instruction Sure Solution 1012305.1 : Sun Fire[TM] 15K/12K Servers: DR attach/detach operations failing due to portid conflicts arising from third party device drivers
PreviouslyPublishedAs 216980
Applies to:Sun Fire 12K ServerSun Fire 15K Server Sun Fire E20K Server - Version: Not Applicable to Not Applicable [Release: NA to NA] Sun Fire E25K Server - Version: Not Applicable to Not Applicable [Release: NA to NA] Sun SPARC Sun OS GoalThe dynamic reconfiguration (DR) feature on the Sun Fire[TM] 12K/15K/20K/25K servers enables you to perform hardware configuration changes to a live domain that is running the Solaris[TM] Operating System without causing machine downtime.This article discusses a known condition with regards to DR operations on Sun Fire 12K/15K/20K/25K servers arising from portid conflicts introduced as a result of certain third party device drivers. SolutionYou can execute DR operations from the system controller (SC) by using the SMS commands: addboard(1M), moveboard(1M),deleteboard(1M), and rcfgadm(1M).Alternatively, the same DR operations can also be initiated from the domain OS environment via the cfgadm CLI. The DR attach/detach operations' failure signature associated with such events is as follows: A. Failed DR attach operations hgc-6-sc0% date;addboard -da sb0 The above DR attach failure event is further marked by the following message logs in the domain OS environment: Aug 16 06:22:48 aji drmach: [ID 801593 kern.warning] WARNING: Cannot read property value: Device Node 0x0: property name B. Failed DR detach operations xcat3-sc0:sms-svc:6> deleteboard SB0 The above DR detach failure event is further marked by the following message logs in the domain OS environment: Jul 21 23:30:57 nebula2 gptwocfg: [ID 200766 kern.notice] ndi_devi_offline failed The root-cause of the above failure signatures have been isolated to the fact that if there exist a device node within the domain OS environment whose portid maps to a board that is involved with the DR attach/detach operations & where this specific device driver instance does not have a "property-name" attribute, will result in the failure of the DR operations. The above example demonstrates different symptoms of the same error condition. In the DR attach failure example detailed above, the device node originating the DR addboard failure was isolated to an instance of the Hitachi HDLM driver dlmfdrv, whilst in the DR detach failure example detailed above, the device node found to be in conflict with SB0's processors was isolated to an instance of the Compaq RAID driver swsp. Underlying in the above conflicting portid scenarios is the incorrect assumption that the portid for devices attached to the Safari bus are unique. In both situations depicted above the third party device driver instance resulted in a duplicated portid condition. This resulted in the DR failure problem, caused by the improper identification of devices (i.e., relying on the portid property attribute as the primary means of locating the device list of a DR operations' target board). The above issue with regards to the employment of portid as the primary means of identifying device(s) can occur with either Solaris[TM] 8 or Solaris[TM] 9 environment. This bug-fix to the Solaris Operating Environment is addressed via the following:
116979-04 = platform/SUNW,Sun-Fire-15000/kernel/misc/sparcv9/platmod 110836-06 = platform/sun4u/kernel/misc/sparcv9/gptwocfg
Internal Comments For internal Oracle/Sun use only see Bug IDs 4873095 & 4913987 starcat, dr, cfgadm, addboard, deleteboard, deprobe, property, name, portid, attribute, drmach Previously Published As Doc 80017 Attachments This solution has no attachment |
||||||||||||
|