Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Technical Instruction Sure Solution 1020243.1 : T5440 CMP configuration changes
PreviouslyPublishedAs 254769 Description One of the major design changes for the T5440 is the POST state on each PLX chipset is dependent on the CMP configuration, this in turn determines which paths are available to the internal and external I/O. This document covers the process, and potential problems, for changing CMP configurations. Steps to Follow The following process is detailed in the Platform Service manual. Access to each PLX is via the local CMP if present, otherwise the upstream port is disabled and communication is PLX <-> PLX driven by the next lowest numbered CMP (ie. CMP0 > PLX3 > PLX2 | CMP1 > PLX1 > PLX0 in a 2P configuration). In a 1P configuration all paths are accessed via the CMP0 upstream path. What does this mean in the field? Whenever we change the configuration we need to ensure the ILOM/VBSC are aware of which upstream ports should be active and the OS updated with any device path changes. The ILOM holds the current masks that determine which ports are active, we can force VBSC to update these by rescanning on just the next power-on, or after every power-on via the ioreconfigure ILOM parameter. The default is to never perform this even if we change the CMP configuration changes, whether that be following a CMP failure or an increase/reduction in modules installed so engineers will need to do this manually whenever changes are made. Any changes also need to be reflected in the OS, we provide a Perl script that needs to be run when booted off a net install image with the root disk mounted and this must be run before booting the OS after a configuration change to prevent errors and a possible path_to_inst rebuild. Note: Both the procedure and the script are covered in the T5440 Service manual. In the Service manual the reconfig.pl script is called reconf.pl, but it is actually the same script. Note: The script 'reconfig.pl' is it available via patch 10264587: I/O Remapping Script for Sun SPARC Enterprise T5440 Server - Solaris SPARC In summary after changing the CMP configuration we need to do the following: 1.On the ILOM set the reconfigure parameter set /HOST ioreconfigure=nextboot If you don't reconfigure the upstream PLX ports when upgrading the number of CMPs you will still be able to access all devices but it will be driven through the single upstream port which I imagine will be a performance hit. If you fail to reconfigure after degrading the number of CMPs you will lose access to whichever devices were connected via that specific upstream port; 4P system reduced to a 1P, no ioreconfigure so the VBSC will try to access the onboard network through the only active upstream port it has (pci@400 = CMP0) - however the upstream to CMP1 is still held in the configuration so device access fails: {0} ok boot net -s Boot device: /pci@400/pci@0/pci@8/pci@0/pci@8/pci@0/pci@c/network@0 File and args: -s ERROR: boot-read fail Can't locate boot device {0} ok If you booted to Solaris then network services, and dependent services, would fail to start. If you boot to Solaris[TM] after the ioreconfigure but before running reconfig.pl (via the net install image) you will lose access to devices as above, but additionally when you later perform the OS device path reconfigure multiple device path entries will be created in the platform path_to_inst resulting in errors on reboot: # mount /dev/dsk/c0t0d0s0 /mnt # cd /mnt # /reconfig.pl replacing /pci@600 with /pci@400/pci@0/pci@8 in /etc/path_to_inst updating /dev symlinks replacing /pci@700 with /pci@500/pci@0/pci@8 in /etc/path_to_inst updating /dev symlinks # ls -lc etc/path_to_inst -r--r--r-- 1 root root 2687 Nov 6 10:10 etc/path_to_inst # Rebooted to check everything is ok and errors reported during boot: WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0' (driver pxb_plx), 18 used WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0/pci@9' (driver pxb_plx), 19 used WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0/pci@c' (driver pxb_plx), 20 used WARNING: multiple instance number assignments for '/pci@400/pci@0/pci@8/pci@0/pci@d' (driver pxb_plx), 21 used WARNING: multiple instance number assignments for '/pci@500/pci@0/pci@8/pci@0' (driver pxb_plx), 22 used WARNING: multiple instance number assignments for '/pci@500/pci@0/pci@8/pci@0/pci@9' (driver pxb_plx), 23 used WARNING: multiple instance number assignments for '/pci@500/pci@0/pci@8/pci@0/pci@c' (driver pxb_plx), 24 used The reconfig.pl output shows that the CMP2 and CMP3 upstream paths have been swapped with CMP0 and CMP1 due to the PLX <-> PLX pathing. The best method for clearing the multiple path entries is to rebuild the path_to_inst from scratch: # echo "#path_to_inst_bootstrap_1" > /etc/path_to_inst # sync # sync # sync # reboot If the customer is using LDOM this will cause further problems since we will lose any virtual devices, at this time we are unsure of the implications to ZFS. So in summary customers/field engineers need to follow the correct procedure every time, this has so far proven to be 100% reliable in reconfiguring the platform and OS correctly. However we need to be aware of what occurs when things go wrong, and in fairness it is reasonably simple to recover from. Please be aware that customers using software raid (such as SVM) will need to detach one side of their root mirror prior to running the reconfigure script - once booted from the network OS image SVM will not be available and the underlying drive rather than the metadevice will be mounted. Once the reconfigure and reboot is complete simply reattach the submirror and allow to synchronize. NOTE: A new Solaris command device_remap is added to S10U8 and later, which provides the functionality of the reconfig.pl script. For more details reference the man page of the device_map command. * The reconfig.pl is supported on all Solaris 10 releases. * The device_remap command (script) is supported with S10U8 and later and is not certified on earlier Solaris releases. Product Sun SPARC Enterprise T5440 Server PLX, CMP, ioreconfigure, reconfig.pl Change History Date: 2010-12-21 User name: Dencho Kojucharov Action: Currency check Comments: audited by Entry-Level SPARC Content Lead made a few minor updates (added reference to service manual) Attachments This solution has no attachment |
||||||||||||
|