Sun System Handbook - ISO 3.4 June 2011 Internal/Partner Edition | |||
|
|
Solution Type Technical Instruction Sure Solution 1006645.1 : What is a Solaris[TM] PCI IOMMU error telling me?
PreviouslyPublishedAs 209268 Description When there is a Peripheral Component Interconnect (PCI) bus I/O Memory Management Unit (IOMMU) fault the Solaris[TM] kernel logs a long and complex error message and may panic. This document should explain what that message contains. The example is based on the Schizo host-to-PCI bridge computer chip that is used in most of the UltraSPARC[TM] based systems. Steps to Follow A system could have several PCI buses connected to it. Each PCI bus has its own address space that uses 32 bit physical addressing. The system's CPU and memory and the host-to-PCI bridges operate in their own 64 bit address space (for example the Safari bus address space on a Sun Fire[TM] 6800), with physical and virtual addresses. In order to support Dynamic Memory Access(DMA) the host-to-PCI bridge chips need to be able to translate between the two address spaces in a controlled manner. This is achieved using a small IOMMU in each host-to-PCI bridge chip. When a driver that manages a DMA-capable PCI device wants to command the PCI device to perform a DMA, it must first program the IOMMU to translate an unused segment of the PCI bus's physical address space into a similar-sized segment of system physical memory. When the PCI device completes the DMA, it will inform the Solaris device driver using an interrupt. The device driver can then destroy the mapping allowing some other driver to use the address space. The IOMMU in each host-to-PCI bridge is only responsible for translations that are programmed for PCI devices that are in the tree of PCI buses and devices below that specific host-to-PCI bridge. If a DMA occurs that accesses a PCI address that is currently unmapped or has the wrong permissions then the host-to-PCI bridge will generate an error interrupt and log a message like... line ==== 1 - pcisch: [ID 462479 kern.warning] WARNING: pcisch3 (pci@9,600000): PCI fault log start: 2 - pcisch: [ID 309153 kern.notice] PCI iommu error 3 - pcisch: [ID 866426 kern.notice] pcisch3: Error 1 on IOMMU TLB entry 2: 4 - Context=0 not Writable not Streamable 5 - PCI Page Size=8k Address in page c446a000 6 - pcisch: [ID 219581 kern.notice] Memory: Valid not Cacheable Page Frame=0 7 - pcisch: [ID 684763 kern.notice] pcisch3 (pci@9,600000): PBM 8 - AFSR=0x0.00000000 9 - pcisch: [ID 120591 kern.notice] dwordmask=0 bytemask=0 10 - pcisch: [ID 829486 kern.notice] pcisch3 (pci@9,600000): PCI primary error (0): 11 - pcisch: [ID 227296 kern.notice] pcisch3 (pci@9,600000): PCI secondary error (0): 12 - pcisch: [ID 748186 kern.notice] pcisch3 (pci@9,600000): PBM AFAR 0.00000000: 13 - pcisch: [ID 127741 kern.warning] WARNING: pcisch3: PCI config space 14 - CSR=0xaa0<signaled-target-abort> 15 - pcisch: [ID 656289 kern.notice] pcisch3 (pci@9,600000): PCI fault log end. 16 - pcisch: [ID 686566 kern.notice] Scrubbing PCI iommu TLB entries 17 - pcisch: [ID 193938 kern.notice] No fatal PCI bus error(s) An IOMMU entry consists of two linked data structures: the tag and data. These are held in a small array, called the Translation Lookaside Buffer(TLB), inside the bridge chip. This array is a 64-entry subset of the much larger array of Translation Table Entry(TTE) structures that are held in memory and searched automatically when there is no match for the PCI address in the array of TLBs. TLB tag Bits Description ===== =========== 32-25 context used to link entries. 23-24 error type 00=protection error, 01=invalid 10 timeout error, 11 UE ECC error. 22 error, 0 = no error, 1 = error. 21 writeable 20 streaming 19 page size, 0 = 8KB, 1=64KB 18-0 19 bit virtual page number (PCI address >> 13) TLB data Bits Description ==== =========== 32 valid, TLB data is valid. 31 reserved. 30 cacheable. 29-0 30 bit physical page frame number (system address >> 13) TTE Bits Description ==== =========== 63 valid bit 61 page size 60 streamable 59 localbus (ignored) 58-51 context number 42-13 bits 42-13 of the system physical address 12-7 software bits. 4 cacheable. 1 writeable. So when a DMA is performed from a PCI device to the host-to-PCI bridge the bottom 13 bits (the offset within the 8KB page are saved). The upper 19 bits are then compared to the entries in the TLB array, if the 19 bits match a valid entry then the 30 bits of physical page number (system bus address modulo 8k) are added to the 13 bits of saved offset to produce a 43 bit system physical address that is the real target of the DMA which addresses the real memory in the machine and the DMA continues. It the valid bit is not set in the TLB data then the search for a match continues. If no match is found in the little TLB array, the PCI bus address is used to calculate the offset into the much larger array of TTEs. The TTE at that address is loaded into the TLB array and used. If that entry has the wrong permissions or the valid bit is not set, then we have an IOMMU error. The error reporting code walks the small TLB array looking for entries with the error bit set and then printing out most of these fields. Let's go through this error line by line. Line 1 - Tells us the host-to-PCI bridge that is reporting the error. The PCI DMA transaction that caused the message must have occurred from a PCI device below this node in the device tree. line 2 - There are several types of error that the pcisch host-to-PCI bridge chip can generate this tells us that it is an IOMMU error. line 3 - Error 1 tells us the type of IOMMU error that we have found,this is just the "error type" field from the tag. It also tells us the entry in the TLB array where the error walker found a tag with the error bit set in the data. line 4 - prints the context/writeable/cacheable bits. line 5 - prints out the page size and the PCI bus address (with the 13 bits of offset masked out) from the TLB tag. line 6 - prints out the valid, cacheable and the 30 bit Page Frame Number (PFN) from the data. Now in our example we see that the error was an invalid error, yet the valid bit is set. This means that we failed to look up the address in the little TLB array, we used the PCI virtual page number to locate the TTE in the memory array but the valid bit was not set in the selected TTE, we always load the TTE into the TLB data and tag entries, the valid bit in the data is set as it has been installed okay. You will see that the the PCI virtual page number is not stored in the memory array of TTEs, but is calculated from the incoming DMA transaction, there is a direct link between the size of the memory array of TTEs lines 7-12 are printing out internal registers lines 13-15 are printing out the host-to-PCI bridges PCI node. Its default action on receiving a invalid DMA transaction is to send a target abort back. lines 15 and on - just housekeeping. On the whole, this message is not very useful as it is reporting an error condition caused by an invalid action. The faulty component could be the following: -The device driver which has setup or removed(torn down) an IOMMU mapping incorrectly. Product Sun Fire E25K Server Sun Fire V890 Server Sun Fire V880 Server Sun Fire V490 Server kernel, pci, panic, iommu, fault Previously Published As 82572 Attachments This solution has no attachment |
||||||||||||
|