[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference mvblab::sable

Title:SABLE SYSTEM PUBLIC DISCUSSION
Moderator:COSMIC::PETERSON
Created:Mon Jan 11 1993
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2614
Total number of notes:10244

2580.0. "Strange CPU-Exception - Kurt can you have a look?" by ATZIS2::PUTZENLECHNE (wherever is fun, there's always ALPHA) Thu Apr 17 1997 11:03

Hi Kurt!

Can you have a look at the following errorlog entryof a 2100 4/275?
UNIX 3.2G
SRM 4.5
PAL 1.45

There must be "Quadword-shift" in the errorlogframe because this is
the errorlogframe of an Correctable Memory Error on Memory Modul 2.
(BIU_STAT = 240, MEM ERR REG for MODUL2 is 40001)
But up to now the BIU_STAT was located at Address 1A8 in this errorlog output
it is at address 1B0 !

Also the Memory-Registers: for Modul 2 the Error-Register have been at
address 3F0 in this output it is on 3F8!

--> All registers are schifted backwards - Is this new in 3.2G?
    Were does it come from?

thanks for any hint
Helmut

entwama1:root> uerf -R -Z | more
                                                  uerf version 4.2-011 (122)


********************************* ENTRY     1. *********************************

----- EVENT INFORMATION -----

EVENT CLASS                             ERROR EVENT
OS EVENT TYPE                  100.     CPU EXCEPTION
SEQUENCE NUMBER                 44.
OPERATING SYSTEM                        DEC OSF/1
OCCURRED/LOGGED ON                      Thu Apr 17 09:06:31 1997
OCCURRED ON SYSTEM                      entwama1
SYSTEM ID                 x00060009     CPU TYPE:  DEC 2100
SYSTYPE                   x00000000
PROCESSOR COUNT                  4.
PROCESSOR WHO LOGGED      x00000002

----- UNIT INFORMATION -----

UNIT CLASS                              CPU

RECORD ENTRY DUMP:

  RECORD HEADER
0000:   002C04B8  00060009  00060101  3355CBF7        *..,...........U3*
0010:   77746E65  31616D61  00000000  00000000        *entwama1........*
0020:   00000004  00000002  10010064  00000000        *........d.......*
0030:   00000000  00000000                            *........        *

  RECORD BODY
0038:   00000002  00000228  00000220  80000000        *....(... .......*
0048:   00000110  000001A0  0000008A  00000001        *................*
0058:   0000008A  00000001  07B39800  00000000        *................*
0068:   00000004  001382F8  0063FD9C  00000000        *..........c.....*
0078:   00000000  00000000  00000009  00000000        *................*
0088:   00000240  00000000  00004E00  00000000        *@........N......*
0098:   00000600  00000000  00000000  00000000        *................*
00A8:   0046F5E0  FFFFFC00  00000000  00000000        *..F.............*
00B8:   0046F980  FFFFFC00  0046F9B0  FFFFFC00        *..F.......F.....*
00C8:   0046FA10  FFFFFC00  0046F780  FFFFFC00        *..F.......F.....*
00D8:   0046F490  FFFFFC00  00019410  00000000        *..F.............*
00E8:   1FFED340  00000001  A929BA58  FFFFFFFF        *@.......X.).....*
00F8:   005E2F80  FFFFFC00  EDCBA987  EDCBA987        *./^.............*
0108:   72727272  40424272  FFFFFFFD  CF77BF3F        *rrrrrBB@....?.w.*
0118:   00000000  00000000  00010000  00000000        *................*
0128:   0063FD9B  00000000  00000002  00000000        *..c.............*
0138:   0BAB0000  00000000  00000000  FFFFFFFC        *................*
0148:   00000001  00000000  3EA65A58  00000000        *........XZ.>....*
0158:   00014125  00000000  00000000  00000000        *%A..............*
0168:   00000000  00000000  00000004  001382F8        *................*
0178:   00014000  00000000  000004F0  00000000        *.@..............*
0188:   00000000  00000000  00003F01  00000000        *.........?......*
0198:   00000003  00000000  FFFFFFFF  00000007        *................*
01A8:   0000142E  00000000  00000240  00000000        *........@.......*
01B8:   00019410  00000000  5000E567  0000000E        *........g..P....*
01C8:   00000000  00000000  00019710  00000000        *................*
01D8:   00006970  00000000  0004F833  00002600        *pi......3....&..*
01E8:   C00001C5  C00001C5  00000000  00000000        *................*
01F8:   00000000  00000000  00000A00  00000A00        *................*
0208:   20C5CD20  20C5CD20  0D1B4C84  0D1B4C84        * ..  .. .L...L..*
0218:   00000868  00000868  C4000000  C8002000        *h...h........ ..*
0228:   E0800043  E0800043  2F600B43  0F600B43        *C...C...C.`/C.`.*
0238:   00000000  00000000  00000000  00000000        *................*
0248:   00000000  00000000  0FD7CCE8  0FD7CCE8        *................*
0258:   0FF9EE89  0FF9EE19  0002001A  0002001A        *................*
0268:   00000011  000000B8  27060190  FE000082        *...........'....*
0278:   00000000  00000000  E3800010  E3800010        *................*
0288:   20600B43  00600B43  00000000  00000000        *C.` C.`.........*
0298:   5A79A860  00000007  00000010  00000000        *`.yZ............*
02A8:   00000000  00000000  0010603F  00000000        *........?`......*
02B8:   400807FF  00000000  3FF00000  00000000        *...@.......?....*
02C8:   00000000  00000000  000C01FF  00000000        *................*
02D8:   1FF00000  00000000  00440000  00000000        *..........D.....*
02E8:   00110803  00000663  0011080D  00012618        *....c........&..*
02F8:   0011080E  0000821A  0011080F  0003221C        *............."..*
0308:   00110810  0003F61E  00110811  0002DA20        *............ ...*
0318:   00110800  0000065D  00110801  0000065F        *....]......._...*
0328:   00000008  00000058  00000000  00000000        *....X...........*
0338:   00000000  00000000  00000000  00000000        *................*
0348:   00000000  00000000  00000000  00000000        *................*
0358:   00000000  00000000  00000000  00000000        *................*
0368:   00000000  00000000  00000000  00000000        *................*
0378:   00000000  00000000  00000000  00000000        *................*
0388:   00000008  00000058  00000001  00000000        *....X...........*
0398:   00000000  00000000  00000000  00000000        *................*
03A8:   00000000  00000000  00000000  00000000        *................*
03B8:   00000000  00000000  00000000  00000000        *................*
03C8:   00000000  00000000  00000000  00000000        *................*
03D8:   00000000  00000000  00000000  00000000        *................*
03E8:   00000008  00000058  00000002  00000000        *....X...........*
03F8:   00040001  00000000  069E6800  E2800008        *.........h......*
0408:   F083FFFF  00601E83  8005506A  8005506A        *......`.jP..jP..*
0418:   0C020C02  0E560E56  00000B45  00000017        *....V.V.E.......*
0428:   20000000  20000000  00000800  00000800        *... ... ........*
0438:   000001D8  000001D8  00000000  00000000        *................*
0448:   00000008  00000058  00000003  00000000        *....X...........*
0458:   00000000  00000000  E2C00008  E2C00008        *................*
0468:   20601E83  00601E83  8015506B  8015506B        *..` ..`.kP..kP..*
0478:   071506A1  04230555  0000000D  00000017        *....U.#.........*
0488:   20000000  20000000  00000800  00000800        *... ... ........*
0498:   000001D8  000001D8  00000000  00000000        *................*
04A8:   00000000  00000000  00000000  5E3C7E25        *............%~<^*
T.RTitleUserPersonal
Name
DateLines
2580.1Who knows ?ATZIS2::PUTZENLECHNEwherever is fun, there's always ALPHAWed Apr 23 1997 07:285
    Hi!
    
    Anybody else who is able to answer the question in .0 ?
    
    Helmut
2580.2Seen on AS4100DANGER::HAYESWed Apr 23 1997 13:1994
          
    This is a blitz released against the AS4100, certainly similar,
    perhaps the console folks can comment.
    
    Dennis
    
    
      <<< MVBLAB::SYS$SYSDEVICE:[NOTES$LIBRARY]ALPHASERVER_4100.NOTE;1 >>>
                             -< AlphaServer 4100 >-
================================================================================
Note 93.39                        Field Blitzes                         39 of 45
MOVMON::DAVIS                                        79 lines  26-FEB-1997 10:03
                        -< Errorlog Decode Byte Offset >-
--------------------------------------------------------------------------------

Request log number: 886

+---------------------------+TM
|   |   |   |   |   |   |   |
| d | i | g | i | t | a | l |      TIME   DEPENDENT   BLITZ
|   |   |   |   |   |   |   |      
+---------------------------+


   
      BLITZ TITLE: Alpha Server 4100 - Errorlog Decode Byte Offset
 
                                                DATE: February 26, 1997
      AUTHOR:Ted Gent      			TD #: 
      DTN:223-6530         
      ENET:POBOXA::GENT  	   		CROSS REFERENCE #'s:
      DEPARTMENT:SBU Engineering		(PRISM/TIME/CLD#'s) 
                                                
         					
                                                        
      INTENDED AUDIENCE: U.S/EUROPE/GIA		PRIORITY LEVEL: 2
                         
                                                 (1=TIME CRITICAL,
                                                  2=NON-TIME CRITICAL)
      =====================================================================

Subject : Alpha Server 4100 - Errorlog Decode Byte Offset


Problem
-------

  A bug  has  been  found  in  the  Digital  Unix and OpenVMS PAL code which
  causes,  under  certain circumstances, the errorlog to be offset by a byte
  in  Correctable  Read Data (MChk 620/630) handling.  The potential for the
  problem to occur is present in all AlphaServer 4100 SRM Console Code prior
  to release 4.8-5.

  If you  encounter  entries in the errorlog indicating a Machine Check code
  of  400,  an unassigned code, that record may be offset by a byte.  Please
  note  that this only affects individual records, other entries in the same
  log file will be correctly recorded and reported.

Description
-----------

  1. An I/O-detected 620 CRD error is delivered as a device IPL (21)
     interrupt by the IOD.
  2. PAL is supposed to raise IPL to MCHK IPL at this point. In all
     versions of SRM console prior to V4.8-X, IPL is not raised..
  3. During the course of handling the 620 error, PALcode will scrub
     the location if the 620 was a memory CRD error. When PAL reads
     the bad location in order to write it back (scrub it), if the
     error was not a transient, the CPU takes a 630 CRD error since
     it sees the same bad ECC.
  4. If IPL is at MCHK IPL, which it should be at, the 630 will pend
     until the 620 can be fully serviced. The bug in PAL is such that
     the 630 can potentially come in on top of the 620. [Software CRD
     handlers typically raise IPL to MCHK IPL (31), but there's still
     a window between PAL dispatching to the interrupt service routine
     (ISR) and when IPL is actually raised by the handler.]
  5. The net effect of the 630 on top of the 620 is that PAL passes a
     minus one (FFFFFFFF) pointer to the ISR and when this pointer is
     added to various other hard-coded offsets, it results in the CRD
     frame passed by PAL to the upper level software being off by one
     byte. This causes the CRD entry to the screen (if at console) or
     the system error log entry (if at UNIX/VMS level) to be shifted
     by one byte.

Resolution
----------

  Upgrade to  the  V4.8-5  SRM console if you are experiencing this problem.
  Occurrences  have  been rare and the only documented and analysed case was
  from  a  site being monitored by DPP that was experiencing greater than 15
  CRD's per hour.

bc/tg 02/26/97