[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::netware

Title:Novell NetWare
Moderator:NETCAD::STEFANI
Created:Wed Feb 27 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2146
Total number of notes:7285

2123.0. "DEFEA/SFT3, Primary Interrupt faults" by CHEFS::CHOIC (Shaking and Moving) Mon Feb 17 1997 15:05

    Hi,
    
    I have a customer who has a problem very much like Note 1465.*, ie
    
    primary interupt controller detected a lost hardware interupt..
    
    They have a big (read important) end-user who is running 4 pairs of
    servers on SFTIII using DEFEA-AA (SAS MMF) with Netware V4 and 4.11.
    
    The above errors start as soon as the drivers are loaded; when they use
    DEFPAs, there's no problem.
    
    Hardware is a number of different machines, Pentium 120 and 150Mhz or
    Pentium Pros.
    
    They've tried the latest V2.80 and V2.81 drivers.
    
    The cards are Rev02, f/w V2.46.  IRQ is 10 (not shared) - the reseller
    is aware that IRQ15 is "bad news for Netware".
    
    The original note 1465 is dated Aug 94 and suggested the bug had been
    fixed.  Could it have come back?  
    
    Lots of business if we can fix this...
    
    Thanks in advance.
    
    Clinton
T.RTitleUserPersonal
Name
DateLines
2123.1NETCAD::STEFANIFDDI Adapters R UsMon Feb 17 1997 15:5839
Hello Clinton,

>>    I have a customer who has a problem very much like Note 1465.*, ie
>>    primary interupt controller detected a lost hardware interupt..

This error message has been on-again, off-again for several years now.  Back in
'94 when I wrote that note, I did change the way I disabled interrupts on the
DEFEA in the DEFEA.LAN driver and it seemed to help minimize the problem.  Was
I convinced it completely went away?  No, not really, but it did help the
customers who saw it back then.

This problem is NetWare only, and system, option, and configuration-specific. 
Sometimes physically changing the cards around or changing the IRQ assignments
makes it go away.  When it doesn't go away after repeated attempts, the only
recourse left is to turn off the message in AUTOEXEC.NCF.

The DEFEA supports four IRQ assignments (9, 10, 11, and 15) so you can try
changing the IRQ's.  You didn't mention whether you're using the DEFEA as an
MSL or LAN card.  If you're using the DEFEA.LAN driver, that driver supports
IRQ sharing.  If you're using the DECMSL4X.MSL driver, you'll need to assign a
unique IRQ for it.

>>    The above errors start as soon as the drivers are loaded; when they use
>>    DEFPAs, there's no problem.

Different adapters, different bus type - too many differences to ensure
consistent behavior across both adapter families.

>>    Lots of business if we can fix this...

You'd need to place HW probes on the EISA bus and possibly the Intel primary
and secondary PICs to try and trace what's happening.  A lot of work to even
begin figuring out who to point fingers at.

Try changing the slots and resources around.  If that doesn't work, try to
convince the customer to disable the message in AUTOEXEC.NCF.

Regards,
   Larry
2123.2Clarification?CHEFS::CHOICShaking and MovingTue Feb 18 1997 11:1723
    HI Larry,
    
    Thanks for the quick reply!
    
    >You'd need to place HW probes on the EISA bus and possibly the Intel
    >primary and secondary PICs to try and trace what's happening.  A lot 
    >of work to even begin figuring out who to point fingers at.
    
    What's an Intel Primary & secondary PIC?  Did you mis-type PCI?
    This sounds difficult/impossible.  The customer would have to have
    probes, know what to look for and be able to differentiate good from
    bad signals.
    
    >Try changing the slots and resources around.  If that doesn't work, try
    >to convince the customer to disable the message in AUTOEXEC.NCF.
    
    If the customer were to disable the message, does that mask a problem,
    or is this message more of a nuisance than actually warning of a
    genuine problem?
    
    Thanks,
    
    Clinton
2123.3NETCAD::STEFANIFDDI Adapters R UsThu Feb 20 1997 03:1837
    >>>You'd need to place HW probes on the EISA bus and possibly the Intel
    >>>primary and secondary PICs to try and trace what's happening.  A lot 
    >>>of work to even begin figuring out who to point fingers at.
    >>
    >>What's an Intel Primary & secondary PIC?  Did you mis-type PCI?

Nope.  I meant PIC.  It refers to the primary interrupt controller (PIC) chips
on Intel motherboards.  There is a master and a slave PIC to support the 0-15
IRQ lines.

    >>This sounds difficult/impossible.  The customer would have to have
    >>probes, know what to look for and be able to differentiate good from
    >>bad signals.
    
That's true.

    >>>Try changing the slots and resources around.  If that doesn't work, try
    >>>to convince the customer to disable the message in AUTOEXEC.NCF.
    >>
    >>If the customer were to disable the message, does that mask a problem,
    >>or is this message more of a nuisance than actually warning of a
    >>genuine problem?

Opinions vary.  Novell used to take a hard-line stance that this problem should
get resolved, not masked.  Now I don't see that hard-line stance from them. 
However, if you take any other commercial operating system do you recall ever
seeing an annoying message and system beeping because of a lost HW interrupt? 
Me neither.  Obviously if the cards and system are causing lost interrupts,
you'd expect it on more environments than just NetWare.

By turning off the message, the OS will just ignore the condition.  Will you
see a system performance drop because of it?  I don't think so, but that's hard
to say.  If the customer is willing to turn off the message and run the system
for awhile you can judge whether the performance and stability are there.

Regards,
   Larry