Re: [PATCH] ia64: reset console_loglevel so INIT output always goes to console

From: Russ Anderson <rja_at_efs.americas.sgi.com>
Date: 2005-01-10 04:06:37
Keith Owens wrote:
> 
> We are slowly but steadily moving to recovery from some MCA events.  If
> one of the cpus is spinning disabled when an MCA occurs then the
> disabled cpu will get a slave INIT event as part of the MCA rendezvous.
> If the MCA is recoverable then the slave INIT event will also be
> recoverable and will eventually return to user space.
> 
> That change is still some way off, but bear it in mind when changing
> the existing code.

Good points, Keith.

There are a number of changes that will be needed now that MCAs and INITs
are becoming recoverable.  A disabled CPU should not receive an INIT
as part of MCA rendezvous.  Some of the changes will require changes
in the MCA and SAL specs.

For example, having SAL rendezvous all the CPUs before calling OS_MCA 
may have been reasonable when linux lacked the ability to recover from 
an MCA.  But now that is changing, the descision to rendezvous CPUs
should get made later, in linux, if it cannot recover from the MCA.
Does it really make sense to rendezvous 512 CPUs just because one
CPU happened to hit a memory uncorrectable in a user application
(and recovers by killing the appication and discarding the page)?

Does it still make sense to have only one call into OS_MCA at
a time?  Or is it more reasonable to support multiple OS_MCAs
and let the linux MCA code coordinate processing of the OS_MCA,
when needed?  As the code progresses, it should be reasonable move 
more of the decision & coordination code further into the
recovery code (or at least not prevent that from happening) so
that, for example, multiple independent MCAs can be recovered
in parallel.

As I said, this will require changes in the MCA & SAL specs.
Some are simply clearing up ambiguities in the specs, as Keith found
in MCA logging of recovered errors.  Some will be more fundamental 
changes to support better recovery.  The code has reached the point 
where we need to start making enhancements to those specs.

Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Jan 9 12:06:49 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:34 EST