Re: [PATCH&RFC 2/2] OS_MCA Recovery from poisoned memory read

From: Christian Cotte-Barrot <Christian.Cotte-Barrot_at_bull.net>
Date: 2004-08-23 18:29:20
Hidetoshi Seto wrote:
> 
> Keith Owens wrote:
> > mca_handler_bh() is running as an extension of the MCA event which
> > means that it is not irq safe.  It is not safe to get any external lock
> > in mca_page_isolate() or mca_handler_bh().
> 
> Therefore the MCA handler couldn't recover the system if the process is
> running in kernel-mode since it possibly has such important kernel locks.
> 
> > AFAICT, my concerns about the MCA event and mca_handler_bh() not being
> > irq safe are only a problem for the case when the MCA was triggered by
> > user space code but was delivered when the cpu was in kernel code.
> > Maybe we do not support the problem case.
> >
> > *    offending process  affected process  OS MCA do
> > *     kernel mode        kernel mode       down system
> > *     kernel mode        user   mode       kill the process
> > *     user   mode        kernel mode       kill the process <=== problem
> > *     user   mode        user   mode       kill the process
> 
> So we have to guarantee that the process is running in user-mode,
> which all processes can't have any locks of kernel, right?
> 
> It would be happy if this fixed patch satisfy your requirement.
> 

Surprising it looks like a discovery (locking problem within the
mca handler) ?

Here is Zoltan's document including a design that exposes a solution 
how to avoid race (paragraphs 3.5, 3.5.1, 3.5.2 ...) and many other
stuff :
  http://mca-recovery.sourceforge.net
    Linux Itanium MCA Recovery Design Document, third draft
      http://mca-recovery.sourceforge.net/mca3.html

-- 
+===========+=======================+==================================+
|  |\/\/\/| |                       |                                  |
|  |      | |Christian Cotte-Barrot |org.  :BULL/                      |
|  | (~)(o) |Bull S.A.              |office:FREC/B1-401                |
| C      _) |1, rue de Provence     |mailto:                           |
|  | ,___|  |B.P. 208               |   Christian.Cotte-Barrot@bull.net|
|  |   /    |38432 ECHIROLLES CEDEX |phone :+33 (0)476297725 (229 7725)|
| /----\    |FRANCE                 |fax   :+33 (0)476297518 (229 7518)|
+===========+=======================+==================================+
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Mon Aug 23 04:29:56 2004

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:29 EST