Re: [patch] Memory Error Handling Improvement

From: Russ Anderson <rja_at_sgi.com>
Date: 2005-06-25 06:11:09
David Mosberger wrote:
>
>   Russ> I assume you mean usage like in
>   Russ> arch/ia64/oprofile/backtrace.c, so the usage in mca_drv.c
>   Russ> would be along the lines of:
>
>   Russ>       extern char ia64_ivt[];
>
>   Russ>       if (psr1->cpl != 0 || (pmsa->pmsa_iip >= (unsigned
>   Russ> long)ia64_ivt+0x3000 &&
>
>   Russ> Do you have the same objection for interrupt_pnr?  If so, what
>   Russ> is the best way to calculate the offset in ivt.S (which looks
>   Russ> hardcoded for other routines)?
>
> I thought interrupt_pnr was at the end of the vector.  Now that I look
> closer, the whole thing looks rather doubious/fragile to me.  What
> exactly are you trying to do there?

The intent is to deal with cases where the user application does
a load of memory with bad ECC, but the MCA surfaces in the interrupt
(kernel) code.   

Testing with error injection showed a significant number of cases
where the MCA surfaced early in the interrupt routine, even though
the load of the bad data was launched from a user process.  Adding 
the second condition to look for these cases allowed them to be 
recovered.  Analysis of the recovered MCA records showed 7-10%
of the recoverys were this condition, when running the error
recovery code with other activity that caused interrupts.

Previously, if the MCA surfaced while the cpu was in privilage 
mode the code would not try to recover.  This change adds a second 
condition, to see if the kernel is early in the interrupt
routine.  It does this by checking the instruction range.  As
Hidetoshi Seto points out, the check should also make sure
the interrupted process was in user mode.  That has been added 
to the patch and tested.

A major concern is verifying that the correct process is killed.  
The name of the process killed is logged to /var/log/messages.
The test also verifies that the correct process is killed.  Early
on there was a bug in the error injection routine, that resulted
in changing the ECC on the wrong physical address.  This was 
identified because the error injection test completed without 
being killed.  If the recovery code killed the wrong process, 
it would be obvious because the error injection test would not 
get killed. 

We have also been testing the recovery code in manufacturing 
with real bad DIMMs.

The updated patch.

Signed-off-by: Russ Anderson (rja@sgi.com)

-----------------------------------------------------------
Index: linux-2.6/arch/ia64/kernel/mca_drv.c
===================================================================
--- linux-2.6.orig/arch/ia64/kernel/mca_drv.c	2005-06-24 14:24:08.100951706 -0500
+++ linux-2.6/arch/ia64/kernel/mca_drv.c	2005-06-24 14:31:03.253211915 -0500
@@ -118,10 +118,11 @@
  */
 
 void
-mca_handler_bh(unsigned long paddr)
+mca_handler_bh(unsigned long paddr, void *iip, unsigned long ipsr)
 {
-	printk(KERN_DEBUG "OS_MCA: process [pid: %d](%s) encounters MCA.\n",
-		current->pid, current->comm);
+	printk(KERN_DEBUG "OS_MCA: process [cpu %d, pid: %d, uid: %d, iip: %p, psr: 0x%lx, paddr: 0x%lx](%s) encounters MCA.\n",
+		smp_processor_id(), current->pid, current->uid, iip, ipsr, paddr, current->
+comm);
 
 	spin_lock(&mca_bh_lock);
 	if (mca_page_isolate(paddr) == ISOLATE_OK) {
@@ -414,21 +415,27 @@
 	 */
 
 	psr1 =(struct ia64_psr *)&(peidx_minstate_area(peidx)->pmsa_ipsr);
+	psr2 =(struct ia64_psr *)&(peidx_minstate_area(peidx)->pmsa_xpsr);
 
 	/*
 	 *  Check the privilege level of interrupted context.
 	 *   If it is user-mode, then terminate affected process.
 	 */
-	if (psr1->cpl != 0) {
+	pmsa = (pal_min_state_area_t *)(sal_to_os_handoff_state->pal_min_state | (6ul<<61));
+
+	if (psr1->cpl != 0 || ((psr2->cpl != 0) &&
+			       (pmsa->pmsa_iip >= (unsigned long)ia64_ivt+0x3000 &&
+			        pmsa->pmsa_iip <  (unsigned long)&interrupt_pnr))) {
 		smei = peidx_bus_check(peidx, 0);
 		if (smei->valid.target_identifier) {
 			/*
 			 *  setup for resume to bottom half of MCA,
 			 * "mca_handler_bhhook"
 			 */
-			pmsa = (pal_min_state_area_t *)(sal_to_os_handoff_state->pal_min_state | (6ul<<61));
-			/* pass to bhhook as 1st argument (gr8) */
+			/* pass to bhhook as argument (gr8, ...) */
 			pmsa->pmsa_gr[8-1] = smei->target_identifier;
+			pmsa->pmsa_gr[9-1] = pmsa->pmsa_iip;
+			pmsa->pmsa_gr[10-1] = pmsa->pmsa_ipsr;
 			/* set interrupted return address (but no use) */
 			pmsa->pmsa_br0 = pmsa->pmsa_iip;
 			/* change resume address to bottom half */
@@ -438,6 +445,7 @@
 			psr2 = (struct ia64_psr *)&pmsa->pmsa_ipsr;
 			psr2->cpl = 0;
 			psr2->ri  = 0;
+			psr2->bn  = 1;
 			psr2->i  = 0;
 
 			return 1;
Index: linux-2.6/arch/ia64/kernel/mca_drv_asm.S
===================================================================
--- linux-2.6.orig/arch/ia64/kernel/mca_drv_asm.S	2005-06-24 14:24:08.100951706 -0500
+++ linux-2.6/arch/ia64/kernel/mca_drv_asm.S	2005-06-24 14:31:03.254188466 -0500
@@ -19,7 +19,7 @@
 	;;						//
 	clrrrb						//
 	;;						
-	alloc		r16=ar.pfs,0,2,1,0		// make a new frame
+	alloc		r16=ar.pfs,0,2,3,0		// make a new frame
 	;;
 	mov		ar.rsc=0
 	;;
@@ -40,11 +40,13 @@
 	movl		loc1=mca_handler_bh		// recovery C function
 	;;
 	mov		out0=r8				// poisoned address
+	mov		out1=r9				// iip
+	mov		out2=r10			// psr
 	mov		b6=loc1
 	;;
 	mov		loc1=rp
 	;;
-	ssm		psr.i
+	ssm		psr.i | psr.ic
 	;;
 	br.call.sptk.many    rp=b6			// does not return ...
 	;;
Index: linux-2.6/arch/ia64/kernel/ivt.S
===================================================================
--- linux-2.6.orig/arch/ia64/kernel/ivt.S	2005-06-24 14:24:08.097045503 -0500
+++ linux-2.6/arch/ia64/kernel/ivt.S	2005-06-24 14:31:03.255165017 -0500
@@ -785,6 +785,8 @@
 	;;
 	SAVE_REST
 	;;
+	.global	interrupt_pnr
+interrupt_pnr:
 	alloc r14=ar.pfs,0,0,2,0 // must be first in an insn group
 	mov out0=cr.ivr		// pass cr.ivr as first arg
 	add out1=16,sp		// pass pointer to pt_regs as second arg
Index: linux-2.6/arch/ia64/kernel/mca_drv.h
===================================================================
--- linux-2.6.orig/arch/ia64/kernel/mca_drv.h	2005-06-24 14:24:08.100951706 -0500
+++ linux-2.6/arch/ia64/kernel/mca_drv.h	2005-06-24 14:31:03.256141568 -0500
@@ -111,3 +111,6 @@
 	slidx_foreach_entry(__pos, &((slidx)->sec)) { __count++; }\
 	__count; })
 
+extern char ia64_ivt[];
+extern void *interrupt_pnr;
+EXPORT_SYMBOL(interrupt_pnr);
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Jun 24 16:20:00 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:40 EST