mca.c: Incorrect recovery from TLB errors?

From: Keith Owens <kaos_at_sgi.com>
Date: 2004-02-09 12:53:28
In both 2.4 and 2.6 kernels, arch/ia64/kernel/mca.c
ia64_return_to_sal_check() has

	if (psp->cc == 1 && psp->bc == 1 && psp->rc == 1 && psp->uc == 1)
		ia64_os_to_sal_handoff_state.imots_os_status = IA64_MCA_COLD_BOOT;
	else
		ia64_os_to_sal_handoff_state.imots_os_status = IA64_MCA_CORRECTED;

Why does it test for all the cc/bc/rc/uc bits being set?  Surely that
should be or, not and?  The real test for recovery is

	psp->tc && !(psp->cc || psp->bc || psp->rc || psp->uc)

The existing code is also inconsistent with the test in mca_asm.S, that
only tests for psp->tc being 1 and ignores the other bits.

Tony: it makes life easier for kdb if the "am I going to recover" test
is promoted from ia64_return_to_sal_check() to ia64_mca_ucmc_handler()
and passed down to ia64_return_to_sal_check().  Otherwise kdb has to
duplicate the code in ia64_return_to_sal_check() to decide if the MCA
is recoverable or not, normally you do not want kdb to handle a
recovered error.  Any objections to this?

void
ia64_mca_ucmc_handler(void)
{
	pal_processor_state_info_t *psp = (pal_processor_state_info_t *)
		&ia64_sal_to_os_handoff_state.proc_state_param;
	int recover = psp->tc && !(psp->cc || psp->bc || psp->rc || psp->uc);
	...
	ia64_return_to_sal_check(psp, recover)
}

-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Feb 8 20:57:50 2004

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:22 EST