[PATCH 06/10] IOCHK interface for I/O error handling/detecting

From: Hidetoshi Seto <seto.hidetoshi_at_jp.fujitsu.com>
Date: 2005-06-09 22:56:03
[This is 6 of 10 patches, "iochk-06-mcanotify.patch"]

- This is a headache:
   When ia64 get a problem on hardware, OS could request
   SAL(System Abstraction Layer: ia64 firmware) to gather
   system status via calling SAL_GET_STATE_INFO procedure.

   However (depend on implementation of SAL for its platform,
   hopefully), on the way of gathering, SAL also checks
   every host bridges and its status, and after that, resets
   the state...

   So we should take care of this reset by SAL.

   Handling MCA(Machine Check Abort) is one of a situation
   should we take care. Originally MCA is designed as a
   critical interruption, so when MCA comes, without OS's
   order, SAL gathers system status before OS gets its control.
   So since states of bridges are already reset on entrance of
   MCA, OS should notify "lost of state" to all "check-in"
   contexts, by marking its error flag, iocookie->error.

   There would be better way if OS can know the bridge state
   from data which SAL gathered, but in the meanwhile, I
   just do simple way.

PCI-parity error is one of MCA causes, is it OK?
Next, "data poisoning" helps us... see next (7 of 10).

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>


  arch/ia64/kernel/mca.c      |   13 +++++++++++++
  arch/ia64/lib/iomap_check.c |    7 ++++++-
  2 files changed, 19 insertions(+), 1 deletion(-)

Index: linux-
--- linux-
+++ linux-
@@ -77,6 +77,11 @@
  #include <asm/irq.h>
  #include <asm/hw_irq.h>

+#include <linux/pci.h>
+extern void notify_bridge_error(struct pci_dev *bridge);
  #if defined(IA64_MCA_DEBUG_INFO)
  # define IA64_MCA_DEBUG(fmt...)	printk(fmt)
@@ -893,6 +898,14 @@ ia64_mca_ucmc_handler(void)
  		sal_log_record_header_t *rh = IA64_LOG_CURR_BUFFER(SAL_INFO_TYPE_MCA);
  		rh->severity = sal_log_severity_corrected;
+		/*
+		 * SAL already reads and clears error bits on bridge registers,
+		 * so we should have all running transactions to retry.
+		 */
+		notify_bridge_error(0);
  	 *  Wakeup all the processors which are spinning in the rendezvous
Index: linux-
--- linux-
+++ linux-
@@ -109,7 +109,12 @@ void notify_bridge_error(struct pci_dev

  	/* notify error to all transactions using this host bridge */
-	if (bridge) {
+	if (!bridge) {
+		/* global notify, ex. MCA */
+		list_for_each_entry(cookie, &iochk_devices, list) {
+			cookie->error = 1;
+		}
+	} else {
  		/* local notify, ex. Parity, Abort etc. */
  		list_for_each_entry(cookie, &iochk_devices, list) {
  			if (cookie->host == bridge)

To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Thu Jun 9 08:59:31 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:39 EST