Re: [PATCH/RFC] I/O-check interface for driver's error handling

From: Benjamin Herrenschmidt <benh_at_kernel.crashing.org>
Date: 2005-03-05 09:37:25
On Fri, 2005-03-04 at 14:54 +0100, Pavel Machek wrote:
> Hi!
> 
> > > If there's no ->error method, at leat call ->remove so one device only
> > > takes itself down.
> > >
> > > Does this make sense?
> > 
> > This was my thought too last time we had this discussion.  A completely 
> > asynchronous call is probably needed in addition to Hidetoshi's proposed API, 
> > since as you point out, the driver may not be running when an error occurs 
> > (e.g. in the case of a DMA error or more general bus problem).  The async
> 
> Hmm, before we go async way (nasty locking, no?) could driver simply
> ask "did something bad happen while I was sleeping?" at begining of each
> function?
> 
> For DMA problems, driver probably has its own, timer-based,
> "something is wrong" timer, anyway, no?

No, there is no nasty locking, when the callback happens, pretty much
all IOs have stopped anyway due to errors, and we aren't on a critical
code path.

Polling for error might be possible, but async notification is the way
to go because whatever does error management need to be able to
separately: 

 - notify all drivers on the affected bus segment
 - one the above is done, and based on system/driver capabilities (API
to be defined) eventually re-enable IO access and do a new round of
notifications
 - based on system/driver capabilities, eventually reset the slot and
notify drivers to re-initialize themselves.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Mar 4 17:57:27 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:36 EST