Re: [linux-usb-devel] Re: serious 2.6 bug in USB subsystem?

From: David Brownell <david-b_at_pacbell.net>
Date: 2004-03-11 03:22:26
I'll send out a revised patch later, thanks!  It's also good
this code got a more careful read.  It seems like some things
are not as obvious as I might like...

That patch will merge those list corruption fixes I sent, the
"else" you verified was needed (ugh!!!), and some of what you
include here.   It won't add new BUG_ON calls (WARN at worst)
or a duplicate ED state (see next).


>   >>   ...  add a new state
>   >> ED_DESCHEDULED, which is treated exactly like ED_IDLE, except
>   >> that in this state, the HC may still be referring to the ED in
>   >> question.  Thus, if
> 
>   David.B> Sounds exactly like ED_UNLINK -- except maybe that it's not
>   David.B> been put onto ed_rm_list (with ED_DEQUEUE set).  Why add
>   David.B> another state?
>   ...
> 
> The current OHCI relies on the internals of the dma_pool()
> implementation.  If the implementation changed to, say, modify the
> memory that is free or, heaven forbid, return the memory to the
> kernel, you'd get _extremely_ difficult to track down race conditions.

The current implementation _does_ poison memory on free, if
slab poisoning is enabled.  (That's why I asked if you were
using it.)

And that's been quite handy for reporting list corruption bugs,
from races or otherwise. But those list corruption bugs hit a
blind spot in that code:  it doesn't check for modify-after-free.
Which is why those bugs were able to hide for so long!


It'd be good if you said _how_ you think it relies on such
internals.  Some of your debug diagnostics wrongly claimed
allocation of "new" EDs when they were just being re-used.
That'd make intentional/safe re-use look like a bug or race.


> Even so, the code isn't race-free, like I explained yesterday:
> 
>  - ed_alloc() clears the ED to 0 via memset()
>  - the order in which memset() clears memory is undefined (various
>    from platform to platform etc)

There's a wmb() before any ED is handed off to the OHCI silicon;
that forces a defined order.  Top of ed_schedule().  First use,
or Nth re-use; no matter.

>  - thus you might get a case where hwTailP is 0 but hwHeadP
>    is non-zero, which would cause the HC to happily start
>    dereferencing the descriptor

If you assume a bug where the ED is freed but still in use,
that's hardly the only thing that'd go wrong!!  You can't
use such a potential bug to prove something else is broken.

- Dave



-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Mar 10 11:26:39 2004

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:24 EST