I'll send out a revised patch later, thanks! It's also good this code got a more careful read. It seems like some things are not as obvious as I might like... That patch will merge those list corruption fixes I sent, the "else" you verified was needed (ugh!!!), and some of what you include here. It won't add new BUG_ON calls (WARN at worst) or a duplicate ED state (see next). > >> ... add a new state > >> ED_DESCHEDULED, which is treated exactly like ED_IDLE, except > >> that in this state, the HC may still be referring to the ED in > >> question. Thus, if > > David.B> Sounds exactly like ED_UNLINK -- except maybe that it's not > David.B> been put onto ed_rm_list (with ED_DEQUEUE set). Why add > David.B> another state? > ... > > The current OHCI relies on the internals of the dma_pool() > implementation. If the implementation changed to, say, modify the > memory that is free or, heaven forbid, return the memory to the > kernel, you'd get _extremely_ difficult to track down race conditions. The current implementation _does_ poison memory on free, if slab poisoning is enabled. (That's why I asked if you were using it.) And that's been quite handy for reporting list corruption bugs, from races or otherwise. But those list corruption bugs hit a blind spot in that code: it doesn't check for modify-after-free. Which is why those bugs were able to hide for so long! It'd be good if you said _how_ you think it relies on such internals. Some of your debug diagnostics wrongly claimed allocation of "new" EDs when they were just being re-used. That'd make intentional/safe re-use look like a bug or race. > Even so, the code isn't race-free, like I explained yesterday: > > - ed_alloc() clears the ED to 0 via memset() > - the order in which memset() clears memory is undefined (various > from platform to platform etc) There's a wmb() before any ED is handed off to the OHCI silicon; that forces a defined order. Top of ed_schedule(). First use, or Nth re-use; no matter. > - thus you might get a case where hwTailP is 0 but hwHeadP > is non-zero, which would cause the HC to happily start > dereferencing the descriptor If you assume a bug where the ED is freed but still in use, that's hardly the only thing that'd go wrong!! You can't use such a potential bug to prove something else is broken. - Dave - To unsubscribe from this list: send the line "unsubscribe linux-ia64" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlReceived on Wed Mar 10 11:26:39 2004
This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:24 EST