Re: [PATCH] New CMC/CPE polling

From: Alex Williamson <alex_williamson_at_hp.com>
Date: 2003-08-05 07:15:33
"Luck, Tony" wrote:
> 
> >    You might be right on the clearing side, I think moving it down
> > a couple lines and disabling local interrupts would eliminate the
> > potential hole though.  Something like this:
> >
> > ia64_mca_cmc_int_caller(...)
> > {
> > ...
> >     smp_call_function(ia64_mca_cmc_vector_enable, NULL, 1, 0);
> >     local_irq_disable();
> >     ia64_mca_cmc_vector_enable(NULL);
> >     cmc_polling_enabled = 0;
> > ...
> >
> >    Does that address the race you were looking at?  I don't see one
> > on the setting end, could you be more specific?  The spinlock feels
> > like it does the trick to me.  Thanks for the comments,
> 
> My issue is that you check and set cmc_polling_enabled under the
> protection of a lock (cmc_history_lock ... I realise that this is
> mostly protecting the cmc_history[], but it also defends against
> multiple cpus taking a CMCI at the same time and all trying to
> switch over to polling mode).
> 
> But you clear cmc_polling_enabled without the lock.  So the race is
> between enabling the interrupt (on all cpus) and taking the next
> interrupt.  The local_irq_disable() in the above fragment fixes that
> for the current cpu, but for other cpus you can end up in the handler
> with cmc_polling_enabled set to the wrong value.
> 
> One way might be harmless, but I'm not sure which.

Tony,

   I believe it is harmless.  I make use of the spinlock when setting
cmc_polling_enabled to serialize the cpus.  ia64_mca_cmc_int_handler()
can obviously have multiple CPUs in it at the same time.  However,
the polling sequence is naturally serialized, so there will never
be more than one CPU in the block that clears cmc_polling_enabled.
I didn't feel it necessary to pull the spinlock out of the interrupt
handler when it wasn't needed.

   With the modifications above, there is a tiny window where the
other CPUs could take a CMCI while the cmc_polling_enabled flag is in
the wrong state.  IMHO, this doesn't really seem like a problem.
The goal of polling for CMCs is simply to make the system usable, not
to count every CMC that occurs.  Forward progress towards setting
the flag in the right state is ensured by disabling interrupts, so
the window is guaranteed to be very small.  FWIW, this is the same
state we're in between ia64_mca_init() and ia64_mca_late_init().

	Alex

-- 
Alex Williamson                             HP Linux & Open Source Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Mon Aug 4 17:15:44 2003

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:16 EST