Re: Attribute spinlock contention ticks to caller.

From: Robin Holt <holt_at_sgi.com>
Date: 2005-09-17 08:29:49
On Fri, Sep 16, 2005 at 02:37:31AM -0700, Stephane Eranian wrote:
> Robin,
> 
> On Thu, Sep 15, 2005 at 05:29:00PM -0500, Robin Holt wrote:
> > > It seems you assume you know something in advance here. 
> > > I think you need a two-step process somehow. First you need to
> > > discover that you have contention, i.e., lots of samples
> > > in the contention code. Second you want to know from where
> > > and that's why you record the return from contention rather
> > > than contention. This sequence makes sense. 
> > > 
> > > With your patch, you would skip the first step.  If you don't
> > > know you have contention, how would you interpret the samples
> > > you get? For each sample, you have to search backwards to see
> > > if there is a br.call or similar that points to some 
> > > spinlock code. Why would you do this costly search systematically?
> > > Unless the tool is designed just to look for this. 
> > 
> > Unfortunately, without this patch, ia64_spinlock_contention become the
> > top billing issue on nearly every large cpu count sample.  It does not
> > mean you are contending on the same lock.  I have been fooled many times
> > into chasing a contention problem when in reality there were many locks
> > lightly contended which artificially raised the number of ticks to a
> > significant level.
> > 
> You have another side effect in here: interrupt masking. With existing
> PMU, you cannot take a sample is interrupts are masked. For some
> of the kernel code, it means that you get samples attributed to the
> bundle(s) just following interrupt unmasking.

At least in those situations, you can clearly see the samples are
attributable to the function which disabled/enabled interrupts.  Not some
generic code which has nothing to do with the function.

> I think what you are really after here is kernel call stack unwinding.
> Your patch is effectively a quick hack to get this for a specific function
> and for one level of unwinding.

Are you saying I should add the heavy steps of checking to top the call
stack and unwinding a single step if I am in ia64_spinlock_contention
instead of the relatively light checking against two global symbols
and doing a register move?

> I have shown in several presentations (incl. Gelato May 2004) that the existing
> infrastructure can be used to sample the kernel call stack. I have written
> a prototype perfmon2 sampling format that does just that. You can
> say, for instance:" Every 100,0000 cycles in the kernel record the full
> (or partial) call stack". The format is just a prototype at this point
> but I think it could be useful for your situation. The format was
> designed to show the power of the interface in that it allows you
> sample on PMU event and yet record non PMU-based information.

Can you site one instance where it is more helpful to know that _ANY_
lock in the system is hitting ia64_spinlock_contention as opposed to
the function which has the spin_lock() code?  I can recall times when
network traffic was contending on locks and causing my application to
appear to have a contended lock.  With this change, at least we know
the degradation is do to something external.

Thanks,
Robin
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sat Sep 17 08:30:59 2005

This archive was generated by hypermail 2.1.8 : 2005-09-17 08:31:07 EST