Re: [Linux-ia64] Re: web page on O(1) scheduler

From: Hans Boehm <Hans.Boehm_at_hp.com>
Date: 2003-05-23 11:07:46
On Wed, 21 May 2003, Arjan van de Ven wrote:

> oh you mean the OpenMP broken behavior of calling sched_yield() in a
> tight loop to implement spinlocks ?
> 
> I'd guess that instead of second guessing the runtime, they should use
> the pthreads primitives which are the fastest for the platform one would
> hope.. (eg futexes nowadays)
> 

That really depends on the circumstances.  I think there will always
be applications that use custom synchronization because they need the
last bit of performance in their specific environment.

I just ran a quick test to compare

a) 10,000,000 lock/unlock pairs with a rudimentary custom user-level spinlock
implementation, and
b) 10,000,000 pthread_mutex_lock/pthread_mutex_unlock pairs.

All locking is done by a single thread; this is the completely contention-free
case.

On a 1GHz Itanium 2 I get

Custom lock: 180 msecs
Custom lock: 1382 msecs

On a 2GHz Xeon, I get

Custom lock: 646 msecs
Custom lock: 1659 msecs

There are good reasons for the differences:

1) The pthread implementation needs an atomic operation on lock exit to
check whether waiters need to be awoken.  The spin lock just needs a
release barrier, which is cheap on both platforms.

2) The custom lock can be inlined.  The pthread one normally involves a
dynamic library call.  That has to be the case if you want to be able
to plug in different thread implementations.

3) Pthread locks come in various flavors, and the interface is designed
such that you have to check at runtime which flavor you have.

In the contention case there are other interesting issues, since it's
often far more efficient to spin before attempting to yield, and pthreads
implementations don't always do that.

The original case may also have involved barrier synchronization instead
of locks, in which case there is probably at least as much motivation to
"roll your own".

To reproduce my results from attached files (ao is my current attempt at
a portable atomic operations library):

tar xvfz ao-0.2.tar.gz
cp time_lock.c ao-0.2
cd ao-0.2
gcc -O2 time_lock.c -lpthread
./a.out

-- 
Hans Boehm
(hboehm@hpl.hp.com)



Received on Thu May 22 17:57:28 2003

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:15 EST