Re: speeding up thread-creation

From: Stephane Eranian <>
Date: 2003-11-22 04:22:17

On Fri, Nov 21, 2003 at 12:19:59AM -0800, David Mosberger wrote:
> It occurred to me that at present, we're copying lots of state on a
> clone2() for absolutely no reason.  Not only that, but the large size
> of the "thread_struct" probably also causes poor cache-locality since
> the task-structure is effectively split in two, with a large unused
> gap in between.  I think it might make sense to move all the large
> thread_struct-state (IA-32 registers, pmcs[], pmds[], dbr[], ibr[],
> and fph[]) into a separate "thread_lazy" structure and then put that
> structure at a place where it doesn't hurt (perhaps above the
> thread_info structure).  If I counted right, this state accounts for
> 2KB so not copying it in copy_process() ought to speed up
> thread-creation significantly and avoid stomping needlessly on the L1
> d-cache.
That looks like an good idea.

I assume you want to rely on the thread's flags to determine if it is worth
copying the thread_lazy structure during a clone. For perfmon, we may need
to have two flags: one that says we are storing information in pmds/pmcs and
one that says we need to context switch the PMU state. Today PM_VALID flag
is used to mean the latter only.

To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to
More majordomo info at
Received on Fri Nov 21 12:34:25 2003

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:20 EST