Re: page fault scalability patch V11 [0/7]: overview

From: William Lee Irwin III <>
Date: 2004-11-21 06:08:18
On Sat, Nov 20, 2004 at 09:14:11AM -0800, Linus Torvalds wrote:
> I will pretty much guarantee that if you put the per-thread patches next
> to some abomination with per-cpu allocation for each mm, the choice will
> be clear. Especially if the per-cpu/per-mm thing tries to avoid false
> cacheline sharing, which sounds really "interesting" in itself.
> And without the cacheline sharing avoidance, what's the point of this 
> again? It sure wasn't to make the code simpler. It was about performance 
> and scalability.

"The perfect is the enemy of the good."

The "perfect" cacheline separation achieved that way is at the cost of
destabilizing the kernel. The dense per-cpu business is only really a
concession to the notion that the counter needs to be split up at all,
which has never been demonstrated with performance measurements. In fact,
Robin Holt has performance measurements demonstrating the opposite.

The "good" alternatives are negligibly different wrt. performance, and
don't carry the high cost of rwlock starvation that breaks boxen.

-- wli
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to
More majordomo info at
Received on Sat Nov 20 14:08:44 2004

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:32 EST