Re: Page fault scalability patch V18: Drop first acquisition of ptl

From: Benjamin Herrenschmidt <>
Date: 2005-03-03 17:11:53
On Wed, 2005-03-02 at 21:51 -0800, Christoph Lameter wrote:
> On Wed, 2 Mar 2005, David S. Miller wrote:
> > Actually, I guess I could do the pte_cmpxchg() stuff, but only if it's
> > used to "add" access.  If the TLB miss handler races, we just go into
> > the handle_mm_fault() path unnecessarily in order to synchronize.
> >
> > However, if this pte_cmpxchg() thing is used for removing access, then
> > sparc64 can't use it.  In such a case a race in the TLB handler would
> > result in using an invalid PTE.  I could "spin" on some lock bit, but
> > there is no way I'm adding instructions to the carefully constructed
> > TLB miss handler assembler on sparc64 just for that :-)
> There is no need to provide pte_cmpxchg. If the arch does not support
> cmpxchg on ptes (CONFIG_ATOMIC_TABLE_OPS not defined)
> then it will fall back to using pte_get_and_clear while holding the
> page_table_lock to insure that the entry is not touched while performing
> the comparison.

Nah, this is wrong :)

We actually _want_ pte_cmpxchg on ppc64, because we can do the stuff,
but it requires some careful manipulation of some bits in the PTE that
are beyond linux common layer understanding :) Like the BUSY bit which
is a lock bit for arbitrating with the hash fault handler for example.

Also, if it's ever used to cmpxchg from anything but a !present PTE, it
will need additional massaging (like the COW case where we just
"replace" a PTE with set_pte). We also need to preserve some bits in
there that indicate if the PTE was in the hash table and where in the
hash so we can flush it afterward.


To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to
More majordomo info at
Received on Thu Mar 3 01:43:19 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:36 EST