Re: page table lock patch V15 [0/7]: overview

From: Andi Kleen <ak_at_suse.de>
Date: 2005-01-14 07:17:41
On Thu, Jan 13, 2005 at 10:16:58AM -0800, Christoph Lameter wrote:
> On Thu, 13 Jan 2005, Andi Kleen wrote:
> 
> > The rule in i386/x86-64 is that you cannot set the PTE in a non atomic way
> > when its present bit is set (because the hardware could asynchronously
> > change bits in the PTE that would get lost). Atomic way means clearing
> > first and then replacing in an atomic operation.
> 
> Hmm. I replaced that portion in the swapper with an xchg operation
> and inspect the result later. Clearing a pte and then setting it to
> something would open a window for the page fault handler to set up a new

Yes, it usually assumes page table lock hold.

> pte there since it does not take the page_table_lock. That xchg must be
> atomic for PAE mode to work then.

You can always use cmpxchg8 for that if you want. Just to make
it really atomic you may need a LOCK prefix, and with that the
cost is not much lower than a real spinlock.


> 
> > This helps you because you shouldn't be looking at the pte anyways
> > when pte_present is false. When it is not false it is always updated
> > atomically.
> 
> so pmd_present, pud_none and pgd_none could be considered atomic even if
> the pm/u/gd_t is a multi-word entity? In that case the current approach

The optimistic read function I posted would do this.

But you have to read multiple entries anyways, which could get
non atomic no? (e.g. to do something on a PTE you always need
to read PGD/PUD/PMD)

In theory you could do this lazily with retires too, but it would be probably
somewhat costly and complicated.

> would work for higher level entities and in particular S/390 would be in
> the clear.
> 
> But then the issues of replacing multi-word ptes on i386 PAE remain. If no
> write lock is held on mmap_sem then all writes to pte's must be atomic in

mmap_sem is only for VMAs. The page tables itself are protected by page table 
lock.

> order for the get_pte_atomic operation to work reliably.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Thu Jan 13 16:04:56 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:34 EST