Re: page fault scalability patch final : i386 tested, x86_64 support added

From: Christoph Lameter <>
Date: 2004-09-02 02:43:54
On Wed, 1 Sep 2004, Benjamin Herrenschmidt wrote:

> The removal of the page table lock has other more subtle side effects
> on ppc64 (and ppc32 too) that aren't trivial to solve. Typically, due
> to the way we use the hash table as a TLB cache.
> For example, out ptep_test_and_clear will first clear the PTE and then
> flush the hash table entry. If in the meantime another CPU gets in,
> takes a fault, re-populates the PTE and fills the hash table via
> update_mmu_cache, we may end up with 2 hash PTEs for the same linux
> PTE at least for a short while. This is a potential cause of checkstop
> on ppc CPUs.
> There may be other subtle races of that sort I haven't encovered yet.
> We need to spend more time on our (ppc/ppc64) side to figure out what
> is the extent of the problem. We may have a cheap way to fix most of the
> issues using the PAGE_BUSY bit we have in the PTEs as a lock, but we
> don't have that facility on ppc32.
> I think there wouldn't be a problem if we could guarantee exclusion
> between page fault and clearing of a PTE (that is basically having the
> swapper take the mm write sem) but I don't think that's realistic, oh
> well, not that I understand anything about the swap code anyways...

We may be able to accomplish that by generic routines for
ptep_cmpxchg and so on that would use the page table lock for platforms
that do not support atomic pte operations.

Something along the lines of:

pte_t ptep_xchg(struct mm_struct *mm, pte_t *ptep, pte_t new) {
	pte_t old;

	old = *ptep
	set_pte(ptep, new_pte);
	/* Do rehashing */
	return old;

This would limit the time that the page_table_lock is held to a minimum
and may still offer some of the performance improvements.

Would that be acceptable?
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to
More majordomo info at
Received on Wed Sep 1 12:55:06 2004

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:30 EST