On Wed, 1 Sep 2004, Benjamin Herrenschmidt wrote: > The removal of the page table lock has other more subtle side effects > on ppc64 (and ppc32 too) that aren't trivial to solve. Typically, due > to the way we use the hash table as a TLB cache. > > For example, out ptep_test_and_clear will first clear the PTE and then > flush the hash table entry. If in the meantime another CPU gets in, > takes a fault, re-populates the PTE and fills the hash table via > update_mmu_cache, we may end up with 2 hash PTEs for the same linux > PTE at least for a short while. This is a potential cause of checkstop > on ppc CPUs. > > There may be other subtle races of that sort I haven't encovered yet. > > We need to spend more time on our (ppc/ppc64) side to figure out what > is the extent of the problem. We may have a cheap way to fix most of the > issues using the PAGE_BUSY bit we have in the PTEs as a lock, but we > don't have that facility on ppc32. > > I think there wouldn't be a problem if we could guarantee exclusion > between page fault and clearing of a PTE (that is basically having the > swapper take the mm write sem) but I don't think that's realistic, oh > well, not that I understand anything about the swap code anyways... We may be able to accomplish that by generic routines for ptep_cmpxchg and so on that would use the page table lock for platforms that do not support atomic pte operations. Something along the lines of: pte_t ptep_xchg(struct mm_struct *mm, pte_t *ptep, pte_t new) { pte_t old; spin_lock(&mm->page_table_lock); old = *ptep set_pte(ptep, new_pte); /* Do rehashing */ spin_unlock(&mm->page_table_lock); return old; } This would limit the time that the page_table_lock is held to a minimum and may still offer some of the performance improvements. Would that be acceptable? - To unsubscribe from this list: send the line "unsubscribe linux-ia64" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlReceived on Wed Sep 1 12:55:06 2004
This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:30 EST