On Fri, Jan 14, 2005 at 03:54:59PM +1100, Nick Piggin wrote: > On Fri, 2005-01-14 at 05:39 +0100, Andi Kleen wrote: > > > As you can see cmpxchg is slightly faster for the cache hot case, > > but incredibly slow for cache cold (probably because it does something > > nasty on the bus). This is pretty consistent to Intel and AMD CPUs. > > Given that page tables are likely more often cache cold than hot > > I would use the lazy variant. > > > > I have a question about your trickery with the read_pte function ;) > > pte_t read_pte(volatile pte_t *pte) > { > pte_t n; > do { > n.pte_low = pte->pte_low; > rmb(); > n.pte_high = pte->pte_high; > rmb(); > } while (n.pte_low != pte->pte_low); > return pte; > } > > Versus the existing set_pte function. Presumably the order here > can't be changed otherwise you could set the present bit before > the high bit, and race with the hardware MMU? The hardware MMU only ever adds some bits (D etc.). Never changes the address. It won't clear P bits. The page fault handler also doesn't clear them, only the swapper does. With that knowledge you could probably do some optimizations. > So I think you can get a non atomic result. Are you relying on > assumptions about the value of pte_low not causing any problems > in the page fault handler? I don't know. You have to ask Christopher L. I only commented on one subthread where he asked about atomic pte reading, but haven't studied his patches in detail. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-ia64" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlReceived on Fri Jan 14 05:46:36 2005
This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:34 EST