Andi Kleen <ak@muc.de> writes: > As you can see cmpxchg is slightly faster for the cache hot case, > but incredibly slow for cache cold (probably because it does something > nasty on the bus). This is pretty consistent to Intel and AMD CPUs. > Given that page tables are likely more often cache cold than hot > I would use the lazy variant. Sorry, my benchmark program actually had a bug (first loop included page faults). Here are updated numbers. They are somewhat different: Athlon 64: readpte hot 25 readpte cold 171 readpte_cmp hot 18 readpte_cmp cold 162 Nocona: readpte hot 118 readpte cold 443 readpte_cmp hot 22 readpte_cmp cold 224 The difference is much smaller here. Assuming cache cold cmpxchg8b is better, at least on the Intel CPUs which have a slow rmb(). -Andi - To unsubscribe from this list: send the line "unsubscribe linux-ia64" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlReceived on Thu Jan 13 23:52:53 2005
This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:34 EST