RE: [RFC] 4-level page table directories.

From: Chen, Kenneth W <kenneth.w.chen_at_intel.com>
Date: 2005-11-09 07:27:17
Robin Holt wrote on Tuesday, November 08, 2005 11:37 AM
> Can I get you to reproduce this?  I have tried many times and
> your test is giving me numbers that are very close between 3
> and 4 level page tables.  For 25 runs, I got:
> 
> With 3-level page table kernel: Average of 25 is 24612659771.96
> With 4-level page table kernel: Average of 25 is 24686556792.96
> 
> Which is showing that a vhtp_miss test is adding a 0.30% overhead
> which can also be expressed as an average 1.44 clock cycles per
> miss.  As of this writing, the loops have run over 250 times and
> the min reading to this point is 23946196482.  This is nowhere
> close to your min reported.

The other option is to instrument vhpt_miss handler and measure
average clock ticks spend in that handler.  I had that instrumented
and measured with 3-level/4-level page table configurations.  I
measured 221 clocks with 3-level page table, versus 298 clocks with
4-level page table [*].  This measurement is certainly depends on
system chipset/platform.  But the point is that penalty with 4-level
page table is certainly visible.  Not only the low level hander has
to Walk the extra level, it also incurs additional cache misses while
walking the table, and it has damaging side effect of evicting other
Working set data resides in the cache.

- Ken

[*] measured on: 1.6 GHz Itainum2 processor, 9M L3, Intel server platform
    SR870BN4.  32GB PC2100 DDR memory.

-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Nov 09 07:27:56 2005

This archive was generated by hypermail 2.1.8 : 2005-11-09 07:28:03 EST