[Patch 2/3] Free page tables belonging to remote nodes.

From: Robin Holt <holt_at_sgi.com>
Date: 2005-03-03 05:41:03

This patch is simple but necessary for large numa configurations.
It simply ensures that only pages from the local node are added to a
cpus quicklist.  This prevents the trapping of pages on a remote nodes
quicklist by starting a process, touching a large number of pages to
fill pmd and pte entries, migrating to another node, and then unmapping
or exiting.  With those conditions, the pages get trapped and if the
machine has more than 100 nodes of the same size, the calculation of
the pgtable high water mark will be larger than any single node so page
table cache flushing will never occur.

I ran lmbench lat_proc fork and lat_proc exec on a zx1 with and without
this patch and did not notice any change.

On an sn2 machine, there was a slight improvement which is possibly
due to pages from other nodes trapped on the test node before starting
the run.  I did not investigate further.

Signed-off-by: Robin Holt <holt@sgi.com>

Process fork+exit: 186.7037 microseconds
Process fork+execve: 699.0000 microseconds
Process fork+/bin/sh -c: 2960.0000 microseconds

Process fork+exit: 182.2333 microseconds
Process fork+execve: 692.7500 microseconds
Process fork+/bin/sh -c: 2905.5000 microseconds

 pgalloc.h |   10 ++++++++++
 1 files changed, 10 insertions(+)

Index: linux-2.6/include/asm-ia64/pgalloc.h
--- linux-2.6.orig/include/asm-ia64/pgalloc.h	2005-03-02 12:19:09.672133971 -0600
+++ linux-2.6/include/asm-ia64/pgalloc.h	2005-03-02 12:23:57.155152045 -0600
@@ -64,6 +64,13 @@
 static inline void
 pgtable_quicklist_free (void *pgtable_entry)
+	if (unlikely(page_to_nid(virt_to_page(pgtable_entry)) != numa_node_id())) {
+		free_page((unsigned long) pgtable_entry);
+		return;
+	}
 	*(unsigned long *)pgtable_entry = (unsigned long) __ia64_per_cpu_var(pgtable_quicklist);
 	__ia64_per_cpu_var(pgtable_quicklist) = (unsigned long *) pgtable_entry;
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Mar 2 13:41:32 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:36 EST