Kernel bug fix: hugepage_ free_pgtables()

From: Chen, Kenneth W <kenneth.w.chen_at_intel.com>
Date: 2003-12-25 10:46:25
We recently discovered a bug on ia64 when unmapping an address space
that
belongs to huge page region.  The generic code unmap_region() calls
free_pgtables() to free any possible pages that are used for page
tables.
However, it does no differentiation whether that region is mapped for
normal
page or huge tlb page.  Problem arises when free_pgtables() calculates
PGDIR
aligned area based on the default page size, where in the huge page
case, it
should really be using huge tlb page size instead.  The pgd_index
calculation
should also be adjusted accordingly.

So we need an architecture specific code to handle huge tlb cases.  It
also
requires changes in generic part of kernel.  The generic kernel changes
has
made into Andrew Morton's 2.6.0-mm1 already.  Here is the ia64 part of
patch
to take the new free page table semantics.

A bit more details on the kernel bug:  When there are two huge page
mappings,
like the two in the example, first one at the end of PGDIR_SIZE, and
second
one starts at next PGDIR_SIZE (64GB with 16K page size):
8000000ff0000000-8000001000000000 rw-s
8000001000000000-8000001010000000 rw-s
Unmapping the first vma would trick free_pgtable to think it can remove
one
set of pgd indexed at 0x400, and it went ahead purge the entire pmd/pte
that
are still in use by the second mapping. Now any subsequent access to
pmd/pte
for the second active mapping will trigger the bug.  We've seen hard
kernel
hang on some platform, some other platform will generate MCA, plus all
kinds
of unpleasant result.

David, please apply for 2.6.

- Ken

-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Received on Wed Dec 24 18:46:47 2003

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:21 EST