Mprotect needs arch hook for updated PTE settings

From: Seth, Rohit <rohit.seth_at_intel.com>
Date: 2005-03-16 04:43:31
Recently on IA-64, we have found an issue where old data could be used
by apps.  The sequence of operations includes few mprotects from user
space (glibc) goes like this:

1- The text region of an executable is mmaped using PROT_READ|PROT_EXEC.
As a result, a shared page is allocated to user.

2- User then requests the text region to be mprotected with
PROT_READ|PROT_WRITE.   Kernel removes the execute permission and leave
the read permission on the text region.

3- Subsequent write operation by user results in page fault and
eventually resulting in COW break. User gets a new private copy of the
page.  At this point kernel marks the new page for defered flush. 

4- User then request the text region to be mprotected back with
PROT_READ|PROT_EXEC.  mprotect suppport code in kernel, flushes the
caches, updates the PTEs and then flushes the TLBs.  Though after
updating the PTEs  with new permissions,  we don't let the arch specific
code know about the new mappings (through update_mmu_cache like
routine).  IA-64 typically uses update_mmu_cache to check for the
defered flush flag (that got set in step 3) to maintain cache coherency
lazily (The local I and D caches on IA-64 are incoherent).  

DavidM suggeested that we would need to add a hook in the function
change_pte_range in mm/mprotect.c  This would let the architecture
specific code to look at the new ptes to decide if it needs to update
any other architectual/kernel state based on the updated (new
permissions) PTE values.

Please let me know your comments.

A prototype patch including the support for IA-64 goes as follows:

--- linux-2.6.10/mm/mprotect.c  2004-12-24 13:35:50.000000000 -0800
+++ linux-2.6.10.work/mm/mprotect.c     2005-03-14 23:51:01.511553704
-0800
@@ -46,14 +46,16 @@
                end = PMD_SIZE;
        do {
                if (pte_present(*pte)) {
-                       pte_t entry;
+                       pte_t entry, nentry;

                        /* Avoid an SMP race with hardware updated
dirty/clean
                         * bits by wiping the pte and then setting the
new pte
                         * into place.
                         */
                        entry = ptep_get_and_clear(pte);
-                       set_pte(pte, pte_modify(entry, newprot));
+                       nentry = pte_modify(entry, newprot);
+                       set_pte(pte, nentry);
+                       lazy_mmu_update(pte, entry, nentry);
                }
                address += PAGE_SIZE;
                pte++;
--- linux-2.6.10/arch/ia64/mm/init.c    2004-12-24 13:34:32.000000000
-0800
+++ linux-2.6.10.work/arch/ia64/mm/init.c       2005-03-15
00:16:21.880422504 -0800
@@ -95,6 +95,11 @@
        set_bit(PG_arch_1, &page->flags);       /* mark page as clean */
 }

+void
+lazy_mmu_update(pte_t *pte, pte_t opte, pte_t npte)
+{
+       update_mmu_cache(NULL, 0, npte);
+}
 inline void
 ia64_set_rbs_bot (void)
 {
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Tue Mar 15 12:48:50 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:36 EST