2.4.31 TLB corruption

From: Smarduch Mario-CMS063 <CMS063_at_motorola.com>
Date: 2005-06-16 01:34:23
Here's the race condition that appears possible. Considering
a context range (for per task RID selection) 1-100. 
The real range is 21 bits wide, and starting context == 300,
resulting in much sparser context selection values and
thus much more difficult to trip.

But for example after next==100 there are the
following context values that exist owned by various tasks:

Now on a 2 way system 1  is executing on CPU 0 
and 2 on CPU 1. Both happen to run fork() eventually 
winding up in dup_mmap().

CPU 0 (orig ctxt=1):		CPU 1 (orig ctxt=2):
------				------
- Both call flush_tlb_mm() this sets their mm->context == 0
- eventually both get into activate_context(mm)
                        - grabs ia64_ctx.lock first
				- context wrap around wrap_mmu_contxt()
				  gets called
                        - chooses context=1, limit=10
                        - flushes local TLB, marks lazy flush needed
                          on 0.
- now acquires ia64_ctx.lock
- chooses context=2 and 
  installs it in its RRs
- appears to resume in user mode with
  matching RID of task running on
  CPU1 (i.e. with its previous
  TLBs with RID=2 installed)	

This whole scheme is complex and elusive I'd appreciate
feedback from this group.

- mario			   
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Jun 15 11:34:37 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:39 EST