Re: hugetlb demand paging patch part [2/3]

From: 'David Gibson' <>
Date: 2004-04-16 14:49:17
On Thu, Apr 15, 2004 at 09:13:38PM -0700, Chen, Kenneth W wrote:
> David Gibson wrote on Thursday, April 15, 2004 8:27 PM
> > Ah!  So it's just an optimiziation - it makes a bit more sense to me
> > now.  I had assumed that this case (hugepage get_user_pages()) would
> > be sufficiently rare that it would not require optimization.
> > Apparently not.
> It's a huge deal because for *every* I/O, kernel has to do get_user_pages()
> to lock the page, it's really gets in the way with the spin_lock as well.
>         spin_lock(&mm->page_table_lock);
>         do {
>                 struct page *map;
>                 int lookup_write = write;
>                 while (!(map = follow_page(mm, start, lookup_write))) {
> With current state of art platform, I/O requirement pushes into 200K
> per second, this become quite significant.

Ok.  This makes sense now that you explain it.

> > Do you know where the cycles are going without this optimization?  In
> > particular, could it be just the find_vma() in hugepage_vma() called
> > before follow_huge_addr()?  I note that IA64 is the only arch to have
> > a non-trivial hugepage_vma()/follow_huge_addr() and that its
> > follow_huge_addr() doesn't actually use the vma passed in.
> That's one, plus the spin lock mentioned above.

And akpm has just explained why it can be avoided in the hugepage

> > If we could get rid of follow_hugetlb_pages() it would remove an ugly
> > function from every arch, which would be nice.
> I hope the goal here is not to trim code for existing prefaulting scheme.
> That function has to go for demand paging, and demand paging comes with
> a performance price most people don't realize.  If the goal here is to
> make the code prettier, I vote against that.

Well, I'm attempting to understand the hugepage code across all the
archs, so that I can try to implement copy-on-write with a minimum of
arch specific gunk.  Simplifying and consolidating the existing code
across archs would be a helpful first step, if possible.

David Gibson			| For every complex problem there is a
david AT	| solution which is simple, neat and
				| wrong.
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to
More majordomo info at
Received on Fri Apr 16 00:54:30 2004

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:25 EST