Note: this page is largley kept for archiving

Please see Ia64SuperPages for current work

How Shimizu Super Pages are implemented

Users do not need to worry themselves with Superpages, it is more the concern of the operating system, in our case Linux. Depending on the hardware platform Superpages may be supported. For instance the Alpha architecture supports a base page size of 8Kb and Superpage sizes of 64, 512 and 4096 Kbi whereas the i386 architectures support 4Kbi and 4096Kbi pages. Information with respect to Superpages on Intel 64bit architectures can be found at Ia64SuperPages.

My (DarrenWilliams) first introduction to Superpages was here at Gelato. The first paper I read on the subject was Practical, transparent operating system support for superpages (Navarro et.al. 2002) published in ACM Operating Systems Review Special Issue-Winter 2002, proceedings of the Fifth ACM Symposium on Operating System Design and Implementation(OSID'02). The authors present an overview of the requirements, Shimazu implementation details and results of experiments carried out, and Gelato@UNSW is work on an Ia64SuperPages variant.

Shimizu Super Page patch

The following describes how Shimazu's patch seamlessly implements Super-pages. We hope that this description will help others who wish to take on porting the patch to other architectures not covered by the current implementation or to update it to new kernel versions. For the discussion, Shimazu's Super-page patch will be referred to as the patch.

Super-page initialisation

Order of page sizes

When defining super-pages it is possible and most practicle to define them as an order multiple of the base page size. The Alpha architecture offers 4 alternate page sizes (8kb, 64kb, 512kb and 4096kb). For initialisation, the order of a page size multiples has to be defined, on the Alpha this is equivalent to {0,3,6,9}, that is a base page can be defined as (23*2ord)*1024, where ord=0.

Super-page allocation

When allocating super-pages, the first consideration is to determine if a page order increase can be addressed or not, i.e. the increase is not past the end of the tasks virtual memory area. If the addressing is valid then the largest possible super-page is allocated that does exceed the end of the tasks vitual memory area.

Allocation details

 |---------------------------------------------------------------|---------------------------------------------------------------| <--- order 4

 [ order 3 skipped ]

 |---------------|---------------|---------------|---------------|---------------|---------------|---------------|---------------| <--- order 2
        
 |-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------| <--- order 1 

 +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ <--- order 0 (base page size)
     ^
     |-------------------------------------------------------------------------------------------------------------------------------- .... and so on 
    addr                             <---- request length ---->

Assume you have a request for a large amount of memory, starting at addr.

We run through an algorithm (in pseduo code)

order = 0
for all page orders do
 remainder <- the difference between addr and start of (order+1) page
 if (remainder > 0) and (request length > current page size)
  while (remainder != 0)
   if (order == 0) 
    skip /* order zero is system page size, and we already have that page assigned to us */
   assign page of size order to us
   addr = addr + sizeof(superpage @ order)
   remainder = remainder - sizeof(superpage @ order)
  endwhile
done

XXX when we fall through here we find the next biggest superpage that fits the allocation and assign that.

Super-page de-allocation

De-allocation of a super-page follows the current Linux path, however before zapping the directory tree any super-page are downgraded to base pages with a call to adj_sp_range.

Shimizu super page patch features

* View of super page state in procfs. * Control of super page configuration from sysctl.

Super-page interface

The interface to super-pages is provided through mm/super_page.c. There are currently and one optional super_page_getinf that interfaces with the proc filesystem providing super-page information.

Linux with SuperPages

Currently we are working on two versions of the Linux kernel 2.5 and 2.6 (which has not been released as of 19/08/2003). The interface supplied by the 2.6 kernel has been cleaned up a little, with deprecated interfaces removed and additional functionality provided above the 2.5 kernel.

Kernel 2.5

Extending Linux with Superpage support started with the 2.5.44 kernel. A Superpage patch supplied by Naohiko Shimizu can be found here and further details about Dr Naohiko Shimizu's work can be found here (note that occasionally this link goes down).

The only change here is to apply the super-page patch to the source tree that was dedicated to the super-page, in my case this was;

Running the Test Cases

Naohiko Shimizu wrote several C programs that test the pager these are

Turning the pager on/off

Problems and Bugs

Current Problems

Kernel 2.6

The 2.6 kernel has been updated with SuperPages support and a patch can be found here 2.6.0 Superpage patch. This patch still produces the page allocation error, watch this space for updates.

We have also been keeping the Alpha part of the original patch up to date with current kernels, all the patches that Gelato@UNSW have maintained can be found here.

Testing Results

The following test results are an average of several test runs that have been carried out in a variety of environments. To conduct these tests a C program that performs matrix transformation was used that was written by Dr Naohiko Shimizu. Following are results of testing before the final problems were solved. I will add to these results as I rectify the problems mentioned previously. The test environment that was used for these test are:

Matrix Transpose

Size of matrix

Superpage On

Std. Dev.

Superpage Off

Std. Dev

3000 Store

25.30MB/sec

0.02

19.93MB/sec

0.03

3000 Load

29.32MB/sec

0.00

21.72MB/sec

0.06

3100 Store

25.19MB/sec

0.02

19.68MB/sec

0.02

3100 Load

26.64MB/sec

0.00

20.14MB/sec

0.04

3150 Store

24.87MB/sec

0.02

19.43MB/sec

0.04

3150 Load

25.56MB/sec

0.01

19.35MB/sec

0.02

3175 Store

25.41MB/sec

0.02

19.66MB/sec

0.03

3175 Load

29.41MB/sec

0.01

21.50MB/sec

0.05

3200 Store

22.91MB/sec

0.01

20.82MB/sec

0.07

3200 Load

29.23MB/sec

0.00

22.76MB/sec

0.05

Store Average

23.76MB/sec

0.00

19.75MB/sec

0.00

Load Average

26.79MB/sec

0.00

20.67MB/sec

0.00

Extensive test results can be found here. To reproduce these results the matrix transpose program can be found here, and a script that will execute and generates output files can be found here. This scripts finds all executables in the current directory except for itself and runs them 10 times placing any output into 10 separate result files.

Matrix Transpose

Size of matrix

Superpage On

Std. Dev.

Superpage Off

Std. Dev

3000 Store

25.46MB/sec

0.02

20.03MB/sec

0.03

3000 Load

29.47MB/sec

0.01

21.86MB/sec

0.05

3100 Store

25.32MB/sec

0.02

19.83MB/sec

0.05

3100 Load

26.87MB/sec

0.00

20.19MB/sec

0.04

3150 Store

24.97MB/sec

0.01

19.55MB/sec

0.03

3150 Load

25.68MB/sec

0.00

19.42MB/sec

0.03

3175 Store

25.52MB/sec

0.03

19.99MB/sec

0.64

3175 Load

29.51MB/sec

0.00

21.62MB/sec

0.05

3200 Store

22.94MB/sec

0.01

20.82MB/sec

0.15

3200 Load

29.30MB/sec

0.00

22.86MB/sec

0.06

Store Average

24.84MB/sec

0.77

19.92MB/sec

0.04

Load Average

28.17MB/sec

1.51

20.86MB/sec

1.25

The results above show that superpages increase throughput by approximately 20% on the store strobe of the test and approximately 26% on the load strobe. The updated 2.6.0 kernel was capable of running the tests at a matrix size 3300 and also show a slight improvement in memory performance before the superpage patch was applied.

Matrix transpose testing shows that superpages improve throughput by an average of about 20%. One notable result was in the 3300 matrix test. This test can be found in the extended results and shows that the throughput of the store strobe is less for a kernel with SuperPages than that of a non-Superpage kernel. This effect can be attributed to the fact that there is extra work to be performed by the memory manager with respect to SuperPages. When a page is allocated for memory the superpager will attempt to allocate the largest Superpage available, (i.e. in the case of the Alpha architecture this will be 4MB). If there is not a continuous 4MB hole in memory available the page to be allocated is downgraded to the next available page size, (i.e. once again with Alpha this will be 512kb). The downgrading of a page repeats until a base page size is reached; if there is still no available space the swap daemon is called to evict a page from memory.

Patches

Patches are constantly being updated, you can find all of the patches that Gelato@UNSW have generated here.

ToDo

IA64wiki: ShimizuSuperPages (last edited 2009-12-10 03:13:56 by localhost)

Gelato@UNSW is sponsored by
the University of New South Wales National ICT Australia The Gelato Federation Hewlett-Packard Company Australian Research Council
Please contact us with any questions or comments.