Modify glibc libraries for testing tlb-sharing patches

The gelato guys created the patches to share the tlb for the mmap()-ed object.

If you wanna share the tlb among processes, basically what you need to do is to map the same thing to the same place. So, you need to map an object to the fixed address and you need to share the Address Space ID (for Itanium it's Region ID). we've already had the patches (LongFormatVhpt) for sharing the region ID. And we can either let the kernel do the bookkeeping for mapping the same object to the same address( like to store the addresses in the inode structures ) or modify the glibc libraries to pre-arrange the address for each object (library).

The idea behind that is simply. scan the whole system;find out all the libraries that we are interested in; arrange the unique address section for each of them.( Since the kernel has no idea about the reserved address, it's very likely that kernel will map different things in the same address, see Bugs below. This problem should be fixed in the future.)

As we all known, there are two important segments in each of the elf-format libraries, text segment and data segment. But we can only share the text segment because the data segment should be privated for each process. Usually the system will map the data segment right after the text segment in the same region, but we should violate it to map the text segment into a shared-region and map the data segment into the private region. In order to do that, we should change the libraries being mapped and the ld.so, that is the Dynamic Loader who is in charge of mapping libraries in user space.

One thing should be mentioned here, at the beginning, because the Itanium processor has 2 interesting registers--IP, which points to the current executing instruction; GP, which points to the data segment that currently using. Therefore I was thinking those two segments are not relative to each other. ( It seems quite straigtforward that you could reference instruction by IP and data by GP ) Yet when I tried to map those two segments individually, I found the Itanium architecture assumes that the text segment is followed by the data segment as they were in compling, so it will reference the instruction by GP pointer. The solutions are either tell the compiler/assembler not to generate the GP-related insturction( then we can map them freely ) or tell the linker to do something tricky, relocates the text segment into the shared-region and the data segment into the private region.

What it does

We implemented the latter one, and the key is the linker script. In ia64 linux, the shared region is region 1 (0x2000000000000000-0x3fffffffffffffff) and we chose region 4 (0x6000000000000000-0x7fffffffffffffff) as the private region. The offset from region 1 to region 4 is 0x4000000000000000. So We changed the linker script to put the text segment first, then skip 0x4000000000000000 before putting the data segment.

Build and install (in debian linux )

Let's assume your machine is running the kernel with tlb-sharing patches

Setup the chroot environment.

See (GlibcDevelopmentEnvironment) for more details.

Install the patche for glibc and setup buiding environment.

See (GlibcDevelopmentEnvironment) for more details. Glibc patche is in glibc.diff . But make sure your environment contains the following commands/packages those are needed for building glibc:

If you met the errors like whatever marco undefined, try to copy unistd.h from your linux source tree directory to YOUR_CHROOT/usr/include/asm/

Build the glibc in the chroot envrionment.

There are two ways to change linker script while you're building the glibc libraries.

Change linker script by hand

  1. goto your glibc buiding directory, say /usr/src/libc-obj

  2. type in make

  3. goto /usr/src/libc-obj

    • In the linker script shlib.lds (That's the linker script for all shared libraries.), find out the following code section, it's at the beginning of the data segment:

      • . = ALIGN(0x10000) + (. & (0x10000 - 1));

      The above line means to let the data segment skip to the next page-aligned address, and we change it to
      • . = 0x4000000000000000 + (. & (0x10000 - 1));

  4. then type in make install

Change the default linker script of ''ld''

The second one is to change the default linker script to allow the system generates the correct linker script that we expect. What we need to change is the same as above, but this time, we change the default one of the linker, then when we build the glibc package, the shlid.lds will be changed automatically. See (ChangeDefLinkerScript) for more on how to change the default linker script.

Bugs

One can NOT run two chroot environments in the same machine.

As mentioned before, the kernel has no idea about the address space management, it's quite happy to map two different objects to the same address. You can experience that by installing two different chroot environments. One slightly differs from the other, say, change the libc.so a little bit. Run one then the other. The kernel then maps two different libraries to the same address with the same Region ID and different Protection Keys. It's obviously the first one will setup the TLB entries and when the second one tries to search the TLB, it will find out a "valid" entry with the region ID and virtual address it wants but having the other proection key. Therefore the pk_fault handler will be invoked to install the correct keys, but it would keep running for ever because it can never find out the key it needs.

IA64wiki: Glibc4TLBSharing (last edited 2009-12-10 03:14:06 by localhost)

Gelato@UNSW is sponsored by
the University of New South Wales National ICT Australia The Gelato Federation Hewlett-Packard Company Australian Research Council
Please contact us with any questions or comments.