Re: Inefficient ia64 system call implementation in glibc

From: John Worley <jworley_at_fc.hp.com>
Date: 2003-09-20 07:46:24
H.J. Lu <hjl@lucon.org> write:

> The inline ia64 system call assumes all values passed to kernel are
> signed 64bit. It does sign extension if the incoming arg is not signed
> 64bit. In case of fxstat.c:
> 
> int
> __fxstat (int vers, int fd, struct stat *buf)
> {
>   return INLINE_SYSCALL (fstat, 2, fd, CHECK_1 (buf));
> }
>  
> it leads to
> 
> 0000000000000000 <__fxstat>:
>    0:   00 20 39 0c 80 05       [MII]       alloc r36=ar.pfs,14,6,0
>    6:   f0 e0 01 12 48 a0                   mov r15=1212
>    c:   04 08 00 84                         mov r37=r1
>   10:   01 38 01 44 00 21       [MII]       mov r39=r34
>   16:   60 02 84 2c 00 60                   sxt4 r38=r33
> 					    ^^^^^^^^^^^^^
>   1c:   04 00 c4 00                         mov r35=b0;;
>   20:   0a 00 00 00 00 02       [MMI]       break.m 0x100000;;
>   26:   10 02 20 00 42 e0                   mov r33=r8

     The real inefficiency here is the compiler output. Given the
realities of the Itanium 2 implementation, the first two bundles
will require 3 cycles to execute. A better coding would be:

	{	.mmi
		alloc	r36=ar.pfs,14,6,0
		mov	r15=1212
		mov	r35=b0
	}
	{	.mmi
		mov	r37=r1
		mov	r39=r34
		sxt4	r38=r33
	} ;;

     which will execute in one cycle. The sign extension, although
"unnecessary" doesn't cost any cycles. Admittedly you could use the
mi;;i bundle to pack the break instruction in the second bundle if
you didn't have to sign-extend, but I'd rather see the 3 v. 1 cycle
problem addressed first.

	Regards,
	John "I worry about this stuff way too much" Worley
	john.worley@hp.com


-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Sep 19 17:49:56 2003

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:17 EST