Re: Itanium2@900MHz slower than alpha@666MHz ?

From: Ionut Georgescu <george_at_physik.tu-cottbus.de>
Date: 2003-10-25 04:55:17
Thanks, but I still need to learn about this kind of optimization.
After running the program I don't get any .dpi file. And the name of the
.dpy file is rather related to the pid of the process than to the name
of the program.


On Fri, Oct 24, 2003 at 10:42:40AM -0700, Chen, Kenneth W wrote:
> It wasn't clear from the description whether you actually turned on
> profile guided optimization with electron compiler.  It is a two pass
> compilation, once with -prof_gen to generate execution profile and then
> once with -prof_use to complete PGO optimization.
> 
> One other neat thing about Itanium architecture is the capability of
> it's performance counter.  It has capability to do cycle accounting that
> break-down the number of cycles that are lost due to various kinds of
> micro-architectural events, it is based on CPU's actual stall cycles in
> the pipeline so you can see exactly where the stall is coming from to
> eliminate any guess work.

Are qprof and pfmon enough to do this ?

> 
> See electron compiler user's guide for PGO methodology:
> http://www.intel.com/software/products/compilers/c60l/resources/c_ug_lnx
> .pdf

Is this the same for 7.x ?

> 
> Cycle accounting is described in Intel Itanium Software developer's
> manual.
> 

Thank you for the info. I'll try to find the bottleneck.

Ionut

> - Ken
> 
> 
> -----Original Message-----
> From: linux-ia64-owner@vger.kernel.org
> [mailto:linux-ia64-owner@vger.kernel.org] On Behalf Of Ionut Georgescu
> Sent: Friday, October 24, 2003 8:15 AM
> To: linux-ia64@vger.kernel.org
> Subject: Itanium2@900MHz slower than alpha@666MHz ?
> 
> Hello,
> 
> I am puzzled about the speed of a zx2000 workstation with a 900Mhz CPU.
> According to the SPECfp2000 benchmarks, this workstation should be about
> twice as fast as a DS10 alpha workstation and according to the fftw2
> benchmarks at least 50% faster (double precision, real data, 256x256 FFT
> transforms). I ran the fftw2 benchmark myself and I could reproduce the
> data on fftw.org
> 
> However, my program is about 40% slower on the zx2000 as on the alpha.
> It only does some Fourier transforms (fftw2, 256x256) and some matrix
> operations (sort of an inner product). Both fftw2 and the program have
> been compiled with ecc -O2 -ipo -limf. ecc is Version 7.1, Build
> 20030307.
> 
> Both the alpha and the Itanium2 run Debian stable and kernel 2.4.20.
> 
> Is there anything else I can do to improve performance ? I tried to some
> profiling (CFLAGS="-g -p -Ob0 -O0 -inline_debug_info"), but the report
> is missing the call-graph and a lot of other information, so that I
> can't trust the quality of those data. Right now I'm trying to dig my
> way through qprof and pfmon (for the moment qprof fails when
> QPROF_HW_EVENT is set).
> 
> Thanks a lot,
> Ionut
> 
> -- 
> ***************
> * Ionut Georgescu
> * http://www.physik.tu-cottbus.de/~george/
> * Registered Linux User #244479
> *
> * "In Windows you can do everything Microsoft wants you to do; in Unix
> you
> *                can do anything the computer is able to do."
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
***************
* Ionut Georgescu
* http://www.physik.tu-cottbus.de/~george/
* Registered Linux User #244479
*
* "In Windows you can do everything Microsoft wants you to do; in Unix you
*                can do anything the computer is able to do."

-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Oct 24 15:21:47 2003

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:19 EST