Talk:Intel C++ Compiler
From Gentoo Linux Wiki
Contents |
[edit] Just For Fun?
I want to reinstall my gentoo in about 2 months. I wonder if icc gains are worth all that trouble and manual work. Let's say icc code is 30% faster then one from gcc but on the other hand you cannot make everything using icc and it seems that it is a bit trouble making. Can someone can tell me if kernel, X and KDE instalation is "posible" using icc or icc mixed with gcc and how much additional work will it cost?
-- mmatczuk@yahoo.co.uk
"man icc" does not work. No man files were installed when I emerged icc.
Also, it may be worth mentioning that you can test this out without integrating it into portage; just add the variables to your command line. e.g. "# CC="icc" CXX="icc" CFLAGS="-O2 -tpp6 -march=pentiumiii" CXXFLAGS="-O2 -tpp6 -march=pentiumiii -parallel" emerge package"
-- jnicol@backnine.org
Reporting the problem with the man page at http://bugs.gentoo.org would be helpful.
-- nicktastic@gmail.com
[edit] Comparisons
Sure ICC has vectorizations and such, but is it worth re-emerging all of these packages. A very interesting statistic would be benchmark tests against Gcc for each of the working packages, Compared to ICC. It is entirely useless to risk a program not working on the recompile for mariginal gains. However, if there is alot of floating point math going around, I hear ICC compiles programs that reall haul it.
Re: Comparisons
ICC does indeed do a better job on floating point math, and also it appears to optimize c++ much better then gcc. for most GUI/WM/DEs this may or may not have an effect, only environments (and apps) written in C++ would see any huge difference, and floating point math is for scientific, most graphical apps dont use it, so no real gain there. (I'm not sure where i heard that C++ on icc is better, but i've seen it in a few places, but again dont quote me on that :-P)
--69.163.5.47 03:47, 25 May 2005 (GMT)
[edit] ICC compiler flags
For me, being the one to wounder into the unknown, i like to have moderate cflags (-march=pentium4 -O3 -xW (willy core) ) so i dont expect icc to really haul ass over gcc, but in the end if i have a whole computer i can optimize the stuff that needs it ( anything graphical) to the max, but first lets see if it works.
Well - I got Gentoo in a VM compiled with ICC (whatever did compile) and encoding a video gives me this:
time mencoder a.mpg -ovc xvid -xvidencopts bitrate=-50000:pass=2 -nosound -o a.avi
Gentoo in VM:
real 1m6.512s user 0m15.470s sys 0m47.800s
Debian host machine:
real 2m8.984s user 2m7.716s sys 0m1.180s
As to why - I wish I knew.Trying to figure it out.What I can tell you is that compiling xvid with gcc doesn't make a big impact and mencoder is compiled with gcc anyway since it didn't like icc.
[edit] icc.cfg parsed?
I'm currently having problems getting icc (dev-lang/icc-9.0.021 from portage) to compile a sed (sys-apps/sed-4.1.4) that actually works. I mean, compilation works just fine, but the resulting binary just segfaults.
I, "of course" :), followed the guide to do:
echo "-no-gcc" >> /opt/intel/compiler81/bin/icc.cfg
(I did "nano /opt/intel/compiler90/bin/icc.cfg" and added a new line with "-no-gcc" (no ") and also added it to the existing line), but sed still isn't useable.
Now I added "-no-gcc" to CFLAGS in my make.conf.
Because of that, I wonder if that icc.cfg is really used.
Any comments? --A.Skwar 23:07, 6 Aug 2005 (GMT)
Yes, it is parsed. You can see it from the output of configure:
checking whether we are using the GNU C compiler... no
You can analogously add "-no-gcc" to icpc.cfg.
Try compiling sed without IPO to prevent segfaults. --Jgorski 16:33, 7 September 2006 (UTC)
[edit] LD and AR
Do you also have to set LD and AR to the intel equivalents?
LD=xild
AR=xiar
- icc defaults to gcc compatible mode, linking with gld or assembling with gas shouldn't give any problems. I've defined them in my make script as intel native ones, but leaving them out compiles just as fine with regular apps.
[edit] ICC For non-Intel CPU's
Is there any advantage to using ICC on non-Intel CPU's (e.g. AMD XP/64)? As the newer AMD CPU's support the same instruction set as the newer P4's.
- While I can't speak from experience (I have not tried ICC), but I would think that the answer would be "no", because while the instruction set is the same, the architecture is different, which is what ICC optimizes for. I could be wrong though. MighMoS 21:08, 12 October 2005 (GMT)
Well, I've just tried icc-9.0.30 with -O2 -march=pentiumii -ip -ipo -rcd -parallel -fomit-frame-pointer on an athlon-mp. The generated binary is about 30% faster then gcc with athlon specific options. The sourcefile calculates different means for a big array of floats.
A run of a gcc compiled version vs. Intel:
darkstar Means # time nice -n -19 ./means 123456789; time nice -n -19 ./meansi 123456789; Memory size : 470.95 Mb ArithmicMean : 0.500000 GeometricMean : 0.000000 HarmonicMean : 0.052060 HeronianMean : 0.333333 RootMeanSquare : 0.577350 Total time spent: 14.110000 seconds real 0m14.144s user 0m13.290s sys 0m0.830s Memory size : 470.95 Mb ArithmicMean : 0.500000 GeometricMean : 0.000000 HarmonicMean : 0.052060 HeronianMean : 0.333333 RootMeanSquare : 0.577350 Total time spent: 9.370000 seconds real 0m9.578s user 0m11.620s sys 0m0.950s darkstar Means #
I have yet to install ifc, but here are the results of g95 compiled version:
darkstar Means # time nice -n -19 ./meansf 123456789 Memory size : 470.95 Mb Arithmetic mean : 0.500000 Geometric mean : 0.000000 Harmonic Mean : 0.052060 Heronian Mean : 0.333333 Root Mean Sqrt : 0.577350 Total time spent: 14.140000 seconds real 0m14.170s user 0m13.470s sys 0m0.680s
C Source code listing of the program:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <math.h>
float ArithmicMean (float *array, unsigned long int length);
float GeometricMean (float *array, unsigned long int length);
float HarmonicMean (float *array, unsigned long int length);
float HeronianMean (float *array, unsigned long int length);
float RootMeanSquare (float *array, unsigned long int length);
void exitError (char *message);
float bytes2human (unsigned long int bytes, char *text);
int main(int argc, char **argv)
{
unsigned long int L, i, arraySize;
float *array=NULL, pSize;
char pText[3];
clock_t TStart, TEnd;
/* Check and get argument */
if (argc!=2) exitError("Please supply the array length\n");
L = atol(argv[1]);
if (L<=0) exitError("Argument conversion error\n");
/* Allocate memory */
arraySize = sizeof(float)*L;
pSize = bytes2human(arraySize, pText);
printf("Memory size : %6.2f %s\n", pSize, pText);
TStart = clock();
array = malloc(arraySize);
if (!array) exitError("Not enough memory to hold array");
/* Fill the array with dummy values */
for (i=0; i<L; ++i) array[i] = (i+1.0)/L;
/* Calculate the means and clean up */
printf("ArithmicMean : %10.6f\n", ArithmicMean (array, L));
printf("GeometricMean : %10.6f\n", GeometricMean (array, L));
printf("HarmonicMean : %10.6f\n", HarmonicMean (array, L));
printf("HeronianMean : %10.6f\n", HeronianMean (array, L));
printf("RootMeanSquare : %10.6f\n", RootMeanSquare(array, L));
/* Print total time spent calculating */
free(array);
TEnd = clock();
printf( "Total time spent: %10.6f seconds\n", ((double)(TEnd-TStart))/CLOCKS_PER_SEC
);
return 0;
}
float ArithmicMean(float *array, unsigned long int length)
{
unsigned long int i=0;
double sum=0.0;
for (i=0; i<length; i++) sum += array[i];
return sum/length;
}
float GeometricMean(float *array, unsigned long int length)
{
unsigned long int i=0;
double prod=1.0;
for (i=0; i<length; i++) prod *= array[i];
return pow(prod, (1.0/length));
}
float HarmonicMean(float *array, unsigned long int length)
{
unsigned long int i=0;
double divsum=0.0;
for (i=0; i<length; i++) divsum += 1.0/array[i];
return length/divsum;
}
float HeronianMean(float *array, unsigned long int length)
{
return 1.0/3.0 * (2.0*ArithmicMean(array, length) + GeometricMean(array, length));
}
float RootMeanSquare(float *array, unsigned long int length)
{
unsigned long int i=0;
double sqrsum=0.0;
for (i=0; i<length; i++) sqrsum += array[i] * array[i];
return sqrt(sqrsum/length);
}
void exitError(char *message)
{
fprintf(stderr, message);
exit(1);
}
float bytes2human(unsigned long int bytes, char *text)
{
float ret=0.0;
if (bytes>1024*1024*1024) {
ret = bytes / 1024.0/1024.0/1024.0;
text[0] = 'G', text[1] = 'b';
}
else if (bytes>1024*1024) {
ret = bytes / 1024.0/1024.0;
text[0] = 'M', text[1] = 'b';
}
else if (bytes>1024) {
ret = bytes / 1024.0;
text[0] = 'K', text[1] = 'b';
}
else {
ret = bytes;
text[0] = 'b';
}
if (text[1]!='b') text[1] = '\0';
else text[2] = '\0';
return ret;
}
Fotran Source code:
PROGRAM Means
IMPLICIT NONE
!
! Declare all variables
!
INTEGER*8 :: S=1, I, L, ERR
REAL, DIMENSION(:), ALLOCATABLE :: X
CHARACTER(LEN=25) :: C
CHARACTER(LEN=2) :: pText
REAL :: TSTART, TEND, pSize
!
! Read in the array length from the command line
IF (COMMAND_ARGUMENT_COUNT() < 1) THEN
STOP 'Please supply the array length'
END IF
CALL GET_COMMAND_ARGUMENT(1, C)
L = ATOI(C, S)
IF (L <= 0) THEN
STOP 'Argument conversion error'
END IF
pSize = bytes2human(L*SIZEOF(pSize), pText)
WRITE (*, '(''Memory size : ''(F6.2)'' ''(A))') pSize, pText
CALL CPU_TIME(TSTART)
ALLOCATE( X(1:L), STAT=ERR )
IF (ERR /= 0) THEN
STOP 'Error allocating memory for array'
END IF
!
! Fill the allocated array
!
DO I = 1, L
X(I) = I/REAL(L)
END DO
!
! And print the results
!
WRITE (*, '(''Arithmetic mean : ''(F10.6))') ArithmicMean(X, L)
WRITE (*, '(''Geometric mean : ''(F10.6))') GeometricMean(X, L)
WRITE (*, '(''Harmonic Mean : ''(F10.6))') HarmonicMean(X, L)
WRITE (*, '(''Heronian Mean : ''(F10.6))') HeronianMean(X, L)
WRITE (*, '(''Root Mean Sqrt : ''(F10.6))') RootMeanSquare(X, L)
DEALLOCATE(X)
CALL CPU_TIME(TEND)
WRITE (*, '(''Total time spent: ''(F10.6)'' seconds'')') TEND-TSTART
CONTAINS
!
! Computes the arithmic mean for the given array and length
!
REAL FUNCTION ArithmicMean(ar, len)
INTEGER*8 :: len
REAL :: ar(len)
ArithmicMean = SUM(ar)/REAL(len)
END FUNCTION
!
! Computes the geometric mean for the given array and length
!
REAL FUNCTION GeometricMean(ar, len)
INTEGER*8 :: len
REAL :: ar(len)
GeometricMean = PRODUCT(ar)**(1.0/len)
END FUNCTION
!
! Computes the harmonic mean for the given array and length
!
REAL FUNCTION HarmonicMean(ar, len)
INTEGER*8 :: len
REAL :: ar(len)
HarmonicMean = len/SUM(1.0/ar)
END FUNCTION
!
! Computes the heronian mean for the given array and length
!
REAL FUNCTION HeronianMean(ar, len)
INTEGER*8 :: len
REAL :: ar(len)
HeronianMean = 1.0/3.0 * (2.0*ArithmicMean(ar,len) + GeometricMean(ar,len))
END FUNCTION
!
! Computes the root mean square for the given array and length
!
REAL FUNCTION RootMeanSquare(ar, len)
INTEGER*8 :: len
REAL :: ar(len)
RootMeanSquare = SQRT(SUM(ar**2.0)/len)
END FUNCTION
FUNCTION atoi(string,i) RESULT (value)
CHARACTER (len=*) :: string
INTEGER*8 :: i, ii, max, sign, value
CHARACTER (len=10), PARAMETER :: digit = '0123456789'
max = len(string)
value = 0
sign = 1
! CALL skipbl(string,i)
IF (string(i:i)=='-') THEN
sign = -1
i = i + 1
END IF
DO i = i, max
ii = index(digit,string(i:i))
IF (ii==0) THEN
value = sign*value
RETURN
END IF
value = (value*10) + (ii-1)
END DO
END FUNCTION atoi
REAL FUNCTION bytes2human(bytes, text)
INTEGER*8 :: bytes
CHARACTER(2) :: text
IF (bytes > 1024*1024*1024) THEN
bytes2human = bytes / 1024.0/1024.0/1024.0
text = 'Gb'
ELSE IF (bytes > 1024*1024) THEN
bytes2human = bytes / 1024.0/1024.0
text = 'Mb'
ELSE IF (bytes > 1024) THEN
bytes2human = bytes / 1024.0
text = 'Kb'
ELSE
bytes2human = REAL(bytes)
text = 'b'
END IF
RETURN
END FUNCTION bytes2human
END PROGRAM Means
General things like lzma, bzip2 and other cpu intensive packages benefit from these compiles too. (/me loves lzma-utils :)
[edit] Version 10
Does this information apply to the most recent version of icc?
[edit] my experience
In response to above question, here is what I have experienced with the latest ICC (10.0.026) on ~amd64.
Relevant entries in /etc/make.conf
CFLAGS="-O2 -xT -inline-level=2 -parallel -fomit-frame-pointer -fPIC"
CXXFLAGS="${CFLAGS}"
GCCFLAGS="-march=nocona -O2 -pipe -fomit-frame-pointer"
GCXXFLAGS="${GCCFLAGS}"
My /etc/portage/package.gcc is
app-shells/bash #bash2 works, but not bash3 sys-apps/gawk #compile error #sys-apps/module-init-tools #compile error sys-apps/sandbox #fails to build properly if -no-gcc is used sys-apps/sed #breaks emerge if compiled with icc sys-devel/binutils #toolchain sys-devel/gcc #toolchain sys-libs/db #linking error sys-libs/glibc #toolchain sys-libs/ncurses #needed if you're going to compile bash with gcc sys-libs/libstdc++-v3 #toolchain sys-libs/zlib #needs to be gcc for linking continuity later on sys-process/procps # (Above is based on this information. # I tried to emerge each of these with ICC, # but eventually failed except module-init-tools. # I left them alone as you see.)
# (Below is what I have trouble with ICC, # which is not mentioned in this info or # reported as successful emerge with version 8.) sys-fs/udev sys-apps/util-linux net-analyzer/traceroute app-portage/eix media-sound/esound net-misc/openssh dev-libs/apr media-gfx/splashutils app-arch/bzip2 app-admin/apache-tools net-dialup/ppp www-servers/apache mail-client/sylpheed dev-util/subversion x11-base/xorg-server x11-drivers/xf86-input-keyboard app-text/tetex app-office/texmacs media-sound/alsa-driver x11-drivers/xf86-input-mouse x11-drivers/xf86-video-i810
[edit] Both ICC and GCC versions of the same package installed?
Question: What I need is to have both versions - one gcc and one intel icc - of the same package (for example of blas, cblas, umfpack, etc.) compiled. What is the most "gentoo way" to do this? Ideally, it would be great to emerge the packages (not to compile them myself somewhere in my /home) and, if possible, to further update both versions via emerge --update. Any hints?
