Intel C++ / gcc 2.95.4 comparisions - Pike-devel

16 Jan 2003


      Thought these results are interesting. Note that compilation with
gcc-3.2 probably would generate better code, especially with
arch-dependent data. The binary is compiled with arch-specific code
for multiple platforms (i.e -march but for multiple platforms at once
- rather nice). I'm also using multi-file Interprocedural
optimizations (takes quite a long time to link Pike). arch is Athlon -
Pentium probably would get even better results.
gcc version 2.95.4 20011002 (Debian prerelease):
test                        total    user    mem   (runs)
Pike start overhead........ 0.228s  0.001s  3352kb  (22)
Ackermann.................. 1.669s  1.453s  3532kb   (3)
Array & String Juggling.... 1.026s  0.808s  3660kb   (5)
Clone null-object.......... 0.488s  0.273s  3340kb  (11) (12100000/s)
Clone object............... 0.909s  0.692s  3340kb   (6) (2602410/s)
Compile.................... 1.975s  1.760s  3504kb   (3) (41148 lines/s)
Compile & Exec............. 1.790s  1.577s  3520kb   (3) (1144313 lines/s)
GC......................... 1.269s  0.925s  3468kb   (4)
Matrix multiplication...... 0.862s  0.643s  5144kb   (6)
Loops Nested (local)....... 0.578s  0.362s  3324kb   (9) (416857184 iters/s)
Loops Nested (global)...... 0.899s  0.642s  3324kb   (6) (156877872 iters/s)
Loops Recursed............. 1.442s  1.225s  3324kb   (4) (3423922 iters/s)
Intel(R) C++ Compiler for 32-bit applications, Version 7.0 Build 20021021Z:
Pike start overhead........ 0.191s  0.000s  3760kb  (25)
Ackermann.................. 1.068s  0.864s  3956kb   (5)
Array & String Juggling.... 1.007s  0.804s  3968kb   (5)
Clone null-object.......... 0.426s  0.237s  3728kb  (12) (15157895/s)
Clone object............... 0.816s  0.626s  3728kb   (7) (3356164/s)
Compile.................... 1.594s  1.405s  3916kb   (4) (68726 lines/s)
Compile & Exec............. 1.667s  1.397s  3884kb   (3) (1291790 lines/s)
GC......................... 1.068s  0.880s  3876kb   (5)
Matrix multiplication...... 0.746s  0.556s  5680kb   (7)
Loops Nested (local)....... 0.727s  0.534s  3760kb   (7) (219808464 iters/s)
Loops Nested (global)...... 1.083s  0.894s  3760kb   (5) (93832312 iters/s)
Loops Recursed............. 0.784s  0.594s  3760kb   (7) (12351014 iters/s)
Note the slowdowns in the two nested loop tests though.