Intel C++ / gcc 2.95.4 comparisions

15 Jan 2003


      Thought these results are interesting. Note that compilation with
gcc-3.2 probably would generate better code, especially with
arch-dependent data. The binary is compiled with arch-specific code
for multiple platforms (i.e -march but for multiple platforms at once
- rather nice). I'm also using multi-file Interprocedural
optimizations (takes quite a long time to link Pike). arch is Athlon -
Pentium probably would get even better results.
gcc version 2.95.4 20011002 (Debian prerelease):
test                        total    user    mem   (runs)
Pike start overhead........ 0.228s  0.001s  3352kb  (22)
Ackermann.................. 1.669s  1.453s  3532kb   (3)
Array & String Juggling.... 1.026s  0.808s  3660kb   (5)
Clone null-object.......... 0.488s  0.273s  3340kb  (11) (12100000/s)
Clone object............... 0.909s  0.692s  3340kb   (6) (2602410/s)
Compile.................... 1.975s  1.760s  3504kb   (3) (41148 lines/s)
Compile & Exec............. 1.790s  1.577s  3520kb   (3) (1144313 lines/s)
GC......................... 1.269s  0.925s  3468kb   (4)
Matrix multiplication...... 0.862s  0.643s  5144kb   (6)
Loops Nested (local)....... 0.578s  0.362s  3324kb   (9) (416857184 iters/s)
Loops Nested (global)...... 0.899s  0.642s  3324kb   (6) (156877872 iters/s)
Loops Recursed............. 1.442s  1.225s  3324kb   (4) (3423922 iters/s)
Intel(R) C++ Compiler for 32-bit applications, Version 7.0 Build 20021021Z:
Pike start overhead........ 0.191s  0.000s  3760kb  (25)
Ackermann.................. 1.068s  0.864s  3956kb   (5)
Array & String Juggling.... 1.007s  0.804s  3968kb   (5)
Clone null-object.......... 0.426s  0.237s  3728kb  (12) (15157895/s)
Clone object............... 0.816s  0.626s  3728kb   (7) (3356164/s)
Compile.................... 1.594s  1.405s  3916kb   (4) (68726 lines/s)
Compile & Exec............. 1.667s  1.397s  3884kb   (3) (1291790 lines/s)
GC......................... 1.068s  0.880s  3876kb   (5)
Matrix multiplication...... 0.746s  0.556s  5680kb   (7)
Loops Nested (local)....... 0.727s  0.534s  3760kb   (7) (219808464 iters/s)
Loops Nested (global)...... 1.083s  0.894s  3760kb   (5) (93832312 iters/s)
Loops Recursed............. 0.784s  0.594s  3760kb   (7) (12351014 iters/s)
Note the slowdowns in the two nested loop tests though.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Intel C++ / gcc 2.95.4 comparisions