Well, I think I have added the first opteron to the build farm. :)
you have been to the CeBit? and you didn't mention it? we could have met. i was there for 4 days. (ok, maybe *I* should have mentioned that, but who would expect you to travel such a distance just to look at computers)
greetings, martin.
Such distance = 12 hours.
/ Martin Nilsson (har bott i google)
Previous text:
2003-04-07 12:30: Subject: Re: First Opteron?
you have been to the CeBit? and you didn't mention it? we could have met. i was there for 4 days. (ok, maybe *I* should have mentioned that, but who would expect you to travel such a distance just to look at computers)
greetings, martin.
/ Brevbäraren
exactly, compared to 3 hours for me. ok, instead of sulking over lost opportunities let's all try to remember to announce trips like these :-)
greetings, martin.
On Mon, Apr 07, 2003 at 07:40:01AM +0200, David Hedbor @ Pike developers forum scribbled:
Well, I think I have added the first opteron to the build farm. :)
How does the beast work? :)
It seems to work fine. I haven't done a whole lot of stuff with it though. As a comparision it ran the 7.4 build in 6 minutes, while my Athlon 1.2 ran it in 24m. Of course, it's dual CPU so taken that in account it's still a more than 2x as fast (per CPU) since a large part of the compile isn't dual CPU friendly (most of the testsuite and configure). Of course, it also has less modules so that should be taken in account too I suppose.
The box is a dual 1.6 GHz Opteron with 6 GB of memory. It's unfortunately not mine (rather it's a review box that sits on a friends DSL line).
Anyone have any fun benchmarks to run on it?
/ David Hedbor
Previous text:
2003-04-07 12:10: Subject: Re: First Opteron?
On Mon, Apr 07, 2003 at 07:40:01AM +0200, David Hedbor @ Pike developers forum scribbled:
Well, I think I have added the first opteron to the build farm. :)
How does the beast work? :)
marek
/ Brevbäraren
In other words you really have no clue how much faster it is, since 5GB (or whatever) memory is probably the most significant difference.
/ Martin Nilsson (har bott i google)
Previous text:
2003-04-07 20:25: Subject: Re: First Opteron?
On Mon, Apr 07, 2003 at 07:40:01AM +0200, David Hedbor @ Pike developers forum scribbled:
Well, I think I have added the first opteron to the build farm. :)
How does the beast work? :)
It seems to work fine. I haven't done a whole lot of stuff with it though. As a comparision it ran the 7.4 build in 6 minutes, while my Athlon 1.2 ran it in 24m. Of course, it's dual CPU so taken that in account it's still a more than 2x as fast (per CPU) since a large part of the compile isn't dual CPU friendly (most of the testsuite and configure). Of course, it also has less modules so that should be taken in account too I suppose.
The box is a dual 1.6 GHz Opteron with 6 GB of memory. It's unfortunately not mine (rather it's a review box that sits on a friends DSL line).
Anyone have any fun benchmarks to run on it?
/ David Hedbor
Well, memory hardly makes a difference for compilation or benchmarks since the comparision buxes have 1 GB or more memory with plenty free.
My conclusion is that for every day tasks, which doesn't benefit greatly from the 1 MB cache, it's of similar speed (somewhat faster) than Athlon MP at the same clock frequency. This is pretty decent since it IS running in 64-bit mode rather than 32-bit. Also mind you this is without ANY architecture specific optimizations (unless they are enabled by default in the compiler).
For tasks where cache memory is more important such as 3D rendering, compilation etc, it SHOULD be faster. However I can't really compare compilation of Pike since it depends a lot of what libraries I have installed. Should compile some other big project without external dependencies and compare that.
/ David Hedbor
Previous text:
2003-04-07 20:30: Subject: Re: First Opteron?
In other words you really have no clue how much faster it is, since 5GB (or whatever) memory is probably the most significant difference.
/ Martin Nilsson (har bott i google)
Talking 64-bit. Is Pike already supporting that, or is it planned?
/ edde (nu med båge!!!)
Previous text:
2003-04-07 20:47: Subject: Re: First Opteron?
Well, memory hardly makes a difference for compilation or benchmarks since the comparision buxes have 1 GB or more memory with plenty free.
My conclusion is that for every day tasks, which doesn't benefit greatly from the 1 MB cache, it's of similar speed (somewhat faster) than Athlon MP at the same clock frequency. This is pretty decent since it IS running in 64-bit mode rather than 32-bit. Also mind you this is without ANY architecture specific optimizations (unless they are enabled by default in the compiler).
For tasks where cache memory is more important such as 3D rendering, compilation etc, it SHOULD be faster. However I can't really compare compilation of Pike since it depends a lot of what libraries I have installed. Should compile some other big project without external dependencies and compare that.
/ David Hedbor
64-bit you meen???
I was more curious if there is any plans to support the Opteron/Athlon XP-64 (or whatever it will be called) x86-64 features. Thats of course if it will be any usefull for Pike to do so.
/ edde (nu med båge!!!)
Previous text:
2003-04-08 05:45: Subject: Re: First Opteron?
Pike has worked on 32 bit computers for many years (think Alpha for example).
/ David Hedbor
Right.
/ David Hedbor
Previous text:
2003-04-08 05:50: Subject: Re: First Opteron?
64-bit you meen???
I was more curious if there is any plans to support the Opteron/Athlon XP-64 (or whatever it will be called) x86-64 features. Thats of course if it will be any usefull for Pike to do so.
/ edde (nu med båge!!!)
I was more curious if there is any plans to support the Opteron/Athlon XP-64 (or whatever it will be called) x86-64 features. Thats of course if it will be any usefull for Pike to do so.
Since I have made the first compilation on that arch, no. Also I don't know what support gcc has at this point.
/ David Hedbor
Previous text:
2003-04-08 05:50: Subject: Re: First Opteron?
64-bit you meen???
I was more curious if there is any plans to support the Opteron/Athlon XP-64 (or whatever it will be called) x86-64 features. Thats of course if it will be any usefull for Pike to do so.
/ edde (nu med båge!!!)
Dunno, but I saw that Mandrake already released an Opteron version of their Linux distro...
/ edde (nu med båge!!!)
Previous text:
2003-04-08 05:53: Subject: Re: First Opteron?
I was more curious if there is any plans to support the Opteron/Athlon XP-64 (or whatever it will be called) x86-64 features. Thats of course if it will be any usefull for Pike to do so.
Since I have made the first compilation on that arch, no. Also I don't know what support gcc has at this point.
/ David Hedbor
Pike supports 64-bit. Pike has always supported 64-bit pointers, and supports 64-bit floats since way back. Only recently though it's begun to support 64-bit int, and int size > pointer size.
/ Mirar
Previous text:
2003-04-08 04:38: Subject: Re: First Opteron?
Talking 64-bit. Is Pike already supporting that, or is it planned?
/ edde (nu med båge!!!)
How does it compare in the benchmarks?
I see it has no special compilation flags, I assume it compiles for 32 bit compat mode then? (Can gcc make Opteron 64-bit code?)
/ Mirar
Previous text:
2003-04-07 07:39: Subject: First Opteron?
Well, I think I have added the first opteron to the build farm. :)
/ David Hedbor
As far as I know, it defaults to 64 bit rather. Cut from configure:
checking size of char *... 8 checking for int... yes checking size of int... 4 checking for short... yes checking size of short... 2 checking for float... yes checking size of float... 4 checking for double... yes checking size of double... 8 checking for long double... yes checking size of long double... 16 checking for long... yes checking size of long... 8 checking for long long... yes checking size of long long... 8
As for benchmarks:
test total user mem (runs) Pike start overhead........ 0.023s 0.000s 3560kb (25) Ackermann.................. 0.605s 0.584s 3932kb (9) Append array............... 0.457s 0.431s 3548kb (11) (1160338/s) Append mapping............. 2.409s 2.380s 3644kb (3) (4202/s) Append multiset............ 0.431s 0.406s 3576kb (12) (24641/s) Array & String Juggling.... 0.476s 0.448s 4164kb (11) Read binary INT16.......... 0.272s 0.249s 4092kb (19) (4008439/s) Read binary INT32.......... 1.350s 1.320s 4204kb (4) (378788/s) Read binary INT128......... 0.501s 0.476s 4652kb (10) (21008/s) Clone null-object.......... 0.186s 0.163s 3492kb (25) (1842752/s) Clone object............... 0.401s 0.378s 3488kb (13) (792683/s) Compile.................... 0.913s 0.892s 6776kb (6) (27073 lines/s) Compile & Exec............. 0.726s 0.706s 11404kb (7) (852186 lines/s) GC......................... 0.453s 0.429s 3720kb (12) Insert in mapping.......... 0.422s 0.399s 3560kb (12) (1252610/s) Insert in multiset......... 0.688s 0.665s 3560kb (8) (751880/s) Matrix multiplication...... 0.368s 0.347s 5396kb (14) Loops Nested (local)....... 0.419s 0.383s 3556kb (12) (43862003 iters/s) Loops Nested (global)...... 0.691s 0.670s 3560kb (8) (25040621 iters/s) Loops Recursed............. 0.446s 0.425s 3560kb (12) (2467238 iters/s)
By the way, I haven't explored possible additional flags for x86-64 or even athlon specific optimizations (such as march= and -ftune and such things). Also for completeness, here's the same benchmark on a dual Athlon-MP 2000+ (which runs just over 1.6 GHz). It's missing the 'read binary' tests, I assume that's because it doesn't have crypto:
test total user mem (runs) Pike start overhead........ 0.084s 0.002s 3380kb (25) Ackermann.................. 1.214s 1.132s 3588kb (5) Append array............... 0.586s 0.494s 3336kb (9) (1011236/s) Append mapping............. 2.949s 2.870s 3436kb (2) (3484/s) Append multiset............ 0.501s 0.417s 3400kb (11) (23965/s) Array & String Juggling.... 0.668s 0.579s 3712kb (8) Clone null-object.......... 0.257s 0.173s 3392kb (20) (1739130/s) Clone object............... 0.534s 0.452s 3392kb (10) (663717/s) Compile.................... 1.313s 1.227s 5136kb (4) (19666 lines/s) Compile & Exec............. 1.271s 1.186s 3592kb (5) (507083 lines/s) GC......................... 0.749s 0.666s 3352kb (7) Insert in mapping.......... 0.486s 0.403s 3380kb (11) (1241535/s) Insert in multiset......... 0.935s 0.852s 3384kb (6) (587084/s) Matrix multiplication...... 0.490s 0.405s 5196kb (11) Loops Nested (local)....... 0.344s 0.261s 3380kb (15) (64362724 iters/s) Loops Nested (global)...... 0.685s 0.600s 3380kb (8) (27962026 iters/s) Loops Recursed............. 1.621s 1.535s 3380kb (4) (683111 iters/s)
/ David Hedbor
Previous text:
2003-04-07 12:50: Subject: First Opteron?
How does it compare in the benchmarks?
I see it has no special compilation flags, I assume it compiles for 32 bit compat mode then? (Can gcc make Opteron 64-bit code?)
/ Mirar
checking size of char *... 8 checking size of int... 4 checking size of long... 8
Looks like LP64.
/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)
Previous text:
2003-04-07 20:40: Subject: First Opteron?
As far as I know, it defaults to 64 bit rather. Cut from configure:
checking size of char *... 8 checking for int... yes checking size of int... 4 checking for short... yes checking size of short... 2 checking for float... yes checking size of float... 4 checking for double... yes checking size of double... 8 checking for long double... yes checking size of long double... 16 checking for long... yes checking size of long... 8 checking for long long... yes checking size of long long... 8
As for benchmarks:
test total user mem (runs) Pike start overhead........ 0.023s 0.000s 3560kb (25) Ackermann.................. 0.605s 0.584s 3932kb (9) Append array............... 0.457s 0.431s 3548kb (11) (1160338/s) Append mapping............. 2.409s 2.380s 3644kb (3) (4202/s) Append multiset............ 0.431s 0.406s 3576kb (12) (24641/s) Array & String Juggling.... 0.476s 0.448s 4164kb (11) Read binary INT16.......... 0.272s 0.249s 4092kb (19) (4008439/s) Read binary INT32.......... 1.350s 1.320s 4204kb (4) (378788/s) Read binary INT128......... 0.501s 0.476s 4652kb (10) (21008/s) Clone null-object.......... 0.186s 0.163s 3492kb (25) (1842752/s) Clone object............... 0.401s 0.378s 3488kb (13) (792683/s) Compile.................... 0.913s 0.892s 6776kb (6) (27073 lines/s) Compile & Exec............. 0.726s 0.706s 11404kb (7) (852186 lines/s) GC......................... 0.453s 0.429s 3720kb (12) Insert in mapping.......... 0.422s 0.399s 3560kb (12) (1252610/s) Insert in multiset......... 0.688s 0.665s 3560kb (8) (751880/s) Matrix multiplication...... 0.368s 0.347s 5396kb (14) Loops Nested (local)....... 0.419s 0.383s 3556kb (12) (43862003 iters/s) Loops Nested (global)...... 0.691s 0.670s 3560kb (8) (25040621 iters/s) Loops Recursed............. 0.446s 0.425s 3560kb (12) (2467238 iters/s)
By the way, I haven't explored possible additional flags for x86-64 or even athlon specific optimizations (such as march= and -ftune and such things). Also for completeness, here's the same benchmark on a dual Athlon-MP 2000+ (which runs just over 1.6 GHz). It's missing the 'read binary' tests, I assume that's because it doesn't have crypto:
test total user mem (runs) Pike start overhead........ 0.084s 0.002s 3380kb (25) Ackermann.................. 1.214s 1.132s 3588kb (5) Append array............... 0.586s 0.494s 3336kb (9) (1011236/s) Append mapping............. 2.949s 2.870s 3436kb (2) (3484/s) Append multiset............ 0.501s 0.417s 3400kb (11) (23965/s) Array & String Juggling.... 0.668s 0.579s 3712kb (8) Clone null-object.......... 0.257s 0.173s 3392kb (20) (1739130/s) Clone object............... 0.534s 0.452s 3392kb (10) (663717/s) Compile.................... 1.313s 1.227s 5136kb (4) (19666 lines/s) Compile & Exec............. 1.271s 1.186s 3592kb (5) (507083 lines/s) GC......................... 0.749s 0.666s 3352kb (7) Insert in mapping.......... 0.486s 0.403s 3380kb (11) (1241535/s) Insert in multiset......... 0.935s 0.852s 3384kb (6) (587084/s) Matrix multiplication...... 0.490s 0.405s 5196kb (11) Loops Nested (local)....... 0.344s 0.261s 3380kb (15) (64362724 iters/s) Loops Nested (global)...... 0.685s 0.600s 3380kb (8) (27962026 iters/s) Loops Recursed............. 1.621s 1.535s 3380kb (4) (683111 iters/s)
/ David Hedbor
Loops Nested (local)....... 0.419s 0.383s 3556kb (12) (43862003 iters/s) Loops Nested (global)...... 0.691s 0.670s 3560kb (8) (25040621 iters/s) Loops Recursed............. 0.446s 0.425s 3560kb (12) (2467238 iters/s)
The last one is impressive, the other scores are "just good", I assume they all are around the same level as a Barton at 1600MHz (with the increased cache, which is probably where the main difference is).
By the way, I haven't explored possible additional flags for x86-64 or even athlon specific optimizations (such as march= and -ftune and such things). Also for completeness, here's the same benchmark on a dual Athlon-MP 2000+ (which runs just over 1.6 GHz). It's missing the 'read binary' tests, I assume that's because it doesn't have crypto:
Yes, it needs Crypto.randomness.pike_random() which seems to have disappeared. I can't run those tests either. Why is that gone? :( It was hardly dependant on any crypto stuff.
/ Mirar
Previous text:
2003-04-07 20:40: Subject: First Opteron?
As far as I know, it defaults to 64 bit rather. Cut from configure:
checking size of char *... 8 checking for int... yes checking size of int... 4 checking for short... yes checking size of short... 2 checking for float... yes checking size of float... 4 checking for double... yes checking size of double... 8 checking for long double... yes checking size of long double... 16 checking for long... yes checking size of long... 8 checking for long long... yes checking size of long long... 8
As for benchmarks:
test total user mem (runs) Pike start overhead........ 0.023s 0.000s 3560kb (25) Ackermann.................. 0.605s 0.584s 3932kb (9) Append array............... 0.457s 0.431s 3548kb (11) (1160338/s) Append mapping............. 2.409s 2.380s 3644kb (3) (4202/s) Append multiset............ 0.431s 0.406s 3576kb (12) (24641/s) Array & String Juggling.... 0.476s 0.448s 4164kb (11) Read binary INT16.......... 0.272s 0.249s 4092kb (19) (4008439/s) Read binary INT32.......... 1.350s 1.320s 4204kb (4) (378788/s) Read binary INT128......... 0.501s 0.476s 4652kb (10) (21008/s) Clone null-object.......... 0.186s 0.163s 3492kb (25) (1842752/s) Clone object............... 0.401s 0.378s 3488kb (13) (792683/s) Compile.................... 0.913s 0.892s 6776kb (6) (27073 lines/s) Compile & Exec............. 0.726s 0.706s 11404kb (7) (852186 lines/s) GC......................... 0.453s 0.429s 3720kb (12) Insert in mapping.......... 0.422s 0.399s 3560kb (12) (1252610/s) Insert in multiset......... 0.688s 0.665s 3560kb (8) (751880/s) Matrix multiplication...... 0.368s 0.347s 5396kb (14) Loops Nested (local)....... 0.419s 0.383s 3556kb (12) (43862003 iters/s) Loops Nested (global)...... 0.691s 0.670s 3560kb (8) (25040621 iters/s) Loops Recursed............. 0.446s 0.425s 3560kb (12) (2467238 iters/s)
By the way, I haven't explored possible additional flags for x86-64 or even athlon specific optimizations (such as march= and -ftune and such things). Also for completeness, here's the same benchmark on a dual Athlon-MP 2000+ (which runs just over 1.6 GHz). It's missing the 'read binary' tests, I assume that's because it doesn't have crypto:
test total user mem (runs) Pike start overhead........ 0.084s 0.002s 3380kb (25) Ackermann.................. 1.214s 1.132s 3588kb (5) Append array............... 0.586s 0.494s 3336kb (9) (1011236/s) Append mapping............. 2.949s 2.870s 3436kb (2) (3484/s) Append multiset............ 0.501s 0.417s 3400kb (11) (23965/s) Array & String Juggling.... 0.668s 0.579s 3712kb (8) Clone null-object.......... 0.257s 0.173s 3392kb (20) (1739130/s) Clone object............... 0.534s 0.452s 3392kb (10) (663717/s) Compile.................... 1.313s 1.227s 5136kb (4) (19666 lines/s) Compile & Exec............. 1.271s 1.186s 3592kb (5) (507083 lines/s) GC......................... 0.749s 0.666s 3352kb (7) Insert in mapping.......... 0.486s 0.403s 3380kb (11) (1241535/s) Insert in multiset......... 0.935s 0.852s 3384kb (6) (587084/s) Matrix multiplication...... 0.490s 0.405s 5196kb (11) Loops Nested (local)....... 0.344s 0.261s 3380kb (15) (64362724 iters/s) Loops Nested (global)...... 0.685s 0.600s 3380kb (8) (27962026 iters/s) Loops Recursed............. 1.621s 1.535s 3380kb (4) (683111 iters/s)
/ David Hedbor
It is not gone for me.
Pike v7.5 release 5 running Hilfe v3.5 (Incremental Pike Frontend)
Crypto.randomness.pike_random()->read(5);
(1) Result: "\2Ûìów"
Though there is no use in calling pike_random since it's only the random_string function wrapped up in an object that looks like the real random-objects.
/ Martin Nilsson (har bott i google)
Previous text:
2003-04-07 21:31: Subject: First Opteron?
Loops Nested (local)....... 0.419s 0.383s 3556kb (12) (43862003 iters/s) Loops Nested (global)...... 0.691s 0.670s 3560kb (8) (25040621 iters/s) Loops Recursed............. 0.446s 0.425s 3560kb (12) (2467238 iters/s)
The last one is impressive, the other scores are "just good", I assume they all are around the same level as a Barton at 1600MHz (with the increased cache, which is probably where the main difference is).
By the way, I haven't explored possible additional flags for x86-64 or even athlon specific optimizations (such as march= and -ftune and such things). Also for completeness, here's the same benchmark on a dual Athlon-MP 2000+ (which runs just over 1.6 GHz). It's missing the 'read binary' tests, I assume that's because it doesn't have crypto:
Yes, it needs Crypto.randomness.pike_random() which seems to have disappeared. I can't run those tests either. Why is that gone? :( It was hardly dependant on any crypto stuff.
/ Mirar
Wierd.
Oh. I didn't know that. Maybe the benchmark tests should be changed then. :)
/ Mirar
Previous text:
2003-04-07 21:37: Subject: First Opteron?
It is not gone for me.
Pike v7.5 release 5 running Hilfe v3.5 (Incremental Pike Frontend)
Crypto.randomness.pike_random()->read(5);
(1) Result: "\2Ûìów"
Though there is no use in calling pike_random since it's only the random_string function wrapped up in an object that looks like the real random-objects.
/ Martin Nilsson (har bott i google)
pike-devel@lists.lysator.liu.se