Is anyone seriously attached to the current pike -x benchmark format/output?
I have done some work to
a: Only output relevant information b: Actually output relevant information.
Additionally I have added a json output mode, and a way to run a benchmark with an old result as the baseline (and then report differences, useful when you are trying to optimize something).
What is left of the output is somewhat sparse, though.
The perhaps most important part that is missing is the memory usage, but it was rather .. random before, since it counted the memory use _after_ the benchmark was run.
# leopard$ make benchmark BENCHARGS="-c $PWD/compare"
(...)
----------------------------------------------------------------- Test Result Change ----------------------------------------------------------------- Ackermann . . . . . . . . . . . . . . . . . . . . . 38M/s -0.1% Append array . . . . . . . . . . . . . . . . . . . 23M/s -0.1% Append mapping (+) . . . . . . . . . . . . . . . 4902k/s 0.2% Append mapping (|) . . . . . . . . . . . . . . . 5571k/s 6.3% Append multiset . . . . . . . . . . . . . . . . . . 105k/s -3.3% Array & String Juggling . . . . . . . . . . . . . . 110k/s -4.8% Array Copy . . . . . . . . . . . . . . . . . . . . 41M/s 2.0% Array Zero . . . . . . . . . . . . . . . . . . . . 285k/s 0.0% Binary Trees . . . . . . . . . . . . . . . . . . . 994k/s 3.3% Clone null-object . . . . . . . . . . . . . . . . . 11M/s 1.1% Clone object . . . . . . . . . . . . . . . . . . 5627k/s -9.4% Compile . . . . . . . . . . . . . . . . . . . 97k lines/s -0.1% Compile & Exec . . . . . . . . . . . . . . . 92k lines/s -0.3% Foreach (arr,global) . . . . . . . . . . . . . . . 71M/s -2.6% Foreach (arr,local) . . . . . . . . . . . . . . . . 174M/s 1.4% Foreach (arr;local;global) . . . . . . . . . . . . 37M/s -1.1% Foreach (arr;local;local) . . . . . . . . . . . . . 52M/s 1.0% GC . . . . . . . . . . . . . . . . . . . . . . . . 1197/s 2.2% Insert in array . . . . . . . . . . . . . . . . . . 48M/s 0.4% Insert in mapping . . . . . . . . . . . . . . . . . 10M/s -2.2% Insert in multiset . . . . . . . . . . . . . . . 3868k/s 2.4% Loops Nested (global) . . . . . . . . . . . . . . . 31M/s 2.9% Loops Nested (local) . . . . . . . . . . . . . . . 181M/s -0.4% Loops Nested (local,var) . . . . . . . . . . . . . 161M/s -2.0% Loops Recursed . . . . . . . . . . . . . . . . . . 18M/s 1.1% Matrix multiplication (100x100) . . . . . . . . 2.23 GF/s 1.9% Read binary INT128 . . . . . . . . . . . . . . . . 197k/s -0.9% Read binary INT16 . . . . . . . . . . . . . . . . . 19M/s 20.1% Read binary INT32 . . . . . . . . . . . . . . . . . 17M/s -2.0% Replace (parallel) . . . . . . . . . . . . . . . . 9831/s 5.0% Replace (serial) . . . . . . . . . . . . . . . . . 13k/s -2.6% Sort equal integers . . . . . . . . . . . . . . . . 59M/s -0.9% Sort ordered integers . . . . . . . . . . . . . . . 88M/s -0.1% Sort unordered integers . . . . . . . . . . . . . . 14M/s 3.9% Sort unordered objects . . . . . . . . . . . . . . 569k/s -2.2% String Creation . . . . . . . . . . . . . . . . . 3053k/s -8.1% String Creation (existing) . . . . . . . . . . . . 10M/s 1.9% String Creation (wide) . . . . . . . . . . . . . . 678k/s 1.9% Tag removal u. Parser.HTML . . . . . . . . . . . 4929k/s 3.5% Tag removal u. Regexp.PCRE . . . . . . . . . . . 2124k/s 1.4% Tag removal u. array_sscanf . . . . . . . . . . . 6150k/s -0.2% Tag removal u. division . . . . . . . . . . . . . 3113k/s -1.5% Tag removal u. search . . . . . . . . . . . . . . 5218k/s -2.9% Tag removal using a loop . . . . . . . . . . . . . 895k/s -2.3% Tag removal using sscanf . . . . . . . . . . . . . 718k/s 2.7% Upper/lower case shift 0 . . . . . . . . . . . . . 137M/s -0.3% Upper/lower case shift 1 . . . . . . . . . . . . . 60M/s -1.4% call_out handling . . . . . . . . . . . . . . . . . 177k/s 0.2% call_out handling (with id) . . . . . . . . . . . 4560k/s -2.4% ----------------------------------------------------------------- 0.3% -----------------------------------------------------------------
I have some tools for clusterized benchmarks that rely on the Shoot.pmod "raw" output format - but that's probably untouched?
No, sorry, the raw output and API is totally different.
Specifically:
Pike v7.9 release 11 running Hilfe v3.5 (Incremental Pike Frontend)
Tools.Shoot.run_sub( Tools.Shoot["MatrixMult"](), 3, 0.0 );
Compiler Warning: 1: Returning a void expression. Converted to zero. {"readable":"2.22 GF/s","n_over_time":2218219278,"n":6800000000,"loops":33,"time":3.065522} (1) Result: 0
So... The individual tests output JSON, and starting them is done differently.
Additionally, the actual 'n' and other fields are drastically different for most tests, they now tend to reflect the number of whatever the test is doing, and not the number of top-level loops.
As an example, the matrix multiply returns the number of flops.
The amount of time it takes to update my scripts is probably less than what you spend on writing that message, so go ahead.
How about also backporting to 7.8? That would make it a lot easier to compare the two branches.
pike-devel@lists.lysator.liu.se