Maamoun TK maamoun.tk@googlemail.com writes:
On Mon, Nov 30, 2020 at 11:18 PM Maamoun TK maamoun.tk@googlemail.com wrote:
on POWER9 I get the following benchmark with ". /configure --enable-power-altivec":
chacha encrypt 763.57 chacha decrypt 780.64
regards, Mamone
I got this result using ppc-chacha-2core branch on same machine:
chacha encrypt 565.79 chacha decrypt 582.10
I've tried running the benchmark on gcc135, and that gives me much more consistent values than gcc112. The 2-way code (currently on master branch) gives 686 Mbyte/2. The 4-way code you tried gives 958 MByte/s. I then replaced the innerloop with a versino with better interleaving, written by Torbjörn Granlund (just pushed to the branch). That gives 1225 Mbyte/s.
And for reference, the plain C implementation gives 363 MByte/s.
Regards, /Niels