Maamoun TK maamoun.tk@googlemail.com writes:
On POWER9 I got the following benchmark result:
./configured: chacha encrypt 308.58 chacha decrypt 325.87 ./configured --enable-power-altivec "master branch": chacha encrypt 342.15 chacha decrypt 356.24 ./configured --enable-power-altivec "ppc-chacha-2core": chacha encrypt 648.97 chacha decrypt 648.00
It's gotten better with every further optimization on the core, great work.
Nice. So almost a factor 2 speedup from doing 2 blocks in parallel. I wonder if one can get close to another factor of two by going to 4 blocks. I hope to get the time to try that out, it should be fairly easy. (And if that does work out fine, maybe the code to do only 2 blocks could be removed).
Regards, /Niels