New subject: x86 sha_ni

8 Feb 2018


      Jeffrey Walton noloader@gmail.com writes:
...
Looks good on a Celeron J3455, which is a [low-end] Goldmont machine
with the instructions:
[...]
...
goldmont:nettle$ LD_LIBRARY_PATH=.lib:/usr/local/lib64/
./examples/nettle-benchmark
sha1_compress: 84.60 cycles
85 cycles is a lot less than than 136 cycles I observed in my testing.
The function is 131 instructions long, so it's approximately 1.5
instructions per cycle.
...
          sha1       update 1194.33
  openssl sha1       update 1321.71

And this is a 11% difference (compared to 8% in my benckmarks). Makes
sense if the main crunching is fewer cycles, then the per block function
call overhead is relatively larger.
...
A small suggestion may be to update Section 8 Installation
(https://www.lysator.liu.se/~nisse/nettle/nettle.html). It was not
obvious to me how to enable the hardware acceleration.
There's an --enable-x86-aesni configure option which should enable the
aesni code unconditionally in non-fat builds. And an --enable-arm-neon.
But it seems I forgot to add a corresponding --enable-x86-sha-ni.
But --enable-fat is the most common way to enable the support. I'm
considering enabling it by default in the next release.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

Re: x86 sha_ni