Re: x86 sha_ni

8 Feb 2018


      Jeffrey Walton noloader@gmail.com writes:
...
Looks good on a Celeron J3455, which is a [low-end] Goldmont machine
with the instructions:
[...]
...
goldmont:nettle$ LD_LIBRARY_PATH=.lib:/usr/local/lib64/
./examples/nettle-benchmark
sha1_compress: 84.60 cycles
85 cycles is a lot less than than 136 cycles I observed in my testing.
The function is 131 instructions long, so it's approximately 1.5
instructions per cycle.
...
          sha1       update 1194.33
  openssl sha1       update 1321.71

And this is a 11% difference (compared to 8% in my benckmarks). Makes
sense if the main crunching is fewer cycles, then the per block function
call overhead is relatively larger.
...
A small suggestion may be to update Section 8 Installation
(https://www.lysator.liu.se/~nisse/nettle/nettle.html). It was not
obvious to me how to enable the hardware acceleration.
There's an --enable-x86-aesni configure option which should enable the
aesni code unconditionally in non-fat builds. And an --enable-arm-neon.
But it seems I forgot to add a corresponding --enable-x86-sha-ni.
But --enable-fat is the most common way to enable the support. I'm
considering enabling it by default in the next release.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: x86 sha_ni