nisse@lysator.liu.se (Niels Möller) writes:
Below replacement for sha1-compress.asm seems to run on roughly 2 cycles/byte when I benchmark it on an "AMD Ryzen 7 1700X" cpu in the gcc compile farm. Still sligthly slower than openssl, to squeeze out a few more cycles, it might help to change the sha1_compress interface to let it process more than one 64-byte block at a time.
I hope to be able to wire it up via fat-x86_64.c reasonably soon. In the mean time, if anyone wants to try it out, just change the sha1-compress.asm symlink to point to this file.
Enabled via fat-x86_64 now, and pushed to a branch named x86_64-sha_ni-sha1.
I intend to merge to master soon.
Testing and benchmarking appreciated.
Regards, /Niels