Eric Richter erichte@linux.ibm.com writes:
This set introduces an optimized powerpc64 assembly implementation for SHA256 and SHA512. This have been derived from BSD-2-Clause licensed code authored by IBM, originally released in the IBM POWER Cryptography Reference Implementation project[1], modified to work in Nettle, contributed under the GPL license.
Development of this new implementation targetted POWER 10, however supports the POWER 8 ISA and above. The following commits provide the performance data I recorded on POWER 10, though similar improvements can be found on P8/P9.
Thanks, I've had a first quick look. Nice speedup, and it looks pretty good. I wasn't aware of the vshasigma instructions.
One comment on the Nettle ppc conventions: I prefer to use register names rather than just register numbers; that helps me avoid some confusion when some instructions take v1 registers and others take vs1 registers. Preferably by configuring with ASM_FLAGS=-mregnames during development. For assemblers that don't like register names (seems to be the default), machine.m4 arranges for translation from v1 --> 1, etc.
As an aside: I have tested this patch set on POWER 8 and POWER 10 hardware running little-endian linux distributions, however I have not yet been able to test on a big-endian distro. I can confirm however that the original source in IPCRI does compile and pass tests for both little and big endian via qemu-user, so spare human error in deriving the version for Nettle, it is expected to be functional.
There are big-endian tests in the ci pipeline (hosted on the mirror repo at https://gitlab.com/gnutls/nettle), using cross-compiling + qemu-user. And I also have a similar setup locally.
Regards, /Niels