I've updated this set to use the proper conventions for register names, and
also adjusted the IV macro according to the suggestions provided.
I can also confirm that I've gotten a working build environment based on
the approach the GitLab CI configuration, and that the ppc64 big-endian
build does indeed pass tests.
Amended original cover letter:
This set introduces an optimized powerpc64 assembly implementation for
SHA256 and SHA512. This have been derived from BSD-2-Clause licensed
code authored by IBM, originally released in the IBM POWER
Cryptography Reference Implementation project[1], modified to work in
Nettle, contributed under the GPL license.
Development of this new implementation targetted POWER 10, however
supports the POWER 8 and above ISA. The following commits provide the
performance data I recorded on POWER 10, though similar improvements can
be found on P8/P9.
I have tested this patch set on POWER 8 and POWER 10, hardware running
little-endian linux distributions, and via qemu-user for big-endian ppc64.
Eric Richter (2):
powerpc64: Add optimized assembly for sha256-compress-n
powerpc64: Add optimized assembly for sha512-compress-n
fat-ppc.c | 22 ++
powerpc64/fat/sha256-compress-n-2.asm | 36 +++
powerpc64/fat/sha512-compress-2.asm | 36 +++
powerpc64/p8/sha256-compress-n.asm | 323 +++++++++++++++++++++++++
powerpc64/p8/sha512-compress.asm | 327 ++++++++++++++++++++++++++
5 files changed, 744 insertions(+)
create mode 100644 powerpc64/fat/sha256-compress-n-2.asm
create mode 100644 powerpc64/fat/sha512-compress-2.asm
create mode 100644 powerpc64/p8/sha256-compress-n.asm
create mode 100644 powerpc64/p8/sha512-compress.asm
--
2.44.0