What is x86/sha1-compress.nlms? How can I implement nettle_copmress_n function for that particular type?
regards, Mamone
On Sun, Aug 15, 2021 at 2:10 AM Maamoun TK maamoun.tk@googlemail.com wrote:
On Thu, Aug 12, 2021 at 4:26 PM Maamoun TK maamoun.tk@googlemail.com wrote:
On Tue, Aug 10, 2021 at 11:55 PM Niels Möller nisse@lysator.liu.se wrote:
Maamoun TK maamoun.tk@googlemail.com writes:
I made a merge request in the main repository that optimizes SHA1 for
s390x
architecture with fat build support !33 https://git.lysator.liu.se/nettle/nettle/-/merge_requests/33.
Regarding the discussion on https://git.lysator.liu.se/nettle/nettle/-/merge_requests/33#note_10005: It seems the sha1 instructions on s390x are fast enough that the overhead of loading constants, and loading and storing the state, all per block, is a significant cost.
I think it makes sense to change the internal convention for _sha1_compress so that it can do multiple blocks. There are currently 5 assembly implementations that would need updating: arm/v6, arm64/crypto, x86, x86_64 and x86_64/sha_ni. And the C implementation, of course.
If it turns out to be too large a change to do them all at once, one could introduce some new _sha1_compress_n function or the like, and use when available. Actually, we probably need to do that anyway, since for historical reasons, _nettle_sha1_compress is a public function, and needs to be kept (as just a simple C wrapper) for backwards compatibility. Changing it incrementally should be doable but a bit hairy.
There are some other similar compression functions with assembly implementation, for md5, sha256 and sha512. But there's no need to change them all at the same time, or at all.
Regarding the MD_UPDATE macro, that one is defined in the public header file macros.h (which in retrospect was a mistake). So it's probably best to leave it unchanged. New macros for the new convention should be put into some internal header, e.g., md-internal.h.
Yet, there are implementations of x86, x86_64, and arm architectures to adapt with the new compress function
Modified basic x86_64 implementation to sha1_compress_n function in the same branch. Unfortunately, my x86_64 CPU doesn't support SHA extension so I'm trying to figure out a simple way to test the hardware-accelerated implementation.
regards, Mamone