I added support for the sha1_compress_n function on arm architecture in the same branch https://git.lysator.liu.se/mamonet/nettle/-/tree/sha1-compress-n
regards, Mamone
On Sat, Aug 21, 2021 at 5:22 AM Maamoun TK maamoun.tk@googlemail.com wrote:
On Thu, Aug 19, 2021 at 8:48 AM Niels Möller nisse@lysator.liu.se wrote:
Maamoun TK maamoun.tk@googlemail.com writes:
What is x86/sha1-compress.nlms? How can I implement nettle_copmress_n function for that particular type?
That's an input file for an obscure "loop mixer" tool, IIRC, it was written mainly by David Harvey for use with GMP loops. This tool tries permuting the instructions of an assembly loop, taking dependencies into account, benchmarks each variant, and tries to find the fastest instruction sequence. It seems I tried this toool on x86 sha1_compress back in 2009, on an AMD K7, and it gave a 17% speedup at the time, according to commit message for 1e757582ac7f8465b213d9761e17c33bd21ca686.
So you can just ignore this file. And you may want to look at the more readable version of x86/sha1_compress.asm, just before that commit.
Thanks, I left the nlms files as are and modified x86/sha1_compress.asm to work with the sha1_compress_n function. I've kept the function parameters in the stack since the instructions are able to execute on memory operands and x86 calling convention passes the parameters through the stack, I'm not sure if those parameters are read-only or can be adjustable, TBH I haven't run into x86 32-bit code for 8 years. What I did is reserving fields in the stack for two parameters and adjusting both values in the new locations to keep the original values unmodified.
regards, Mamone