On Sat, Nov 21, 2020 at 5:32 PM Niels Möller nisse@lysator.liu.se wrote:
Is this a loop, part of a loop, or is there some vector load instruction that lets you pass a byte length?
It generates a mask compatible with the length of leftovers, for example if the length is 1 then the mask generated is 0xFF000000000000000000000000000000 then the mask is ANDed with the vector register of leftovers to clear the extra unneeded bytes. It's not exactly like the first approach but it avoids using stack and handles the leftovers inside the assembly implementation, sorry for mixing up.
I recommend the third approach so we don't have to deal with the leftover bytes in the upcoming implementations but the problem is that gcm_init_key() initialize the table for the compatible gcm_hash() function,
If we go this way, the power assembly file would have to provide an implementation of gcm_gf_mul, compatible with its gcm_init_key. It would do essentially the same thing as the single-block part of gcm_hash. But approach 1 is fine too, if it doesn't get too complicated.
Your recent mails have not included actual patches, neither inline, nor as attachments. E.g., https://lists.lysator.liu.se/pipermail/nettle-bugs/2020/009234.html. (The mailist software might discard some attachments, but content-type: text/x-patch and the like should be fine). If your mail client doesn't cooperate, feel free to create a pull request on git.lysator.liu.se instead (and ping the list).
I made a merge request in git.lysator.liu.se, it ended up easier for me to push patches to the repository in this way, I hope you don't mind dealing with the future patches the same way.
regards, Mamone