I've just pushed new aes code using intel's aesni instructions. See
https://git.lysator.liu.se/nettle/nettle/blob/530014f3f811d9018ec83a8748fdbc...
It gave a speedup of almost 10 times on the haswell machine where I tested it (and in addition, it should avoid sidechannel leaks in those functions). Clearly, this will be more useful after adding support for fat binaries, detecting presence of these instructions at runtime. For now, it has to be enabled explicitly with the configure argument --enable-x86-aesni.
I have one question, on how to enable support for these instructions in the assembler. For now I added a pseudo-op
.arch bdver2
and that seems to work, but it's a bit too specific for my taste. I would have preferred something like .arch generic64,aes, but I couldn't get that to work. So what's the right way to do this?
I haven't played with the corresponding arch flags to gcc, but I'd prefer do declare within the .asm file itself which instruction set it is intended for.
Feedback on the actual assembler code is also appreciated, of course. It's pretty basic, a dozen lines, no unrolling or other cleverness.
Regards, /Niels