On Tue, 2015-01-27 at 22:53 +0100, Niels Möller wrote:
Nikos Mavrogiannopoulos nmav@gnutls.org writes:
About the release... Since you added the fat, would it include AESNI +PCLMUL?
AESNI is in. If you have the time, it would be interesting if you could benchmark it against the gnutls code. The nettle implementation is pretty basic, maybe it could be sped up a bit by unrolling or by caching subkeys in registers.
Currently the numbers I get with the current implementation: $ ./gnutls-cli --benchmark-ciphers AES-128-CBC-SHA1 0.41 GB/sec AES-128-CBC-SHA256 0.27 GB/sec AES-128-GCM 3.02 GB/sec
If I use nettle's only $ GNUTLS_CPUID_OVERRIDE=0x1 ./gnutls-cli --benchmark-ciphers AES-128-CBC-SHA1 0.29 GB/sec AES-128-CBC-SHA256 188.68 MB/sec AES-128-GCM 0.29 GB/sec
(I verified that nettle detects aesni)
The GCM part heavily depends on pclmul so it's only listed for completeness. AES-CBC is quite slower though.
I don't know if it helps, but the code I currently use for AESNI is: https://github.com/openssl/openssl/blob/e0fc7961c4fbd27577fb519d9aea2dc78874...
Unrelated but I realized that I also have overrides for non-AESNI systems which use this implementation by Mike Hamburg: https://github.com/openssl/openssl/blob/e0fc7961c4fbd27577fb519d9aea2dc78874...
This takes advantage of SSSE3 and is faster while being constant time as well.
Haven't looked carefully at pclmul, so I don't know how difficult it is to make use of it.
No idea either. I can only provide a link to the existing code I use which is: https://github.com/openssl/openssl/blob/c1669e1c205dc8e695fb0c10a655f434e758... which provides low level functions used to implement GCM in: https://gitorious.org/gnutls/gnutls/source/eabf1f27d255577bad60d302abf46a969...
If yes that would reduce significantly the assembly shipped in gnutls (only the padlock functions would remain).
I guess padlock code could be ported over to Nettle, if it's still relevant.
That would be of course ideal. I can hardly help with that though...
regards, Nikos