On Wed, Jan 3, 2018 at 7:36 PM, Niels Möller nisse@lysator.liu.se wrote:
"Daniel P. Berrange" berrange@redhat.com writes:
I wrote a crude/simple test program to compare the performance of AES-128-CBC across openssl, gcrypt, nettle and gnutls, and was surprised to find that nettle is consistently ~25% slower than the other libraries for its AESNI implementation.
I've now pushed new aesni code to the master-updates branch. It reads all subkeys into registers upfront, and unrolls the round loop. This brings a great speedup when calling the aes functions with many blocks at a time, but little difference when doing only one block at a time. Results for aes128, when benchmarkign on my machine (intel broadwell):
ECB encrypt and decrypt: About 90% speedup, from 1.25 cycles/byte to 0.65, about the same as openssl, or even *slightly* faster.
That's great news.
CBC encrypt: No significant change, about 5.7 cycles/byte. CBC decrypt: About 60% speedup, from 1,5 cycles/byte down to 0.93.
CTR mode: No significant change, about 2.5 cycles/byte.
I think it's reasonble to speed up CTR mode by passing more blocks per call to the encryption function (currently it does 4 blocks at a time), and maybe by some more efficient routine to generate the counter input.
To improve CBC would need some structural and possibly ugly changes.
If I had to chose between optimizing one of two, I'd say CTR. All the modern AEAD modes (GCM, CCM) use CTR, while CBC is only used as legacy and backwards compatible mode.
regards, Nikos