I wrote a crude/simple test program to compare the performance of AES-128-CBC across openssl, gcrypt, nettle and gnutls, and was surprised to find that nettle is consistently ~25% slower than the other libraries for its AESNI implementation.
On my Core i7-6820HQ I get
nettle: 850 MB/s gcrypt: 1172 MB/s gnutls: 1230 MB/s openssl: 1153 MB/s
with versions
nettle-3.3-2.fc26.x86_64 libgcrypt-1.7.8-1.fc26.x86_64 gnutls-3.5.14-1.fc26.x86_64 openssl-1.1.0f-7.fc26.x86_64
And on Xeon E5-2609 I get
nettle: 325 MB/s gcrypt: 403 MB/s gnutls: 414 MB/s openssl: 414 MB/s
with versions
nettle-3.3-1.fc25.x86_64 libgcrypt-1.7.8-1.fc25.x86_64 gnutls-3.5.14-1.fc25.x86_64 openssl-1.0.2k-1.fc25.x86_64
Naively I would have expected them all to be pretty much equal given that they're delegating to the same hardware routines. Has anyone else done comparative benchmarks of nettle's impl against others & seen the same kind of results ? I'll attach my test program to this mail, so if I made a mistake in usage there feel free to point it out.
FWIW, I also found there is some wierd interaction between nettle and glibc-2.23. If I have that glibc version and run with NETTLE_FAT_VERBOSE=1 it claims it is picking the AESNI impl, but the performance figures clearly show it is actually running the pure software impl because they're 100 MB/s instead of 325 MB/s. I upgraded to glibc 2.24 and this wierdness went away, so I've not investigated that further.
Regards, Daniel