"Daniel P. Berrange" berrange@redhat.com writes:
Naively I would have expected them all to be pretty much equal given that they're delegating to the same hardware routines.
This is not completely unexpected.
* Nettle's AESNI assembly routines were written for simplicity and small code size, without putting a lot of effort into it. They could probably be sped up by some unrolling or more careful instruction scheduling. Patches welcome (but we shouldn't use excessive unrolling unless there's a significant speedup).
* Nettle's AES-CBC uses general CBC functions invoking the AES encrypt and decrypt functions. In particular for CBC *en*crypt, this adds significant overhead for function calls, and the memxor function will examine src/dst alignment once per block. CBC *de*crypt is usually a bit faster, since we can then decrypt more than one block at a time.
comparative benchmarks of nettle's impl against others & seen the same kind of results ?
You can also try the ./examples/nettle-benchmark program; if openssl was found at configure time, it includes benchmarks of some openssl functions for comparison.
FWIW, I also found there is some wierd interaction between nettle and glibc-2.23. If I have that glibc version and run with NETTLE_FAT_VERBOSE=1 it claims it is picking the AESNI impl, but the performance figures clearly show it is actually running the pure software impl because they're 100 MB/s instead of 325 MB/s.
Odd. Nettle-3.1 used glibc's IFUNC feature, but that was disabled in later versions due to problems with the order the resolver functions were called.
Regards, /Niels