On 02/07/2011 01:20 PM, Niels Möller wrote:
Nikos Mavrogiannopoulos nmav@gnutls.org writes:
I've also done a comparison benchmark of AES-GCM (the 4-bit table one) versus HMAC-SHAx+AES-CBC... AES-GCM in software is disappointing...
Now I've tried 8-bit tables. Then I get into the same ballpark as md5 and the sha functions (benchmarking on intel x86_64): Algorithm mode Mbyte/s cycles/byte cycles/block md5 update 174.20 7.12 455.48 sha1 update 158.09 7.84 501.89 sha256 update 68.36 18.14 1160.65 sha512 update 104.99 11.81 1511.55 gmac auth 65.93 18.80 300.87 I think both sha512 and gmac benefit from 64-bit wide registers, while md5, sha1 and sha256 does not. And I think there are still a couple of microoptimizations left to do for gmac. (I'm only benchmarking gmac; the encryption should be about the same as AES in ECB or CTR mode, which is roughly 17 cycles/byte on the same hardware). Now the question is if it's a good tradeoff to expand the key to a 4 KB table.
4kb is not much on a desktop. There are constraint systems where this might be a problem though. Libtomcrypt had a definition to cope with this issue (e.g. LOW_FOOTPRINT or so).
On the other hand systems that might have an assembler-optimized version, would need to share the same big state as well... I don't know. That's why I like hiding that stuff :)
regards, Nikos