On Tue, 2018-01-30 at 20:57 +0100, Niels Möller wrote:
Nikos Mavrogiannopoulos n.mavrogiannopoulos@gmail.com writes:
To follow up on this, gcm would get an 8% (on my system) speedup by switching gcm_crypt() with ctr_crypt(). With that change as is however, the 32-bit counter is replaced with an "unlimited" counter. Wouldn't introducing an assert on decrypt and encrypt length be sufficient to share that code?
I think it's valid to use gcm with an IV which makes the 32-bit counter start close to 2^32 - 1, and then propagating carry further then 32 bits would produce incorrect results. Right? (I'm afraid there's no test case for that, though).
I agree it would be very nice to reuse ctr_crypt and not duplicate most of the logic. But I think we need a gcm-specific variant of ctr_fill. To do that, it would make sense to add a field
uint32_t u32[4];
to the nettle_block16 union.
To reduce code duplication, we could add a fill function pointer as argument to ctr_crypt16, and use that for gcm. Not sure if that's a good idea, but it might be nice and clean and indirect call to the fill function should be negligible.
It seems that ctr_crypt16() would not handle the whole input and that was complicating things. I've modified it towards that, and added the parameter. I did a gcm_fill(), but I didn't see the need for the nettle_block16 update, as the version I did (quite simplistic), didn't seem to differ in performance comparing to ctr_fill16.
regards, Nikos