Micro optimizations of the umac context structs - nettle-bugs

16 Apr 2013


      Speaking of umac, I'm also looking at the umac context structs, for
potential micro optimizations and fixes before it becomes a part of the
ABI.
Some fields, like nonce_length, index, and (for umac32 and umac64)
nonce_low, fit in 16 or even 8 bits. So it might make sense to make them
adjacent.
And on the other hand, the umac block count is currently unsigned, and
will wraparound after 2*32 blocks or 2^42 bytes. Other hash functions
typically support data sizes up to 2^64 (except sha512 which uses a
128-bit coutner, which seems gross overkill).
For umac, the block counter is only needed to keep track of when to
switch to different layer 2 hashing, and to keep track of odd and even
blocks for poly128. So it could probably be made to work with only 16
bits and some saturation logic. But extending it to 64 bits seems
simpler.
It would also be nice if we could force 16-byte alignment for the l1_key
array (this is important for assembly routines), which would them imply
16-byte alignment for the complete context struct. Could help x86 sse2
assembly. And could help also on ARM, but I'm not sure if the system
(primarily linker and malloc) really makes 16-byte alignment possible
there.
And it would also be good to get a reasonably large alignment for the
block buffer.
In gcc, there's __attribute__ ((aligned (16))), but since this gets part
of the ABI, we can't use it in public headers unless we can specify the
same alignment for *all* reasonable compilers for the given architecture.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.