Nikos Mavrogiannopoulos nmav@gnutls.org writes:
btw. the current _salsa20_core takes rounds as a variable. Wouldn't it allow for better optimizations (loop unrolling actually) if that was a static function, or that doesn't matter much?
I don't think it matters very much. But I haven't tried it.
My understanding is that this type of looping branches are handled well by the branch predictor in current cpu:s. (In contrast to unpredictable branches, which cost lots of cycles).
And since a single iteration should be 60-100 instructions, loop overhead should be almost negligible. Unrolling is more important for small loops.
Regards, /Niels