I think there's only one sensitive use of memcmp within nettle, and that's the tag comparison in ccm_decrypt_message. I've now written a private function memeql_sec to do that comparison in a more side-channel silent fashion.
static int memeql_sec (const void *a, const void *b, size_t n) { volatile const unsigned char *ap = (const unsigned char *) a; volatile const unsigned char *bp = (const unsigned char *) b; volatile unsigned char d; size_t i; for (d = i = 0; i < n; i++) d |= (ap[i] ^ bp[i]); return d == 0; }
The idea is to avoid leaking (via timing or memory access patterns) the location of the first difference. Information that a guess for a forged MAC tag matches some characters of the correct MAC can be used to attack the MAC key, in particular for MAC algorithms with linear structure such as gcm and poly1305 (which is why chacha-poly1305 uses a new poly1305 key for each message, and why one shouldn't use gcm with short authentication tags).
memeql_sec is a bit similar to http://nacl.cr.yp.to/verify.html.
Now, applications using nettle are likely doing a lot of similar comparisons on hash and hmac digests. So it would be good to make this function public. But then I'd need to decide on
1. A good name.
2. A suitable headerfile to declare it in. It would make some sense to group it together with memxor, but memxor.h isn't a good name for that header file.
Regards, /Niels
On Sat, 2015-03-14 at 08:20 +0100, Niels Möller wrote:
Now, applications using nettle are likely doing a lot of similar comparisons on hash and hmac digests.
They are :) so it would be nice to export them.
So it would be good to make this function public. But then I'd need to decide on
- A good name.
nettle_memcmp?
- A suitable headerfile to declare it in. It would make some sense to group it together with memxor, but memxor.h isn't a good name for that header file.
nettle/mem.h, which will include memxor, memcmp, and memset?
regards, Nikos
Nikos Mavrogiannopoulos nmav@gnutls.org writes:
nettle_memcmp?
I'd prefer to avoid "memcmp" in the name, since the return value is very different from libc memcmp. Nettle's function returns 1 for equality, and 0 for inequality. So the name should associate with equality.
"sec" in the name is a gmp convention for side-channel silent functions, where it's usually a prefix, e.g., mpn_sec_mul. But it is used mainly when there is some "non-sec" function computing the same thing but with side-channel leaks.
nettle/mem.h, which will include memxor, memcmp, and memset?
Makes sense. Or memory.h, or memops.h.
Regards, /Niels
On 14/03/2015 8:20 p.m., Niels Möller wrote:
I think there's only one sensitive use of memcmp within nettle, and that's the tag comparison in ccm_decrypt_message. I've now written a private function memeql_sec to do that comparison in a more side-channel silent fashion.
static int memeql_sec (const void *a, const void *b, size_t n) { volatile const unsigned char *ap = (const unsigned char *) a; volatile const unsigned char *bp = (const unsigned char *) b; volatile unsigned char d; size_t i; for (d = i = 0; i < n; i++) d |= (ap[i] ^ bp[i]); return d == 0; }
Is the compiler optimized code for that for loop faster or slower than a loop suming the differentials?
{ volatile const unsigned char *ap = (const unsigned char *) a + n; volatile const unsigned char *bp = (const unsigned char *) b + n; volatile unsigned char d; for (d = 0; ap >= a; ap--, bp--) d += (*ap - *bp); return d == 0; }
Or does the subtract and add still leak timing from CPU internal optimizations the bitmasking avoids?
NP: That would allow this function to take the uint8_t that most of nettle operates with.
(Sorry if thats a dumb Q, its been a long time since I worked on anything like this.)
AYJ
Amos Jeffries squid3@treenet.co.nz writes:
Is the compiler optimized code for that for loop faster or slower than a loop suming the differentials?
Not sure. But I don't think performance is very important here, the function is going to be used on pretty small inputs.
volatile unsigned char d; for (d = 0; ap >= a; ap--, bp--) d += (*ap - *bp);
I don't think that is correct, since d may wrap around to zero. One would need to accumulate into a larger variable, something like
unsigned d; for (d = 0; ap >= a; ap--, bp--) d += (uint8_t)(*ap - *bp);
which, if unsigned int is 32 bits, would be correct for n up to 2^24. (I think the cast necessary, to avoid values being promoted to *signed* int). Using | is simpler and more robust.
NP: That would allow this function to take the uint8_t that most of nettle operates with.
Like memxor, this function tries to mimic the conventions of the libc mem* functions, not nettle's conventions.
Regards, /Niels
On 15/03/2015 12:39 a.m., Niels Möller wrote:
Amos Jeffries writes:
Is the compiler optimized code for that for loop faster or slower than a loop suming the differentials?
Not sure. But I don't think performance is very important here, the function is going to be used on pretty small inputs.
volatile unsigned char d; for (d = 0; ap >= a; ap--, bp--) d += (*ap - *bp);
I don't think that is correct, since d may wrap around to zero. One would need to accumulate into a larger variable, something like
unsigned d; for (d = 0; ap >= a; ap--, bp--) d += (uint8_t)(*ap - *bp);
which, if unsigned int is 32 bits, would be correct for n up to 2^24. (I think the cast necessary, to avoid values being promoted to *signed* int). Using | is simpler and more robust.
Aha! Thank you.
AYJ
nettle-bugs@lists.lysator.liu.se