Hello,
I haven't made all the changes you requested but here's my situation.
Le 2021-03-06 à 04 h 45, Niels Möller a écrit :
Nicolas Mora nicolas@babelouest.org writes:
About the byteswapping comment, the code
// A = MSB(64, B) ^ t where t = (n*j)+i A64 = READ_UINT64(B.b); A64 ^= (n*j)+(i+1); WRITE_UINT64(A.b, A64);
could be replaced by something like
#if WORDS_BIGENDIAN A.u64 = B.u64 ^ (n*j)+(i+1); #elif HAVE_BUILTIN_BSWAP64 A.u64 = B.u64 ^ __builtin_bswap64((n*j)+(i+1)); #else ... READ_UINT64 / WRITE_UINT64 or some other workaround ... #endif
The problem I have with this is the type of A and B: nettle_block8 and nettle_block16, and in nettle_block16, u64 is declared as 'uint64_t u64[2];'
So do I need to choose B.u64[0] or B.u64[1] depending on the architecture?
Preferably encapsulated into a single macro, so it doesn't have to be duplicated in both the wrap and the unwrap function. There's another example of using __builtin_bswap64 in ctr.c.
I've started something like that:
#if WORDS_BIGENDIAN #define MSB_XOR_T_WRAP(A, B, xor) A.u64 = B.u64 ^ (xor); #elif HAVE_BUILTIN_BSWAP64 #define MSB_XOR_T_WRAP(A, B, xor) A.u64 = B.u64 ^ __builtin_bswap64((xor)); #else #define MSB_XOR_T_WRAP(A, B, xor) \ uint64_t A64_##A##; \ A64_##A## = READ_UINT64 (B.b); \ A64_##A## ^= (xor); \ WRITE_UINT64 (A.b, A64_##A##); \ #endif
First I need to fix my B.u64 issue
- Intialization: If you don't intend to use the initial values, omit
initialization in declarations like
union nettle_block16 I = {0}, B = {0}; union nettle_block8 A = {0};
Fixed
- Some or all memcpys in the main loop can be replaced by uint64_t
operations, e.g.,
I.u64 = A.u64;
instead of
memcpy(I.b, A.b, 8);
I have the same problem with B.u64 being an array
Therefore I removed 'uint8_t R[64]' to use TMP_GMP_DECL(R, uint8_t); instead.
Unfortunately, that doesn't work: This code should go into libnettle (not libhogweed), and then it can't depend on GMP. You could do plain malloc + free, but according to the README file, Nettle doesn't do memory allocation, so that's not ideal.
I think it should be doable to reuse the output buffer as temporary storage (R = ciphertext for wrap, R = cleartext for unwrap). In-place operation (ciphertext == cleartext) should be supported (but no partial overlap), so it's important to test that case.
I've updated the MR to reuse ciphertext or cleartext as R. The original test cases still pass, I'll have to complete the tests.
Using the output area directly has the drawback that it isn't aligned, so you'll need to keep some memcpys in the main loop. One could consider using an aligned pointer into output buffer and separate handling of first and/or last block, but if that's a lot of extra complexty, I wouldn't do it unless either (i) it gives a significant performance improvement, or (ii) it turns out to actually be reasonably nice and clean.
I'll rely on your wisdom about that if you don't mind
- And one more nit: Indentation. It's fine to use TAB characters, but
they must be assumed to be traditional TAB to 8 positions: changing the appearance of TAB to anything else in one's editor is wrong, because it makes the code look weird for everyone else (e.g., in gitlab's ui). And the visual appearance should follow GNU standards, braces on their own lines, indent steps of two spaces, which means usually SPC characters, with TAB only for large indentation.
I agree, I've updated the indentation using gnu indent with gnu style
/Nicolas