Michael Weiser michael@weiser.dinsnail.net writes:
Hi Niels,
On Wed, Feb 07, 2018 at 01:13:32PM +0100, Niels Möller wrote:
Can you check if it's detected correctly also when cross-compiling?
[...] Seems fine.
Good!
FAIL: memxor
This also does some tricks with word reads and rotate. (The C code does that too, but with conditions on WORDS_BIGENDIAN).
I think I got memxor, sha1 and sha256 sorted. Patch below.
Nice. Only some quick comments for now.
This one is still failing, even though memxor and sha are fixed. I've been looking at the code and can't find any apparent reason. In chacha-core-internal.c I see the following bit of code that does seem to do endianness handling:
dst[i] = LE_SWAP32 (t);
Would this apply to chacha-core-internal.asm, too?
That's right, it is expected to produce the output as an array of 16 32-bit words, stored in *little* endian order. So byteswap is needed before the final
vstm DST, {X0,X1,X2,X3}
instruction. The idea is that the buffer can be used directly with memxor, without the C code having to do any byte swaps.
FAIL: umac
Similar problem, I would guess. But this time, loading 64 bits at a time into neon registers.
I'm drawing a bit of a blank on this one. It fails on the very first test case of umac32 where only umac-nh is used and all the input is zeroes. So there does seem to be another endianness dependency in the actual computation code.
It could be that if the all-zero input is misaligned, the aligned read + rotation tricks gets non-zero data from outside of the input area into the computation.
Have I understood correctly, that vld1.8 reads a byte stream and should be endianness-neutral
Don't remember, have to consult the arm docs.
and the keys are in host endianness?
I think so.
I was wrong: While the compiler is able to output big-endian objects with -mbig-endian, it needs matching libs as well (e.g. libgcc_s). Debian doesn't have anything precompiled for armeb.
Maybe full-system qemu is easier then (assuming there's some dist supporting big-endian arm)?
Instead I augmented the default action (which is documented and shouldn't change) by setting ASM_WORDS_BIGENDIAN directly. Also this should make the explicit value checking in IF_BE redundant because we now know for sure configure will never emit anything other than yes and no. Documentation says that AC_C_BIGENDIAN will abort if endianness can't be determined.
[...]
--- a/configure.ac +++ b/configure.ac @@ -201,7 +201,9 @@ LSH_FUNC_STRERROR # getenv_secure is used for fat overrides, # getline is used in the testsuite AC_CHECK_FUNCS(secure_getenv getline) -AC_C_BIGENDIAN +AC_C_BIGENDIAN([AC_DEFINE([WORDS_BIGENDIAN], 1)
- [ASM_WORDS_BIGENDIAN=yes]],
- [ASM_WORDS_BIGENDIAN=no])
I think I'd indent differently, to group the two parts ACTION-IF-TRUE together, and drop the redundant square brackets in "[ASM_WORDS_BIGENDIAN=yes]".
You leave the ACTION-IS-UNIVERSAL as default, which I think is good. I hope that's not relevant for arm. Might still be good to set ASM_WORDS_BIGENDIAN to some default value before this check, and have IF_BE fail if used with an unknown endianness.
Regards, /Niels