Nikos Mavrogiannopoulos nmav@gnutls.org writes:
I was debugging an invalid memory access in gnutls and realized that the issue is in memxor3 of nettle (2.7.x branch).
I have now tried your new test case. I can reproduce the valgrind warning. But after stepping through the memxor3 function, I think it's harmless. This is what happens:
We get an memxor3 call with size 15. We have
dst area: 0x7fffffffe91d, ending before 0x7fffffffe92c src1 area: 0x6333d0, ending before 0x6333df src2 area: 0x6333f0, ending before 0x6333ff
Since the end of the destination area, 0x7fffffffe92c, is unaligned, the initial byte loop processes the last 4 bytes of the areas, and reduces the length accordingly, to 11. The remaining work stores to the destination area ending at the aligned address 0x7fffffffe928.
Next, we enter the special case that the src operands have the same alignment, but different from the destination. Pointers and shift counts are set up, and the code then makes aligned reads at
0x6333d0, 0x6333d8 (src1) 0x6333f0, 0x6333f8 (src2)
The reads at 0x6333d8 and 0x6333f8 include one byte beyond the end of the input areas, and one of them triggers a valgrind warning.
The result of these reads are xored together, then shifted and ored together, so that the last five bytes read are ignored (i.e., the byte beyond the end of input, as well as the 4 bytes already processed). And the result is then stored at the aligned address 0x7fffffffe920.
Then the code exits the main loop (actually, it never entered, the above operations were done by the loop setup for odd word count, since the main loop uses two-way unrolling).
And then there's a final byte loop, processing the first three bytes of the areas (doing some unnecessary rereads of the input bytes at addresses 0x6333d[0-2] and 0x6333f[0-2]).
My conclusions are:
1. There's no bug here.
2. We should use the --partial-loads-ok=yes valgrind option. (The manual says "Note that code that behaves in this way is in violation of the the ISO C/C++ standards, and should be considered broken.", but those standards clearly don't apply to assembly code).
3. memxor.c might also use "partial loads" in a way which violates C standards. I don't think that's a problem on any real system, and, e.g, glibc memcmp does similar tricks.
Regards, /Niels