Simo Sorce simo@redhat.com writes:
On Wed, 2019-03-20 at 06:14 +0100, Niels Möller wrote:
And another possible trick for big-endian is to do an "opposite-endian" left shift as
((x & 0x7f7f7f7f7f7f7f7f) << 1) | ((x & 0x8080808080808080) >> 15) where this bit is the carry out ^
This would allow us to avoid copies at the cost of more complicated code.
Which do you prefer? using endian.h where available? Or having two separate codepaths depending on the endianess of the machine ?
If it matters for performance, use the fastest variant. Using separate implementations of xts_shift, with #if:s depending on endianness and compiler support, is fine.
I'd expect the opposite-endian shift to be more efficient when bswap is particularly slow, and implemented in terms of shifting and masking.
A bit difficult to determine, though. Neither existence of endian.h macros or __builtin_bswap64 implies that the byte swapping is cheap. Are there any interesting platforms these days that lack an efficient bswap instruction? And are big-endian? Does mips have a bswap instruction?
Regards, /Niels