Re: Miscomputation with big-endian arm asm

11 Feb 2018

Hello Niels,
On Sun, Feb 11, 2018 at 11:03:41AM +0100, Niels Möller wrote:
...
...
  dst[i] = LE_SWAP32 (t);

Would this apply to chacha-core-internal.asm, too?
That's right, it is expected to produce the output as an array of 16
32-bit words, stored in *little* endian order. So byteswap is needed
Right. When this still didn't fix it, I compared little- and big-endian
behaviour and found that a.) vldm and vstm switch doublewords for no
reason I can see or find documentation about and b.) vext extracts from
the top of the vector, not bottom. Taking both into account, I now have
chacha and salsa20 passing tests.
...
...
...
...
FAIL: umac
I'm drawing a bit of a blank on this one. It fails on the very first
It could be that if the all-zero input is misaligned, the aligned read +
rotation tricks gets non-zero data from outside of the input area into
the computation.
Apparently, NEON adjusts for endianness, meaning shifts switch direction
as well. Which would explain why chacha and salsa20 didn't need more
adjustment. All I needed to do in the end was change the order of
registers for the 64bit return value in umac-nh and the checks passed.
I now have the whole testsuite passing apart from these two:
PASS: cxx
./sexp-conv-test: line 17: ../tools/sexp-conv: No such file or directory
cmp: EOF on test1.out which is empty
FAIL: sexp-conv
SKIP: pkcs1-conv
./nettle-pbkdf2-test: line 18: ../tools/nettle-pbkdf2: No such file or
directory
cmp: EOF on test1.out which is empty
FAIL: nettle-pbkdf2
PASS: symbols
PASS: dlopen
====================
2 of 93 tests failed
====================
They've been failing all along. Can they be ignored?
...
...
I was wrong: While the compiler is able to output big-endian objects
with -mbig-endian, it needs matching libs as well (e.g. libgcc_s).
Debian doesn't have anything precompiled for armeb.
Maybe full-system qemu is easier then (assuming there's some dist
supporting big-endian arm)?
Weeell, depends on what you consider easier: I haven't found any binary
distribution that supports armeb. Yocto and buildroot seem to support it
but still require compiling the whole thing.
...
...
-AC_C_BIGENDIAN
+AC_C_BIGENDIAN([AC_DEFINE([WORDS_BIGENDIAN], 1)

[ASM_WORDS_BIGENDIAN=yes]],
[ASM_WORDS_BIGENDIAN=no])

I think I'd indent differently, to group the two parts ACTION-IF-TRUE
together, and drop the redundant square brackets in
"[ASM_WORDS_BIGENDIAN=yes]".
Done.
...
You leave the ACTION-IS-UNIVERSAL as default, which I think is good. I
hope that's not relevant for arm.
Ahem, seems I was looking at old documentation of autoconf which didn't
have action-if-universal.
Apple does do arm and someone could potentially want to build a fat
nettle that supports x86_64 and arm or rather arm and arm64.
Does nettle currently support being compiled fat with assembly at all?
It would require building the individual platform's asm source and then
lipo-ing them together, I guess. I'm not clear on the specifics.
...
Might still be good to set
ASM_WORDS_BIGENDIAN to some default value before this check, and have
IF_BE fail if used with an unknown endianness.
But then I want to have a nice error message so as to not leave the user
with an aborted build and no apparent reason. :) Is this portable?
The patch got quite large now. Should I better make a series out of it?
...
From 67de31a70f8b8076681d6ddd221605365080103f Mon Sep 17 00:00:00 2001
From: Michael Weiser michael.weiser@gmx.de
Date: Wed, 7 Feb 2018 00:11:24 +0100
Subject: [PATCH] Support big-endian arm in assembly code
Introduce m4 macros to conditionally handle differences of little- and
big-endian arm in assembler code. Adjust sha1-compress, sha256-compress,
umac-nh, chacha-core-internal, salsa20-core-internal and memxor for arm
to work in big-endian mode.
---
 arm/memxor.asm                     | 21 ++++++++++++----
 arm/memxor3.asm                    | 49 +++++++++++++++++++++++++-------------
 arm/neon/chacha-core-internal.asm  | 37 +++++++++++++++++++++++-----
 arm/neon/salsa20-core-internal.asm | 43 ++++++++++++++++++++++++++++-----
 arm/neon/umac-nh.asm               |  4 +++-
 arm/v6/sha1-compress.asm           |  8 +++++--
 arm/v6/sha256-compress.asm         | 14 +++++++----
 asm.m4                             |  8 +++++++
 config.m4.in                       |  1 +
 configure.ac                       |  7 +++++-
 10 files changed, 149 insertions(+), 43 deletions(-)

diff --git a/arm/memxor.asm b/arm/memxor.asm
index a50e91bc..239a4034 100644
--- a/arm/memxor.asm
+++ b/arm/memxor.asm
@@ -44,6 +44,11 @@ define(<N>, <r2>)
 define(<CNT>, <r6>)
 define(<TNC>, <r12>)
+C little-endian and big-endian need to shift in different directions for
+C alignment correction
+define(<S0ADJ>, IF_LE(<lsr>, <lsl>))
+define(<S1ADJ>, IF_LE(<lsl>, <lsr>))
+
    .syntax unified
.file "memxor.asm"
@@ -99,6 +104,8 @@ PROLOGUE(nettle_memxor)
    C
    C With little-endian, we need to do
    C DST[i] ^= (SRC[i] >> CNT) ^ (SRC[i+1] << TNC)
+	C With big-endian, we need to do
+	C DST[i] ^= (SRC[i] << CNT) ^ (SRC[i+1] >> TNC)
push	{r4,r5,r6}
    
@@ -117,14 +124,14 @@ PROLOGUE(nettle_memxor)
 .Lmemxor_word_loop:
    ldr	r5, [SRC], #+4
    ldr	r3, [DST]
-	eor	r3, r3, r4, lsr CNT
-	eor	r3, r3, r5, lsl TNC
+	eor	r3, r3, r4, S0ADJ CNT
+	eor	r3, r3, r5, S1ADJ TNC
    str	r3, [DST], #+4
 .Lmemxor_odd:
    ldr	r4, [SRC], #+4
    ldr	r3, [DST]
-	eor	r3, r3, r5, lsr CNT
-	eor	r3, r3, r4, lsl TNC
+	eor	r3, r3, r5, S0ADJ CNT
+	eor	r3, r3, r4, S1ADJ TNC
    str	r3, [DST], #+4
    subs	N, #8
    bcs	.Lmemxor_word_loop
@@ -132,10 +139,14 @@ PROLOGUE(nettle_memxor)
    beq	.Lmemxor_odd_done
C We have TNC/8 left-over bytes in r4, high end
-	lsr	r4, CNT
+	S0ADJ	r4, CNT
    ldr	r3, [DST]
    eor	r3, r4
+	C memxor_leftover does an LSB store
+	C so we need to reverse if actually BE
+IF_BE(<	rev	r3, r3>)
+
    pop	{r4,r5,r6}
C Store bytes, one by one.
diff --git a/arm/memxor3.asm b/arm/memxor3.asm
index 139fd208..69598e1c 100644
--- a/arm/memxor3.asm
+++ b/arm/memxor3.asm
@@ -49,6 +49,11 @@ define(<ATNC>, <r10>)
 define(<BCNT>, <r11>)
 define(<BTNC>, <r12>)
+C little-endian and big-endian need to shift in different directions for
+C alignment correction
+define(<S0ADJ>, IF_LE(<lsr>, <lsl>))
+define(<S1ADJ>, IF_LE(<lsl>, <lsr>))
+
    .syntax unified
.file "memxor3.asm"
@@ -124,6 +129,8 @@ PROLOGUE(nettle_memxor3)
    C
    C With little-endian, we need to do
    C DST[i-i] ^= (SRC[i-i] >> CNT) ^ (SRC[i] << TNC)
+	C With big-endian, we need to do
+	C DST[i-i] ^= (SRC[i-i] << CNT) ^ (SRC[i] >> TNC)
    rsb	ATNC, ACNT, #32
    bic	BP, #3
@@ -138,14 +145,14 @@ PROLOGUE(nettle_memxor3)
 .Lmemxor3_au_loop:
    ldr	r5, [BP, #-4]!
    ldr	r6, [AP, #-4]!
-	eor	r6, r6, r4, lsl ATNC
-	eor	r6, r6, r5, lsr ACNT
+	eor	r6, r6, r4, S1ADJ ATNC
+	eor	r6, r6, r5, S0ADJ ACNT
    str	r6, [DST, #-4]!
 .Lmemxor3_au_odd:
    ldr	r4, [BP, #-4]!
    ldr	r6, [AP, #-4]!
-	eor	r6, r6, r5, lsl ATNC
-	eor	r6, r6, r4, lsr ACNT
+	eor	r6, r6, r5, S1ADJ ATNC
+	eor	r6, r6, r4, S0ADJ ACNT
    str	r6, [DST, #-4]!
    subs	N, #8
    bcs	.Lmemxor3_au_loop
@@ -154,7 +161,11 @@ PROLOGUE(nettle_memxor3)
C Leftover bytes in r4, low end
    ldr	r5, [AP, #-4]
-	eor	r4, r5, r4, lsl ATNC
+	eor	r4, r5, r4, S1ADJ ATNC
+
+	C leftover does an LSB store
+	C so we need to reverse if actually BE
+IF_BE(<	rev	r4, r4>)
.Lmemxor3_au_leftover:
    C Store a byte at a time
@@ -247,21 +258,25 @@ PROLOGUE(nettle_memxor3)
    ldr	r5, [AP, #-4]!
    ldr	r6, [BP, #-4]!
    eor	r5, r6
-	lsl	r4, ATNC
-	eor	r4, r4, r5, lsr ACNT
+	S1ADJ	r4, ATNC
+	eor	r4, r4, r5, S0ADJ ACNT
    str	r4, [DST, #-4]!
 .Lmemxor3_uu_odd:
    ldr	r4, [AP, #-4]!
    ldr	r6, [BP, #-4]!
    eor	r4, r6
-	lsl	r5, ATNC
-	eor	r5, r5, r4, lsr ACNT
+	S1ADJ	r5, ATNC
+	eor	r5, r5, r4, S0ADJ ACNT
    str	r5, [DST, #-4]!
    subs	N, #8
    bcs	.Lmemxor3_uu_loop
    adds	N, #8
    beq	.Lmemxor3_done
+	C leftover does an LSB store
+	C so we need to reverse if actually BE
+IF_BE(<	rev	r4, r4>)
+
    C Leftover bytes in a4, low end
    ror	r4, ACNT
 .Lmemxor3_uu_leftover:
@@ -290,18 +305,18 @@ PROLOGUE(nettle_memxor3)
 .Lmemxor3_uud_loop:
    ldr	r5, [AP, #-4]!
    ldr	r7, [BP, #-4]!
-	lsl	r4, ATNC
-	eor	r4, r4, r6, lsl BTNC
-	eor	r4, r4, r5, lsr ACNT
-	eor	r4, r4, r7, lsr BCNT
+	S1ADJ	r4, ATNC
+	eor	r4, r4, r6, S1ADJ BTNC
+	eor	r4, r4, r5, S0ADJ ACNT
+	eor	r4, r4, r7, S0ADJ BCNT
    str	r4, [DST, #-4]!
 .Lmemxor3_uud_odd:
    ldr	r4, [AP, #-4]!
    ldr	r6, [BP, #-4]!
-	lsl	r5, ATNC
-	eor	r5, r5, r7, lsl BTNC
-	eor	r5, r5, r4, lsr ACNT
-	eor	r5, r5, r6, lsr BCNT
+	S1ADJ	r5, ATNC
+	eor	r5, r5, r7, S1ADJ BTNC
+	eor	r5, r5, r4, S0ADJ ACNT
+	eor	r5, r5, r6, S0ADJ BCNT
    str	r5, [DST, #-4]!
    subs	N, #8
    bcs	.Lmemxor3_uud_loop
diff --git a/arm/neon/chacha-core-internal.asm b/arm/neon/chacha-core-internal.asm
index 6f623106..43bacda6 100644
--- a/arm/neon/chacha-core-internal.asm
+++ b/arm/neon/chacha-core-internal.asm
@@ -90,31 +90,50 @@ PROLOGUE(_nettle_chacha_core)
    vmov	S2, X2
    vmov	S3, X3
-	C Input rows:
+	C Input rows little-endian:
    C	 0  1  2  3	X0
    C	 4  5  6  7	X1
    C	 8  9 10 11	X2
    C	12 13 14 15	X3
+	C Input rows big-endian:
+	C	 2  3  0  1	X0
+	C	 6  7  4  5	X1
+	C	10 11  8  9	X2
+	C	14 15 12 13	X3
+	C because vldm switches doublewords
+
 .Loop:
    QROUND(X0, X1, X2, X3)
-	C Rotate rows, to get
+	C In little-endian rotate rows, to get
    C	 0  1  2  3
    C	 5  6  7  4  >>> 3
    C	10 11  8  9  >>> 2
    C	15 12 13 14  >>> 1
-	vext.32	X1, X1, X1, #1
+
+	C In big-endian rotate rows, to get
+	C	 2  3  0  1
+	C	 7  4  5  6  >>> 3
+	C	 8  9 10 11  >>> 2
+	C	13 14 15 12  >>> 1
+
+	C vext extracts from the top of the vector instead of bottom
+IF_LE(<	vext.32	X1, X1, X1, #1>)
+IF_BE(<	vext.32	X1, X1, X1, #3>)
    vext.32	X2, X2, X2, #2
-	vext.32	X3, X3, X3, #3
+IF_LE(<	vext.32	X3, X3, X3, #3>)
+IF_BE(<	vext.32	X3, X3, X3, #1>)
QROUND(X0, X1, X2, X3)
subs	ROUNDS, ROUNDS, #2
    C Inverse rotation
-	vext.32	X1, X1, X1, #3
+IF_LE(<	vext.32	X1, X1, X1, #3>)
+IF_BE(<	vext.32	X1, X1, X1, #1>)
    vext.32	X2, X2, X2, #2
-	vext.32	X3, X3, X3, #1
+IF_LE(<	vext.32	X3, X3, X3, #1>)
+IF_BE(<	vext.32	X3, X3, X3, #3>)
bhi	.Loop
@@ -123,6 +142,12 @@ PROLOGUE(_nettle_chacha_core)
    vadd.u32	X2, X2, S2
    vadd.u32	X3, X3, S3
+	C caller expects result little-endian
+IF_BE(<	vrev32.u8	X0, X0
+	vrev32.u8	X1, X1
+	vrev32.u8	X2, X2
+	vrev32.u8	X3, X3>)
+
    vstm	DST, {X0,X1,X2,X3}
    bx	lr
 EPILOGUE(_nettle_chacha_core)
diff --git a/arm/neon/salsa20-core-internal.asm b/arm/neon/salsa20-core-internal.asm
index 34eb1fba..12a812d8 100644
--- a/arm/neon/salsa20-core-internal.asm
+++ b/arm/neon/salsa20-core-internal.asm
@@ -88,7 +88,7 @@ define(<QROUND>, <
 PROLOGUE(_nettle_salsa20_core)
    vldm	SRC, {X0,X1,X2,X3}
-	C Input rows:
+	C Input rows little-endian:
    C	 0  1  2  3	X0
    C	 4  5  6  7	X1
    C	 8  9 10 11	X2
@@ -99,6 +99,18 @@ PROLOGUE(_nettle_salsa20_core)
    C	 8 13  2  7
    C	12  1  6 11
+	C Input rows big-endian:
+	C	 2  3  0  1	X0
+	C	 6  7  4  5	X1
+	C	10 11  8  9	X2
+	C	14 15 12 13	X3
+	C because vldm switches doublewords
+	C Permuted to:
+	C	10 15  0  5
+	C	14  3  4  9
+	C	 2  7  8 13
+	C	 6 11 12  1
+
    C FIXME: Construct in some other way?
    adr	r12, .Lmasks
    vldm	r12, {M0101, M0110, M0011}
@@ -112,6 +124,7 @@ PROLOGUE(_nettle_salsa20_core)
    C	 4  1  6  3	T0 v
    C	 8 13 10 15	T1  ^
    C	12  9 14 11	X3  v
+	C same in big endian just with transposed double-rows
    vmov	T0, X1
    vmov	T1, X2
    vbit	T0, X0, M0101
@@ -140,22 +153,34 @@ PROLOGUE(_nettle_salsa20_core)
 .Loop:
    QROUND(X0, X1, X2, X3)
-	C Rotate rows, to get
+	C In little-endian rotate rows, to get
    C	 0  5 10 15
    C	 3  4  9 14  >>> 1
    C	 2  7  8 13  >>> 2
    C	 1  6 11 12  >>> 3
-	vext.32	X1, X1, X1, #3
+
+	C In big-endian rotate rows, to get
+	C	10 15  0  5
+	C	 9 14  3  4  >>> 1
+	C	 8 13  2  7  >>> 2
+	C	11 12  1  6  >>> 3
+
+	C vext extracts from the top of the vector instead of bottom
+IF_LE(<	vext.32	X1, X1, X1, #3>)
+IF_BE(<	vext.32	X1, X1, X1, #1>)
    vext.32	X2, X2, X2, #2
-	vext.32	X3, X3, X3, #1
+IF_LE(<	vext.32	X3, X3, X3, #1>)
+IF_BE(<	vext.32	X3, X3, X3, #3>)
QROUND(X0, X3, X2, X1)
subs	ROUNDS, ROUNDS, #2
    C Inverse rotation
-	vext.32	X1, X1, X1, #1
+IF_LE(<	vext.32	X1, X1, X1, #1>)
+IF_BE(<	vext.32	X1, X1, X1, #3>)
    vext.32	X2, X2, X2, #2
-	vext.32	X3, X3, X3, #3
+IF_LE(<	vext.32	X3, X3, X3, #3>)
+IF_BE(<	vext.32	X3, X3, X3, #1>)
bhi	.Loop
@@ -181,6 +206,12 @@ PROLOGUE(_nettle_salsa20_core)
    vadd.u32	X2, X2, S2
    vadd.u32	X3, X3, S3
+	C caller expects result little-endian
+IF_BE(<	vrev32.u8	X0, X0
+	vrev32.u8	X1, X1
+	vrev32.u8	X2, X2
+	vrev32.u8	X3, X3>)
+
    vstm	DST, {X0,X1,X2,X3}
    bx	lr
 EPILOGUE(_nettle_salsa20_core)
diff --git a/arm/neon/umac-nh.asm b/arm/neon/umac-nh.asm
index 158a5686..2b617202 100644
--- a/arm/neon/umac-nh.asm
+++ b/arm/neon/umac-nh.asm
@@ -97,6 +97,8 @@ PROLOGUE(_nettle_umac_nh)
    bhi	.Loop
vadd.i64 D0REG(QY), D0REG(QY), D1REG(QY)
-	vmov	r0, r1, D0REG(QY)
+	C return values use memory endianness
+IF_LE(<	vmov	r0, r1, D0REG(QY)>)
+IF_BE(<	vmov	r1, r0, D0REG(QY)>)
    bx	lr
 EPILOGUE(_nettle_umac_nh)
diff --git a/arm/v6/sha1-compress.asm b/arm/v6/sha1-compress.asm
index 59d6297e..8cc22be7 100644
--- a/arm/v6/sha1-compress.asm
+++ b/arm/v6/sha1-compress.asm
@@ -52,7 +52,7 @@ define(<LOAD>, <
    sel	W, WPREV, T0
    ror	W, W, SHIFT
    mov	WPREV, T0
-	rev	W, W
+IF_LE(<	rev	W, W>)
    str	W, [SP,#eval(4*$1)]
...
)
define(<EXPN>, <
@@ -127,8 +127,12 @@ PROLOGUE(_nettle_sha1_compress)
    lsl	SHIFT, SHIFT, #3
    mov	T0, #0
    movne	T0, #-1
-	lsl	W, T0, SHIFT
+IF_LE(<	lsl	W, T0, SHIFT>)
+IF_BE(<	lsr	W, T0, SHIFT>)
    uadd8	T0, T0, W		C Sets APSR.GE bits
+	C on BE rotate right by 32-SHIFT bits
+	C because there is no rotate left
+IF_BE(<	rsb     SHIFT, SHIFT, #32>)
    
    ldr	K, .LK1
    ldm	STATE, {SA,SB,SC,SD,SE}
diff --git a/arm/v6/sha256-compress.asm b/arm/v6/sha256-compress.asm
index e6f4e1e9..324730c7 100644
--- a/arm/v6/sha256-compress.asm
+++ b/arm/v6/sha256-compress.asm
@@ -137,8 +137,12 @@ PROLOGUE(_nettle_sha256_compress)
    lsl	SHIFT, SHIFT, #3
    mov	T0, #0
    movne	T0, #-1
-	lsl	I1, T0, SHIFT
+IF_LE(<	lsl	I1, T0, SHIFT>)
+IF_BE(<	lsr	I1, T0, SHIFT>)
    uadd8	T0, T0, I1		C Sets APSR.GE bits
+	C on BE rotate right by 32-SHIFT bits
+	C because there is no rotate left
+IF_BE(<	rsb	SHIFT, SHIFT, #32>)
mov	DST, sp
    mov	ILEFT, #4
@@ -146,16 +150,16 @@ PROLOGUE(_nettle_sha256_compress)
    ldm	INPUT!, {I1,I2,I3,I4}
    sel	I0, I0, I1
    ror	I0, I0, SHIFT
-	rev	I0, I0
+IF_LE(<	rev	I0, I0>)
    sel	I1, I1, I2
    ror	I1, I1, SHIFT
-	rev	I1, I1
+IF_LE(<	rev	I1, I1>)
    sel	I2, I2, I3
    ror	I2, I2, SHIFT
-	rev	I2, I2
+IF_LE(<	rev	I2, I2>)
    sel	I3, I3, I4
    ror	I3, I3, SHIFT
-	rev	I3, I3
+IF_LE(<	rev	I3, I3>)
    subs	ILEFT, ILEFT, #1
    stm	DST!, {I0,I1,I2,I3}
    mov	I0, I4	
diff --git a/asm.m4 b/asm.m4
index 4018c235..8c290551 100644
--- a/asm.m4
+++ b/asm.m4
@@ -51,6 +51,14 @@ define(<ALIGN>,
 <.align ifelse(ALIGN_LOG,yes,<m4_log2($1)>,$1)
...
)
+define(<IF_BE>, <ifelse(
+WORDS_BIGENDIAN,yes,<$1>,
+WORDS_BIGENDIAN,no,<$2>,
+<errprint(__file__:__line__:,<Unsupported endianness value>,WORDS_BIGENDIAN,<
+>)
+  m4exit(1)>)>)
+define(<IF_LE>, <IF_BE(<$2>, <$1>)>)
+
 dnl Struct defining macros
dnl STRUCTURE(prefix) 
diff --git a/config.m4.in b/config.m4.in
index e39c880c..11f90a40 100644
--- a/config.m4.in
+++ b/config.m4.in
@@ -7,6 +7,7 @@ define(<TYPE_PROGBITS>, <@ASM_TYPE_PROGBITS@>)dnl
 define(<ALIGN_LOG>, <@ASM_ALIGN_LOG@>)dnl
 define(<W64_ABI>, <@W64_ABI@>)dnl
 define(<RODATA>, <@ASM_RODATA@>)dnl
+define(<WORDS_BIGENDIAN>, <@ASM_WORDS_BIGENDIAN@>)dnl
 divert(1)
 @ASM_MARK_NOEXEC_STACK@
 divert
diff --git a/configure.ac b/configure.ac
index 41bf0998..b57d2fb8 100644
--- a/configure.ac
+++ b/configure.ac
@@ -201,7 +201,11 @@ LSH_FUNC_STRERROR
 # getenv_secure is used for fat overrides,
 # getline is used in the testsuite
 AC_CHECK_FUNCS(secure_getenv getline)
-AC_C_BIGENDIAN
+
+ASM_WORDS_BIGENDIAN=unknown
+AC_C_BIGENDIAN([AC_DEFINE([WORDS_BIGENDIAN], 1)
+		ASM_WORDS_BIGENDIAN=yes],
+	[ASM_WORDS_BIGENDIAN=no])
AC_CACHE_CHECK([for __builtin_bswap64],
    	nettle_cv_c_builtin_bswap64,
@@ -811,6 +815,7 @@ AC_SUBST(ASM_TYPE_PROGBITS)
 AC_SUBST(ASM_MARK_NOEXEC_STACK)
 AC_SUBST(ASM_ALIGN_LOG)
 AC_SUBST(W64_ABI)
+AC_SUBST(ASM_WORDS_BIGENDIAN)
 AC_SUBST(EMULATOR)
AC_SUBST(LIBNETTLE_MAJOR)
-- 
2.16.1

-- 
bye, Michael
I hate forms!

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Miscomputation with big-endian arm asm