Hello Mamone,
On Mon, Jan 11, 2021 at 11:39:43PM +0200, Maamoun TK wrote:
I have tuned the ghash patch to support big-endian mode but I'm really having difficulties testing it out through emulating, I'll attach the patch here so you can test it but I'm not sure how I can fix the bugs on big-endian system if any, you can feel free to send debugging info or setup a remote ssh connection so we can get it work properly.
Out of curiosity as I can't seem to find the beginning of the discussion: Is there anyone but me with an actual use-case for big-endian arm64 here? If not, I'd hate to cause a lot of effort for you and would certainly put in the effort to get this going myself.
The patch is built on top of the master branch.
First it failed to compile gcm-hash.o with error "No rule to make target" which turned out to be caused by a missing arm64/machine.m4. After I added an empty file there it compiled fine on aarch64 and the testsuite succeeded on the actual hardware as well as under qemu-aarch64 user mode emulation (both LE).
On aarch64_be it fails to compile with the following error message:
gcm-hash.s:113: Error: unknown mnemonic `zip' -- `zip v23.2d,v2.2d,v22.2d' gcm-hash.s:119: Error: unknown mnemonic `zip' -- `zip v25.2d,v3.2d,v22.2d' gcm-hash.s:129: Error: unknown mnemonic `zip' -- `zip v27.2d,v4.2d,v22.2d' gcm-hash.s:137: Error: unknown mnemonic `zip' -- `zip v29.2d,v5.2d,v22.2d'
This happens with gcc 10.2.0 on my hardware board as well as cross gcc 9.3.0 of Buildroot 2020.11.1 in a container.
I did a search of the aarch64 instruction set and saw that there's zip1 and zip2 instructions. So as a first test I just changed zip to zip1 which made it compile. As was to be expected, the testsuite failed though.
Before you try and get me up to speed on what the routine is supposed to be doing there's also an option for you to get a cross toolchain and emulator for your own tests without too much effort. Here's how I cross-compile nettle and run the testsuite using rootless podman (docker should do just as well) on my x86_64 box:
cd ~/Downloads mkdir nettle cd nettle git clone https://git.lysator.liu.se/nettle/nettle cd nettle git apply ~/arm64_ghash.patch ./.bootstrap podman run -it -v ~/Downloads/nettle:/nettle michaelweisernettleci/buildroot:2020.11.1-aarch64_be-glibc-gdb cd /nettle/ mkdir build-aarch64_be cd build-aarch64_be/ ../nettle/configure --host=$(cat /buildroot/triple) --enable-armv8-a-crypto make -j4 make -j4 check EMULATOR=/buildroot/qemu
Unfortunately, because in this case qemu-aarch64_be is running the testsuite binaries under emulation and doesn't support the ptrace syscall (and containers usually don't either), you can't just run it under an aarch64_be native gdb to see what it's executing.
One option would be to boot a full BE system image with kernel in qemu-system-aarch64 including a native gdb. But that's a bit of a hassle (building a rootfs and kernel e.g. using buildroot, getting it to boot in qemu, accessing it via console or network, ...)
qemu-user can however serve as a gdb server similar to qemu-system[1]. [1] https://qemu.readthedocs.io/en/latest/system/gdb.html
As luck would have it, above container image contains an x86_64-native gdb targeting aarch64_be. So you can start the testsuite test under qemu with the -g option and a port to listen on for the gdb remote debugging connection and then fire up gdb and connect there. After that you can debug as usual, single-step and look at register values:
root@6c85515d3939:/nettle/build-aarch64_be/testsuite# /buildroot/qemu -E LD_LIBRARY_PATH=../.lib -g 9000 ./gcm-test & [1] 4205 root@6c85515d3939:/nettle/build-aarch64_be/testsuite# aarch64_be-buildroot-linux-gnu-gdb ./gcm-test GNU gdb (GDB) 8.3.1 [...] Reading symbols from ./gcm-test... (gdb) break main Breakpoint 1 at 0x4037b0: file ../../nettle/testsuite/testutils.c, line 123. (gdb) target remote localhost:9000 Remote debugging using localhost:9000 warning: remote target does not support file transfer, attempting to access files from local filesystem. warning: Unable to find dynamic linker breakpoint function. GDB will be unable to debug shared library initializers and track explicitly loaded dynamic code. 0x0000004000802040 in ?? () (gdb) c Continuing. warning: Could not load shared library symbols for 3 libraries, e.g. /usr/lib64/libgmp.so.10. Use the "info sharedlibrary" command to see the complete listing. Do you need "set solib-search-path" or "set sysroot"?
Breakpoint 1, main (argc=1, argv=0x4000800d58) at ../../nettle/testsuite/testutils.c:123 123 if (argc > 1) (gdb) b _nettle_gcm_init_key Breakpoint 2 at 0x40008b69f4: file gcm-hash.s, line 93. (gdb) c Continuing.
Breakpoint 2, _nettle_gcm_init_key () at gcm-hash.s:93 93 ldr q2,[x0,#16*128] (gdb) s 94 dup v0.16b,v2.b[0] (gdb) 96 mov x1,#0xC200000000000000 (gdb) 97 mov x2,#1 (gdb) 98 mov v6.d[0],x1 (gdb) 99 mov v6.d[1],x2 (gdb) 100 sshr v0.16b,v0.16b,#7 (gdb) 101 and v0.16b,v0.16b,v6.16b (gdb) 102 ushr v1.2d,v2.2d,#63 (gdb) 103 and v1.16b,v1.16b,v6.16b (gdb) 104 ext v1.16b,v1.16b,v1.16b,#8 (gdb) 105 shl v2.2d,v2.2d,#1 (gdb) 106 orr v2.16b,v2.16b,v1.16b (gdb) 107 eor v2.16b,v2.16b,v0.16b (gdb) 109 dup v6.2d,v6.d[0] (gdb) 113 PMUL_PARAM v2,v23,v24 ^--- doesn't seem to expand the macro here (gdb) 115 PMUL v2,v23,v24 (gdb) 117 REDUCTION v3 (gdb) i r x0 0x423390 4338576 x1 0xc200000000000000 -4467570830351532032 [...] x30 0x406c44 4222020 sp 0x4000800ad0 0x4000800ad0 pc 0x40008b6a5c 0x40008b6a5c <_nettle_gcm_init_key+104> cpsr 0x80000000 -2147483648 fpsr 0x0 0 fpcr 0x0 0
The trick to see and single-step the individual instructions of the macro seems to be disp/i $pc combined with stepi:
(gdb) disp/i $pc 1: x/i $pc => 0x40008b6a30 <_nettle_gcm_init_key+60>: pmull2 v20.1q, v2.2d, v6.2d (gdb) stepi 0x00000040008b6a34 113 PMUL_PARAM v2,v23,v24 1: x/i $pc => 0x40008b6a34 <_nettle_gcm_init_key+64>: ext v22.16b, v2.16b, v2.16b, #8 (gdb) 0x00000040008b6a38 113 PMUL_PARAM v2,v23,v24 1: x/i $pc => 0x40008b6a38 <_nettle_gcm_init_key+68>: eor v22.16b, v22.16b, v20.16b (gdb) 0x00000040008b6a3c 113 PMUL_PARAM v2,v23,v24 1: x/i $pc => 0x40008b6a3c <_nettle_gcm_init_key+72>: zip1 v23.2d, v2.2d, v22.2d
From here I would now continue to compare register contents after each
instruction on LE and BE to see where it's going wrong.
How would you like to proceed? Shall I dig into it or do you want to? :)
BTW: In case you want to build the image yourself, the diff to the Dockerfile.aarch64[3] is this:
diff --git a/Dockerfile.aarch64 b/Dockerfile.aarch64 index 36af2c5..5b51c17 100644 --- a/Dockerfile.aarch64 +++ b/Dockerfile.aarch64 @@ -41,6 +41,7 @@ RUN br_libc="${BR_LIBC}" ; \ echo "BR2_TOOLCHAIN_BUILDROOT_${libcopt}=y" ; \ echo 'BR2_KERNEL_HEADERS_4_19=y' ; \ echo 'BR2_PACKAGE_GMP=y' ; \ + echo 'BR2_PACKAGE_HOST_GDB=y' ; \ echo 'BR2_PER_PACKAGE_DIRECTORIES=y' ; \ ) > .config && \ make olddefconfig && \ @@ -75,7 +76,7 @@ MAINTAINER Michael Weiser michael.weiser@gmx.de RUN apt-get update -qq -y && \ apt-get dist-upgrade -y && \ apt-get autoremove -y && \ - apt-get install -y autoconf dash g++ make qemu-user && \ + apt-get install -y autoconf dash g++ libncurses6 libexpat1 make qemu-user && \ apt-get clean all && \ rm -rf /var/lib/apt/lists/*
[3] https://github.com/michaelweiser-nettle-ci/docker-buildroot/blob/master/Dock...
The command to build the image is:
podman build -f Dockerfile.aarch64 --build-arg BR_LIBC=glibc -t buildroot:2020.11.1-aarch64_be-glibc-gdb .