Stephen R. van den Berg wrote:
Stephen R. van den Berg wrote:
In pikefarm, the devel2 host has gcc-6 and gcc-7 (Debian 7.2.0-19); by default it picks gcc-7 and then fails. Manually compiling with gcc-6 works, I think I already tried gcc-7 using -O1, and that also works.
Still trying to investigate why exactly.
The culprit appears to be pike_memory.c
More specifically, in the gcc-7 case, it uses an xmm0 register to copy things in reorder() in the 16-byte special case.
I'm not quite sure why this goes wrong. Most likely explanations: a. Somehow the rest of the system does not expect xmm0 to be clobbered. b. The gcc-7 compiler gets some of the address calculations wrong.
Looking at the assembly code, I decode the following:
gcc-6 --- pike_memory6.s 2018-02-01 13:39:43.825976318 +0100
; case 16: ; B16_T *from = (B16_T *) memory; ; B16_T *to = (B16_T *) tmp;
- leal -1(%r13), %eax - leaq 4(,%rax,4), %rcx ; nitems = 8 + nitems * 4 ?? - xorl %eax, %eax ; e = 0
; for(e=0;e<nitems;e++) ; Register assignment %rbx : order .L173: - movslq (%rbx,%rax), %rdx ; %rdx = order[e] salq $4, %rdx ; %rdx *= 16
; to[e]=from[order[e]]; ; Register assignment %r14 : from ; Register assignment %r12 : to
- movq (%r14,%rdx), %rsi ; %rsi = from[%rdx] - movq 8(%r14,%rdx), %rdi ; %rdi = from[%rdx + 8] - movq %rsi, (%r12,%rax,4) ; to[e*4] = %rsi - movq %rdi, 8(%r12,%rax,4) ; to[e*4+8] = %rdi
- addq $4, %rax ; e += 4 cmpq %rax, %rcx ; while e != nitems jne .L173
gcc-7 +++ pike_memory7.s 2018-02-01 13:40:11.505289854 +0100
; case 16: ; B16_T *from = (B16_T *) memory; ; B16_T *to = (B16_T *) tmp;
+ leal -1(%r13), %edx + salq $4, %rdx ; * 16 + leaq 16(%rax,%rdx), %rcx ; nitems = 16 + nitems * 4 ??
; for(e=0;e<nitems;e++) ; Register assignment %rbx : order .L173: + movslq (%rbx), %rdx ; %rdx = *order + addq $16, %rax ; to += 16 + addq $4, %rbx ; order += 4 salq $4, %rdx ; %rdx *= 16
; to[e]=from[order[e]]; ; Register assignment %r14 : from ; Register assignment %r12 : to
+ movdqa (%r14,%rdx), %xmm0 ; %xmm0 = from[%rdx] + movaps %xmm0, -16(%rax) ; to[-16] = %xmm0 cmpq %rax, %rcx ; while to != to_end jne .L173
I'm not entirely sure about the instructions annotated with "??". My x86 assembly-fu is not perfect.
Anybody any ideas?