On Sat, Jun 20, 2020 at 11:54 AM Niels Möller <nisse(a)lysator.liu.se> wrote:
> Have you measured speedup when going from 4 to 8 blocks? We shouldn't
> add larger loops than needed.
>
The 8x loop has x~1.15 performance boost over 4x loop, if you think it's
not worth it, I can add only 4x loop to make the code simpler.
> Do you measure a speedup from this? Karatsuba usually pays off only for
> a bit larger sizes (but I guess overhead is a little less here than for
> standard multiplication).
>
Actually, I considered the Karatsuba algorithm not only for performance but
to reduce the number of registers used. However, I believe that using the
Karatsuba algorithm in my case performs better or similar to classical
multiplication.
> > - Since the functionality of gcm_set_key() is replaced with
> > gcm_init_key() for PowerPC64LE, two warnings will pop up:
> [‘gcm_gf_shift’
> > defined but not used] and [‘gcm_gf_add’ defined but not used]
>
When I applied the patch to the last upstream, these warnings did not
appear, some changes have occurred to gcm.c, I will look at it.
> To test PPC code, I wonder if it's easy to add a PPC build to
> .gitlab-ci, in the same way as arm and mips tests. These are based on
> Debian packaged cross compilers and qemu-user. I'm also not that familiar
> with the variants within the Power and PowerPC family of processors.
>
I will see what I can do.