Re: PPC64LE optimizing AES and GHASH

20 Jun 2020

      Maamoun TK maamoun.tk@googlemail.com writes:
...
I added a PowerPC64LE optimized version of AES and GHASH to nettle.
Cool. I haven't yet looked at the patches, but some general comments:
...

The main equation: The main equation for 4 block (128-bit each) can be

seen in reference [1]  Digest = (((((((Digest⊕C0)*H)⊕C1)*H)⊕C2)*H)⊕C3)*H
   = ((Digest⊕C0)*H4)⊕(C1*H3)⊕(C2*H2)⊕(C3*H) to achieve more parallelism,
   this equation can be modified to address 8 blocks per one loop. It looks
   like as follows Digest =
   ((Digest⊕C0)*H8)⊕(C1*H7)⊕(C2*H6)⊕(C3*H5)⊕(C4*H4)⊕(C5*H3)⊕(C6*H2)⊕(C7*H)
Have you measured speedup when going from 4 to 8 blocks? We shouldn't
add larger loops than needed.
...

Handling Bit-reflection of the multiplication product [1]: This

technique moves part of the workload inside the loop to the init function
   so it is executed only once.
The "carry less" multiplication is symmetric under bit reversal. So
great to get it out of the main loops.
...

Karatsuba Algorithm: This algorithm allows to perform three

multiplication instructions instead of four, in exchange for two additional
   Xor. This technique is well explained with figures in reference [1]
Do you measure a speedup from this? Karatsuba usually pays off only for
a bit larger sizes (but I guess overhead is a little less here than for
standard multiplication).
...

Test 128 bytes is added to gcm-test in testsuite to test 8x loop in

GHASH optimized function.
Good!
...

Since the functionality of gcm_set_key() is replaced with

gcm_init_key() for PowerPC64LE, two warnings will pop up: [‘gcm_gf_shift’
   defined but not used] and [‘gcm_gf_add’ defined but not used]
You can perhaps solve this by adding
#if HAVE_NATIVE_...
#endif
around the related functions.
To test PPC code, I wonder if it's easy to add a PPC build to
.gitlab-ci, in the same way as arm and mips tests. These are based on
Debian packaged cross compilers and qemu-user. I'm also not that familiar
with the variants within the Power and PowerPC family of processors.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: PPC64LE optimizing AES and GHASH