Re: [PowerPC] GCM optimization

14 Nov 2020

      Maamoun TK maamoun.tk@googlemail.com writes:
...
+Lmod:

C --- process the modulo bytes, padding the low-order bytes with zeros

cmpldi         LENGTH,0
beq            Ldone

C load table elements
li             r8,1*TableElemAlign
lxvd2x         VSR(H1M),0,TABLE
lxvd2x         VSR(H1L),r8,TABLE

C push every modulo byte to the stack and load them with padding into

vector register

vxor           ZERO,ZERO,ZERO
addi           r8,SP,-16
stvx           ZERO,0,r8

+Lstb_loop:

subic.         LENGTH,LENGTH,1
lbzx           r7,LENGTH,DATA
stbx           r7,LENGTH,r8
bne            Lstb_loop
lxvd2x         VSR(C0),0,r8

It's always a bit annoying to have to deal with leftovers like this
in the assembly code. Can we avoid having to store it to memory and read
back? I can see three other approaches:
1. Loop, reading a byte at a time, and shift into a target register. I
   guess we would need to assemble the bytes in a regular register, and
   then transfer the final value to a vector register. Is that
   expensive?
2. Round the address down to make it aligned, read an aligned word and,
   only if needed, the next word. And shift and mask to get the needed
   bytes. I think it is fine to read a few bytes outside of the input
   area, as long as the reads do *not* cross any word boundary (and
   hence a potential page boundary). We do things like this in some
   other places, but then for reading unaligned data in general, not
   just leftover parts.
3. Adapt the internal C/asm interface, so that the assembly routine only
   needs to handle complete blocks. It could provide a gcm_gf_mul, and
   let the C code handle partial blocks using memxor + gcm_gf_mul.
I would guess (1) or maybe (3) is the most reasonable. I don't think
performance is that important, since it looks like for each message,
this case can happen only for the last call to gcm_update and the last
call to gcm_encrypt/gcm_decrypt.
What about test coverage? It looks like we have test cases for sizes up
to 8 blocks, and for partial blocks, so I guess that should be fine?
Reards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [PowerPC] GCM optimization