Re: [PATCH] "PowerPC64" GCM support

11 Oct 2020


      Thanks for the clarification, I just misunderstanded the division with the
partial reduction in a previous reply.
Ok, so you mean a polynomial division of  b_0(x) by P(x) where P(x) = X^128
+  X^127 +  X^126 +  X^121 + 1
b_0(x)/P(x) = (b_0(x)*(p^-1 mod P(x))) mod P(x)
b_0(x)/P(x) = (b_0(x)*(p')) mod P(x)
P(x) = X^64 +  X^63 +  X^62 +  X^57
P' = p^-1 mod P(x) = X^63 +  X^62 +  X^57
so the constant 0xC2
let me show you part of the new implementation of _nettle_gcm_init_key in
PPC
C --- calculate [H = H << 1 modulo polynomial] ---
vupkhsb        EMSB,H                        C extend most significant
bit to first byte
    vspltisb       B1,1                          C
0x01010101010101010101010101010101
    vspltb         EMSB,EMSB,0                   C first byte
quadword-extend
    vsl            H,H,B1                        C H = H << 1
    vand           EMSB,EMSB,POLY                C EMSB &=
0xC2000000000000000000000000000001
    vxor           ZERO,ZERO,ZERO                C
0x00000000000000000000000000000000
    vxor           H,H,EMSB                      C H ^= EMSB
xxmrghd        VSR(POLY_L),VSR(ZERO),VSR(POLY) C
0x0000000000000000C200000000000000
    xxmrghd        VSR(POLY_H),VSR(POLY),VSR(ZERO) C
0xC2000000000000000000000000000000
C --- calculate [H^2 = H*H] ---
xxswapd        VSR(Hm),VSR(H)
    vpmsumd        HP,H,POLY_L
    vxor           L,HP,Hm
    xxmrghd        VSR(M),VSR(H),VSR(HP)
    xxmrgld        VSR(L),VSR(H),VSR(L)
vpmsumd        R,M,H
    vpmsumd        F,L,H
vpmsumd        T,F,POLY_H
    xxswapd        VSR(F),VSR(F)
    vxor           R,R,T
    vxor           R,R,F
R still yields the wrong value of H^2, did I miss something in the
implementation or did it wrong?
On Sun, Oct 11, 2020 at 8:14 PM Niels Möller nisse@lysator.liu.se wrote:
...
Maamoun TK maamoun.tk@googlemail.com writes:
...
Hi Niels,
I tried to apply your method but can't get it work,
Hmm, do you think I've missed something in the math, or are there other
difficulties?
...
while applying it one
question came to my mind.
...
First, compute b_0(x) / x^64 (mod P(x)), which expands it from 64 bits
to
...
...
128,
c_1(x) x^64 + c_0(x) = b_0(x) / x^64 (mod P(x))
Here you are trying to get partially reduced product by computing b_0(x)
/
...
x^64 (mod P(x)) but since the degree of input is 127,  we can use the
polynomial defining the finite field with x^64 elements, in this case
P(x)
...
= X^64+X^4+X^3+X+1 and P' = P^-1 (mod X^64) = X^63+X^61+X^60+1 which is
the
...
same constant 0xB0 and the function now: c_1(x) x^64 + c_0(x) = ((b_0 mod
X^64) * p') mod X^64
For correctness, I think it is important that the computation b_0(x) /
x^64 is done modulo the gcm polynomial (originally, x^128 + x^7 + x^2 +
x + 1, but after bit reflection, P(x) = x^128 + x^127 + x^126 + x^125 +
1).
I don't see how one can do part of the computation in GF(2^64), or how
your degree-64 polynomial relates the the original degree-128
polynomial. If there's some useful embedding of GF(2^64) as a subfield
of GF(2^128), please explain?
That said, division by x^64 is fairly cheap, since P(x) = 1 (mod x^64).
I think we get
b_0(x) / x^64 (mod P(x)) = b_0(x) (1 + P(x)) / x^64 (mod P(x)
where we can simplify (P(x) + 1) / x^64 to x^64 + x^63 + x^62 + x^58, or
b_0(x) / x^64 (mod P(x)) = b_0(x) (x^64 to x^64 + x^63 + x^62 + x^58)
So no reduction needed, just split the product in high and low part to
get c_1 and c_0.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [PATCH] "PowerPC64" GCM support