gcm/ghash organization (was Re: x86_64 gcm)

16 Feb 2022


      nisse@lysator.liu.se (Niels Möller) writes:
...
I've written a first version of a gcm_hash for x86_64, using the
pclmulqdq (carryless mul) instructions. With only a single block at a time,
no interleaving, this gives to 4.3 GByte/s,
I've added proper config and fat setup and merged this. It could surely
be improved further, but it's already much faster than the C version on
processors that support these instructions.
I'm considering reorganizing the internal gcm functions. I think I'd
like to have
void
  _nettle_ghash_set_key (struct gcm_key *gcm, const union nettle_block16 *key);
which sets the key (typically, the key block is zero encrypte using aes).
void
  _nettle_ghash_update (const struct gcm_key *key, union nettle_block16 *x,
    	        size_t length, const uint8_t *data);
where the input is complete blocks (padding done in the calling C code).
Not sure if length should be block count or byte count.
void
  _nettle_ghash_digest (union nettle_block16 *digest, const union nettle_block16 *x);
xors the final state into the digest block. Main point of this function
is that the implementation can chose internal byteorder, eliminating
byteswaps at start and end of the update function.
Would perhaps be good to also delete the code for GCM_TABLE_BITS != 8,
which isn't enabled and haven't been tested in years.
Regards,
/Niels
-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

gcm/ghash organization (was Re: x86_64 gcm)