Re: [S390x] Optimize AES modes

1 Apr 2021


      Maamoun TK maamoun.tk@googlemail.com writes:
...
...
I've tried out a split, see below patch. It's a rather large change,
moving pieces to new places, but nothing difficult. I'm considering
committing this to the s390x branch, what do you think?
I agree, I'll modify the patch of basic AES-128 optimized functions to be
built on top of the splitted aes functions.
Ok, pushed to the s390x branch now.
...
memxor performs the same in C and assembly since s390 architecture offers
memory xor instruction "xc" see xor_len macro in machine.m4 of the original
patch for an implementation example.
But the C implmementation is somewhat complicated, splitting into
several cases depending on alignment, and shifting data around to be able
to do word operations. If it can be done simpler with the nc
instruction, that would at least cut some overhead. (Note that memxor3
must support the overlap case needed by cbc decrypt).
...
However, s390x AES accelerators offer considerable speedup against C
implementation with optimized internal AES. The following table
demonstrates the idea more clearly:
Function               S390x accelerator   C implementation with optimized
internal AES (Only enable aes128.asm, aes192.asm, aes256.asm)

[...]
...
CBC AES128 Decrypt  0.647008 cpb  3.131405 cpb
[...]
...
CTR AES128 Crypt    0.710237 cpb  4.767290 cpb
For these two, the speed difference should essentially be the time for
the C implementation of memxor. "cpb" mean cycles per byte, right? 2-4
cycles per byte for memxor is quite slow. On my x86_64 laptop (ok,
comparing apples to oranges), memxor, for the aligned case, is 0.08 cpb,
and memxor twice as much. And even the C implementation is not that much
slower.
...
GCM AES128 Encrypt  0.630504 cpb  15.473187 cpb
For GCM, are there instructions that combine AES-CTR and GCM HASH? Or
are those done separately? It would be nice to have GCM HASH being fast
by itself, for performance with other ciphers than aes.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [S390x] Optimize AES modes