Re: [S390x] Optimize SHA1 with fat build support

20 Sep 2021


      Maamoun TK maamoun.tk@googlemail.com writes:
...
I got almost 12% speedup of optimizing the sha3_permute() function using
the SHA hardware accelerator of s390x, is it worth adding that assembly
implementation?
For such a small assembly function, I think it's worth the effort (more
questionable if it was worth adding the special instructions for it...).
If you have the time, you could also try out doing it with vector
registers, like on x86_64 and arm/neon. Some difficulties in the x86_64
implementation were (i) xmm register shortage, (ii) moving 64-bit pieces
between the 128-bit xmm registers, and (iii) rotating the 64-bit pieces
of an xmm register by different shift counts.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [S390x] Optimize SHA1 with fat build support