Re: [AArch64] Optimize GHASH

25 Jan 2021


      Hello Mamone,
On Sun, Jan 24, 2021 at 06:44:33PM +0200, Maamoun TK wrote:
...
...
representation. As for arm and aarch64, little-endian is the default, do
you think, the routine could be changed to move the special endianness
treatment using rev64 to BE mode, i.e. avoid them in the standard LE
case? It's certainly beyond me but it might give some additional
speedup.
Or would it be irrelevant compared to the speedup already given by using
pmull in the first place?
I don't know how it gonna affect the performance but it's irrelevant margin
indeed, TBH I liked the patch with the special endianness treatment but
it's up to you to decide!
As you might expect, I like the one where doubleword vectors are used
throughout and stored in host endianness in TABLE because to me it's
most intuitive. For DATA my rationale is that if we want to *treat* it
as big-endian doublewords we should load it as doublewords to make it
clearer why and what we need to adjust afterwards. It also avoids the
rev64s with BE. I've added some comments with rationale. I've added a
README with an excerpt of last email as well. Attached are the current
patches, the first being your original. What do you think?
As said, I'm up for looking into endianness-specific versions of the
macros again. But what was supposed to be the LE versions of PMUL and
friends has now become the BE-native versions and we'd need to come up
with variants of them that make the rev64s unneccessary. Any ideas?
-- 
Thanks!
Michael

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [AArch64] Optimize GHASH