lists.lysator.liu.se
Sign In
Sign Up
Sign In
Sign Up
Manage this list
×
Keyboard Shortcuts
Thread View
j
: Next unread message
k
: Previous unread message
j a
: Jump to all threads
j l
: Jump to MailingList overview
2025
January
2024
December
November
October
September
August
July
June
May
April
March
February
January
2023
December
November
October
September
August
July
June
May
April
March
February
January
2022
December
November
October
September
August
July
June
May
April
March
February
January
2021
December
November
October
September
August
July
June
May
April
March
February
January
2020
December
November
October
September
August
July
June
May
April
March
February
January
2019
December
November
October
September
August
July
June
May
April
March
February
January
2018
December
November
October
September
August
July
June
May
April
March
February
January
2017
December
November
October
September
August
July
June
May
April
March
February
January
2016
December
November
October
September
August
July
June
May
April
March
February
January
2015
December
November
October
September
August
July
June
May
April
March
February
January
2014
December
November
October
September
August
July
June
May
April
March
February
January
2013
December
November
October
September
August
July
June
May
April
March
February
January
2012
December
November
October
September
August
July
June
May
April
March
February
January
2011
December
November
October
September
August
July
June
May
April
March
February
January
2010
December
November
October
September
August
July
June
May
April
March
February
January
2009
December
November
October
September
August
July
June
May
April
March
February
January
2008
December
November
October
September
August
July
June
May
April
March
February
January
2007
December
November
October
September
August
July
June
May
April
March
February
January
2006
December
November
October
September
August
July
June
May
April
March
February
January
2005
December
November
October
September
August
July
June
May
April
March
February
January
2004
December
November
October
September
August
July
June
May
April
March
February
January
2003
December
November
October
September
August
July
June
May
April
March
February
January
2002
December
November
October
List overview
Download
nettle-bugs
June 2021
----- 2025 -----
January 2025
----- 2024 -----
December 2024
November 2024
October 2024
September 2024
August 2024
July 2024
June 2024
May 2024
April 2024
March 2024
February 2024
January 2024
----- 2023 -----
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
----- 2022 -----
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
----- 2021 -----
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
----- 2020 -----
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
----- 2019 -----
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
----- 2018 -----
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
----- 2017 -----
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
----- 2016 -----
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
----- 2015 -----
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
----- 2014 -----
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
----- 2013 -----
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
----- 2012 -----
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
----- 2011 -----
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
----- 2010 -----
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
----- 2009 -----
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
----- 2008 -----
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
----- 2007 -----
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
----- 2006 -----
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
----- 2005 -----
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
----- 2004 -----
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
----- 2003 -----
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
----- 2002 -----
December 2002
November 2002
October 2002
nettle-bugs@lists.lysator.liu.se
3 participants
11 discussions
Start a n
N
ew thread
[Aarch64] Optimize SHA1 Compress
by Maamoun TK
01 Jun '21
01 Jun '21
This patch optimizes SHA1 compress function for arm64 architecture by taking advantage of SHA-1 instructions of Armv8 crypto extension. The SHA-1 instructions: SHA1C: SHA1 hash update (choose) SHA1H: SHA1 fixed rotate SHA1M: SHA1 hash update (majority) SHA1P: SHA1 hash update (parity) SHA1SU0: SHA1 schedule update 0 SHA1SU1: SHA1 schedule update 1 The patch is based on sha1-arm.c - ARMv8 SHA extensions using C intrinsics of repository
https://github.com/noloader/SHA-Intrinsics
by Jeffrey Walton. The patch passes the testsuite of nettle library and the benchmark numbers are considerably improved but the performance of the overall sha1 hash function doesn't surpass the corresponding openssl numbers. Benchmark on gcc117 instance of CFarm before applying the patch: Algorithm mode Mbyte/s sha1 update 214.16 openssl sha1 update 849.44 hmac-sha1 64 bytes 61.69 hmac-sha1 256 bytes 131.50 hmac-sha1 1024 bytes 185.20 hmac-sha1 4096 bytes 204.55 hmac-sha1 single msg 210.97 Benchmark on gcc117 instance of CFarm after applying the patch: Algorithm mode Mbyte/s sha1 update 795.57 openssl sha1 update 849.25 hmac-sha1 64 bytes 167.65 hmac-sha1 256 bytes 408.24 hmac-sha1 1024 bytes 636.68 hmac-sha1 4096 bytes 739.42 hmac-sha1 single msg 775.89 --- arm64/crypto/sha1-compress.asm | 245 +++++++++++++++++++++++++++++++++++++++++ arm64/machine.m4 | 7 ++ 2 files changed, 252 insertions(+) create mode 100644 arm64/crypto/sha1-compress.asm diff --git a/arm64/crypto/sha1-compress.asm b/arm64/crypto/sha1-compress.asm new file mode 100644 index 00000000..bb3f1d35 --- /dev/null +++ b/arm64/crypto/sha1-compress.asm @@ -0,0 +1,245 @@ +C arm64/crypto/sha1-compress.asm + +ifelse(` + Copyright (C) 2021 Mamone Tarsha + + Based on sha1-arm.c - ARMv8 SHA extensions using C intrinsics of + repository
https://github.com/noloader/SHA-Intrinsics
+ sha1-arm.c is written and placed in public domain by Jeffrey Walton, + based on code from ARM, and by Johannes Schneiders, Skip + Hovsmith and Barry O'Rourke for the mbedTLS project. + + This file is part of GNU Nettle. + + GNU Nettle is free software: you can redistribute it and/or + modify it under the terms of either: + + * the GNU Lesser General Public License as published by the Free + Software Foundation; either version 3 of the License, or (at your + option) any later version. + + or + + * the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your + option) any later version. + + or both in parallel, as here. + + GNU Nettle is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received copies of the GNU General Public License and + the GNU Lesser General Public License along with this program. If + not, see
http://www.gnu.org/licenses/
. +') + +.file "sha1-compress.asm" +.arch armv8-a+crypto + +.text + +C Register usage: + +define(`STATE', `x0') +define(`INPUT', `x1') + +define(`CONST0', `v0') +define(`CONST1', `v1') +define(`CONST2', `v2') +define(`CONST3', `v3') +define(`MSG0', `v4') +define(`MSG1', `v5') +define(`MSG2', `v6') +define(`MSG3', `v7') +define(`ABCD', `v16') +define(`ABCD_SAVED', `v17') +define(`E0', `v18') +define(`E0_SAVED', `v19') +define(`E1', `v20') +define(`TMP0', `v21') +define(`TMP1', `v22') + +C void nettle_sha1_compress(uint32_t *state, const uint8_t *input) + +PROLOGUE(nettle_sha1_compress) + C Initialize constants + mov w2,#0x7999 + movk w2,#0x5A82,lsl #16 + dup CONST0.4s,w2 + mov w2,#0xEBA1 + movk w2,#0x6ED9,lsl #16 + dup CONST1.4s,w2 + mov w2,#0xBCDC + movk w2,#0x8F1B,lsl #16 + dup CONST2.4s,w2 + mov w2,#0xC1D6 + movk w2,#0xCA62,lsl #16 + dup CONST3.4s,w2 + + C Load state + add x2,STATE,#16 + movi E0.4s,#0 + ld1 {ABCD.4s},[STATE] + ld1 {E0.s}[0],[x2] + + C Save state + mov ABCD_SAVED.16b,ABCD.16b + mov E0_SAVED.16b,E0.16b + + C Load message + ld1 {MSG0.16b,MSG1.16b,MSG2.16b,MSG3.16b},[INPUT] + + C Reverse for little endian + rev32 MSG0.16b,MSG0.16b + rev32 MSG1.16b,MSG1.16b + rev32 MSG2.16b,MSG2.16b + rev32 MSG3.16b,MSG3.16b + + add TMP0.4s,MSG0.4s,CONST0.4s + add TMP1.4s,MSG1.4s,CONST0.4s + + C Rounds 0-3 + sha1h SFP(E1),SFP(ABCD) + sha1c QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG2.4s,CONST0.4s + sha1su0 MSG0.4s,MSG1.4s,MSG2.4s + + C Rounds 4-7 + sha1h SFP(E0),SFP(ABCD) + sha1c QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG3.4s,CONST0.4s + sha1su1 MSG0.4s,MSG3.4s + sha1su0 MSG1.4s,MSG2.4s,MSG3.4s + + C Rounds 8-11 + sha1h SFP(E1),SFP(ABCD) + sha1c QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG0.4s,CONST0.4s + sha1su1 MSG1.4s,MSG0.4s + sha1su0 MSG2.4s,MSG3.4s,MSG0.4s + + C Rounds 12-15 + sha1h SFP(E0),SFP(ABCD) + sha1c QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG1.4s,CONST1.4s + sha1su1 MSG2.4s,MSG1.4s + sha1su0 MSG3.4s,MSG0.4s,MSG1.4s + + C Rounds 16-19 + sha1h SFP(E1),SFP(ABCD) + sha1c QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG2.4s,CONST1.4s + sha1su1 MSG3.4s,MSG2.4s + sha1su0 MSG0.4s,MSG1.4s,MSG2.4s + + C Rounds 20-23 + sha1h SFP(E0),SFP(ABCD) + sha1p QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG3.4s,CONST1.4s + sha1su1 MSG0.4s,MSG3.4s + sha1su0 MSG1.4s,MSG2.4s,MSG3.4s + + C Rounds 24-27 + sha1h SFP(E1),SFP(ABCD) + sha1p QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG0.4s,CONST1.4s + sha1su1 MSG1.4s,MSG0.4s + sha1su0 MSG2.4s,MSG3.4s,MSG0.4s + + C Rounds 28-31 + sha1h SFP(E0),SFP(ABCD) + sha1p QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG1.4s,CONST1.4s + sha1su1 MSG2.4s,MSG1.4s + sha1su0 MSG3.4s,MSG0.4s,MSG1.4s + + C Rounds 32-35 + sha1h SFP(E1),SFP(ABCD) + sha1p QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG2.4s,CONST2.4s + sha1su1 MSG3.4s,MSG2.4s + sha1su0 MSG0.4s,MSG1.4s,MSG2.4s + + C Rounds 36-39 + sha1h SFP(E0),SFP(ABCD) + sha1p QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG3.4s,CONST2.4s + sha1su1 MSG0.4s,MSG3.4s + sha1su0 MSG1.4s,MSG2.4s,MSG3.4s + + C Rounds 40-43 + sha1h SFP(E1),SFP(ABCD) + sha1m QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG0.4s,CONST2.4s + sha1su1 MSG1.4s,MSG0.4s + sha1su0 MSG2.4s,MSG3.4s,MSG0.4s + + C Rounds 44-47 + sha1h SFP(E0),SFP(ABCD) + sha1m QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG1.4s,CONST2.4s + sha1su1 MSG2.4s,MSG1.4s + sha1su0 MSG3.4s,MSG0.4s,MSG1.4s + + C Rounds 48-51 + sha1h SFP(E1),SFP(ABCD) + sha1m QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG2.4s,CONST2.4s + sha1su1 MSG3.4s,MSG2.4s + sha1su0 MSG0.4s,MSG1.4s,MSG2.4s + + C Rounds 52-55 + sha1h SFP(E0),SFP(ABCD) + sha1m QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG3.4s,CONST3.4s + sha1su1 MSG0.4s,MSG3.4s + sha1su0 MSG1.4s,MSG2.4s,MSG3.4s + + C Rounds 56-59 + sha1h SFP(E1),SFP(ABCD) + sha1m QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG0.4s,CONST3.4s + sha1su1 MSG1.4s,MSG0.4s + sha1su0 MSG2.4s,MSG3.4s,MSG0.4s + + C Rounds 60-63 + sha1h SFP(E0),SFP(ABCD) + sha1p QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG1.4s,CONST3.4s + sha1su1 MSG2.4s,MSG1.4s + sha1su0 MSG3.4s,MSG0.4s,MSG1.4s + + C Rounds 64-67 + sha1h SFP(E1),SFP(ABCD) + sha1p QFP(ABCD),SFP(E0),TMP0.4s + add TMP0.4s,MSG2.4s,CONST3.4s + sha1su1 MSG3.4s,MSG2.4s + sha1su0 MSG0.4s,MSG1.4s,MSG2.4s + + C Rounds 68-71 + sha1h SFP(E0),SFP(ABCD) + sha1p QFP(ABCD),SFP(E1),TMP1.4s + add TMP1.4s,MSG3.4s,CONST3.4s + sha1su1 MSG0.4s,MSG3.4s + + C Rounds 72-75 + sha1h SFP(E1),SFP(ABCD) + sha1p QFP(ABCD),SFP(E0),TMP0.4s + + C Rounds 76-79 + sha1h SFP(E0),SFP(ABCD) + sha1p QFP(ABCD),SFP(E1),TMP1.4s + + C Combine state + add E0.4s,E0.4s,E0_SAVED.4s + add ABCD.4s,ABCD.4s,ABCD_SAVED.4s + + C Store state + st1 {ABCD.4s},[STATE] + st1 {E0.s}[0],[x2] + + ret +EPILOGUE(nettle_sha1_compress) diff --git a/arm64/machine.m4 b/arm64/machine.m4 index e69de29b..7df62bcc 100644 --- a/arm64/machine.m4 +++ b/arm64/machine.m4 @@ -0,0 +1,7 @@ +C Get 32-bit floating-point register from vector register +C SFP(VR) +define(`SFP',``s'substr($1,1,len($1))') + +C Get 128-bit floating-point register from vector register +C QFP(VR) +define(`QFP',``q'substr($1,1,len($1))') -- 2.25.1
3
9
0
0
← Newer
1
2
Older →
Jump to page:
1
2
Results per page:
10
25
50
100
200