Re: Micro optimizations of the umac context structs

16 Apr 2013


      Nikos Mavrogiannopoulos n.mavrogiannopoulos@gmail.com writes:
...
On Tue, Apr 16, 2013 at 1:08 PM, Niels Möller nisse@lysator.liu.se wrote:
...
And I'm not sure how much difference to performance it would really
make. I guess it's not worth doing unless there's a large demonstraded
gain in performance.
The results will be very CPU-specific. If you have any benchmark or test
code, I could test on i7 and amd 64 cpus.
No, I don't have any good benchmark. But maybe it matters mostly for
code which is close to memory bandwidth limits.
Speaking of benchmarks, I've written some more umac assembly (not yet in
the public repo, I'll try to get it in later today).
x86_64 (Intel i5, 3.4 GHz):
Algorithm        mode Mbyte/s
            sha256      update  286.04
            sha512      update  433.52
            umac32      update 17837.65
            umac64      update 8364.80
            umac96      update 6447.72
           umac128      update 5270.74
ARM (Cortex-A9, 1 GHz):
Algorithm        mode Mbyte/s
            sha256      update   31.69
            sha512      update   30.38
            umac32      update  937.02
            umac64      update  464.81
            umac96      update  383.02
           umac128      update  350.13
So umac128 seems to be an order of magnitude faster than sha2. On
machines with decent multiplication performance.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Micro optimizations of the umac context structs