Jeffrey Walton noloader@gmail.com writes:
On Mon, Mar 12, 2018 at 2:40 PM, Niels Möller nisse@lysator.liu.se wrote:
nisse@lysator.liu.se (Niels Möller) writes: ...
Now wired up for fat builds, changes pushed to the same branch.
Looks good on a Celeron J3455 (https://www.amazon.com/dp/B01LYCDG4H):
Without --enable-fat
md2 update 6.88 md4 update 570.47 md5 update 383.59 openssl md5 update 444.94 sha1 update 238.53 openssl sha1 update 1323.53 sha224 update 110.07 sha256 update 110.25 sha384 update 173.90 sha512 update 174.35 sha512-224 update 174.30 sha512-256 update 174.08
With --enable-fat
md2 update 6.89 md4 update 569.68 md5 update 382.82 openssl md5 update 444.76 sha1 update 1192.25 openssl sha1 update 1324.47 sha224 update 494.33 sha256 update 495.22 sha384 update 173.87 sha512 update 174.33
So you get 5 times speedup of sha1 and 4.5 times for sha256. Nice!
On gcc67 (AMD Ryzen 5 2400G), I measure 3 times and 4.8 times speedup, respectively.
Now, I think there are opportunities for improvements also for sha1 and sha256 without sha_ni, but that's a more difficult project, to carefully take data dependencies into account, and deal with hard-to-predict x86 scheduling.
Regards, /Niels