---------- Forwarded message ---------
From: Maamoun TK <maamoun.tk(a)googlemail.com>
Date: Thu, Nov 12, 2020 at 7:42 PM
Subject: Re: [PowerPC] GCM optimization
To: Niels Möller <nisse(a)lysator.liu.se>
On Thu, Nov 12, 2020 at 6:40 PM Niels Möller <nisse(a)lysator.liu.se> wrote:
> I gave it a test run on gcc112 in the gcc compile farm, and speedup of
> gcm update seems to be 26 times(!) compared to the C version.
>
That's reasonable, I got similar speedup on more …
[View More]stable POWER instances
than gcc compile farm.
> Where would that documentation be published? In the Nettle manual, as
> some IBM white paper, or as a more-or-less academic paper, e.g., on
> arxiv? I will not be able to spend much time on writing, but I'd be
> happy to review.
>
I'll start writing the papers once I got more details from IBM, similar to
intel documents, the document will be academic and practical at the same
time, I'll dive into finite field equations to demonstrate how we get there
as well as I'll add a practical example to clarify the preference of this
method in addition to the expected speedup of this method. My
intention that other crypto libraries could take advantage of this document
or maybe be a starting point for further improvements to the algorithm so
I'm checking if IBM would publish or approve such a document the same as
intel.
> I have a sketch of ARM Neon code doing the equivalent of two vpmsumd,
> with reasonable parallelism. Quite a lot of instructions needed.
>
If you don't have much time, you can send it here and I'll continue from
that point. I'm planning to compare the new method with the usual method
with and without the karatsuba algorithm.
> +C Alignment of gcm_key table elements, which is declared in gcm.h
> > +define(`TableElemAlign', `0x100')
>
> I still find this large constant puzzling. If I try
>
> struct gcm_key key;
> printf("sizeof (key): %zd, sizeof(key.h[0]): %zd\n", sizeof(key),
> sizeof(key.h[0]));
>
> (I added it to the start of test_main in gcm-test.c) and run on the
> gcc112 machine, I get
>
> sizeof (key): 4096, sizeof(key.h[0]): 16
>
> Which is what I'd expect, with elements of size 16 bytes, not 256 bytes.
>
> I haven't yet had the time to read the code carefully.
>
You see, the alignment of each element is 0x100 (256). The table has 16
elements and you got the size of the table 4096 which is reasonable because
16*256=4096
regards,
Mamone
[View Less]
Hi,
Nettle includes a function for side-channel silent modular inversion,
which asymptotically is O(n^2), like binary gcd but slower by a pretty
large constant factor.
For prime moduli, inversion can also be done via a^{-1} = a^{p-2} (mod
p). That's asymptotically O(n^3) (if the underlying multiplies are done
with the basic O(n^2) algorithm), but it may nevertheless be faster than
the other method for numbers of the size used for Nettle's elliptic
curves.
The code for curve25519 and curve448 …
[View More]has been using powering to invert
for a long time. I've now spent some time writing specific powering code
for the five secp curves as well. I've found fairly efficient addition
chains where powering for a prime of n bits needs n-1 squarings and
about a dozen multiplies. (I don't know what the *optimal* addition
chains are, if you know of tools for that, let me know).
Code is on the branch optimize-ecc-invert. However, there's some risk
the new code is slower on some platforms, in particular platforms
with slow multiplication.
The main benchmark is
./examples/hogweed-benchmark ecdsa
To get numbers for just the changed function, one can run
./examples-ecc-benchmark
and look at the modinv column. Please compare performance between this
branch and master, on the platforms that are important to you.
And to get relevant numbers, make sure to build using a recent GMP
library; performance when built with mini-gmp is not so important.
I will merge these changes to master in a week or two, if no problems
show up
I've benchmarked on one recent and one older x86_64 machine, and on
raspberry-pi version 1 (the slowest ARM machine I had easy access to).
I've seen improvements in ecdsa signing performance for all the secp
curves, ranging from 5% up to 20% depending on curve and platform.
Not all inversions are rewritten. I haven't changed the modq inversion
which is needed for ecdsa, and I haven't changed the code for the two
supported gost curves.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
[View Less]
Hi,
It's well known that SHA-1 is broken. I don't want to save it. But,
particularly when dealing with data at rest, there are cases where one
has to use SHA-1. It would be nice if Nettle integrated SHA-1
collision detection to make that a tiny bit safer:
https://github.com/cr-marcstevens/sha1collisiondetection
That library is under the MIT license, and apparently detects known
attacks against SHA-1:
[The routines] will compute the SHA-1 hash of any given file and
additionally will …
[View More]detect cryptanalytic collision attacks against
SHA-1 present in each file. It is very fast and takes less than
twice the amount of time as regular SHA-1.
More specifically they will detect any cryptanalytic collision
attack against SHA-1 using any of the top 32 SHA-1 disturbance
vectors with probability 1: ...
The possibility of false positives can be neglected as the
probability is smaller than 2^-90.
Thanks,
:) Neal
[View Less]
Hi all,
My project is Cryptofuzz (https://github.com/guidovranken/cryptofuzz) which
uses differential fuzzing to find correctness bugs (and memory bugs as
well) in popular cryptographic libraries.
It has bindings for Nettle and it tests many of the library's features:
https://github.com/guidovranken/cryptofuzz/blob/master/modules/nettle/modul…
I've been running this on Google's OSS-Fuzz (
https://github.com/google/oss-fuzz) for a while, and today it found a bug
in the blowfish key setter …
[View More]function.
Niels and other maintainers (if any), if you would like to be notified by
e-mail of bugs found by Cryptofuzz, please send me your e-mail address and
I will add you to the project. The e-mail address needs to be linked to a
Google account in order to access the dashboard at oss-fuzz.com.
Bug reproducer below:
-------
#include <nettle/blowfish.h>
int main(void)
{
const unsigned char key[] = {0xec, 0x00, 0x3a, 0x06, 0x73, 0x61, 0x74,
0x20, 0x74, 0xab, 0xe2, 0xc6, 0x61, 0x8b, 0x98, 0x89};
struct blowfish_ctx ctx;
blowfish_set_key(&ctx, sizeof(key), key);
return 0;
}
-------
If you compile with Clang and -fsanitize=undefined, this will print:
blowfish.c:388:22: runtime error: left shift of 236 by 24 places cannot be
represented in type 'int'
Explicit casting around the shifted values will fix this.
[View Less]