ECC plans - nettle-bugs

17 Jan 2013


      I've started to read up on ECC again (and I've also spent some time
reading up on ARM architecture, but that's for a different message).
As I wrote earlier, for a start I'll limit the support to standard NIST
curves. So first question is naming; what's the most established names
for these curves? It seems gnutls uses, e.g., "SECP256R1", I guess we
can just follow that (rather than, e.g, "P-256", as it's called in the
DSA spec fips-186-3)?
My current plan is as follows:
1. The top-level functions for ECDSA and the like will use mpz_t to
   represent point coordinates, exponents, etc, just like the current
   DSA code.
2. Internally, I think I want to use GMP's lower-level mpn interface,
   for two reasons: (i) that should make it possible, with some effort,
   to use ecc functions without any allocation inside Nettle. (ii) For
   performance, I'd like to do specific mod p functions for the various
   curves, and this is done best on top of functions such as mpn_add_n
   and mpn_addmul_1. After all, the primes p used for the curves are
   carefully chosen to have special structure, to enable optimizations.
3. Representation of a curve will for now be an opaque struct. The
   header will just declare
struct ecc_curve;
     extern const struct ecc_curve nettle secp256r1;
Tables for point multiplication will be compiled into the library.
   They will be somewhat machine dependent, depending on the GMP limb
   size, so they have to be generated at build time.
4. I think I'll use Jacobian coordinates. These have a penalty for
   addition of arbitrary points, but we can try to use multiplication
   algorithms where one of the points being added always is a constant,
   which can be represented in normalized form. Then there's no penalty
   over plain homogeneous coordinates (this is the operation called
   "madd" in gnutls sources).
I've been intrigued by the recent (2007) breakthrough of Edwards
   curves, but as far as I understand, that can't easily be applied to
   the standard NIST curves. See
   http://cr.yp.to/newelliptic/newelliptic.html.
5. For point multiplication with the fixed generator, I think I'll use
   the L=1 special case of Pippenger's algorithm, described in Daniel
   Bernstein's paper http://cr.yp.to/papers/pippenger.pdf. I think this
   is more or less the same as the "comb" method or "Lim-Lee"-method, but
   predating any bogus patents on the algorithm.
(I can try to summarize the algorithm later, but it's pretty simple
   once one gets past the notation).
I think it's fairly easy to silence with regards to timing and cache
   side-channels. One nice feature is that even with pretty large
   tables, not all of the table index bits are data-dependent (extracted
   from the exponent). Then it should be practical to read all possible
   table entries and use masking to copy the selected one, see
   http://gmplib.org:8000/gmp/file/eee6b8e54765/mpn/generic/tabselect.c.
This operation is the working horse for ECDSA signing, and for the
   first half of ECDH.
I'm not sure how large tables it makes sense to create for each
   curve, but I guess somewehere between a few KByte to at most 100
   Kbyte. This could be made configurable at compile time, but I'd
   expect almost all users to stick to the default tables.
6. For general point multiplication, I tend towards using a simple
   non-sliding window, with a small point-specific table computed at run
   time. If the table is small enough, it might make sense to normalize
   the entries. As far as I understand, one can't beat the plain binary
   algorithm by a large factor; the algorithms I'm aware of only reduce
   the number of adds, but not the number of dups.
This operation is the working horse for the second half of ECDH.
7. For verifying ECDSA signatures, one needs to operate on two points,
   computing e_1 G + e_2 Y, where G is the generator (and hence offers
   possibilities for precomputation), and Y is the public key.
I haven't yet looked into this in detail, but it seems clear this
   should be done as a single operation, with only one chain of point
   duplications. Preferably, it should be implemented in such a way that
   the tables for G (from 5 above) can be reused.
Comments? Does this make sense, or is it anything which ought to be done
differently?
I have had a quick look at the gnutls code. wmNAF makes me a bit uneasy
when I think about side channels; there seems to be a lot of branches
depending on exponent bits.
In the point addition, it seems hard to avoid special cases when adding
a point to itself or to its negation, or adding the zero point. Which
will cause sidechannel leaks. Not sure what to do about that, but there
are some papers on "unified" ecc operations, or maybe one can do
something analogous to rsa blinding.
As far as I understand, for point multiplication in both ECDSA and ECDH,
it's only the scalar that ever is secret, the points multiplied are
always public. Are there other uses of point multiplication where also
the point is secret?
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.