nisse@lysator.liu.se (Niels Möller) writes:
The code for curve25519 and curve448 has been using powering to invert for a long time. I've now spent some time writing specific powering code for the five secp curves as well. I've found fairly efficient addition chains where powering for a prime of n bits needs n-1 squarings and about a dozen multiplies. (I don't know what the *optimal* addition chains are, if you know of tools for that, let me know).
[...]
I will merge these changes to master in a week or two, if no problems show up
Before doing this merge, I've made some changes to the modulo p reduce functions (mod and redc, with both C and assembly implementations). They can now store the final result at a different location than the clobbered input area. Then, the ecc_mod_mul and ecc_mod_sqr functions are also changed to have separates result area, different from the (larger) scratch area. This makes the allocation puzzle when using the ecc_mod_* functions a lot simpler, resulting in reduced scratch need for lots of functions, and elimination of a few copy operations.
Merging those changes is now on the master-updates branch. When this is in, the new code on the optimize-ecc-invert branch can likely be simplified a bit before merging.
Regards, /Niels