nisse@lysator.liu.se (Niels Möller) writes:
The point multiplication with the random point takes much longer time than the point multiplication involving the generator.
I added some more functions to the benchmarking (at the cost of a bit too long lines):
size modp redc modq modinv dup_jj add_jja add_jjj mul_g mul_a (us) 192 0.0250 0.0261 0.0442 13.7320 0.4994 0.6127 0.6138 37.7651 147.9455 224 0.0546 0.0435 0.0665 22.8033 0.7316 0.9063 0.9075 75.1715 256.2235 256 0.0714 0.0391 0.0798 23.1066 0.6377 0.7910 0.7912 80.2991 254.6769 384 0.0834 0.0000 0.0614 47.6998 1.2428 1.6319 1.6311 241.0215 734.5506 521 0.0344 0.0533 0.1296 105.0090 1.1316 1.4764 1.4775 343.3341 915.2812
So mul_a appears to be about 3 times slower than mul_g. And modinv is awful, at 1/3 of the time of a mul_g, signing will spend 1/4 of the time in modinv. (I have some ideas on how to simplify modinv and make it a little less slow).
(BTW, the 0.0 for 384-bit redc just means that no redc function is implemented for that prime. It's more of an anomaly that modp is slower than modq; that's an opportunity for optimization).
Regards, /Niels