Nettle uses precomputed tables for ecc scalar multiplication involving the fixed base point. I've found that parameters for those tables were somewhat poorly chosen.
Updated parameters just pushed to branch ecc-params-tweak. The size of the tables is unchanged (except for the 192-bit curve, where they are reduced from 15 KB to 12 KB). In my benchmarking, 10%-15% speedup of ecdsa with the 256-, 384- and 521-bit curves, and 24% speedup of eddsa25519 (all for the signature operation, changes for the verify operation are more modest).
I target table sizes of around 16 KB per curve. Would it be useful with a configure option to use smaller or larger table size? For the larger curves, one could perhaps get an additional 50% speedup with considerable larger curves. (But due to the requirements for side-channel silence, huge tables are impractical, since each access reads the entire table.)
Regards, /Niels