I think it may be possible to do the sha1 compression function all in x86 registers. Five registers for the state, one for pointing to the input, and then one free temporary.
Benchmark for the C implementation of various hashes:
md2 (Update): 2.327MB/s md4 (Update): 171.846MB/s md5 (Update): 114.488MB/s sha1 (Update): 81.916MB/s sha256 (Update): 43.055MB/s
/ Niels Möller (vässar rödpennan)
Previous text:
2004-02-05 17:43: Subject: Nettle
If you are in a good hacking mood, any speedup in SHA1 would be most welcome. Since it is used in bittorrent, quite a few gigabytes runs through it.
/ Martin Nilsson (saturator)