On Sat, Aug 27, 2011 at 3:05 PM, Niels Möller nisse@lysator.liu.se wrote:
Andres Mejia mcitadel@gmail.com writes:
Here's a patch that will implement RIPEMD-160 for nettle. It's code ported from libgcrypt.
Thanks!
There's no documentation update in this patch. I didn't look into how documentation was generated in nettle.
The documentation "master file" is nettle.texinfo. info, html and pdf files are all generated from that.
Also, this patch is against the nettle-2.2 release. I couldn't get the CVS snapshot of nettle building, particularly because of the integration of LSH.
Did you follow the instructions? If so, I'd be interested to hear how it failed. Anyway, basing patches on the 2.2 release is fine too.
It was when I tried to run autoreconf. I'll try again with the instructions.
Some initial comments on the implementation added below. Mostly cosmetic. Let me know if you plan to make an updated patch, or if I should address this myself when I intergrate the code.
- rmd160.c rmd160-compress.c rmd160-meta.c \
Naming: Is "rmd" a commonly used abbreviation? Otherwise, I think I'd prefer to write out "ripemd" both in filenames and C symbols.
Either is fine by me.
Are any other variants of ripemd in use? (According to wikipedia, there's original ripemd, ripemd-128, ripend-160, ripemd-256 and ripemd-320, but I have no idea which of them are in use today or are likely to be used in the future).
RIPEMD-160 was the only implementation I found in libgcrypt. I'm not aware if any of the other algorithms are in use.
--- /dev/null +++ b/rmd160-compress.c @@ -0,0 +1,277 @@ +/* rmd160-compress.c - RIPE-MD160 (Transform function)
- Copyright (C) 1998, 2001, 2002, 2003 Free Software Foundation, Inc.
- The nettle library is free software; you can redistribute it and/or modify
- it under the terms of the GNU Lesser General Public License as published by
- the Free Software Foundation; either version 2.1 of the License, or (at your
- option) any later version.
- The nettle library is distributed in the hope that it will be useful, but
- WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
- License for more details.
- You should have received a copy of the GNU Lesser General Public License
- along with the nettle library; see the file COPYING.LIB. If not, write to
- the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
- MA 02111-1307, USA.
- */
+#include <string.h>
+#include "rmd.h"
+/****************
- Rotate the 32 bit unsigned integer X by N bits left/right
- */
+#if defined(__GNUC__) && defined(__i386__) +static inline uint32_t +rol(uint32_t x, int n) +{
- __asm__("roll %%cl,%0"
- :"=r" (x)
- :"0" (x),"c" (n));
- return x;
+} +#else +#define rol(x,n) ( ((x) << (n)) | ((x) >> (32-(n))) ) +#endif
I think current gcc may recognize (x << n) | (x >> (wordsize - n)) and generate a rotate instruction (but I haven't tested). I don't usually put very machine dependent things like this in the C source files; if performance for this function is that important, one can write an assembler implementation of the compression function.
+/****************
- Transform the message X which consists of 16 32-bit-words
- */
+void +_nettle_rmd160_compress(uint32_t *state, const uint8_t *data) +{
- register uint32_t a,b,c,d,e;
- uint32_t aa,bb,cc,dd,ee,t;
+#ifdef WORDS_BIGENDIAN
- uint32_t x[16];
- {
- int i;
- uint8_t *p2, *p1;
- for (i=0, p1=data, p2=(uint8_t*)x; i < 16; i++, p2 += 4 )
- {
- p2[3] = *p1++;
- p2[2] = *p1++;
- p2[1] = *p1++;
- p2[0] = *p1++;
- }
- }
I'd just use READ_UINT32. I'd expect that explicit shifting is faster than repeated byte writes to memory.
I would also do the little endian special case (with memcpy) only if memcpy really is measured to be faster (it may well be, I can't guess).
--- /dev/null +++ b/rmd160-meta.c @@ -0,0 +1,32 @@ +/* rmd160-meta.c */
+/* nettle, low-level cryptographics library
- Copyright (C) 2002 Niels Möller
This copyright notice seems to be wrong (even if the contents of this file is barely copyrightable).
- RIPEMD-160 is not patented, see (as of 25.10.97)
I'd prefer to write the date either in words, or using ISO-format yyyy-mm-dd (1997-10-25).
- Note that the code uses Little Endian byteorder, which is good for
- 386 etc, but we must add some conversion when used on a big endian box.
Would it be possible to write efficient code using big-endian byteorder for all intermediate values? My initial guess is no: even if we use inverse byteorder for input, output, and constants, the rotate operations will be a problem, and that will kill any performance gain from omitted byteswapping of inputs and outputs.
+/* The routine terminates the computation */ +static void +rmd160_final(struct rmd160_ctx *ctx) +{
- uint32_t t, msb, lsb;
- uint8_t *p;
- rmd160_update(ctx, 0, NULL); /* flush */;
- t = ctx->nblocks;
- /* multiply by 64 to make a byte count */
- lsb = t << 6;
- msb = t >> 26;
- /* add the count */
- t = lsb;
- if( (lsb += ctx->index) < t )
- msb++;
- /* multiply by 8 to make a bit count */
- t = lsb;
- lsb <<= 3;
- msb <<= 3;
- msb |= t >> 29;
If it's really a 64-bit bit count, then I think the context struct needs a larger counter. Also the logic for adding the counter and padding could perhaps be borrowed from the corresponding sha1 or md5 code (but I haven't read the ripemd160 spec, so maybe it's really doing something different).
- /* append the 64 bit count */
- ctx->block[56] = lsb;
- ctx->block[57] = lsb >> 8;
- ctx->block[58] = lsb >> 16;
- ctx->block[59] = lsb >> 24;
- ctx->block[60] = msb;
- ctx->block[61] = msb >> 8;
- ctx->block[62] = msb >> 16;
- ctx->block[63] = msb >> 24;
Use WRITE_UINT32 (or one could have the input to the compression function by an array of uint32_t rather than uint8_t, but that has the drawback that an assembler implementation of the compression function can no longer optimize the byteswapping of the input).
- _nettle_rmd160_compress(ctx->digest, ctx->block);
- p = ctx->block;
+#ifdef WORDS_BIGENDIAN +#define X(a) do { *p++ = ctx->digest[a] ; *p++ = ctx->digest[a] >> 8; \
- *p++ = ctx->digest[a] >> 16; *p++ = ctx->digest[a] >> 24; } while(0)
+#else /* little endian */ +#define X(a) do { *(uint32_t*)p = ctx->digest[a] ; p += 4; } while(0) +#endif
This looks more obscure than it should be. I think it would be better to have the _final function not do any byteswapping, but leave it to the _digest function (which also knows how much data is needed, and which is the function responsible for converting the internal state to a byte sequence). And then write a new _nettle_write_le32 function (analogous to write-be32.c:_nettle_write_be32) and share it with md5_digest.
All this code is ported from libgcrypt. I'll leave it to you what modifications you want to do with the code.
Regards, /Niels
-- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.
I'm curious, why are the internal transform functions exposed to the public?