Dmitry Eremin-Solenikov dbaryshkov@gmail.com writes:
Hello colleagues,
I gave a little thought to Niels' idea:
- Revamp hmac and underlying hash functions with a separate state struct. Probably low priority, but it is a bit silly that, e.g., hmac_sha512_ctx includes three 128-byte large block buffers.
I've implemented new approach using hmac2 prefix, but if you like this approach I can switch hmac2 prefix to just hmac and drop older API.
Nice!
diff --git a/hmac.c b/hmac.c index 6ac5e11a0686..6d57f8c9197c 100644 --- a/hmac.c +++ b/hmac.c @@ -115,3 +115,69 @@ hmac_digest(const void *outer, const void *inner, void *state,
memcpy(state, inner, hash->context_size); }
+static void +hmac2_reinit_state(void *state, void *derived_key,
const struct nettle_hash *hash,
uint8_t padc)
+{
- TMP_DECL(pad, uint8_t, NETTLE_MAX_HASH_BLOCK_SIZE);
- TMP_ALLOC(pad, hash->block_size);
- memset(pad, padc, hash->block_size);
- memxor(pad, derived_key, hash->block_size);
- hash->init(state);
- hash->update(state, hash->block_size, pad);
+}
This reinit function is used instead of a plain memcpy (of the complete ctx, including buffer). That's less efficient, since we'll get more calls to the heavy compression function for each message.
In principle, it should be possible to replace derived_key with the relevant part of hash context, except the buffer, and memcpy that. If it's possible to arrange it in that way without things getting too ugly, I think that might be worth the effort.
A typical context struct looks like
struct sha256_ctx { uint32_t state[_SHA256_DIGEST_LENGTH]; /* State variables */ uint64_t count; /* 64-bit block count */ uint8_t block[SHA256_BLOCK_SIZE]; /* SHA256 data buffer */ unsigned int index; /* index into buffer */ };
Here, we should first reorder fields so that the block buffer is last,
struct sha256_ctx { uint32_t state[_SHA256_DIGEST_LENGTH]; /* State variables */ uint64_t count; /* 64-bit block count */ unsigned int index; /* index into buffer */ uint8_t block[SHA256_BLOCK_SIZE]; /* SHA256 data buffer */ };
(and we can do that, since we're planning an abi break).
Then at the time reinit is called, we would memcpy the first three fields. state here depends on the key, while count will be always 1 and index always zero (but it's likely not a useful optimization to handle the constat part separately). To make reasonably clean, we may have to take out the non-block fields to a separate struct, say
struct sha256_state { uint32_t state[_SHA256_DIGEST_LENGTH]; /* State variables */ uint64_t count; /* 64-bit block count */ unsigned int index; /* index into buffer */ };
struct sha256_ctx { struct sha256_state state; uint8_t block[SHA256_BLOCK_SIZE]; /* SHA256 data buffer */ };
and let
struct hmac_sha256 { struct sha256_state inner; struct sha256_state outer; struct sha256_ctx hash_ctx; /* Initialized from key, updated as the message is processed */ };
We'd need to add a state_size field to struct nettle_hash, and then reinit would be
memcpy(&hmac_ctx->hash_ctx, hmac_ctx->inner /* or outer */, hash->state_size);
And the nice thing is that any hash function not matching this internal structure can let state_size == context_size, and things will keep working.
What do you think?
Regards, /Niels