Re: Language bindings question

27 Dec 2016


      Ron Frederick ronf@timeheart.net writes:
...
Understood, and I’ve already implemented a copy() method on the
wrapper which allocates a new context structure and initializes it
with the contents of the existing context. However, it seems very
expensive to do that malloc & memcpy on _every_ call to digest(),
since as you’ve said this is the uncommon case. Given the documented
behavior of the digest() function in PEP 452 though, that’s what I
would have to do, since there’s no way in the wrapper to know if the
caller might want to continue feeding data after digest() is called.
One problem is that many hash functions have an end-of-data padding
which, if implemented in the simple way, clobbers the block buffer. So
if the digest() wasn't allowed to modify the context, that would
introduce some extra copying or complexity for the common case where
digest is the final operation on the data.
And most hashes have a pretty small context, so that an extra copy in
the uncommon case isn't a big deal. Now, umac differs from plain hash
functions in that it has a much larger context, making copying more
expensive.
The nonce auto-increment is less of a problem.
I would also like to say that for a MAC which depends on a nonce (in
contrast to plain hash functions and HMAC), the python "PEP 452" API
allowing multiple calls to digest seems dangerous. I'd expect that the
key could be attacked if you expose both UMAC(key, nonce, "foo") and
UMAC(key, nonce, "foobar"), since the nonce is supposed to be unique for
each message.
Maybe one reasonable way to implement the python API could be to
require an explicit set_nonce, and raise an exception if digest is
called without a corresponding set_nonce? I.e., if set_nonce was never
called, or if there are two digest calls without an intervening
set_nonce. And then do any helper methods you want for managing the
nonce value in the python code?
...
It makes sense to me to have Nettle default to doing the reset &
auto-increment of the nonce, as I agree that would be the common case
for someone using Nettle directly.. However, if this could be made
configurable when the context is created, it would make it possible to
avoid the cost of the malloc & memcpy in the Python wrapper unless the
caller actually wanted to “fork” a context and hash multiple
independent streams of data which had a common prefix.
If you'd like to experiment, you could try writing a
umac32_digest_pure (const struct umac32_ctx *ctx, ...)
which doesn't modify the context, to see what it takes. Probably can't
use the "padding cache" optimization, though.
I'd prefer a separate function (naming is a bit difficult, as usual)
over a flag in the context.
What would probably be better (but a larger reorg), is to separate the
state which depends on the key from the state which depends on message
and nonce. The only MAC-like algoritm currently done like that is GCM,
which also has a large key-dependent table. That allows several contexts
sharing the same key-dependent tables.
It would make sense to do something similar for HMAC too. Currently, a
hmac_sha256_ctx consists of three sha256_ctx, which is three SHA256
state vectors of 32 bytes each, plus three block buffers of 64 bytes
each. So a total of 300 bytes or so. But it really needs only one block
buffer, so if state vectors and block buffers were better separated, it
could be trimmed down to about 170 bytes.
Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Language bindings question