On Dec 23, 2016, at 11:00 PM, Niels Möller nisse@lysator.liu.se wrote:
Ron Frederick ronf@timeheart.net writes:
While this wouldn’t really be a problem for my use case, the Python cryptographic hash standard API (defined in PEP 452) states the following about the digest() method:
digest()
Return the hash value of this hashing object as a bytes containing 8-bit data. The object is not altered in any way by this function; you can continue updating the object after calling this function.
So, if I wanted to provide a Python module which adhered to this API, the automatic reset of the context and increment of the nonce would be a problem.
The way you'd do this with Nettle is to make a copy (plain memcpy or struct assignment) of the context struct, and extract the digest from the copy.
The nettle design is based on the assumption that it's an uncommon use case to hash (or mac) both a string and a prefix thereof. So it's possible, but not optimized for.
Understood, and I’ve already implemented a copy() method on the wrapper which allocates a new context structure and initializes it with the contents of the existing context. However, it seems very expensive to do that malloc & memcpy on _every_ call to digest(), since as you’ve said this is the uncommon case. Given the documented behavior of the digest() function in PEP 452 though, that’s what I would have to do, since there’s no way in the wrapper to know if the caller might want to continue feeding data after digest() is called.
It makes sense to me to have Nettle default to doing the reset & auto-increment of the nonce, as I agree that would be the common case for someone using Nettle directly.. However, if this could be made configurable when the context is created, it would make it possible to avoid the cost of the malloc & memcpy in the Python wrapper unless the caller actually wanted to “fork” a context and hash multiple independent streams of data which had a common prefix.
What I’d suggest is to split out the nonce increment into its own externally callable function, and add a flag on the context about whether to call that function automatically or not. The flag would default to true to preserve existing behavior, but a function could be provided to disable this for callers that wanted to be able to do partial hashing. They could then call the increment manually if they wanted to reuse a context for multiple messages, avoiding the malloc & memcpy even in both cases. The copy would only be needed in the case I mention above when hashing multiple streams that start with a common prefix.