On Dec 23, 2016, at 8:07 AM, Niels Möller nisse@lysator.liu.se wrote:
Ron Frederick ronf@timeheart.net writes:
[Ron] There are other ways to interface to C code in Python that can auto-generate shims to do something like this, but they would involve actually generating and compiling additional C code at build time and shipping an additional C shared library with the Python code, which means multiple versions of the package are needed to cover each of the supported architectures.
That's they way I would expect language bindings to work. And I don't think any other way can work if one aims to have python glue for everything in Nettle.
That said, for umac and other mac algorithms, I think it would make sense to provide structs similar to nettle_hash which includes all needed sizes and function pointers.
Would that solve your problem? You could try to see if you can make 100% python wrapper for the supported hash functions in the nettle_hashes list (I think Daniel Kahn Gillmor did something similar in his perl bindings). I guess you'd still need to access struct members, though. If useful, it's also possible to add accessor functions like
size_t nettle_hash_ctx_size(const struct nettle_hash *hash) { return hash->ctx_size; }
nettle_hash_init_func *nettle_hash_init(const struct nettle_hash *hash) { return hash->init; }
etc. (Other variants are possible, if one aims for a function-call-only api).
Thanks - a function-call based API that didn’t rely on the Python code needing to access any Nettle-specific structures sounds good, and would be preferred to something that involves declaring a struct to access function pointers and other static data.
I’ve confirmed that for UMAC the only such function I would need to build a Python wrapper is something that returns the context size. In fact, I have a first cut of such a module written and working with a “try” clause that shows how I could call such a function to request the size if it existed, with a fallback which uses a hard-coded size if that’s not present.
Regarding the init function, that shouldn’t be necessary if Nettle guarantees that a call to set_key() resets the context structure and performs all necessarily initialization. I can see where init() would be needed for the key-less hash functions, but it may not be needed here.
That actually leads to one other wrinkle I’ve run into. According to the Nettle docs:
Function: void umac32_digest (struct umac32_ctx *ctx, size_t length, uint8_t *digest) <>Function: void umac64_digest (struct umac64_ctx *ctx, size_t length, uint8_t *digest) <>Function: void umac96_digest (struct umac96_ctx *ctx, size_t length, uint8_t *digest) <>Function: void umac128_digest (struct umac128_ctx *ctx, size_t length, uint8_t *digest) Extracts the MAC of the message, writing it to digest. length is usually equal to the specified output size, but if you provide a smaller value, only the first length octets of the MAC are written. These functions reset the context for processing of a new message with the same key. The nonce is incremented as described above, the new value is used unless you call the _set_nonce function explicitly for each message.
While this wouldn’t really be a problem for my use case, the Python cryptographic hash standard API (defined in PEP 452) states the following about the digest() method:
digest() Return the hash value of this hashing object as a bytes containing 8-bit data. The object is not altered in any way by this function; you can continue updating the object after calling this function.
So, if I wanted to provide a Python module which adhered to this API, the automatic reset of the context and increment of the nonce would be a problem. Mind you, I think this is a very useful feature in Nettle and wouldn’t want to see it go away. However, do you think it would be possible to make this configurable? There could be something like a umac32_auto_increment_nonce() function that takes a context and a true/false value as an argument to determine whether calling digest() does this or not. If it didn’t, you could continue to append to a message even after requesting a hash of the data provided so far, without the need to make a copy of the context structure first.
And then there's a known ABI problem with exporting that list, discussed in another mail to the list, which needs to be solved in one way or the other.
Yeah, I saw the earlier e-mail on that. My vote would be for option 2 in your list where a function is defined that returns a pointer to the list, and I think that would probably be a good approach for accessing any static data in the library.
That said, I like the suggestion in your other e-mail about providing both an ABI and API level interface, leaving it up to callers to decide if they’re willing to deal with the upgrade/downgrade issues of replacing the libnettle library without recompiling the calling code against the latest header file. More on that shortly in a follow-up to that other e-mail.
[Ron] This is a good point. Based on some simple tests, the returned memory always seems to be at least 16-byte aligned similar to malloc(), but I can’t actually find documentation that explicitly promises this. I’ll have to do more research on this.
If you don't find any better way, maybe you can use ctype to call libc malloc directly?
Yes - this definitely seems like it would work if the Python allocator didn’t already provide such a guarantee.