On Dec 22, 2016, at 12:16 PM, Niels Möller nisse@lysator.liu.se wrote:
Ron Frederick ronf@timeheart.net writes:
Rather than adding a function to return the structure size here, what do you think about adding functions such as umac32_new() which would do the allocation of the structure and return a pointer to it?
Actually, I think I'd prefer a way to get the struct size at run time; otherwise I'd have to consider interfaces to let an application override the allocation function used by the *_new functions.
Are you sure the ctypes package you use doesn't provide any general way to extract struct sizes from a C interface? I know it's common practice to design libraries to have only opaque types, with function calls for allocation, use, and deallocation. But I wouldn't expect nettle to be the only library which tries to be more low-level and expose some internals.
[Ron] Keep in mind that the binding we’re talking about here is all happening dynamically at run time of the Python program. At that point in time, source code to the C module wouldn’t even be available on the system (not even header files) in many cases, and there’d be no C compiler or preprocessor available to parse them even if they were. The only thing available is the information in the dynamic library itself, readable by the linker.
There is a “sizeof” function in ctypes for structures, but the exact types and ordering of the members of the structure would have to be provided in the Python code, which in this case would be translating the C structure definitions (and CPP macros in this case) into the ctypes equivalent, meaning the structure is no longer opaque.
For example, if one ever wants to let the python code read or write fields of a struct used in a C interface, then, I imagine, the python glue magic needs to either extract the layout from the header file, or generate a litte C code, including the header file, to get proper sizes and offsets or accesor functions.
[Ron] There are other ways to interface to C code in Python that can auto-generate shims to do something like this, but they would involve actually generating and compiling additional C code at build time and shipping an additional C shared library with the Python code, which means multiple versions of the package are needed to cover each of the supported architectures. That’s what I’m trying to avoid here, as up until now my package is 100% pure Python and the exact same code runs on all architectures and OSes (MacOS/Windows/UNIX). The only calls out to C code are either provided by other Python modules or involve dynamic linking against third-party C libraries using ctypes operating at the ABI level.
I think that, in general, it makes little sense to use C code in a shared library without also using the corresponding header file in some way or the other.
In principle, the compiler could insert information about sizes and struct layouts (for more general use than just debug info) into the object files, so that the python glue code could extract it from the shared library itself. But as far as I'm aware, common compilers and linkers don't do that.
[Ron] Yeah - I think that sort of information may be available when libraries are compiled with debug information, but it’s not something available to the linker as far as I know and I’ve never seen anything to allow languages like Python to use that data to access structures like this.
ctx = ctypes.create_string_buffer(UMAC_CTX_SIZE)
Note that nettle's context structs have stricter alignment requirements than a string. Maybe you need to use a different allocation method, to guarantedd you at least as much alignment as the system's malloc?
[Ron] This is a good point. Based on some simple tests, the returned memory always seems to be at least 16-byte aligned similar to malloc(), but I can’t actually find documentation that explicitly promises this. I’ll have to do more research on this.
I really don’t want to rely on a hard-coded size like this,
I totally agree that's a bad workaround.
[Ron] Unfortunately, my options at the moment seem to be to either do this or keep looking for alternate implementations of UMAC I can attempt to link against if I want to continue to keep my code 100% Python, and the Nettle version of UMAC is by far the cleanest thing I’ve found so far.