nettle-bugs December 2016

nettle-bugs@lists.lysator.liu.se

4 participants
3 discussions

Subtle ABI problem with nettle_hashes
by nisse＠lysator.liu.se 13 Jan '17

13 Jan '17

The idea of the interface /* null-terminated list of digests implemented by this version of nettle */ extern const struct nettle_hash * const nettle_hashes[]; was that the size of the array isn't part of the ABI; a new Nettle version should be able to extend it with more entries, without breaking the ABI. However, I've recently learnt that it does break, in a subtle way. (See https://bugs.gentoo.org/show_bug.cgi?id=601512 and http://trofi.github.io/posts/195-dynamic-linking-ABI-is-hard.html). The problematic case is a traditional non-PIC executable linking with a nettle shared library; in this post, I care exclusively about this case. All symbol references in the executable are resolved at link time, and at load time it is mapped at a fixed address (traditionally 0, I don't know if current systems skip the first page), without any relocations. Now, if the executable links with libnettle.so, and contains a reference to the symbol nettle_hashes, the address where nettle.so is going to be mapped isn't known at link time? So how can the executable refer to it without load-time relocation of its references? The linker solves that problem using the curious relocation type R_X86_64_COPY. What the linker does is that it allocates space for a copy of the data in the BSS segment of the executable, and resolve all references to point to that copy. At the time libnettle.so is loaded, the list in the library is copied (after relocating it, but that's a minor complication in this context) to the space in the BSS. I imagine the dynamic linker also adjusts the pointers in libnettle.so's GOT table to refer to the copy rather than the original. Now the problem is that the allocation in the BSS segment, as well as the copying operation, are based on the size of the data object as recorded in the version of libnettle.so available at the time the executable was linked. If the array size is larger in the version of libnettle.so actally loaded, the copy operation is truncates it, which is particularly bad when it's NULL-terminated. So the array size, which was intended to not be part of the ABI, creeps into the ABI. So what to do about this? We have to break the ABI, but I'd prefer if we keep the API unchanged. Some alternatives: 1. Define nettle_hashesp as a constant pointer to the current nettle_hashes list, and #define nettle_hashes (*nettle_hashesp) Then nettle_hashesp will still get a R_X86_64_COPY relocation, but now the size is always a single pointer, regardless of the array size. At load time, it will be set to point directly to the list in the data segment of the loaded libnettle.so. 2. Define a function get_nettle_hashes returning a pointer to the list, and #define nettle_hashes (get_nettle_hashes()) In this case, the indirection is via a PLT entry in the executable. 3. Define the array with a size explicitly part of the ABI, extern const struct nettle_hash * const nettle_hashes[17]; Add a some reasonable number of reserved NULL entries at the end, and make an ABI break whenever we run out of reserved places and have to increase the size. We also have other public data, e.g., individual nettle_hash structs, like extern const struct nettle_hash nettle_sha256; These will also get a R_X86_64_COPY relocation if referenced (and all internal references within libnettle.so will be relocated to the copy in the executable's BSS, I guess). But that's less of a problem, since the size and layout is already part of the ABI. More problematic are the objects declared in ecc-curve.h; the size and layout of struct ecc_curve was intended to be an implementation detail, not part of the ABI, but isn't, when R_X86_64_COPY is involved. Advice appreciated. I'd also like to hear if anyone on the list knows how these things work with windows dlls. I've read sometime that exporting data (in contrast to functions) from a dll is extra tricky, but I don't remember any details. If we make any changes to fix the exported-data issues with ELF, it would be good if we could ensure that we solve any related problems for libnettle.dll too. I understand why R_X86_64_COPY is needed, but it works in a way that was pretty counter-intutive to me. The effect is, more or less, that the library's data is *statically* linked into the executable. And then initialized at load time based on the contents of the loaded library. And we then get a mix of statically linked and dynamically linked parts which might originate in different versions of the library. It would be prettier if we could force the executable to always access library data via GOT (like it works for PIC code), and never use R_X86_64_COPY. But I guess that has to be known at code generation time, and likely too late to fixup at link time, which is when we know which external data objects are defined by some shared library, and which aren't. But if anyone knows how to fix the issue in this way, I'd be delighted. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance.

2 1

Language bindings question
by Ron Frederick 28 Dec '16

28 Dec '16

Hello, I recently discovered the Nettle crypto library and noticed that it has an implementation of UMAC in it, which I haven’t been able to find in most of the other popular crypto libraries out there. I was hoping to call into the UMAC functions from Python, and I see Python mentioned on the Nettle home page at http://www.lysator.liu.se/~nisse/nettle/ <http://www.lysator.liu.se/~nisse/nettle/>, but there’s no link for it down in the “Language bindings” section, and in fact many of the other links in that section seem to point at dead pages. I tried searching for Python bindings elsewhere, but haven’t had any luck so far. Does anyone know of any? Assuming there aren’t Python bindings yet, I would be tempted to try and craft my own (at least for UMAC for now). However, I’d like to be able to do it using the Python ‘ctypes’ library, and one difficulty I can see with doing that is knowing the amount of memory I should allocate for some of the structures used by this code. Specifically, for UMAC, I’d need to know the size things like “struct umac32_ctx”, “struct umac64_ctx”, “struct umac96_ctx”, and “struct umac128_ctx”. These sizes are trivial to get from C code using sizeof(), but I don’t believe there’s any good way to get them from Python using ctypes unless there’s an API call that can be made to a C function in the library which returns this information. As an example of this, libsodium provides functions like crypto_stream_chacha20_keybytes() and crypto_stream_chacha20_noncebytes() which can be called to return the size of a Chacha20 key and nonce, respectively. In the C header file, these look like: #define crypto_stream_chacha20_KEYBYTES 32U SODIUM_EXPORT size_t crypto_stream_chacha20_keybytes(void); #define crypto_stream_chacha20_NONCEBYTES 8U SODIUM_EXPORT size_t crypto_stream_chacha20_noncebytes(void); The corresponding functions in the code just return the constants defined in the header file shown above. For this UMAC example in Nettle, I could imagine similar functions which returned the value computed by sizeof() for these context structures, and also things like the UMAC key size, digest sizes, block size, and allowed nonce sizes. Has a use-case like this been considered? Is there some other way to call into Nettle from Python without writing additional glue code in C? -- Ron Frederick ronf(a)timeheart.net

3 23

Some ideas for new algorithms in Nettle
by Joachim Strömbergson 04 Dec '16

04 Dec '16

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Aloha! I've thought about som algorithms and constructs that I think would be useful, good to add to Nettle. We are seeing in an interest in using EC keys for both DH and DSA operations. Esp in embedded systems. One should be careful of reusing keys for more than one purpose. But for EC there seem to be some circumstances when using the keys for the two constructions does not harm each other, see: “On the Joint Security of Encryption and Signature in EMV.” Cryptology ePrint Archive, Report 2011/615, 2011. http://eprint.iacr.org/2011/6 Recently, Trevor Perrin from Openwhispersystems wrote a paper that describes how given a Curve25519 (or Curve448) keypair can reuse them in a specific DSA construction called XEdDSA. The XEdDSA is in fact a way to convert the Curve-keys in a specific way and then use them with Ed25519, Ed448 to sign or verify messages. Openwhispersystems have also code for XEd25519 on Github. I've looked at it and compared to the Curve code in Nettle. It seems that we could add this algorithm with basically a small wrapper. https://whispersystems.org/docs/specifications/xeddsa/xeddsa.pdf https://github.com/WhisperSystems/curve25519-java/blob/master/android/jni/e… https://github.com/WhisperSystems/curve25519-java/blob/master/android/jni/e… Another algorithm that I've seen been used in embedded space is the SipHash PRF/keyed hash function. It is very fast on Cortex-M devices and have low code and RAM resource requirements. If implemented in Nettle I think we should support both 64 and 128 bit digests. https://131002.net/siphash/ https://github.com/veorq/SipHash When it comes to block cipher modes, CMAC and OCB are two modes that are very interesting for embedded space. CMAC is a "better CBC-MAC" that can be/is used as KDF, MAC etc. http://csrc.nist.gov/publications/nistpubs/800-38B/SP_800-38B.pdf OCB is an aead construction that has seen little used until now due to licensing issues. But the licensing has been changed by Rogaway et al and there is a RFC for OCB. The cost for OCB goes asymptotically towards one cipher block operation/message block. https://www.rfc-editor.org/rfc/rfc7253.txt http://web.cs.ucdavis.edu/~rogaway/ocb/ I don't know what the idea is in relation to password hashing, memory/computational hard functions. PBKDF2 is in Nettle, but not bcrypt, scrypt or the PHC winner Argon2. Are there any interest in adding them to Nettle? https://github.com/P-H-C/phc-winner-argon2 Finally. Since Skein was being developed, how about adding blake2? Blake2 was one of the runner ups for SHA-3 and is faster than Keccak. There is also versions of Blake2 suitable for embedded systems. https://blake2.net/ - -- Med vänlig hälsning, Yours Joachim Strömbergson - Alltid i harmonisk svängning. ======================================================================== Joachim Strömbergson Secworks AB joachim(a)secworks.se ======================================================================== -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCAAGBQJYIYmnAAoJEF3cfFQkIuyN1FQQALQdER6bmnvN6fV5M2CQuVPR mMZPzm/V6DK2+WcQnB8m5X4F3t3wB6svOO0qRGJfp13+r+xftlhBR1yGhZdA/sh+ uFVEdDl71eRgg6wFlnfKB7oo1z8gVaBDiUeZFnmnB33nttXIJTS6uL+5u+c1443L 4gdylLwL/TdJzttyPUIWdiqVkqKlJCC3VX1dfvTGbXdK5NuTypZYnfESY3XI4z+m uMNsHTQtTy3JPmL0KDawPQr33ZJNYItdxLNTOae5bePMyeOTWBA52OX+SN7te/xV eFUNjMA0j7Z3haAOqAwjP5WwMwzcDbkpK1qHNuN2rbtnbqVdS8MoR66Te4y04x7t YOhtZqbV9DOVRBTNaqprHjV+4K0m6xDN6YWWOwb1fKi54mGjM6h1VvybyfKHCKpu rBRVj6uhU7gRSAQnnUhCGK4i6s0dDfS0qFikb64r9P22BtNE3KQantpdDX9M+kVH +iskZKMAqbb/i+9Yi+wm/oAnJRoaXPy44V6NXOpzlh4Leau1LeVnaesrkruo2HMd X5UvbIv7OUsbgiTt8W9fjCECP5Ub2qA1w8aV8GjiAzY27E5qiRTKoRC31uDrNymt bdwLogZmHny6kuRJADdbVcWtMaZp5gtoX53akA4obu8Ub45bA8nl+n2yHuz480+L FeU8Gr/go+MkcCRygnfW =iQua -----END PGP SIGNATURE-----

4 6

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

nettle-bugs December 2016