Hashing with short input assertion question

List overview All Threads
Download

newer

older

using gitlab CI for nettle

Miscalculations on secp384 curve

Tim Ruehsen

7 Jan 2016 7 Jan '16

12:11 p.m.

Hi,

just experienced a crash (assertion) when using md5 on short input.

test: md5.c:81: nettle_md5_digest: Assertion `length <= 16' failed.

The code is struct md5_ctx md5; md5_init(&md5); md5_update(&md5, "moin", 4); md5_digest(&md5, 20, digest);

My system is Debian SID (libnettle6:amd64 3.1.1-4).

The example (http://www.lysator.liu.se/~nisse/nettle/nettle.html#Example) does not mention special handling for short input (nettle_sha1_digest() has a similar assert()). I searched the document for 'padding' but found nothing relevant. How should I proceed ? Pad with 0 bytes ?

Is it reasonable to open a Debian bug suggesting to use -DNDEBUG ? IMO, these assertions needs to be disabled on a production system or replaced by proper error handling. It would be fine to have them in the respective -dbg (debugging) package though.

Best regards,

Tim

Attachments:

signature.asc (application/pgp-signature — 819 bytes)

Show replies by date

nisse＠lysator.liu.se

7 Jan 7 Jan

12:26 p.m.

Tim Ruehsen tim.ruehsen@gmx.de writes:

...

just experienced a crash (assertion) when using md5 on short input.

test: md5.c:81: nettle_md5_digest: Assertion `length <= 16' failed.

The code is struct md5_ctx md5; md5_init(&md5); md5_update(&md5, "moin", 4); md5_digest(&md5, 20, digest);

The problem isn't the *input* length, but the output length. The length argument for md5_digest can be at most MD5_DIGEST_SIZE, i.e., 16.

...

Is it reasonable to open a Debian bug suggesting to use -DNDEBUG ?

IMO, no.

Asserts in nettle are triggered when bugs in nettle or in the application violate necessary assumptions required for correct operation. In most cases, an immediate crash is preferable to follow on problems such data corruption or invalid memory accesses. Right, I know there are some exceptional production systems (e.g, the Ariadne rocket...) where it *might* be preferable to ignore problems and hope for the best, and I also know there are different opinions. But my view is that in general, it makes sense to keep asserts also in production code.

Best regards, /Niels

-- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.

Tim Ruehsen

1:51 p.m.

On Thursday 07 January 2016 13:26:37 Niels Möller wrote:

...

Tim Ruehsen tim.ruehsen@gmx.de writes:

...
just experienced a crash (assertion) when using md5 on short input.

test: md5.c:81: nettle_md5_digest: Assertion `length <= 16' failed.

The code is

struct md5_ctx md5; md5_init(&md5); md5_update(&md5, "moin", 4); md5_digest(&md5, 20, digest);
                     ^^
The problem isn't the *input* length, but the output length. The length argument for md5_digest can be at most MD5_DIGEST_SIZE, i.e., 16.

When walking back from lunch everything sorted out in my brain. I was too hungry when I wrote that text. Sorry for the noise.

But what's wrong with providing a larger buffer than needed ? Imagine snprintf would throw an assertion if the provided buffer is 'too large'.

...

Asserts in nettle are triggered when bugs in nettle or in the application violate necessary assumptions required for correct operation. In most cases, an immediate crash is preferable to follow on problems such data corruption or invalid memory accesses. Right, I know there are some exceptional production systems (e.g, the Ariadne rocket...) where it *might* be preferable to ignore problems and hope for the best, and I also know there are different opinions. But my view is that in general, it makes sense to keep asserts also in production code.

Some admins have their jobs due to 'stop-by-assertion' software - at least that is good thing about assertions ;)

Best Regards

Tim

nisse＠lysator.liu.se

2:05 p.m.

Tim Ruehsen tim.ruehsen@gmx.de writes:

...

But what's wrong with providing a larger buffer than needed ?

I don't think about it as the size of the provided buffer, but as the requested size of the digest (intended for the usecase of truncated digests). And it's not defined how to produce a 20-byte md5 digest. If

md5_digest(&md5, 20, digest);

were allowed, what should it do? Write 16 bytes, and leave the remaining 4 bytes untouched?

...

Some admins have their jobs due to 'stop-by-assertion' software - at least that is good thing about assertions ;)

I suspect that 'continue-with-silent-data-corruption' software would give them even more work...

Regards, /Niels

-- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.

Tim Ruehsen

3:30 p.m.

On Thursday 07 January 2016 15:05:38 Niels Möller wrote:

...

Tim Ruehsen tim.ruehsen@gmx.de writes:

...
But what's wrong with providing a larger buffer than needed ?

I don't think about it as the size of the provided buffer, but as the requested size of the digest (intended for the usecase of truncated digests). And it's not defined how to produce a 20-byte md5 digest. If

md5_digest(&md5, 20, digest);

were allowed, what should it do? Write 16 bytes, and leave the remaining 4 bytes untouched?

You put the answer into my mouth... yes, that seems intuitive to me.

...

...
Some admins have their jobs due to 'stop-by-assertion' software - at least that is good thing about assertions ;)

I suspect that 'continue-with-silent-data-corruption' software would give them even more work...

Definitely. That's why I try to avoid either of them.

An assertion doesn't give the higher layers a chance to intervene. The process is being killed, eventually leaving corrupted data in persistent memory while all information to recover anything is gone with the killed process. Returning a proper error value at least gives the ones who care for the chance to recover. I am working in the telephony area where recovering from faults is a basic design (even on protocol level). Also in mind that a single process may control hundreds of connections - a sudden stop may kill people (just calling the ambulance / police, maybe no chance to call again).

Regards

Tim

Amos Jeffries

5:57 p.m.

On 8/01/2016 4:30 a.m., Tim Ruehsen wrote:

...

On Thursday 07 January 2016 15:05:38 Niels Möller wrote:

...
Tim Ruehsen writes:

...
But what's wrong with providing a larger buffer than needed ?

I don't think about it as the size of the provided buffer, but as the requested size of the digest (intended for the usecase of truncated digests). And it's not defined how to produce a 20-byte md5 digest. If

md5_digest(&md5, 20, digest);

were allowed, what should it do? Write 16 bytes, and leave the remaining 4 bytes untouched?

You put the answer into my mouth... yes, that seems intuitive to me.

Then what will the application do with 4 random bytes of output? It clearly is not aware of the 16-byte digest size limit. So what basis do we have to be certain it will output or use only those 16 bytes, and not doing something such as sending the buffer to snprintf with length parameter of 20 there as well. Or worse, passing just the (unterminated) buffer start pointer to sprintf.

It is a very bad policy for any code, let alone security code, to just blindly trust that the external software will operate correctly.

The choice is also not so black and white. There are a bunch of other "intuitive" actions that could be performed: * wipe the entire digest and emit 20 bytes of 0's, or * wipe the entire digest and emit 20 bytes of 1's, or * wipe the extra 4 bytes with 0's, or * wipe the extra 4 bytes with 1's, or * wipe the 1st of the extra bytes with a null byte, or * throw an exception.

Each of which has its own set of problems and nasty side effects depending on what the external software is doing or assuming.

...

...
...
Some admins have their jobs due to 'stop-by-assertion' software - at least that is good thing about assertions ;)

I suspect that 'continue-with-silent-data-corruption' software would give them even more work...

Definitely. That's why I try to avoid either of them.

An assertion doesn't give the higher layers a chance to intervene.

The higher layers are clearly broken in their designed use of the nettle API. This is not a dynamic limit being checked, but an explicit and fixed global value of MD5_DIGEST_SIZE.

Amos

Tim Ruehsen

8 Jan 8 Jan

10:08 a.m.

On Friday 08 January 2016 06:57:54 Amos Jeffries wrote:

...

On 8/01/2016 4:30 a.m., Tim Ruehsen wrote:

...
On Thursday 07 January 2016 15:05:38 Niels Möller wrote:

...
Tim Ruehsen writes:

...
But what's wrong with providing a larger buffer than needed ?

I don't think about it as the size of the provided buffer, but as the requested size of the digest (intended for the usecase of truncated digests). And it's not defined how to produce a 20-byte md5 digest. If

md5_digest(&md5, 20, digest);

were allowed, what should it do? Write 16 bytes, and leave the remaining 4 bytes untouched?

You put the answer into my mouth... yes, that seems intuitive to me.

Then what will the application do with 4 random bytes of output? It clearly is not aware of the 16-byte digest size limit. So what basis do we have to be certain it will output or use only those 16 bytes, and not doing something such as sending the buffer to snprintf with length parameter of 20 there as well. Or worse, passing just the (unterminated) buffer start pointer to sprintf.

???

...

It is a very bad policy for any code, let alone security code, to just blindly trust that the external software will operate correctly.

Sorry, that is IMO bullshit. You say a low-level function is responsible for what the (high-level caller | application) is doing. *head shake* It is absolutely in the callers responsibility to use the output of any function correctly - and his basis is the documentation.

...

The choice is also not so black and white. There are a bunch of other "intuitive" actions that could be performed:

You are right, 'intuition' is not the point. The point is what is documented and what not.

Here on Debian I have no man pages for any nettle functions - so I looked into the header files to find the function prototypes and used them 'intuitive'. Which worked immediately until I started to use a larger digest buffer than 'allowed'. I am using the same buffer for other algorithms/hash functions as well, so using 'sizeof' was straight forward. Nettle functions unexpectedly crashed of my application instead of a sane return value(all those function are void, *_digest() could return error or the number of bytes written).

I meanwhile found the nettle docs on the web (http://www.lysator.liu.se/~nisse/nettle/nettle.html#MD5):

########### Function: void md5_digest (struct md5_ctx *ctx, size_t length, uint8_t *digest)

Performs final processing and extracts the message digest, writing it to digest. length may be smaller than MD5_DIGEST_SIZE, in which case only the first length octets of the digest are written. ###########

Not a simple word saying 'If length is larger than MD5_DIGEST_SIZE' the function does not return - it calls abort()'.

I would like to see that documented for every *_digest() function !

...

...
...
...
Some admins have their jobs due to 'stop-by-assertion' software - at least that is good thing about assertions ;)

I suspect that 'continue-with-silent-data-corruption' software would give them even more work...

Definitely. That's why I try to avoid either of them.

An assertion doesn't give the higher layers a chance to intervene.

The higher layers are clearly broken in their designed use of the nettle API. This is not a dynamic limit being checked, but an explicit and fixed global value of MD5_DIGEST_SIZE.

See the excerpt from the docs above. What you say is not correct, you can provide a smaller digest buffer than MD5_DIGEST_SIZE. If you were right, the 'length' param wouldn't make sense. In my eyes it makes sense as saying: 'digest' is 'length' bytes large - please do not overwrite (= put at most 'length' bytes into 'digest'). The typical C buffer over flow prevention. But any such functions generally are fine with larger buffers - they put their payload into the buffer and leave the rest alone (except otherwise documented). Returning the actual number of bytes put into 'digest' would be of large help as well.

Just an impression for you what I am doing and why fixed digest sizes doesn't matter resp. are uninteresting here:

For my high-level hash function I use GnuTLS (wrapper around nettle|other), Nettle or Gcrypt (configurable).

unsigned char digest[128]; // *should* be large enough, if not we get an error char *algo = <user input> if ((len=hash(algo,text,textlen,digest,digestsize)) < 0) { // error like 'unknown algo', 'digest size too small', ... } else { // got <len> bytes in digest, e.g. print as hex values }

At this level MD5_DIGEST_SIZE etc. is unknown and not interesting.

(For nettle i found the nettle_hash structure variables and it's digest_size variable. Everything is working fine.)

Regards, Tim

nisse＠lysator.liu.se

12:16 p.m.

Tim Ruehsen tim.ruehsen@gmx.de writes:

...

I meanwhile found the nettle docs on the web (http://www.lysator.liu.se/~nisse/nettle/nettle.html#MD5):

I'm happy you found the documentation. The manual should also be included in debian's nettle-dev package, in info, html and pdf formats.

...

########### Function: void md5_digest (struct md5_ctx *ctx, size_t length, uint8_t *digest)

Performs final processing and extracts the message digest, writing it to digest. length may be smaller than MD5_DIGEST_SIZE, in which case only the first length octets of the digest are written. ###########

Sorry you find this unclear. Can you suggest a concise improvement?

...

unsigned char digest[128]; // *should* be large enough, if not we get an error char *algo = <user input> if ((len=hash(algo,text,textlen,digest,digestsize)) < 0) { // error like 'unknown algo', 'digest size too small', ... } else { // got <len> bytes in digest, e.g. print as hex values }

For simplicity, let's assume that algoriths are represented by struct nettle_hash, rather than an ascii name. Then a hash function in that style should be implemented like

int hash (const struct nettle_hash *hash, const uint8_t *msg, size_t length, uint8_t *digest, size_t digest_size) { void *ctx; if (digest_size < hash->digest_size) /* Too small */ return -1;

ctx = alloca(hash->context_size); hash->init(ctx); hash->update(ctx, length, msg); hash->digest(ctx, hash->digest_size, digest); return hash->digest_size; }

I don't think that passing a valid digest size to hash->digest here should be a big deal.

Regards, /Niels

-- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.

Tim Ruehsen

2:14 p.m.

On Friday 08 January 2016 13:16:27 Niels Möller wrote:

...

Tim Ruehsen tim.ruehsen@gmx.de writes:

...
I meanwhile found the nettle docs on the web

...
(http://www.lysator.liu.se/~nisse/nettle/nettle.html#MD5):

I'm happy you found the documentation. The manual should also be included in debian's nettle-dev package, in info, html and pdf formats.

...
########### Function: void md5_digest (struct md5_ctx *ctx, size_t length, uint8_t *digest)

Performs final processing and extracts the message digest, writing it to digest. length may be smaller than MD5_DIGEST_SIZE, in which case only the first length octets of the digest are written. ###########

Sorry you find this unclear. Can you suggest a concise improvement?

Never said the text is unclear. It just misses the case when 'length' is too large. Citing my proposal from my previous email:

...

'If length is larger than MD5_DIGEST_SIZE the function does not return - it calls abort()'

Or more precise If length is larger than MD5_DIGEST_SIZE an assertion is generated and by that the program is terminated by abort(). To avoid this behavior nettle has to be compiled with -DNDEBUG (additional bytes will not be touched).

...

ctx = alloca(hash->context_size);
hash->init(ctx);
hash->update(ctx, length, msg);
hash->digest(ctx, hash->digest_size, digest);
return hash->digest_size;
}

I don't think that passing a valid digest size to hash->digest here should be a big deal.

Thanks Niels. This part is already up and running. My writing was for Amos to make clear that MD5_DIGEST_SIZE and the like are not needed from a higher level perspective.

Regards, Tim

3500

Age (days ago)

3501

Last active (days ago)

nettle-bugs@lists.lysator.liu.se

8 comments

3 participants

tags (0)

participants (3)

Amos Jeffries
nisse＠lysator.liu.se
Tim Ruehsen