Hello Niels,
Niels Möller nisse@lysator.liu.se writes:
Daiki Ueno ueno@gnu.org writes:
When I'm trying to implement ML-KEM (Kyber), I realized that the current API for SHAKE (sha3_256_shake) is a bit too limited: while ML-KEM uses SHAKE128 as a source of pseudorandom samples[1], the the current API requires the total number of bytes are determined prior to the call, and after the call the hash context is reset.
I vaguely recall discussing that when shake256 was added, and we concluded it was good enough as a start, and could be extended later.
I think it would be nice if one could support the streaming case with the existing struct sha3_256_ctx, and little extra wrapping. Question is what the interface should be. I see a few variants:
void /* Essentially the same as _sha3_pad_shake */ sha3_256_shake_start (struct sha3_256_ctx *ctx);
void /* Unbuffered, length must be a multiple of SHA3_256_BLOCK_SIZE */ sha3_256_shake_output (struct sha3_256_ctx *ctx size_t length, uint8_t *dst);
void /* Last call, length can be arbitrary, context reinitialized */ sha3_256_shake_end (struct sha3_256_ctx *ctx size_t length, uint8_t *dst);
Requiring all calls but the last to be full blocks is consistent with nettle's funtions for block ciphers. But since we anyway have a buffer available (to support arbitrary sizes for streaming the input), we could perhaps just as well reuse that buffer.
void /* Essentially the same as _sha3_pad_shake */ sha3_256_shake_start (struct sha3_256_ctx *ctx);
void /* Arbitrary length, no need to signal end of data */ sha3_256_shake_output (struct sha3_256_ctx *ctx size_t length, uint8_t *dst);
void /* Explicit init call needed to start a new input message */ sha3_256_init (struct sha3_256_ctx *ctx);
In this case, sha3_256_shake_output would use ctx->index and ctx->buffer for partial blocks.
With some hacking (say, using the unused high bit of ctx->index to signal that shake is in output mode), then we could have just
void /* Arbitrary length, no need to signal start or end of output */ sha3_256_shake_output (struct sha3_256_ctx *ctx size_t length, uint8_t *dst);
void /* Explicit init call needed to start a new input message */ sha3_256_init (struct sha3_256_ctx *ctx);
As always, naming is also a crucial question. Is _shake_output a good name? Or _shake_read, or _shake_generate? From the terminology in the spec (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf), I think "_shake_output" is reasonable.
When deciding on naming and conventions, we should strive to define somthing that can be reused for later hash functions with variable output size (called extendable-output functions, "XOF", in the spec).
So what do you think makes most sense?
Thank you. The option (3) sounds like a great idea as it only need one more function to be added for streaming. I tried to implement it as the attached patch.
Regards,