Re: additional API for SHAKE streaming read

10 Mar 2024


      Hello Niels,
Niels Möller nisse@lysator.liu.se writes:
...
Daiki Ueno ueno@gnu.org writes:
...
When I'm trying to implement ML-KEM (Kyber), I realized that the current
API for SHAKE (sha3_256_shake) is a bit too limited: while ML-KEM uses
SHAKE128 as a source of pseudorandom samples[1], the the current API
requires the total number of bytes are determined prior to the call, and
after the call the hash context is reset.
I vaguely recall discussing that when shake256 was added, and we
concluded it was good enough as a start, and could be extended later.
I think it would be nice if one could support the streaming case with
the existing struct sha3_256_ctx, and little extra wrapping. Question is
what the interface should be. I see a few variants:


void /* Essentially the same as _sha3_pad_shake */
  sha3_256_shake_start (struct sha3_256_ctx *ctx);
void /* Unbuffered, length must be a multiple of SHA3_256_BLOCK_SIZE */
  sha3_256_shake_output (struct sha3_256_ctx *ctx
                         size_t length, uint8_t *dst);
void /* Last call, length can be arbitrary, context reinitialized */
  sha3_256_shake_end (struct sha3_256_ctx *ctx
                      size_t length, uint8_t *dst);
Requiring all calls but the last to be full blocks is consistent with
nettle's funtions for block ciphers. But since we anyway have a buffer
available (to support arbitrary sizes for streaming the input), we could
perhaps just as well reuse that buffer.


void /* Essentially the same as _sha3_pad_shake */
  sha3_256_shake_start (struct sha3_256_ctx *ctx);
void /* Arbitrary length, no need to signal end of data */
  sha3_256_shake_output (struct sha3_256_ctx *ctx
                         size_t length, uint8_t *dst);
void /* Explicit init call needed to start a new input message */
  sha3_256_init (struct sha3_256_ctx *ctx);
In this case, sha3_256_shake_output would use ctx->index and ctx->buffer
for partial blocks.
With some hacking (say, using the unused high bit of ctx->index to
signal that shake is in output mode), then we could have just


void /* Arbitrary length, no need to signal start or end of output */
  sha3_256_shake_output (struct sha3_256_ctx *ctx
                         size_t length, uint8_t *dst);
void /* Explicit init call needed to start a new input message */
  sha3_256_init (struct sha3_256_ctx *ctx);
As always, naming is also a crucial question. Is _shake_output a good
name? Or _shake_read, or _shake_generate? From the terminology in the
spec (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf), I think
"_shake_output" is reasonable.
When deciding on naming and conventions, we should strive to define
somthing that can be reused for later hash functions with variable
output size (called extendable-output functions, "XOF", in the spec).
So what do you think makes most sense?
Thank you.  The option (3) sounds like a great idea as it only need one
more function to be added for streaming.  I tried to implement it as the
attached patch.
Regards,
-- 
Daiki Ueno

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: additional API for SHAKE streaming read