Re: Implement XTS block cipher mode

19 Mar 2019

      Hi Niels,
attached find two patches, comments inline.
On Tue, 2019-03-19 at 07:31 +0100, Niels Möller wrote:
...
Simo Sorce simo@redhat.com writes:
...
New patch attached, the diff has also been applied as an additional commit to my xts_exploded tree in gitlab.
Thanks. Looks pretty good, a few more comments below.
Id din't look so closely at the tests, but it would be good with to test
inplace opertion also with the aes-specific functions.
For performance (which we don't need to focus that much on for the
initial version), one thing to do would be to fill a buffer with
consecutive T_k values, and do the encrypt/decrypt and memxor operations
on several blocks at a time.
...
+For a plaintext length that is a perfect multiple of the XTS block size:
+@example
+T_1 = E_k2(IV) MUL a^0
+C_1 = E_k1(P_1 XOR T_1) XOR T_1

+@dots{}

+T_n = E_k2(IV) MUL a^(n-1)
+C_n = E_k1(P_n XOR T_n) XOR T_n
+@end example
Nit: I'd write the T updates as
T_1 = E_k2(IV)
 T_k = T_(k-1) MUL a
since this is how to implement it, and no powering operation is needed.
Ok changed it.
...
...
+The functions @var{encf} @var{decf} are of type

+@code{void f (void *@var{ctx}, size_t @var{length}, uint8_t *@var{dst},
+const uint8_t *@var{src})},
It seems the manual doesn't really have an entry for the
nettle_cipher_func type. Maybe it should. Anyway, the type for the first
argument is _const_ void *.
Fixed.
...
...
+/* shift one and XOR with 0x87. */
+/* src and dest can point to the same buffer for in-place operations */
+static void
+xts_shift(union nettle_block16 *dst,

     const union nettle_block16 *src)

+{

uint8_t carry = src->b[15] >> 7;
uint64_t b0 = LE_READ_UINT64(src->b);
uint64_t b1 = LE_READ_UINT64(src->b+8);
b1 = (b1 << 1) | (b0 >> 63);
b0 = b0 << 1;
LE_WRITE_UINT64(dst->b, b0);
LE_WRITE_UINT64(dst->b+8, b1);
dst->b[0] ^= 0x87 & -carry;

+}
Looks right. It can be optimized later on.
Just for curiosity, would it be ok change,
LE_READ_UINT64/LE_WRITE_UINT64 to use uint64_t and have the macro use
__bswap64 if available or fall back to the original unoptimized code ?
(__bswap64 already knows how to use __builtin_bswap64 or falls back to
similar code as in LE_READ_UINT64)
Or do we need a new macro that expects aligned input/output and leave
the existing macro alone, to be used for unaligned input/output ?
...
...
+/*

prev is the block to steal from

curr is the input block to the last step

length is the partial block length

dst is the destination partial block

src is the source partial block

In the Encryption case:

prev -> the output of the N-1 encryption step

curr -> the input to the Nth step (will be encrypted as Cn-1)

dst  -> the final Cn partial block

src  -> the final Pn partial block

In the decryption case:

prev -> the output of the N-1 decryption step

curr -> the input to the Nth step (will be decrypted as Pn-1)

dst  -> the final Pn partial block

src  -> the final Cn partial block

*/

+static void
+xts_steal(uint8_t *prev, uint8_t *curr,

 size_t length, uint8_t *dst, const uint8_t *src)

+{

/* copy the remaining in the current input block */
memcpy(curr, src, length);
/* fill the current block with the last blocksize - length

bytes of the previous block */

memcpy(&curr[length], &prev[length], XTS_BLOCK_SIZE - length);

/* This must be last or inplace operations will break

copy 'length' bytes of the previous block in the

destination block, which is the final partial block

returned to the caller */

memcpy(dst, prev, length);

+}
This is a bit confusing; first we write to the curr block, using data
from src and prev. And next we copy from prev to dst; is that the
swapping of the last two blocks? Maybe this could be made simpler, and a
copy eliminatedif the main loops were
for (;length >= 2*XTS_BLOCK_SIZE;
For the arguments, it looks like prev could be const, and curr could be
nettle_block16 (but changing the latter is perhaps a bit pointless,
since we only treat it as a byte array with no obvious benefits from the
alignment).
Ok, I took a stab at removing xts_steal completely in the second patch,
let me know what you think, I think I may like it better than my
original code and uses nettle_block16 for temporary storage to avoid a
copy.
...
...
+static void
+check_length(size_t length, uint8_t *dst)
+{

assert(length >= XTS_BLOCK_SIZE);
/* asserts may be compiled out, try to save the user by zeroing the dst in

case the buffer contains sensitive data (like the clear text for inplace

encryption) */

if (length < XTS_BLOCK_SIZE)
memxor(dst, dst, length);

+}

+/* works also for inplace encryption/decryption */

+void
+xts_encrypt_message(const void *enc_ctx, const void *twk_ctx,

           nettle_cipher_func *encf,

           const uint8_t *tweak, size_t length,

           uint8_t *dst, const uint8_t *src)

+{

union nettle_block16 T;
union nettle_block16 P;

check_length(length, dst);

encf(twk_ctx, XTS_BLOCK_SIZE, T.b, tweak);

/* the zeroth power of alpha is the initial ciphertext value itself, so we

skip shifting and do it at the end of each block operation instead */

for (;length >= XTS_BLOCK_SIZE;
  length -= XTS_BLOCK_SIZE, src += XTS_BLOCK_SIZE, dst += XTS_BLOCK_SIZE)

{
 memcpy(P.b, src, XTS_BLOCK_SIZE);

 memxor(P.b, T.b, XTS_BLOCK_SIZE);	        /* P -> PP */

This is what memxor3 is for. (If it's more efficient in this case is not entirely
obvious, though).
I did not know about memxor3, it makes the code better, esp in the
second patch, I think.
...
...
+/* XTS Mode with AES-128 */
+struct xts_aes128_ctx {

struct aes128_ctx cipher;
struct aes128_ctx tweak_cipher;

+};
Could consider renaming it to xts_aes128_key, somewhat analogous to
struct eax_key and struct gcm_key. This represents message-independent
data, and then the name xts_aes128_ctx could be used in case we add an
interface with a buffer and incremental encryption and decryption. Not
clear to me if that's likely or not to ever be useful, though.
I did this, the reasoning makes sense to me.
HTH,
Simo.
-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Implement XTS block cipher mode