Re: Implement XTS block cipher mode

15 Mar 2019

On Fri, 2019-03-15 at 13:14 +0100, Niels Möller wrote:
...
Simo Sorce simo@redhat.com writes:
...
The attached patch implements the XTS block cipher mode, as specified
in IEEE P1619. The interface is split into a generic pair of functions
for encryption and decryption and additional AES-128/AES-256 variants.
Thanks. Sorry for the late response.
...
The function signatures follows the same pattern used by other block-
cipher modes like ctr, cfb, ccm, etc...
But it looks like one has to pass the complete message to one call?
Yes, due to ciphertext stealing, XTS needs to know what are the last
two blocks, or at the very least needs to withhold the last processed
block in order to be able to change it if a final partial block is
provided. This means inputs and outputs would not be symmetrical and I
felt it would make it somewhat hard to deal with as an API.
In general XTS is used for block storage and the input is always fully
available (and relatively small, either around 512 bytes, or 4k).
...
Other modes support incremental encryption (with the requirement that
all calls but the last must be an integral number of blocks). I.e.,
calling sequence like
xts_aes128_set_key
xts_aes128_set_iv
xts_aes128_encrypt ... // 1 or more times
xts_aes128_set_iv   // Start new message
xts_aes128_encrypt ... // 1 or more times
...
+The @code{n} plaintext blocks are transformed into @code{n} ciphertext blocks
+@code{C_1},@dots{} @code{C_n} as follows.



+For a plaintext length that is a perfect multiple of the XTS block size:
+@example
+T_1 = E_k2(IV) MUL a^0
+C_1 = E_k1(P_1 XOR T_1) XOR T_1



+@dots{}



+T_n = E_k2(IV) MUL a^(n-1)
+C_n = E_k1(P_n XOR T_n) XOR T_n
+@end example



+For any other plaintext lengths:
+@example
+T_1 = E_k2(IV) MUL a^0
+C_1 = E_k1(P_1 XOR T_1) XOR T_1



+@dots{}



+T_(n-2) = E_k2(IV) MUL a^(n-3)
+C_(n-2) = E_k1(P_(n-2) XOR T_(n-2)) XOR T_(n-2)



+T_(n-1) = E_k2(IV) MUL a^(n-2)
+CC_(n-1) = E_k1(P_(n-1) XOR T_(n-1)) XOR T_(n-1)



+T_n = E_k2(IV) MUL a^(n-1)
+PP = [1..m]Pn | [m+1..128]CC_(n-1)
+C_(n-1) = E_k1(PP XOR T_n) XOR T_n



+C_n = [1..m]CC_(n-1)
+@end example
So the second key, with E_k2, is only ever used to encrypt the IV? If
you add a set_iv function, that could do this encryption and only store
E_k2(IV).
What would be the advantage ? I guess it may make sense if we were to
allow to call the encryption function multiple times, but as explained
above I am not sure this is necessarily desirable.
It may also risk misuse where people set the same IV for all encryption
operations, that would be catastrophic, but probably can be handled by
clearing the stored IV when the encryption is finalized.
...
...

--- /dev/null
+++ b/xts.c
@@ -0,0 +1,219 @@
[...]
...
+static void
+xts_shift(uint8_t *T)
+{

uint8_t carry;
uint8_t i;

for (i = 0, carry = 0; i < XTS_BLOCK_SIZE; i++)
{
 uint8_t msb = T[i] & 0x80;


 T[i] = T[i] << 1;


 T[i] |= carry;


 carry = msb >> 7;


}
if (carry)
T[0] ^= 0x87;

+}
I think this is the same as block_mulx, in cmac.c. (Also same byte
order, right?)
Looks the same indeed, should I share it? Just copy it from cmac?
Something else?
...
Since the block size is fixed to 128 bits, I think it makes sense to use
the nettle_block16 type for all blocks but the application's src and
destination areas. Then we get proper alignment, and can easily use
operations on larger units.
Ok.
...
BTW, for side-channel silence, we should change
if (carry)
    T[0] ^= 0x87;
to something like
T[0] ^= 0x87 & - carry;
(and similarly for the cmac version).
I can do it for xts.c, and provide a separate patch for cmac.c too, or
use a common function for both and handle it there.
...
...

fblen = length - (length % XTS_BLOCK_SIZE);
XTSENC(twk_ctx, T, tweak);

/* the zeroth power of alpha is the initial ciphertext value itself, so we

skip shifting and do it at the end of each block operation instead */


for (i = 0; i < fblen; i += XTS_BLOCK_SIZE)
{

In other places, loops like this are often written as
for (; length >= BLOCK_SIZE; 
       length -= BLOCK_SIZE, src += BLOCK_SIZE, dst += BLOCK_SIZE)
Then there's no need for the up-front division length & BLOCK_SIZE.
Doesn't matter much in this case, since the block size is a constant
power of two, but in general, division is quite expensive.
Ok, I can change that.
...
...

 C = &dst[i];


 XTSCPY(P, &src[i]);


 XTSXOR(P, T);		/* P -> PP */


 XTSENC(enc_ctx, C, P);	/* CC */


 XTSXOR(C, T);		/* CC -> C */



I think it would be clearer with encf being an explicit argument to the
macros that need it (or maybe do it without the macros, if they expand
to only a single call each).
Ok, will drop the macros, they seemed clearer, but now that I am
rereding the code I found myself looking at their implementation more
often than I thought necessary.
Simo.
-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc



    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Implement XTS block cipher mode