I also have a few comments on the code.
Dmitry Eremin-Solenikov dbaryshkov@gmail.com writes:
+void +cfb_encrypt(const void *ctx, nettle_cipher_func *f,
size_t block_size, uint8_t *iv,size_t length, uint8_t *dst,const uint8_t *src)+{
- if (src != dst)
- {
uint8_t *p;for (p = iv; length >= block_size; p = dst, dst += block_size, src += block_size, length -= block_size)- {
f(ctx, block_size, dst, p);memxor(dst, src, block_size);- }
if (p != iv)- memcpy(iv, p, block_size);
if (length)- {
TMP_DECL(buffer, uint8_t, NETTLE_MAX_CIPHER_BLOCK_SIZE);TMP_ALLOC(buffer, block_size);f(ctx, block_size, buffer, iv);memxor3(dst, buffer, src, length);memcpy(iv, dst, length);- }
- }
- else
- {
TMP_DECL(buffer, uint8_t, NETTLE_MAX_CIPHER_BLOCK_SIZE);TMP_ALLOC(buffer, block_size);uint8_t *p;for (p = iv; length >= block_size; p = dst, dst += block_size, length -= block_size)- {
f(ctx, block_size, buffer, p);memxor(dst, buffer, block_size);- }
if (p != iv)- memcpy(iv, p, block_size);
if (length)- {
f(ctx, block_size, buffer, iv);memxor(dst, buffer, length);memcpy(iv, dst, length);- }
- }
+}
It may be nice to reduce the code duplication for the two cases; only the block loop really needs to be different. I think it's fine to move allocation of buffer up front even if it's unnecessary in the case src != dst and there's no left-over; it's only an alloca so it's pretty check. Also unconditionally using memxor3 rather than memxor for the left-over is a small overhead. Do you think those operations are performance critical?
Also for cfb_decrypt it would be nice to take common parts out of the src != dst condition, but not quite obvious how to do it.
+/* Don't allocate any more space than this on the stack */ +#define CFB_BUFFER_LIMIT 512
+void +cfb_decrypt(const void *ctx, nettle_cipher_func *f,
size_t block_size, uint8_t *iv,size_t length, uint8_t *dst,const uint8_t *src)+{
- if (src != dst)
- {
size_t left = length % block_size;
Avoiding % would be nice, division by non-constants is expensive (might apply to the old cbc code too). But perhaps tricky, since for the final part we need to call f with a length which is rounded down to an integral number of blocks.
while (length > 0)- {
uint32_t part = length > buffer_size ? buffer_size : length;
Should be size_t, not uint32_t.
+void +cfb8_encrypt(const void *ctx, nettle_cipher_func *f,
size_t block_size, uint8_t *iv,size_t length, uint8_t *dst,const uint8_t *src)+{
As I wrote in my other mail, I don't think we need a separate function for this. My understanding is that the main use case of CFB8 is when we get one octet at a time to encrypt and transmit, i.e., tis function called with length == 1. If that's right, there's little use to optimize it's performance for larger length.
Regards, /Niels