Simon Josefsson simon@josefsson.org writes:
I don't know of any reason today, e.g., no important Internet applications uses it. It has some interesting properties though, so maybe there is uptake.
It seems the main selling point is speed. Internally, it seems a bit unorthodox,
* hash function in counter mode (when everybody seems to move to block ciphers in counter mode, the invertibility of the underlying block cipher is an irrelevant feature).
* No sboxes or other "large" non-linearities. Just mix of xor and mod 2^32 add, and ten rotations to speed up diffusion. Everybody else seems to think non-linear sboxes (or things like the majority function in sha hashes) are essential. Possibly with the exception of idea? Which uses similar primitives, iirc, but with mod (2^16 + 1) multiplies instead of simple rotations to help the bit diffusion.
Somewhat, although I think the distinction between stream ciphers and block ciphers is not a good one to make in an API. Usually applications want to use ciphers in some mode, and some modes used with block ciphers make them essentially stream ciphers. I think an good API should be modeled around "symmetric encryption" as the concept.
Such an API would make some sense, but I don't think it's a substitute for a block cipher API. E.g., in ssh, the application has to be aware of the block size, and the message padding is application specific (isn't it the same in tls?). An API for encrypting a stream would not be of much help here. And when using cbc, I think it's not even possible.
Including the arcfour stream cipher there may well be a mistake. One annoying problem with the current design is that for block ciphers, the ctx argument of the encrypt and decrypt functions naturally is const, but due arcfour being fittted in the same nettle_cipher abstraction, the function typedef nettle_crypt_func can't use a const ctx.
I always found rc4 the odd cipher in nettle.
That the nettle_cipher abstraction treats the stream ciphers (arcfour being the only supported one) as a block cipher with block size zero goes back to my and Henrik Grubbström's design for Pike's crypto library, some 15 years ago...
Is that what you're finding odd, or is there anything else which is strange with arcfour?
Then there's also the tentative (nettle-internal.h) nettle_aead abstraction. Salsa20 could perhaps fit there, if we allow algorithms with no authentication (NULL digest function pointer).
Possibly... or just have one abstract "symmetric encryption" that embodies all these variants. Or does that lead to other disadvantages?
In general, it's not much point to have a general interface, if the application has to query particulars before using it (e.g, does this mechanism provide any authentication, or do I need to combine it with some other MAC?). So I don't think what I sketched is a good idea.
--- /dev/null +++ b/salsa20.c +#define ROTL32(x,n) ((((x))<<(n)) | (((x))>>(32-(n))))
There are several different variants of that macro. It would be nice with a unified one in macros.h.
I agree.
Done.
+#define SWAP32(v) \
- ((ROTL32(v, 8) & 0x00FF00FFUL) | \
- (ROTL32(v, 24) & 0xFF00FF00UL))
[...]
That's a clever byte swapping trick (at least if a true rot instruction is available). In Nettle conversion between bytes and integers are usually done with READ_UINT32, LE_READ_UINT32 and friends. It's usually not very performance critical, and it deals naturally with any alignment. I suspect U8TO32_LITTLE above breaks if the input is unaligned but the architecture doesn't allow unaligned word reads.
Using READ_UINT32 etc is probably better.
I ended up keeping that kind of byte swapping, to be able to stick to word accesses to memory until the final plaintext/cryptotext xor.
I put the majority of my changes above that header, to make it easy to sync and compare the code
I'm afraid that not easy any more... I made quite a lot of changes.
- Try an sse2 assembly implementation (the djb:s papers outline how to do that). Or copy some existing implementation.
Take a look at the link above, most likely there exists something. I'm not sure how important it is though.
I had a look, but I found that assembly code hard to read (apparantly automatically generated by some tool of djb's).
- One advertised feature of the cipher is random access. I think we should have something like a salsa20_set_pos, taking a block count as argument.
Yes.
Suggestions for name?
I think set_iv should still set the count to zero, so you need to use the new function only if you want to do seeeks in the data.
Do any of you know of any protocols which specify use of salsa20? Is it usually combined with some *fast* MAC algorithm?
I suspect people who like Salsa are inclined to also like CubeHash, which could be used in HMAC variants. CubeHash is fast with optimistic parameters, but the "default" is pretty conservative making it not so fast.
Maybe it's possible to do something in the style of ofb, to get authentication cheaply as a side effect (haven't looked very closely at ofb, though, and ofb itself seems to be unusable in Nettle due to patents). Maybe I should mail djb and ask.
Regards, /Niels