If I do any assembler at all, I think I'd prefer a generic x86, at least for a start.
What could you win by using mmx instructions? One could generate some k (k = 16 or so) bytes of the keystream, put it into a register, and then apply xor them all to the source stream at onece, assuming source and destination are properly aligned. But most instructions are used for generating the key stream, and it seems non-trivial to get any parallelization there.
(My understanding of mmx, sse, vis etc is quite vague).
/ Niels Möller (vässar rödpennan)
Previous text:
2004-02-05 14:32: Subject: Nettle
It would be possible to use MMX. If nothing else you then have more registers. However, it can only access memory on even 8-byte boundaries.
/ Per Hedbor ()