On 29/01/2015 9:08 p.m., Niels Möller wrote:
Amos Jeffries writes:
The attached patch implements a proposed API/ABI extension adding support for RFC 4648 section 5 "Base 64 Encoding with URL and Filename Safe Alphabet"
Thanks. I have some comments below. Also, testcases are essential (and I suspect that the definition of base64_encode_group is not really accepted by the compiler).
I have added a fuzz tester that runs a set of randomly generated strings past the default encoder and decoder, then the extended one.
I am using rand() to generate 1020 input octets each loop then for each alphabet: encode checking the output lengths match the crypted string length, decode checking for success, then compare the round-trip output matches the original input. - if you have a better random generator you would like to use please let me know where to find a code example or docs for it.
I've tested the unit test itself in a few ways: * eyeball comparison of several base64 vs base64url crypted strings to verify they match modulo the different alphabet characters. * eyeball check to ensure the fuzz does include the alphabet unique characters fairly often. * inverting test assertions to verify the checks are asserting on the right boundary conditions * growing the fuzz length and count form small to large values checking for consistent pass/fail. * seeding rand() with different fixed values, and leaving default seed value.
The test and the coders seem to be working fine. Except that when I increase the *count* of fuzz test run it asserts regular as clockwork on the 950th loop:
base64-test: base64-encode.c:96: _base64_encode_raw: Assertion `in == src' failed.
For the standardized alphabets external code simply calls the init() function relevant to the alphabet it is needing to encode/decode with. The library internally uses the context to select which lookup table to use for later base64 function calls.
Or we could introduce a generalized init function, if that need arises.
-#define ENCODE(x) (encode_table[0x3F & (x)]) +#define ENCODE(x) (ctx->alphabet[0x3F & (x)])
I think
#define ENCODE(alphabet, x) ((alphabet)[0x3F & (x)])
is better, not relying on "ctx" variable being defined where the macro is used.
void base64_encode_raw(uint8_t *dst, size_t length, const uint8_t *src) {
- struct base64_encode_ctx ctx;
- base64_encode_init(&ctx);
- _base64_encode_raw(&cxt, dst, length, src);
+}
I think it would be better to let _base64_encode_raw take the alphabet as an argument. base64_encode_raw would then be simply
void base64_encode_raw(uint8_t *dst, size_t length, const uint8_t *src) { _base64_encode_raw(default_encode_table, dst, length, src); }
void base64_encode_group(uint8_t *dst, uint32_t group) {
- struct base64_encode ctx;
- base64_encode_init(&ctx);
- *dst++ = ENCODE(group >> 18); *dst++ = ENCODE(group >> 12); *dst++ = ENCODE(group >> 6);
Okay, done all that.
@@ -144,6 +111,7 @@ void
And similarly here, I think it would be better to not use any ctx struct, and instead do
*dst++ = ENCODE(default_encode_table, group >> 18); *dst++ = ENCODE(default_encode_table, group >> 12); *dst++ = ENCODE(default_encode_table, group >> 6);
(And this function is of questionable utility...)
Do you wish me to remove base64_encode_group entirely?
Cheers Amos