On Wed, Nov 03, 2004 at 04:50:02AM +0100, Martin Nilsson (DivX Networks) @ Pike (-) developers forum wrote:
If I UTF8-encode a string three times and UTF8-decode a string three times I expect to get the original string.
Right, but if (in case of SQLite, for example) I pass on UTF8 string to big_query(), it will be encoded second time, so the value in database will be incorrect (and longer than original) - to external application.
In case if I use bound parameters, everything will be OK (no conversion is done for 8-bit strings), but query text by itself may be UTF8 string, and will be converted by unconditional call to f_string_to_utf8().
Basically, any call to f_string_to_utf8() will scramble existing UTF8 encoding, so it will (obviously) be decoded correctly if only Pike is used (which will decode it always), however anything but Pike, reading values stored by Pike, may fail.
Example:
... query = string_to_utf8("INSERT INTO x VALUES('\x1234')"); ... 100 lines later big_query(query)->fetch_row();
So now we have incorrectly encoded value in table - external application will read it and make no conversion since it is expected to be in UTF8 already. It will be (still) correctly encoded, but not something that makes any sense.
Regards, /Al