So your array must end up being encoded as:
"=?us-ascii?q?Hello?= =?iso-8859-1?q?Wor?= =?iso-8859-2?q?ld?= =?us-ascii?q?!?="
That's not correct. By setting the charset for "Hello" to 0, rather than "us-ascii", I have requested that the Hello part is encoded literally, and not as an encoded-word.
or, if you choose to "extend" the charset into the adjacent string (which only works if the charset is a superset of us-ascii):
"=?iso-8859-1?q?HelloWor?= =?iso-8859-2?q?ld!?="
That's not correct either. Since no charset is provided for the "Hello" part, I can't assume it's a subset of <whatever the encoding for "Wor" is>, and I can't even assume that it a subset of "us-ascii" as you did in the first suggestion.
/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)
Previous text:
2002-12-12 23:46: Subject: Re: incorrect rfc2047 MIME decoding?
In the last episode (Dec 12), Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum said:
Probably the most conformant way is to only eat whitespace between two encoded words. Since the RFC doesn't seem to mention any other kinds of whitespace, the intention might be that they should be left alone.
It does pose something of a semantic problem for _encode_ though: Given the input
x = ({ ({ "Hello", 0 }), ({ "Wor", "iso-8859-1" }), ({ "ld", "iso-8859-2" }), ({ "!", 0 }) }) }) ;
what should MIME.encode_words_text(x, "q") produce? It is not possible to put the first encoded world directly after the "o", but if a space is inserted the resulting string will decode to
Hello World !
and not
HelloWorld!
as intended. Tricky...
RFC2047 says that "An 'encoded-word' that appears within a 'phrase' MUST be separated from any adjacent 'word', 'text' or 'special' by 'linear-white-space'". That means any strings adjacent to a string that gets encoded must also get encoded, unless they contain a leading (or trailing) space. So your array must end up being encoded as:
"=?us-ascii?q?Hello?= =?iso-8859-1?q?Wor?= =?iso-8859-2?q?ld?= =?us-ascii?q?!?="
or, if you choose to "extend" the charset into the adjacent string (which only works if the charset is a superset of us-ascii):
"=?iso-8859-1?q?HelloWor?= =?iso-8859-2?q?ld!?="
. If element 0 was "Hello ", and element 3 was " !", only then could you leave them unencoded, and the result would be
"Hello =?iso-8859-1?q?Wor?= =?iso-8859-2?q?ld?= !"
/ Brevbäraren