In the last episode (Dec 12), Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum said:
Probably the most conformant way is to only eat whitespace between two encoded words. Since the RFC doesn't seem to mention any other kinds of whitespace, the intention might be that they should be left alone.
It does pose something of a semantic problem for _encode_ though: Given the input
x = ({ ({ "Hello", 0 }), ({ "Wor", "iso-8859-1" }), ({ "ld", "iso-8859-2" }), ({ "!", 0 }) }) }) ;
what should MIME.encode_words_text(x, "q") produce? It is not possible to put the first encoded world directly after the "o", but if a space is inserted the resulting string will decode to
Hello World !
and not
HelloWorld!
as intended. Tricky...
RFC2047 says that "An 'encoded-word' that appears within a 'phrase' MUST be separated from any adjacent 'word', 'text' or 'special' by 'linear-white-space'". That means any strings adjacent to a string that gets encoded must also get encoded, unless they contain a leading (or trailing) space. So your array must end up being encoded as:
"=?us-ascii?q?Hello?= =?iso-8859-1?q?Wor?= =?iso-8859-2?q?ld?= =?us-ascii?q?!?="
or, if you choose to "extend" the charset into the adjacent string (which only works if the charset is a superset of us-ascii):
"=?iso-8859-1?q?HelloWor?= =?iso-8859-2?q?ld!?="
. If element 0 was "Hello ", and element 3 was " !", only then could you leave them unencoded, and the result would be
"Hello =?iso-8859-1?q?Wor?= =?iso-8859-2?q?ld?= !"