Re: MIME.Message and RFC 2047 encodings

30 Oct 2016


      On Sun, Oct 30, 2016 at 9:17 PM, Chris Angelico rosuav@gmail.com wrote:
...
On Sun, Oct 30, 2016 at 4:17 AM, Martin Karlgren marty@roxen.com wrote:
...
Adding charset decoding to MIME.Message sounds good to me, perhaps with a flag to enable it on decoding?
(A compat problem I can think of is that applications may assume that decoded data is 8bit strings and fail to apply proper encoding before writing to file, causing an exception.)
I agree about backward compat, and that's a bit problematic. So here's
my thinking: MIME.UnicodeMessage will be a subclass of MIME.Message
with the express goal of making everything use 21-bit strings. Any
time it returns an eight-bit string, that is a bug to be fixed. So
future incompatibility won't be a problem, as it's expressly
documented that way; and past compatibility is fine, because
MIME.Message itself isn't changing. Methods like
MIME.Message()->get_filename, which currently do the decoding at that
late point, can simply be overridden in UnicodeMessage.
Does that seem like a reasonable API?
I've pushed a change to 8.1 that ought to be 100% backward compatible.
If there's a problem, I can revert it, but there shouldn't be. (Just
in case, it's not in 8.0.) The two notable features are:
1) MIME.UnicodeMessage, as described above
2) MIME.parse_headers() now takes an additional parameter 'unicode'.
Everything else should be completely invisible to most programs, and
both of these can be ignored.
ChrisA

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: MIME.Message and RFC 2047 encodings