This is interesting concerning Java variant: http://mail.nl.linux.org/linux-utf8/2002-12/msg00306.html
It's not only the overlong encoding of NUL that's special. I think the way to handle this one is by adding a special encoding for it to the Charset module.
(Note that according to the author of that text, this "Java modified UTF-8" isn't intended to be used for generic I/O of UTF-8 strings, but rather for object serialization. There are other Java libraries that read and write correct UTF-8.)