For the Charset module, I belive that the decoder should be lenient. The reason being that the module handles more than UTF-8, it handles also e.g. EBCDIC and UTF-7, which do _not_ share the design goal of UTF-8 that you should be able to do ASCII processing of the "encoded" form. If you look for "/" in an EBCDIC string for example, you will not find any slashes as they are encoded as "a". So the general operation principle for the Charset module is that you decode the string _first_, _then_ you look for specific characters. If you deliberatly vioulate this principle because you _know_ you are dealing with UTF-8, which lets you get away with it, you can just as well use utf8_to_string. That way you know that you have to rewrite the code anyway if you want to change to different character encoding.