utf8_to_string crippling

5 Apr 2005


      For the Charset module, I belive that the decoder should be lenient.
The reason being that the module handles more than UTF-8, it handles
also e.g. EBCDIC and UTF-7, which do _not_ share the design goal of
UTF-8 that you should be able to do ASCII processing of the "encoded"
form.  If you look for "/" in an EBCDIC string for example, you will
not find any slashes as they are encoded as "a".  So the general
operation principle for the Charset module is that you decode the
string _first_, _then_ you look for specific characters.  If you
deliberatly vioulate this principle because you _know_ you are dealing
with UTF-8, which lets you get away with it, you can just as well use
utf8_to_string.  That way you know that you have to rewrite the code
anyway if you want to change to different character encoding.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

utf8_to_string crippling