Ok, I'm a little confused now. So you are looking for a particular ASCII char which terminates the UTF-8 encoded part. Is this character itself part of an UTF-8 encoded part or not? If it is, then you should decode before looking for it. If it is not, then there is only one possible encoding of it, and any overlong UTF-8 representation of it is clearly not the end marker.