The parts that might or might not be UTF-8 encoded can still very well contain structural (or "protocol") information. That structural info can very well be interpreted and reinterpreted on several levels, some before decoding, some afterwards. The whole multilevel thing might very well be streaming.
And to reiterate, I think this whole line of discussion isn't particularly important compared to the simple argument that the encoding called "utf-8" in the Charset module should comply to the UTF-8 standard.