New subject: Clean-room Engine.IO implementation committed to git 8.0/8.1

23 Nov 2016


      On Wed, Nov 23, 2016 at 11:10 PM, Marcus Comstedt (ACROSS) (Hail
Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se
wrote:
...
...
I agree, but using string(8bit) to mean "binary data" is something
that's 100% backward compatible.
It would not be backwards compatible, since that is not what
string(8bit) means today.
By "binary data", I mean eight-bit strings of arbitrary bytes - like
you'd read from a file or something. Currently, functions like
Stdio.read_file simply return "string", but they'll effectively be
returning string(8bit).
...
...
Unicode text would always be referred
to as string(21bit), even if it happens to contain nothing but Latin-1
characters.
That doesn't really make sense.  So you say that "R\xe4ksm\xf6rg\xe5s"
would have type string(21bit)?  What type would "\U12345678" have?
\U12345678 possibly should be an error, as it's not valid Unicode.
Maybe the Pike string type can be used for other things, but they're
not Unicode text - so you could use string(32bit) for those sorts of
non-textual strings. (I don't know of any use cases, so I can't say
beyond that.) My statement about Unicode text specifically excludes
anything that isn't valid according to the Unicode standard.
...
What type would "Foo" have?  How would you specify a UTF-8 encoded
literal?
Now, these are questions that can't truly be answered with the current
system. I would like the former to be string(7bit), and the latter
would be either string(7bit) or string(8bit) depending on whether
there are non-ASCII characters in it. But they're probably both just
type 'string' at the moment.
ChrisA

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1