Re: Clean-room Engine.IO implementation committed to git 8.0/8.1

23 Nov 2016


      ...
By "binary data", I mean eight-bit strings of arbitrary bytes - like
you'd read from a file or something. Currently, functions like
Stdio.read_file simply return "string", but they'll effectively be
returning string(8bit).
No, Stdio.read_file currently returns string(8bit).  That simply means
that each element will be in the range 0-255.  If you were to change
the meaning to something else, you would create compatibility issues
by making some currently valid assignments involving string(8bit)
invalid.
...
\U12345678 possibly should be an error, as it's not valid Unicode.
It's valid Pike.  Pike supports the full ISO/IEC 10646 31-bit range,
plus an equally large negative range.
...
so you could use string(32bit) for those sorts of
non-textual strings.
Not string(31bit)?
...
My statement about Unicode text specifically excludes
anything that isn't valid according to the Unicode standard.
Which makes it even worse since the set of valid characters change
with each release of the Unicode standard...
...
...
What type would "Foo" have?  How would you specify a UTF-8 encoded
literal?
Now, these are questions that can't truly be answered with the current
system. I would like the former to be string(7bit),
Then you are contradicting yourself, since you claimed that Unicode
text would _always_ be referred to as string(21bit), and "Foo" is
definitely Unicode text (both 'F' and 'o' have been part of the
Unicode standard since the first version).
...
and the latter
would be either string(7bit) or string(8bit) depending on whether
there are non-ASCII characters in it.
But how would the compiler know that the characters are UTF-8 encoded,
so that it does not assign a type of string(21bit) instead?

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Clean-room Engine.IO implementation committed to git 8.0/8.1