Danger Will Robinson - Pike-devel - lists.lysator.liu.se

Martin Nilsson (Opera Mini - AFK!) ＠ Pike (-) developers forum

9 May 2008 9 May '08

10:45 p.m.

In Unicode 5.1.0 there is an upper/lower case pair with a greater distance than fits in a short... Kind of a design flaw in Unicode.

Show replies by date

Johan Sundstr�m (Achtung Liebe!) ＠ Pike (-) developers forum

9 May 9 May

10:50 p.m.

You should be on the Unicode commission, if there is one. Any idea of how to position oneself there? The usual corporate $$$ bribery thing, for bulky standardization organs, or something more meritocratic, like the IETF?

Martin Nilsson (Opera Mini - AFK!) ＠ Pike (-) developers forum

10:55 p.m.

I sent a technical note, but this is already released since 2008-03-25.

Johan Sundstr�m (Achtung Liebe!) ＠ Pike (-) developers forum

11 p.m.

I hope you did not bypass the opportunity to sign it with some impressive-and-influential-sounding title. ;-)

Martin Nilsson (Opera Mini - AFK!) ＠ Pike (-) developers forum

11 p.m.

Martin Nilsson, Humbug in Philately at the University of Düsseldorf.

Martin Bähr

10 May 10 May

1:23 p.m.

for the totally clueless, could you elaborate what the effect of this is, and why it is bad?

greetings, martin.

Martin Nilsson (Opera Mini - AFK!) ＠ Pike (-) developers forum

4:55 p.m.

It means that the code we have for upper/lower case in Pike needs to be rewritten to handle Unicode 5.1.0. Or ignore the specific character pair in question, which is what Ken Whistler (co-author of Unicode) suggested when I asked about this.

Martin Stjernholm, Roxen IS ＠ Pike developers forum

6:35 p.m.

Or perhaps just put an ugly special case in front of the normal algorithm.

Johan Sundstr�m (Achtung Liebe!) ＠ Pike (-) developers forum

11:55 p.m.

Doubtful if it's worth the overhead in practice, even though complete correctness is not to be frowned upon. Hmm. Did strings end up being type annotated with a [lower..upper] bound without extra work penalty? And what code points are we talking about, by the way?

If it can be done inexpensively with a pre- or post-process pass (not changing ordos on the functions in the general case, which IMO ought to be considered "when this code point pair was not represented in the string" here), I'm all for it. :-)

Martin Nilsson (Opera Mini - AFK!) ＠ Pike (-) developers forum

11 May 11 May

12:30 a.m.

Adding another case_info-type doesn't appear to change the performance.

6312

Age (days ago)

6313

Last active (days ago)

pike-devel@lists.lysator.liu.se

9 comments

4 participants

tags (0)

participants (4)

Johan Sundstr�m (Achtung Liebe!) ＠ Pike (-) developers forum
Martin Bähr
Martin Nilsson (Opera Mini - AFK!) ＠ Pike (-) developers forum
Martin Stjernholm, Roxen IS ＠ Pike developers forum