>Message: 1
>Date: Tue, 03 Aug 2004 18:52:18 +0200
>From: Anders Thulin <ath(a)algonet.se>
>To: runeberg(a)lists.lysator.liu.se
>Subject: [Runeberg] Re: How do I make sure I use ISO 8859-1
>
>"Ingemar Olson" <bio2935c(a)hotmail.com> asks:
>
> > I would like to start proofreading but I'm stuck on the difference
>between
> > ASCII and ISO 8859-1. For example, I am used to keying <alt-148> to
>write an
> > ö, but the instructions ("Olika streck och andra specialtecken" på
> > //runeberg.org/wiki/Instruktioner_för_korrekturläsare) state clearly to
>NOT
> > use this technique.
>
> As you say 'alt-148', the difference you are worried about is really
>the one
>between ISO 8859-1 and the character code used by your Windows system --
>which
>probably is CP1252, which is just a superset of 8859-1. (ASCII is, as far
>as I
>understand, a 7-bit character set, related to the 7-bit ISO 646 character
>sets).
>
> The difference between the two, however, is not in the 0xCO - 0xFF
>area, where
>most of the accented letters have been placed. But if you try to produce
>S/s or
>Z/z with caron, the OE/oe ligature, and y with diaeresis, and the various
>left
>and right single and double quotation marks, the different dashes, and
>several
>other special characters in the 0x80-0x9F area, it won't work.
Tack Anders
Det ser ut som om jag kan fortsätta med alt-xxx för åäö i alla fall - bara
jag undviker 0x80-0x9F. Jag tänker bara jobba med svenska sidor så jag kan
nog undvika de flesta 'konstiga' bokstäverna.
Det skulle vara bra om det fanns någonstans där man kunde se, och klippa ut,
enstaka problematiska ISO 8859-1 bokstäver som man senare kunde sätta in i
texten.
Men nu har jag en annan fråga (till vem som helst som vill svara):
Jag ser att det fins rätt så många (vad jag skulle kalla "quotation marks"
på engelska). Alltså det där som ser ut som ett komma, eller snarare två
kommor, och det står tryckt _mitt_ i raden. Om det hade varit lite högre så
skulle jag ha kallat det "closing quotation mark" () (på engelska). Men när
det står mitt i raden tycks det tolkas av OCR-programmet som ett par V
(eller pilspetsar) som pekar till höger (»). Vad ska man göra med det? Ändra
det till " eller låta det stå kvar som » ? Det ser ju INTE ut som pilspetsar
i orginalet! Vilket är rätt?
Ingemar
_________________________________________________________________
Discover the best of the best at MSN Luxury Living. http://lexus.msn.com/
"Ingemar Olson" <bio2935c(a)hotmail.com> asks:
> I would like to start proofreading but I'm stuck on the difference between
> ASCII and ISO 8859-1. For example, I am used to keying <alt-148> to write an
> ö, but the instructions ("Olika streck och andra specialtecken" på
> //runeberg.org/wiki/Instruktioner_för_korrekturläsare) state clearly to NOT
> use this technique.
As you say 'alt-148', the difference you are worried about is really the one
between ISO 8859-1 and the character code used by your Windows system -- which
probably is CP1252, which is just a superset of 8859-1. (ASCII is, as far as I
understand, a 7-bit character set, related to the 7-bit ISO 646 character sets).
The difference between the two, however, is not in the 0xCO - 0xFF area, where
most of the accented letters have been placed. But if you try to produce S/s or
Z/z with caron, the OE/oe ligature, and y with diaeresis, and the various left
and right single and double quotation marks, the different dashes, and several
other special characters in the 0x80-0x9F area, it won't work.
That is, in principle, the alt-xxx method won't work, as it produces characters
in a different character set. However, if you know the difference between the
two character sets, you can get by. The risk for making errors can be high,
especially if you acquire bad keyboarding habits, and there are no safety nets.
For a full description of the differences, see
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
Any character with the same CP1252 and Unicode code point is safe:
0xC5 0x00C5 #LATIN CAPITAL LETTER A WITH RING ABOVE
If the differ, you have to use another method:
0x8E 0x017D #LATIN CAPITAL LETTER Z WITH CARON
Strictly speaking, this describes the mapping from CP1252 to Unicode, but as
the first 256 code points of Unicode are the same as those of ISO 8859-1
(a 8859-1 to Unicode mapping table can be found under .../MAPPINGS) the difference
is only superficial.
In the same catalogue (.../WINDOWS/) other code mapping tables can be found.
> So I'm confused. How DO I make sure I generate the ISO characters?
I'm sorry -- I'm not much of a Win expert.
I would use a Wordpad, Save as 'ANSI', and then use the GNU recode program
myself (under Cygwin -- assuming it can be compiled). I better leave it to any
Windows expert to say if the same effect can be produced with pure Win tools.
I've been told that there are two forms of the ALT- keyboarding method:
the ALT-xxx method and the ALT-0xxx method, and that the difference can be
useful when you know exactly how they work. There seems to be one or two
web sites describing it (search for ALT-0xxx): you might want to investigate.
Personally, I regard ALT- as unfit for human use.
best wishes,
--
Anders Thulin ath*algonet.se http://www.algonet.se/~ath
Hello everyone (or maybe only Lars?).
I would like to start proofreading but I'm stuck on the difference between
ASCII and ISO 8859-1. For example, I am used to keying <alt-148> to write an
ö, but the instructions ("Olika streck och andra specialtecken" på
//runeberg.org/wiki/Instruktioner_för_korrekturläsare) state clearly to NOT
use this technique.
I have created other web pages containing åäöÅÄÖ (generated with the alt-nnn
keystrokes) written with "charset=iso-8859-1" and they pass the W3C HTML
validator check _and_ they display correctly (for me anyway), implying that
the characters I generated are part of the ISO 8859-1 character set.
So I'm confused. How DO I make sure I generate the ISO characters? Or did I
misunderstand something in the instructions?
Ingemar
_________________________________________________________________
Powerful Parental Controls Let your child discover the best the Internet has
to offer.
http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=ht…
Start enjoying all the benefits of MSN® Premium right now and get the
first two months FREE*.