gnupg-1.0.2 patch: LC_CTYPE needs to be imported
Daniel Resare
noa@metamatrix.se
Wed, 16 Aug 2000 20:20:34 +0200
On Wed, Aug 16, 2000 at 04:32:06PM +0100, Edmund GRIMLEY EVANS wrote:
> (What's it doing in fact? Does
> it change Ä to AE?)
Yes the swedish chars i have in my translation gets mangled
å -> aa
ä -> ae
ö -> oe
>
> By the way, I think there's no guarantee that the charset of the
> portable "C" locale is US-ASCII. Today it usually is, but in the
> future it might more often be UTF-8.
I think the glibc infopages prove you wrong here, at least if we
work with a system based on ISO C.
(libc.info.gz)Standard Locales:
`"C"'
This is the standard C locale. The attributes and behavior it
provides are specified in the ISO C standard. When your program
starts up, it initially uses this locale by default.
>
> > 2) wrap all calls to the ctype.h functions (isalpha() and friends
> > in setlocale(LC_CTYPE, "C")) (some grep'ing shows 56 occurances)
>
> Yuck.
>
> > 3) review all uses of the ctype.h functions and perhaps use the
> > isascii() function instead where appliciable (according to
> > the manpage isascii() is a BSD and SVID extension and should
> > be quite widely available)
> >
> > 4) write platform indipendent replacements of the used ctype.h
> > functions that check against the US-ASCII charset. (Shouldn't
> > be difficult)
>
> To me, these solutions look best. You could use a configure test to
> choose between them. I assume that either way you're assuming that the
> locale charset is compatible with US-ASCII, so GnuPG won't work in an
> EBCDIC locale, but who cares.
>
Even though Werner Koch mailed me privately saying 'please no' to
alternative 4 I fail to see the problem with it. The US-ASCII definition
(as found in ISO646) is set in stone and will never change, it defines
values that a char (as defined in ISO C) can have that maps to glyphs.
A completely portable, clear, bugfree and efficient implementation of
an isascii() function could be written in about 1 hour.
Benefits:
1) no dependency of the layout of the C locale. (who knows AIX
or someone might have gotten it wrong)
2) no dependency of the LC_CTYPE setting (i fooled some redhat person
to accept my patch to change LC_CTYPE to "" before i was caught by
Werner) What can happen once, usually happens twice.
so, please do. Until I (or someone else) have time enough to convert
everything to UTF-8
cheers/daniel
--
nuclear cia fbi spy password code president bomb
8D97 F297 CA0D 8751 D8EB 12B6 6EA6 727F 9B8D EC2A