gnupg-1.0.2 patch: LC_CTYPE needs to be imported

Daniel Resare noa@metamatrix.se
Wed, 16 Aug 2000 20:20:34 +0200


On Wed, Aug 16, 2000 at 04:32:06PM +0100, Edmund GRIMLEY EVANS wrote:

> (What's it doing in fact? Does
> it change Ä to AE?)
Yes the swedish chars i have in my translation gets mangled å -> aa ä -> ae ö -> oe
>
> By the way, I think there's no guarantee that the charset of the
> portable "C" locale is US-ASCII. Today it usually is, but in the
> future it might more often be UTF-8.
I think the glibc infopages prove you wrong here, at least if we work with a system based on ISO C. (libc.info.gz)Standard Locales: `"C"' This is the standard C locale. The attributes and behavior it provides are specified in the ISO C standard. When your program starts up, it initially uses this locale by default.
>
> > 2) wrap all calls to the ctype.h functions (isalpha() and friends
> > in setlocale(LC_CTYPE, "C")) (some grep'ing shows 56 occurances)
>
> Yuck.
>
> > 3) review all uses of the ctype.h functions and perhaps use the
> > isascii() function instead where appliciable (according to
> > the manpage isascii() is a BSD and SVID extension and should
> > be quite widely available)
> >
> > 4) write platform indipendent replacements of the used ctype.h
> > functions that check against the US-ASCII charset. (Shouldn't
> > be difficult)
>
> To me, these solutions look best. You could use a configure test to
> choose between them. I assume that either way you're assuming that the
> locale charset is compatible with US-ASCII, so GnuPG won't work in an
> EBCDIC locale, but who cares.
>
Even though Werner Koch mailed me privately saying 'please no' to alternative 4 I fail to see the problem with it. The US-ASCII definition (as found in ISO646) is set in stone and will never change, it defines values that a char (as defined in ISO C) can have that maps to glyphs. A completely portable, clear, bugfree and efficient implementation of an isascii() function could be written in about 1 hour. Benefits: 1) no dependency of the layout of the C locale. (who knows AIX or someone might have gotten it wrong) 2) no dependency of the LC_CTYPE setting (i fooled some redhat person to accept my patch to change LC_CTYPE to "" before i was caught by Werner) What can happen once, usually happens twice. so, please do. Until I (or someone else) have time enough to convert everything to UTF-8 cheers/daniel -- nuclear cia fbi spy password code president bomb 8D97 F297 CA0D 8751 D8EB 12B6 6EA6 727F 9B8D EC2A