UTF-8 support

Alain Bench messtic at oreka.com
Tue Jul 5 14:29:09 CEST 2005


Hello Bruno,

 On Monday, July 4, 2005 at 1:28:35 PM +0200, Bruno Haible wrote:

> Please use the appended patch, which I'll also use in libiconv-1.10.
>| localcharset.c (get_charset_aliases) [WIN32]: Add CP65001 and others.
>| Reported by <mus1876 at gmx.info> via Alain Bench <messtic at oreka.com>.
>| "CP20936" "\0" "GB2312" "\0"
>| "CP38598" "\0" "ISO-8859-8" "\0"
>| "CP51932" "\0" "EUC-JP" "\0"
>| "CP51936" "\0" "GB2312" "\0"
>| "CP51949" "\0" "EUC-KR" "\0"
>| "CP51950" "\0" "EUC-TW" "\0"
>| "CP54936" "\0" "GB18030" "\0"
>| "CP65001" "\0" "UTF-8" "\0";

    Much thanks, Bruno!


    BTW how is a Win32 console app supposed to use libcharset? I mean
libcharset uses GetACP() only, getting graphic mode default charset
(typically 1252), while console apps use a usually different text mode
default charset (typically 850, given by GetConsoleOutputCP()).

    More complicated yet for apps like GnuPG, usable both directly in
console with 850, or thru a graphic frontend interacting in 1252. GnuPG
doesn't use libcharset, but on Win32 uses directly GetConsoleOutputCP(),
unless it fails then GetACP(), then canonicalizes names (28591 ==>
ISO-8859-1) with the same table as libcharset. There are still cases
where forcing charset with --charset option becomes necessary.

    I keep the crosspost gnupg-users, because I believe it's not way
off-topic, being a continuation of an old January 2005 "current charset
guessing" thread.


Bye!	Alain.
-- 
Hotmail users break umlauts for everyone else on a mailing list!
They should stop doing so immediately!
	« MSN considered HARMFUL » PCC CB on MU. © June 2002




More information about the Gnupg-users mailing list