GPG-Charset-Problems (Windows)

Andreas John ajgpgml at tesla.inka.de
Sun May 11 01:20:01 CEST 2003


Hi!


If I generate a UserID in the windows-console with non-7bit-chars (eg. german umlauts), the current charset is not taken into account (in german windows it's usually CP850), and so the newly generated UserID has not the right UTF8-codes for the umlauts.
As long as GPG is used at the german windows-console it's no problem, because the output will look right within CP850-UTF8 (if I can say so).
But it's not working for all other users (all the non-german-windows-users that is).

And a quick look into simple-gettext.c told me:
    #if 0 /* Mapping is not used any more.  Instead we convert the files when
             Creating the binary distribution. */

So, this is obviously a kludge which allows to have german umlauts in the translated strings that GPG outputs and this also explains why there is such an inconsistend behaviour concerning input/output of general translated texts and UserIDs.

As far as I know, in window-terms, OEM-Charset means the Concole-Charset, where ANSI-Charset is the Windows-Charset.
So, the right way to handle this would be to use the MultiByteToWideChar-/WideCharToMultiByte-API from windows to convert from/to CP_OEMCP/Unicode(16bit).
Unfortunately the direct UTF8-Support for this API is only available with Win NT4.0+, so it's also required to have some UTF-8<->Unicode(16bit)-"Translation" (which is of course no big deal but nevertheless some additional overhead).

For the Po-File-Translation there is CharToOem (AnsiToOem is deprecated); but I would guess, this will be problematic if the current windows-charset is not ISO-8859-1 (or whatever is used to create the po-file).
So I suggest to use UTF-8 (or 16bit-Unicode) pretranslation (like in the present kludge), but convert it to the currently used OEM-Charset by using WideCharToMultiBytes.


This is no extremely smooth approach (once again special code for windows), but I strongly advice to support this! It's simply too annoying to have charset-inconsistencies...


Bye!






More information about the Gnupg-devel mailing list