Japanese and UTF8
Edmund GRIMLEY EVANS
edmundo@rano.org
Thu, 17 Feb 2000 12:03:54 +0000
Werner Koch <wk@gnupg.org>:
> now that we have a Japanese translation, we have to do a conversion from
> EOC_JP to UTF-8, because UTF-8 is the required encoding for user IDs
> and some other strings in OpenPGP.
>
> I don't think that the currently used simple mapping approach works
> with that character set, because it is a simple one-to-one mapping and
> I expect that EOC_JP uses state shifting.
Probably.
> What is a portable way to this conversion? I had some talks about
> that in Tokyo and it boiled down to let the OS/libc do it. Okay, how?
The official API uses iconv_open, iconv and iconv_close and is defined
in iconv.h. The version in glibc-2.1 doesn't do Japanese and deviates
from the standard. I hope glibc-2.2 will have a more correct and
complete implementation. Bruno Haible has a portable libiconv that
provides the same functions and does do Japanese. (I'm using it now,
linked with mutt, on a glibc-2.1 machine.)
There's concise info and relevant links at:
ftp://ftp.ilog.fr/pub/Users/haible/utf8/Unicode-HOWTO-5.html#ss5.1
If you're going to use iconv, then you might want to get rid of the
charset tables in util/strgutil.c. On the other hand, you might want
to do what I did with mutt: leave them in but use them only if
configure fails to detect iconv. I can send you my configure.in for
mutt, if you want; it's mostly adapated from Bruno Haible's
configure.in for clisp, if I remember correctly. You probably know
more about aoutconf and can tell me what I did wrong ...
Edmund