Japanese and UTF8

Thu, 17 Feb 2000 12:03:54 +0000

Werner Koch <wk@gnupg.org>:

> now that we have a Japanese translation, we have to do a conversion from

> EOC_JP to UTF-8, because UTF-8 is the required encoding for user IDs

> and some other strings in OpenPGP.

> 

> I don't think that the currently used simple mapping approach works

> with that character set, because it is a simple one-to-one mapping and

> I expect that EOC_JP uses state shifting.

Probably.

> What is a portable way to this conversion?  I had some talks about

> that in Tokyo and it boiled down to let the OS/libc do it.  Okay, how?

The official API uses iconv_open, iconv and iconv_close and is defined
in iconv.h. The version in glibc-2.1 doesn't do Japanese and deviates
from the standard. I hope glibc-2.2 will have a more correct and
complete implementation. Bruno Haible has a portable libiconv that
provides the same functions and does do Japanese. (I'm using it now,
linked with mutt, on a glibc-2.1 machine.)

There's concise info and relevant links at:

ftp://ftp.ilog.fr/pub/Users/haible/utf8/Unicode-HOWTO-5.html#ss5.1

If you're going to use iconv, then you might want to get rid of the
charset tables in util/strgutil.c. On the other hand, you might want
to do what I did with mutt: leave them in but use them only if
configure fails to detect iconv. I can send you my configure.in for
mutt, if you want; it's mostly adapated from Bruno Haible's
configure.in for clisp, if I remember correctly. You probably know
more about aoutconf and can tell me what I did wrong ...

Edmund