[Patch] internatinal domain names for email addresses

Simon Josefsson jas at extundo.com
Fri Jan 7 00:02:00 CET 2005


David Shaw <dshaw at jabberwocky.com> writes:

>> UTF-8 only encode codepoints (0x00FC, 0x0075 and 0x0308).
>> 
>> Unicode rules how those codepoints are interpreted.
>> Some "characters" can be represented via different codepoint
>> representations.
>> 
>> A simple example is "ü"
>> 
>> U+00FC LATIN SMALL LETTER U WITH DIAERESIS
>> or
>> U+0075 LATIN SMALL LETTER U
>> U+0308 COMBINING DIAERESIS
>> 
>> There are much more complicated cases in polytonic Greek, Hangul(Korean)
>> and Hebrew.
>> 
>> One way to ease the problem would be to specify one of the 4 so called
>> normalization forms in RFC2440 3.4. (Text).
>
> Ah, I did not understand.  Wow, that's a massive headache.

Using the Punycode form in the OpenPGP User-ID is another option.  And
from what I can tell, it is the only option that can be used without
changing OpenPGP.  IDNA take care of normalization, so that for both
the input U+00FC and U+0075 U+0308, the output will be xn--tda.

Regards,
Simon



More information about the Gnupg-devel mailing list