Encoding of user ID strings
Werner Koch
wk at gnupg.org
Tue May 24 08:26:54 CEST 2016
On Mon, 23 May 2016 20:19, rjh at sixdemonbag.org said:
> At first blush it appears the answer is "no, but most people use UTF-8."
> If so that's fine, but I'll have to silently discard a number of user
OpenPGP requires that the user id is UTF-8 encoded. Older PGP versions
did not care about encoding and used whatever the system provided. Thus
there are lot's of (e.g.) Müller with wrong encodings. This is what GPA
uses to fix the problem for most of the western world's PGP users:
--8<---------------cut here---------------start------------->8---
/* Due to a bug in old and not so old PGP versions user IDs have
been copied verbatim into the key. Thus many users with Umlauts
et al. in their name will see their names garbled. Although this
is not an issue for me (;-)), I have a couple of friends with
Umlauts in their name, so let's try to make their life easier by
detecting invalid encodings and convert that to Latin-1. We use
this even for X.509 because it may make things even better given
all the invalid encodings often found in X.509 certificates. */
for (s = string; *s && !(*s & 0x80); s++)
;
if (*s && ((s[1] & 0xc0) == 0x80) && ( ((*s & 0xe0) == 0xc0)
|| ((*s & 0xf0) == 0xe0)
|| ((*s & 0xf8) == 0xf0)
|| ((*s & 0xfc) == 0xf8)
|| ((*s & 0xfe) == 0xfc)) )
{
/* Possible utf-8 character followed by continuation byte.
Although this might still be Latin-1 we better assume that it
is valid utf-8. */
return g_strdup (string);
}
else if (*s && !strchr (string, 0xc3))
{
/* No 0xC3 character in the string; assume that it is Latin-1. */
return g_convert (string, -1, "UTF-8", "ISO-8859-1", NULL, NULL, NULL);
}
else
{
/* Everything else is assumed to be UTF-8. We do this even that
we know the encoding is not valid. However as we only test
the first non-ascii character, valid encodings might
follow. */
return g_strdup (string);
}
--8<---------------cut here---------------end--------------->8---
[ Feel free to reuse - I put this snippet into the public-domain or under
CC-0 ]
Shalom-Salam,
Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
/* EFH in Erkrath: https://alt-hochdahl.de/haus */
More information about the Gnupg-users
mailing list