Why hashed User IDs is not the solution to User ID enumeration (was: Re: Creating a key bearing no user ID)
Daniel Kahn Gillmor
dkg at fifthhorseman.net
Tue Jan 24 16:21:35 CET 2012
On 01/23/2012 06:23 PM, MFPA wrote:
> It sounds like you value the flavour of privacy that could be afforded
> by a scheme involving the use of hashes in UIDs to protect names and
> email addresses. Such a scheme would (for example) allow somebody with
> one of your email addresses to locate your key, but would not allow
> somebody to devine your names or email addresses by inspecting your
> key. An extension would be required to allow GnuPG to locate keys
> using both the hash and the plaintext string simultaneously.
What you're looking to do with this proposed hashed-user-id scheme is to
find a way to avoid allowing people to enumerate e-mail addresses or
User IDs from the data contained on the keyservers. Right? I'd also
like to be able to do that, but i don't think hashed-user-ids is an
effective way. Here's why:
I worked for a while with a group of people (several of the other
monkeysphere devs) to spec something like this out, to try to address
this very issue.
However, after thinking about the various possible solutions, and
reading more, i started to think this all smelled very similar to
another problem: DNSSEC zone enumeration.
DNSSEC zone enumeration is a byproduct of the way that NXDOMAIN
responses must be signed in order to be provable; the original proposal
required the signed NXDOMAIN response to indicate the range of names
which were excluded. this makes it easy for an attacker to jump from
name to name via NXDOMAIN records, and enumerate all records in the
zone. So far, this looks very much like the current keyservers, which
allow for trivial enumeration of IDs.
DNSSEC tried to fix this with NSEC3 records, which work differently;
instead of listing the boundaries of the requested NXDOMAIN range, they
listed the boundaries in a hashed space. that is, instead of saying
"there are no records of any type between bar.example.com and
foo.example.com", they say "there are no records of any type whose
labels hash to somethng between 8a367d741d7a9a904ef6f92fd99de3d57ded1203
and cb17eb75226ca198afec4ea1170f02fade354e3e". So now, the attacker who
wants to enumerate the zone has to reverse the hash to uncover the
The trouble is that domain names (and e-mail addresses, and human names)
are very low-entropy things, and actually are pretty easy to enumerate
and test. Dan Bernstein wrote a tool called NSEC3walker that can
practically enumerate a DNSSEC-signed zone that uses NSEC3 records,
using pretty low-end hardware, and doing few network queries:
A comparable tool could be made to attack any sort of hashed-user-ids
scheme, which means that anyone who wants to harvest or enumerate
addresses this way could probably do it. Certainly, the bar is raised
for User ID enumeration, but only slightly.
So, as someone who was similarly eager for such a scheme, i have to ask
myself: does the marginal gain in address-enumeration-protection
outweigh the costs in complexity and confusion that the scheme adds?
Certainly, the keyservers will continue to support non-digested User
IDs, so now tools will need to be able to handle both of them; we'll
also need a policy for end-user agents to answer questions like "when
looking up this e-mail address, do i send it only in digested form to
the keyservers for lookup? or do i send it in cleartext form as well,
thereby leaking the e-mail address to the keyserver operators (and to
anyone on the network path)? How do we explain or expose policy
questions like that to users who already struggle with the concepts
behind OpenPGP? or do we modify the keyservers themselves to index
digested forms of cleartext User IDs, and respond to digest lookups with
cleartext responses, thereby turning the keyservers into a
digest-reversing oracle for those non-hashed User IDs which exist?
Ultimately, i don't think the tradeoffs for this scheme are worthwhile
for the marginal and limited gain that the proposal provides. I'd love
to find a solution to the User ID enumeration problem, but i don't think
hashed-user-ids is it.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 1030 bytes
Desc: OpenPGP digital signature
More information about the Gnupg-users