WOT and Authentication Research

Sun Dec 2 03:24:37 CET 2012

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Hi all,

I have a couple research ideas dealing with Authentication and the WOT. I'm
looking for any criticism, opinions, or thoughts on my current directions.
Mostly, I want to make sure I'm not barking at the wrong problems or that there
are not any reasons why some of my current ideas fundamentally would never work
either in practice or in theory. It'd be ideal to incrementally deploy any of
these solutions, but I'm not assuming that redesigning things is out of the
question. Particularly, when it comes to the role of keyservers.

One way to describe the general problem of authentication that I think was
stated well was in "Distributed Identity Management in the PGP Web of Trust"[1]:

>we are trying to map unique and unforgeable identifiers (in this case public
>keys) to ambiguous and potentially contested identifiers (names, e-mail
>addresses, ect.)

I'd state this one step further and say that really you want to know that an
identity controls both this forgeable identity and sole access to the private
key. Correctly mapping a public key to a communication domain is important but
it is better when you can also capture the identity behind it. Google should
control its own domain name but also the private key of the public key
associated with it. Alice should actually control alice at alice.tld and her
private key.

When I say communication domain I am mainly considering any sort of unique
namespace as part of a communication medium that needs authentication (but may
not always require it). Usually these are handed out by single authorities. The
obvious cases are e-mail accounts, domain names, XMPP accounts. All have common
ways to encrypt and sign information (straight PGP, HTTPS, OTR respectively)
The actual keys used for each encryption scheme can be subkeys of the master
identity.

Ultimately though you must also consider that private keys can be stolen or
lost. Their are two independent failure involved: stolen/lost private keys
and/or lost control of a communication domain. To account for these failures
you can't make knowledge of this mapping immutable, but you also don't want to
make it trivial for someone to claim a new public key for an email address
whose real owner has already published a key. Malicious people can always of
course publish new mappings, so instead I want a strategy to generally restrict
publishing of bad mappings while balancing the ability to update mappings for
legitimate reasons.

I'd like to bring the discussion to find a more sane way to balance these
constraints. I have some ideas for improvements in two areas, but keep in mind
none of this is tested or fully worked out. First, what I see as the problems
with current infrastructure.

Problem 1: WOT

I see the main function of the WOT is to maximize the chances that, when
introduced to a new party, that user van verify the key through a path in the
WOT. It'd be nice to have some sense of reliability in this validity and to
detect whether this new party is malicious or not (their public key -> domain
mapping is not the identity you expect).

1) Trust metrics don't scale in finding valid paths, other proposed trust
metrics return non-binary results. When is a WOT path good enough?

Current GPG uses a trust metric to say whether or not a key you signed is
trusted to introduce another key. Although it makes sense to add a metric for a
keys trustworthyness to introduce new keys[2], I don't think this scales
logically. Even in the strongly connected WOT [3] the average distance is
around 6. This means that to introduce yourself to a new key you need already
have determined trust worthiness of 5 arbitrary nodes as introducers. You want
to use the web to validate new people and I think in most cases its very
unlikely you are capable of making this many trust decisions for some random
path.

You can make a trade off by using the trust metric of each node only to the
keys it signs. Now you might be able to validate longer paths, but the trust
metric is less meaningful.

Instead users often use the presence of a key on a keyserver for validation
(bad) or they attend many key signing parties and have directly signed the keys
they talk too. There is a lot of room for improvement, even the existence of
one valid path in the WOT (without trust metrics) is better then just taking a
key from a keyserver.

2) Multiple keys can contain the same UID.

This is a problem because multiple keys can claim ownership to a single name or
email address. This allowed because that email address might have multiple keys
used with it. I think this is a huge area of improvement. I'd like to impose
some sort of structure or restrictions to how non-malicious nodes maintain a
WOT. This way there must be some consensus on a single established mapping so
that it is not easy for Mallory to create a new key and claim it belongs to an
email they don't own.

Problem 2: End-user Usability and Accessibility

The current WOT and PGP infrastructure is limited to advanced users. It
requires manual signing and knowledge to make abstract verification and trust
decisions.  This is the usability problem. I think a lot of this can be solved
through implementation and my current idea revolves around building signatures
for a WOT by automatically signing keys you've 'verified'. There are lots of
  ways to capture out of band user interactions to build a web.

User-to-user interaction provides many ways of remembering and establishing
validity with identities. People close to you in your social network should be
strongly validated and remembered. No paths through the WOT is necessary. Users
need a structured approach to both remembering and validating keys but also a
way to sign keys and participate in the WOT.

The Guardian Project has already started work on this end of things in their
PSST project [6]. I've interned for them this past Summer and the PSST project
of theirs has largely inspired (or overlapped with) things I am writing now.

General Solution:

Basically, I think there should be a single publicly established mapping of
public keys to domain identifiers. One of each type of domain identifier per
public key. It is assumed that each of these domains can provide some sort of
proof of control and that as long as you control your domain identifiers and
sole access to you private key you can always keep your mapping valid and
prevent others from claiming control. Ideally, if private keys couldn't be
lost, you could always use your private key to prove ore reestablish ultimate
control of something like an email account. However, private keys can be lost
or stolen too and this must be considered.

Should a failure of lost access/control of EITHER a private key or domain
identifier occur there needs to be a best effort approach to resolving this
issue. If you lose everything, nothing can be done and you now no longer have
control of those accounts or communication channels. This is already true.
However, if this mapping also captured some real-world identity of yours. You
may be able to appeal through out of band channels to those who validated you
based on this knowledge.

The big picture is that I want to use some sort of consensus or proof of
control to publicly bind this single mapping. A keyserver infrastructure to
support this will be necessary but can be designed to require consensus on a
single WOT across many keyservers. Of course, all verification of paths should
take place offline and we want to servers to have the least authority as
possible. They ultimately will have some authority on what mappings are valid.
This places a stricter requirement on how we spread WOT information, but will
place a higher confidence in the ability to use that information.

Improving the role of the WOT:

In general, I think the WOT is necessary to provide introductions to new
identities. There needs to be some established guarantee of what it can provide
and what it cannot. I'd like to see existing paths in the WOT better leveraged
for new introductions.

1) 1-1 Mapping and Proof of Identity:

I mentioned this mapping above in that the WOT data structure requires a
'consensus' on a single mapping for each public key to a domain(There can only
be one!). Also a proof of control is necessary to create or change mappings.
Much of the details of this still need to be worked out.

There needs to be some cost to changing this identity for legitimate reasons so
that malicious users cannot either: present a different public key for a domain
identity they don't own OR that by having control of a domain identity they are
able to remove the public key the legitimate user controls .

There may be some practical solutions that mitigate this problem. For email for
example, when you wish to change your master key but keep your email, there
could be a lead time on making the change if you lose your key. So when a token
is sent to the email to confirm the change, even if its intercepted, as long as
it isn't censored, the user in control of the email address will have time to
react and prove the key isn't actually lost. A different set of criteria could
be determined for each domain of potentially contested mappings. In the future,
if a user has a stronger ability to guarantee their private keys are not stolen
  (ie they have a trusted security token with backups) or that they will at
  least have access to revocation, stricter requirements could be added to
  update a key server identity.

An important point I think here is also drawing the line between when its good
to be flexible in allowing keys to change and when to fail to prevent MITM
attacks. Different domains will have different requirements. Also, there are a
lot of advantages to a central key infrastructure, but at some point it is
still a central point of failure. Maybe if this *better* keyserver thing plays
out, work into a distributed WOT data structure that supplements the key
servers could be done. There would need to be a meaningful way to solve
conflicts between what a user learns from another user and what the keyserver
shows.

Initial links into this WOT and identities of the keyservers should be
distributed with user software that manages these keys. This way they have
secure HTTPS communication with the server and eventually they can move to
verifying their own seeds into the WOT. This is essentially key pinning and is
where the bootstrapping will occur. Since your often already required to trust
that software is non-maliciously distributed why not reduce the bootstrap
process to a single point. This reduce the opportunities for malicious attacks
to one. If people had physical keyrings (I think this is one day ideal,
something like GPG smart cards that store more data) you could also trust an
entity to supply some initial pinned keys through snail mail rather then the
network. That may or may not be a better.

2) WOT structure, path computability, and utility

A lot of interesting statistics and research has been done on the current
strong set of PGP keys. The most recent attempt on improving trust metrics I've
found is in the paper cited as [1]. They argue that current users often simply
validate a key by its existence on a public keyserver and that "without
automated tools for determining the trust of keys in the web of trust in an
established way, users have no feasible alternatives to these bad practices."
This paper assigns a trust rating for 3 different methods of determining trust:
Disjoint paths, Network Flow, and Strongest Path. I think these are interesting
methods, but in the paper, random weights of introduction trust are assigned to
each edge in the WOT. This is at least scalable compared to the current GPG
solution because you only need a trust metric for keys you signed. I'm not
convinced trust introduction metrics are useful beyond a localized scope, but I
think path analysis will be.

Other statistics and analysis of paths in the WOT: [4][5]. It is not clear to
me what is the best way to validate in the WOT and what decision to make when
you receive a non-binary metric.

For the moment, I think using strongest or disjoint paths would be fine to
validate (but not necessarily sign) keys in the WOT once we have some better
guarantees that the mapping is correct. So you are essentially trusting the
maintenance of the WOT by keyservers combined with an existing valid path that
you compute yourself.

One interesting way to structure the WOT is by creating a Hypercube of
Trust[7].  Based on the size of the Hypercube you have a strict # of bounded
disjoint paths that are easily computed and some guarantees on malicious nodes.
Quite interesting, but the obvious problem is that this structure doesn't
reflect the reality of how the WOT necessarily must be created
(socially/randomly). It may be possible to use existing signatures in the WOT
to create hypercubes, but thats just another research path. Maybe you could
detect what specific signatures you need to create a hypercube and ask those
users to validate each other. The hypercube would sort of be a overlay
structure used for computability and possibly for better guarantees on
malicious nodes (people who make bad signatures on purpose or not).

Improving WOT building and user accessibility:

A user's own keyring provides the strongest level of verification. Certain
types of validation interactions with people they know can be captured as key
signatures. Manual signing of course would still be possible to help build
connectivity in the WOT. Also, it will be necessary that a user has access to
their private keys whenever possible. Most domains of communications don't
require authentication, but its ideal that a user always has the option. This
of course ties directly into the "Portable Shared Security Tokens" research by
the Guardian Project[6].

At some point a user may not be able to verify or validate through the WOT.
Then you must fall back to the SSH model of simply remembering keys otherwise
known as TOFU/POP (Trust on first use persistence of pseudonym). Verifying a
user and keeping a signature is simply the stronger model of this and will also
contribute signatures to the WOT. Trusting on first use is bad, so this should
be minimized whenever possible or even disallowed for certain use cases.

More details will need to worked out. What if you are forced to trust someone
on first use, but are MITM'd? Then the next time you talk to this person (still
MITM'd) you discover a valid link through the WOT to the real public key. It'd
be nice to say that there is enough knowledge to abort your MITM connection,
but I'm not yet convinced there is a clear hierarchy of validity here.

1) Automatically building WOT signatures by capturing validation events It'd be
nice if every user can contribute to the WOT but also easily verify their own
close friends they talk to. Protocols like OTR employ the socialist millionaire
protocol for verification and ZRTP can leverage knowledge of a user's voice to
verify each other's keys. This is a pretty strong verification I would argue.
These exchanges should be captured as signatures. The more ways to capture
these interactions the better. Android NFC 'key bumping' might be another valid
out of band signing process and you could always find new ways.

2) Minimizing Trust Decisions through portability and use of key hierarchy.
Ideally, if Alice validates\ Bob through the OTR socialist millionaire
protocol, it'd be nice if this translates over to email as well. By tying
multiple accounts to a single identity you can minimize the amount of
verification or blind trust decisions a user must make. Thus you have a master
key that has signatures for subkeys of each communication domain. So Alice and
Bob talk on Jabber and validate each other's OTR keys. Then since they both
have their master key present, these keys are exchanged after the OTR sequence
and they both sign and verify each other's keys. Now when they use their email
encryption and signing subkeys they can validate it through this same
signature. Even if they are sending email on a different computer that doesn't
have access to their private master key, as long as they know the public key of
their master key -> this original verification signature can be passed along
safely to their devices that have email since signatures are not private and
cannot be forged. This improves portability as well.

Open Problems: How can we minimize authority of the key servers? Users will
have strong guarantees of people signed in their keyring but must trust the
keyservers to establish single mappings between public key and unique
communication domain identifier. Also, how would you handle discrepancies
between user and keyserver if they wish to build and maintain their own WOT. Is
it practical for users to initiate their own proof of control? Can everyone be
a keyserver?

Also one thing I realized about this mapping I'd want to enforce is that this
is precisely what certificate authorities are supposed to be doing when they
are doing their job. Except we want to capture this without centralized
failures and retain the ability to create our own keys and signatures.

So thank you for reading any of this email. I apologize for the verbosity, much
of this writing I kept to help organize my own thoughts. Some of this material
revolves around trying to develop some sort of community consensus around best
practices for authentication infrastructure and some of this is research that
still needs exploring, thinking, and results.

I am a new grad student at UCSB and interned for the Guardian Project this past
summer. I have been interested in authentication for some time, and discussions
about Guardian Project's PSST project has been an initial building block for
some of these thoughts.

Regards,

Patrick Baxter

References

[1] "Distributed Identity Management in the PGP Web of Trust"
http://www.seas.upenn.edu/~cse400/CSE400_2005_2006/Margolis/paper.pdf

[2] GNU Privacy Handbook: Using Trust to Validate Keys
http://www.gnupg.org/gph/en/manual.html#AEN385

[3] Analysis of the Strong Set in the PGP Web of Trust
http://pgp.cs.uu.nl/plot/

[4] Neal McBurnett, PGP Web of Trust Statistics
http://bcn.boulder.co.us/~neal/pgpstat/

[5] pgp keys, pathfinder, statistics, references
http://www.staff.science.uu.nl/~penni101/henkp/pgp/

[6] Portable Shared Security Token http://guardianproject.info/wiki/PSST

[7] Improving Webs of Trust Through Predetermined Graph Structure
http://repository.lib.ncsu.edu/ir/bitstream/1840.16/617/1/etd.pdf
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQGcBAEBCgAGBQJQurtXAAoJEByqL5nxaJPJGsoL/jcVEbx4qBwMdOTlFb0xYqUu
Y8+i9x36GbB2MIuvoIKj9erKwrT/h+wzcuGVAXb2SaGRgCStPWxGjxrcE9y5dI/z
g7Q8d77LeU6QjdLzr0arhLajuRkL9/E4KQfdJDUVtdqWPrRitx15mX50VXKwS36v
AsJuFKXGqqncGXRERTQbGxqngBXMDME27a4emHcidi40ftd1yec3ESEBmLfNSPMx
Nu49VnytjD8XguKC7gF7E1qnIQ5toFU2VlioZqfGFbrVnJf4wmSEU/aiMAYurLcx
oYYGTNia9c66E6uUpTd4gXYMQOgEgRwhOfE1diQj3w9TXyaOzuFQDxvHTiID5Geg
tJFqZxkOVzuragDrRAhwBaf3RoBwhDSgym8DkneRDq8EqgMHiYiQL240Jsp9j9RO
cMOEpRgt4WjyMpy5iKNiU31W3OodsF+npX1MVGhWo7pIR3BFO+ee45wuOwnNjhOv
SA9WE5iJ5sxTe+Ibjip9Yfr9llQIl+6XRrFALVAGcg==
=a9bQ
-----END PGP SIGNATURE-----