Passphrase entropy (was Re: Symmetric encryption)

Ryan Malayter rmalayter at bai.org
Tue Oct 26 21:51:21 CEST 2004


[Chris De Young]
> I had thought that English has only somewhere around 1.5 bits worth of
> entropy per character.  A passphrase certainly could have more than
> that because it's not necessarily real English, uses a wider character
> set, and so on... is that difference really enough?  19.5 8-bit
> characters is 156 bits; that seems (intuitively, which granted can be
> misleading) to be getting closer to real randomness than a passphrase
> would allow.  At least, any passphrase that someone could
> remember. :-)  It's only 3.5 characters longer than 128 bits, after
> all. 

There are 95 "printable" characters on a US keyboard, including the
space character. (We dumb Americans can't deal with those crazy accented
characters.)

95 ~= 2^6.57

So we have 6.57 bits of entropy per character, assuming we select or
characters totally randomly.

128/6.57 ~= 19.48

So we need 19.5 characters to get 128 bits of entropy in our pass
phrase. Since you can't enter half-characters, you really need a 20
character pass phrase.

Any non-randomness you add to the process (using real words, using the
first letters of each word from a sentence, for example) severely
decreases the amount of entropy per character. English prose has about
1.5 bits of entropy per character, which means you'd need an
86-character pass phrase of English text to get 128 bits of entropy.
That's a lot to remember.

Incidentally, the entropy of English text has usually been calculated by
its compressibility. The very best arithmetic compression algorithms
(e.g. PPM) will compress a large body of English text down to about 20%
of its original size. 8 * 0.2 = 1.6, which is why we frequently see the
"1.5 bits per character" entropy number for English.

Regards,
	Ryan



More information about the Gnupg-users mailing list