Passphrase Encoding and Entropy

Wed Jun 8 07:03:00 CEST 2005

Bit early in the morning to try to comprehend your main question, but
I can answer the question about the string to key part :).

There are 3 ways to generate a key from a password outlined in the rfc:

1. Merely hashing password.

2. Salting then hashing.

3. Salting then hashing, then hashing again some arbitrary number of times.

The thirds the best. I [believe] gpg uses that.
Unless there is some [very] good reason not to you should always use a salt.

On 6/8/05, Oskar L. <oskar at rbgi.net> wrote:
> "Martin Geisler" <mgeisler at mgeisler.net> wrote:
> 
> > When you have 64 different possibilities, all of equal likelyhood,
> > then you can code them using 6 bit. This is what the entropy tells
> > you.
> >
> > The fact that A in the 7-bit ASCII standard is 01000001 is just a
> > coincedence --- they could just as well have put your chosen 64
> > characters into the lower 6 bits, and then have the other 64 available
> > characters use the high bit.
> >
> > In general it doesn't change anything if you encode your message (a
> > passphrase in your example) in a different encoding: the amount of
> > information stays the same if you still just select your characters
> > From the same subset.
> >
> >
> > So making a passphrase of ASCII characters, and then encoding it using
> > UTF-16 doesn't make it more secure. Sure, with UTF-16 gives you the
> > potential to encode something like 2^16-1 characters, but to calculate
> > the entropy you can disregard all characters which you will never
> > choose.
> >
> > The formula for entropy explains this:
> >
> >   H(X) = - sum_{i=1}^n p(i) * log_2(p(i))
> >
> > Here the p(i)'s are the probability that your message will be "i".
> > With a bigger space of possible messages (a bigger n) then the sum
> > contains more terms, but if you still select your message from the
> > same small set, then most of the terms will be zero. So if a message
> > of value "j" is, say, a Chinese passphrase, then p(j) = 0 would mean
> > that know that you'll never such a passphrase. And thus the term
> > disappears from the sum (well, actually you get a problem with taking
> > the logarighm of zerolet's not go into that).
> >
> >
> > See http://en.wikipedia.org/wiki/Information_entropy for more on how
> > to calculate the entropy, but I hope this helped a bit.
> 
> Thanks for your anwser, but I'm afraid you mostly told me what I already
> know. What I don't understand is how this relates to breaking passphrases.
> For example, say I use the passphrase foobar. It has 6 characters, each
> represented by 8 bits, so it will be represented by 46 bits. These 46 bits
> are then the key used to symmetrically encrypt/decrypt my secret key,
> right? (Another question; is salt added to it, and/or is it hashed?)
> 
> Now if the attacker knows that I have only used the 23 characters a-z in
> the passphrase, then she/he can represent all of them using 5 bits. But I
> don't understand how this helps the attacker, since foobar represented
> this way would obviously produce a completely different key (of only 30
> bits).
> 
> I thought that all this was about time and the amount of data you have to
> deal with. That, for example, when someone is brute forcing a passphrase
> it would be 25% faster if all the characters used were represented with
> only 6 bits instead of 8. I understand why this would be faster, but not
> who it could possibly work.
> 
> Oskar
> 
> _______________________________________________
> Gnupg-users mailing list
> Gnupg-users at gnupg.org
> http://lists.gnupg.org/mailman/listinfo/gnupg-users
>