Differentiating GPG data from random data

Ted ted at 16systems.com
Mon Nov 24 23:19:32 CET 2008


Hi,

Hope this is not off-topic here.

I'm writing a program that searches for files that are made up of
random data. GPG data (that is not ascii armored) is consistently
identified by the program. That's expected as GPG data is very random.
However, even though GPG data passes the random tests, I'm not
interested in finding GPG encrypted files, so I thought I would write
a routine to exclude these files based on the first few bytes of the
file, but I'm not comfortable with doing that. It's not ideal, but
seems to work OK. Basically I'm skipping random data files that have
certain bytes in the beginning like so:

Symmetric:
 Hex(8c 0d 04 03)
 Dec(140 13 4 3)

Asymmetric:
 Hex(85 02 0e 03)
 Dec(133 2 14 3)

This works well in informal testing on multiple systems running
various versions of GPG, but I bet it will fail a lot in the real
world after reading the RFC's. That's why I thought I might pose the
question to this list. Is there a simple way to skip most GPG
encrypted files without implementing 4880? It does not have to be
perfect, but perhaps there is something better than what I have
described above.

Thanks for any suggestions,

Ted



More information about the Gnupg-devel mailing list