10x compression factor

Steve Butler sbutler at fchn.com
Tue Feb 10 08:32:24 CET 2004


Plain text should have lots (and even more than lots) of spaces.  Many of
them consecutively spaced <<grin>>.  There should be lots of other
characters (even strings of characters) that repeat throughout the file.

Gzip should be able to do that trick fairly easy.  Pkzip could also do it.
Since GnuPG uses similar compression routines (ZIP and ZLIB), it should do
just as well.

I just generated a 1 Gbyte file that contains only spaces (no newline; no
<CR><LF>).  That's 1,073,741,824 spaces (all in a row).

PKZIP compressed this down to 1,048,417 bytes (1024:1)  1Gbyte to just under
1 Mbyte.
gzip  compressed this down to 1,042,077 bytes (1030:1)
GnuPG encrypted  this down to 1,044,357 bytes (1028:1) (even with the key
overhead)

So, 10:1 compression just means that the text files had more random text
than a string of spaces.  But it wasn't completely random.  Most text files
will fail most any statistical test for randomness.  Ergo, they compress
really well!


-----Original Message-----
From: Hasnain Mujtaba [mailto:hmujtaba at forumsys.com]
Sent: Monday, February 09, 2004 7:06 PM
To: gnupg-users at gnupg.org
Subject: 10x compression factor


Hi,

A client has presented an astonishing metric. They say they have a 1Gig
plaintext file which they can compress and PGP encrypt down to 80Mb! I
don't know what tool they are using. 

How is this possible? What compression algorithm could they be using?
Can I achieve this sort of compression using GPG?

Thanks
Hasnain.

_______________________________________________
Gnupg-users mailing list
Gnupg-users at gnupg.org
http://lists.gnupg.org/mailman/listinfo/gnupg-users

CONFIDENTIALITY NOTICE:  This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.





More information about the Gnupg-users mailing list