integrating GPG with deniable steganography

Tue Mar 20 22:35:17 CET 2001

"Matthias Urlichs" <smurf at noris.de> writes:

> For instance, I might take an MP3 encoder or a JPEG compressor. I might
> then let the computer fiddle with the data of each basic unit (the
> image's 8x8 block, for instance) until the decompressed data stream's
> low-order bits (or the result of hashing them with MD5 or whatever)
> "just happens to" yield my crypted data. IMHO, it's impossible to prove
> that there exists no combination of encoder parameters plus input data
> which emits exactly this MP3 data stream or JPEG image under "normal"
> circumstances -- it's a lossy compression, after all, thus you can't get
> the original data back no matter how hard you try because you cannot
> know which bits the algorithm threw away. Thus, the NSA's fancy
> randomness detection algorithms are 100% worthless.

Your scheme needs a method to extract the hidden bytes from the hiding
data stream.  We're talking about hard steganography, so we can assume
that the security of the scheme is a result of its properties, and
not of its secrecy, and it shouldn't matter if the attacker knows
the extraction method or not.  Now if we run your extraction method
on some of your (presumably tampered) data streams, and on several
reference data streams (which use the same encoding method, but are
otherwise completely different), we will get a derived data stream
for each data stream (obviously, it's a good idea to leave out any
cryptographic hash functions which have been used in the decryption
method).  These derived data streams will look like noise, but they
have some statistical properties anyway, and if you've got enough
samples and a proper noise model, you can tell which data streams
have a hidden message in them and which do not with a high degree of
certainty.  You can improve your attack if you look at the way the
data stream is manipulated to hide the message.  This will change some
statistical properties of the data stream as well.

An alternative attack works by looking at the decoded data stream
and analyzing the noise in it.  It doesn't require knowledge of the
extraction method, but it is infeasible if the amount of data hidden
is extremely small (for example, 1 ppm and below), unless your noise
model is exceptionally good.

Of course, the attacker will never obtain a complete proof that a
secret message was hidden, but he will be able to obtain enough
evidence to get you into trouble.