dash-escape - a potential improvement?

Thu Apr 1 18:30:34 CEST 2004

Hi,

rfc 2440 section 7.1 defines what dash-escaping means for the OpenPGP
Message Format. Many people don't understand why a line starting with
"-" ends up like "- -". I propose a change which confuses less people
and breaks nothing (except for violating rfc2440):

preface:
I always hear that a line beginning with "-" could
interfere with the header parsing. This is not true. Only a line
beginning with "-----" can break the header parsing. The natural way of
escaping is to escape as rarely as possible, so a line such as "-- "
which is no problem anyway should not be escaped.

notation:
"_rfc": rfc 2440
"_de": the dash escape method described in _rfc
"_!de": unescaping _de
"_p": a program which implements _rfc (digest, _de, verify, _!de)
"_h": this proposal (here)
"_4de": the dash escape method of _h
"_!4de": unescaping_4de
"_p4": a program which implements _h (digest, _4de, verify, _!4de)

the potential improvement: ("^" means beginning of line)
_4de:
1) escape "^----" with "^----+"
2) remove trailing spaces
3) escape "^- -" with "^+ -" (for backward compatibility only)
digest:
calculate the digest on the escaped text instead of the cleartext.
verification:
if verification fails, do _!de and retry. If this succeeds, the
signature is good, else it is bad.

backwards compatibility:
a) verifying a _h message using _rfc:
_p would _!de the message before verifying. In perl, this could be done
by: "$line =~ s/^- (-.*)/$1/". No line would match in a _h message, so
_p would actually verify the escaped _h message which is what _h
calculates the digest for. -> This works!

b) verifying an _rfc message using _h:
_p4 would detect a verification failure because it calculates the digest
on the escaped message. But it then retries after _!de and therefore
calculates the digest over the original cleartext message. -> This also
works!

potential problems:
a) _h violates _rfc twice: first, it calculates the digest over the
escaped instead of the cleartext message. It also uses a different
dash-escaping mechanism. If _rfc is an axiom to you (ie. it cannot be
questioned), then this is a problem. For everybody else, this is not a
real problem.

b) _p might implement _!de in a dangerous way. It has to be checked that
no _p does something like (perl): "$line =~ s/^-.(.*)/$1/"

benefits:
1) Users would be less surprised to what happens to their cleartext
because in almost all cases, nothing would be escaped. Today, almost all
mails have at least one escaped line (the signature identifier "-- ").
Today, _rfc unaware MUAs fail to detect the signature correctly. With
_h, this problem is solved.

2) _h is backward compatible. Only bad MUAs which don't implement _!de
safely would break. The patch to all those programs is to do something
like "$line =~ s/^- (-.*)/$1/" (perl) to implement _!de.

What is your opinion on that? Is this a good proposal?
Did I miss an important point?

cheers,
tom