[PATCH 5/5] gpg: Fix regexp sanitization.

Wed Jul 19 13:22:29 CEST 2017

Damien Goutte-Gattat <dgouttegattat at incenp.org> writes:

> * g10/trustdb.c (sanitize_regexp): Do not escape normal characters.
> --
>
> The current sanitization code escapes ALL characters in the
> regular expression, including characters that do not have any
> special meaning and only match themselves. Only the dot (.)
> is not escaped.

That is odd indeed.

> This leads to, e.g., 'example.com' being sanitized into
> '\e\x\a\m\p\l\e.\c\o\m', which will then fail to match against
> 'alice at example.com'.

And rightfully so, because POSIX says:
> The interpretation of an ordinary character preceded by a backslash (
> '\' ) is undefined.

http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_04_02

> This patch updates the function to escape only the meaningful
> characters (minus the dot).

I'm not convinced that this patch is correct.  I'm not convinced that we
should attempt any kind of sanitization at all.  This happens right
before the expression is fed to the regex engine.  I could somehow
understand that we do sanitization or try to suggest improvements when
users enter an expression, but not when interpreting expressions found
in keys.

Imagine I have for some reason the expression 'foo.*\.org' in a trust
signature, because I only want to match organisations starting with foo.
RFC4880 seems to allow that aiui.  Your patch breaks that, but so does
sanitize_regexp as it is now.

I don't see how and why we should do any sane sanitization at all.

Discuss!

Cheers,
Justus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: </pipermail/attachments/20170719/e8933824/attachment.sig>