Signal handling: what you can't do.

Mon Aug 23 14:02:23 CEST 1999

I was reviewing gnupg prior to installing it (not a proper "is this a
safe crypto program", just a "is this a safe setuid program" review).

Everything looked OK, with one exception.

In 'init_signals', there is:

    nact.sa_handler = got_fatal_signal;
...
    do_sigaction( SIGINT, &nact );

and got_fatal_signal looks like this:

static RETSIGTYPE
got_fatal_signal( int sig )
{
    if( caught_fatal_sig )
        raise( sig );
    caught_fatal_sig = 1;

    fprintf( stderr, "\n%s: %s caught ... exiting\n",
              log_get_name(), signal_name(sig) );
    secmem_term();
    exit( 8 );
}

now, this is not good.  There are three issues here, but most flow
from the same important rule:

In a signal handler, under Linux (and most other unixes), you cannot:

1. Assume that the values in non-volatile variables are stored in the same
   order as implied by the code (or in fact, that they are ever stored
   at all);
2. Call non-reentrant procedures.

For instance, even though 'fprintf' is thread-safe, it is not
reentrant, because it achieves thread-safety by using locks; since
gnupg doesn't link with the thread library, the locking calls do
nothing.  Likewise, atexit()/exit() are not reentrant.

Also for instance, in secmem_init, there is this code:

    else {
        pool_is_mmapped = 1;
        pool_okay = 1;
    }

there is absolutely no guarantee that `pool_is_mmapped' is physically
set to 1 before `pool_ok', or that `pool_ok' is set to 1 after the
previous store to `pool'.  secmem_term() uses the values of these
variables.

How could someone exploit this?  Well, as an example, just after the
init_signals call in main(), there is 'create_dotlock (NULL)', which
calls atexit().  atexit() does this:

int
atexit (void (*func) (void))
{
  struct exit_function *new = __new_exitfn ();

  if (new == NULL)
    return -1;

  new->flavor = ef_at;
  new->func.at = func;
  return 0;
}

where __new_exitfn() returns something which was produced by calling
malloc().

Suppose a SIGINT happens between the assign to 'new->flavor' and the
one to 'new->func.at', and exit() gets called.  It will end up calling
whatever address was previously in 'new->func.at'.  If the user could
control this address, perhaps by convincing ld.so to allocate then
deallocate something in memory which will be re-used by malloc(), and
the user could trigger a SIGINT at the right time (which would require
persistence but would not be impossible), the user gets root.

As it happens, this isn't possible.  On my system, the scheduler
rearranges the two stores so they happen in the opposite order to the
one written, so everything is OK.  This is pretty much blind luck.

I should mention that 'my system' has glibc 2.2.2pre2 compiled using
gcc-2.96 snapshot of 19990721 targetting powerpc-linux with
'-mtune=750'.  People using other compilers or other CPUs might want
to check that they are still not vulnerable.

Finally, a minor nit:  in signal_name(), there is this:

  #if SYS_SIGLIST_DECLARED
    return sys_siglist[signum];
  #else
    static char buf[20];
    sprintf(buf, "signal %d", signum );
    return buf;
  #endif

it should check that `signum' is less than NSIG, or better yet it
should use strsignal() which does the check for you and will still
work even if someone decides, say, to add 32 new real-time signals.
Oh, sorry, _another_ 32 new real-time signals :-).  

Again on my system this won't happen because NSIG is greater than the
signal number of all the signals for which signal_name is actually
called; but it's not robust code.

-- 
Geoffrey Keating <geoffk at cygnus.com>