[gpgme] fork() problem
Stephan Menzel
smenzel at gmx-gmbh.de
Wed Feb 21 11:42:02 CET 2007
Hi Marcus,
Am Mittwoch, 21. Februar 2007 01:56:07 schrieb Marcus Brinkmann:
> Did you serialize *all* calls into GPGME properly, with memory barrier
> (a single global mutex locked around all calls should do the trick)?
Yes and no.
I just doublechecked it once again to make sure. I think I can say that it's
impossible for two calls into the lib to happen simultaneously. It's all
protected by a scoped_lock around each of the objects.
So I don't have global mutexes. The wrapper objects itself are boost_mutexes
and lock within themselfes which if prefer.
However, it is theoretically possible for subsequent calls to happen to
different contexts (objects for me). Meaning something like this
(simplified):
GPGObject a(); // does initialization routines using gpgme_new()
GPGObject b();
a->verify();
b->verify();
That would mean context switches to the engine. I think this not too likely
but possible. Btw, all coredumps I got seemed to happen within gpgme_new().
None happening within the actual verify. But I was instanciating a bit
unnessecarily at times, so this could be explained. I'm not any more though.
> > #0 0xffffe410 in __kernel_vsyscall ()
> > (gdb) backtrace
> > #0 0xffffe410 in __kernel_vsyscall ()
> > #1 0xb711d885 in raise () from /lib/tls/i686/cmov/libc.so.6
> > #2 0xb711f002 in abort () from /lib/tls/i686/cmov/libc.so.6
> > #3 0xb7117318 in __assert_fail () from /lib/tls/i686/cmov/libc.so.6
> > #4 0xb6731e03 in _gpgme_ath_mutex_lock (lock=0x0) at ath.c:71
> > #5 0xb6741e2f in _gpgme_sema_cs_enter (s=0xb674db40) at posix-sema.c:48
> > #6 0xb673bd3b in _gpgme_engine_info_copy (r_info=0x0) at engine.c:225
> > #7 0xb6743070 in gpgme_new (r_ctx=0x0) at gpgme.c:58
> > #8 0xb732e9f3 in MyWrapperClass (this=0xb1782768) at
> > MyWrapperClass.cc:187
> >
> > It still doesn't crash all the time though. It mostly works so I think
> > it's some strange race condition.
> > Maybe this helps.
>
> I don't trust the "lock=0x0" parameter in the debug output, it is
> clearly bogus which indicates optimization (likely) or a corrupt stack
> (less likely).
Of course. I often get stuff like this and it's never to be trusted. We
use -O2 btw.
> above. So I assume that this is what it actually is, because
> otherwise you would get a segmentation fault and not an assertion
> failure.
Yes. I didn't get any of those. All crashes I noticed were sig6.
> The non-threaded version of GPGME has some rudimentary error checking:
> It makes the same mutex calls as the threaded version, but just checks
> if the locks are taken and released properly. This can catch some
> simple bugs where locks are not unlocked when they should be or used
> after they are destroyed.
This is ath.c right?
> The above assertion failure means that it was attempted to take the
> engine_info_lock in engine.c twice without unlocking it inbetween.
I patched around a bit and at times had a version running with this mutex
removed altogether. I tried to rely on my own mutex instead. I tried this and
linking against the non-pthread version.
The result was that I didn't get thos crashes around this mutex
(engine_info_lock) anymore but different ones. I just looked, but don't have
the stacktraces anymore :-(
> Frankly, aside from a "spurious writes to random memory locations"
> (and accidentially hitting the above PRIVATE member), I can not
> imagine what might cause this assertion failure if all calls into
> GPGME are properly serialized. It's a mystery.
It sure is to me.
I looked briefly into it too but I came to the same conclusion. However, since
we valgrinded the daemon a lot I just trust that we don't have any messing
around in it's mem like you describe. Given our use cases that would be quite
desastrous and I really think we would have already noticed that. Segfault
crashes would have to result.
> Mmh. There is one issue in GPGME which may not be handled properly.
> We set a null handler for SIGPIPE at initialization time, and one
> should make sure that this is also the case for all GPGME-using
> threads. You may want to check if this could be a potential cause for
> trouble in your application. I would expect this to show up
> differently though.
Could I with my limited time do anything to prove that right or wrong?
many Thanks and Greetings....
Stephan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : /pipermail/attachments/20070221/a2984329/attachment.pgp
More information about the Gnupg-devel
mailing list