[PATCH v3 3/5] gpg-agent: Implement --supervised command (for systemd, etc).

Daniel Kahn Gillmor dkg at fifthhorseman.net
Wed Oct 5 06:51:25 CEST 2016


Hi Werner et al--

On Tue 2016-10-04 11:23:06 -0400, Werner Koch wrote:

> I have applied your patch as well as a few other updates regarding that
> new feature.  For example it is now required that getsockopt works and
> returns a valid socket name.  I have done a few cursory tests by
> creating a few sockets inside gpg-agent to test the
> map_supervised_sockets functions but not much more.  I would appreciate
> if you can test this with systemd; I would then use the same code for
> dirmngr.

I've tested this and there are a few problems still aside from the
patch i've sent to the list.

0) when running --supervised with a logfile directive, gpg-agent aborts
   at startup.  Removing the logfile directive allows the agent to start
   up fine.  I haven't diagnosed this further, because i ran into the
   next problem …

1) when running --supervised, i see regular hangs of the agent which are
   not seen when it is auto-launched.

When using gpg-agent not under systemd supervision (auto-launched),
strace on the main process looks like the following (we see several
select()-based timeouts, followed by a few quick forks, followed by more
select()s etc):

…more pselect6()'s…
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1762}, {[], 8}) = 0 (Timeout)
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1204}, {[], 8}) = 0 (Timeout)
clone(child_stack=0x7f933435bff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f933435c9d0, tls=0x7f933435c700, child_tidptr=0x7f933435c9d0) = 10593
futex(0x7f9334cff200, FUTEX_WAKE_PRIVATE, 1) = 1
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 217785}, {[], 8}) = 1 (in [3], left {2, 211320})
accept(3, {sa_family=AF_LOCAL, NULL}, [2]) = 9
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f933335b000
mprotect(0x7f933335b000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f9333b5aff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f9333b5b9d0, tls=0x7f9333b5b700, child_tidptr=0x7f9333b5b9d0) = 10594
futex(0x7f9334cff200, FUTEX_WAKE_PRIVATE, 1) = 1
pselect6(8, [3 4 5 6 7], NULL, NULL, {1, 999208864}, {[], 8}) = 0 (Timeout)
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1184}, {[], 8}) = 0 (Timeout)
…more pselect6()'s…

I suspect this is the whole "check if my socket is reachable" business.

But when running --supervised, the self-triggered timeout does several
different different syscalls, and then hangs in a futex:

…more pselect6()'s…
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1806}, {[], 8}) = 0 (Timeout)
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1802}, {[], 8}) = 0 (Timeout)
getuid()                                = 1000
stat("/run/user/1000", {st_mode=S_IFDIR|0700, st_size=180, ...}) = 0
getuid()                                = 1000
stat("/run/user/1000/gnupg", {st_mode=S_IFDIR|0700, st_size=140, ...}) = 0
getuid()                                = 1000
clone(child_stack=0x7fb4bfd25ff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb4bfd269d0, tls=0x7fb4bfd26700, child_tidptr=0x7fb4bfd269d0) = 10473
futex(0x7fb4c06c9200, FUTEX_WAKE_PRIVATE, 1) = 1
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 761217}, {[], 8}) = 1 (in [5], left {2, 723370})
futex(0x7fb4c06c9200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff


When it's hung in this state, i see the following backtrace:

0x00007fdf1586c577 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7fdf15c7e200) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) bt
#0  0x00007fdf1586c577 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7fdf15c7e200) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem at entry=0x7fdf15c7e200, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007fdf1586c624 in __new_sem_wait_slow (sem=0x7fdf15c7e200, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007fdf15a7bcf9 in ?? () from /usr/lib/x86_64-linux-gnu/libnpth.so.0
#4  0x00007fdf15a7c303 in npth_pselect () from /usr/lib/x86_64-linux-gnu/libnpth.so.0
#5  0x000055ff7d27d3c2 in handle_connections (listen_fd=<optimized out>, listen_fd_extra=<optimized out>, listen_fd_browser=<optimized out>, listen_fd_ssh=<optimized out>) at ../../agent/gpg-agent.c:2920
#6  0x000055ff7d27a43e in main (argc=<optimized out>, argv=<optimized out>) at ../../agent/gpg-agent.c:1501

(sorry, these line numbers aren't the same as git master, i'm working
From slightly modified source).

I'm using npth 1.2-3 in debian, fwiw.  Any thoughts about what might be
going wrong here?

    --dkg


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 930 bytes
Desc: not available
URL: </pipermail/attachments/20161005/ed392028/attachment-0001.sig>


More information about the Gnupg-devel mailing list