[PATCH v3 3/5] gpg-agent: Implement --supervised command (for systemd, etc).
Daniel Kahn Gillmor
dkg at fifthhorseman.net
Wed Oct 5 06:51:25 CEST 2016
Hi Werner et al--
On Tue 2016-10-04 11:23:06 -0400, Werner Koch wrote:
> I have applied your patch as well as a few other updates regarding that
> new feature. For example it is now required that getsockopt works and
> returns a valid socket name. I have done a few cursory tests by
> creating a few sockets inside gpg-agent to test the
> map_supervised_sockets functions but not much more. I would appreciate
> if you can test this with systemd; I would then use the same code for
> dirmngr.
I've tested this and there are a few problems still aside from the
patch i've sent to the list.
0) when running --supervised with a logfile directive, gpg-agent aborts
at startup. Removing the logfile directive allows the agent to start
up fine. I haven't diagnosed this further, because i ran into the
next problem …
1) when running --supervised, i see regular hangs of the agent which are
not seen when it is auto-launched.
When using gpg-agent not under systemd supervision (auto-launched),
strace on the main process looks like the following (we see several
select()-based timeouts, followed by a few quick forks, followed by more
select()s etc):
…more pselect6()'s…
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1762}, {[], 8}) = 0 (Timeout)
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1204}, {[], 8}) = 0 (Timeout)
clone(child_stack=0x7f933435bff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f933435c9d0, tls=0x7f933435c700, child_tidptr=0x7f933435c9d0) = 10593
futex(0x7f9334cff200, FUTEX_WAKE_PRIVATE, 1) = 1
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 217785}, {[], 8}) = 1 (in [3], left {2, 211320})
accept(3, {sa_family=AF_LOCAL, NULL}, [2]) = 9
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f933335b000
mprotect(0x7f933335b000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f9333b5aff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f9333b5b9d0, tls=0x7f9333b5b700, child_tidptr=0x7f9333b5b9d0) = 10594
futex(0x7f9334cff200, FUTEX_WAKE_PRIVATE, 1) = 1
pselect6(8, [3 4 5 6 7], NULL, NULL, {1, 999208864}, {[], 8}) = 0 (Timeout)
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1184}, {[], 8}) = 0 (Timeout)
…more pselect6()'s…
I suspect this is the whole "check if my socket is reachable" business.
But when running --supervised, the self-triggered timeout does several
different different syscalls, and then hangs in a futex:
…more pselect6()'s…
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1806}, {[], 8}) = 0 (Timeout)
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 1802}, {[], 8}) = 0 (Timeout)
getuid() = 1000
stat("/run/user/1000", {st_mode=S_IFDIR|0700, st_size=180, ...}) = 0
getuid() = 1000
stat("/run/user/1000/gnupg", {st_mode=S_IFDIR|0700, st_size=140, ...}) = 0
getuid() = 1000
clone(child_stack=0x7fb4bfd25ff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb4bfd269d0, tls=0x7fb4bfd26700, child_tidptr=0x7fb4bfd269d0) = 10473
futex(0x7fb4c06c9200, FUTEX_WAKE_PRIVATE, 1) = 1
pselect6(8, [3 4 5 6 7], NULL, NULL, {2, 761217}, {[], 8}) = 1 (in [5], left {2, 723370})
futex(0x7fb4c06c9200, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff
When it's hung in this state, i see the following backtrace:
0x00007fdf1586c577 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7fdf15c7e200) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205 ../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) bt
#0 0x00007fdf1586c577 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7fdf15c7e200) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1 do_futex_wait (sem=sem at entry=0x7fdf15c7e200, abstime=0x0) at sem_waitcommon.c:111
#2 0x00007fdf1586c624 in __new_sem_wait_slow (sem=0x7fdf15c7e200, abstime=0x0) at sem_waitcommon.c:181
#3 0x00007fdf15a7bcf9 in ?? () from /usr/lib/x86_64-linux-gnu/libnpth.so.0
#4 0x00007fdf15a7c303 in npth_pselect () from /usr/lib/x86_64-linux-gnu/libnpth.so.0
#5 0x000055ff7d27d3c2 in handle_connections (listen_fd=<optimized out>, listen_fd_extra=<optimized out>, listen_fd_browser=<optimized out>, listen_fd_ssh=<optimized out>) at ../../agent/gpg-agent.c:2920
#6 0x000055ff7d27a43e in main (argc=<optimized out>, argv=<optimized out>) at ../../agent/gpg-agent.c:1501
(sorry, these line numbers aren't the same as git master, i'm working
From slightly modified source).
I'm using npth 1.2-3 in debian, fwiw. Any thoughts about what might be
going wrong here?
--dkg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 930 bytes
Desc: not available
URL: </pipermail/attachments/20161005/ed392028/attachment-0001.sig>
More information about the Gnupg-devel
mailing list