libgcrypt performance issue, diagnostic & workaround

Christian Grothoff grothoff at gnunet.org
Tue Aug 14 10:51:19 CEST 2018


Hi!

Running the GNU Taler exchange (which uses libgcrypt) on a many-core
(32-thread) system I noticed that futex syscall was dominating CPU
utilization.  Digging deeper into it, the reason is the
SECMEM_LOCK/SECMEM_UNLOCK logic, despite the fact that I (properly)
disable the use of SECMEM via gcry_control on startup
(gnunet/src/util/crypto_random.c):


/**
 * Initialize libgcrypt.
 */
void __attribute__ ((constructor))
GNUNET_CRYPTO_random_init ()
{
  gcry_error_t rc;

  if (! gcry_check_version (NEED_LIBGCRYPT_VERSION))
  {
    FPRINTF (stderr,
             _("libgcrypt has not the expected version (version %s is
required).\n"),
             NEED_LIBGCRYPT_VERSION);
    GNUNET_assert (0);
  }
  /* Disable use of secure memory */
  if ((rc = gcry_control (GCRYCTL_DISABLE_SECMEM, 0)))
    FPRINTF (stderr,
             "Failed to set libgcrypt option %s: %s\n",
             "DISABLE_SECMEM",
	     gcry_strerror (rc));
  /* Otherwise gnunet-ecc takes forever to complete, besides
     we are fine with "just" using GCRY_STRONG_RANDOM */
  if ((rc = gcry_control (GCRYCTL_ENABLE_QUICK_RANDOM, 0)))
    FPRINTF (stderr,
	     "Failed to set libgcrypt option %s: %s\n",
	     "ENABLE_QUICK_RANDOM",
	     gcry_strerror (rc));
  gcry_control (GCRYCTL_INITIALIZATION_FINISHED, 0);
}


Sample stack trace:

#2  0x00007fbab01077ea in _gcry_secmem_free (a=0x7fba4000faf0) at
secmem.c:753
#3  0x00007fbab01068f6 in _gcry_private_free (a=0x7fba4000faf0) at
stdmem.c:238
#4  0x00007fbab0100c8d in _gcry_free (p=0x7fba4000faf0) at global.c:1033
#5  0x00007fbab01018e9 in _gcry_sexp_release (sexp=0x7fba4000faf0) at
sexp.c:350
#6  0x00007fbab01025e1 in _gcry_sexp_cadr (list=0x7fba40005dc0) at
sexp.c:947
#7  0x00007fbab00fbc6a in gcry_sexp_cadr (list=0x7fba40005dc0) at
visibility.c:226
#8  0x00007fbab0052a1e in key_from_sexp (array=0x7fba4fffe4b0,
sexp=0x7fba4001a5f0, topname=0x7fbab009244d "public-key",
elems=0x7fbab00924b6 "n") at crypto_rsa.c:103
#9  0x00007fbab00534d0 in GNUNET_CRYPTO_rsa_public_key_decode (
    buf=0x7fba40003630 "(public-key \n (rsa \n  (n
#00C8B6DCDE035F1BD788CE1062F7F4D5DF64AF0C87C6B080B17E5313BEBC0FDAA1A711C03B55447778D53E70BF9CEBA198707D37E5B206C0A6B9EFD4DBBA79B6426487513F7D838EB16D1346D80A
6758C9739EA5B94C51"..., len=310) at crypto_rsa.c:380
#10 0x00007fbab0017129 in parse_rsa_public_key (cls=0x0,
root=0x7fba4001dd60, spec=0x7fba4fffe680) at json_helper.c:799
#11 0x00007fbab00144f6 in GNUNET_JSON_parse (root=0x7fba4001cb40,
spec=0x7fba4fffe610, error_json_name=0x7fba4fffe5c0,
error_line=0x7fba4fffe5bc) at json.c:62
#12 0x000055680f9af868 in TEH_PARSE_json_data
(connection=0x7fba4001e370, root=0x7fba4001cb40, spec=0x7fba4fffe610) at
taler-exchange-httpd_parsing.c:190
#13 0x000055680f9a96ea in TEH_DEPOSIT_handler_deposit (rh=0x55680f9c7988
<handlers+616>, connection=0x7fba4001e370,
connection_cls=0x7fba4001e3c8, upload_data=0x0,
upload_data_size=0x7fba4fffeb70)


Note that 'mainpool' clearly was never setup:

(gdb) print mainpool
$4 = {next = 0x0, mem = 0x0, size = 0, okay = 0, is_mmapped = 0,
cur_alloced = 0, cur_blocks = 0}
(gdb) print no_secure_memory
$5 = 1


My understanding is that the bug is that stdmem.c::_gcry_private_free()
first tries to call _gcry_secmem_free() before calling free(), thus
uselessly grabbing the lock.  Here, a test whether secmem is disabled
instead of grabbing the lock ought to be inserted.


As a workaround, I've added this call to our libgcrypt initialization
sequence:

  /* set custom allocators */
  gcry_set_allocation_handler (&w_malloc, // wrapper around calloc()
                               &w_malloc, // wrapper around calloc()
                               &w_check,  // return 'false'
                               &realloc,
                               &free);

After that, the GNU Taler exchange could do > 10x the number of
operations per second on my system (might be more, but the bottleneck is
now elsehwere).


Happy hacking!

Christian



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.gnupg.org/pipermail/gcrypt-devel/attachments/20180814/6b1ee9ca/attachment-0001.sig>


More information about the Gcrypt-devel mailing list