[gnutls-devel] GnuTLS | SEGFAULT in libgnutls30 during multithreaded call of `gnutls_record_send` (#1567)

Read-only notification of GnuTLS library development activities gnutls-devel at lists.gnutls.org
Tue Aug 6 17:26:09 CEST 2024



Moritz Schneider created an issue: https://gitlab.com/gnutls/gnutls/-/issues/1567



## Description of problem:

As described in the documentation we use the libgnutls in a multi-threaded environment: one thread for reading and one thread for sending, both on the same socket. This has worked in the past well (at least for Debian Buster) but starting with Debian Bookworm (or maybe also Debian Bullseye) we experience sometimes a segfault. From the core file I can tell the place which is causing the segfault:

```
#0  0x00007f38af43baea in _gnutls_send_tlen_int (session=session at entry=0x7f387fea3000, type=type at entry=GNUTLS_APPLICATION_DATA,
    htype=htype at entry=4294967295, epoch_rel=epoch_rel at entry=70001, _data=<optimized out>, data_size=<optimized out>, min_pad=0, mflags=1)
    at ../../lib/record.c:611
#1  0x00007f38af43ef66 in _gnutls_send_int (mflags=1, data_size=<optimized out>, _data=<optimized out>, epoch_rel=70001, htype=4294967295,
    type=GNUTLS_APPLICATION_DATA, session=0x7f387fea3000) at ../../lib/record.h:43
#2  gnutls_record_send2 (session=0x7f387fea3000, data=0x7f386f420c40, data_size=<optimized out>, pad=0, flags=<optimized out>)
    at ../../lib/record.c:2068
```

The source code for the top frame is the following condition:

```
611          if (vers->tls13_sem && !(session->internals.flags & GNUTLS_NO_AUTO_REKEY) &&                                                       
612              !(record_params->cipher->flags & GNUTLS_CIPHER_FLAG_NO_REKEY)) {                                                               
```

and from the disassemble I can tell that we have a nullpointer dereference at `record_params->cipher`:

```
  0x7f38af43bae1 <_gnutls_send_tlen_int+993>      mov    0x20(%rsp),%rax                                                                            
  0x7f38af43bae6 <_gnutls_send_tlen_int+998>      mov    0x8(%rax),%rax                                                                             
> 0x7f38af43baea <_gnutls_send_tlen_int+1002>     testb  $0x4,0x1c(%rax)
```

since the `GNUTLS_CIPHER_FLAG_NO_REKEY` is equal to `0x4`, and also from the syslog:

```
segfault at 1c ip 00007f38af43baea sp 00007f38a209a0d0 error 4 in libgnutls.so.30.34.3[7f38af437000+133000] likely on CPU 29 (core 4, socket 1)
```

>From my analysis of the code, the only place in the libgnutls code, where the `cipher` field is set to `NULL` is in `_gnutls_epoch_setup_next` (`constate.c:1015`):

```
	if (null_epoch) {
		(*slot)->cipher = cipher_to_entry(GNUTLS_CIPHER_NULL);
		(*slot)->mac = mac_to_entry(GNUTLS_MAC_NULL);
		(*slot)->initialized = 1;
	} else {
		(*slot)->cipher = NULL;
		(*slot)->mac = NULL;
	}
```

But there are a lot of locations where this function can be called from.

The parallel reading thread (via `gnutls_record_recv`) has got 8 times `GNUTLS_E_AGAIN` and then one time `GNUTLS_E_TOO_MANY_HANDSHAKE_PACKET`. My guess is that the `GNUTLS_E_TOO_MANY_HANDSHAKE_PACKET` is set in function `_gnutls13_recv_key_update` (`lib/tls13/key_update.c`). I've come to this claim with data from the core file, but of course the data might have changed in between calls.

For some reason I don't understand I  can access the `record_params->cipher` in GDB just fine. This has the follwoing content:
```
(gdb) p *record_params->cipher
$9 = {name = 0x7f38af599c4c "AES-256-GCM", id = GNUTLS_CIPHER_AES_256_GCM, blocksize = 16, keysize = 32, type = CIPHER_AEAD, implicit_iv = 4, 
  explicit_iv = 8, cipher_iv = 12, tagsize = 16, flags = 0}
```

The #2 frame from the backtrace is `switch` `case` statement in the `RECORD_SEND_KEY_UPDATE_3` block (most probable falling through from `RECORD_SEND_KEY_UPDATE_1` and `RECORD_SEND_KEY_UPDATE_2`. And during the analysis where those `rsend_states` are set I've noticed is that

```
session->internals.rsend_state = RECORD_SEND_KEY_UPDATE_1
```

is

* written in function \_gnutls13_recv_key_update inf lib/tls13/key_update.c:117 possible without any synchronization, and
* read in gnutls_record_send2 in lib/record.c:2038 definitely without any synchronization.

I would have expected that the read/write access to those fields must be synchronized.

If I can help in any way to resolve this segfault please ask for additional details (but I will be on vacation until 2024-08-26).

## Version of gnutls used:

libgnutls30 3.7.9-2+deb12u3

## Distributor of gnutls (e.g., Ubuntu, Fedora, RHEL)

Debian Bookworm 12.3

## How reproducible:

Unfortunately I am not able to reproduce it (yet?).

-- 
Reply to this email directly or view it on GitLab: https://gitlab.com/gnutls/gnutls/-/issues/1567
You're receiving this email because of your account on gitlab.com.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnupg.org/pipermail/gnutls-devel/attachments/20240806/44cc8bb4/attachment-0001.html>


More information about the Gnutls-devel mailing list