<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html lang="en">
<head>
<meta content="text/html; charset=US-ASCII" http-equiv="Content-Type">
<title>
GitLab
</title>
<style>img {
max-width: 100%; height: auto;
}
</style>
</head>
<body>
<div class="content">
<p class="details" style="font-style: italic; color: #666;">
<a href="https://gitlab.com/rgacogne">Remi Gacogne</a> created an issue: <a href="https://gitlab.com/gnutls/gnutls/-/issues/1277">#1277</a>
</p>
<div></div>
<h2 dir="auto">
<a id="user-content-description-of-problem" class="anchor" href="#description-of-problem" aria-hidden="true"></a>Description of problem:</h2>
<p dir="auto">Possible race condition leading to a memory corruption issue in <code>trust_list_add_compat</code> called indirectly from <code>gnutls_x509_trust_list_verify_crt2</code> (see below) when handling outgoing (client) TLS connections from multiple threads. Or possibly I'm holding GnuTLS wrong.</p>
<h2 dir="auto">
<a id="user-content-version-of-gnutls-used" class="anchor" href="#version-of-gnutls-used" aria-hidden="true"></a>Version of gnutls used:</h2>
<p dir="auto">3.7.2.</p>
<h2 dir="auto">
<a id="user-content-distributor-of-gnutls-eg-ubuntu-fedora-rhel" class="anchor" href="#distributor-of-gnutls-eg-ubuntu-fedora-rhel" aria-hidden="true"></a>Distributor of gnutls (e.g., Ubuntu, Fedora, RHEL)</h2>
<p dir="auto">Arch Linux and compiled from source.</p>
<h2 dir="auto">
<a id="user-content-how-reproducible" class="anchor" href="#how-reproducible" aria-hidden="true"></a>How reproducible:</h2>
<p dir="auto">I'm experiencing a race condition leading to a memory corruption issue in dnsdist 1.7.0-alpha1 (developer here), when using GnuTLS 3.7.2 to handle outgoing (client) TLS connections from multiple threads, and I'm trying to understand whether I'm holding GnuTLS wrong or if this is an issue that needs to be fixed in GnuTLS itself.</p>
<p dir="auto">Our design is that we create a single <code>gnutls_certificate_credentials_t</code> object while parsing the configuration, in this particular case calling <code>gnutls_certificate_set_x509_system_trust</code> to use the system CA store. PKCS11 support is enabled in this GnuTLS build, which will be important later.</p>
<p dir="auto">Later we have several worker threads each creating several new TLS connections, a single <code>gnutls_session_t</code> being only accessed by one thread, but the <code>gnutls_certificate_credentials_t</code> is shared by all connections by calling <code>gnutls_credentials_set</code> with <code>GNUTLS_CRD_CERTIFICATE</code>. My understanding after reading the "Thread safety" and "gnutls_credentials_set" parts of the documentation is that it should be safe to do so, but perhaps I'm wrong and this is the root cause of my issue.
We also require certificate verification by calling <code>gnutls_session_set_verify_cert</code> with <code>GNUTLS_VERIFY_ALLOW_UNSORTED_CHAIN</code>.</p>
<p dir="auto">We are then experiencing a memory corruption when several handshakes are processed at the same time from different threads, in the certification verification code:</p>
<pre class="code highlight js-syntax-highlight language-plaintext" lang="plaintext" v-pre="true"><code><span id="LC1" class="line" lang="plaintext">=================================================================</span>
<span id="LC2" class="line" lang="plaintext">==82302==ERROR: AddressSanitizer: attempting double-free on 0x627000085100 in thread T19 (dnsdist/healthC):</span>
<span id="LC3" class="line" lang="plaintext"> #0 0x5610da6eada2 in realloc (/work/pdns/pdns/dnsdistdist/dnsdist+0x12fcda2)</span>
<span id="LC4" class="line" lang="plaintext"> #1 0x7fc1b573ab14 in _gnutls_reallocarray_fast /data/sources/gnutls-3.7.2/lib/mem.c:63:8</span>
<span id="LC5" class="line" lang="plaintext"> #2 0x7fc1b57c6b03 in trust_list_add_compat /data/sources/gnutls-3.7.2/lib/x509/verify-high.c:310:3</span>
<span id="LC6" class="line" lang="plaintext"> #3 0x7fc1b57c6b03 in gnutls_x509_trust_list_get_issuer /data/sources/gnutls-3.7.2/lib/x509/verify-high.c:1165:10</span>
<span id="LC7" class="line" lang="plaintext"> #4 0x7fc1b57c732b in gnutls_x509_trust_list_verify_crt2 /data/sources/gnutls-3.7.2/lib/x509/verify-high.c:1521:7</span>
<span id="LC8" class="line" lang="plaintext"> #5 0x7fc1b5755208 in _gnutls_x509_cert_verify_peers /data/sources/gnutls-3.7.2/lib/cert-session.c:597:10</span>
<span id="LC9" class="line" lang="plaintext"> #6 0x7fc1b57541c0 in auto_verify_cb /data/sources/gnutls-3.7.2/lib/auto-verify.c:40:9</span>
<span id="LC10" class="line" lang="plaintext"> #7 0x7fc1b5719148 in _gnutls_run_verify_callback /data/sources/gnutls-3.7.2/lib/handshake.c:2972:10</span>
<span id="LC11" class="line" lang="plaintext"> #8 0x7fc1b5719148 in _gnutls_run_verify_callback /data/sources/gnutls-3.7.2/lib/handshake.c:2938:5</span>
<span id="LC12" class="line" lang="plaintext"> #9 0x7fc1b571156c in _gnutls13_handshake_client /data/sources/gnutls-3.7.2/lib/handshake-tls13.c:132:9</span>
<span id="LC13" class="line" lang="plaintext"> #10 0x7fc1b571cf41 in handshake_client /data/sources/gnutls-3.7.2/lib/handshake.c:3012:10</span>
<span id="LC14" class="line" lang="plaintext"> #11 0x7fc1b571cf41 in gnutls_handshake /data/sources/gnutls-3.7.2/lib/handshake.c:2855:10</span>
<span id="LC15" class="line" lang="plaintext"> #12 0x5610db811868 in GnuTLSConnection::tryHandshake() /work/pdns/pdns/dnsdistdist/tcpiohandler.cc:1103:13</span>
<span id="LC16" class="line" lang="plaintext"> #13 0x5610db81396b in GnuTLSConnection::tryWrite(std::vector<unsigned char, noinit_adaptor<std::allocator<unsigned char> > > const&, unsigned long&, unsigned long) /work/pdns/pdns/dnsdistdist/tcpiohandler.cc:1145:20</span>
<span id="LC17" class="line" lang="plaintext"> #14 0x5610da8af7ab in TCPIOHandler::tryWrite(std::vector<unsigned char, noinit_adaptor<std::allocator<unsigned char> > > const&, unsigned long&, unsigned long) /work/pdns/pdns/dnsdistdist/./tcpiohandler.hh:402:22</span>
<span id="LC18" class="line" lang="plaintext"> #15 0x5610da8a88b3 in healthCheckTCPCallback(int, boost::any&) /work/pdns/pdns/dnsdistdist/dnsdist-healthchecks.cc:261:37</span>
<span id="LC19" class="line" lang="plaintext"> #16 0x5610db7d3bb4 in boost::function2<void, int, boost::any&>::operator()(int, boost::any&) const /usr/include/boost/function/function_template.hpp:763:14</span>
<span id="LC20" class="line" lang="plaintext"> #17 0x5610db84be27 in EpollFDMultiplexer::run(timeval*, int) /work/pdns/pdns/dnsdistdist/epollmplexer.cc:193:9</span>
<span id="LC21" class="line" lang="plaintext"> #18 0x5610da8a9f64 in handleQueuedHealthChecks(FDMultiplexer&, bool) /work/pdns/pdns/dnsdistdist/dnsdist-healthchecks.cc:451:23</span>
<span id="LC22" class="line" lang="plaintext"> #19 0x5610db6d0ed9 in healthChecksThread() /work/pdns/pdns/dnsdistdist/dnsdist.cc:1907:5</span>
<span id="LC23" class="line" lang="plaintext"> #20 0x7fc1b55433c3 in execute_native_thread_routine /build/gcc/src/gcc/libstdc++-v3/src/c++11/thread.cc:82:18</span>
<span id="LC24" class="line" lang="plaintext"> #21 0x7fc1b568f258 in start_thread (/usr/lib/libpthread.so.0+0x9258)</span>
<span id="LC25" class="line" lang="plaintext"> #22 0x7fc1b522f5e2 in clone (/usr/lib/libc.so.6+0xfe5e2)</span>
<span id="LC26" class="line" lang="plaintext"></span>
<span id="LC27" class="line" lang="plaintext">0x627000085100 is located 0 bytes inside of 14000-byte region [0x627000085100,0x6270000887b0)</span>
<span id="LC28" class="line" lang="plaintext">freed by thread T6 (dnsdist/tcpClie) here:</span>
<span id="LC29" class="line" lang="plaintext"> #0 0x5610da6eada2 in realloc (/work/pdns/pdns/dnsdistdist/dnsdist+0x12fcda2)</span>
<span id="LC30" class="line" lang="plaintext"> #1 0x7fc1b573ab14 in _gnutls_reallocarray_fast /data/sources/gnutls-3.7.2/lib/mem.c:63:8</span>
<span id="LC31" class="line" lang="plaintext"></span>
<span id="LC32" class="line" lang="plaintext">previously allocated by thread T3 (dnsdist/tcpClie) here:</span>
<span id="LC33" class="line" lang="plaintext"> #0 0x5610da6eada2 in realloc (/work/pdns/pdns/dnsdistdist/dnsdist+0x12fcda2)</span>
<span id="LC34" class="line" lang="plaintext"> #1 0x7fc1b573ab14 in _gnutls_reallocarray_fast /data/sources/gnutls-3.7.2/lib/mem.c:63:8</span></code></pre>
<p dir="auto">We see that the certificate verification code is reallocating an array inside the cred's <code>tlist</code> in <code>trust_list_add_compat</code>, after being called by <code>gnutls_x509_trust_list_get_issuer</code>.
That happens only if PKCS11 support is enabled and the trust list's pkcs11_token field is set.</p>
<p dir="auto">The documentation for <code>gnutls_x509_trust_list_get_issuer</code> states that "the flag <code>GNUTLS_TL_GET_COPY</code> is required for this function to work with PKCS#11 trust lists in a thread-safe way", but <code>gnutls_x509_trust_list_verify_crt2</code> does not set that flag.</p>
<p dir="auto">Unfortunately that means that another thread might be trying to access the array at the same time, or even reallocating it, which leads to memory corruption (use-after-free).</p>
<p dir="auto">Note that <code>gnutls_x509_trust_list_get_issuer</code> was not called before <a href="https://gitlab.com/gnutls/gnutls/-/commit/e97a5f07bc9d9394424c6520656e902019fcb380" data-original="e97a5f07bc9d9394424c6520656e902019fcb380" data-link="false" data-link-reference="false" data-project="179611" data-commit="e97a5f07bc9d9394424c6520656e902019fcb380" data-reference-type="commit" data-container="body" data-placement="top" title="gnutls_x509_trust_list_verify_crt2: skip duped certs for PKCS11 too" class="gfm gfm-commit has-tooltip">e97a5f07</a>, so this behaviour might have been introduced in 3.7.1.</p>
<h2 dir="auto">
<a id="user-content-actual-results" class="anchor" href="#actual-results" aria-hidden="true"></a>Actual results:</h2>
<p dir="auto">Memory corruption.</p>
<h2 dir="auto">
<a id="user-content-expected-results" class="anchor" href="#expected-results" aria-hidden="true"></a>Expected results:</h2>
<p dir="auto">No memory corruption.</p>
<p dir="auto">I would welcome some help understanding whether I should be doing things differently in dnsdist in order to prevent this. Many thanks in advance :)</p>
</div>
<div class="footer" style="margin-top: 10px;">
<p style="font-size: small; color: #666;">
—
<br>
Reply to this email directly or <a href="https://gitlab.com/gnutls/gnutls/-/issues/1277">view it on GitLab</a>.
<br>
You're receiving this email because of your account on gitlab.com.
If you'd like to receive fewer emails, you can
<a href="https://gitlab.com/-/sent_notifications/905e0364b9fc072bb4556788bbe37dcb/unsubscribe">unsubscribe</a>
from this thread or
adjust your notification settings.
<script type="application/ld+json">{"@context":"http://schema.org","@type":"EmailMessage","action":{"@type":"ViewAction","name":"View Issue","url":"https://gitlab.com/gnutls/gnutls/-/issues/1277"}}</script>
</p>
</div>
</body>
</html>