From hkario at redhat.com Thu Mar 7 14:46:22 2024 From: hkario at redhat.com (Hubert Kario) Date: Thu, 07 Mar 2024 14:46:22 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack Message-ID: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> Hello, I've tested libgcrypt against the Marvin Attack[1] and have verified it to be vulnerable. Running the test harness from marvin-toolkit[2] I got the following result: tlsfuzzer analyse.py version 5 analysis Sign test mean p-value: 0.1769, median p-value: 0.01503, min p-value: 4.587e-55 Friedman test (chisquare approximation) for all samples p-value: 2.0539047856632484e-85 Worst pair: 2(no_padding_48), 4(signature_padding_8) Mean of differences: 2.09765e-07s, 95% CI: 1.83451e-07s, 2.311208e-07s (?2.384e-08s) Median of differences: 2.09797e-07s, 95% CI: 1.81122e-07s, 2.323270e-07s (?2.560e-08s) Trimmed mean (5%) of differences: 2.09885e-07s, 95% CI: 1.84586e-07s, 2.308092e-07s (?2.311e-08s) Trimmed mean (25%) of differences: 2.10169e-07s, 95% CI: 1.84646e-07s, 2.302561e-07s (?2.281e-08s) Trimmed mean (45%) of differences: 2.09076e-07s, 95% CI: 1.82240e-07s, 2.321705e-07s (?2.497e-08s) Trimean of differences: 2.08114e-07s, 95% CI: 1.80188e-07s, 2.266213e-07s (?2.322e-08s) Looking more closely at results, the side-channel from removal of blinding or conversion of the integer returned from the RSADP() operation[3] to a byte string is the most significant source of leakage. That means that all padding modes that use RSA will be vulnerable: raw RSA (RSASVE), PKCS#1v1.5, and RSA-OAEP. But even with this code fixed, because the API of the decryption operation doesn't permit a side-channel free returning of error messages (as the returned object has different type and size depending on error or size of the decrypted message), fixing it will require either implementing implicit rejection or providing API specifically for PKCS#1v1.5 decryption. This issue has been assigned CVE-2024-2236 1 - https://people.redhat.com/~hkario/marvin/ 2 - https://github.com/tomato42/marvin-toolkit/tree/master/example/libgcrypt 3 - https://datatracker.ietf.org/doc/html/rfc8017#section-5.1.2 -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From gniibe at fsij.org Fri Mar 8 03:55:55 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 08 Mar 2024 11:55:55 +0900 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> Message-ID: <87ttlhctz8.fsf@akagi.fsij.org> Hello, Hubert Kario wrote: > I've tested libgcrypt against the Marvin Attack[1] and have verified it to > be vulnerable. Thank you for your report. My understanding is that libgcrypt exposes timing differences against chosen cipher texts by your timing analysis. > Looking more closely at results, the side-channel from removal of blinding > or conversion of the integer returned from the RSADP() operation[3] to a > byte string is the most significant source of leakage. > That means that all padding modes that use RSA will be vulnerable: raw RSA > (RSASVE), PKCS#1v1.5, and RSA-OAEP. The major possible causes of timing differences in libgcrypt are: an old fork of GNU MP Bignum library for multi precision integer arithmetic. S-expression handling for multi precision integer representation. I'd agree that we need documentation update of libgcrypt to explain possible timing differences of libgcrypt RSA implementation; Well, libgcrypt users should know that RSA private key may be at risk when implementing decryption network service if timing information is available to remote side. If possible, could you give us some concrete information how large the side-channel to compose a possible attack? It would be good for us to know the impact of timing differences. -- From gniibe at fsij.org Fri Mar 8 07:28:56 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 08 Mar 2024 15:28:56 +0900 Subject: Exposing the gcry_md_hash_buffers_extract function Message-ID: <87le6tck47.fsf@akagi.fsij.org> Hello, While we have gcry_md_hash_buffers in libgcrypt, and _gcry_md_hash_buffers_extract internally, we don't have the gcry_md_hash_buffers_extract function exposed. For an extendable output function like SHAKE, it is good to have gcry_md_hash_buffers_extract. Shall I do that for next libgcrypt release? -- From wk at gnupg.org Fri Mar 8 10:24:02 2024 From: wk at gnupg.org (Werner Koch) Date: Fri, 08 Mar 2024 10:24:02 +0100 Subject: Exposing the gcry_md_hash_buffers_extract function In-Reply-To: <87le6tck47.fsf@akagi.fsij.org> (NIIBE Yutaka's message of "Fri, 08 Mar 2024 15:28:56 +0900") References: <87le6tck47.fsf@akagi.fsij.org> Message-ID: <87plw5qdot.fsf@jacob.g10code.de> On Fri, 8 Mar 2024 15:28, NIIBE Yutaka said: > While we have gcry_md_hash_buffers in libgcrypt, and > _gcry_md_hash_buffers_extract internally, we don't have the > gcry_md_hash_buffers_extract function exposed. I think we should use a better name for that function. gcry_md_hash_buffers_extract is the same as gcry_md_hash_buffers with the addition of the size of the provided result buffer. Thus gcry_md_hash_buffers_ext sounds like a better name to me > Shall I do that for next libgcrypt release? Yep. Shalom-Salam, Werner -- The pioneers of a warless world are the youth that refuse military service. - A. Einstein -------------- next part -------------- A non-text attachment was scrubbed... Name: openpgp-digital-signature.asc Type: application/pgp-signature Size: 247 bytes Desc: not available URL: From hkario at redhat.com Fri Mar 8 11:27:30 2024 From: hkario at redhat.com (Hubert Kario) Date: Fri, 08 Mar 2024 11:27:30 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <87ttlhctz8.fsf@akagi.fsij.org> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> Message-ID: <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> On Friday, 8 March 2024 03:55:55 CET, NIIBE Yutaka wrote: > Hello, > > Hubert Kario wrote: >> I've tested libgcrypt against the Marvin Attack[1] and have verified it to >> be vulnerable. > > Thank you for your report. > > My understanding is that libgcrypt exposes timing differences against > chosen cipher texts by your timing analysis. Correct, it's vulnerable to the chosen ciphertext attacks using timing as a side-channel. >> Looking more closely at results, the side-channel from removal of blinding >> or conversion of the integer returned from the RSADP() operation[3] to a >> byte string is the most significant source of leakage. >> That means that all padding modes that use RSA will be vulnerable: raw RSA >> (RSASVE), PKCS#1v1.5, and RSA-OAEP. > > The major possible causes of timing differences in libgcrypt are: > > an old fork of GNU MP Bignum library for multi precision integer > arithmetic. > > S-expression handling for multi precision integer representation. not only for integers, the result of the PKCS#1v1.5 decryption is also returned as an S-expression and it includes memory allocation that is exactly the size of the message. Combined with no memory allocation in case of padding check failure, that gives a very clear signal. So it's general S-expression handling of data that has secret lengths. > I'd agree that we need documentation update of libgcrypt to explain > possible timing differences of libgcrypt RSA implementation; Well, > libgcrypt users should know that RSA private key may be at risk when > implementing decryption network service if timing information is > available to remote side. +1 to that. Not every implementation needs to be side-channel safe, but if it isn't, then the covered threat model needs to be documented so that users can make informed decisions. My reading of current https://gnupg.org/documentation/security.html is that remote timing attacks are in scope. Only microarchitectural attacks, like SPECTRE, are outside the threat model. > If possible, could you give us some concrete information how large the > side-channel to compose a possible attack? It would be good for us to > know the impact of timing differences. Not sure if I understand the question... The size of the side-channel in libgcrypt is about 200 ns. The smallest side-channel I was able to successfully differentiate over the network, across 5 router hops (2 physically separate data centres in the same city), is about 1 ns. So in practice, to _fix_ a timing side-channel, the leakage needs to be completely eliminated. Otherwise the attack is just a question of attacker's persistence, not size of the side-channel. -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From gniibe at fsij.org Fri Mar 15 07:42:24 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 15 Mar 2024 15:42:24 +0900 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> Message-ID: <87jzm49esv.fsf@akagi.fsij.org> Hello, again, Hubert Kario wrote: > Correct, it's vulnerable to the chosen ciphertext attacks using timing > as a side-channel. Thank you for your confirmation. >>> Looking more closely at results, the side-channel from removal of blinding >>> or conversion of the integer returned from the RSADP() operation[3] to a >>> byte string is the most significant source of leakage. >>> That means that all padding modes that use RSA will be vulnerable: raw RSA >>> (RSASVE), PKCS#1v1.5, and RSA-OAEP. >> >> The major possible causes of timing differences in libgcrypt are: >> >> an old fork of GNU MP Bignum library for multi precision integer >> arithmetic. >> >> S-expression handling for multi precision integer representation. > > not only for integers, the result of the PKCS#1v1.5 decryption is also > returned as an S-expression and it includes memory allocation that is > exactly the size of the message. Combined with no memory allocation in > case of padding check failure, that gives a very clear signal. > > So it's general S-expression handling of data that has secret lengths. Thank you for your clarification. >> I'd agree that we need documentation update of libgcrypt to explain >> possible timing differences of libgcrypt RSA implementation; Well, >> libgcrypt users should know that RSA private key may be at risk when >> implementing decryption network service if timing information is >> available to remote side. > > +1 to that. Not every implementation needs to be side-channel safe, but > if it isn't, then the covered threat model needs to be documented so > that users can make informed decisions. > > My reading of current https://gnupg.org/documentation/security.html > is that remote timing attacks are in scope. Only microarchitectural > attacks, like SPECTRE, are outside the threat model. I think that we need to update the document to avoid possible confusion. >> If possible, could you give us some concrete information how large the >> side-channel to compose a possible attack? It would be good for us to >> know the impact of timing differences. > > Not sure if I understand the question... > > The size of the side-channel in libgcrypt is about 200 ns. > The smallest side-channel I was able to successfully differentiate > over the network, across 5 router hops (2 physically separate data centres > in the same city), is about 1 ns. > > So in practice, to _fix_ a timing side-channel, the leakage needs to be > completely eliminated. Otherwise the attack is just a question of > attacker's persistence, not size of the side-channel. Again, thank you for your clarification. I tried hard to consider a possible scenario we could imagine. I managed to come up with something like: * A crypto developer uses libgcrypt to build a secure enclave implementation which offers RSADP service. * He is not dumb, so, error return of gcry_pk_decrypt is hidden in the service. * But he is kind enough to attackers not having protection measures against abuse for the service. * In the service, side-channel of timing information is available to attackers. Suppose there is no limitation for number of invocations. In this scenario, yes, you are right that it's a matter of attacker's persistence (or his/product lifespan). In theory, even with terrible S/N ratio, information can be transmitted over a channel. Thus, it means that RSA private key may be at risk, in this kind of hypothetical scenario. My original question was... about quantitative evaluation and possibility in real cases. In other words, my interest is: if there are any existing applications/services/products/etc., and the degree of how likely are these problems and how much effort/time is needed to recover RSA private key, in such a possible scenario. Well, I think that there are other protection measures than eliminating timing difference in the crypto library. We can consider the case of a USB device. The communication from a device to host is basically time-slotted (or device implementation can wait its response to the next Start-Of-Frame, if needed). Timing information for the computation of crypto operation can be hidden by time-slotted response. Let us see how we can change our documentation. -- From verbuecheln at posteo.de Fri Mar 15 13:37:16 2024 From: verbuecheln at posteo.de (Stephan =?ISO-8859-1?Q?Verb=FCcheln?=) Date: Fri, 15 Mar 2024 12:37:16 +0000 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> Message-ID: Hello Thank you for your work and sharing your results! How about the use case of interactively authenticating to a server which is not controlled by oneself and therefore not fully trusted? Since the authentication is interactive, the timing could matter. For example, I am using my PGP key for SSH public-key authentication to github.com and alike. Regards Stephan -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: This is a digitally signed message part URL: From hkario at redhat.com Fri Mar 15 15:06:51 2024 From: hkario at redhat.com (Hubert Kario) Date: Fri, 15 Mar 2024 15:06:51 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> Message-ID: <30d9dfee-ac6e-41ca-bff9-f97b5db1af52@redhat.com> On Friday, 15 March 2024 13:37:16 CET, Stephan Verb?cheln via Gcrypt-devel wrote: > Hello > > Thank you for your work and sharing your results! > > How about the use case of interactively authenticating to a server > which is not controlled by oneself and therefore not fully trusted? > Since the authentication is interactive, the timing could matter. > > For example, I am using my PGP key for SSH public-key authentication to > github.com and alike. Authentication uses signing, not decryption. While there are also timing attacks on signing operations (see Kocher 1996 as the first example of those), that's not what I have been testing or tried to exploit. While presence of timing attacks in decryption is a red flag, it's not a guarantee that timing attacks in signing are exploitable. Or vice versa. An implementation vulnerable to Bleichenbacher may be completely immune to Kocher-like attacks and an implementation vulnerable to Kocher can be completely immune to Bleichenbacher like attacks. (though do note that Kocher allows for private key extraction, so if a Kocher like attack is possible, decryption of captured ciphertexts is also possible) -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From hkario at redhat.com Fri Mar 15 15:14:34 2024 From: hkario at redhat.com (Hubert Kario) Date: Fri, 15 Mar 2024 15:14:34 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <87jzm49esv.fsf@akagi.fsij.org> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> Message-ID: On Friday, 15 March 2024 07:42:24 CET, NIIBE Yutaka wrote: > Hello, again, > > Hubert Kario wrote: >>> I'd agree that we need documentation update of libgcrypt to explain >>> possible timing differences of libgcrypt RSA implementation; Well, >>> libgcrypt users should know that RSA private key may be at risk when >>> implementing decryption network service if timing information is >>> available to remote side. >> >> +1 to that. Not every implementation needs to be side-channel safe, but >> if it isn't, then the covered threat model needs to be documented so >> that users can make informed decisions. >> >> My reading of current https://gnupg.org/documentation/security.html >> is that remote timing attacks are in scope. Only microarchitectural >> attacks, like SPECTRE, are outside the threat model. > > I think that we need to update the document to avoid possible confusion. Please do. >>> If possible, could you give us some concrete information how large the >>> side-channel to compose a possible attack? It would be good for us to >>> know the impact of timing differences. >> >> Not sure if I understand the question... >> >> The size of the side-channel in libgcrypt is about 200 ns. >> The smallest side-channel I was able to successfully differentiate >> over the network, across 5 router hops (2 physically separate data centres >> in the same city), is about 1 ns. >> >> So in practice, to _fix_ a timing side-channel, the leakage needs to be >> completely eliminated. Otherwise the attack is just a question of >> attacker's persistence, not size of the side-channel. > > Again, thank you for your clarification. > > > I tried hard to consider a possible scenario we could imagine. I > managed to come up with something like: > > * A crypto developer uses libgcrypt to build a secure enclave > implementation which offers RSADP service. > > * He is not dumb, so, error return of gcry_pk_decrypt is hidden in the > service. > > * But he is kind enough to attackers not having protection measures > against abuse for the service. > > * In the service, side-channel of timing information is available to > attackers. Suppose there is no limitation for number of invocations. > > In this scenario, yes, you are right that it's a matter of attacker's > persistence (or his/product lifespan). In theory, even with terrible > S/N ratio, information can be transmitted over a channel. > > Thus, it means that RSA private key may be at risk, in this kind of > hypothetical scenario. > > My original question was... about quantitative evaluation and > possibility in real cases. In other words, my interest is: if there are > any existing applications/services/products/etc., and the degree of how > likely are these problems and how much effort/time is needed to recover > RSA private key, in such a possible scenario. The current threat model specifies "Libgcrypt has been developed for use in a wide variety of platforms with different security needs." My reading of that is that it's an appropriate library for general purpose cryptography. So yes, that means that Marvin requires such hypothetical scenario as you talk about. But I'd say that using libgcrypt to implement a network API end point that accepts Json Web Encryption tokens is not outside the realm of possibility, or even a far fetched idea. And definitely within the threat model as currently documented. > Well, I think that there are other protection measures than eliminating > timing difference in the crypto library. We can consider the case of a > USB device. The communication from a device to host is basically > time-slotted (or device implementation can wait its response to the next > Start-Of-Frame, if needed). Timing information for the computation of > crypto operation can be hidden by time-slotted response. Actually no. If the time slots are consistent (say, the USB device returns the message only on the second, on the dot), then the attacker can tune the time when it _starts_ the operation so that it end exactly at the second. Then quicker operations will be returned earlier, while slower will be returned a second later. The only way to effectively hide timing side-channel is to have operation take the same amount of time always, but make it very long, so that additional load can't really push it towards that time. I mean like 30 to 60 seconds per operation. That won't mask any other side-channels, but may be enough to protect against simple timing attacks. -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From gniibe at fsij.org Sat Mar 16 00:43:58 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Sat, 16 Mar 2024 08:43:58 +0900 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> Message-ID: <87h6h79i2p.fsf@akagi.fsij.org> Hubert Kario wrote: > Actually no. If the time slots are consistent (say, the USB device returns > the message only on the second, on the dot), then the attacker can tune the > time when it _starts_ the operation so that it end exactly at the second. > Then quicker operations will be returned earlier, while slower will be > returned a second later. This is not the communication of USB bus. The request from host is also time-slotted. Your claim above would be only valid if the attacker can start the request of the crypto operation from another channel where timing can be accurately controlled, and the responce is on USB bus (for some reason). I don't think this is a general scenario in the real world. -- From cllang at redhat.com Mon Mar 18 13:41:42 2024 From: cllang at redhat.com (Clemens Lang) Date: Mon, 18 Mar 2024 13:41:42 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <87jzm49esv.fsf@akagi.fsij.org> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> Message-ID: Hi, > On 15. Mar 2024, at 07:42, NIIBE Yutaka wrote: > > My original question was... about quantitative evaluation and > possibility in real cases. In other words, my interest is: if there are > any existing applications/services/products/etc., and the degree of how > likely are these problems and how much effort/time is needed to recover > RSA private key, in such a possible scenario. Just to give you a rough ball park of some numbers: I looked at the same vulnerability in Apple?s CoreCrypto library. CVE-2024-23218 was assigned for that. When directly measuring the affected decrypt operation, decryption of a cipher text without using the private key just by making calls to the timing oracle needed about 24 hours. I didn?t bother attempting to optimize this, and didn?t parallelize it. Now, over the network, you?ll need more samples due to the noise. Hubert can probably guesstimate how many more samples, but let?s say you?d need 100 times of what you?d need locally. You can assume the attacker isn?t halfway around the world, but a few hops next to you in some Amazon or Google cloud datacenter. That would still mean the attacker would need 100 days to decrypt a single cipher text. However, this entire attack can be run in parallel. You don?t need to always talk to the same server. If somebody were running a distributed service that does RSA decryption with an observable timing channel across 100 nodes, we?re back at 24 hours. Sending this many requests might be detected as abuse, so an attacker would likely have to adequately reduce the number of queries to hide them in the noise. Overall, definitely not something somebody would do for all captured cipher texts, but for a high-value target in some bigger cloud deployment, it certainly sounds a lot more doable. -- Clemens Lang RHEL Crypto Team Red Hat From hkario at redhat.com Tue Mar 19 14:22:44 2024 From: hkario at redhat.com (Hubert Kario) Date: Tue, 19 Mar 2024 14:22:44 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> Message-ID: On Monday, 18 March 2024 13:41:42 CET, Clemens Lang wrote: > Hi, > >> On 15. Mar 2024, at 07:42, NIIBE Yutaka wrote: >> >> My original question was... about quantitative evaluation and >> possibility in real cases. In other words, my interest is: if there are >> any existing applications/services/products/etc., and the degree of how >> likely are these problems and how much effort/time is needed to recover >> RSA private key, in such a possible scenario. > > Just to give you a rough ball park of some numbers: > > I looked at the same vulnerability in Apple?s CoreCrypto > library. CVE-2024-23218 was assigned for that. > > When directly measuring the affected decrypt operation, > decryption of a cipher text without using the private key just > by making calls to the timing oracle needed about 24 hours. I > didn?t bother attempting to optimize this, and didn?t > parallelize it. > > Now, over the network, you?ll need more samples due to the > noise. Hubert can probably guesstimate how many more samples, > but let?s say you?d need 100 times of what you?d need locally. > You can assume the attacker isn?t halfway around the world, but > a few hops next to you in some Amazon or Google cloud > datacenter. For same-switch attack vs loopback attack it's a factor of 4. For more remote connections there's too much variables to provide a good estimate. -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From hkario at redhat.com Tue Mar 19 14:28:02 2024 From: hkario at redhat.com (Hubert Kario) Date: Tue, 19 Mar 2024 14:28:02 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <87h6h79i2p.fsf@akagi.fsij.org> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> <87h6h79i2p.fsf@akagi.fsij.org> Message-ID: On Saturday, 16 March 2024 00:43:58 CET, NIIBE Yutaka wrote: > Hubert Kario wrote: >> Actually no. If the time slots are consistent (say, the USB device returns >> the message only on the second, on the dot), then the attacker >> can tune the >> time when it _starts_ the operation so that it end exactly at the second. >> Then quicker operations will be returned earlier, while slower will be >> returned a second later. > > This is not the communication of USB bus. The request from host is also > time-slotted. Your claim above would be only valid if the attacker can > start the request of the crypto operation from another channel where > timing can be accurately controlled, and the responce is on USB bus (for > some reason). If the communication is like that in both directions, then yes, it's more problematic. But as long as there is a variability in the responses, the statistical tests I'm using will still work. Like, if the operation normally takes between 1.8 and 2.2 s, and the communication can happen every 0.1 s, then the attack is still possible. It only won't be possible if the inherent variability is completely hidden by the quantization, like if in the above example the communication could happen only every 10 s. -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From jcb62281 at gmail.com Wed Mar 20 02:44:51 2024 From: jcb62281 at gmail.com (Jacob Bachmeyer) Date: Tue, 19 Mar 2024 20:44:51 -0500 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> <87h6h79i2p.fsf@akagi.fsij.org> Message-ID: <65FA3F93.2050502@gmail.com> Hubert Kario via Gcrypt-devel wrote: > On Saturday, 16 March 2024 00:43:58 CET, NIIBE Yutaka wrote: >> Hubert Kario wrote: >>> Actually no. If the time slots are consistent (say, the USB device >>> returns >>> the message only on the second, on the dot), then the attacker can >>> tune the >>> time when it _starts_ the operation so that it end exactly at the >>> second. >>> Then quicker operations will be returned earlier, while slower will be >>> returned a second later. >> >> This is not the communication of USB bus. The request from host is also >> time-slotted. Your claim above would be only valid if the attacker can >> start the request of the crypto operation from another channel where >> timing can be accurately controlled, and the responce is on USB bus (for >> some reason). > > If the communication is like that in both directions, then yes, it's more > problematic. But as long as there is a variability in the responses, the > statistical tests I'm using will still work. > > Like, if the operation normally takes between 1.8 and 2.2 s, and the > communication can happen every 0.1 s, then the attack is still possible. > > It only won't be possible if the inherent variability is completely > hidden > by the quantization, like if in the above example the communication could > happen only every 10 s. The method to harden a USB device against this type of attack is to work out the worst-case computation time, and always hold the response until that time (measured in USB time slots) has elapsed. To use numbers from your example, the device performs the operation, completing it in 18 to 22 time slots, but holds the response until 24 time slots have elapsed from the request. This of course requires actually knowing how your program works and its worst-case running time, which sadly is probably rare in modern commercial programming. The device also must guard against a malicious host by having its own clock (which is needed for its processor and USB interface anyway) and shutting down if the time slots it sees on the bus do not align with the USB spec. (If I remember correctly, the USB spec requires each time slot to be some number of milliseconds, but the USB host determines the precise timing.) Otherwise, a non-standard malicious host could "bend" the slot timing enough that the fixed response delay is not always sufficient for the operation to complete. -- Jacob From hkario at redhat.com Wed Mar 20 14:18:28 2024 From: hkario at redhat.com (Hubert Kario) Date: Wed, 20 Mar 2024 14:18:28 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <65FA3F93.2050502@gmail.com> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> <87h6h79i2p.fsf@akagi.fsij.org> <65FA3F93.2050502@gmail.com> Message-ID: On Wednesday, 20 March 2024 02:44:51 CET, Jacob Bachmeyer wrote: > Hubert Kario via Gcrypt-devel wrote: >> On Saturday, 16 March 2024 00:43:58 CET, NIIBE Yutaka wrote: ... > > The method to harden a USB device against this type of attack > is to work out the worst-case computation time, and always hold > the response until that time (measured in USB time slots) has > elapsed. To use numbers from your example, the device performs > the operation, completing it in 18 to 22 time slots, but holds > the response until 24 time slots have elapsed from the request. > > This of course requires actually knowing how your program works > and its worst-case running time, which sadly is probably rare in > modern commercial programming. > > The device also must guard against a malicious host by having > its own clock (which is needed for its processor and USB > interface anyway) and shutting down if the time slots it sees on > the bus do not align with the USB spec. (If I remember > correctly, the USB spec requires each time slot to be some > number of milliseconds, but the USB host determines the precise > timing.) Otherwise, a non-standard malicious host could "bend" > the slot timing enough that the fixed response delay is not > always sufficient for the operation to complete. IIUC, there are ways to do polling more often... some gaming mice advertise that as a feature. But, yes, if there is no differentiation in the reply times, or they don't depend on the secret data, then you will fix the timing side-channel. It should be noted that this will protect only against timing side-channel. There are other side-channels, like sound: https://arstechnica.com/information-technology/2013/12/new-attack-steals-e-mail-decryption-keys-by-capturing-computer-sounds/ or power related (using remote CCTV cameras): https://arstechnica.com/information-technology/2023/06/hackers-can-steal-cryptographic-keys-by-video-recording-connected-power-leds-60-feet-away/ Fixing it so that the timing of the actual operation is actually independent of secret data is a first step in fixing the power side channels. -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From gniibe at fsij.org Fri Mar 22 00:51:06 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 22 Mar 2024 08:51:06 +0900 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <65FA3F93.2050502@gmail.com> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> <87h6h79i2p.fsf@akagi.fsij.org> <65FA3F93.2050502@gmail.com> Message-ID: <875xxf9mad.fsf@akagi.fsij.org> Hello, Jacob Bachmeyer wrote: > Otherwise, a non-standard malicious host could "bend" the slot timing > enough that the fixed response delay is not always sufficient for the > operation to complete. True. Thank you for showing this possibility. I didn't consider this point. Well, in general, I suggest not keeping a USB token inserted into a host. There are possibilities (in theory, or in history) that a decryption service by a USB token might be providing a decryption oracle to an attacker by some channel(s). When a user has a practice of only powering the device when needed, bandwidth to an attacker could be small, hopefully small enough. Slower service is better too, for smaller bandwidth. It is a bit off-topic (from the original report). Sorry, It was me who addressed USB communication. And... yes, it's true that it's hard for programming to estimate worst-case running time, it's also hard to guarantee constant-time running time, in a given situation of programming environment and hardware architecture. For me, the question is how we can live with that. -- From jcb62281 at gmail.com Fri Mar 22 02:40:05 2024 From: jcb62281 at gmail.com (Jacob Bachmeyer) Date: Thu, 21 Mar 2024 20:40:05 -0500 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <875xxf9mad.fsf@akagi.fsij.org> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> <87h6h79i2p.fsf@akagi.fsij.org> <65FA3F93.2050502@gmail.com> <875xxf9mad.fsf@akagi.fsij.org> Message-ID: <65FCE175.2070109@gmail.com> NIIBE Yutaka wrote: > Hello, > > Jacob Bachmeyer wrote: > >> Otherwise, a non-standard malicious host could "bend" the slot timing >> enough that the fixed response delay is not always sufficient for the >> operation to complete. >> > > True. Thank you for showing this possibility. I didn't consider this > point. > To be fair, when I said "non-standard", I meant a USB host implemented either using a programmable logic device or by bit-banging on MCU GPIO pins. I do not believe that a PC's USB host controller, for example, will allow that. I would expect standard hardware USB hosts to enforce spec-compliant bus timing beyond software control. Therefore, the token should shut down or possibly even wipe itself if it detects such incorrect timing. > Well, in general, I suggest not keeping a USB token inserted into a host. > > There are possibilities (in theory, or in history) that a decryption > service by a USB token might be providing a decryption oracle to an > attacker by some channel(s). > > When a user has a practice of only powering the device when needed, > bandwidth to an attacker could be small, hopefully small enough. Slower > service is better too, for smaller bandwidth. > Leaving the token connected to the host and active (unlocked) raises a different hazard: the token would then be available for malware on the host to abuse at will. This largely defeats the purpose of using a token in the first place, as the security benefits of tokens center around preserving the security of the key even if the host is compromised. Obvious decryption oracle: just ask the token to decrypt the ciphertext. If it is connected and unlocked, it will. Oops. > It is a bit off-topic (from the original report). Sorry, It was me who > addressed USB communication. > > And... yes, it's true that it's hard for programming to estimate > worst-case running time, it's also hard to guarantee constant-time > running time, in a given situation of programming environment and > hardware architecture. > The basic method is to do both sides of every branch and select the result with the equivalent of a multiplexer. This obviously does not work for loop-test branches, since loops must eventually terminate, and still requires care at higher-levels that the set (and possibly sequence) of operations performed is invariant with respect to secret data, but it is possible as I understand. This may also require avoiding the use of sufficiently-advanced processors, if any exist that can detect that the result of a speculative execution chain will not be used and elide the chain. This could also be a good application for tokens containing simpler processors intended for security over performance, if main processors get advanced enough to do that. -- Jacob From hkario at redhat.com Fri Mar 22 13:24:41 2024 From: hkario at redhat.com (Hubert Kario) Date: Fri, 22 Mar 2024 13:24:41 +0100 Subject: Side-channel vulnerability in libgcrypt - the Marvin Attack In-Reply-To: <875xxf9mad.fsf@akagi.fsij.org> References: <3b78ead2-16c9-4177-a8d0-434f0a27de70@redhat.com> <87ttlhctz8.fsf@akagi.fsij.org> <9f06c8ea-6019-446d-9a3d-e3cd3ff7039b@redhat.com> <87jzm49esv.fsf@akagi.fsij.org> <87h6h79i2p.fsf@akagi.fsij.org> <65FA3F93.2050502@gmail.com> <875xxf9mad.fsf@akagi.fsij.org> Message-ID: <4aeddaf3-18de-46c4-a0ef-5a63e1f9bf0d@redhat.com> On Friday, 22 March 2024 00:51:06 CET, NIIBE Yutaka wrote: > Hello, > > And... yes, it's true that it's hard for programming to estimate > worst-case running time, it's also hard to guarantee constant-time > running time, in a given situation of programming environment and > hardware architecture. OpenSSL, BoringSSL (they have different code for RSA than OpenSSL now), Go, NSS, GnuTLS, Apple corecrypto, and WolfSSL were all able to do this operation in constant time in software, and those are only the ones that I have directly seen the evidence that the fixes were successful, so while it may not be simple, it's clearly not impossible. -- Regards, Hubert Kario Principal Quality Engineer, RHEL Crypto team Web: www.cz.redhat.com Red Hat Czech s.r.o., Purky?ova 115, 612 00, Brno, Czech Republic From gniibe at fsij.org Tue Mar 26 06:03:28 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Tue, 26 Mar 2024 14:03:28 +0900 Subject: Adding cSHAKE digest Message-ID: <877chpoa8v.fsf@akagi.fsij.org> Hello, In the task T6637, adding cSHAKE and KMAC is proposed. I read the patch, while it works somehow, it is not easy to merge it directly. Thus, I do implement cSHAKE part, with minimum change. Attached is my try. I plan to take the test vectors for cSHAKE from the patch in T6637 and add them. -- -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-md-Add-cSHAKE-digest-algorithm-and-the-implementatio.patch Type: text/x-diff Size: 17455 bytes Desc: not available URL: From gniibe at fsij.org Thu Mar 28 05:30:42 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Thu, 28 Mar 2024 13:30:42 +0900 Subject: Adding ECC KEM Message-ID: <871q7vugel.fsf@akagi.fsij.org> Hello, In the task T6755, we introduced KEM API. ML-KEM is added. Today, I'd like to propose adding ECC KEM implementation in the API. The intention of mine is use in gpg-agent to support PQC (task T7014). Attached is a patch adding ECC KEM for X25519. -- -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-cipher-kem-Add-ECC-KEM-for-X25519.patch Type: text/x-diff Size: 36422 bytes Desc: not available URL: From simon at josefsson.org Thu Mar 28 10:45:53 2024 From: simon at josefsson.org (Simon Josefsson) Date: Thu, 28 Mar 2024 10:45:53 +0100 Subject: Adding ECC KEM In-Reply-To: <871q7vugel.fsf@akagi.fsij.org> (NIIBE Yutaka's message of "Thu, 28 Mar 2024 13:30:42 +0900") References: <871q7vugel.fsf@akagi.fsij.org> Message-ID: <87o7ayu1ta.fsf@kaka.sjd.se> NIIBE Yutaka writes: > Hello, > > In the task T6755, we introduced KEM API. ML-KEM is added. > > Today, I'd like to propose adding ECC KEM implementation in the API. > The intention of mine is use in gpg-agent to support PQC (task T7014). > > Attached is a patch adding ECC KEM for X25519. Nice! Is this intended to be compatible with HPKE ECC KEM? https://www.rfc-editor.org/rfc/rfc9180.html#name-dh-based-kem-dhkem Did you validate test vectors? /Simon -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 255 bytes Desc: not available URL: From wk at gnupg.org Thu Mar 28 11:18:08 2024 From: wk at gnupg.org (Werner Koch) Date: Thu, 28 Mar 2024 11:18:08 +0100 Subject: Adding ECC KEM In-Reply-To: <871q7vugel.fsf@akagi.fsij.org> (NIIBE Yutaka's message of "Thu, 28 Mar 2024 13:30:42 +0900") References: <871q7vugel.fsf@akagi.fsij.org> Message-ID: <87le62fyn3.fsf@jacob.g10code.de> Hi! Thanks for working on this. I understand that you stared with X25519. However we also need to support BrainpoolP384r1 and P512r1 because they will likley be the default in GnuPG for Kyber (ML-KEM). There are also request for other curves. Thus I wondered whether we really want to have a whole bunch of GCRY_KEM_* constants or whether it would be possible to define another parameter which can be shared by similar algorithms/curves. Salam-Shalom, Werner -- The pioneers of a warless world are the youth that refuse military service. - A. Einstein -------------- next part -------------- A non-text attachment was scrubbed... Name: openpgp-digital-signature.asc Type: application/pgp-signature Size: 247 bytes Desc: not available URL: From wk at gnupg.org Thu Mar 28 14:08:09 2024 From: wk at gnupg.org (Werner Koch) Date: Thu, 28 Mar 2024 14:08:09 +0100 Subject: Adding cSHAKE digest In-Reply-To: <877chpoa8v.fsf@akagi.fsij.org> (NIIBE Yutaka's message of "Tue, 26 Mar 2024 14:03:28 +0900") References: <877chpoa8v.fsf@akagi.fsij.org> Message-ID: <87frwafqrq.fsf@jacob.g10code.de> Hi, I looked at the cSHAKE changes and wondered whether we should replace the struct gcry_cshake_customization { const void *n; unsigned int n_len; const void *s; unsigned int s_len; }; by the already existing typedef struct { size_t size; /* The allocated size of the buffer or 0. */ size_t off; /* Offset into the buffer. */ size_t len; /* The used length of the buffer. */ void *data; /* The buffer. */ } gcry_buffer_t; Or a new typedef struct { size_t size; /* The allocated size of the buffer or 0. */ size_t off; /* Offset into the buffer. */ size_t len; /* The used length of the buffer. */ const void *data; /* The buffer. */ } gcry_cbuffer_t; the only disadvantge I see is that it won't be possible to have a sanity check like if (buflen != sizeof (struct gcry_cshake_customization)) rc = GPG_ERR_INV_ARG; But this check could be done if we also define a typedef struct { size_t count; union { gcry_buffer_t v[1]; gcry_cbuffer_t c[1]; } io; } gcry_buffer_desc_t; Or if one prefers better checks drop the union. What do you think? Shalom-Salam, Werner -- The pioneers of a warless world are the youth that refuse military service. - A. Einstein -------------- next part -------------- A non-text attachment was scrubbed... Name: openpgp-digital-signature.asc Type: application/pgp-signature Size: 247 bytes Desc: not available URL: From gniibe at fsij.org Fri Mar 29 02:10:24 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 29 Mar 2024 10:10:24 +0900 Subject: Adding ECC KEM In-Reply-To: <87o7ayu1ta.fsf@kaka.sjd.se> References: <871q7vugel.fsf@akagi.fsij.org> <87o7ayu1ta.fsf@kaka.sjd.se> Message-ID: <87v855u9kv.fsf@akagi.fsij.org> Hello, Simon Josefsson wrote: > Nice! Is this intended to be compatible with HPKE ECC KEM? > > https://www.rfc-editor.org/rfc/rfc9180.html#name-dh-based-kem-dhkem Yes. GCRY_KEM_DHKEM25519 is for DHKEM with X25519, HKDF, and SHA256 described in RFC 9180. > Did you validate test vectors? In my working branch of last year, I added test vectors from RFC 9180. https://dev.gnupg.org/source/libgcrypt/browse/gniibe%252Fkem2/tests/t-kem.c;2f93e53f6525155b4c78419d55b35a35cde84907$349 It was tested at that time. I plan to merge this test into master. (Currently, tests/t-kem only has generate-encap-decap tests.) So, the answer is: yes, I did, but not yet with master. -- From gniibe at fsij.org Fri Mar 29 02:26:34 2024 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 29 Mar 2024 10:26:34 +0900 Subject: Adding ECC KEM In-Reply-To: <87le62fyn3.fsf@jacob.g10code.de> References: <871q7vugel.fsf@akagi.fsij.org> <87le62fyn3.fsf@jacob.g10code.de> Message-ID: <87sf09u8tx.fsf@akagi.fsij.org> Hello, Werner Koch wrote: > However we also need to support BrainpoolP384r1 and P512r1 because they > will likley be the default in GnuPG for Kyber (ML-KEM). There are also > request for other curves. I will add other curves, too. > Thus I wondered whether we really want to have a whole bunch of > GCRY_KEM_* constants or whether it would be possible to define another > parameter which can be shared by similar algorithms/curves. Let's see by using the API. While I added the ECC KEM API, I'm not sure if gpg-agent should use the ECC KEM API for all of its uses of ECC. Possibly, ECC KEM API will be only used for PQC. In this case, gpg-agent uses gcry_kem_* API for PQC hybrid, and keeps using gcry_pk_* API for existing non-hybrid use of ECC. --