From gniibe at fsij.org Mon Feb 3 01:31:20 2025 From: gniibe at fsij.org (NIIBE Yutaka) Date: Mon, 03 Feb 2025 09:31:20 +0900 Subject: [PATCH] MPI helper of multiplication, Least Leak Intended In-Reply-To: <877c6b8pp8.fsf@akagi.fsij.org> References: <877c6b8pp8.fsf@akagi.fsij.org> Message-ID: <87seovj2cn.fsf@akagi.fsij.org> NIIBE Yutaka wrote: > Honestly speaking, it's "Least Leak Intended", and I couldn't declare > it constant-time. I pushed the change for _gcry_mpih_mul_lli. And I also pushed the change for _gcry_mpih_mod_lli. The implementation was already there, it's renaming _gcry_mpih_mod_lli from _gcry_mpih_mod. -- From jussi.kivilinna at iki.fi Mon Feb 3 20:22:07 2025 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 3 Feb 2025 21:22:07 +0200 Subject: [PATCH 1/3] t-fips-service-ind: fix broken fail print Message-ID: <20250203192209.3072952-1-jussi.kivilinna@iki.fi> * tests/t-fips-service-ind.c (check_cipher_o_s_e_d_c): Fix typo '<' to ','. -- Signed-off-by: Jussi Kivilinna --- tests/t-fips-service-ind.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/t-fips-service-ind.c b/tests/t-fips-service-ind.c index 74521bb3..ed5f8d3f 100644 --- a/tests/t-fips-service-ind.c +++ b/tests/t-fips-service-ind.c @@ -767,7 +767,7 @@ check_cipher_o_s_e_d_c (int reject) err = gcry_cipher_set_decryption_tag (h, tag, 16); if (err) - fail ("gcry_cipher_set_decryption_tag %d failed: %s\n", tvidx< + fail ("gcry_cipher_set_decryption_tag %d failed: %s\n", tvidx, gpg_strerror (err)); } -- 2.45.2 From jussi.kivilinna at iki.fi Mon Feb 3 20:22:08 2025 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 3 Feb 2025 21:22:08 +0200 Subject: [PATCH 2/3] mpih-const-time: avoid branches in _gcry_mpih_cmp_ui In-Reply-To: <20250203192209.3072952-1-jussi.kivilinna@iki.fi> References: <20250203192209.3072952-1-jussi.kivilinna@iki.fi> Message-ID: <20250203192209.3072952-2-jussi.kivilinna@iki.fi> * mpi/mpih-const-time.c (_gcry_mpih_cmp_ui): Avoid conditional branches for return value selection. -- Signed-off-by: Jussi Kivilinna --- mpi/mpih-const-time.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/mpi/mpih-const-time.c b/mpi/mpih-const-time.c index e684b956..d8b66c46 100644 --- a/mpi/mpih-const-time.c +++ b/mpi/mpih-const-time.c @@ -222,20 +222,15 @@ _gcry_mpih_mod_lli (mpi_ptr_t vp, mpi_size_t vsize, int _gcry_mpih_cmp_ui (mpi_ptr_t up, mpi_size_t usize, unsigned long v) { - int is_all_zero = 1; + unsigned long is_all_zero = ct_ulong_gen_mask(1); + int cmp0; mpi_size_t i; + cmp0 = -mpih_ct_limb_less_than (up[0], v); + cmp0 |= mpih_ct_limb_greater_than (up[0], v); + for (i = 1; i < usize; i++) - is_all_zero &= mpih_limb_is_zero (up[i]); + is_all_zero &= ct_ulong_gen_mask(mpih_limb_is_zero (up[i])); - if (is_all_zero) - { - if (up[0] < v) - return -1; - else if (up[0] > v) - return 1; - else - return 0; - } - return 1; + return cmp0 & (int)is_all_zero; } -- 2.45.2 From jussi.kivilinna at iki.fi Mon Feb 3 20:22:09 2025 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 3 Feb 2025 21:22:09 +0200 Subject: [PATCH 3/3] mpi/longlong: prevent optimization of carry instructions to branches In-Reply-To: <20250203192209.3072952-1-jussi.kivilinna@iki.fi> References: <20250203192209.3072952-1-jussi.kivilinna@iki.fi> Message-ID: <20250203192209.3072952-3-jussi.kivilinna@iki.fi> * mpi/longlong.h: Include "const-time.h" (add_ssaaaa, sub_ddmmss): Prevent optimization of carry handling to conditional branches in generic variant of double width addition and subtraction as was seen with GCC on riscv64. (umul_ppmm): Avoid conditional branch in generic 16x16=>32bit multiplication version of umul_ppmm. * src/const-time.h (CT_DEOPTIMIZE_VAR): New. -- RISC-V has "sltu" instruction for generating carry value and generic version of add_ssaaaa and sub_ddmmss typically used this instruction. However, sometimes compiler gets too clever and instead generates code with conditional branch, which is not good for constant time code. Commit changes add_ssaaaaa and sub_ddmmss to clobber high word of calculation in a way that prevents such optimizations. Signed-off-by: Jussi Kivilinna --- mpi/longlong.h | 47 +++++++++++++++++++++++++++++++---------------- src/const-time.h | 8 ++++++++ 2 files changed, 39 insertions(+), 16 deletions(-) diff --git a/mpi/longlong.h b/mpi/longlong.h index 21bd1a7e..7dc67591 100644 --- a/mpi/longlong.h +++ b/mpi/longlong.h @@ -20,6 +20,8 @@ along with this file; see the file COPYING.LIB. If not, see > W_TYPE_SIZE); \ - (sl) = (UWtype)(__audw); \ + __auwh = (UWtype)(__audw >> W_TYPE_SIZE); \ + __auwl = (UWtype)(__audw); \ + CT_DEOPTIMIZE_VAR(__auwh); \ + (sh) = __auwh; \ + (sl) = __auwl; \ } while (0) #elif !defined (add_ssaaaa) # define add_ssaaaa(sh, sl, ah, al, bh, bl) \ do { \ - UWtype __x; \ - __x = (al) + (bl); \ - (sh) = (ah) + (bh) + (__x < (al)); \ - (sl) = __x; \ + UWtype __xl, __xh; \ + __xl = (al) + (bl); \ + __xh = __xl < (al); \ + __xh = (ah) + (bh) + __xh; \ + CT_DEOPTIMIZE_VAR(__xh); \ + (sh) = __xh; \ + (sl) = __xl; \ } while (0) #endif @@ -1606,22 +1615,29 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); # define sub_ddmmss(sh, sl, ah, al, bh, bl) \ do { \ UDWtype __audw = (ah); \ + UWtype __auwh, __auwl; \ UDWtype __budw = (bh); \ __audw <<= W_TYPE_SIZE; \ __audw |= (al); \ __budw <<= W_TYPE_SIZE; \ __budw |= (bl); \ __audw -= __budw; \ - (sh) = (UWtype)(__audw >> W_TYPE_SIZE); \ - (sl) = (UWtype)(__audw); \ + __auwh = (UWtype)(__audw >> W_TYPE_SIZE); \ + __auwl = (UWtype)(__audw); \ + CT_DEOPTIMIZE_VAR(__auwh); \ + (sh) = __auwh; \ + (sl) = __auwl; \ } while (0) #elif !defined (sub_ddmmss) # define sub_ddmmss(sh, sl, ah, al, bh, bl) \ do { \ - UWtype __x; \ - __x = (al) - (bl); \ - (sh) = (ah) - (bh) - (__x > (al)); \ - (sl) = __x; \ + UWtype __xl, __xh; \ + __xl = (al) - (bl); \ + __xh = (__xl > (al)); \ + __xh = (ah) - (bh) - __xh; \ + CT_DEOPTIMIZE_VAR(__xh); \ + (sh) = __xh; \ + (sl) = __xl; \ } while (0) #endif @@ -1651,10 +1667,9 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); __x3 = (UWtype) __uh * __vh; \ \ __x1 += __ll_highpart (__x0);/* this can't give carry */ \ - __x1 += __x2; /* but this indeed can */ \ - if (__x1 < __x2) /* did we get it? */ \ - __x3 += __ll_B; /* yes, add it in the proper pos. */ \ - \ + /* but this indeed can, and if so, add it in the proper pos: */ \ + add_ssaaaa(__x2, __x1, 0, __x1, 0, __x2); \ + __x3 += __x2 << (W_TYPE_SIZE / 2); \ (w1) = __x3 + __ll_highpart (__x1); \ (w0) = (__ll_lowpart (__x1) << W_TYPE_SIZE/2) + __ll_lowpart (__x0);\ } while (0) diff --git a/src/const-time.h b/src/const-time.h index 46eb187d..c2acbb73 100644 --- a/src/const-time.h +++ b/src/const-time.h @@ -82,6 +82,14 @@ unsigned int _gcry_ct_not_memequal (const void *b1, const void *b2, size_t len); any structure. */ unsigned int _gcry_ct_memequal (const void *b1, const void *b2, size_t len); +/* Prevent compiler from assuming value of variable and from making + non-constant time optimizations. */ +#ifdef HAVE_GCC_ASM_VOLATILE_MEMORY +# define CT_DEOPTIMIZE_VAR(var) asm volatile ("\n" : "+r" (var) :: "memory") +#else +# define CT_DEOPTIMIZE_VAR(var) (void)((var) += _gcry_ct_vzero) +#endif + /* * Return all bits set if A is 1 and return 0 otherwise. */ -- 2.45.2 From harmen at stoppels.ch Wed Feb 5 09:52:02 2025 From: harmen at stoppels.ch (Harmen Stoppels) Date: Wed, 05 Feb 2025 09:52:02 +0100 Subject: [PATCH] Simplify flag munging for rndjent.c Message-ID: * random/Makefile.am (o_flag_munging): append -O0 Replace `echo ... | sed` idiom with simply appending -O0. This overrides previous optimization flags. Hopefully that ends the series of patches to these lines. --- random/Makefile.am | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/random/Makefile.am b/random/Makefile.am index 41041e8a..b6487192 100644 --- a/random/Makefile.am +++ b/random/Makefile.am @@ -55,9 +55,9 @@ jitterentropy-base.c jitterentropy.h jitterentropy-base-user.h # The rndjent module needs to be compiled without optimization. */ if ENABLE_O_FLAG_MUNGING -o_flag_munging = sed -e 's/[[:blank:]]-O\([1-9sgz][1-9sgz]*\)/ -O0 /g' -e 's/[[:blank:]]-Ofast/ -O0 /g' +o_flag_munging = -O0 else -o_flag_munging = cat +o_flag_munging = endif rndjent.o: $(srcdir)/rndjent.c jitterentropy-base-user.h \ @@ -67,7 +67,7 @@ rndjent.o: $(srcdir)/rndjent.c jitterentropy-base-user.h \ $(srcdir)/jitterentropy-sha3.c $(srcdir)/jitterentropy-sha3.h \ $(srcdir)/jitterentropy-timer.c $(srcdir)/jitterentropy-timer.h \ $(srcdir)/jitterentropy-base.c $(srcdir)/jitterentropy.h - `echo $(COMPILE) -c $(srcdir)/rndjent.c | $(o_flag_munging) ` + $(COMPILE) $(o_flag_munging) -c $(srcdir)/rndjent.c rndjent.lo: $(srcdir)/rndjent.c jitterentropy-base-user.h \ $(srcdir)/jitterentropy-gcd.c $(srcdir)/jitterentropy-gcd.h \ @@ -76,4 +76,4 @@ rndjent.lo: $(srcdir)/rndjent.c jitterentropy-base-user.h \ $(srcdir)/jitterentropy-sha3.c $(srcdir)/jitterentropy-sha3.h \ $(srcdir)/jitterentropy-timer.c $(srcdir)/jitterentropy-timer.h \ $(srcdir)/jitterentropy-base.c $(srcdir)/jitterentropy.h - `echo $(LTCOMPILE) -c $(srcdir)/rndjent.c | $(o_flag_munging) ` + $(LTCOMPILE) $(o_flag_munging) -c $(srcdir)/rndjent.c -- 2.43.0 From gniibe at fsij.org Thu Feb 6 08:22:15 2025 From: gniibe at fsij.org (NIIBE Yutaka) Date: Thu, 06 Feb 2025 16:22:15 +0900 Subject: [PATCH] Simplify flag munging for rndjent.c In-Reply-To: References: Message-ID: <87jza3y1ug.fsf@akagi.fsij.org> Hello, "Harmen Stoppels" wrote: > * random/Makefile.am (o_flag_munging): append -O0 > > Replace `echo ... | sed` idiom with simply appending -O0. This overrides > previous optimization flags. Hopefully that ends the series of patches > to these lines. Yes, GCC and Clang have this behavior; In GCC manual, it says: If you use multiple '-O' options, with or without level numbers, the last such option is the one that is effective. However, not all compilers have this behavior. So, please don't change. Well, I would understand your intention to prefer simpler things. Thank you for your attempt. -- From harmen at stoppels.ch Thu Feb 6 10:51:33 2025 From: harmen at stoppels.ch (Harmen Stoppels) Date: Thu, 06 Feb 2025 10:51:33 +0100 Subject: [PATCH] Simplify flag munging for rndjent.c In-Reply-To: <87jza3y1ug.fsf@akagi.fsij.org> References: <87jza3y1ug.fsf@akagi.fsij.org> Message-ID: <0ba70378-ab66-49cd-9055-ea73e02dce3c@app.fastmail.com> Any examples of compilers that do not do that? On Thu, Feb 6, 2025, at 8:22 AM, NIIBE Yutaka wrote: > Hello, > > "Harmen Stoppels" wrote: >> * random/Makefile.am (o_flag_munging): append -O0 >> >> Replace `echo ... | sed` idiom with simply appending -O0. This overrides >> previous optimization flags. Hopefully that ends the series of patches >> to these lines. > > Yes, GCC and Clang have this behavior; In GCC manual, it says: > > If you use multiple '-O' options, with or without level numbers, > the last such option is the one that is effective. > > However, not all compilers have this behavior. So, please don't change. > > Well, I would understand your intention to prefer simpler things. Thank > you for your attempt. > -- From gniibe at fsij.org Fri Feb 7 01:08:30 2025 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 07 Feb 2025 09:08:30 +0900 Subject: [PATCH] Simplify flag munging for rndjent.c In-Reply-To: <0ba70378-ab66-49cd-9055-ea73e02dce3c@app.fastmail.com> References: <87jza3y1ug.fsf@akagi.fsij.org> <0ba70378-ab66-49cd-9055-ea73e02dce3c@app.fastmail.com> Message-ID: <87v7tm4nwh.fsf@akagi.fsij.org> "Harmen Stoppels" writes: > Any examples of compilers that do not do that? Since we occasionally got reports from IBM, this time, I tested on AIX, specifically, on the cfarm111 machine [0]. For the IBM XL C compiler on that machine, appending -O0 doesn't work as you intended. While it allows multiple -O[number] options, its decision seems to be most higher optimization level among multiple options. [0] https://portal.cfarm.net/machines/list/ -- From gniibe at fsij.org Fri Feb 7 06:28:43 2025 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 07 Feb 2025 14:28:43 +0900 Subject: [PATCH] MPI helper of comparison, Least Leak Intended (was: [PATCH] MPI helper of multiplication, Least Leak Intended) In-Reply-To: <87seovj2cn.fsf@akagi.fsij.org> References: <877c6b8pp8.fsf@akagi.fsij.org> <87seovj2cn.fsf@akagi.fsij.org> Message-ID: <87plju492s.fsf@akagi.fsij.org> Hello, This is not related to modular exponentiation, but another function for constant-time; MPI comparison by a helper function. I think that this implementation could be improved. Anyhow, let us start having the function for comparison. diff --git a/mpi/mpi-internal.h b/mpi/mpi-internal.h index ffe8140a..0840d1fd 100644 --- a/mpi/mpi-internal.h +++ b/mpi/mpi-internal.h @@ -304,6 +304,7 @@ void _gcry_mpih_abs_cond (mpi_ptr_t wp, mpi_ptr_t up, mpi_ptr_t _gcry_mpih_mod_lli (mpi_ptr_t vp, mpi_size_t vsize, mpi_ptr_t up, mpi_size_t usize); int _gcry_mpih_cmp_ui (mpi_ptr_t up, mpi_size_t usize, unsigned long v); +int _gcry_mpih_cmp_lli (mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size); /* Define stuff for longlong.h. */ diff --git a/mpi/mpih-const-time.c b/mpi/mpih-const-time.c index e684b956..4549ebca 100644 --- a/mpi/mpih-const-time.c +++ b/mpi/mpih-const-time.c @@ -239,3 +239,25 @@ _gcry_mpih_cmp_ui (mpi_ptr_t up, mpi_size_t usize, unsigned long v) } return 1; } + +/* Do same calculation as _gcry_mpih_cmp does, but Least Leak Intended. + * Return 1 if U > V, 0 if they are equal, and -1 if U < V. */ +int +_gcry_mpih_cmp_lli (mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size) +{ + mpi_size_t i; + mpi_limb_t gt, lt; + mpi_limb_t result = 0; + + for (i = 0; i < size ; i++) + { + gt = mpih_ct_limb_greater_than (up[i], vp[i]); + lt = mpih_ct_limb_less_than (up[i], vp[i]); + /* result = gt ? 1 : result; */ + result = (result & (- mpih_limb_is_zero (gt))) | gt; + /* result = lt ? -1 : result; */ + result = (result & (- mpih_limb_is_zero (lt))) | -lt; + } + + return result; +} -- From gniibe at fsij.org Sat Feb 8 03:05:08 2025 From: gniibe at fsij.org (NIIBE Yutaka) Date: Sat, 08 Feb 2025 11:05:08 +0900 Subject: [PATCH] MPI helper of comparison, Least Leak Intended (was: [PATCH] MPI helper of multiplication, Least Leak Intended) In-Reply-To: <87plju492s.fsf@akagi.fsij.org> References: <877c6b8pp8.fsf@akagi.fsij.org> <87seovj2cn.fsf@akagi.fsij.org> <87plju492s.fsf@akagi.fsij.org> Message-ID: <87frkpmbsb.fsf@haruna.fsij.org> NIIBE Yutaka wrote: > I think that this implementation could be improved. I should use ct_limb_gen_inv_mask function instead of directly use unary minus operator. -- -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-mpi-Add-_gcry_mpih_cmp_lli-for-Least-Leak-Intended-c.patch Type: text/x-diff Size: 1950 bytes Desc: not available URL: From jcb62281 at gmail.com Sat Feb 8 02:49:17 2025 From: jcb62281 at gmail.com (Jacob Bachmeyer) Date: Fri, 7 Feb 2025 19:49:17 -0600 Subject: [PATCH] MPI helper of comparison, Least Leak Intended In-Reply-To: <87plju492s.fsf@akagi.fsij.org> References: <877c6b8pp8.fsf@akagi.fsij.org> <87seovj2cn.fsf@akagi.fsij.org> <87plju492s.fsf@akagi.fsij.org> Message-ID: On 2/6/25 23:28, NIIBE Yutaka via Gcrypt-devel wrote: > Hello, > > This is not related to modular exponentiation, but another function for > constant-time; MPI comparison by a helper function. > > I think that this implementation could be improved. Anyhow, let us > start having the function for comparison. While I am not entirely familiar with the details of the Gcryipt MPI implementation, I am unsure of the equivalence some of the comments imply.? Details inline below. > diff --git a/mpi/mpi-internal.h b/mpi/mpi-internal.h > index ffe8140a..0840d1fd 100644 > [...] > diff --git a/mpi/mpih-const-time.c b/mpi/mpih-const-time.c > index e684b956..4549ebca 100644 > --- a/mpi/mpih-const-time.c > +++ b/mpi/mpih-const-time.c > @@ -239,3 +239,25 @@ _gcry_mpih_cmp_ui (mpi_ptr_t up, mpi_size_t usize, unsigned long v) > } > return 1; > } > + > +/* Do same calculation as _gcry_mpih_cmp does, but Least Leak Intended. > + * Return 1 if U > V, 0 if they are equal, and -1 if U < V. */ > +int > +_gcry_mpih_cmp_lli (mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size) > +{ > + mpi_size_t i; > + mpi_limb_t gt, lt; > + mpi_limb_t result = 0; If you can initialize an mpi_limb_t to literal zero, then I know that mpi_limb_t is an integer type. > + > + for (i = 0; i < size ; i++) > + { > + gt = mpih_ct_limb_greater_than (up[i], vp[i]); > + lt = mpih_ct_limb_less_than (up[i], vp[i]); To check my understanding:? at most one of GT, LT can be non-zero; both are zero if UP[I]==VP[I].? I assume that the comparisons are done using function calls because "<" and ">" are not guaranteed to be constant-time? > + /* result = gt ? 1 : result; */ > + result = (result & (- mpih_limb_is_zero (gt))) | gt; > + /* result = lt ? -1 : result; */ > + result = (result & (- mpih_limb_is_zero (lt))) | -lt; Why are these using mpih_limb_is_zero when mpi_limb_t is an integer type? Assuming that mpih_liimb_is zero returns 1 if its argument is zero and 0 otherwise, in constant time, and we work from least-significant to most-significant, such that the last non-equal result determines the overall result, should these two lines instead be: result = (result & (- mpih_limb_is_zero (lt))) | gt; result = (result & (- mpih_limb_is_zero (gt))) | -lt; Since at most one of the flags can be set, each result line should pass the old value iff the /other/ flag is clear/zero. > + } > + > + return result; > +} > Overall comments and questions: Could this be made more efficient by defining an mpih_ct_limb_cmp function and then only needing to reduce it in constant time? Then we could work from the least-significant to most-significant limb and only need to find a constant-time evaluation of ({previous, this}) {X, -1} -> -1, {X, 1} -> 1, {X, 0} -> X. There might be a potential power-usage leak between setting 1 and -1 (the population counts radically differ); could we instead use 1 and 2 (adjacent bits, each one-hot) as the running flag values or even as the result codes?? (Maybe 1, 2, and 4 for one-hot encodings of less, equal, greater?) -- Jacob -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcb62281 at gmail.com Sat Feb 8 23:50:42 2025 From: jcb62281 at gmail.com (Jacob Bachmeyer) Date: Sat, 8 Feb 2025 16:50:42 -0600 Subject: [PATCH] MPI helper of comparison, Least Leak Intended In-Reply-To: <87frkpmbsb.fsf@haruna.fsij.org> References: <877c6b8pp8.fsf@akagi.fsij.org> <87seovj2cn.fsf@akagi.fsij.org> <87plju492s.fsf@akagi.fsij.org> <87frkpmbsb.fsf@haruna.fsij.org> Message-ID: <89b5ea2f-03b5-43cb-ad85-645ccdb42c60@gmail.com> On 2/7/25 20:05, NIIBE Yutaka via Gcrypt-devel wrote: > NIIBE Yutaka wrote: >> I think that this implementation could be improved. > I should use ct_limb_gen_inv_mask function instead of directly use unary > minus operator. Could it make more sense to write: result &= ct_limb_gen_inv_mask (gt) & ct_limb_gen_inv_mask (lt); result |= gt | -lt; Assuming that ct_limb_gen_inv_mask returns all-bits-set if its argument is zero and all-bits-clear otherwise, the first line clears result if a previous value is to be overwritten and the second sets the new value. I also still suggest considering an alternate encoding for the comparison result.? The Hamming distance between 0 and 1 is 1, but the Hamming distance between 0 and -1 is the maximum on a 2's complement machine, which means that any information leakage on the power rail will be at its strongest when the comparison result is "less than". A one-hot encoding would have a constant Hamming distance (of 2) between any pair of valid values. I remember reading a paper some years ago by an academic research group that was able to recover private keys by observing noise on a laptop's ground (a USB port shield connection, if I recall correctly). -- Jacob -------------- next part -------------- An HTML attachment was scrubbed... URL: From jussi.kivilinna at iki.fi Sun Feb 9 15:06:22 2025 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 9 Feb 2025 16:06:22 +0200 Subject: [PATCH] MPI helper of comparison, Least Leak Intended In-Reply-To: <89b5ea2f-03b5-43cb-ad85-645ccdb42c60@gmail.com> References: <877c6b8pp8.fsf@akagi.fsij.org> <87seovj2cn.fsf@akagi.fsij.org> <87plju492s.fsf@akagi.fsij.org> <87frkpmbsb.fsf@haruna.fsij.org> <89b5ea2f-03b5-43cb-ad85-645ccdb42c60@gmail.com> Message-ID: Hello, On 9.2.2025 0.50, Jacob Bachmeyer via Gcrypt-devel wrote: > On 2/7/25 20:05, NIIBE Yutaka via Gcrypt-devel wrote: >> NIIBE Yutaka wrote: >>> I think that this implementation could be improved. >> I should use ct_limb_gen_inv_mask function instead of directly use unary >> minus operator. > > Could it make more sense to write: > > result &= ct_limb_gen_inv_mask (gt) & ct_limb_gen_inv_mask (lt); > result |= gt | -lt; > > Assuming that ct_limb_gen_inv_mask returns all-bits-set if its argument is zero and all-bits-clear otherwise, the first line clears result if a previous value is to be overwritten and the second sets the new value. > > I also still suggest considering an alternate encoding for the comparison result.? The Hamming distance between 0 and 1 is 1, but the Hamming distance between 0 and -1 is the maximum on a 2's complement machine, which means that any information leakage on the power rail will be at its strongest when the comparison result is "less than". I'd move final result generation outside from the loop and instead generate separate result_lt and result_gt values in loop. These would then be combined at the end of function to form final result code. That should mostly mitigate the 0/1/-1 hamming distance EM leakage from inside the loop. int _gcry_mpih_cmp_lli (mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size) { mpi_size_t i; mpi_limb_t res_gt = 0; mpi_limb_t res_lt = 0; for (i = 0; i < size ; i++) { mpi_limb_t gt, lt, eq, neq; gt = mpih_ct_limb_greater_than (up[i], vp[i]); lt = mpih_ct_limb_less_than (up[i], vp[i]); neq = ct_limb_gen_mask(gt | lt); eq = ct_limb_gen_inv_mask(gt | lt); res_gt = (eq & res_gt) | (neq & gt); res_lt = (eq & res_lt) | (neq & lt); } return (int)(res_gt - res_lt); /* return 0 if U==V, 1 if U>V, -1 if U > A one-hot encoding would have a constant Hamming distance (of 2) between any pair of valid values. If returned value (0 vs 1 vs -1) could cause EM leakage, last line of function could be changed to something like: return (int)(res_gt | (res_lt << 1)); /* return 0 if U==V, 1 if U>V, 2 if UV, INT_MIN if U References: <877c6b8pp8.fsf@akagi.fsij.org> <87seovj2cn.fsf@akagi.fsij.org> <87plju492s.fsf@akagi.fsij.org> <87frkpmbsb.fsf@haruna.fsij.org> <89b5ea2f-03b5-43cb-ad85-645ccdb42c60@gmail.com> Message-ID: On 2/9/25 08:06, Jussi Kivilinna wrote: > Hello, > > On 9.2.2025 0.50, Jacob Bachmeyer via Gcrypt-devel wrote: >> On 2/7/25 20:05, NIIBE Yutaka via Gcrypt-devel wrote: >>> NIIBE Yutaka wrote: >>>> I think that this implementation could be improved. >>> I should use ct_limb_gen_inv_mask function instead of directly use >>> unary >>> minus operator. >> >> Could it make more sense to write: >> >> result &= ct_limb_gen_inv_mask (gt) & ct_limb_gen_inv_mask (lt); >> result |= gt | -lt; >> >> Assuming that ct_limb_gen_inv_mask returns all-bits-set if its >> argument is zero and all-bits-clear otherwise, the first line clears >> result if a previous value is to be overwritten and the second sets >> the new value. >> >> I also still suggest considering an alternate encoding for the >> comparison result.? The Hamming distance between 0 and 1 is 1, but >> the Hamming distance between 0 and -1 is the maximum on a 2's >> complement machine, which means that any information leakage on the >> power rail will be at its strongest when the comparison result is >> "less than". > > I'd move final result generation outside from the loop and instead > generate separate result_lt and result_gt values in loop. These would > then be combined at the end of function to form final result code. > That should mostly mitigate the 0/1/-1 hamming distance EM leakage > from inside the loop. > > [...] I had not thought of that.? Thank you. >> >> A one-hot encoding would have a constant Hamming distance (of 2) >> between any pair of valid values. > > If returned value (0 vs 1 vs -1) could cause EM leakage, last line of > function could be changed to something like: > > ? return (int)(res_gt | (res_lt << 1)); /* return 0 if U==V, 1 if U>V, > 2 if U > Or if having sign-bit set is important but we want to avoid "set all > bits to ones" case, then only set sign-bit for "U > ? return (int)(res_gt | (res_lt << (sizeof(int) * CHAR_BIT - 1))); /* > return 0 if U==V, 1 if U>V, INT_MIN if U