[PATCH] MPI helper of table lookup, Least Leak Intended

NIIBE Yutaka gniibe at fsij.org
Tue Feb 18 06:30:49 CET 2025


Hello,

Thank you for your comments.

Jacob Bachmeyer <jcb62281 at gmail.com> wrote:
> The obvious comment to me is that the function name should probably 
> contain either "_ct" or "_lli" to denote that this is a slow function 
> for leak minimization.

Ack.  I have a look at OpenSSL and GNU MP.

OpenSSL has:

    static ossl_inline void constant_time_lookup(void *out,
                                                 const void *table,
                                                 size_t rowsize,
                                                 size_t numrows,
                                                 size_t idx);
GNU MP has:

    void
    mpn_sec_tabselect (volatile mp_limb_t *rp, volatile const mp_limb_t *tab,
		       mp_size_t n, mp_size_t nents, mp_size_t which);


Thus, I'm going to use the name "_gcry_mpih_lookup_lli" for libgcrypt,
adding _lli suffix and removing "table" since it can be easy to assume.

> There might also be architecture-specific instructions that can be used 
> to retrieve a table row without polluting the data cache; allowing 
> architecture-specific overrides here could make a very significant 
> performance difference, as the basic implementation could easily flush 
> the entire data cache if used on a large table.
>
> For the base case, reading the entire table is probably the best that 
> you can do, but if you have a "load without temporal locality" 
> instruction (I believe that there are such instructions in SSE, for 
> example), you can avoid the problem, while accessing only a single table 
> row.  (The memory bus is assumed to not be visible to an attacker.)

Ah, I didn't consider that.

IIUC, you mean something like _mm*_stream_load_si* functions in the
Intel Intrinsics Guide (to access an entry in a table).

    https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#cats=Load

That is interesting to try, and it could be effective when table is
larger and read-only.  (But when table is larger than a page,
it might be a target of TLB flush attack to determine which page.)

Note that in this particular case of the modular exponentiation, the
table size is typically 4 Ki-byte and the entry size is 256-byte.  The
table is computed in _gcry_mpih_powm_lli before the loop which uses the
table.

For now, let me apply and push _gcry_mpih_lookup_lli, and
possible improvement will be done in future.
-- 



More information about the Gcrypt-devel mailing list