[PATCH 2/2] Avoid slow integer multiplication and division with blocksize calculations.

Jussi Kivilinna jussi.kivilinna at mbnet.fi
Thu Nov 29 20:09:03 CET 2012


Quoting Werner Koch <wk at gnupg.org>:

> On Thu, 29 Nov 2012 16:37, jussi.kivilinna at mbnet.fi said:
>
>> Currently all blocksizes are powers of two (and most likely in  
>> future), so we
>> can avoid using integer multiplication and division/modulo operations (that
>> are slow on architechtures without hardware units for mul/div/mod).
>
> Do you have some of your cool benchmarks?
>

Well, I currently only have access to x86 machines and there this  
didn't have easily measurable effect.

However if I leave 'buffer xor' patch out, and change cipher-ctr.c  
loop to use '& c->blockmask' instead of '% blocksize', AMD Phenom  
II/x86-64 sees following improvement:

Before (% blocksize):

Running each test 20 times.
                   CTR
              ---------------
IDEA           490ms   500ms
3DES          1040ms  1010ms
CAST5          370ms   370ms
BLOWFISH       390ms   380ms
AES            160ms   170ms
AES192         190ms   190ms
AES256         220ms   220ms
TWOFISH        310ms   310ms
DES            520ms   530ms
TWOFISH128     320ms   320ms
SERPENT128     460ms   460ms
SERPENT192     460ms   460ms
SERPENT256     440ms   450ms
RFC2268_40     600ms   590ms
SEED           400ms   410ms
CAMELLIA128    340ms   350ms
CAMELLIA192    380ms   370ms
CAMELLIA256    380ms   380ms

After (& c->blocksize):

Running each test 20 times.
                   CTR
              ---------------
IDEA           350ms   350ms
3DES           850ms   840ms
CAST5          220ms   220ms
BLOWFISH       220ms   220ms
AES            160ms   160ms
AES192         180ms   190ms
AES256         220ms   210ms
TWOFISH        170ms   160ms
DES            370ms   370ms
TWOFISH128     160ms   170ms
SERPENT128     300ms   310ms
SERPENT192     300ms   300ms
SERPENT256     310ms   300ms
RFC2268_40     430ms   420ms
SEED           270ms   250ms
CAMELLIA128    200ms   190ms
CAMELLIA192    240ms   230ms
CAMELLIA256    230ms   230ms

With 'buffer xor':

Running each test 20 times.
                   CTR
              ---------------
IDEA           310ms   320ms
3DES           810ms   850ms
CAST5          190ms   200ms
BLOWFISH       190ms   200ms
AES            140ms   130ms
AES192         160ms   160ms
AES256         190ms   190ms
TWOFISH        140ms   140ms
DES            320ms   340ms
TWOFISH128     130ms   130ms
SERPENT128     280ms   280ms
SERPENT192     270ms   270ms
SERPENT256     280ms   270ms
RFC2268_40     380ms   390ms
SEED           230ms   220ms
CAMELLIA128    170ms   160ms
CAMELLIA192    200ms   200ms
CAMELLIA256    200ms   210ms

-Jussi


>
> Salam-Shalom,
>
>    Werner
>
> --
> Die Gedanken sind frei.  Ausnahmen regelt ein Bundesgesetz.
>
>
>






More information about the Gcrypt-devel mailing list