Profiling (was: Faster mutex lock() and unlock())

Werner Koch wk at gnupg.org
Wed Jul 19 14:39:50 CEST 2006


On Thu, 13 Jul 2006 15:40, Nikos Mavrogiannopoulos said:

>  time   samples   samples    calls  T1/call  T1/call  name    
>  23.55   1247.00  1247.00                             Loop
>  16.37   2114.00   867.00                             Loop
>  14.41   2877.00   763.00                             gcry_mpih_divrem

This shows that the big number operations are taking up most of the
time.  This is expected.  If someone is really up to modern ia32 CPUs
this can be optimized.  I know that meanwhile GMP has better
optimizations but due to their configuration change in the assembler
functions, it is not straightforward to port them back to libgcrypt.
It is of course possible and should be done.  before you ask: No,
libgcrypt's current configuration scheme is not subject to a change
becuase we know that it works and is portable over a wide range of
platforms.

>   8.63   3334.00   457.00                             rijndael_encrypt

The current AES code is pretty standard the reference implementation.
It should be possible to squeeze out more performance and maybe even
make cache timing atatcks harder.  Briand Gladman put quite some work
in optimized implementations (http://fp.gladman.plus.com/AES/index.htm).

I just noticed that the new license terms allow distribution under the
GPL - so we could take the code and add it to libgcrypt using an
configure option to still allow building an LGPL version of libgcrypt.

If we use this code as an alternative AES implenemtation, I think it 
will be okay with the GNU coding standards to go without a copyright
disclaimer in this case.  That alternative code should be clearly
separated, though.  Needs a volunteer of course ;-)

>   5.44   3622.00   288.00                             transform

Ah well, SHA-1.  We should definitely look for an optimized version as
SHA-1 is really often and can lead to a real performance problem.  For
example GnupG runs not only AES but also SHA-1 over the bulk data.

The benchmark tool could give some hints on the current performance.

>   3.76   3821.00   199.00                             transform

I guess this is another hash algorithm - probably ripe-md160 as used
by the RNG.



Salam-Shalom,

   Werner




More information about the Gcrypt-devel mailing list