Underscore prefix detection fix

Gregor Riepl seto-kun at freesurf.ch
Sun Jul 29 01:28:52 CEST 2007


> The other oddity is, when I build with i586 assembly, the checks  
> run _slower_ than in i386 mode.
> I get 1min 19sec vs. 2min 14sec on a MacBook CoreDuo 1.83GHz with  
> 1GB RAM.
> Even when using aggressive optimisation (CFLAGS="-arch i586 - 
> march=yonah -O3 -ffast-math -mfpmath=sse -msse -msse2"), I still  
> only get 1min 47secs. For i386, I didn't use any special compiler  
> flags.
>
> What are me and my Mac messing up here?

I think I've found the problem.
In mpi/config.links, there's a rule for i586-* that sets the macro  
ELF_SYNTAX in asm-syntax.h. This in turn causes the assembler to see  
the line
.align (1<<3)
in front of the Loop: label in mpih-sub1-asm.S and mpih-add1-asm.S.
At least with the Apple assembler, this will be interpreted as "align  
the next instruction on a 2^(1<<3) boundary" - which is BSD syntax.  
I'm not quite sure, but I thought I read somewhere that this 2^(align  
size) type syntax is even used in recent gas versions? In any case,  
the 1<<(1<<3) = 0x100 = 256 byte alignment produces 200+ nops, which  
slow the routine down considerably.
I fixed this by adding the darwin triplets to the djgpp triplets in  
config.links:
     i[3467]86*-msdosdjgpp* | \
     i[34]86*-apple-darwin*)
         echo '#define BSD_SYNTAX'        >>./mpi/asm-syntax.h
         cat  $srcdir/mpi/i386/syntax.h   >>./mpi/asm-syntax.h
         path="i386"
         ;;
     i586*-msdosdjgpp* | \
     i[567]86*-apple-darwin*)
         echo '#define BSD_SYNTAX'        >>./mpi/asm-syntax.h
         cat  $srcdir/mpi/i386/syntax.h   >>./mpi/asm-syntax.h
         path="i586 i386"
         ;;

This takes out the nops - but it's still slower.
Using the aggressive optimisation flags mentioned earlier, i386  
assembly lets benchmark run in 49secs, and in 68secs with i586 assembly.
Disabling assembly yields 65secs by the way.

I think I give up on this for now - it's fast enough and I'm happy  
that gcrypt builds with a little bit of speed improvement on OSX. :)

Thanks for all your work,
Gregor


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2482 bytes
Desc: not available
Url : /pipermail/attachments/20070729/ce7f2327/attachment-0001.bin 


More information about the Gcrypt-devel mailing list