[PATCH 1/2] Optimize buffer xoring.

Sat Dec 1 12:21:15 CET 2012

Quoting Werner Koch <wk at gnupg.org>:

>> +#if defined(__i386__) || defined(__x86_64__)
>> +/* These architechtures are able of unaligned memory accesses and can
>> +   handle those fast.
>> + */
>
> Really?  All of them?
>

I've now tested AMD Phenom II (32/64bit), Intel Core2 (32/64bit),  
Intel Sandy Bridge (32/64bit) and Intel Atom (32bit) for unaligned  
accesses/buf_xor, and all do reasonably well. Intel Core2 seems to  
have highest penality (2.0x more time) for unaligned buf_xor on small  
buffers (16 bytes). However it's still faster than byte-xor, that  
takes 4.0x more time than aligned buf_xor with 16 bytes buffers.

I have attached the source of tool I used to do the measurements.

-Jussi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_unaligned.c
Type: text/x-csrc
Size: 6998 bytes
Desc: not available
URL: </pipermail/attachments/20121201/7fed12b9/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 665 bytes
Desc: PGP Digital Signature
URL: </pipermail/attachments/20121201/7fed12b9/attachment.pgp>