[PATCH 1/2] sha512-avx512: enable only on Intel CPUs for now

Jussi Kivilinna jussi.kivilinna at iki.fi
Sat Oct 22 16:14:25 CEST 2022


* cipher/sha512.c (sha512_init_common): Enable AVX512 implementation
only for Intel CPUs.
--

SHA512-AVX512 implementation is slightly slower than AVX2 variant
on AMD Zen4 (AVX512 4.88 cpb, AVX2 4.35 cpb). This is likely
because AVX512 implementation uses vector registers for round
function unlike AVX2 where general purpose registers are used
for round function. On Zen4, message expansion and round function
then end up competing for narrower vector execution bandwidth
and gives slower performance.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna at iki.fi>
---
 cipher/sha512.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cipher/sha512.c b/cipher/sha512.c
index 9ac412b3..492d021a 100644
--- a/cipher/sha512.c
+++ b/cipher/sha512.c
@@ -466,7 +466,7 @@ sha512_init_common (SHA512_CONTEXT *ctx, unsigned int flags)
     ctx->bctx.bwrite = do_sha512_transform_amd64_avx2;
 #endif
 #ifdef USE_AVX512
-  if ((features & HWF_INTEL_AVX512) != 0)
+  if ((features & HWF_INTEL_AVX512) && (features & HWF_INTEL_CPU))
     ctx->bctx.bwrite = do_sha512_transform_amd64_avx512;
 #endif
 #ifdef USE_PPC_CRYPTO
-- 
2.37.2




More information about the Gcrypt-devel mailing list