[PATCH] poly1305: add fast addition macro for ppc64
Jussi Kivilinna
jussi.kivilinna at iki.fi
Fri Sep 6 21:46:05 CEST 2019
* cipher/poly1305.c [USE_MPI_64BIT && __powerpc__] (ADD_1305_64): New.
--
Benchmark on POWER8 (~3.8Ghz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.547 ns/B 1742 MiB/s 2.08 c/B
After (~8% faster):
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.502 ns/B 1901 MiB/s 1.91 c/B
Benchmark on POWER9 (~3.8Ghz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.493 ns/B 1934 MiB/s 1.87 c/B
After (~7% faster):
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 0.459 ns/B 2077 MiB/s 1.74 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna at iki.fi>
---
0 files changed
diff --git a/cipher/poly1305.c b/cipher/poly1305.c
index cded7cb2e..698050851 100644
--- a/cipher/poly1305.c
+++ b/cipher/poly1305.c
@@ -99,6 +99,19 @@ static void poly1305_init (poly1305_context_t *ctx,
#endif /* __x86_64__ */
+#if defined (__powerpc__) && __GNUC__ >= 4
+
+/* A += B (ppc64) */
+#define ADD_1305_64(A2, A1, A0, B2, B1, B0) \
+ __asm__ ("addc %0, %3, %0\n" \
+ "adde %1, %4, %1\n" \
+ "adde %2, %5, %2\n" \
+ : "+r" (A0), "+r" (A1), "+r" (A2) \
+ : "r" (B0), "r" (B1), "r" (B2) \
+ : "cc" )
+
+#endif /* __powerpc__ */
+
#ifndef ADD_1305_64
/* A += B (generic, mpi) */
# define ADD_1305_64(A2, A1, A0, B2, B1, B0) do { \
More information about the Gcrypt-devel
mailing list