[PATCH] poly1305: add fast addition macro for ppc64

Jussi Kivilinna jussi.kivilinna at iki.fi
Fri Sep 6 21:46:05 CEST 2019


* cipher/poly1305.c [USE_MPI_64BIT && __powerpc__] (ADD_1305_64): New.
--

Benchmark on POWER8 (~3.8Ghz):

Before:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     0.547 ns/B      1742 MiB/s      2.08 c/B
After (~8% faster):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     0.502 ns/B      1901 MiB/s      1.91 c/B

Benchmark on POWER9 (~3.8Ghz):
Before:
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     0.493 ns/B      1934 MiB/s      1.87 c/B
After (~7% faster):
                    |  nanosecs/byte   mebibytes/sec   cycles/byte
 POLY1305           |     0.459 ns/B      2077 MiB/s      1.74 c/B

Signed-off-by: Jussi Kivilinna <jussi.kivilinna at iki.fi>
---
 0 files changed

diff --git a/cipher/poly1305.c b/cipher/poly1305.c
index cded7cb2e..698050851 100644
--- a/cipher/poly1305.c
+++ b/cipher/poly1305.c
@@ -99,6 +99,19 @@ static void poly1305_init (poly1305_context_t *ctx,
 
 #endif /* __x86_64__ */
 
+#if defined (__powerpc__) && __GNUC__ >= 4
+
+/* A += B (ppc64) */
+#define ADD_1305_64(A2, A1, A0, B2, B1, B0) \
+      __asm__ ("addc %0, %3, %0\n" \
+	       "adde %1, %4, %1\n" \
+	       "adde %2, %5, %2\n" \
+	       : "+r" (A0), "+r" (A1), "+r" (A2) \
+	       : "r" (B0), "r" (B1), "r" (B2) \
+	       : "cc" )
+
+#endif /* __powerpc__ */
+
 #ifndef ADD_1305_64
 /* A += B (generic, mpi) */
 #  define ADD_1305_64(A2, A1, A0, B2, B1, B0) do { \




More information about the Gcrypt-devel mailing list