From cvs at cvs.gnupg.org Wed Jun 6 19:07:48 2018 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 06 Jun 2018 19:07:48 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.8.1-70-g7b6c2af Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 7b6c2afd699e889f5f054cc3d202a61bd0ee1dcf (commit) via 6606ae44e0de1069b29dd4215ee9748280940e1b (commit) from 61dbb7c08ab11c10060e193b52e3e1d2ec6dd062 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 7b6c2afd699e889f5f054cc3d202a61bd0ee1dcf Author: Werner Koch Date: Tue Jun 5 14:33:01 2018 +0200 ecc: Improve gcry_mpi_ec_curve_point * mpi/ec.c (_gcry_mpi_ec_curve_point): Check range of coordinates. * tests/t-mpi-point.c (point_on_curve): New. -- Due to the conversion to affine coordinates we didn't detected points with values >= P. The solution here might not be the best according to the NIST standard (it is done there at an earlier opportunity) but it reliably detects points we do not expect to receive. The new test vectors have been compared against gnutls/nettle. Reported-by: Stephan M?ller Signed-off-by: Werner Koch diff --git a/mpi/ec.c b/mpi/ec.c index 2c396a7..97afbfe 100644 --- a/mpi/ec.c +++ b/mpi/ec.c @@ -1731,6 +1731,15 @@ _gcry_mpi_ec_curve_point (gcry_mpi_point_t point, mpi_ec_t ctx) y = mpi_new (0); w = mpi_new (0); + /* Check that the point is in range. This needs to be done here and + * not after conversion to affine coordinates. */ + if (mpi_cmpabs (point->x, ctx->p) >= 0) + goto leave; + if (mpi_cmpabs (point->y, ctx->p) >= 0) + goto leave; + if (mpi_cmpabs (point->z, ctx->p) >= 0) + goto leave; + switch (ctx->model) { case MPI_EC_WEIERSTRASS: diff --git a/tests/t-mpi-point.c b/tests/t-mpi-point.c index 1eaa08a..f2378bf 100644 --- a/tests/t-mpi-point.c +++ b/tests/t-mpi-point.c @@ -1045,6 +1045,271 @@ twistededwards_math (void) } +/* Check the point on curve function. */ +static void +point_on_curve (void) +{ + static struct { + const char *curve; + int oncurve; /* Point below is on the curve. */ + const char *qx; + const char *qy; + } t[] = { + { + "NIST P-256", 0, + "015B4F6775D68D4D2E2192C6B8027FC5A3D49957E453CB251155AA3FF5D3EC9974", + "4BC4C87B57A25E1056831208AB5B8F091142F891E9FF19F1E090B030DF1087B3" + }, { + "NIST P-256", 0, + "D22C316E7EBE7B293BD66808E000806F0754398A5D72A4F9BBC21C26EAC0A651", + "3C8DB80CC3CDE5E530D040536E6A58AAB41C33FA70B30896943513FF3690132D" + }, { + "NIST P-256", 0, + "0130F7E7BC52854CA493A0DE87DC4AB3B4343758F2B634F15B10D70DBC0A5A5291", + "86F9CA73C25CE86D54CB21C181AECBB52A5971334FF5040F76CAE9845ED46023" + }, { + "NIST P-256", 1, + "14957B602C7849F28858C7407696F014BC091D6D68C449560B7A38147D6E6A9B", + "A8E09EFEECFE00C797A0848F38B61992D30C61FAB13021E88C8BD3545B3A6C63" + }, { + "NIST P-256", 0, + "923DE4957241DD97780841C76294DB0D4F5DC04C3045081174764D2D32AD2D53", + "01B4B1A2027C02F0F520A3B01E4CE3C668BF481346A74499C5D1044A53E210B600" + }, { + "NIST P-256", 1, + "9021DFAB8B4DAEAADA634AAA26D6E5FFDF8C0476FF5CA31606C870A1B933FB36", + "9AFC65EEB24E46C7B75712EF29A981CB09FAC56E2B81D3ED024748CCAB1CB77E" + }, { + "NIST P-256", 0, + "011529F0B26DE5E0EB2DA4BFB6C149C802CB52EE479DD666553286928A4005E990", + "0EBC63DB2104884456DC0AA81A3F4E99D93B7AE2CD4B1489655EA9BE6289CF9E" + }, { + "NIST P-256", 1, + "216EC5DE8CA989199D31F0DFCD381DCC9270A0785365EC3E34CA347C070A87BE", + "87A88897BA763509ECC1DBE28D9D37F6F4E70E3B99B1CD3C0B934D4190968A6D" + }, { + "NIST P-256", 1, + "7ABAA44ACBC6016FDB52A6F45F6178E65CBFC35F9920D99149CA9999612CE945", + "88F7684BDCDA31EAFB6CAD859F8AB29B5D921D7DB2B34DF7E40CE36235F45B63" + }, { + "NIST P-256", 0, + "E765B4272D211DD0064189B55421FB76BB3A7756364A6CB1627FAED848157A84", + "C13171CFFB243E06B203F0996BBDD16F52292AD11F2DA81106E9C2FD87F4FA0F" + }, { + "NIST P-256", 0, + "EE4999DFC3A1871EE7A592BE26A09BEC9D9B561613EE9EFB6ED42F17985C9CDC", + "8399E967338A7A618336AF70DA67D9CAC1C19267809652F5C5183C8B129E0902" + }, { + "NIST P-256", 0, + "F755D0CF2642A2C7FBACCC8E9E442B8B047A99C6E052B2FA5AB0544B36B4D51C", + "AA080F17657B6565D9A4D94BD260B54D92FEE8DC4A78C4FC9C19209933AF39B0" + } , { + "NIST P-384", 0, + "CBFC7DBEBF15BEAD682549757F9BBA0E3F67669DF13FCE0EBE8024B725B38B00" + "83EC46A8F2FF3203C5C7F8C7E722A5EF", + "0548FE281BEAB18FD1AB86F59B0CA524479A4A81373C83B78AFFD801FAC75922" + "96470753DCF46173C9AA4A8A4C2FBE51" + }, { + "NIST P-384", 0, + "1DC8E054A883DB81EAEDE6C487B26816C927B8196780525A6CA8F675D2557752" + "02CE06CCBE705EA8A38AA2894D4BEEE6", + "010191050E867AFAA96A199FE9C591CF8B853D81486786DA889124881FB39D2F" + "8E0875F4C4BB1E3D0F8535C7A52306FB82" + }, { + "NIST P-384", 1, + "2539FC368CE1D5E464B6C0FBB12D557B712327DB086975255AD7D17F7E7E4F23" + "D719ED4116E2CC907AEB92CF22331A60", + "8843FDBA742CB64323E49CEBE8DD74908CFC9C3AA0015662DFBB7219E92CF32E" + "9FC63F61EF19DE9B3CEA98D163ABF254" + }, { + "NIST P-384", 0, + "0B786DACF400D43575394349EDD9F9CD145FC7EF737A3C5F69B253BE7639DB24" + "EC2F0CA62FF1F90B6515DE356EC2A404", + "225D6B2939CC7F7133F43353946A682C68DAC6BB75EE9CF6BD9A1609FA915692" + "72F4D3A87E88529754E109BB9B61B03B" + }, { + "NIST P-384", 0, + "76C660C9F58CF2051F9F8B06049694AB6FE418009DE6F0A0833BC690CEC06CC2" + "9A440AD51C94CF5BC28817C8C6E2D302", + "012974E5D9E55304ED294AB6C7A3C65B663E67ABC5E6F6C0F6498B519F2F6CA1" + "8306976291F3ADC0B5ABA42DED376EA9A5" + }, { + "NIST P-384", 0, + "23D758B1EDB8E12E9E707C53C131A19D9464B20EE05C99766F5ABDF9F906AD03" + "B958BF28B022E54E320672C4BAD4EEC0", + "01E9E72870C88F4C82A5AB3CC8A3398E8F006BF3EC05FFBB1EFF8AEE88020FEA" + "9E558E9F58ED1D324C9DCBCB4E8F2A5970" + }, { + "NIST P-384", 0, + "D062B96D5A10F715ACF361F99262ABF0F7693A8BB60ECB1DF459CF95750E4293" + "18BCB9FC60499D009F949298F3F9F47B", + "9089C6328E4B39A73D7EE6FAE1A77E48CE354B83BBCE432082C32C8FD6784B86" + "CFE9C552E2E720F5DA5806503D3784CD" + }, { + "NIST P-384", 0, + "2A951D4D6EB35C43D94866280D37365B82441BC84D62CBFF3365CAB1FD0A3E20" + "823CA8F84D2BBF4EA687885437DE7839", + "01CC7D762AFE613F7B5568BC516568A421159C40599E8D52DE10E8F9488931E1" + "69F3656C322DE45C4A70DC6DB9A661E599" + }, { + "NIST P-384", 1, + "A4BAEE6CDAF3AEB69032B3FBA811707C54F5753670DA5173D891547E8CBAEEF3" + "89B92C9A55573A596123415FBFA26991", + "3241EA716583C11C71BB30AF6C5E3A6637956F17ADBBE641BAB52E8539F9FC7B" + "F3B04F46DBFFE08151E0F0950CC70081" + }, { + "NIST P-384", 0, + "5C0E18B0DE3261BCBCFC7B702C2D75CF481336BFBADF420BADC616235C1966AB" + "4C0F876575DDEC1BDB3F3F04061C9AE4", + "E90C78550D1C922F1D8161D8C9C0576E29BD09CA665376FA887D13FA8DF48352" + "D7BBEEFB803F6CC8FC7895E47F348D33" + }, { + "NIST P-384", 1, + "2015864CD50F0A1A50E6401F44191665C19E4AD4B4903EA9EB464E95D1070E36" + "F1D8325E45734D5A0FDD103F4DF6F83E", + "5FB3E9A5C59DD5C5262A8176CB7032A00AE33AED08485884A3E5D68D9EEB990B" + "F26E8D87EC175577E782AD51A6A12C02" + }, { + "NIST P-384", 1, + "56EBF5310EEF5A5D8D001F570A18625383ECD4882B3FC738A69874E7C9D8F89C" + "187BECA23369DFD6C15CC0DA0629958F", + "C1230B349FB662CB762563DB8F9FCB32D5CCA16120681C474D67D279CCA6F6DB" + "73DE6AA96140B5C457B7486E06D318CE" + }, { + "NIST P-521", 0, + "01E4D82EE5CD6DA37080252295EFA273BBBA6952012D0120EAF131E73F1E5024" + "36E3324624471040030E1C345D65490ECEE9B64E03B15B6C7EB69A39C618BAFEED70", + "03EE3A3C88A6933B7B16016BE4CC4E3BF5EA0625CB3DB2604CDCBBD02CABBC90" + "8904D9DB42998F6C5101D4D4318ACFC9643C9CD641F636D1810ED86F1840EA74F3C0" + }, { + "NIST P-521", 0, + "01F3DFCB5433387B6B2E3F74177F4F3D7300F05E1AD49DE112630E27B1C8A437" + "1E742CB020E0039B5477FC897D17332034F9660B3066764EFF5FB440EB8856E782E3", + "02D337616C9D202DC5E290C486F5855CBD6A8470AE62CA96245834CF49257D8D" + "96D4041B15007650DEE668C00DDBF749054256C571F60980AC74D0DBCA7FB96C2F48" + }, { + "NIST P-521", 1, + "822A846606DC9E96452CAC373567A8B57D9ACA15B177F75DD7EF10C635F52CE4" + "EF6ABEEDB90D3F48F50A0C9015A95C955A25C45DE8413DE3BF899B6B1E62CF7CB8", + "0102771B5F3EC8C36838CEC04DCBC28AD1E38C37DAB0EA89B5EE92D21F7A35CE" + "ABC8B155EDC70154D6DFA2E77EC1D8C4A3406A6BD0ECF8F1EE2AC33A02464CB70C97" + }, { + "NIST P-521", 0, + "F733D48467912D1FFE46CF442F27FDD218D190E7B8A829D822DA3B6BAF9B987E" + "5B4BCCE34499248F59EEAF74F63ED15FF73F243C6FC3FD5E5842F6A3BA34C2022D", + "0281AAAD1B7EEBABEB6EC67932CB7E95717AFA3B4CF7A2DB151CD537C419C3A5" + "156ED9160758190B47696CDC15E81BBAD12975283907A571604DB23F702AEA4B38FF" + }, { + "NIST P-521", 0, + "03B1B274175AAEB5907152E5114CCAEADA28A7ADD4A2B1831C3D8302E8596489" + "E2C98B9B8D0CAE98C03BB11E28CE66D4736449758AF58BAFE40EF5A5FA22C9A43117", + "94C5951F81D544E959EDFC5DC1D5F42FE427871D4FB91A43A0B4A6BEA6B35B9E" + "BC5FB444C70BE4FD47B4ED16704F8C86EF019FC47C7FF2271F8B0DDEA9E2D3BCDD" + }, { + "NIST P-521", 1, + "F2248C318055DE37CD706D4FCAF7E7D96737A4A7B6B8067A66DCD58B6B8DFC55" + "90ECE67F6AA67F9C51B57E7B023075F2F42909BF47361CB6881C10F55FB7215B56", + "0162F735CE6A2ADA54CAF96A12D6888C02DE0A74638CF34CE39DABBACA4D651B" + "7E6ED1A65B551B36BAE7BE474BB6E6905ED0E33C7BA2021885027C7C6E40C5613004" + }, { + "NIST P-521", 0, + "9F08E97FEADCF0A391CA1EA4D97B5FE62D3B164593E12027EB967BD6E1FA841A" + "9831158DF164BCAD0BF3ADA96127745E25F349BDDD52EEA1654892B35960C9C023", + "AE2A25F5440F258AFACA6925C4C9F7AEAD3CB67153C4FACB31AC33F58B43A78C" + "B14F682FF726CEE2A6B6F6B481AEEB29A9B3150F02D1CFB764672BA8294C477291" + }, { + "NIST P-521", 0, + "01047B52014748C904980716953206A93F0D01B34CA94A997407FA93FE304F86" + "17BB6E402B2BB8B434C2671ECE953ABE7BADB75713CD9DF950943A33A9A19ACCDABE", + "7433533F098037DEA616337986887D01C5CC8DEC3DC1FDB9CDF7287EF27CC125" + "54FCF3A5E212DF9DAD9F8A3A7173B23FC6E15930704F3AEE1B074BDDB0ED6823E4" + }, { + "NIST P-521", 0, + "01C2A9EBF51592FE6589F618EAADA1697D9B2EC7CE5D48C9E80FC597642B23F1" + "F0EBE953449762BD3F094F57791D9850AFE98BBDA9872BE399B7BDD617860076BB03", + "0B822E27692F63DB8E12C59BB3CCA172B9BBF613CAE5F9D1474186E45E8B26FF" + "962084E1C6BE74821EDBB60941A3B75516F603719563433383812BFEA89EC14B89" + }, { + "NIST P-521", 0, + "99390F342C3F0D46E80C5B65C61E8AA8ACA0B6D4E1352404586364A05D8398E9" + "2BC71A644E8663F0A9B87D0B3ACAEE32F2AB9B321317AD23059D045EBAB91C5D93", + "82FCF93AE4467EB57766F2B150E736636727E7282500CD482DA70D153D195F2B" + "DF9B96D689A0DC1BB9137B41557A33F202F1B71840544CBEFF03072E77E4BB6F0B" + }, { + "NIST P-521", 1, + "018E48E80594FF5496D8CC7DF8A19D6AA18805A4EF4490038AED6A1E9AA18056" + "D0244A97DCF6D132C6804E3F4F369922119544B4C057D783C848FB798B48730A382C", + "01AF510B4F5E1C40BC9C110216D35E7C6D7A2BEE52914FC98258676288449901" + "F27A07EE91DF2D5D79259712906C3E18A990CBF35BCAC41A952820CE2BA8D0220080" + }, { + "NIST P-521", 1, + "ADCEF3539B4BC831DC0AFD173137A4426152058AFBAE06A17FCB89F4DB6E48B5" + "335CB88F8E4DB475A1E390E5656072F06605BFB84CBF9795B7992ECA04A8E10CA1", + "01BCB985AFD6404B9EDA49B6190AAA346BF7D5909CA440C0F7E505C62FAC8635" + "31D3EB7B2AC4DD4F4404E4B12E9D6D3C596179587F3724B1EFFF684CFDB4B21826B9" + } + }; + gpg_error_t err; + int tidx; + const char *lastcurve = NULL; + gcry_ctx_t ctx = NULL; + gcry_mpi_t qx = NULL; + gcry_mpi_t qy = NULL; + gcry_mpi_point_t Q; + int oncurve; + + wherestr = "point_on_curve"; + for (tidx=0; tidx < DIM (t); tidx++) + { + if (!t[tidx].curve) + { + if (!lastcurve || !ctx) + die ("invalid test vectors at idx %d\n", tidx); + } + else if (!ctx || !lastcurve || strcmp (t[tidx].curve, lastcurve)) + { + lastcurve = t[tidx].curve; + gcry_ctx_release (ctx); + err = gcry_mpi_ec_new (&ctx, NULL, lastcurve); + if (err) + die ("error creating context for curve %s at idx %d: %s\n", + lastcurve, tidx, gpg_strerror (err)); + + info ("checking points on curve %s\n", lastcurve); + } + + gcry_mpi_release (qx); + gcry_mpi_release (qy); + qx = hex2mpi (t[tidx].qx); + qy = hex2mpi (t[tidx].qy); + + Q = gcry_mpi_point_set (NULL, qx, qy, GCRYMPI_CONST_ONE); + if (!Q) + die ("gcry_mpi_point_set(Q) failed at idx %d\n", tidx); + + oncurve = gcry_mpi_ec_curve_point (Q, ctx); + + if (t[tidx].oncurve && !oncurve) + { + fail ("point expected on curve but not identified as such (i=%d):\n", + tidx); + print_point (" Q", Q); + } + else if (!t[tidx].oncurve && oncurve) + { + fail ("point not expected on curve but identified as such (i=%d):\n", + tidx); + print_point (" Q", Q); + } + gcry_mpi_point_release (Q); + } + + gcry_mpi_release (qx); + gcry_mpi_release (qy); + gcry_ctx_release (ctx); +} + + int main (int argc, char **argv) { @@ -1067,6 +1332,7 @@ main (int argc, char **argv) context_alloc (); context_param (); basic_ec_math (); + point_on_curve (); /* The tests are for P-192 and ed25519 which are not supported in FIPS mode. */ commit 6606ae44e0de1069b29dd4215ee9748280940e1b Author: Werner Koch Date: Tue Jun 5 14:29:53 2018 +0200 mpi: New internal function _gcry_mpi_cmpabs. * mpi/mpi-cmp.c (_gcry_mpi_cmp): Factor out to ... (do_mpi_cmp): New. Add arg absmode. (_gcry_mpi_cmpabs): New. * src/gcrypt-int.h (mpi_cmpabs): New macro. Signed-off-by: Werner Koch diff --git a/mpi/mpi-cmp.c b/mpi/mpi-cmp.c index 838a7c9..66e0961 100644 --- a/mpi/mpi-cmp.c +++ b/mpi/mpi-cmp.c @@ -54,15 +54,19 @@ _gcry_mpi_cmp_ui (gcry_mpi_t u, unsigned long v) } -int -_gcry_mpi_cmp (gcry_mpi_t u, gcry_mpi_t v) +/* Helper for _gcry_mpi_cmp and _gcry_mpi_cmpabs. */ +static int +do_mpi_cmp (gcry_mpi_t u, gcry_mpi_t v, int absmode) { mpi_size_t usize; mpi_size_t vsize; + int usign; + int vsign; int cmp; if (mpi_is_opaque (u) || mpi_is_opaque (v)) { + /* We have no signan and thus ABSMODE has no efeect here. */ if (mpi_is_opaque (u) && !mpi_is_opaque (v)) return -1; if (!mpi_is_opaque (u) && mpi_is_opaque (v)) @@ -82,26 +86,42 @@ _gcry_mpi_cmp (gcry_mpi_t u, gcry_mpi_t v) usize = u->nlimbs; vsize = v->nlimbs; + usign = absmode? 0 : u->sign; + vsign = absmode? 0 : v->sign; /* Compare sign bits. */ - if (!u->sign && v->sign) + if (!usign && vsign) return 1; - if (u->sign && !v->sign) + if (usign && !vsign) return -1; /* U and V are either both positive or both negative. */ - if (usize != vsize && !u->sign && !v->sign) + if (usize != vsize && !usign && !vsign) return usize - vsize; - if (usize != vsize && u->sign && v->sign) + if (usize != vsize && usign && vsign) return vsize + usize; if (!usize ) return 0; if (!(cmp = _gcry_mpih_cmp (u->d, v->d, usize))) return 0; - if ((cmp < 0?1:0) == (u->sign?1:0)) + if ((cmp < 0?1:0) == (usign?1:0)) return 1; } return -1; } + + +int +_gcry_mpi_cmp (gcry_mpi_t u, gcry_mpi_t v) +{ + return do_mpi_cmp (u, v, 0); +} + +/* Compare only the absolute values. */ +int +_gcry_mpi_cmpabs (gcry_mpi_t u, gcry_mpi_t v) +{ + return do_mpi_cmp (u, v, 1); +} diff --git a/src/gcrypt-int.h b/src/gcrypt-int.h index e88f868..7934f14 100644 --- a/src/gcrypt-int.h +++ b/src/gcrypt-int.h @@ -368,6 +368,7 @@ int _gcry_mpi_is_neg (gcry_mpi_t a); void _gcry_mpi_neg (gcry_mpi_t w, gcry_mpi_t u); void _gcry_mpi_abs (gcry_mpi_t w); int _gcry_mpi_cmp (const gcry_mpi_t u, const gcry_mpi_t v); +int _gcry_mpi_cmpabs (const gcry_mpi_t u, const gcry_mpi_t v); int _gcry_mpi_cmp_ui (const gcry_mpi_t u, unsigned long v); gpg_err_code_t _gcry_mpi_scan (gcry_mpi_t *ret_mpi, enum gcry_mpi_format format, const void *buffer, size_t buflen, @@ -469,6 +470,7 @@ int _gcry_mpi_get_flag (gcry_mpi_t a, enum gcry_mpi_flag flag); #define mpi_abs( w ) _gcry_mpi_abs( (w) ) #define mpi_neg( w, u) _gcry_mpi_neg( (w), (u) ) #define mpi_cmp( u, v ) _gcry_mpi_cmp( (u), (v) ) +#define mpi_cmpabs( u, v ) _gcry_mpi_cmpabs( (u), (v) ) #define mpi_cmp_ui( u, v ) _gcry_mpi_cmp_ui( (u), (v) ) #define mpi_is_neg( a ) _gcry_mpi_is_neg ((a)) ----------------------------------------------------------------------- Summary of changes: mpi/ec.c | 9 ++ mpi/mpi-cmp.c | 34 +++++-- src/gcrypt-int.h | 2 + tests/t-mpi-point.c | 266 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 304 insertions(+), 7 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From stefbon at gmail.com Sun Jun 10 11:25:27 2018 From: stefbon at gmail.com (Stef Bon) Date: Sun, 10 Jun 2018 11:25:27 +0200 Subject: Low level ops? Message-ID: Hi, I've got a ssh client to access sftp via fuse. Now I'm working on making parallel encryption and decryption work. I hope I can achieve some performance improvements. Now I'm asking whether "low level" function calls in gcrypt can make things run faster. Let me explain what I mean. When I look at cipher-cbc to the function to encrypt en decrypt. These functions check first the blocksize and the buffer (and both). These checks are done over and over again, for every message. Does it slow things a bit? If so it may be worth the effort to create encrypt/decrypt calls whithout these checks. In my application the length of the ouputbuffer is always equal to the length of the inputbuffer. And the blocksize is always the default blocksize for the cipher. And in ssh the input buffer length is always is multiple of the blocksize (padding is done). It's also possibe that these checks do not cost anything. I don't know. Stef Bon the Netherlands From r030t1 at gmail.com Sun Jun 10 17:55:04 2018 From: r030t1 at gmail.com (R0b0t1) Date: Sun, 10 Jun 2018 10:55:04 -0500 Subject: Low level ops? In-Reply-To: References: Message-ID: On Sun, Jun 10, 2018 at 4:25 AM, Stef Bon wrote: > Hi, > > I've got a ssh client to access sftp via fuse. > Now I'm working on making parallel encryption and decryption work. I > hope I can achieve some performance improvements. > > Now I'm asking whether "low level" function calls in gcrypt can make > things run faster. Let me explain what I mean. When I look at > cipher-cbc to the function to encrypt en decrypt. These functions > check first the blocksize and the buffer (and both). These checks are > done over and over again, for every message. Does it slow things a > bit? If so it may be worth the effort to create encrypt/decrypt calls > whithout these checks. In my application the length of the ouputbuffer > is always equal to the length of the inputbuffer. And the blocksize is > always the default blocksize for the cipher. And in ssh the input > buffer length is always is multiple of the blocksize (padding is > done). > > It's also possibe that these checks do not cost anything. I don't know. > You may want to look at https://panthema.net/2008/0714-cryptography-speedtest-comparison/. >From memory, the conclusion is "don't use gcrypt." But yes, the change you are describing could save some time. A few libraries offer two API functions - one which is a wrapper, and then one which does not wrap the algorithm at all, should you always supply compliant input. Cheers, R0b0t1 From stefbon at gmail.com Sun Jun 10 20:18:10 2018 From: stefbon at gmail.com (Stef Bon) Date: Sun, 10 Jun 2018 20:18:10 +0200 Subject: Low level ops? In-Reply-To: References: Message-ID: Thank you, this gives a good overview, but is also very dated. It maybe outdated, don't you think? gcrypt has cipher code in assembly. I do not know about others, but I guess thats very good. Stef From jussi.kivilinna at iki.fi Sun Jun 10 23:04:49 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 11 Jun 2018 00:04:49 +0300 Subject: Low level ops? In-Reply-To: References: Message-ID: <01be53bf-bdeb-29a8-caa8-d62e500d7a76@iki.fi> On 10.06.2018 18:55, R0b0t1 wrote: > On Sun, Jun 10, 2018 at 4:25 AM, Stef Bon wrote: >> Hi, >> >> I've got a ssh client to access sftp via fuse. >> Now I'm working on making parallel encryption and decryption work. I >> hope I can achieve some performance improvements. >> >> Now I'm asking whether "low level" function calls in gcrypt can make >> things run faster. Let me explain what I mean. When I look at >> cipher-cbc to the function to encrypt en decrypt. These functions >> check first the blocksize and the buffer (and both). These checks are >> done over and over again, for every message. Does it slow things a >> bit? If so it may be worth the effort to create encrypt/decrypt calls >> whithout these checks. In my application the length of the ouputbuffer >> is always equal to the length of the inputbuffer. And the blocksize is >> always the default blocksize for the cipher. And in ssh the input >> buffer length is always is multiple of the blocksize (padding is >> done). >> >> It's also possibe that these checks do not cost anything. I don't know. >> > > You may want to look at > https://panthema.net/2008/0714-cryptography-speedtest-comparison/. > From memory, the conclusion is "don't use gcrypt." > That comparison is outdated, gcrypt 1.6 fixed many of the overhead issues in encryption/decryption/digest code paths. I did reran part of the tests from panthema.net with libgcrypt 1.5 and 1.6 in 2013, the result pdfs are available at: http://jukivili.kapsi.fi/gcrypt/ -Jussi From stefbon at gmail.com Mon Jun 11 08:37:59 2018 From: stefbon at gmail.com (Stef Bon) Date: Mon, 11 Jun 2018 08:37:59 +0200 Subject: Low level ops? In-Reply-To: <01be53bf-bdeb-29a8-caa8-d62e500d7a76@iki.fi> References: <01be53bf-bdeb-29a8-caa8-d62e500d7a76@iki.fi> Message-ID: Op zo 10 jun. 2018 om 23:05 schreef Jussi Kivilinna : > > > That comparison is outdated, gcrypt 1.6 fixed many of the overhead issues in encryption/decryption/digest code paths. I did reran part of the tests from panthema.net with libgcrypt 1.5 and 1.6 in 2013, the result pdfs are available at: http://jukivili.kapsi.fi/gcrypt/ > Ah. Very good. Now that's clear should someone look at the original question? Can "low level" calls (stripped from checks) be an performance improvement? (leaving these checks to the application programmer>>>) Stef From jussi.kivilinna at iki.fi Mon Jun 11 22:09:55 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 11 Jun 2018 23:09:55 +0300 Subject: Low level ops? In-Reply-To: References: Message-ID: <534aeb69-8bf6-b8a9-6918-422b39cf956a@iki.fi> Hello, On 10.06.2018 12:25, Stef Bon wrote: > Hi, > > I've got a ssh client to access sftp via fuse. > Now I'm working on making parallel encryption and decryption work. I > hope I can achieve some performance improvements. > > Now I'm asking whether "low level" function calls in gcrypt can make > things run faster. Let me explain what I mean. When I look at > cipher-cbc to the function to encrypt en decrypt. These functions > check first the blocksize and the buffer (and both). These checks are > done over and over again, for every message. Does it slow things a > bit? If so it may be worth the effort to create encrypt/decrypt calls > whithout these checks. In my application the length of the ouputbuffer > is always equal to the length of the inputbuffer. And the blocksize is > always the default blocksize for the cipher. And in ssh the input > buffer length is always is multiple of the blocksize (padding is > done). > This depends on the size of buffers that are passed to gcrypt. For best performance one should pass data in as large buffers as possible, although approx ~1024-2048 byte buffers give close to maximum throughput. I made quick tests for overhead by splitting input buffer in bench-slope to multiples of 16. Here's result for AES-CBC encryption with all HW acceleration disabled: AES-CBC/ENC: no-overhead test, full benchmark buffers, speed 13.25 cycles/byte AES-CBC/ENC: overhead test, benchmark buffers processed in 16 byte chunks, speed: 19.47 cycles/byte, overhead +46.9% AES-CBC/ENC: overhead test, benchmark buffers processed in 32 byte chunks, speed: 16.36 cycles/byte, overhead +23.4% AES-CBC/ENC: overhead test, benchmark buffers processed in 64 byte chunks, speed: 14.91 cycles/byte, overhead +12.5% AES-CBC/ENC: overhead test, benchmark buffers processed in 128 byte chunks, speed: 14.12 cycles/byte, overhead +6.6% AES-CBC/ENC: overhead test, benchmark buffers processed in 256 byte chunks, speed: 13.69 cycles/byte, overhead +3.3% AES-CBC/ENC: overhead test, benchmark buffers processed in 512 byte chunks, speed: 13.55 cycles/byte, overhead +2.3% AES-CBC/ENC: overhead test, benchmark buffers processed in 1024 byte chunks, speed: 13.42 cycles/byte, overhead +1.3% AES-CBC/ENC: overhead test, benchmark buffers processed in 2048 byte chunks, speed: 13.36 cycles/byte, overhead +0.8% Absolute overhead per gcry_cipher_encrypt call here is (19.47-13.25)*16 ? 100 cycles. Software AES implementation has higher overhead than other SW cipher implementations as AES needs to prefetch look-up tables for every gcrypt_encrypt call (side-channel mitigation). Also, with slower cipher, overhead will be smaller compared to encryption. For example, here's results for Serpent: SERPENT128-CBC/ENC: no-overhead test, full benchmark buffers, speed 38.23 cycles/byte SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 16 byte chunks, speed: 42.46 cycles/byte, overhead +11.1% SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 32 byte chunks, speed: 40.62 cycles/byte, overhead +6.2% SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 64 byte chunks, speed: 39.42 cycles/byte, overhead +3.1% SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 128 byte chunks, speed: 38.83 cycles/byte, overhead +1.6% SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 256 byte chunks, speed: 38.53 cycles/byte, overhead +0.8% SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 512 byte chunks, speed: 38.46 cycles/byte, overhead +0.6% SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 1024 byte chunks, speed: 38.36 cycles/byte, overhead +0.3% SERPENT128-CBC/ENC: overhead test, benchmark buffers processed in 2048 byte chunks, speed: 38.27 cycles/byte, overhead +0.1% Here overhead per call is ~70 cycles. With AES-NI accelerated AES, overhead shows up larger since encryption is fast. But with sufficiently large buffers overhead becomes insignificant: AES-CBC/ENC: no-overhead test, full benchmark buffers, speed 4.50 cycles/byte AES-CBC/ENC: overhead test, benchmark buffers processed in 16 byte chunks, speed: 7.09 cycles/byte, overhead +57.5% AES-CBC/ENC: overhead test, benchmark buffers processed in 32 byte chunks, speed: 5.82 cycles/byte, overhead +29.2% AES-CBC/ENC: overhead test, benchmark buffers processed in 64 byte chunks, speed: 5.16 cycles/byte, overhead +14.5% AES-CBC/ENC: overhead test, benchmark buffers processed in 128 byte chunks, speed: 4.83 cycles/byte, overhead +7.2% AES-CBC/ENC: overhead test, benchmark buffers processed in 256 byte chunks, speed: 4.67 cycles/byte, overhead +3.6% AES-CBC/ENC: overhead test, benchmark buffers processed in 512 byte chunks, speed: 4.58 cycles/byte, overhead +1.8% AES-CBC/ENC: overhead test, benchmark buffers processed in 1024 byte chunks, speed: 4.54 cycles/byte, overhead +0.9% AES-CBC/ENC: overhead test, benchmark buffers processed in 2048 byte chunks, speed: 4.52 cycles/byte, overhead +0.4% Per call overhead for AES-NI CBC/ENC, ~40 cycles. With parallelizable modes, such as CBC decryption and CTR, this test no longer measure actual overhead as underlying algorithm changes with different chunk sizes. With AES-NI on x86_64, CBC decryption is done with 8 parallel blocks (128 bytes), so results below for 16 to 64 chunks sizes show how slow non-parallel code is compared to parallel code. Results starting with 128 chunks sizes show overhead for parallel code: AES-CBC/DEC: no-overhead test, full benchmark buffers, speed 0.632 cycles/byte AES-CBC/DEC: overhead test, benchmark buffers processed in 16 byte chunks, speed: 6.56 cycles/byte, overhead +937.4% AES-CBC/DEC: overhead test, benchmark buffers processed in 32 byte chunks, speed: 3.94 cycles/byte, overhead +523.3% AES-CBC/DEC: overhead test, benchmark buffers processed in 64 byte chunks, speed: 1.88 cycles/byte, overhead +197.6% AES-CBC/DEC: overhead test, benchmark buffers processed in 128 byte chunks, speed: 1.26 cycles/byte, overhead +99.4% AES-CBC/DEC: overhead test, benchmark buffers processed in 256 byte chunks, speed: 0.946 cycles/byte, overhead +49.7% AES-CBC/DEC: overhead test, benchmark buffers processed in 512 byte chunks, speed: 0.794 cycles/byte, overhead +25.5% AES-CBC/DEC: overhead test, benchmark buffers processed in 1024 byte chunks, speed: 0.711 cycles/byte, overhead +12.5% AES-CBC/DEC: overhead test, benchmark buffers processed in 2048 byte chunks, speed: 0.664 cycles/byte, overhead +5.0% Per call overhead for 128 byte chunks, ~80 cycles. > It's also possibe that these checks do not cost anything. I don't know. Checks should not cost that much .. stack burning, and prefetching (SW AES) cost more, but you'd probably wont want to remove those. -Jussi > > Stef Bon > the Netherlands > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel > From cvs at cvs.gnupg.org Wed Jun 13 09:00:15 2018 From: cvs at cvs.gnupg.org (by NIIBE Yutaka) Date: Wed, 13 Jun 2018 09:00:15 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.8.1-71-g9010d15 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 9010d1576e278a4274ad3f4aa15776c28f6ba965 (commit) from 7b6c2afd699e889f5f054cc3d202a61bd0ee1dcf (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 9010d1576e278a4274ad3f4aa15776c28f6ba965 Author: NIIBE Yutaka Date: Wed Jun 13 15:28:58 2018 +0900 ecc: Add blinding for ECDSA. * cipher/ecc-ecdsa.c (_gcry_ecc_ecdsa_sign): Blind secret D with randomized nonce B. -- Reported-by: Keegan Ryan CVE-id: CVE-2018-0495 Signed-off-by: NIIBE Yutaka diff --git a/cipher/ecc-ecdsa.c b/cipher/ecc-ecdsa.c index 1484830..140e8c0 100644 --- a/cipher/ecc-ecdsa.c +++ b/cipher/ecc-ecdsa.c @@ -50,6 +50,8 @@ _gcry_ecc_ecdsa_sign (gcry_mpi_t input, ECC_secret_key *skey, const void *abuf; unsigned int abits, qbits; mpi_ec_t ctx; + gcry_mpi_t b; /* Random number needed for blinding. */ + gcry_mpi_t bi; /* multiplicative inverse of B. */ if (DBG_CIPHER) log_mpidump ("ecdsa sign hash ", input ); @@ -61,6 +63,15 @@ _gcry_ecc_ecdsa_sign (gcry_mpi_t input, ECC_secret_key *skey, if (rc) return rc; + b = mpi_snew (qbits); + bi = mpi_snew (qbits); + do + { + _gcry_mpi_randomize (b, qbits, GCRY_WEAK_RANDOM); + mpi_mod (b, b, skey->E.n); + } + while (!mpi_invm (bi, b, skey->E.n)); + k = NULL; dr = mpi_alloc (0); sum = mpi_alloc (0); @@ -115,8 +126,11 @@ _gcry_ecc_ecdsa_sign (gcry_mpi_t input, ECC_secret_key *skey, } while (!mpi_cmp_ui (r, 0)); - mpi_mulm (dr, skey->d, r, skey->E.n); /* dr = d*r mod n */ - mpi_addm (sum, hash, dr, skey->E.n); /* sum = hash + (d*r) mod n */ + mpi_mulm (dr, b, skey->d, skey->E.n); + mpi_mulm (dr, dr, r, skey->E.n); /* dr = d*r mod n (blinded with b) */ + mpi_mulm (sum, b, hash, skey->E.n); + mpi_addm (sum, sum, dr, skey->E.n); /* sum = hash + (d*r) mod n (blinded with b) */ + mpi_mulm (sum, bi, sum, skey->E.n); /* undo blinding by b^-1 */ mpi_invm (k_1, k, skey->E.n); /* k_1 = k^(-1) mod n */ mpi_mulm (s, k_1, sum, skey->E.n); /* s = k^(-1)*(hash+(d*r)) mod n */ } @@ -129,6 +143,8 @@ _gcry_ecc_ecdsa_sign (gcry_mpi_t input, ECC_secret_key *skey, } leave: + mpi_free (b); + mpi_free (bi); _gcry_mpi_ec_free (ctx); point_free (&I); mpi_free (x); ----------------------------------------------------------------------- Summary of changes: cipher/ecc-ecdsa.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Wed Jun 13 10:37:37 2018 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 13 Jun 2018 10:37:37 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.8.1-72-g0d51ea9 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 0d51ea9b88b618368a7b916f26ebfe61bdf70503 (commit) from 9010d1576e278a4274ad3f4aa15776c28f6ba965 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 0d51ea9b88b618368a7b916f26ebfe61bdf70503 Author: Werner Koch Date: Wed Jun 13 10:37:59 2018 +0200 Add NEWS from the 1.8 and 1.7 branches. -- diff --git a/AUTHORS b/AUTHORS index 49ab941..eb24236 100644 --- a/AUTHORS +++ b/AUTHORS @@ -21,7 +21,7 @@ year that would otherwise be listed individually. List of Copyright holders ========================= - Copyright (C) 1989,1991-2017 Free Software Foundation, Inc. + Copyright (C) 1989,1991-2018 Free Software Foundation, Inc. Copyright (C) 1994 X Consortium Copyright (C) 1996 L. Peter Deutsch Copyright (C) 1997 Werner Koch @@ -30,14 +30,14 @@ List of Copyright holders Copyright (C) 1996-2006 Peter Gutmann, Matt Thomlinson and Blake Coverett Copyright (C) 2003 Nikos Mavroyanopoulos Copyright (C) 2006-2007 NTT (Nippon Telegraph and Telephone Corporation) - Copyright (C) 2012-2017 g10 Code GmbH + Copyright (C) 2012-2018 g10 Code GmbH Copyright (C) 2012 Simon Josefsson, Niels M?ller Copyright (c) 2012 Intel Corporation Copyright (C) 2013 Christian Grothoff Copyright (C) 2013-2017 Jussi Kivilinna Copyright (C) 2013-2014 Dmitry Eremin-Solenikov Copyright (C) 2014 Stephan Mueller - Copyright (C) 2017 Bundesamt f?r Sicherheit in der Informationstechnik + Copyright (C) 2018 Bundesamt f?r Sicherheit in der Informationstechnik Authors with a FSF copyright assignment diff --git a/NEWS b/NEWS index 8049d7d..a4841b3 100644 --- a/NEWS +++ b/NEWS @@ -1,12 +1,42 @@ Noteworthy changes in version 1.9.0 (unreleased) [C22/A3/R0] ------------------------------------------------ + * Bug fixes + + - Use blinding for ECDSA signing to mitigate a novel side-channel + attack. [#4011,CVE-2018-0495] [also in 1.8.3, 1.7.10] + + - Fix incorrect counter overflow handling for GCM when using an IV + size other than 96 bit. [#3764] [also in 1.8.3, 1.7.10] + + - Fix incorrect output of AES-keywrap mode for in-place encryption + on some platforms. [also in 1.8.3, 1.7.10] + + - Fix the gcry_mpi_ec_curve_point point validation function. + [also in 1.8.3, 1.7.10] + + - Fix rare assertion failure in gcry_prime_check. [also in 1.8.3] + + - Do not use /dev/srandom on OpenBSD. [also in 1.8.2] + + - Fix test suite failure on systems with large pages. [#3351] + [also in 1.8.2] + + - Fix test suite to not use mmap on Windows. [also in 1.8.2] + + - Fix fatal out of secure memory status in the s-expression parser + on heavy loaded systems. [also in 1.8.2] * Interface changes relative to the 1.8.0 release: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gcry_mpi_get_ui NEW function. GCRYCTL_AUTO_EXPAND_SECMEM NEW control code. + * Release dates of 1.8.x versions: + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Version 1.8.2 (2017-12-13) + Version 1.8.3 (2018-06-13) + Noteworthy changes in version 1.8.1 (2017-08-27) [C22/A2/R1] ------------------------------------------------ @@ -129,11 +159,13 @@ Noteworthy changes in version 1.8.0 (2017-07-18) [C22/A2/R0] * Release dates of 1.7.x versions: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Version 1.7.8 (2017-06-29) [C21/A1/R8] - Version 1.7.7 (2017-06-02) [C21/A1/R7] - Version 1.7.6 (2017-01-18) [C21/A1/R6] - Version 1.7.5 (2016-12-15) [C21/A1/R5] - Version 1.7.4 (2016-12-09) [C21/A1/R4] + Version 1.7.10 (2018-06-13) [C21/A1/R10] + Version 1.7.9 (2017-08-27) [C21/A1/R9] + Version 1.7.8 (2017-06-29) [C21/A1/R8] + Version 1.7.7 (2017-06-02) [C21/A1/R7] + Version 1.7.6 (2017-01-18) [C21/A1/R6] + Version 1.7.5 (2016-12-15) [C21/A1/R5] + Version 1.7.4 (2016-12-09) [C21/A1/R4] Noteworthy changes in version 1.7.3 (2016-08-17) [C21/A1/R3] diff --git a/README b/README index c14181a..7ac8e4a 100644 --- a/README +++ b/README @@ -1,10 +1,10 @@ Libgcrypt - The GNU Crypto Library ------------------------------------ - Version 1.7 + Version 1.8 - Copyright (C) 1989,1991-2017 Free Software Foundation, Inc. - Copyright (C) 2012-2017 g10 Code GmbH - Copyright (C) 2013-2017 Jussi Kivilinna + Copyright (C) 1989,1991-2018 Free Software Foundation, Inc. + Copyright (C) 2012-2018 g10 Code GmbH + Copyright (C) 2013-2018 Jussi Kivilinna Libgcrypt is free software. See the file AUTHORS for full copying notices, and LICENSES for notices about contributions that require diff --git a/compat/compat.c b/compat/compat.c index b835293..8b001de 100644 --- a/compat/compat.c +++ b/compat/compat.c @@ -30,9 +30,9 @@ _gcry_compat_identification (void) static const char blurb[] = "\n\n" "This is Libgcrypt " PACKAGE_VERSION " - The GNU Crypto Library\n" - "Copyright (C) 2000-2017 Free Software Foundation, Inc.\n" - "Copyright (C) 2012-2017 g10 Code GmbH\n" - "Copyright (C) 2013-2017 Jussi Kivilinna\n" + "Copyright (C) 2000-2018 Free Software Foundation, Inc.\n" + "Copyright (C) 2012-2018 g10 Code GmbH\n" + "Copyright (C) 2013-2018 Jussi Kivilinna\n" "\n" "(" BUILD_REVISION " " BUILD_TIMESTAMP ")\n" "\n\n"; diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index a1cb15a..d2dfe80 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -1,6 +1,6 @@ /* gcrypt.h - GNU Cryptographic Library Interface -*- c -*- - * Copyright (C) 1998-2017 Free Software Foundation, Inc. - * Copyright (C) 2012-2017 g10 Code GmbH + * Copyright (C) 1998-2018 Free Software Foundation, Inc. + * Copyright (C) 2012-2018 g10 Code GmbH * * This file is part of Libgcrypt. * ----------------------------------------------------------------------- Summary of changes: AUTHORS | 6 +++--- NEWS | 42 +++++++++++++++++++++++++++++++++++++----- README | 8 ++++---- compat/compat.c | 6 +++--- src/gcrypt.h.in | 4 ++-- 5 files changed, 49 insertions(+), 17 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From ametzler at bebt.de Wed Jun 13 19:13:05 2018 From: ametzler at bebt.de (Andreas Metzler) Date: Wed, 13 Jun 2018 19:13:05 +0200 Subject: Libgcrypt 1.8.3 and 1.7.10 to fix CVE-2018-0495 In-Reply-To: <87efhabqhc.fsf@wheatstone.g10code.de> References: <87efhabqhc.fsf@wheatstone.g10code.de> Message-ID: <20180613171305.GA1391@argenau.bebt.de> Hello, I think AUTHORS needs a tiny update: - Copyright (C) 2013-2017 Jussi Kivilinna + Copyright (C) 2013-2012 Jussi Kivilinna See the header of libgcrypt-1.8.3/compat/compat.c. cu Andreas -- `What a good friend you are to him, Dr. Maturin. His other friends are so grateful to you.' `I sew his ears on from time to time, sure' From stefbon at gmail.com Wed Jun 13 21:21:19 2018 From: stefbon at gmail.com (Stef Bon) Date: Wed, 13 Jun 2018 21:21:19 +0200 Subject: Low level ops? In-Reply-To: <534aeb69-8bf6-b8a9-6918-422b39cf956a@iki.fi> References: <534aeb69-8bf6-b8a9-6918-422b39cf956a@iki.fi> Message-ID: Op ma 11 jun. 2018 om 22:10 schreef Jussi Kivilinna : > > Hello, > > > With parallelizable modes, such as CBC decryption and CTR, this test > no longer measure actual overhead as underlying algorithm changes with > different chunk sizes. With AES-NI on x86_64, CBC decryption is done > with 8 parallel blocks (128 bytes), so results below for 16 to 64 chunks > sizes show how slow non-parallel code is compared to parallel code. > Results starting with 128 chunks sizes show overhead for parallel code: > > AES-CBC/DEC: no-overhead test, full benchmark buffers, speed 0.632 cycles/byte > AES-CBC/DEC: overhead test, benchmark buffers processed in 16 byte chunks, speed: 6.56 cycles/byte, overhead +937.4% > AES-CBC/DEC: overhead test, benchmark buffers processed in 32 byte chunks, speed: 3.94 cycles/byte, overhead +523.3% > AES-CBC/DEC: overhead test, benchmark buffers processed in 64 byte chunks, speed: 1.88 cycles/byte, overhead +197.6% > AES-CBC/DEC: overhead test, benchmark buffers processed in 128 byte chunks, speed: 1.26 cycles/byte, overhead +99.4% > AES-CBC/DEC: overhead test, benchmark buffers processed in 256 byte chunks, speed: 0.946 cycles/byte, overhead +49.7% > AES-CBC/DEC: overhead test, benchmark buffers processed in 512 byte chunks, speed: 0.794 cycles/byte, overhead +25.5% > AES-CBC/DEC: overhead test, benchmark buffers processed in 1024 byte chunks, speed: 0.711 cycles/byte, overhead +12.5% > AES-CBC/DEC: overhead test, benchmark buffers processed in 2048 byte chunks, speed: 0.664 cycles/byte, overhead +5.0% > > Per call overhead for 128 byte chunks, ~80 cycles. > Thanks for the results! I had to take some time to read the results. I'm not used to interpret these numbers. As I understand it the size of the data to be en/decrypted makes the percentage of the overhead lower. That makes sense. It looks a lot like the overhead takes a fixed amount of time (or cycles), independent of the chunk size and that the overall time is linear increasing with the chunk size, since the overhead is divided by two when the chunk size is multiplied by 2. With ssh/sftp you cannot control the size of the data. Mostly (I'm not sure) they vary from 128 bytes for a stat to 4096 for directory listnings and reading files. So as I see it it would be worth to try to bring back the overhead for AES-CBC//DEC since they vary from 99% to 12,5%, since the size most ssh messages is between 128 and 1024 bytes. You mention parallel mode for AES-CBC/DEC. Is it possible to use this from the api? And do you know what counts for chacha20-poly1305 at openssh.com? I'm working on parallel en/decryption of two or more messages at the same time. Stef > > It's also possibe that these checks do not cost anything. I don't know. From wk at gnupg.org Thu Jun 14 08:29:24 2018 From: wk at gnupg.org (Werner Koch) Date: Thu, 14 Jun 2018 08:29:24 +0200 Subject: Libgcrypt 1.8.3 and 1.7.10 to fix CVE-2018-0495 In-Reply-To: <20180613171305.GA1391@argenau.bebt.de> (Andreas Metzler's message of "Wed, 13 Jun 2018 19:13:05 +0200") References: <87efhabqhc.fsf@wheatstone.g10code.de> <20180613171305.GA1391@argenau.bebt.de> Message-ID: <87602lc24b.fsf@wheatstone.g10code.de> On Wed, 13 Jun 2018 19:13, ametzler at bebt.de said: > I think AUTHORS needs a tiny update: > - Copyright (C) 2013-2017 Jussi Kivilinna > + Copyright (C) 2013-2012 Jussi Kivilinna Ooops. It should be 2018 of course. Sorry. Shalom-Salam, Werner -- # Please read: Daniel Ellsberg - The Doomsday Machine # Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 227 bytes Desc: not available URL: From rms at gnu.org Sat Jun 16 00:25:40 2018 From: rms at gnu.org (Richard Stallman) Date: Fri, 15 Jun 2018 18:25:40 -0400 Subject: Libgcrypt 1.8.3 and 1.7.10 to fix CVE-2018-0495 In-Reply-To: <87efhabqhc.fsf@wheatstone.g10code.de> (message from Werner Koch on Wed, 13 Jun 2018 18:28:31 +0200) References: <87efhabqhc.fsf@wheatstone.g10code.de> Message-ID: [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] Congratulations on the new release. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) From cvs at cvs.gnupg.org Tue Jun 19 05:17:04 2018 From: cvs at cvs.gnupg.org (by Will Dietz) Date: Tue, 19 Jun 2018 05:17:04 +0200 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.8.1-73-g355f5b7 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 355f5b7f69075c010fe33aa5b10ac60c08fae0c7 (commit) from 0d51ea9b88b618368a7b916f26ebfe61bdf70503 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 355f5b7f69075c010fe33aa5b10ac60c08fae0c7 Author: Will Dietz Date: Sun Jun 17 18:53:58 2018 -0500 random: Fix hang of _gcry_rndjent_get_version. * random/rndjent.c (_gcry_rndjent_get_version): Move locking. -- While the protection for jent_rng_collector is needed, _gcry_rndjent_poll is also acquiring the lock for the variable. Thus, it hangs. This change is sub-optimal, the lock is once released after the call of _gcry_rndjent_poll. It might be good to modify the API of _gcry_rndjent_poll to explicitly allow this use case of forcing initialization keeping the lock. Comments and change log entry by gniibe. GnuPG-bug-id: 4034 Fixes-commit: 0de2a22fcf6607d0aecb550feefa414cee3731b2 diff --git a/random/rndjent.c b/random/rndjent.c index 0c5a820..3740ddd 100644 --- a/random/rndjent.c +++ b/random/rndjent.c @@ -334,9 +334,10 @@ _gcry_rndjent_get_version (int *r_active) { if (r_active) { - lock_rng (); /* Make sure the RNG is initialized. */ _gcry_rndjent_poll (NULL, 0, 0); + + lock_rng (); /* To ease debugging we store 2 for a clock_gettime based * implementation and 1 for a rdtsc based code. */ *r_active = jent_rng_collector? is_rng_available () : 0; ----------------------------------------------------------------------- Summary of changes: random/rndjent.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From stefbon at gmail.com Tue Jun 19 07:27:22 2018 From: stefbon at gmail.com (Stef Bon) Date: Tue, 19 Jun 2018 07:27:22 +0200 Subject: Low level ops? In-Reply-To: References: <534aeb69-8bf6-b8a9-6918-422b39cf956a@iki.fi> Message-ID: Op wo 13 jun. 2018 om 21:21 schreef Stef Bon : > > > So as I see it it would be worth to try to bring back the overhead for > AES-CBC//DEC since they vary from 99% to 12,5%, since the size most > ssh messages is between 128 and 1024 bytes. > > You mention parallel mode for AES-CBC/DEC. Is it possible to use this > from the api? > And do you know what counts for chacha20-poly1305 at openssh.com? > Hi, can you please take a look at my remarks. I think that it's usefull to reduce the overhead for the mentioned ciphers. And what about chacha20-poly1305 at openssh.com? An about controlling the parallel handling through the api? Thanks, Stef From jussi.kivilinna at iki.fi Tue Jun 19 17:44:44 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:44:44 +0300 Subject: [PATCH 1/9] tests/basic: silence GCC-8 warning Message-ID: <20180619154444.6130-1-jussi.kivilinna@iki.fi> * tests/basic.c (check_ofb_cipher, check_stream_cipher): Change tv[].data[].inlen type from signed to unsigned integer. -- Patch silences new GCC-8 compiler warning: '__builtin_memcmp_eq' specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=] Signed-off-by: Jussi Kivilinna --- tests/basic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/basic.c b/tests/basic.c index 42ee819e7..f3d895153 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -1112,7 +1112,7 @@ check_ofb_cipher (void) struct data { unsigned char plaintext[MAX_DATA_LEN]; - int inlen; + unsigned int inlen; char out[MAX_DATA_LEN]; } data[MAX_DATA_LEN]; @@ -5660,7 +5660,7 @@ check_stream_cipher (void) const char *iv; struct data { - int inlen; + unsigned int inlen; const char *plaintext; const char *out; } data[MAX_DATA_LEN]; -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 17:51:14 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:14 +0300 Subject: [PATCH 2/9] Fix CBC-CTS+CBC-MAC flag check Message-ID: <20180619155121.7122-1-jussi.kivilinna@iki.fi> * cipher/cipher.c (_gcry_cipher_open_internal): Check flags separately instead of AND masking two flags to zero. -- Signed-off-by: Jussi Kivilinna --- cipher/cipher.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cipher/cipher.c b/cipher/cipher.c index d6cd0b42e..1b547a4b7 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -503,7 +503,7 @@ _gcry_cipher_open_internal (gcry_cipher_hd_t *handle, | GCRY_CIPHER_ENABLE_SYNC | GCRY_CIPHER_CBC_CTS | GCRY_CIPHER_CBC_MAC)) - || (flags & GCRY_CIPHER_CBC_CTS & GCRY_CIPHER_CBC_MAC))) + || ((flags & GCRY_CIPHER_CBC_CTS) && (flags & GCRY_CIPHER_CBC_MAC)))) err = GPG_ERR_CIPHER_ALGO; /* check that a valid mode has been requested */ -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 17:51:15 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:15 +0300 Subject: [PATCH 3/9] Avoid division by spec->blocksize in cipher mode handlers In-Reply-To: <20180619155121.7122-1-jussi.kivilinna@iki.fi> References: <20180619155121.7122-1-jussi.kivilinna@iki.fi> Message-ID: <20180619155121.7122-2-jussi.kivilinna@iki.fi> * cipher/cipher-internal.h (_gcry_blocksize_shift): New. * cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt) (_gcry_cipherp_cbc_decrypt): Use bit-level operations instead of division to get number of blocks and check input length against blocksize. * cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt) (_gcry_cipher_cfb_decrypt): Ditto. * cipher/cipher-cmac.c (_gcry_cmac_write): Ditto. * cipher/cipher-ctr.c (_gcry_cipher_ctr_crypt): Ditto. * cipher/cipher-ofb.c (_gcry_cipher_ofb_encrypt) (_gcry_cipher_ofb_decrypt): Ditto. -- Integer division was causing 10 to 20 cycles per call overhead for cipher modes on x86-64. Signed-off-by: Jussi Kivilinna --- cipher/cipher-cbc.c | 50 ++++++++++++++++++---------------------- cipher/cipher-cfb.c | 32 ++++++++++--------------- cipher/cipher-cmac.c | 16 +++++-------- cipher/cipher-ctr.c | 16 +++++-------- cipher/cipher-internal.h | 10 ++++++++ cipher/cipher-ofb.c | 8 ++----- 6 files changed, 58 insertions(+), 74 deletions(-) diff --git a/cipher/cipher-cbc.c b/cipher/cipher-cbc.c index 95c49b2b6..7951f34b6 100644 --- a/cipher/cipher-cbc.c +++ b/cipher/cipher-cbc.c @@ -39,20 +39,17 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, size_t n; unsigned char *ivp; int i; - size_t blocksize = c->spec->blocksize; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; + size_t blocksize_mask = blocksize - 1; gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - size_t nblocks = inbuflen / blocksize; + size_t nblocks = inbuflen >> blocksize_shift; unsigned int burn, nburn; - /* Tell compiler that we require a cipher with a 64bit or 128 bit block - * length, to allow better optimization of this function. */ - if (blocksize > 16 || blocksize < 8 || blocksize & (8 - 1)) - return GPG_ERR_INV_LENGTH; - - if (outbuflen < ((c->flags & GCRY_CIPHER_CBC_MAC)? blocksize : inbuflen)) + if (outbuflen < ((c->flags & GCRY_CIPHER_CBC_MAC) ? blocksize : inbuflen)) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % blocksize) + if ((inbuflen & blocksize_mask) && !(inbuflen > blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -61,7 +58,7 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) { - if ((inbuflen % blocksize) == 0) + if ((inbuflen & blocksize_mask) == 0) nblocks--; } @@ -69,9 +66,9 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, { c->bulk.cbc_enc (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks, (c->flags & GCRY_CIPHER_CBC_MAC)); - inbuf += nblocks * blocksize; + inbuf += nblocks << blocksize_shift; if (!(c->flags & GCRY_CIPHER_CBC_MAC)) - outbuf += nblocks * blocksize; + outbuf += nblocks << blocksize_shift; } else { @@ -83,7 +80,7 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, nburn = enc_fn ( &c->context.c, outbuf, outbuf ); burn = nburn > burn ? nburn : burn; ivp = outbuf; - inbuf += blocksize; + inbuf += blocksize; if (!(c->flags & GCRY_CIPHER_CBC_MAC)) outbuf += blocksize; } @@ -99,10 +96,10 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, size_t restbytes; unsigned char b; - if ((inbuflen % blocksize) == 0) + if ((inbuflen & blocksize_mask) == 0) restbytes = blocksize; else - restbytes = inbuflen % blocksize; + restbytes = inbuflen & blocksize_mask; outbuf -= blocksize; for (ivp = c->u_iv.iv, i = 0; i < restbytes; i++) @@ -133,20 +130,17 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, { size_t n; int i; - size_t blocksize = c->spec->blocksize; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; + size_t blocksize_mask = blocksize - 1; gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; - size_t nblocks = inbuflen / blocksize; + size_t nblocks = inbuflen >> blocksize_shift; unsigned int burn, nburn; - /* Tell compiler that we require a cipher with a 64bit or 128 bit block - * length, to allow better optimization of this function. */ - if (blocksize > 16 || blocksize < 8 || blocksize & (8 - 1)) - return GPG_ERR_INV_LENGTH; - if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen % blocksize) + if ((inbuflen & blocksize_mask) && !(inbuflen > blocksize && (c->flags & GCRY_CIPHER_CBC_CTS))) return GPG_ERR_INV_LENGTH; @@ -156,7 +150,7 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) { nblocks--; - if ((inbuflen % blocksize) == 0) + if ((inbuflen & blocksize_mask) == 0) nblocks--; buf_cpy (c->lastiv, c->u_iv.iv, blocksize); } @@ -164,8 +158,8 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, if (c->bulk.cbc_dec) { c->bulk.cbc_dec (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks); - inbuf += nblocks * blocksize; - outbuf += nblocks * blocksize; + inbuf += nblocks << blocksize_shift; + outbuf += nblocks << blocksize_shift; } else { @@ -186,10 +180,10 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, { size_t restbytes; - if ((inbuflen % blocksize) == 0) + if ((inbuflen & blocksize_mask) == 0) restbytes = blocksize; else - restbytes = inbuflen % blocksize; + restbytes = inbuflen & blocksize_mask; buf_cpy (c->lastiv, c->u_iv.iv, blocksize ); /* Save Cn-2. */ buf_cpy (c->u_iv.iv, inbuf + blocksize, restbytes ); /* Save Cn. */ diff --git a/cipher/cipher-cfb.c b/cipher/cipher-cfb.c index c888e70a8..7f00aee5c 100644 --- a/cipher/cipher-cfb.c +++ b/cipher/cipher-cfb.c @@ -37,15 +37,11 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, { unsigned char *ivp; gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - size_t blocksize = c->spec->blocksize; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; - /* Tell compiler that we require a cipher with a 64bit or 128 bit block - * length, to allow better optimization of this function. */ - if (blocksize > 16 || blocksize < 8 || blocksize & (8 - 1)) - return GPG_ERR_INV_LENGTH; - if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; @@ -77,11 +73,11 @@ _gcry_cipher_cfb_encrypt (gcry_cipher_hd_t c, also allows to use a bulk encryption function if available. */ if (inbuflen >= blocksize_x_2 && c->bulk.cfb_enc) { - size_t nblocks = inbuflen / blocksize; + size_t nblocks = inbuflen >> blocksize_shift; c->bulk.cfb_enc (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks); - outbuf += nblocks * blocksize; - inbuf += nblocks * blocksize; - inbuflen -= nblocks * blocksize; + outbuf += nblocks << blocksize_shift; + inbuf += nblocks << blocksize_shift; + inbuflen -= nblocks << blocksize_shift; } else { @@ -139,15 +135,11 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, { unsigned char *ivp; gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - size_t blocksize = c->spec->blocksize; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; size_t blocksize_x_2 = blocksize + blocksize; unsigned int burn, nburn; - /* Tell compiler that we require a cipher with a 64bit or 128 bit block - * length, to allow better optimization of this function. */ - if (blocksize > 16 || blocksize < 8 || blocksize & (8 - 1)) - return GPG_ERR_INV_LENGTH; - if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; @@ -179,11 +171,11 @@ _gcry_cipher_cfb_decrypt (gcry_cipher_hd_t c, also allows to use a bulk encryption function if available. */ if (inbuflen >= blocksize_x_2 && c->bulk.cfb_dec) { - size_t nblocks = inbuflen / blocksize; + size_t nblocks = inbuflen >> blocksize_shift; c->bulk.cfb_dec (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks); - outbuf += nblocks * blocksize; - inbuf += nblocks * blocksize; - inbuflen -= nblocks * blocksize; + outbuf += nblocks << blocksize_shift; + inbuf += nblocks << blocksize_shift; + inbuflen -= nblocks << blocksize_shift; } else { diff --git a/cipher/cipher-cmac.c b/cipher/cipher-cmac.c index 30567b7fc..321ab9eab 100644 --- a/cipher/cipher-cmac.c +++ b/cipher/cipher-cmac.c @@ -38,7 +38,8 @@ _gcry_cmac_write (gcry_cipher_hd_t c, gcry_cmac_context_t *ctx, const byte * inbuf, size_t inlen) { gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - const unsigned int blocksize = c->spec->blocksize; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; byte outbuf[MAX_BLOCKSIZE]; unsigned int burn = 0; unsigned int nblocks; @@ -46,11 +47,6 @@ _gcry_cmac_write (gcry_cipher_hd_t c, gcry_cmac_context_t *ctx, if (ctx->tag) return GPG_ERR_INV_STATE; - /* Tell compiler that we require a cipher with a 64bit or 128 bit block - * length, to allow better optimization of this function. */ - if (blocksize > 16 || blocksize < 8 || blocksize & (8 - 1)) - return GPG_ERR_INV_CIPHER_MODE; - if (!inbuf) return GPG_ERR_INV_ARG; @@ -78,12 +74,12 @@ _gcry_cmac_write (gcry_cipher_hd_t c, gcry_cmac_context_t *ctx, if (c->bulk.cbc_enc && inlen > blocksize) { - nblocks = inlen / blocksize; - nblocks -= (nblocks * blocksize == inlen); + nblocks = inlen >> blocksize_shift; + nblocks -= ((nblocks << blocksize_shift) == inlen); c->bulk.cbc_enc (&c->context.c, ctx->u_iv.iv, outbuf, inbuf, nblocks, 1); - inbuf += nblocks * blocksize; - inlen -= nblocks * blocksize; + inbuf += nblocks << blocksize_shift; + inlen -= nblocks << blocksize_shift; wipememory (outbuf, sizeof (outbuf)); } diff --git a/cipher/cipher-ctr.c b/cipher/cipher-ctr.c index f9cb6b577..b54fb5a73 100644 --- a/cipher/cipher-ctr.c +++ b/cipher/cipher-ctr.c @@ -38,15 +38,11 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, size_t n; int i; gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - unsigned int blocksize = c->spec->blocksize; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; size_t nblocks; unsigned int burn, nburn; - /* Tell compiler that we require a cipher with a 64bit or 128 bit block - * length, to allow better optimization of this function. */ - if (blocksize > 16 || blocksize < 8 || blocksize & (8 - 1)) - return GPG_ERR_INV_LENGTH; - if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; @@ -66,13 +62,13 @@ _gcry_cipher_ctr_encrypt (gcry_cipher_hd_t c, } /* Use a bulk method if available. */ - nblocks = inbuflen / blocksize; + nblocks = inbuflen >> blocksize_shift; if (nblocks && c->bulk.ctr_enc) { c->bulk.ctr_enc (&c->context.c, c->u_ctr.ctr, outbuf, inbuf, nblocks); - inbuf += nblocks * blocksize; - outbuf += nblocks * blocksize; - inbuflen -= nblocks * blocksize; + inbuf += nblocks << blocksize_shift; + outbuf += nblocks << blocksize_shift; + inbuflen -= nblocks << blocksize_shift; } /* If we don't have a bulk method use the standard method. We also diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index a0ede5e03..a5bb0ad2f 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -563,4 +563,14 @@ ocb_get_l (gcry_cipher_hd_t c, u64 n) return c->u_mode.ocb.L[ntz]; } + +/* Return bit-shift of blocksize. */ +static inline unsigned int _gcry_blocksize_shift(gcry_cipher_hd_t c) +{ + /* Only blocksizes 8 and 16 are used. Return value in such way + * that compiler can optimize calling functions based on this. */ + return c->spec->blocksize == 8 ? 3 : 4; +} + + #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher-ofb.c b/cipher/cipher-ofb.c index f821d1bec..419a8d085 100644 --- a/cipher/cipher-ofb.c +++ b/cipher/cipher-ofb.c @@ -37,14 +37,10 @@ _gcry_cipher_ofb_encrypt (gcry_cipher_hd_t c, { unsigned char *ivp; gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - size_t blocksize = c->spec->blocksize; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; unsigned int burn, nburn; - /* Tell compiler that we require a cipher with a 64bit or 128 bit block - * length, to allow better optimization of this function. */ - if (blocksize > 16 || blocksize < 8 || blocksize & (8 - 1)) - return GPG_ERR_INV_LENGTH; - if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 17:51:16 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:16 +0300 Subject: [PATCH 4/9] Add separate handlers for CBC-CTS variant In-Reply-To: <20180619155121.7122-1-jussi.kivilinna@iki.fi> References: <20180619155121.7122-1-jussi.kivilinna@iki.fi> Message-ID: <20180619155121.7122-3-jussi.kivilinna@iki.fi> * cipher/cipher-cbc.c (cbc_encrypt_inner, cbc_decrypt_inner) (_gcry_cipher_cbc_cts_encrypt, _gcry_cipher_cbc_cts_decrypt): New. (_gcry_cipher_cbc_encrypt, _gcry_cipher_cbc_decrypt): Remove CTS handling. * cipher/cipher-internal.h (_gcry_cipher_cbc_cts_encrypt) (_gcry_cipher_cbc_cts_decrypt): New. * cipher/cipher.c (cipher_encrypt, cipher_decrypt): Call CBC-CTS handler if CBC-CTS flag is set. -- Separate CTS handling to separate function for small decrease in CBC per call overhead. Signed-off-by: Jussi Kivilinna --- cipher/cipher-cbc.c | 203 +++++++++++++++++++++++++++------------ cipher/cipher-internal.h | 8 ++ cipher/cipher.c | 12 ++- 3 files changed, 161 insertions(+), 62 deletions(-) diff --git a/cipher/cipher-cbc.c b/cipher/cipher-cbc.c index 7951f34b6..2ad39d09d 100644 --- a/cipher/cipher-cbc.c +++ b/cipher/cipher-cbc.c @@ -31,47 +31,27 @@ -gcry_err_code_t -_gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, - unsigned char *outbuf, size_t outbuflen, - const unsigned char *inbuf, size_t inbuflen) +static inline unsigned int +cbc_encrypt_inner(gcry_cipher_hd_t c, unsigned char *outbuf, + const unsigned char *inbuf, size_t nblocks, size_t blocksize, + int is_cbc_cmac) { - size_t n; - unsigned char *ivp; - int i; - size_t blocksize_shift = _gcry_blocksize_shift(c); - size_t blocksize = 1 << blocksize_shift; - size_t blocksize_mask = blocksize - 1; - gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; - size_t nblocks = inbuflen >> blocksize_shift; - unsigned int burn, nburn; - if (outbuflen < ((c->flags & GCRY_CIPHER_CBC_MAC) ? blocksize : inbuflen)) - return GPG_ERR_BUFFER_TOO_SHORT; - - if ((inbuflen & blocksize_mask) - && !(inbuflen > blocksize - && (c->flags & GCRY_CIPHER_CBC_CTS))) - return GPG_ERR_INV_LENGTH; + unsigned int burn, nburn; + size_t n; burn = 0; - if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) - { - if ((inbuflen & blocksize_mask) == 0) - nblocks--; - } - if (c->bulk.cbc_enc) { c->bulk.cbc_enc (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks, - (c->flags & GCRY_CIPHER_CBC_MAC)); - inbuf += nblocks << blocksize_shift; - if (!(c->flags & GCRY_CIPHER_CBC_MAC)) - outbuf += nblocks << blocksize_shift; + is_cbc_cmac); } else { + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; + unsigned char *ivp; + ivp = c->u_iv.iv; for (n=0; n < nblocks; n++ ) @@ -81,15 +61,78 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, burn = nburn > burn ? nburn : burn; ivp = outbuf; inbuf += blocksize; - if (!(c->flags & GCRY_CIPHER_CBC_MAC)) + if (!is_cbc_cmac) outbuf += blocksize; } if (ivp != c->u_iv.iv) - buf_cpy (c->u_iv.iv, ivp, blocksize ); + buf_cpy (c->u_iv.iv, ivp, blocksize); + } + + return burn; +} + + +gcry_err_code_t +_gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen) +{ + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; + size_t blocksize_mask = blocksize - 1; + size_t nblocks = inbuflen >> blocksize_shift; + int is_cbc_cmac = !!(c->flags & GCRY_CIPHER_CBC_MAC); + unsigned int burn; + + if (outbuflen < (is_cbc_cmac ? blocksize : inbuflen)) + return GPG_ERR_BUFFER_TOO_SHORT; + + if (inbuflen & blocksize_mask) + return GPG_ERR_INV_LENGTH; + + burn = cbc_encrypt_inner(c, outbuf, inbuf, nblocks, blocksize, is_cbc_cmac); + + if (burn > 0) + _gcry_burn_stack (burn + 4 * sizeof(void *)); + + return 0; +} + + +gcry_err_code_t +_gcry_cipher_cbc_cts_encrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen) +{ + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; + size_t blocksize_mask = blocksize - 1; + gcry_cipher_encrypt_t enc_fn = c->spec->encrypt; + size_t nblocks = inbuflen >> blocksize_shift; + unsigned int burn, nburn; + unsigned char *ivp; + int i; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + + if ((inbuflen & blocksize_mask) && !(inbuflen > blocksize)) + return GPG_ERR_INV_LENGTH; + + burn = 0; + + if (inbuflen > blocksize) + { + if ((inbuflen & blocksize_mask) == 0) + nblocks--; } - if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) + burn = cbc_encrypt_inner(c, outbuf, inbuf, nblocks, blocksize, 0); + inbuf += nblocks << blocksize_shift; + outbuf += nblocks << blocksize_shift; + + if (inbuflen > blocksize) { /* We have to be careful here, since outbuf might be equal to inbuf. */ @@ -123,31 +166,88 @@ _gcry_cipher_cbc_encrypt (gcry_cipher_hd_t c, } +static inline unsigned int +cbc_decrypt_inner(gcry_cipher_hd_t c, unsigned char *outbuf, + const unsigned char *inbuf, size_t nblocks, size_t blocksize) +{ + unsigned int burn, nburn; + size_t n; + + burn = 0; + + if (c->bulk.cbc_dec) + { + c->bulk.cbc_dec (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks); + } + else + { + gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; + + for (n = 0; n < nblocks; n++) + { + /* Because outbuf and inbuf might be the same, we must not overwrite + the original ciphertext block. We use LASTIV as intermediate + storage here because it is not used otherwise. */ + nburn = dec_fn ( &c->context.c, c->lastiv, inbuf ); + burn = nburn > burn ? nburn : burn; + buf_xor_n_copy_2 (outbuf, c->lastiv, c->u_iv.iv, inbuf, blocksize); + inbuf += blocksize; + outbuf += blocksize; + } + } + + return burn; +} + + gcry_err_code_t _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, unsigned char *outbuf, size_t outbuflen, const unsigned char *inbuf, size_t inbuflen) { - size_t n; - int i; + size_t blocksize_shift = _gcry_blocksize_shift(c); + size_t blocksize = 1 << blocksize_shift; + size_t blocksize_mask = blocksize - 1; + size_t nblocks = inbuflen >> blocksize_shift; + unsigned int burn; + + if (outbuflen < inbuflen) + return GPG_ERR_BUFFER_TOO_SHORT; + + if (inbuflen & blocksize_mask) + return GPG_ERR_INV_LENGTH; + + burn = cbc_decrypt_inner(c, outbuf, inbuf, nblocks, blocksize); + + if (burn > 0) + _gcry_burn_stack (burn + 4 * sizeof(void *)); + + return 0; +} + + +gcry_err_code_t +_gcry_cipher_cbc_cts_decrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen) +{ size_t blocksize_shift = _gcry_blocksize_shift(c); size_t blocksize = 1 << blocksize_shift; size_t blocksize_mask = blocksize - 1; gcry_cipher_decrypt_t dec_fn = c->spec->decrypt; size_t nblocks = inbuflen >> blocksize_shift; unsigned int burn, nburn; + int i; if (outbuflen < inbuflen) return GPG_ERR_BUFFER_TOO_SHORT; - if ((inbuflen & blocksize_mask) - && !(inbuflen > blocksize - && (c->flags & GCRY_CIPHER_CBC_CTS))) + if ((inbuflen & blocksize_mask) && !(inbuflen > blocksize)) return GPG_ERR_INV_LENGTH; burn = 0; - if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) + if (inbuflen > blocksize) { nblocks--; if ((inbuflen & blocksize_mask) == 0) @@ -155,28 +255,11 @@ _gcry_cipher_cbc_decrypt (gcry_cipher_hd_t c, buf_cpy (c->lastiv, c->u_iv.iv, blocksize); } - if (c->bulk.cbc_dec) - { - c->bulk.cbc_dec (&c->context.c, c->u_iv.iv, outbuf, inbuf, nblocks); - inbuf += nblocks << blocksize_shift; - outbuf += nblocks << blocksize_shift; - } - else - { - for (n=0; n < nblocks; n++ ) - { - /* Because outbuf and inbuf might be the same, we must not overwrite - the original ciphertext block. We use LASTIV as intermediate - storage here because it is not used otherwise. */ - nburn = dec_fn ( &c->context.c, c->lastiv, inbuf ); - burn = nburn > burn ? nburn : burn; - buf_xor_n_copy_2(outbuf, c->lastiv, c->u_iv.iv, inbuf, blocksize); - inbuf += blocksize; - outbuf += blocksize; - } - } + burn = cbc_decrypt_inner(c, outbuf, inbuf, nblocks, blocksize); + inbuf += nblocks << blocksize_shift; + outbuf += nblocks << blocksize_shift; - if ((c->flags & GCRY_CIPHER_CBC_CTS) && inbuflen > blocksize) + if (inbuflen > blocksize) { size_t restbytes; diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index a5bb0ad2f..b12c3be7e 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -356,6 +356,14 @@ gcry_err_code_t _gcry_cipher_cbc_encrypt unsigned char *outbuf, size_t outbuflen, const unsigned char *inbuf, size_t inbuflen); gcry_err_code_t _gcry_cipher_cbc_decrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen); +gcry_err_code_t _gcry_cipher_cbc_cts_encrypt +/* */ (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen); +gcry_err_code_t _gcry_cipher_cbc_cts_decrypt /* */ (gcry_cipher_hd_t c, unsigned char *outbuf, size_t outbuflen, const unsigned char *inbuf, size_t inbuflen); diff --git a/cipher/cipher.c b/cipher/cipher.c index 1b547a4b7..54d00b46d 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -1018,7 +1018,11 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, size_t outbuflen, break; case GCRY_CIPHER_MODE_CBC: - rc = _gcry_cipher_cbc_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + if (!(c->flags & GCRY_CIPHER_CBC_CTS)) + rc = _gcry_cipher_cbc_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + else + rc = _gcry_cipher_cbc_cts_encrypt (c, outbuf, outbuflen, inbuf, + inbuflen); break; case GCRY_CIPHER_MODE_CFB: @@ -1153,7 +1157,11 @@ cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, size_t outbuflen, break; case GCRY_CIPHER_MODE_CBC: - rc = _gcry_cipher_cbc_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + if (!(c->flags & GCRY_CIPHER_CBC_CTS)) + rc = _gcry_cipher_cbc_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); + else + rc = _gcry_cipher_cbc_cts_decrypt (c, outbuf, outbuflen, inbuf, + inbuflen); break; case GCRY_CIPHER_MODE_CFB: -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 17:51:17 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:17 +0300 Subject: [PATCH 5/9] Access cipher mode routines through routine pointers In-Reply-To: <20180619155121.7122-1-jussi.kivilinna@iki.fi> References: <20180619155121.7122-1-jussi.kivilinna@iki.fi> Message-ID: <20180619155121.7122-4-jussi.kivilinna@iki.fi> * cipher/cipher-internal.h (gcry_cipher_handle): Add function pointers for mode operations. (_gcry_cipher_xts_crypt): Remove. (_gcry_cipher_xts_encrypt, _gcry_cipher_xts_decrypt): New. * cipher/cipher-xts.c (_gcry_cipher_xts_encrypt) (_gcry_cipher_xts_decrypt): New. * cipher/cipher.c (_gcry_cipher_setup_mode_ops): New. (_gcry_cipher_open_internal): Setup mode routines. (cipher_encrypt, cipher_decrypt): Remove. (do_stream_encrypt, do_stream_decrypt, do_encrypt_none_unknown) (do_decrypt_none_unknown): New. (_gcry_cipher_encrypt, _gcry_cipher_decrypt, _gcry_cipher_setiv) (_gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag): Adapted to use mode routines through pointers. -- Change to use mode operations through pointers to reduce per call overhead for cipher operations. Signed-off-by: Jussi Kivilinna --- cipher/cipher-internal.h | 26 ++- cipher/cipher-xts.c | 18 ++ cipher/cipher.c | 492 ++++++++++++++++++--------------------- 3 files changed, 272 insertions(+), 264 deletions(-) diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index b12c3be7e..6d87561de 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -140,6 +140,25 @@ struct gcry_cipher_handle interface does not easily allow to retrieve this value. */ int algo; + /* A structure with function pointers for mode operations. */ + struct { + gcry_err_code_t (*encrypt)(gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen); + gcry_err_code_t (*decrypt)(gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen); + gcry_err_code_t (*setiv)(gcry_cipher_hd_t c, const unsigned char *iv, + size_t ivlen); + + gcry_err_code_t (*authenticate)(gcry_cipher_hd_t c, + const unsigned char *abuf, size_t abuflen); + gcry_err_code_t (*get_tag)(gcry_cipher_hd_t c, unsigned char *outtag, + size_t taglen); + gcry_err_code_t (*check_tag)(gcry_cipher_hd_t c, const unsigned char *intag, + size_t taglen); + } mode_ops; + /* A structure with function pointers for bulk operations. Due to limitations of the module system (we don't want to change the API) we need to keep these function pointers here. The cipher @@ -544,9 +563,12 @@ gcry_err_code_t _gcry_cipher_ocb_check_tag /*-- cipher-xts.c --*/ -gcry_err_code_t _gcry_cipher_xts_crypt +gcry_err_code_t _gcry_cipher_xts_encrypt +/* */ (gcry_cipher_hd_t c, unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen); +gcry_err_code_t _gcry_cipher_xts_decrypt /* */ (gcry_cipher_hd_t c, unsigned char *outbuf, size_t outbuflen, - const unsigned char *inbuf, size_t inbuflen, int encrypt); + const unsigned char *inbuf, size_t inbuflen); /* Return the L-value for block N. Note: 'cipher_ocb.c' ensures that N diff --git a/cipher/cipher-xts.c b/cipher/cipher-xts.c index 06cefbe0d..045b3539b 100644 --- a/cipher/cipher-xts.c +++ b/cipher/cipher-xts.c @@ -169,3 +169,21 @@ _gcry_cipher_xts_crypt (gcry_cipher_hd_t c, return 0; } + + +gcry_err_code_t +_gcry_cipher_xts_encrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen) +{ + return _gcry_cipher_xts_crypt (c, outbuf, outbuflen, inbuf, inbuflen, 1); +} + + +gcry_err_code_t +_gcry_cipher_xts_decrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen) +{ + return _gcry_cipher_xts_crypt (c, outbuf, outbuflen, inbuf, inbuflen, 0); +} diff --git a/cipher/cipher.c b/cipher/cipher.c index 54d00b46d..a4dfc4ddc 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -200,6 +200,8 @@ static gcry_cipher_spec_t * const cipher_list_algo301[] = }; +static void _gcry_cipher_setup_mode_ops(gcry_cipher_hd_t c, int mode); + static int map_algo (int algo) @@ -706,6 +708,9 @@ _gcry_cipher_open_internal (gcry_cipher_hd_t *handle, break; } + /* Setup mode routines. */ + _gcry_cipher_setup_mode_ops(h, mode); + /* Setup defaults depending on the mode. */ switch (mode) { @@ -723,8 +728,7 @@ _gcry_cipher_open_internal (gcry_cipher_hd_t *handle, default: break; } - - } + } } /* Done. */ @@ -994,93 +998,78 @@ do_ecb_decrypt (gcry_cipher_hd_t c, } -/**************** - * Encrypt INBUF to OUTBUF with the mode selected at open. - * inbuf and outbuf may overlap or be the same. - * Depending on the mode some constraints apply to INBUFLEN. - */ static gcry_err_code_t -cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, size_t outbuflen, - const byte *inbuf, size_t inbuflen) +do_stream_encrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen) +{ + (void)outbuflen; + c->spec->stencrypt (&c->context.c, outbuf, (void *)inbuf, inbuflen); + return 0; +} + +static gcry_err_code_t +do_stream_decrypt (gcry_cipher_hd_t c, + unsigned char *outbuf, size_t outbuflen, + const unsigned char *inbuf, size_t inbuflen) +{ + (void)outbuflen; + c->spec->stdecrypt (&c->context.c, outbuf, (void *)inbuf, inbuflen); + return 0; +} + + +static gcry_err_code_t +do_encrypt_none_unknown (gcry_cipher_hd_t c, byte *outbuf, size_t outbuflen, + const byte *inbuf, size_t inbuflen) { gcry_err_code_t rc; - if (c->mode != GCRY_CIPHER_MODE_NONE && !c->marks.key) - { - log_error ("cipher_encrypt: key not set\n"); - return GPG_ERR_MISSING_KEY; - } + (void)outbuflen; switch (c->mode) { - case GCRY_CIPHER_MODE_ECB: - rc = do_ecb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + case GCRY_CIPHER_MODE_CMAC: + rc = GPG_ERR_INV_CIPHER_MODE; break; - case GCRY_CIPHER_MODE_CBC: - if (!(c->flags & GCRY_CIPHER_CBC_CTS)) - rc = _gcry_cipher_cbc_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + case GCRY_CIPHER_MODE_NONE: + if (fips_mode () || !_gcry_get_debug_flag (0)) + { + fips_signal_error ("cipher mode NONE used"); + rc = GPG_ERR_INV_CIPHER_MODE; + } else - rc = _gcry_cipher_cbc_cts_encrypt (c, outbuf, outbuflen, inbuf, - inbuflen); - break; - - case GCRY_CIPHER_MODE_CFB: - rc = _gcry_cipher_cfb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_CFB8: - rc = _gcry_cipher_cfb8_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + { + if (inbuf != outbuf) + memmove (outbuf, inbuf, inbuflen); + rc = 0; + } break; - case GCRY_CIPHER_MODE_OFB: - rc = _gcry_cipher_ofb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); + default: + log_fatal ("cipher_encrypt: invalid mode %d\n", c->mode ); + rc = GPG_ERR_INV_CIPHER_MODE; break; + } - case GCRY_CIPHER_MODE_CTR: - rc = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; + return rc; +} - case GCRY_CIPHER_MODE_AESWRAP: - rc = _gcry_cipher_aeswrap_encrypt (c, outbuf, outbuflen, - inbuf, inbuflen); - break; +static gcry_err_code_t +do_decrypt_none_unknown (gcry_cipher_hd_t c, byte *outbuf, size_t outbuflen, + const byte *inbuf, size_t inbuflen) +{ + gcry_err_code_t rc; - case GCRY_CIPHER_MODE_CCM: - rc = _gcry_cipher_ccm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; + (void)outbuflen; + switch (c->mode) + { case GCRY_CIPHER_MODE_CMAC: rc = GPG_ERR_INV_CIPHER_MODE; break; - case GCRY_CIPHER_MODE_EAX: - rc = _gcry_cipher_eax_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_GCM: - rc = _gcry_cipher_gcm_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_POLY1305: - rc = _gcry_cipher_poly1305_encrypt (c, outbuf, outbuflen, - inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_OCB: - rc = _gcry_cipher_ocb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_XTS: - rc = _gcry_cipher_xts_crypt (c, outbuf, outbuflen, inbuf, inbuflen, 1); - break; - - case GCRY_CIPHER_MODE_STREAM: - c->spec->stencrypt (&c->context.c, - outbuf, (byte*)/*arggg*/inbuf, inbuflen); - rc = 0; - break; - case GCRY_CIPHER_MODE_NONE: if (fips_mode () || !_gcry_get_debug_flag (0)) { @@ -1096,7 +1085,7 @@ cipher_encrypt (gcry_cipher_hd_t c, byte *outbuf, size_t outbuflen, break; default: - log_fatal ("cipher_encrypt: invalid mode %d\n", c->mode ); + log_fatal ("cipher_decrypt: invalid mode %d\n", c->mode ); rc = GPG_ERR_INV_CIPHER_MODE; break; } @@ -1121,7 +1110,13 @@ _gcry_cipher_encrypt (gcry_cipher_hd_t h, void *out, size_t outsize, inlen = outsize; } - rc = cipher_encrypt (h, out, outsize, in, inlen); + if (h->mode != GCRY_CIPHER_MODE_NONE && !h->marks.key) + { + log_error ("cipher_decrypt: key not set\n"); + return GPG_ERR_MISSING_KEY; + } + + rc = h->mode_ops.encrypt (h, out, outsize, in, inlen); /* Failsafe: Make sure that the plaintext will never make it into OUT if the encryption returned an error. */ @@ -1132,118 +1127,10 @@ _gcry_cipher_encrypt (gcry_cipher_hd_t h, void *out, size_t outsize, } - /**************** - * Decrypt INBUF to OUTBUF with the mode selected at open. - * inbuf and outbuf may overlap or be the same. - * Depending on the mode some some constraints apply to INBUFLEN. + * Decrypt IN and write it to OUT. If IN is NULL, in-place encryption has + * been requested. */ -static gcry_err_code_t -cipher_decrypt (gcry_cipher_hd_t c, byte *outbuf, size_t outbuflen, - const byte *inbuf, size_t inbuflen) -{ - gcry_err_code_t rc; - - if (c->mode != GCRY_CIPHER_MODE_NONE && !c->marks.key) - { - log_error ("cipher_decrypt: key not set\n"); - return GPG_ERR_MISSING_KEY; - } - - switch (c->mode) - { - case GCRY_CIPHER_MODE_ECB: - rc = do_ecb_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_CBC: - if (!(c->flags & GCRY_CIPHER_CBC_CTS)) - rc = _gcry_cipher_cbc_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - else - rc = _gcry_cipher_cbc_cts_decrypt (c, outbuf, outbuflen, inbuf, - inbuflen); - break; - - case GCRY_CIPHER_MODE_CFB: - rc = _gcry_cipher_cfb_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_CFB8: - rc = _gcry_cipher_cfb8_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_OFB: - rc = _gcry_cipher_ofb_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_CTR: - rc = _gcry_cipher_ctr_encrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_AESWRAP: - rc = _gcry_cipher_aeswrap_decrypt (c, outbuf, outbuflen, - inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_CCM: - rc = _gcry_cipher_ccm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_CMAC: - rc = GPG_ERR_INV_CIPHER_MODE; - break; - - case GCRY_CIPHER_MODE_EAX: - rc = _gcry_cipher_eax_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_GCM: - rc = _gcry_cipher_gcm_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_POLY1305: - rc = _gcry_cipher_poly1305_decrypt (c, outbuf, outbuflen, - inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_OCB: - rc = _gcry_cipher_ocb_decrypt (c, outbuf, outbuflen, inbuf, inbuflen); - break; - - case GCRY_CIPHER_MODE_XTS: - rc = _gcry_cipher_xts_crypt (c, outbuf, outbuflen, inbuf, inbuflen, 0); - break; - - case GCRY_CIPHER_MODE_STREAM: - c->spec->stdecrypt (&c->context.c, - outbuf, (byte*)/*arggg*/inbuf, inbuflen); - rc = 0; - break; - - case GCRY_CIPHER_MODE_NONE: - if (fips_mode () || !_gcry_get_debug_flag (0)) - { - fips_signal_error ("cipher mode NONE used"); - rc = GPG_ERR_INV_CIPHER_MODE; - } - else - { - if (inbuf != outbuf) - memmove (outbuf, inbuf, inbuflen); - rc = 0; - } - break; - - default: - log_fatal ("cipher_decrypt: invalid mode %d\n", c->mode ); - rc = GPG_ERR_INV_CIPHER_MODE; - break; - } - - return rc; -} - - gcry_err_code_t _gcry_cipher_decrypt (gcry_cipher_hd_t h, void *out, size_t outsize, const void *in, size_t inlen) @@ -1254,9 +1141,14 @@ _gcry_cipher_decrypt (gcry_cipher_hd_t h, void *out, size_t outsize, inlen = outsize; } - return cipher_decrypt (h, out, outsize, in, inlen); -} + if (h->mode != GCRY_CIPHER_MODE_NONE && !h->marks.key) + { + log_error ("cipher_decrypt: key not set\n"); + return GPG_ERR_MISSING_KEY; + } + return h->mode_ops.decrypt (h, out, outsize, in, inlen); +} /**************** @@ -1287,37 +1179,10 @@ _gcry_cipher_setkey (gcry_cipher_hd_t hd, const void *key, size_t keylen) gcry_err_code_t _gcry_cipher_setiv (gcry_cipher_hd_t hd, const void *iv, size_t ivlen) { - gcry_err_code_t rc = 0; - - switch (hd->mode) - { - case GCRY_CIPHER_MODE_CCM: - rc = _gcry_cipher_ccm_set_nonce (hd, iv, ivlen); - break; - - case GCRY_CIPHER_MODE_EAX: - rc = _gcry_cipher_eax_set_nonce (hd, iv, ivlen); - break; - - case GCRY_CIPHER_MODE_GCM: - rc = _gcry_cipher_gcm_setiv (hd, iv, ivlen); - break; - - case GCRY_CIPHER_MODE_POLY1305: - rc = _gcry_cipher_poly1305_setiv (hd, iv, ivlen); - break; - - case GCRY_CIPHER_MODE_OCB: - rc = _gcry_cipher_ocb_set_nonce (hd, iv, ivlen); - break; - - default: - rc = cipher_setiv (hd, iv, ivlen); - break; - } - return rc; + return hd->mode_ops.setiv (hd, iv, ivlen); } + /* Set counter for CTR mode. (CTR,CTRLEN) must denote a buffer of block size length, or (NULL,0) to set the CTR to the all-zero block. */ @@ -1351,127 +1216,230 @@ _gcry_cipher_getctr (gcry_cipher_hd_t hd, void *ctr, size_t ctrlen) return 0; } + gcry_err_code_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, size_t abuflen) { gcry_err_code_t rc; - switch (hd->mode) + if (hd->mode_ops.authenticate) { - case GCRY_CIPHER_MODE_CCM: - rc = _gcry_cipher_ccm_authenticate (hd, abuf, abuflen); + rc = hd->mode_ops.authenticate (hd, abuf, abuflen); + } + else + { + log_error ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + } + + return rc; +} + + +gcry_err_code_t +_gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) +{ + gcry_err_code_t rc; + + if (hd->mode_ops.get_tag) + { + rc = hd->mode_ops.get_tag (hd, outtag, taglen); + } + else + { + log_error ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + } + + return rc; +} + + +gcry_err_code_t +_gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) +{ + gcry_err_code_t rc; + + if (hd->mode_ops.check_tag) + { + rc = hd->mode_ops.check_tag (hd, intag, taglen); + } + else + { + log_error ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); + rc = GPG_ERR_INV_CIPHER_MODE; + } + + return rc; +} + + + +static void +_gcry_cipher_setup_mode_ops(gcry_cipher_hd_t c, int mode) +{ + /* Setup encryption and decryption routines. */ + switch (mode) + { + case GCRY_CIPHER_MODE_STREAM: + c->mode_ops.encrypt = do_stream_encrypt; + c->mode_ops.decrypt = do_stream_decrypt; break; - case GCRY_CIPHER_MODE_CMAC: - rc = _gcry_cipher_cmac_authenticate (hd, abuf, abuflen); + case GCRY_CIPHER_MODE_ECB: + c->mode_ops.encrypt = do_ecb_encrypt; + c->mode_ops.decrypt = do_ecb_decrypt; + break; + + case GCRY_CIPHER_MODE_CBC: + if (!(c->flags & GCRY_CIPHER_CBC_CTS)) + { + c->mode_ops.encrypt = _gcry_cipher_cbc_encrypt; + c->mode_ops.decrypt = _gcry_cipher_cbc_decrypt; + } + else + { + c->mode_ops.encrypt = _gcry_cipher_cbc_cts_encrypt; + c->mode_ops.decrypt = _gcry_cipher_cbc_cts_decrypt; + } + break; + + case GCRY_CIPHER_MODE_CFB: + c->mode_ops.encrypt = _gcry_cipher_cfb_encrypt; + c->mode_ops.decrypt = _gcry_cipher_cfb_decrypt; + break; + + case GCRY_CIPHER_MODE_CFB8: + c->mode_ops.encrypt = _gcry_cipher_cfb8_encrypt; + c->mode_ops.decrypt = _gcry_cipher_cfb8_decrypt; + break; + + case GCRY_CIPHER_MODE_OFB: + c->mode_ops.encrypt = _gcry_cipher_ofb_encrypt; + c->mode_ops.decrypt = _gcry_cipher_ofb_encrypt; + break; + + case GCRY_CIPHER_MODE_CTR: + c->mode_ops.encrypt = _gcry_cipher_ctr_encrypt; + c->mode_ops.decrypt = _gcry_cipher_ctr_encrypt; + break; + + case GCRY_CIPHER_MODE_AESWRAP: + c->mode_ops.encrypt = _gcry_cipher_aeswrap_encrypt; + c->mode_ops.decrypt = _gcry_cipher_aeswrap_decrypt; + break; + + case GCRY_CIPHER_MODE_CCM: + c->mode_ops.encrypt = _gcry_cipher_ccm_encrypt; + c->mode_ops.decrypt = _gcry_cipher_ccm_decrypt; break; case GCRY_CIPHER_MODE_EAX: - rc = _gcry_cipher_eax_authenticate (hd, abuf, abuflen); + c->mode_ops.encrypt = _gcry_cipher_eax_encrypt; + c->mode_ops.decrypt = _gcry_cipher_eax_decrypt; break; case GCRY_CIPHER_MODE_GCM: - rc = _gcry_cipher_gcm_authenticate (hd, abuf, abuflen); + c->mode_ops.encrypt = _gcry_cipher_gcm_encrypt; + c->mode_ops.decrypt = _gcry_cipher_gcm_decrypt; break; case GCRY_CIPHER_MODE_POLY1305: - rc = _gcry_cipher_poly1305_authenticate (hd, abuf, abuflen); + c->mode_ops.encrypt = _gcry_cipher_poly1305_encrypt; + c->mode_ops.decrypt = _gcry_cipher_poly1305_decrypt; break; case GCRY_CIPHER_MODE_OCB: - rc = _gcry_cipher_ocb_authenticate (hd, abuf, abuflen); + c->mode_ops.encrypt = _gcry_cipher_ocb_encrypt; + c->mode_ops.decrypt = _gcry_cipher_ocb_decrypt; + break; + + case GCRY_CIPHER_MODE_XTS: + c->mode_ops.encrypt = _gcry_cipher_xts_encrypt; + c->mode_ops.decrypt = _gcry_cipher_xts_decrypt; break; default: - log_error ("gcry_cipher_authenticate: invalid mode %d\n", hd->mode); - rc = GPG_ERR_INV_CIPHER_MODE; + c->mode_ops.encrypt = do_encrypt_none_unknown; + c->mode_ops.decrypt = do_decrypt_none_unknown; break; } - return rc; -} - - -gcry_err_code_t -_gcry_cipher_gettag (gcry_cipher_hd_t hd, void *outtag, size_t taglen) -{ - gcry_err_code_t rc; - - switch (hd->mode) + /* Setup IV setting routine. */ + switch (mode) { case GCRY_CIPHER_MODE_CCM: - rc = _gcry_cipher_ccm_get_tag (hd, outtag, taglen); - break; - - case GCRY_CIPHER_MODE_CMAC: - rc = _gcry_cipher_cmac_get_tag (hd, outtag, taglen); + c->mode_ops.setiv = _gcry_cipher_ccm_set_nonce; break; case GCRY_CIPHER_MODE_EAX: - rc = _gcry_cipher_eax_get_tag (hd, outtag, taglen); + c->mode_ops.setiv = _gcry_cipher_eax_set_nonce; break; case GCRY_CIPHER_MODE_GCM: - rc = _gcry_cipher_gcm_get_tag (hd, outtag, taglen); + c->mode_ops.setiv = _gcry_cipher_gcm_setiv; break; case GCRY_CIPHER_MODE_POLY1305: - rc = _gcry_cipher_poly1305_get_tag (hd, outtag, taglen); + c->mode_ops.setiv = _gcry_cipher_poly1305_setiv; break; case GCRY_CIPHER_MODE_OCB: - rc = _gcry_cipher_ocb_get_tag (hd, outtag, taglen); + c->mode_ops.setiv = _gcry_cipher_ocb_set_nonce; break; default: - log_error ("gcry_cipher_gettag: invalid mode %d\n", hd->mode); - rc = GPG_ERR_INV_CIPHER_MODE; + c->mode_ops.setiv = cipher_setiv; break; } - return rc; -} - -gcry_err_code_t -_gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen) -{ - gcry_err_code_t rc; - - switch (hd->mode) + /* Setup authentication routines for AEAD modes. */ + switch (mode) { case GCRY_CIPHER_MODE_CCM: - rc = _gcry_cipher_ccm_check_tag (hd, intag, taglen); + c->mode_ops.authenticate = _gcry_cipher_ccm_authenticate; + c->mode_ops.get_tag = _gcry_cipher_ccm_get_tag; + c->mode_ops.check_tag = _gcry_cipher_ccm_check_tag; break; case GCRY_CIPHER_MODE_CMAC: - rc = _gcry_cipher_cmac_check_tag (hd, intag, taglen); + c->mode_ops.authenticate = _gcry_cipher_cmac_authenticate; + c->mode_ops.get_tag = _gcry_cipher_cmac_get_tag; + c->mode_ops.check_tag = _gcry_cipher_cmac_check_tag; break; case GCRY_CIPHER_MODE_EAX: - rc = _gcry_cipher_eax_check_tag (hd, intag, taglen); + c->mode_ops.authenticate = _gcry_cipher_eax_authenticate; + c->mode_ops.get_tag = _gcry_cipher_eax_get_tag; + c->mode_ops.check_tag = _gcry_cipher_eax_check_tag; break; case GCRY_CIPHER_MODE_GCM: - rc = _gcry_cipher_gcm_check_tag (hd, intag, taglen); + c->mode_ops.authenticate = _gcry_cipher_gcm_authenticate; + c->mode_ops.get_tag = _gcry_cipher_gcm_get_tag; + c->mode_ops.check_tag = _gcry_cipher_gcm_check_tag; break; case GCRY_CIPHER_MODE_POLY1305: - rc = _gcry_cipher_poly1305_check_tag (hd, intag, taglen); + c->mode_ops.authenticate = _gcry_cipher_poly1305_authenticate; + c->mode_ops.get_tag = _gcry_cipher_poly1305_get_tag; + c->mode_ops.check_tag = _gcry_cipher_poly1305_check_tag; break; case GCRY_CIPHER_MODE_OCB: - rc = _gcry_cipher_ocb_check_tag (hd, intag, taglen); + c->mode_ops.authenticate = _gcry_cipher_ocb_authenticate; + c->mode_ops.get_tag = _gcry_cipher_ocb_get_tag; + c->mode_ops.check_tag = _gcry_cipher_ocb_check_tag; break; default: - log_error ("gcry_cipher_checktag: invalid mode %d\n", hd->mode); - rc = GPG_ERR_INV_CIPHER_MODE; + c->mode_ops.authenticate = NULL; + c->mode_ops.get_tag = NULL; + c->mode_ops.check_tag = NULL; break; } - - return rc; } -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 17:51:18 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:18 +0300 Subject: [PATCH 6/9] Add fast path for _gcry_fips_is_operational In-Reply-To: <20180619155121.7122-1-jussi.kivilinna@iki.fi> References: <20180619155121.7122-1-jussi.kivilinna@iki.fi> Message-ID: <20180619155121.7122-5-jussi.kivilinna@iki.fi> * src/fips.c (no_fips_mode_required): Rename to... (_gcry_no_fips_mode_required): ...this and make externally available. * src/g10lib.h (_gcry_no_fips_mode_required): New extern. (fips_mode): Inline _gcry_fips_mode to macro, use _gcry_no_fips_mode_required directly. (fips_is_operational): Inline fips_mode check from _gcry_fips_in_operational. -- Add fast path to reduce call overhead in src/visibility.c where fips_is_operational is called before cipher/md/etc operations. Signed-off-by: Jussi Kivilinna --- src/fips.c | 14 +++++++------- src/g10lib.h | 18 ++++++++++++++++-- 2 files changed, 23 insertions(+), 9 deletions(-) diff --git a/src/fips.c b/src/fips.c index af3fe2c6d..2b3a0af4b 100644 --- a/src/fips.c +++ b/src/fips.c @@ -57,7 +57,7 @@ enum module_states that fips mode is the default unless changed by the initialization code. To check whether fips mode is enabled, use the function fips_mode()! */ -static int no_fips_mode_required; +int _gcry_no_fips_mode_required; /* Flag to indicate that we are in the enforced FIPS mode. */ static int enforced_fips_mode; @@ -118,7 +118,7 @@ _gcry_initialize_fips_mode (int force) /* If the calling application explicitly requested fipsmode, do so. */ if (force) { - gcry_assert (!no_fips_mode_required); + gcry_assert (!_gcry_no_fips_mode_required); goto leave; } @@ -129,7 +129,7 @@ _gcry_initialize_fips_mode (int force) actually used. The file itself may be empty. */ if ( !access (FIPS_FORCE_FILE, F_OK) ) { - gcry_assert (!no_fips_mode_required); + gcry_assert (!_gcry_no_fips_mode_required); goto leave; } @@ -148,7 +148,7 @@ _gcry_initialize_fips_mode (int force) { /* System is in fips mode. */ fclose (fp); - gcry_assert (!no_fips_mode_required); + gcry_assert (!_gcry_no_fips_mode_required); goto leave; } fclose (fp); @@ -171,10 +171,10 @@ _gcry_initialize_fips_mode (int force) } /* Fips not not requested, set flag. */ - no_fips_mode_required = 1; + _gcry_no_fips_mode_required = 1; leave: - if (!no_fips_mode_required) + if (!_gcry_no_fips_mode_required) { /* Yes, we are in FIPS mode. */ FILE *fp; @@ -265,7 +265,7 @@ _gcry_fips_mode (void) /* No locking is required because we have the requirement that this variable is only initialized once with no other threads existing. */ - return !no_fips_mode_required; + return !_gcry_no_fips_mode_required; } diff --git a/src/g10lib.h b/src/g10lib.h index d41fa0cf7..d52eef324 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -422,10 +422,20 @@ gpg_err_code_t _gcry_sexp_vextract_param (gcry_sexp_t sexp, const char *path, /*-- fips.c --*/ +extern int _gcry_no_fips_mode_required; + void _gcry_initialize_fips_mode (int force); int _gcry_fips_mode (void); -#define fips_mode() _gcry_fips_mode () + +/* This macro returns true if fips mode is enabled. This is + independent of the fips required finite state machine and only used + to enable fips specific code. + + No locking is required because we have the requirement that this + variable is only initialized once with no other threads + existing. */ +#define fips_mode() (!_gcry_no_fips_mode_required) int _gcry_enforced_fips_mode (void); @@ -453,7 +463,11 @@ void _gcry_fips_signal_error (const char *srcfile, #endif int _gcry_fips_is_operational (void); -#define fips_is_operational() (_gcry_global_is_operational ()) + +/* Return true if the library is in the operational state. */ +#define fips_is_operational() \ + (!fips_mode () || _gcry_fips_is_operational ()) + #define fips_not_operational() (GPG_ERR_NOT_OPERATIONAL) int _gcry_fips_test_operational (void); -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 17:51:19 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:19 +0300 Subject: [PATCH 7/9] Pass cipher object pointer to setkey functions In-Reply-To: <20180619155121.7122-1-jussi.kivilinna@iki.fi> References: <20180619155121.7122-1-jussi.kivilinna@iki.fi> Message-ID: <20180619155121.7122-6-jussi.kivilinna@iki.fi> * cipher/cipher.c (cipher_setkey): Pass cipher object pointer to cipher's setkey function. * cipher/arcfour.c: Add gcry_cipher_hd_t parameter for setkey functions and update selftests to pass NULL pointer. * cipher/blowfish.c: Ditto. * cipher/camellia-glue.c: Ditto. * cipher/cast5.c: Ditto. * cipher/chacha20.c: Ditto. * cipher/cipher-selftest.c: Ditto. * cipher/des.c: Ditto. * cipher/gost28147.c: Ditto. * cipher/idea.c: Ditto. * cipher/rfc2268.c: Ditto. * cipher/rijndael.c: Ditto. * cipher/salsa20.c: Ditto. * cipher/seed.c: Ditto. * cipher/serpent.c: Ditto. * cipher/twofish.c: Ditto. * src/cipher-proto.h: Ditto. -- This allows setkey function to replace bulk cipher operations with faster alternative. Signed-off-by: Jussi Kivilinna --- cipher/arcfour.c | 8 +++++--- cipher/blowfish.c | 11 +++++++---- cipher/camellia-glue.c | 11 +++++++---- cipher/cast5.c | 13 ++++++++----- cipher/chacha20.c | 16 +++++++++------- cipher/cipher-selftest.c | 6 +++--- cipher/cipher.c | 4 ++-- cipher/des.c | 19 ++++++++++++++----- cipher/gost28147.c | 5 ++++- cipher/idea.c | 4 +++- cipher/rfc2268.c | 4 +++- cipher/rijndael.c | 10 ++++++---- cipher/salsa20.c | 12 +++++++----- cipher/seed.c | 7 ++++--- cipher/serpent.c | 5 ++++- cipher/twofish.c | 21 ++++++++++++--------- src/cipher-proto.h | 3 ++- 17 files changed, 100 insertions(+), 59 deletions(-) diff --git a/cipher/arcfour.c b/cipher/arcfour.c index 085df9bbd..72decf08b 100644 --- a/cipher/arcfour.c +++ b/cipher/arcfour.c @@ -170,10 +170,12 @@ do_arcfour_setkey (void *context, const byte *key, unsigned int keylen) } static gcry_err_code_t -arcfour_setkey ( void *context, const byte *key, unsigned int keylen ) +arcfour_setkey ( void *context, const byte *key, unsigned int keylen, + gcry_cipher_hd_t hd ) { ARCFOUR_context *ctx = (ARCFOUR_context *) context; gcry_err_code_t rc = do_arcfour_setkey (ctx, key, keylen ); + (void)hd; return rc; } @@ -193,11 +195,11 @@ selftest(void) static const byte ciphertext_1[] = { 0xF1, 0x38, 0x29, 0xC9, 0xDE }; - arcfour_setkey( &ctx, key_1, sizeof(key_1)); + arcfour_setkey( &ctx, key_1, sizeof(key_1), NULL); encrypt_stream( &ctx, scratch, plaintext_1, sizeof(plaintext_1)); if ( memcmp (scratch, ciphertext_1, sizeof (ciphertext_1))) return "Arcfour encryption test 1 failed."; - arcfour_setkey( &ctx, key_1, sizeof(key_1)); + arcfour_setkey( &ctx, key_1, sizeof(key_1), NULL); encrypt_stream(&ctx, scratch, scratch, sizeof(plaintext_1)); /* decrypt */ if ( memcmp (scratch, plaintext_1, sizeof (plaintext_1))) return "Arcfour decryption test 1 failed."; diff --git a/cipher/blowfish.c b/cipher/blowfish.c index 724d64e98..2d9182009 100644 --- a/cipher/blowfish.c +++ b/cipher/blowfish.c @@ -67,7 +67,8 @@ typedef struct { u32 p[BLOWFISH_ROUNDS+2]; } BLOWFISH_context; -static gcry_err_code_t bf_setkey (void *c, const byte *key, unsigned keylen); +static gcry_err_code_t bf_setkey (void *c, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd); static unsigned int encrypt_block (void *bc, byte *outbuf, const byte *inbuf); static unsigned int decrypt_block (void *bc, byte *outbuf, const byte *inbuf); @@ -853,7 +854,7 @@ selftest(void) const char *r; bf_setkey( (void *) &c, - (const unsigned char*)"abcdefghijklmnopqrstuvwxyz", 26 ); + (const unsigned char*)"abcdefghijklmnopqrstuvwxyz", 26, NULL ); encrypt_block( (void *) &c, buffer, plain ); if( memcmp( buffer, "\x32\x4E\xD0\xFE\xF4\x13\xA2\x03", 8 ) ) return "Blowfish selftest failed (1)."; @@ -861,7 +862,7 @@ selftest(void) if( memcmp( buffer, plain, 8 ) ) return "Blowfish selftest failed (2)."; - bf_setkey( (void *) &c, key3, 8 ); + bf_setkey( (void *) &c, key3, 8, NULL ); encrypt_block( (void *) &c, buffer, plain3 ); if( memcmp( buffer, cipher3, 8 ) ) return "Blowfish selftest failed (3)."; @@ -1051,10 +1052,12 @@ do_bf_setkey (BLOWFISH_context *c, const byte *key, unsigned keylen) static gcry_err_code_t -bf_setkey (void *context, const byte *key, unsigned keylen) +bf_setkey (void *context, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd) { BLOWFISH_context *c = (BLOWFISH_context *) context; gcry_err_code_t rc = do_bf_setkey (c, key, keylen); + (void)hd; return rc; } diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index 76870944d..22df21469 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -204,7 +204,8 @@ extern void _gcry_camellia_aesni_avx2_ocb_auth(CAMELLIA_context *ctx, static const char *selftest(void); static gcry_err_code_t -camellia_setkey(void *c, const byte *key, unsigned keylen) +camellia_setkey(void *c, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd) { CAMELLIA_context *ctx=c; static int initialized=0; @@ -213,6 +214,8 @@ camellia_setkey(void *c, const byte *key, unsigned keylen) unsigned int hwf = _gcry_get_hw_features (); #endif + (void)hd; + if(keylen!=16 && keylen!=24 && keylen!=32) return GPG_ERR_INV_KEYLEN; @@ -991,7 +994,7 @@ selftest(void) 0x20,0xef,0x7c,0x91,0x9e,0x3a,0x75,0x09 }; - camellia_setkey(&ctx,key_128,sizeof(key_128)); + camellia_setkey(&ctx,key_128,sizeof(key_128),NULL); camellia_encrypt(&ctx,scratch,plaintext); if(memcmp(scratch,ciphertext_128,sizeof(ciphertext_128))!=0) return "CAMELLIA-128 test encryption failed."; @@ -999,7 +1002,7 @@ selftest(void) if(memcmp(scratch,plaintext,sizeof(plaintext))!=0) return "CAMELLIA-128 test decryption failed."; - camellia_setkey(&ctx,key_192,sizeof(key_192)); + camellia_setkey(&ctx,key_192,sizeof(key_192),NULL); camellia_encrypt(&ctx,scratch,plaintext); if(memcmp(scratch,ciphertext_192,sizeof(ciphertext_192))!=0) return "CAMELLIA-192 test encryption failed."; @@ -1007,7 +1010,7 @@ selftest(void) if(memcmp(scratch,plaintext,sizeof(plaintext))!=0) return "CAMELLIA-192 test decryption failed."; - camellia_setkey(&ctx,key_256,sizeof(key_256)); + camellia_setkey(&ctx,key_256,sizeof(key_256),NULL); camellia_encrypt(&ctx,scratch,plaintext); if(memcmp(scratch,ciphertext_256,sizeof(ciphertext_256))!=0) return "CAMELLIA-256 test encryption failed."; diff --git a/cipher/cast5.c b/cipher/cast5.c index d23882b9a..e7d324b25 100644 --- a/cipher/cast5.c +++ b/cipher/cast5.c @@ -72,7 +72,8 @@ typedef struct { #endif } CAST5_context; -static gcry_err_code_t cast_setkey (void *c, const byte *key, unsigned keylen); +static gcry_err_code_t cast_setkey (void *c, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd); static unsigned int encrypt_block (void *c, byte *outbuf, const byte *inbuf); static unsigned int decrypt_block (void *c, byte *outbuf, const byte *inbuf); @@ -825,7 +826,7 @@ selftest(void) byte buffer[8]; const char *r; - cast_setkey( &c, key, 16 ); + cast_setkey( &c, key, 16, NULL ); encrypt_block( &c, buffer, plain ); if( memcmp( buffer, cipher, 8 ) ) return "1"; @@ -846,10 +847,10 @@ selftest(void) 0x80,0xAC,0x05,0xB8,0xE8,0x3D,0x69,0x6E }; for(i=0; i < 1000000; i++ ) { - cast_setkey( &c, b0, 16 ); + cast_setkey( &c, b0, 16, NULL ); encrypt_block( &c, a0, a0 ); encrypt_block( &c, a0+8, a0+8 ); - cast_setkey( &c, a0, 16 ); + cast_setkey( &c, a0, 16, NULL ); encrypt_block( &c, b0, b0 ); encrypt_block( &c, b0+8, b0+8 ); } @@ -991,10 +992,12 @@ do_cast_setkey( CAST5_context *c, const byte *key, unsigned keylen ) } static gcry_err_code_t -cast_setkey (void *context, const byte *key, unsigned keylen ) +cast_setkey (void *context, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd ) { CAST5_context *c = (CAST5_context *) context; gcry_err_code_t rc = do_cast_setkey (c, key, keylen); + (void)hd; return rc; } diff --git a/cipher/chacha20.c b/cipher/chacha20.c index e89ad2e47..84a9b2b80 100644 --- a/cipher/chacha20.c +++ b/cipher/chacha20.c @@ -372,10 +372,12 @@ chacha20_do_setkey (CHACHA20_context_t *ctx, static gcry_err_code_t -chacha20_setkey (void *context, const byte *key, unsigned int keylen) +chacha20_setkey (void *context, const byte *key, unsigned int keylen, + gcry_cipher_hd_t hd) { CHACHA20_context_t *ctx = (CHACHA20_context_t *) context; gcry_err_code_t rc = chacha20_do_setkey (ctx, key, keylen); + (void)hd; _gcry_burn_stack (4 + sizeof (void *) + 4 * sizeof (void *)); return rc; } @@ -551,7 +553,7 @@ selftest (void) /* 16-byte alignment required for amd64 implementation. */ ctx = (CHACHA20_context_t *)((uintptr_t)(ctxbuf + 15) & ~(uintptr_t)15); - chacha20_setkey (ctx, key_1, sizeof key_1); + chacha20_setkey (ctx, key_1, sizeof key_1, NULL); chacha20_setiv (ctx, nonce_1, sizeof nonce_1); scratch[sizeof (scratch) - 1] = 0; chacha20_encrypt_stream (ctx, scratch, plaintext_1, sizeof plaintext_1); @@ -559,7 +561,7 @@ selftest (void) return "ChaCha20 encryption test 1 failed."; if (scratch[sizeof (scratch) - 1]) return "ChaCha20 wrote too much."; - chacha20_setkey (ctx, key_1, sizeof (key_1)); + chacha20_setkey (ctx, key_1, sizeof (key_1), NULL); chacha20_setiv (ctx, nonce_1, sizeof nonce_1); chacha20_encrypt_stream (ctx, scratch, scratch, sizeof plaintext_1); if (memcmp (scratch, plaintext_1, sizeof plaintext_1)) @@ -567,12 +569,12 @@ selftest (void) for (i = 0; i < sizeof buf; i++) buf[i] = i; - chacha20_setkey (ctx, key_1, sizeof key_1); + chacha20_setkey (ctx, key_1, sizeof key_1, NULL); chacha20_setiv (ctx, nonce_1, sizeof nonce_1); /*encrypt */ chacha20_encrypt_stream (ctx, buf, buf, sizeof buf); /*decrypt */ - chacha20_setkey (ctx, key_1, sizeof key_1); + chacha20_setkey (ctx, key_1, sizeof key_1, NULL); chacha20_setiv (ctx, nonce_1, sizeof nonce_1); chacha20_encrypt_stream (ctx, buf, buf, 1); chacha20_encrypt_stream (ctx, buf + 1, buf + 1, (sizeof buf) - 1 - 1); @@ -582,13 +584,13 @@ selftest (void) if (buf[i] != (byte) i) return "ChaCha20 encryption test 2 failed."; - chacha20_setkey (ctx, key_1, sizeof key_1); + chacha20_setkey (ctx, key_1, sizeof key_1, NULL); chacha20_setiv (ctx, nonce_1, sizeof nonce_1); /* encrypt */ for (i = 0; i < sizeof buf; i++) chacha20_encrypt_stream (ctx, &buf[i], &buf[i], 1); /* decrypt */ - chacha20_setkey (ctx, key_1, sizeof key_1); + chacha20_setkey (ctx, key_1, sizeof key_1, NULL); chacha20_setiv (ctx, nonce_1, sizeof nonce_1); chacha20_encrypt_stream (ctx, buf, buf, sizeof buf); for (i = 0; i < sizeof buf; i++) diff --git a/cipher/cipher-selftest.c b/cipher/cipher-selftest.c index cecbab75c..eb3614ad6 100644 --- a/cipher/cipher-selftest.c +++ b/cipher/cipher-selftest.c @@ -105,7 +105,7 @@ _gcry_selftest_helper_cbc (const char *cipher, gcry_cipher_setkey_t setkey_func, ciphertext = plaintext2 + nblocks * blocksize; /* Initialize ctx */ - if (setkey_func (ctx, key, sizeof(key)) != GPG_ERR_NO_ERROR) + if (setkey_func (ctx, key, sizeof(key), NULL) != GPG_ERR_NO_ERROR) { xfree(mem); return "setkey failed"; @@ -228,7 +228,7 @@ _gcry_selftest_helper_cfb (const char *cipher, gcry_cipher_setkey_t setkey_func, ciphertext = plaintext2 + nblocks * blocksize; /* Initialize ctx */ - if (setkey_func (ctx, key, sizeof(key)) != GPG_ERR_NO_ERROR) + if (setkey_func (ctx, key, sizeof(key), NULL) != GPG_ERR_NO_ERROR) { xfree(mem); return "setkey failed"; @@ -351,7 +351,7 @@ _gcry_selftest_helper_ctr (const char *cipher, gcry_cipher_setkey_t setkey_func, ciphertext2 = ciphertext + nblocks * blocksize; /* Initialize ctx */ - if (setkey_func (ctx, key, sizeof(key)) != GPG_ERR_NO_ERROR) + if (setkey_func (ctx, key, sizeof(key), NULL) != GPG_ERR_NO_ERROR) { xfree(mem); return "setkey failed"; diff --git a/cipher/cipher.c b/cipher/cipher.c index a4dfc4ddc..55b991c35 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -793,7 +793,7 @@ cipher_setkey (gcry_cipher_hd_t c, byte *key, size_t keylen) } } - rc = c->spec->setkey (&c->context.c, key, keylen); + rc = c->spec->setkey (&c->context.c, key, keylen, c); if (!rc) { /* Duplicate initial context. */ @@ -823,7 +823,7 @@ cipher_setkey (gcry_cipher_hd_t c, byte *key, size_t keylen) case GCRY_CIPHER_MODE_XTS: /* Setup tweak cipher with second part of XTS key. */ rc = c->spec->setkey (c->u_mode.xts.tweak_context, key + keylen, - keylen); + keylen, c); if (!rc) { /* Duplicate initial tweak context. */ diff --git a/cipher/des.c b/cipher/des.c index 7801b08fc..05092277e 100644 --- a/cipher/des.c +++ b/cipher/des.c @@ -197,7 +197,8 @@ static unsigned int do_tripledes_encrypt(void *context, byte *outbuf, static unsigned int do_tripledes_decrypt(void *context, byte *outbuf, const byte *inbuf ); static gcry_err_code_t do_tripledes_setkey(void *context, const byte *key, - unsigned keylen); + unsigned keylen, + gcry_cipher_hd_t hd); static int initialized; @@ -1053,7 +1054,8 @@ is_weak_key ( const byte *key ) /* Alternative setkey for selftests; need larger key than default. */ static gcry_err_code_t -bulk_selftest_setkey (void *context, const byte *__key, unsigned __keylen) +bulk_selftest_setkey (void *context, const byte *__key, unsigned __keylen, + gcry_cipher_hd_t hd) { static const unsigned char key[24] ATTR_ALIGNED_16 = { 0x66,0x9A,0x00,0x7F,0xC7,0x6A,0x45,0x9F, @@ -1061,10 +1063,11 @@ bulk_selftest_setkey (void *context, const byte *__key, unsigned __keylen) 0x18,0x2A,0x39,0x47,0x5E,0x6F,0x75,0x82 }; + (void)hd; (void)__key; (void)__keylen; - return do_tripledes_setkey(context, key, sizeof(key)); + return do_tripledes_setkey(context, key, sizeof(key), NULL); } @@ -1316,10 +1319,13 @@ selftest (void) static gcry_err_code_t -do_tripledes_setkey ( void *context, const byte *key, unsigned keylen ) +do_tripledes_setkey ( void *context, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd ) { struct _tripledes_ctx *ctx = (struct _tripledes_ctx *) context; + (void)hd; + if( keylen != 24 ) return GPG_ERR_INV_KEYLEN; @@ -1380,10 +1386,13 @@ do_tripledes_decrypt( void *context, byte *outbuf, const byte *inbuf ) } static gcry_err_code_t -do_des_setkey (void *context, const byte *key, unsigned keylen) +do_des_setkey (void *context, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd) { struct _des_ctx *ctx = (struct _des_ctx *) context; + (void)hd; + if (keylen != 8) return GPG_ERR_INV_KEYLEN; diff --git a/cipher/gost28147.c b/cipher/gost28147.c index 4ff80b469..1b8ab7aeb 100644 --- a/cipher/gost28147.c +++ b/cipher/gost28147.c @@ -39,11 +39,14 @@ #include "gost-sb.h" static gcry_err_code_t -gost_setkey (void *c, const byte *key, unsigned keylen) +gost_setkey (void *c, const byte *key, unsigned keylen, + gcry_cipher_hd_t hd) { int i; GOST28147_context *ctx = c; + (void)hd; + if (keylen != 256 / 8) return GPG_ERR_INV_KEYLEN; diff --git a/cipher/idea.c b/cipher/idea.c index ffe821d32..abfe67558 100644 --- a/cipher/idea.c +++ b/cipher/idea.c @@ -258,10 +258,12 @@ do_setkey( IDEA_context *c, const byte *key, unsigned int keylen ) } static gcry_err_code_t -idea_setkey (void *context, const byte *key, unsigned int keylen) +idea_setkey (void *context, const byte *key, unsigned int keylen, + gcry_cipher_hd_t hd) { IDEA_context *ctx = context; int rc = do_setkey (ctx, key, keylen); + (void)hd; _gcry_burn_stack (23+6*sizeof(void*)); return rc; } diff --git a/cipher/rfc2268.c b/cipher/rfc2268.c index aed8cadba..091494629 100644 --- a/cipher/rfc2268.c +++ b/cipher/rfc2268.c @@ -262,8 +262,10 @@ setkey_core (void *context, const unsigned char *key, unsigned int keylen, int w } static gpg_err_code_t -do_setkey (void *context, const unsigned char *key, unsigned int keylen) +do_setkey (void *context, const unsigned char *key, unsigned int keylen, + gcry_cipher_hd_t hd) { + (void)hd; return setkey_core (context, key, keylen, 1); } diff --git a/cipher/rijndael.c b/cipher/rijndael.c index 0f676fe14..f9666d0cf 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -513,9 +513,11 @@ do_setkey (RIJNDAEL_context *ctx, const byte *key, const unsigned keylen) static gcry_err_code_t -rijndael_setkey (void *context, const byte *key, const unsigned keylen) +rijndael_setkey (void *context, const byte *key, const unsigned keylen, + gcry_cipher_hd_t hd) { RIJNDAEL_context *ctx = context; + (void)hd; return do_setkey (ctx, key, keylen); } @@ -1580,7 +1582,7 @@ selftest_basic_128 (void) if (!ctx) return "failed to allocate memory"; - rijndael_setkey (ctx, key_128, sizeof (key_128)); + rijndael_setkey (ctx, key_128, sizeof (key_128), NULL); rijndael_encrypt (ctx, scratch, plaintext_128); if (memcmp (scratch, ciphertext_128, sizeof (ciphertext_128))) { @@ -1623,7 +1625,7 @@ selftest_basic_192 (void) ctx = _gcry_cipher_selftest_alloc_ctx (sizeof *ctx, &ctxmem); if (!ctx) return "failed to allocate memory"; - rijndael_setkey (ctx, key_192, sizeof(key_192)); + rijndael_setkey (ctx, key_192, sizeof(key_192), NULL); rijndael_encrypt (ctx, scratch, plaintext_192); if (memcmp (scratch, ciphertext_192, sizeof (ciphertext_192))) { @@ -1668,7 +1670,7 @@ selftest_basic_256 (void) ctx = _gcry_cipher_selftest_alloc_ctx (sizeof *ctx, &ctxmem); if (!ctx) return "failed to allocate memory"; - rijndael_setkey (ctx, key_256, sizeof(key_256)); + rijndael_setkey (ctx, key_256, sizeof(key_256), NULL); rijndael_encrypt (ctx, scratch, plaintext_256); if (memcmp (scratch, ciphertext_256, sizeof (ciphertext_256))) { diff --git a/cipher/salsa20.c b/cipher/salsa20.c index 976819856..5c5e2b547 100644 --- a/cipher/salsa20.c +++ b/cipher/salsa20.c @@ -366,10 +366,12 @@ salsa20_do_setkey (SALSA20_context_t *ctx, static gcry_err_code_t -salsa20_setkey (void *context, const byte *key, unsigned int keylen) +salsa20_setkey (void *context, const byte *key, unsigned int keylen, + gcry_cipher_hd_t hd) { SALSA20_context_t *ctx = (SALSA20_context_t *)context; gcry_err_code_t rc = salsa20_do_setkey (ctx, key, keylen); + (void)hd; _gcry_burn_stack (4 + sizeof (void *) + 4 * sizeof (void *)); return rc; } @@ -522,7 +524,7 @@ selftest (void) /* 16-byte alignment required for amd64 implementation. */ ctx = (SALSA20_context_t *)((uintptr_t)(ctxbuf + 15) & ~(uintptr_t)15); - salsa20_setkey (ctx, key_1, sizeof key_1); + salsa20_setkey (ctx, key_1, sizeof key_1, NULL); salsa20_setiv (ctx, nonce_1, sizeof nonce_1); scratch[8] = 0; salsa20_encrypt_stream (ctx, scratch, plaintext_1, sizeof plaintext_1); @@ -530,7 +532,7 @@ selftest (void) return "Salsa20 encryption test 1 failed."; if (scratch[8]) return "Salsa20 wrote too much."; - salsa20_setkey( ctx, key_1, sizeof(key_1)); + salsa20_setkey( ctx, key_1, sizeof(key_1), NULL); salsa20_setiv (ctx, nonce_1, sizeof nonce_1); salsa20_encrypt_stream (ctx, scratch, scratch, sizeof plaintext_1); if (memcmp (scratch, plaintext_1, sizeof plaintext_1)) @@ -538,12 +540,12 @@ selftest (void) for (i = 0; i < sizeof buf; i++) buf[i] = i; - salsa20_setkey (ctx, key_1, sizeof key_1); + salsa20_setkey (ctx, key_1, sizeof key_1, NULL); salsa20_setiv (ctx, nonce_1, sizeof nonce_1); /*encrypt*/ salsa20_encrypt_stream (ctx, buf, buf, sizeof buf); /*decrypt*/ - salsa20_setkey (ctx, key_1, sizeof key_1); + salsa20_setkey (ctx, key_1, sizeof key_1, NULL); salsa20_setiv (ctx, nonce_1, sizeof nonce_1); salsa20_encrypt_stream (ctx, buf, buf, 1); salsa20_encrypt_stream (ctx, buf+1, buf+1, (sizeof buf)-1-1); diff --git a/cipher/seed.c b/cipher/seed.c index 9f87c0558..e36d3cf91 100644 --- a/cipher/seed.c +++ b/cipher/seed.c @@ -309,11 +309,12 @@ do_setkey (SEED_context *ctx, const byte *key, const unsigned keylen) } static gcry_err_code_t -seed_setkey (void *context, const byte *key, const unsigned keylen) +seed_setkey (void *context, const byte *key, const unsigned keylen, + gcry_cipher_hd_t hd) { SEED_context *ctx = context; - int rc = do_setkey (ctx, key, keylen); + (void)hd; _gcry_burn_stack (4*6 + sizeof(void*)*2 + sizeof(int)*2); return rc; } @@ -446,7 +447,7 @@ selftest (void) 0x22, 0x6B, 0xC3, 0x14, 0x2C, 0xD4, 0x0D, 0x4A, }; - seed_setkey (&ctx, key, sizeof(key)); + seed_setkey (&ctx, key, sizeof(key), NULL); seed_encrypt (&ctx, scratch, plaintext); if (memcmp (scratch, ciphertext, sizeof (ciphertext))) return "SEED test encryption failed."; diff --git a/cipher/serpent.c b/cipher/serpent.c index ea4b8edc8..0736ad195 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -748,13 +748,16 @@ serpent_setkey_internal (serpent_context_t *context, /* Initialize CTX with the key KEY of KEY_LENGTH bytes. */ static gcry_err_code_t serpent_setkey (void *ctx, - const byte *key, unsigned int key_length) + const byte *key, unsigned int key_length, + gcry_cipher_hd_t hd) { serpent_context_t *context = ctx; static const char *serpent_test_ret; static int serpent_init_done; gcry_err_code_t ret = GPG_ERR_NO_ERROR; + (void)hd; + if (! serpent_init_done) { /* Execute a self-test the first time, Serpent is used. */ diff --git a/cipher/twofish.c b/cipher/twofish.c index 48feaae9f..0d187bda4 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -734,12 +734,15 @@ do_twofish_setkey (TWOFISH_context *ctx, const byte *key, const unsigned keylen) } static gcry_err_code_t -twofish_setkey (void *context, const byte *key, unsigned int keylen) +twofish_setkey (void *context, const byte *key, unsigned int keylen, + gcry_cipher_hd_t hd) { TWOFISH_context *ctx = context; unsigned int hwfeatures = _gcry_get_hw_features (); int rc; + (void)hd; + rc = do_twofish_setkey (ctx, key, keylen); #ifdef USE_AVX2 @@ -1623,7 +1626,7 @@ selftest (void) 0x05, 0x93, 0x1C, 0xB6, 0xD4, 0x08, 0xE7, 0xFA }; - twofish_setkey (&ctx, key, sizeof(key)); + twofish_setkey (&ctx, key, sizeof(key), NULL); twofish_encrypt (&ctx, scratch, plaintext); if (memcmp (scratch, ciphertext, sizeof (ciphertext))) return "Twofish-128 test encryption failed."; @@ -1631,7 +1634,7 @@ selftest (void) if (memcmp (scratch, plaintext, sizeof (plaintext))) return "Twofish-128 test decryption failed."; - twofish_setkey (&ctx, key_256, sizeof(key_256)); + twofish_setkey (&ctx, key_256, sizeof(key_256), NULL); twofish_encrypt (&ctx, scratch, plaintext_256); if (memcmp (scratch, ciphertext_256, sizeof (ciphertext_256))) return "Twofish-256 test encryption failed."; @@ -1713,13 +1716,13 @@ main() /* Encryption test. */ for (i = 0; i < 125; i++) { - twofish_setkey (&ctx, buffer[0], sizeof (buffer[0])); + twofish_setkey (&ctx, buffer[0], sizeof (buffer[0]), NULL); for (j = 0; j < 1000; j++) twofish_encrypt (&ctx, buffer[2], buffer[2]); - twofish_setkey (&ctx, buffer[1], sizeof (buffer[1])); + twofish_setkey (&ctx, buffer[1], sizeof (buffer[1]), NULL); for (j = 0; j < 1000; j++) twofish_encrypt (&ctx, buffer[3], buffer[3]); - twofish_setkey (&ctx, buffer[2], sizeof (buffer[2])*2); + twofish_setkey (&ctx, buffer[2], sizeof (buffer[2])*2, NULL); for (j = 0; j < 1000; j++) { twofish_encrypt (&ctx, buffer[0], buffer[0]); twofish_encrypt (&ctx, buffer[1], buffer[1]); @@ -1731,15 +1734,15 @@ main() /* Decryption test. */ for (i = 0; i < 125; i++) { - twofish_setkey (&ctx, buffer[2], sizeof (buffer[2])*2); + twofish_setkey (&ctx, buffer[2], sizeof (buffer[2])*2, NULL); for (j = 0; j < 1000; j++) { twofish_decrypt (&ctx, buffer[0], buffer[0]); twofish_decrypt (&ctx, buffer[1], buffer[1]); } - twofish_setkey (&ctx, buffer[1], sizeof (buffer[1])); + twofish_setkey (&ctx, buffer[1], sizeof (buffer[1]), NULL); for (j = 0; j < 1000; j++) twofish_decrypt (&ctx, buffer[3], buffer[3]); - twofish_setkey (&ctx, buffer[0], sizeof (buffer[0])); + twofish_setkey (&ctx, buffer[0], sizeof (buffer[0]), NULL); for (j = 0; j < 1000; j++) twofish_decrypt (&ctx, buffer[2], buffer[2]); } diff --git a/src/cipher-proto.h b/src/cipher-proto.h index d1ddc5dd2..daa917c23 100644 --- a/src/cipher-proto.h +++ b/src/cipher-proto.h @@ -132,7 +132,8 @@ typedef struct gcry_pk_spec /* Type for the cipher_setkey function. */ typedef gcry_err_code_t (*gcry_cipher_setkey_t) (void *c, const unsigned char *key, - unsigned keylen); + unsigned keylen, + gcry_cipher_hd_t hd); /* Type for the cipher_encrypt function. */ typedef unsigned int (*gcry_cipher_encrypt_t) (void *c, -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 17:51:21 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:21 +0300 Subject: [PATCH 9/9] Add hash_buffer and hash_buffers pointers to message digest spec In-Reply-To: <20180619155121.7122-1-jussi.kivilinna@iki.fi> References: <20180619155121.7122-1-jussi.kivilinna@iki.fi> Message-ID: <20180619155121.7122-8-jussi.kivilinna@iki.fi> * src/cipher-proto.h (gcry_md_hash_buffer_t) (gcry_md_hash_buffers_t): New. (gcry_md_spec): Add hash_buffer and hash_buffers. * cipher/md.c (_gcry_md_hash_buffer, _gcry_md_hash_buffers): Use hash_buffer/hash_buffers from MD spec instead of hard-coding supported algorithms. * cipher/blake2.c: Add NULL to MD spec hash_buffer and hash_buffers pointers. * cipher/crc.c: Ditto. * cipher/gostr3411-94.c: Ditto. * cipher/keccak.c: Ditto. * cipher/md2.c: Ditto. * cipher/md4.c: Ditto. * cipher/md5.c: Ditto. * cipher/stribog.c: Ditto. * cipher/tiger.c: Ditto. * cipher/whirlpool.c: Ditto. * cipher/rmd160.c (_gcry_rmd160_hash_buffers): New. (_gcry_digest_spec_rmd160): Add hash_buffer and hash_buffers functions. * cipher/sha1.c (_gcry_digest_spec_sha1): Add hash_buffer and hash_buffers functions. * cipher/sha256.c (_gcry_digest_spec_sha256): Add hash_buffer and hash_buffers functions. (_gcry_digest_spec_sha224): Add NULL pointers for hash_buffer and hash_buffers. * cipher/sha512.c (_gcry_digest_spec_sha1): Add hash_buffer and hash_buffers functions. (_gcry_digest_spec_sha384): Add NULL pointers for hash_buffer and hash_buffers. * cipher/sm3.c (_gcry_digest_spec_sha1): Add hash_buffer and hash_buffers functions. -- Signed-off-by: Jussi Kivilinna --- cipher/blake2.c | 1 + cipher/crc.c | 3 ++ cipher/gostr3411-94.c | 2 + cipher/keccak.c | 6 +++ cipher/md.c | 120 ++++++++++++++++++++++-------------------- cipher/md2.c | 1 + cipher/md4.c | 1 + cipher/md5.c | 1 + cipher/rmd160.c | 16 ++++++ cipher/sha1.c | 1 + cipher/sha256.c | 2 + cipher/sha512.c | 2 + cipher/sm3.c | 1 + cipher/stribog.c | 4 +- cipher/tiger.c | 3 ++ cipher/whirlpool.c | 1 + src/cipher-proto.h | 10 ++++ src/cipher.h | 1 + 18 files changed, 116 insertions(+), 60 deletions(-) diff --git a/cipher/blake2.c b/cipher/blake2.c index 0f7494f21..bfd24b9f0 100644 --- a/cipher/blake2.c +++ b/cipher/blake2.c @@ -958,6 +958,7 @@ gcry_err_code_t _gcry_blake2_init_with_key(void *ctx, unsigned int flags, DIM (blake2##bs##_##dbits##_asn), oid_spec_blake2##bs##_##dbits, \ dbits / 8, blake2##bs##_##dbits##_init, blake2##bs##_write, \ blake2##bs##_final, blake2##bs##_read, NULL, \ + NULL, NULL, \ sizeof (BLAKE2##BS##_CONTEXT), selftests_blake2##bs \ }; diff --git a/cipher/crc.c b/cipher/crc.c index a1ce50b65..4457ff62f 100644 --- a/cipher/crc.c +++ b/cipher/crc.c @@ -841,6 +841,7 @@ gcry_md_spec_t _gcry_digest_spec_crc32 = GCRY_MD_CRC32, {0, 1}, "CRC32", NULL, 0, NULL, 4, crc32_init, crc32_write, crc32_final, crc32_read, NULL, + NULL, NULL, sizeof (CRC_CONTEXT) }; @@ -849,6 +850,7 @@ gcry_md_spec_t _gcry_digest_spec_crc32_rfc1510 = GCRY_MD_CRC32_RFC1510, {0, 1}, "CRC32RFC1510", NULL, 0, NULL, 4, crc32rfc1510_init, crc32_write, crc32rfc1510_final, crc32_read, NULL, + NULL, NULL, sizeof (CRC_CONTEXT) }; @@ -857,5 +859,6 @@ gcry_md_spec_t _gcry_digest_spec_crc24_rfc2440 = GCRY_MD_CRC24_RFC2440, {0, 1}, "CRC24RFC2440", NULL, 0, NULL, 3, crc24rfc2440_init, crc24rfc2440_write, crc24rfc2440_final, crc32_read, NULL, + NULL, NULL, sizeof (CRC_CONTEXT) }; diff --git a/cipher/gostr3411-94.c b/cipher/gostr3411-94.c index a782427f0..d9746275e 100644 --- a/cipher/gostr3411-94.c +++ b/cipher/gostr3411-94.c @@ -344,6 +344,7 @@ gcry_md_spec_t _gcry_digest_spec_gost3411_94 = GCRY_MD_GOSTR3411_94, {0, 0}, "GOSTR3411_94", NULL, 0, NULL, 32, gost3411_init, _gcry_md_block_write, gost3411_final, gost3411_read, NULL, + NULL, NULL, sizeof (GOSTR3411_CONTEXT) }; gcry_md_spec_t _gcry_digest_spec_gost3411_cp = @@ -351,5 +352,6 @@ gcry_md_spec_t _gcry_digest_spec_gost3411_cp = GCRY_MD_GOSTR3411_CP, {0, 0}, "GOSTR3411_CP", asn, DIM (asn), oid_spec_gostr3411, 32, gost3411_cp_init, _gcry_md_block_write, gost3411_final, gost3411_read, NULL, + NULL, NULL, sizeof (GOSTR3411_CONTEXT) }; diff --git a/cipher/keccak.c b/cipher/keccak.c index 0bb315520..db67d0714 100644 --- a/cipher/keccak.c +++ b/cipher/keccak.c @@ -1221,6 +1221,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_224 = GCRY_MD_SHA3_224, {0, 1}, "SHA3-224", sha3_224_asn, DIM (sha3_224_asn), oid_spec_sha3_224, 28, sha3_224_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1229,6 +1230,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_256 = GCRY_MD_SHA3_256, {0, 1}, "SHA3-256", sha3_256_asn, DIM (sha3_256_asn), oid_spec_sha3_256, 32, sha3_256_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1237,6 +1239,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_384 = GCRY_MD_SHA3_384, {0, 1}, "SHA3-384", sha3_384_asn, DIM (sha3_384_asn), oid_spec_sha3_384, 48, sha3_384_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1245,6 +1248,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_512 = GCRY_MD_SHA3_512, {0, 1}, "SHA3-512", sha3_512_asn, DIM (sha3_512_asn), oid_spec_sha3_512, 64, sha3_512_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1253,6 +1257,7 @@ gcry_md_spec_t _gcry_digest_spec_shake128 = GCRY_MD_SHAKE128, {0, 1}, "SHAKE128", shake128_asn, DIM (shake128_asn), oid_spec_shake128, 0, shake128_init, keccak_write, keccak_final, NULL, keccak_extract, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1261,6 +1266,7 @@ gcry_md_spec_t _gcry_digest_spec_shake256 = GCRY_MD_SHAKE256, {0, 1}, "SHAKE256", shake256_asn, DIM (shake256_asn), oid_spec_shake256, 0, shake256_init, keccak_write, keccak_final, NULL, keccak_extract, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; diff --git a/cipher/md.c b/cipher/md.c index 47c8cecdd..15e19a95f 100644 --- a/cipher/md.c +++ b/cipher/md.c @@ -1174,46 +1174,52 @@ void _gcry_md_hash_buffer (int algo, void *digest, const void *buffer, size_t length) { - if (0) - ; -#if USE_SHA256 - else if (algo == GCRY_MD_SHA256) - _gcry_sha256_hash_buffer (digest, buffer, length); -#endif -#if USE_SHA512 - else if (algo == GCRY_MD_SHA512) - _gcry_sha512_hash_buffer (digest, buffer, length); -#endif -#if USE_SHA1 - else if (algo == GCRY_MD_SHA1) - _gcry_sha1_hash_buffer (digest, buffer, length); -#endif -#if USE_RMD160 - else if (algo == GCRY_MD_RMD160 && !fips_mode () ) - _gcry_rmd160_hash_buffer (digest, buffer, length); -#endif + gcry_md_spec_t *spec; + + spec = spec_from_algo (algo); + if (!spec) + { + log_debug ("md_hash_buffer: algorithm %d not available\n", algo); + return; + } + + if (algo == GCRY_MD_MD5 && fips_mode ()) + { + _gcry_inactivate_fips_mode ("MD5 used"); + if (_gcry_enforced_fips_mode () ) + { + /* We should never get to here because we do not register + MD5 in enforced fips mode. */ + _gcry_fips_noreturn (); + } + } + + if (spec->hash_buffer != NULL) + { + spec->hash_buffer (digest, buffer, length); + } + else if (spec->hash_buffers != NULL) + { + gcry_buffer_t iov; + + iov.size = 0; + iov.data = (void *)buffer; + iov.off = 0; + iov.len = length; + + spec->hash_buffers (digest, &iov, 1); + } else { /* For the others we do not have a fast function, so we use the - normal functions. */ + normal functions. */ gcry_md_hd_t h; gpg_err_code_t err; - if (algo == GCRY_MD_MD5 && fips_mode ()) - { - _gcry_inactivate_fips_mode ("MD5 used"); - if (_gcry_enforced_fips_mode () ) - { - /* We should never get to here because we do not register - MD5 in enforced fips mode. */ - _gcry_fips_noreturn (); - } - } - err = md_open (&h, algo, 0); if (err) - log_bug ("gcry_md_open failed for algo %d: %s", - algo, gpg_strerror (gcry_error(err))); + log_bug ("gcry_md_open failed for algo %d: %s", + algo, gpg_strerror (gcry_error(err))); md_write (h, (byte *) buffer, length); md_final (h); memcpy (digest, md_read (h, algo), md_digest_length (algo)); @@ -1240,6 +1246,7 @@ gpg_err_code_t _gcry_md_hash_buffers (int algo, unsigned int flags, void *digest, const gcry_buffer_t *iov, int iovcnt) { + gcry_md_spec_t *spec; int hmac; if (!iov || iovcnt < 0) @@ -1251,39 +1258,36 @@ _gcry_md_hash_buffers (int algo, unsigned int flags, void *digest, if (hmac && iovcnt < 1) return GPG_ERR_INV_ARG; - if (0) - ; -#if USE_SHA256 - else if (algo == GCRY_MD_SHA256 && !hmac) - _gcry_sha256_hash_buffers (digest, iov, iovcnt); -#endif -#if USE_SHA512 - else if (algo == GCRY_MD_SHA512 && !hmac) - _gcry_sha512_hash_buffers (digest, iov, iovcnt); -#endif -#if USE_SHA1 - else if (algo == GCRY_MD_SHA1 && !hmac) - _gcry_sha1_hash_buffers (digest, iov, iovcnt); -#endif + spec = spec_from_algo (algo); + if (!spec) + { + log_debug ("md_hash_buffers: algorithm %d not available\n", algo); + return GPG_ERR_DIGEST_ALGO; + } + + if (algo == GCRY_MD_MD5 && fips_mode ()) + { + _gcry_inactivate_fips_mode ("MD5 used"); + if (_gcry_enforced_fips_mode () ) + { + /* We should never get to here because we do not register + MD5 in enforced fips mode. */ + _gcry_fips_noreturn (); + } + } + + if (!hmac && spec->hash_buffers) + { + spec->hash_buffers (digest, iov, iovcnt); + } else { /* For the others we do not have a fast function, so we use the - normal functions. */ + normal functions. */ gcry_md_hd_t h; gpg_err_code_t rc; int dlen; - if (algo == GCRY_MD_MD5 && fips_mode ()) - { - _gcry_inactivate_fips_mode ("MD5 used"); - if (_gcry_enforced_fips_mode () ) - { - /* We should never get to here because we do not register - MD5 in enforced fips mode. */ - _gcry_fips_noreturn (); - } - } - /* Detect SHAKE128 like algorithms which we can't use because * our API does not allow for a variable length digest. */ dlen = md_digest_length (algo); diff --git a/cipher/md2.c b/cipher/md2.c index e339b28d0..b6f7e94f4 100644 --- a/cipher/md2.c +++ b/cipher/md2.c @@ -178,5 +178,6 @@ gcry_md_spec_t _gcry_digest_spec_md2 = GCRY_MD_MD2, {0, 0}, "MD2", asn, DIM (asn), oid_spec_md2, 16, md2_init, _gcry_md_block_write, md2_final, md2_read, NULL, + NULL, NULL, sizeof (MD2_CONTEXT) }; diff --git a/cipher/md4.c b/cipher/md4.c index afa638232..098380801 100644 --- a/cipher/md4.c +++ b/cipher/md4.c @@ -287,5 +287,6 @@ gcry_md_spec_t _gcry_digest_spec_md4 = GCRY_MD_MD4, {0, 0}, "MD4", asn, DIM (asn), oid_spec_md4,16, md4_init, _gcry_md_block_write, md4_final, md4_read, NULL, + NULL, NULL, sizeof (MD4_CONTEXT) }; diff --git a/cipher/md5.c b/cipher/md5.c index ed942cf40..e35a500c4 100644 --- a/cipher/md5.c +++ b/cipher/md5.c @@ -313,5 +313,6 @@ gcry_md_spec_t _gcry_digest_spec_md5 = GCRY_MD_MD5, {0, 0}, "MD5", asn, DIM (asn), oid_spec_md5, 16, md5_init, _gcry_md_block_write, md5_final, md5_read, NULL, + NULL, NULL, sizeof (MD5_CONTEXT) }; diff --git a/cipher/rmd160.c b/cipher/rmd160.c index 0a019b9c6..2d2fae916 100644 --- a/cipher/rmd160.c +++ b/cipher/rmd160.c @@ -486,6 +486,21 @@ _gcry_rmd160_hash_buffer (void *outbuf, const void *buffer, size_t length ) memcpy ( outbuf, hd.bctx.buf, 20 ); } +/* Variant of the above shortcut function using a multiple buffers. */ +static void +_gcry_rmd160_hash_buffers (void *outbuf, const gcry_buffer_t *iov, int iovcnt) +{ + RMD160_CONTEXT hd; + + rmd160_init (&hd, 0); + for (;iovcnt > 0; iov++, iovcnt--) + _gcry_md_block_write (&hd, + (const char*)iov[0].data + iov[0].off, iov[0].len); + rmd160_final ( &hd ); + memcpy ( outbuf, hd.bctx.buf, 20 ); +} + + static byte asn[15] = /* Object ID is 1.3.36.3.2.1 */ { 0x30, 0x21, 0x30, 0x09, 0x06, 0x05, 0x2b, 0x24, 0x03, 0x02, 0x01, 0x05, 0x00, 0x04, 0x14 }; @@ -504,5 +519,6 @@ gcry_md_spec_t _gcry_digest_spec_rmd160 = GCRY_MD_RMD160, {0, 0}, "RIPEMD160", asn, DIM (asn), oid_spec_rmd160, 20, rmd160_init, _gcry_md_block_write, rmd160_final, rmd160_read, NULL, + _gcry_rmd160_hash_buffer, _gcry_rmd160_hash_buffers, sizeof (RMD160_CONTEXT) }; diff --git a/cipher/sha1.c b/cipher/sha1.c index 09868aa3f..e50262ff4 100644 --- a/cipher/sha1.c +++ b/cipher/sha1.c @@ -665,6 +665,7 @@ gcry_md_spec_t _gcry_digest_spec_sha1 = GCRY_MD_SHA1, {0, 1}, "SHA1", asn, DIM (asn), oid_spec_sha1, 20, sha1_init, _gcry_md_block_write, sha1_final, sha1_read, NULL, + _gcry_sha1_hash_buffer, _gcry_sha1_hash_buffers, sizeof (SHA1_CONTEXT), run_selftests }; diff --git a/cipher/sha256.c b/cipher/sha256.c index cb6a860ac..5c1c13f84 100644 --- a/cipher/sha256.c +++ b/cipher/sha256.c @@ -743,6 +743,7 @@ gcry_md_spec_t _gcry_digest_spec_sha224 = GCRY_MD_SHA224, {0, 1}, "SHA224", asn224, DIM (asn224), oid_spec_sha224, 28, sha224_init, _gcry_md_block_write, sha256_final, sha256_read, NULL, + NULL, NULL, sizeof (SHA256_CONTEXT), run_selftests }; @@ -752,6 +753,7 @@ gcry_md_spec_t _gcry_digest_spec_sha256 = GCRY_MD_SHA256, {0, 1}, "SHA256", asn256, DIM (asn256), oid_spec_sha256, 32, sha256_init, _gcry_md_block_write, sha256_final, sha256_read, NULL, + _gcry_sha256_hash_buffer, _gcry_sha256_hash_buffers, sizeof (SHA256_CONTEXT), run_selftests }; diff --git a/cipher/sha512.c b/cipher/sha512.c index 06e8a2b91..e83e84b83 100644 --- a/cipher/sha512.c +++ b/cipher/sha512.c @@ -925,6 +925,7 @@ gcry_md_spec_t _gcry_digest_spec_sha512 = GCRY_MD_SHA512, {0, 1}, "SHA512", sha512_asn, DIM (sha512_asn), oid_spec_sha512, 64, sha512_init, _gcry_md_block_write, sha512_final, sha512_read, NULL, + _gcry_sha512_hash_buffer, _gcry_sha512_hash_buffers, sizeof (SHA512_CONTEXT), run_selftests }; @@ -954,6 +955,7 @@ gcry_md_spec_t _gcry_digest_spec_sha384 = GCRY_MD_SHA384, {0, 1}, "SHA384", sha384_asn, DIM (sha384_asn), oid_spec_sha384, 48, sha384_init, _gcry_md_block_write, sha512_final, sha512_read, NULL, + NULL, NULL, sizeof (SHA512_CONTEXT), run_selftests }; diff --git a/cipher/sm3.c b/cipher/sm3.c index ee5daf227..c6f1a091d 100644 --- a/cipher/sm3.c +++ b/cipher/sm3.c @@ -462,6 +462,7 @@ gcry_md_spec_t _gcry_digest_spec_sm3 = GCRY_MD_SM3, {0, 1}, "SM3", asn_sm3, DIM (asn_sm3), oid_spec_sm3, 32, sm3_init, _gcry_md_block_write, sm3_final, sm3_read, NULL, + _gcry_sm3_hash_buffer, _gcry_sm3_hash_buffers, sizeof (SM3_CONTEXT), run_selftests }; diff --git a/cipher/stribog.c b/cipher/stribog.c index 7b6e330d0..459e4db99 100644 --- a/cipher/stribog.c +++ b/cipher/stribog.c @@ -1344,7 +1344,7 @@ gcry_md_spec_t _gcry_digest_spec_stribog_256 = GCRY_MD_STRIBOG256, {0, 0}, "STRIBOG256", NULL, 0, oid_spec_stribog256, 32, stribog_init_256, _gcry_md_block_write, stribog_final, stribog_read_256, - NULL, + NULL, NULL, NULL, sizeof (STRIBOG_CONTEXT) }; @@ -1353,6 +1353,6 @@ gcry_md_spec_t _gcry_digest_spec_stribog_512 = GCRY_MD_STRIBOG512, {0, 0}, "STRIBOG512", NULL, 0, oid_spec_stribog512, 64, stribog_init_512, _gcry_md_block_write, stribog_final, stribog_read_512, - NULL, + NULL, NULL, NULL, sizeof (STRIBOG_CONTEXT) }; diff --git a/cipher/tiger.c b/cipher/tiger.c index b60ec162f..d24d1603b 100644 --- a/cipher/tiger.c +++ b/cipher/tiger.c @@ -814,6 +814,7 @@ gcry_md_spec_t _gcry_digest_spec_tiger = GCRY_MD_TIGER, {0, 0}, "TIGER192", NULL, 0, NULL, 24, tiger_init, _gcry_md_block_write, tiger_final, tiger_read, NULL, + NULL, NULL, sizeof (TIGER_CONTEXT) }; @@ -837,6 +838,7 @@ gcry_md_spec_t _gcry_digest_spec_tiger1 = GCRY_MD_TIGER1, {0, 0}, "TIGER", asn1, DIM (asn1), oid_spec_tiger1, 24, tiger1_init, _gcry_md_block_write, tiger_final, tiger_read, NULL, + NULL, NULL, sizeof (TIGER_CONTEXT) }; @@ -848,5 +850,6 @@ gcry_md_spec_t _gcry_digest_spec_tiger2 = GCRY_MD_TIGER2, {0, 0}, "TIGER2", NULL, 0, NULL, 24, tiger2_init, _gcry_md_block_write, tiger_final, tiger_read, NULL, + NULL, NULL, sizeof (TIGER_CONTEXT) }; diff --git a/cipher/whirlpool.c b/cipher/whirlpool.c index 8a069392e..d52375ada 100644 --- a/cipher/whirlpool.c +++ b/cipher/whirlpool.c @@ -1526,5 +1526,6 @@ gcry_md_spec_t _gcry_digest_spec_whirlpool = GCRY_MD_WHIRLPOOL, {0, 0}, "WHIRLPOOL", NULL, 0, NULL, 64, whirlpool_init, whirlpool_write, whirlpool_final, whirlpool_read, NULL, + NULL, NULL, sizeof (whirlpool_context_t) }; diff --git a/src/cipher-proto.h b/src/cipher-proto.h index daa917c23..97eb0d9a6 100644 --- a/src/cipher-proto.h +++ b/src/cipher-proto.h @@ -219,6 +219,14 @@ typedef unsigned char *(*gcry_md_read_t) (void *c); /* Type for the md_extract function. */ typedef void (*gcry_md_extract_t) (void *c, void *outbuf, size_t nbytes); +/* Type for the md_hash_buffer function. */ +typedef void (*gcry_md_hash_buffer_t) (void *outbuf, const void *buffer, + size_t length); + +/* Type for the md_hash_buffers function. */ +typedef void (*gcry_md_hash_buffers_t) (void *outbuf, const gcry_buffer_t *iov, + int iovcnt); + typedef struct gcry_md_oid_spec { const char *oidstring; @@ -242,6 +250,8 @@ typedef struct gcry_md_spec gcry_md_final_t final; gcry_md_read_t read; gcry_md_extract_t extract; + gcry_md_hash_buffer_t hash_buffer; + gcry_md_hash_buffers_t hash_buffers; size_t contextsize; /* allocate this amount of context */ selftest_func_t selftest; } gcry_md_spec_t; diff --git a/src/cipher.h b/src/cipher.h index 7c2e5d9e7..6e89be3da 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -115,6 +115,7 @@ gcry_err_code_t _gcry_cipher_cmac_set_subkeys /*-- rmd160.c --*/ void _gcry_rmd160_hash_buffer (void *outbuf, const void *buffer, size_t length); + /*-- sha1.c --*/ void _gcry_sha1_hash_buffer (void *outbuf, const void *buffer, size_t length); -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 18:03:41 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 19:03:41 +0300 Subject: [PATCH 2/2] Add hash_buffer and hash_buffers for SHA-224, SHA-385, SHA3 and BLAKE2 In-Reply-To: <20180619160341.8946-1-jussi.kivilinna@iki.fi> References: <20180619160341.8946-1-jussi.kivilinna@iki.fi> Message-ID: <20180619160341.8946-2-jussi.kivilinna@iki.fi> * cipher/blake2.c (DEFINE_BLAKE2_VARIANT): Add hash_buffer and hash_buffers functions for BLAKE2 variants. * cipher/keccak.c (_gcry_sha3_hash_buffer, _gcry_sha3_hash_buffers) (_gcry_sha3_224_hash_buffer, _gcry_sha3_224_hash_buffers) (_gcry_sha3_256_hash_buffer, _gcry_sha3_256_hash_buffers) (_gcry_sha3_384_hash_buffer, _gcry_sha3_384_hash_buffers) (_gcry_sha3_512_hash_buffer, _gcry_sha3_512_hash_buffers): New. * cipher/sha256.c (_gcry_sha224_hash_buffer) (_gcry_sha224_hash_buffers): New. * cipher/sha512.c (_gcry_sha384_hash_buffer) (_gcry_sha384_hash_buffers): New. -- Signed-off-by: Jussi Kivilinna --- cipher/blake2.c | 25 +++++++++++++- cipher/keccak.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++--- cipher/sha256.c | 31 ++++++++++++++++- cipher/sha512.c | 32 +++++++++++++++++- 4 files changed, 171 insertions(+), 7 deletions(-) diff --git a/cipher/blake2.c b/cipher/blake2.c index bfd24b9f0..f2bf49e52 100644 --- a/cipher/blake2.c +++ b/cipher/blake2.c @@ -945,6 +945,28 @@ gcry_err_code_t _gcry_blake2_init_with_key(void *ctx, unsigned int flags, int err = blake2##bs##_init_ctx (ctx, flags, NULL, 0, dbits); \ gcry_assert (err == 0); \ } \ + static void \ + _gcry_blake2##bs##_##dbits##_hash_buffer(void *outbuf, \ + const void *buffer, size_t length) \ + { \ + BLAKE2##BS##_CONTEXT hd; \ + blake2##bs##_##dbits##_init (&hd, 0); \ + blake2##bs##_write (&hd, buffer, length); \ + blake2##bs##_final (&hd); \ + memcpy (outbuf, blake2##bs##_read (&hd), dbits / 8); \ + } \ + static void \ + _gcry_blake2##bs##_##dbits##_hash_buffers(void *outbuf, \ + const gcry_buffer_t *iov, int iovcnt) \ + { \ + BLAKE2##BS##_CONTEXT hd; \ + blake2##bs##_##dbits##_init (&hd, 0); \ + for (;iovcnt > 0; iov++, iovcnt--) \ + blake2##bs##_write (&hd, (const char*)iov[0].data + iov[0].off, \ + iov[0].len); \ + blake2##bs##_final (&hd); \ + memcpy (outbuf, blake2##bs##_read (&hd), dbits / 8); \ + } \ static byte blake2##bs##_##dbits##_asn[] = { 0x30 }; \ static gcry_md_oid_spec_t oid_spec_blake2##bs##_##dbits[] = \ { \ @@ -958,7 +980,8 @@ gcry_err_code_t _gcry_blake2_init_with_key(void *ctx, unsigned int flags, DIM (blake2##bs##_##dbits##_asn), oid_spec_blake2##bs##_##dbits, \ dbits / 8, blake2##bs##_##dbits##_init, blake2##bs##_write, \ blake2##bs##_final, blake2##bs##_read, NULL, \ - NULL, NULL, \ + _gcry_blake2##bs##_##dbits##_hash_buffer, \ + _gcry_blake2##bs##_##dbits##_hash_buffers, \ sizeof (BLAKE2##BS##_CONTEXT), selftests_blake2##bs \ }; diff --git a/cipher/keccak.c b/cipher/keccak.c index db67d0714..24963f120 100644 --- a/cipher/keccak.c +++ b/cipher/keccak.c @@ -998,6 +998,88 @@ keccak_extract (void *context, void *out, size_t outlen) } +/* Shortcut functions which puts the hash value of the supplied buffer + * into outbuf which must have a size of 'spec->mdlen' bytes. */ +static void +_gcry_sha3_hash_buffer (void *outbuf, const void *buffer, size_t length, + const gcry_md_spec_t *spec) +{ + KECCAK_CONTEXT hd; + + spec->init (&hd, 0); + keccak_write (&hd, buffer, length); + keccak_final (&hd); + memcpy (outbuf, keccak_read (&hd), spec->mdlen); +} + + +/* Variant of the above shortcut function using multiple buffers. */ +static void +_gcry_sha3_hash_buffers (void *outbuf, const gcry_buffer_t *iov, int iovcnt, + const gcry_md_spec_t *spec) +{ + KECCAK_CONTEXT hd; + + spec->init (&hd, 0); + for (;iovcnt > 0; iov++, iovcnt--) + keccak_write (&hd, (const char*)iov[0].data + iov[0].off, iov[0].len); + keccak_final (&hd); + memcpy (outbuf, keccak_read (&hd), spec->mdlen); +} + + +static void +_gcry_sha3_224_hash_buffer (void *outbuf, const void *buffer, size_t length) +{ + _gcry_sha3_hash_buffer (outbuf, buffer, length, &_gcry_digest_spec_sha3_224); +} + +static void +_gcry_sha3_256_hash_buffer (void *outbuf, const void *buffer, size_t length) +{ + _gcry_sha3_hash_buffer (outbuf, buffer, length, &_gcry_digest_spec_sha3_256); +} + +static void +_gcry_sha3_384_hash_buffer (void *outbuf, const void *buffer, size_t length) +{ + _gcry_sha3_hash_buffer (outbuf, buffer, length, &_gcry_digest_spec_sha3_384); +} + +static void +_gcry_sha3_512_hash_buffer (void *outbuf, const void *buffer, size_t length) +{ + _gcry_sha3_hash_buffer (outbuf, buffer, length, &_gcry_digest_spec_sha3_512); +} + +static void +_gcry_sha3_224_hash_buffers (void *outbuf, const gcry_buffer_t *iov, + int iovcnt) +{ + _gcry_sha3_hash_buffers (outbuf, iov, iovcnt, &_gcry_digest_spec_sha3_224); +} + +static void +_gcry_sha3_256_hash_buffers (void *outbuf, const gcry_buffer_t *iov, + int iovcnt) +{ + _gcry_sha3_hash_buffers (outbuf, iov, iovcnt, &_gcry_digest_spec_sha3_256); +} + +static void +_gcry_sha3_384_hash_buffers (void *outbuf, const gcry_buffer_t *iov, + int iovcnt) +{ + _gcry_sha3_hash_buffers (outbuf, iov, iovcnt, &_gcry_digest_spec_sha3_384); +} + +static void +_gcry_sha3_512_hash_buffers (void *outbuf, const gcry_buffer_t *iov, + int iovcnt) +{ + _gcry_sha3_hash_buffers (outbuf, iov, iovcnt, &_gcry_digest_spec_sha3_512); +} + /* Self-test section. @@ -1221,7 +1303,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_224 = GCRY_MD_SHA3_224, {0, 1}, "SHA3-224", sha3_224_asn, DIM (sha3_224_asn), oid_spec_sha3_224, 28, sha3_224_init, keccak_write, keccak_final, keccak_read, NULL, - NULL, NULL, + _gcry_sha3_224_hash_buffer, _gcry_sha3_224_hash_buffers, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1230,7 +1312,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_256 = GCRY_MD_SHA3_256, {0, 1}, "SHA3-256", sha3_256_asn, DIM (sha3_256_asn), oid_spec_sha3_256, 32, sha3_256_init, keccak_write, keccak_final, keccak_read, NULL, - NULL, NULL, + _gcry_sha3_256_hash_buffer, _gcry_sha3_256_hash_buffers, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1239,7 +1321,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_384 = GCRY_MD_SHA3_384, {0, 1}, "SHA3-384", sha3_384_asn, DIM (sha3_384_asn), oid_spec_sha3_384, 48, sha3_384_init, keccak_write, keccak_final, keccak_read, NULL, - NULL, NULL, + _gcry_sha3_384_hash_buffer, _gcry_sha3_384_hash_buffers, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1248,7 +1330,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_512 = GCRY_MD_SHA3_512, {0, 1}, "SHA3-512", sha3_512_asn, DIM (sha3_512_asn), oid_spec_sha3_512, 64, sha3_512_init, keccak_write, keccak_final, keccak_read, NULL, - NULL, NULL, + _gcry_sha3_512_hash_buffer, _gcry_sha3_512_hash_buffers, sizeof (KECCAK_CONTEXT), run_selftests }; diff --git a/cipher/sha256.c b/cipher/sha256.c index 5c1c13f84..069597074 100644 --- a/cipher/sha256.c +++ b/cipher/sha256.c @@ -588,6 +588,35 @@ _gcry_sha256_hash_buffers (void *outbuf, const gcry_buffer_t *iov, int iovcnt) } +/* Shortcut functions which puts the hash value of the supplied buffer + * into outbuf which must have a size of 28 bytes. */ +static void +_gcry_sha224_hash_buffer (void *outbuf, const void *buffer, size_t length) +{ + SHA256_CONTEXT hd; + + sha224_init (&hd, 0); + _gcry_md_block_write (&hd, buffer, length); + sha256_final (&hd); + memcpy (outbuf, hd.bctx.buf, 28); +} + + +/* Variant of the above shortcut function using multiple buffers. */ +static void +_gcry_sha224_hash_buffers (void *outbuf, const gcry_buffer_t *iov, int iovcnt) +{ + SHA256_CONTEXT hd; + + sha224_init (&hd, 0); + for (;iovcnt > 0; iov++, iovcnt--) + _gcry_md_block_write (&hd, + (const char*)iov[0].data + iov[0].off, iov[0].len); + sha256_final (&hd); + memcpy (outbuf, hd.bctx.buf, 28); +} + + /* Self-test section. @@ -743,7 +772,7 @@ gcry_md_spec_t _gcry_digest_spec_sha224 = GCRY_MD_SHA224, {0, 1}, "SHA224", asn224, DIM (asn224), oid_spec_sha224, 28, sha224_init, _gcry_md_block_write, sha256_final, sha256_read, NULL, - NULL, NULL, + _gcry_sha224_hash_buffer, _gcry_sha224_hash_buffers, sizeof (SHA256_CONTEXT), run_selftests }; diff --git a/cipher/sha512.c b/cipher/sha512.c index e83e84b83..9405de80b 100644 --- a/cipher/sha512.c +++ b/cipher/sha512.c @@ -768,6 +768,36 @@ _gcry_sha512_hash_buffers (void *outbuf, const gcry_buffer_t *iov, int iovcnt) } + +/* Shortcut functions which puts the hash value of the supplied buffer + * into outbuf which must have a size of 48 bytes. */ +static void +_gcry_sha384_hash_buffer (void *outbuf, const void *buffer, size_t length) +{ + SHA512_CONTEXT hd; + + sha384_init (&hd, 0); + _gcry_md_block_write (&hd, buffer, length); + sha512_final (&hd); + memcpy (outbuf, hd.bctx.buf, 48); +} + + +/* Variant of the above shortcut function using multiple buffers. */ +static void +_gcry_sha384_hash_buffers (void *outbuf, const gcry_buffer_t *iov, int iovcnt) +{ + SHA512_CONTEXT hd; + + sha384_init (&hd, 0); + for (;iovcnt > 0; iov++, iovcnt--) + _gcry_md_block_write (&hd, + (const char*)iov[0].data + iov[0].off, iov[0].len); + sha512_final (&hd); + memcpy (outbuf, hd.bctx.buf, 48); +} + + /* Self-test section. @@ -955,7 +985,7 @@ gcry_md_spec_t _gcry_digest_spec_sha384 = GCRY_MD_SHA384, {0, 1}, "SHA384", sha384_asn, DIM (sha384_asn), oid_spec_sha384, 48, sha384_init, _gcry_md_block_write, sha512_final, sha512_read, NULL, - NULL, NULL, + _gcry_sha384_hash_buffer, _gcry_sha384_hash_buffers, sizeof (SHA512_CONTEXT), run_selftests }; -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 18:03:40 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 19:03:40 +0300 Subject: [PATCH 1/2] Add hash_buffer and hash_buffers pointers to message digest spec Message-ID: <20180619160341.8946-1-jussi.kivilinna@iki.fi> * src/cipher-proto.h (gcry_md_hash_buffer_t) (gcry_md_hash_buffers_t): New. (gcry_md_spec): Add hash_buffer and hash_buffers. * cipher/md.c (_gcry_md_hash_buffer, _gcry_md_hash_buffers): Use hash_buffer/hash_buffers from MD spec instead of hard-coding supported algorithms. * cipher/blake2.c: Add NULL to MD spec hash_buffer and hash_buffers pointers. * cipher/crc.c: Ditto. * cipher/gostr3411-94.c: Ditto. * cipher/keccak.c: Ditto. * cipher/md2.c: Ditto. * cipher/md4.c: Ditto. * cipher/md5.c: Ditto. * cipher/stribog.c: Ditto. * cipher/tiger.c: Ditto. * cipher/whirlpool.c: Ditto. * cipher/rmd160.c (_gcry_rmd160_hash_buffers): New. (_gcry_digest_spec_rmd160): Add hash_buffer and hash_buffers functions. * cipher/sha1.c (_gcry_digest_spec_sha1): Add hash_buffer and hash_buffers functions. * cipher/sha256.c (_gcry_digest_spec_sha256): Add hash_buffer and hash_buffers functions. (_gcry_digest_spec_sha224): Add NULL pointers for hash_buffer and hash_buffers. * cipher/sha512.c (_gcry_digest_spec_sha1): Add hash_buffer and hash_buffers functions. (_gcry_digest_spec_sha384): Add NULL pointers for hash_buffer and hash_buffers. * cipher/sm3.c (_gcry_digest_spec_sha1): Add hash_buffer and hash_buffers functions. -- Signed-off-by: Jussi Kivilinna --- cipher/blake2.c | 1 + cipher/crc.c | 3 ++ cipher/gostr3411-94.c | 2 + cipher/keccak.c | 6 +++ cipher/md.c | 120 ++++++++++++++++++++++-------------------- cipher/md2.c | 1 + cipher/md4.c | 1 + cipher/md5.c | 1 + cipher/rmd160.c | 16 ++++++ cipher/sha1.c | 1 + cipher/sha256.c | 2 + cipher/sha512.c | 2 + cipher/sm3.c | 1 + cipher/stribog.c | 4 +- cipher/tiger.c | 3 ++ cipher/whirlpool.c | 1 + src/cipher-proto.h | 10 ++++ src/cipher.h | 1 + 18 files changed, 116 insertions(+), 60 deletions(-) diff --git a/cipher/blake2.c b/cipher/blake2.c index 0f7494f21..bfd24b9f0 100644 --- a/cipher/blake2.c +++ b/cipher/blake2.c @@ -958,6 +958,7 @@ gcry_err_code_t _gcry_blake2_init_with_key(void *ctx, unsigned int flags, DIM (blake2##bs##_##dbits##_asn), oid_spec_blake2##bs##_##dbits, \ dbits / 8, blake2##bs##_##dbits##_init, blake2##bs##_write, \ blake2##bs##_final, blake2##bs##_read, NULL, \ + NULL, NULL, \ sizeof (BLAKE2##BS##_CONTEXT), selftests_blake2##bs \ }; diff --git a/cipher/crc.c b/cipher/crc.c index a1ce50b65..4457ff62f 100644 --- a/cipher/crc.c +++ b/cipher/crc.c @@ -841,6 +841,7 @@ gcry_md_spec_t _gcry_digest_spec_crc32 = GCRY_MD_CRC32, {0, 1}, "CRC32", NULL, 0, NULL, 4, crc32_init, crc32_write, crc32_final, crc32_read, NULL, + NULL, NULL, sizeof (CRC_CONTEXT) }; @@ -849,6 +850,7 @@ gcry_md_spec_t _gcry_digest_spec_crc32_rfc1510 = GCRY_MD_CRC32_RFC1510, {0, 1}, "CRC32RFC1510", NULL, 0, NULL, 4, crc32rfc1510_init, crc32_write, crc32rfc1510_final, crc32_read, NULL, + NULL, NULL, sizeof (CRC_CONTEXT) }; @@ -857,5 +859,6 @@ gcry_md_spec_t _gcry_digest_spec_crc24_rfc2440 = GCRY_MD_CRC24_RFC2440, {0, 1}, "CRC24RFC2440", NULL, 0, NULL, 3, crc24rfc2440_init, crc24rfc2440_write, crc24rfc2440_final, crc32_read, NULL, + NULL, NULL, sizeof (CRC_CONTEXT) }; diff --git a/cipher/gostr3411-94.c b/cipher/gostr3411-94.c index a782427f0..d9746275e 100644 --- a/cipher/gostr3411-94.c +++ b/cipher/gostr3411-94.c @@ -344,6 +344,7 @@ gcry_md_spec_t _gcry_digest_spec_gost3411_94 = GCRY_MD_GOSTR3411_94, {0, 0}, "GOSTR3411_94", NULL, 0, NULL, 32, gost3411_init, _gcry_md_block_write, gost3411_final, gost3411_read, NULL, + NULL, NULL, sizeof (GOSTR3411_CONTEXT) }; gcry_md_spec_t _gcry_digest_spec_gost3411_cp = @@ -351,5 +352,6 @@ gcry_md_spec_t _gcry_digest_spec_gost3411_cp = GCRY_MD_GOSTR3411_CP, {0, 0}, "GOSTR3411_CP", asn, DIM (asn), oid_spec_gostr3411, 32, gost3411_cp_init, _gcry_md_block_write, gost3411_final, gost3411_read, NULL, + NULL, NULL, sizeof (GOSTR3411_CONTEXT) }; diff --git a/cipher/keccak.c b/cipher/keccak.c index 0bb315520..db67d0714 100644 --- a/cipher/keccak.c +++ b/cipher/keccak.c @@ -1221,6 +1221,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_224 = GCRY_MD_SHA3_224, {0, 1}, "SHA3-224", sha3_224_asn, DIM (sha3_224_asn), oid_spec_sha3_224, 28, sha3_224_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1229,6 +1230,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_256 = GCRY_MD_SHA3_256, {0, 1}, "SHA3-256", sha3_256_asn, DIM (sha3_256_asn), oid_spec_sha3_256, 32, sha3_256_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1237,6 +1239,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_384 = GCRY_MD_SHA3_384, {0, 1}, "SHA3-384", sha3_384_asn, DIM (sha3_384_asn), oid_spec_sha3_384, 48, sha3_384_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1245,6 +1248,7 @@ gcry_md_spec_t _gcry_digest_spec_sha3_512 = GCRY_MD_SHA3_512, {0, 1}, "SHA3-512", sha3_512_asn, DIM (sha3_512_asn), oid_spec_sha3_512, 64, sha3_512_init, keccak_write, keccak_final, keccak_read, NULL, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1253,6 +1257,7 @@ gcry_md_spec_t _gcry_digest_spec_shake128 = GCRY_MD_SHAKE128, {0, 1}, "SHAKE128", shake128_asn, DIM (shake128_asn), oid_spec_shake128, 0, shake128_init, keccak_write, keccak_final, NULL, keccak_extract, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; @@ -1261,6 +1266,7 @@ gcry_md_spec_t _gcry_digest_spec_shake256 = GCRY_MD_SHAKE256, {0, 1}, "SHAKE256", shake256_asn, DIM (shake256_asn), oid_spec_shake256, 0, shake256_init, keccak_write, keccak_final, NULL, keccak_extract, + NULL, NULL, sizeof (KECCAK_CONTEXT), run_selftests }; diff --git a/cipher/md.c b/cipher/md.c index 47c8cecdd..15e19a95f 100644 --- a/cipher/md.c +++ b/cipher/md.c @@ -1174,46 +1174,52 @@ void _gcry_md_hash_buffer (int algo, void *digest, const void *buffer, size_t length) { - if (0) - ; -#if USE_SHA256 - else if (algo == GCRY_MD_SHA256) - _gcry_sha256_hash_buffer (digest, buffer, length); -#endif -#if USE_SHA512 - else if (algo == GCRY_MD_SHA512) - _gcry_sha512_hash_buffer (digest, buffer, length); -#endif -#if USE_SHA1 - else if (algo == GCRY_MD_SHA1) - _gcry_sha1_hash_buffer (digest, buffer, length); -#endif -#if USE_RMD160 - else if (algo == GCRY_MD_RMD160 && !fips_mode () ) - _gcry_rmd160_hash_buffer (digest, buffer, length); -#endif + gcry_md_spec_t *spec; + + spec = spec_from_algo (algo); + if (!spec) + { + log_debug ("md_hash_buffer: algorithm %d not available\n", algo); + return; + } + + if (algo == GCRY_MD_MD5 && fips_mode ()) + { + _gcry_inactivate_fips_mode ("MD5 used"); + if (_gcry_enforced_fips_mode () ) + { + /* We should never get to here because we do not register + MD5 in enforced fips mode. */ + _gcry_fips_noreturn (); + } + } + + if (spec->hash_buffer != NULL) + { + spec->hash_buffer (digest, buffer, length); + } + else if (spec->hash_buffers != NULL) + { + gcry_buffer_t iov; + + iov.size = 0; + iov.data = (void *)buffer; + iov.off = 0; + iov.len = length; + + spec->hash_buffers (digest, &iov, 1); + } else { /* For the others we do not have a fast function, so we use the - normal functions. */ + normal functions. */ gcry_md_hd_t h; gpg_err_code_t err; - if (algo == GCRY_MD_MD5 && fips_mode ()) - { - _gcry_inactivate_fips_mode ("MD5 used"); - if (_gcry_enforced_fips_mode () ) - { - /* We should never get to here because we do not register - MD5 in enforced fips mode. */ - _gcry_fips_noreturn (); - } - } - err = md_open (&h, algo, 0); if (err) - log_bug ("gcry_md_open failed for algo %d: %s", - algo, gpg_strerror (gcry_error(err))); + log_bug ("gcry_md_open failed for algo %d: %s", + algo, gpg_strerror (gcry_error(err))); md_write (h, (byte *) buffer, length); md_final (h); memcpy (digest, md_read (h, algo), md_digest_length (algo)); @@ -1240,6 +1246,7 @@ gpg_err_code_t _gcry_md_hash_buffers (int algo, unsigned int flags, void *digest, const gcry_buffer_t *iov, int iovcnt) { + gcry_md_spec_t *spec; int hmac; if (!iov || iovcnt < 0) @@ -1251,39 +1258,36 @@ _gcry_md_hash_buffers (int algo, unsigned int flags, void *digest, if (hmac && iovcnt < 1) return GPG_ERR_INV_ARG; - if (0) - ; -#if USE_SHA256 - else if (algo == GCRY_MD_SHA256 && !hmac) - _gcry_sha256_hash_buffers (digest, iov, iovcnt); -#endif -#if USE_SHA512 - else if (algo == GCRY_MD_SHA512 && !hmac) - _gcry_sha512_hash_buffers (digest, iov, iovcnt); -#endif -#if USE_SHA1 - else if (algo == GCRY_MD_SHA1 && !hmac) - _gcry_sha1_hash_buffers (digest, iov, iovcnt); -#endif + spec = spec_from_algo (algo); + if (!spec) + { + log_debug ("md_hash_buffers: algorithm %d not available\n", algo); + return GPG_ERR_DIGEST_ALGO; + } + + if (algo == GCRY_MD_MD5 && fips_mode ()) + { + _gcry_inactivate_fips_mode ("MD5 used"); + if (_gcry_enforced_fips_mode () ) + { + /* We should never get to here because we do not register + MD5 in enforced fips mode. */ + _gcry_fips_noreturn (); + } + } + + if (!hmac && spec->hash_buffers) + { + spec->hash_buffers (digest, iov, iovcnt); + } else { /* For the others we do not have a fast function, so we use the - normal functions. */ + normal functions. */ gcry_md_hd_t h; gpg_err_code_t rc; int dlen; - if (algo == GCRY_MD_MD5 && fips_mode ()) - { - _gcry_inactivate_fips_mode ("MD5 used"); - if (_gcry_enforced_fips_mode () ) - { - /* We should never get to here because we do not register - MD5 in enforced fips mode. */ - _gcry_fips_noreturn (); - } - } - /* Detect SHAKE128 like algorithms which we can't use because * our API does not allow for a variable length digest. */ dlen = md_digest_length (algo); diff --git a/cipher/md2.c b/cipher/md2.c index e339b28d0..b6f7e94f4 100644 --- a/cipher/md2.c +++ b/cipher/md2.c @@ -178,5 +178,6 @@ gcry_md_spec_t _gcry_digest_spec_md2 = GCRY_MD_MD2, {0, 0}, "MD2", asn, DIM (asn), oid_spec_md2, 16, md2_init, _gcry_md_block_write, md2_final, md2_read, NULL, + NULL, NULL, sizeof (MD2_CONTEXT) }; diff --git a/cipher/md4.c b/cipher/md4.c index afa638232..098380801 100644 --- a/cipher/md4.c +++ b/cipher/md4.c @@ -287,5 +287,6 @@ gcry_md_spec_t _gcry_digest_spec_md4 = GCRY_MD_MD4, {0, 0}, "MD4", asn, DIM (asn), oid_spec_md4,16, md4_init, _gcry_md_block_write, md4_final, md4_read, NULL, + NULL, NULL, sizeof (MD4_CONTEXT) }; diff --git a/cipher/md5.c b/cipher/md5.c index ed942cf40..e35a500c4 100644 --- a/cipher/md5.c +++ b/cipher/md5.c @@ -313,5 +313,6 @@ gcry_md_spec_t _gcry_digest_spec_md5 = GCRY_MD_MD5, {0, 0}, "MD5", asn, DIM (asn), oid_spec_md5, 16, md5_init, _gcry_md_block_write, md5_final, md5_read, NULL, + NULL, NULL, sizeof (MD5_CONTEXT) }; diff --git a/cipher/rmd160.c b/cipher/rmd160.c index 0a019b9c6..2d2fae916 100644 --- a/cipher/rmd160.c +++ b/cipher/rmd160.c @@ -486,6 +486,21 @@ _gcry_rmd160_hash_buffer (void *outbuf, const void *buffer, size_t length ) memcpy ( outbuf, hd.bctx.buf, 20 ); } +/* Variant of the above shortcut function using a multiple buffers. */ +static void +_gcry_rmd160_hash_buffers (void *outbuf, const gcry_buffer_t *iov, int iovcnt) +{ + RMD160_CONTEXT hd; + + rmd160_init (&hd, 0); + for (;iovcnt > 0; iov++, iovcnt--) + _gcry_md_block_write (&hd, + (const char*)iov[0].data + iov[0].off, iov[0].len); + rmd160_final ( &hd ); + memcpy ( outbuf, hd.bctx.buf, 20 ); +} + + static byte asn[15] = /* Object ID is 1.3.36.3.2.1 */ { 0x30, 0x21, 0x30, 0x09, 0x06, 0x05, 0x2b, 0x24, 0x03, 0x02, 0x01, 0x05, 0x00, 0x04, 0x14 }; @@ -504,5 +519,6 @@ gcry_md_spec_t _gcry_digest_spec_rmd160 = GCRY_MD_RMD160, {0, 0}, "RIPEMD160", asn, DIM (asn), oid_spec_rmd160, 20, rmd160_init, _gcry_md_block_write, rmd160_final, rmd160_read, NULL, + _gcry_rmd160_hash_buffer, _gcry_rmd160_hash_buffers, sizeof (RMD160_CONTEXT) }; diff --git a/cipher/sha1.c b/cipher/sha1.c index 09868aa3f..e50262ff4 100644 --- a/cipher/sha1.c +++ b/cipher/sha1.c @@ -665,6 +665,7 @@ gcry_md_spec_t _gcry_digest_spec_sha1 = GCRY_MD_SHA1, {0, 1}, "SHA1", asn, DIM (asn), oid_spec_sha1, 20, sha1_init, _gcry_md_block_write, sha1_final, sha1_read, NULL, + _gcry_sha1_hash_buffer, _gcry_sha1_hash_buffers, sizeof (SHA1_CONTEXT), run_selftests }; diff --git a/cipher/sha256.c b/cipher/sha256.c index cb6a860ac..5c1c13f84 100644 --- a/cipher/sha256.c +++ b/cipher/sha256.c @@ -743,6 +743,7 @@ gcry_md_spec_t _gcry_digest_spec_sha224 = GCRY_MD_SHA224, {0, 1}, "SHA224", asn224, DIM (asn224), oid_spec_sha224, 28, sha224_init, _gcry_md_block_write, sha256_final, sha256_read, NULL, + NULL, NULL, sizeof (SHA256_CONTEXT), run_selftests }; @@ -752,6 +753,7 @@ gcry_md_spec_t _gcry_digest_spec_sha256 = GCRY_MD_SHA256, {0, 1}, "SHA256", asn256, DIM (asn256), oid_spec_sha256, 32, sha256_init, _gcry_md_block_write, sha256_final, sha256_read, NULL, + _gcry_sha256_hash_buffer, _gcry_sha256_hash_buffers, sizeof (SHA256_CONTEXT), run_selftests }; diff --git a/cipher/sha512.c b/cipher/sha512.c index 06e8a2b91..e83e84b83 100644 --- a/cipher/sha512.c +++ b/cipher/sha512.c @@ -925,6 +925,7 @@ gcry_md_spec_t _gcry_digest_spec_sha512 = GCRY_MD_SHA512, {0, 1}, "SHA512", sha512_asn, DIM (sha512_asn), oid_spec_sha512, 64, sha512_init, _gcry_md_block_write, sha512_final, sha512_read, NULL, + _gcry_sha512_hash_buffer, _gcry_sha512_hash_buffers, sizeof (SHA512_CONTEXT), run_selftests }; @@ -954,6 +955,7 @@ gcry_md_spec_t _gcry_digest_spec_sha384 = GCRY_MD_SHA384, {0, 1}, "SHA384", sha384_asn, DIM (sha384_asn), oid_spec_sha384, 48, sha384_init, _gcry_md_block_write, sha512_final, sha512_read, NULL, + NULL, NULL, sizeof (SHA512_CONTEXT), run_selftests }; diff --git a/cipher/sm3.c b/cipher/sm3.c index ee5daf227..c6f1a091d 100644 --- a/cipher/sm3.c +++ b/cipher/sm3.c @@ -462,6 +462,7 @@ gcry_md_spec_t _gcry_digest_spec_sm3 = GCRY_MD_SM3, {0, 1}, "SM3", asn_sm3, DIM (asn_sm3), oid_spec_sm3, 32, sm3_init, _gcry_md_block_write, sm3_final, sm3_read, NULL, + _gcry_sm3_hash_buffer, _gcry_sm3_hash_buffers, sizeof (SM3_CONTEXT), run_selftests }; diff --git a/cipher/stribog.c b/cipher/stribog.c index 7b6e330d0..459e4db99 100644 --- a/cipher/stribog.c +++ b/cipher/stribog.c @@ -1344,7 +1344,7 @@ gcry_md_spec_t _gcry_digest_spec_stribog_256 = GCRY_MD_STRIBOG256, {0, 0}, "STRIBOG256", NULL, 0, oid_spec_stribog256, 32, stribog_init_256, _gcry_md_block_write, stribog_final, stribog_read_256, - NULL, + NULL, NULL, NULL, sizeof (STRIBOG_CONTEXT) }; @@ -1353,6 +1353,6 @@ gcry_md_spec_t _gcry_digest_spec_stribog_512 = GCRY_MD_STRIBOG512, {0, 0}, "STRIBOG512", NULL, 0, oid_spec_stribog512, 64, stribog_init_512, _gcry_md_block_write, stribog_final, stribog_read_512, - NULL, + NULL, NULL, NULL, sizeof (STRIBOG_CONTEXT) }; diff --git a/cipher/tiger.c b/cipher/tiger.c index b60ec162f..d24d1603b 100644 --- a/cipher/tiger.c +++ b/cipher/tiger.c @@ -814,6 +814,7 @@ gcry_md_spec_t _gcry_digest_spec_tiger = GCRY_MD_TIGER, {0, 0}, "TIGER192", NULL, 0, NULL, 24, tiger_init, _gcry_md_block_write, tiger_final, tiger_read, NULL, + NULL, NULL, sizeof (TIGER_CONTEXT) }; @@ -837,6 +838,7 @@ gcry_md_spec_t _gcry_digest_spec_tiger1 = GCRY_MD_TIGER1, {0, 0}, "TIGER", asn1, DIM (asn1), oid_spec_tiger1, 24, tiger1_init, _gcry_md_block_write, tiger_final, tiger_read, NULL, + NULL, NULL, sizeof (TIGER_CONTEXT) }; @@ -848,5 +850,6 @@ gcry_md_spec_t _gcry_digest_spec_tiger2 = GCRY_MD_TIGER2, {0, 0}, "TIGER2", NULL, 0, NULL, 24, tiger2_init, _gcry_md_block_write, tiger_final, tiger_read, NULL, + NULL, NULL, sizeof (TIGER_CONTEXT) }; diff --git a/cipher/whirlpool.c b/cipher/whirlpool.c index 8a069392e..d52375ada 100644 --- a/cipher/whirlpool.c +++ b/cipher/whirlpool.c @@ -1526,5 +1526,6 @@ gcry_md_spec_t _gcry_digest_spec_whirlpool = GCRY_MD_WHIRLPOOL, {0, 0}, "WHIRLPOOL", NULL, 0, NULL, 64, whirlpool_init, whirlpool_write, whirlpool_final, whirlpool_read, NULL, + NULL, NULL, sizeof (whirlpool_context_t) }; diff --git a/src/cipher-proto.h b/src/cipher-proto.h index daa917c23..97eb0d9a6 100644 --- a/src/cipher-proto.h +++ b/src/cipher-proto.h @@ -219,6 +219,14 @@ typedef unsigned char *(*gcry_md_read_t) (void *c); /* Type for the md_extract function. */ typedef void (*gcry_md_extract_t) (void *c, void *outbuf, size_t nbytes); +/* Type for the md_hash_buffer function. */ +typedef void (*gcry_md_hash_buffer_t) (void *outbuf, const void *buffer, + size_t length); + +/* Type for the md_hash_buffers function. */ +typedef void (*gcry_md_hash_buffers_t) (void *outbuf, const gcry_buffer_t *iov, + int iovcnt); + typedef struct gcry_md_oid_spec { const char *oidstring; @@ -242,6 +250,8 @@ typedef struct gcry_md_spec gcry_md_final_t final; gcry_md_read_t read; gcry_md_extract_t extract; + gcry_md_hash_buffer_t hash_buffer; + gcry_md_hash_buffers_t hash_buffers; size_t contextsize; /* allocate this amount of context */ selftest_func_t selftest; } gcry_md_spec_t; diff --git a/src/cipher.h b/src/cipher.h index 7c2e5d9e7..6e89be3da 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -115,6 +115,7 @@ gcry_err_code_t _gcry_cipher_cmac_set_subkeys /*-- rmd160.c --*/ void _gcry_rmd160_hash_buffer (void *outbuf, const void *buffer, size_t length); + /*-- sha1.c --*/ void _gcry_sha1_hash_buffer (void *outbuf, const void *buffer, size_t length); -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 18:28:37 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 19:28:37 +0300 Subject: Low level ops? In-Reply-To: References: <534aeb69-8bf6-b8a9-6918-422b39cf956a@iki.fi> Message-ID: <82d38695-ff7e-ef03-9215-cf2dcdbfcae4@iki.fi> Hello, On 19.06.2018 08:27, Stef Bon wrote: > Op wo 13 jun. 2018 om 21:21 schreef Stef Bon : >> >> >> So as I see it it would be worth to try to bring back the overhead for >> AES-CBC//DEC since they vary from 99% to 12,5%, since the size most >> ssh messages is between 128 and 1024 bytes. >> >> You mention parallel mode for AES-CBC/DEC. Is it possible to use this >> from the api? >> And do you know what counts for chacha20-poly1305 at openssh.com? >> > > Hi, > > can you please take a look at my remarks. I think that it's usefull to > reduce the overhead > for the mentioned ciphers. I made changes on weekend to reduce the overhead for cipher operations. When I tried to get those patches to the mailing-list they just would not get through. I've spend past two nights trying to figure out what the ____ is wrong with my mail setup. Anyway, overhead for example for AESNI/CBC decryption has reduced from ~80 cycles per call to ~30 cycles. The remaining 30 cycles, seems to be mainly caused by the optimized AESNI/CBC decryption function itself. AESNI/CBC encryption function is less complex and overhead for it is now 9 cycles per call (was 40 cycles). > And what about chacha20-poly1305 at openssh.com? If you check the chacha20-poly1305 in OpenSSH, you see that for each packet you need to perform one extra chacha20 block encryption, which alone is going to cost over 400 cycles. If you want to see how to implement chacha20-poly1305 at openssh.com with libgcrypt, check following commit where I've changed OpenSSH to use libgcrypt: https://github.com/jkivilin/openssh-portable/commit/dd4d06bb47cbbbe3607b9be30f17f1495adbeb12 > An about controlling the parallel handling through the api? Parallel handling is automatic for cipher modes that can be parallelizable (depends on your CPU's feature set and what implementations are available). These are CTR-mode, CBC-decryption, CFB-decryption, XTS, and OCB. EAX, GCM and CCM modes use CTR for encryption/decryption and benefit from CTR-mode optimizations too. Chacha20 and Salsa20 stream ciphers also have parallel block optimizations. To utilize this, you need to provide input buffers larger than blocksize to libgcrypt. For AESNI implementations, you get best performance starting with buffer size of 8 blocks or 8*16=128 bytes. For Chacha20, you need 4 blocks or 4*64=256 bytes. -Jussi > > Thanks, > > Stef > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel > From jussi.kivilinna at iki.fi Tue Jun 19 17:51:20 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 18:51:20 +0300 Subject: [PATCH 8/9] AES: setup cipher object bulk routines with optimized versions In-Reply-To: <20180619155121.7122-1-jussi.kivilinna@iki.fi> References: <20180619155121.7122-1-jussi.kivilinna@iki.fi> Message-ID: <20180619155121.7122-7-jussi.kivilinna@iki.fi> * cipher/rijndael-aesni.c (_gcry_aes_aesni_prepare_decryption): Rename... (do_aesni_prepare_decryption): .. to this. (_gcry_aes_aesni_prepare_decryption): New. (_gcry_aes_aesni_cfb_enc, _gcry_aes_aesni_cbc_enc) (_gcry_aes_aesni_ctr_enc, _gcry_aes_aesni_cfb_dec) (_gcry_aes_aesni_cbc_dec): Reorder parameters to match bulk operations. (_gcry_aes_aesni_cbc_dec, aesni_ocb_dec) (_gcry_aes_aesni_xts_dec): Check and prepare decryption. (_gcry_aes_aesni_ocb_crypt, _gcry_aes_aesni_ocb_auth): Change return type to size_t. * cipher/rijndael-armv8-ce.c (_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc) (_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec) (_gcry_aes_armv8_ce_cbc_dec): Reorder parameters to match bulk operations. (_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt) (_gcry_aes_armv8_ce_xts_dec): Check and prepare decryption. (_gcry_aes_armv8_ce_ocb_crypt, _gcry_aes_armv8_ce_ocb_auth): Change return type to size_t. * cipher/rijndael-ssse3-amd64.c (_gcry_ssse3_prepare_decryption): Rename... (do_ssse3_prepare_decryption): .. to this. (_gcry_ssse3_prepare_decryption): New. (_gcry_aes_ssse3_cfb_enc, _gcry_aes_ssse3_cbc_enc) (_gcry_aes_ssse3_ctr_enc, _gcry_aes_ssse3_cfb_dec) (_gcry_aes_ssse3_cbc_dec): Reorder parameters to match bulk operations. (_gcry_aes_ssse3_cbc_dec, ssse3_ocb_dec): Check and prepare decryption. (_gcry_aes_ssse3_ocb_crypt, _gcry_aes_ssse3_ocb_auth): Change return type to size_t. * cipher/rijndael.c (_gcry_aes_aesni_cfb_enc, _gcry_aes_aesni_cbc_enc) (_gcry_aes_aesni_ctr_enc, _gcry_aes_aesni_cfb_dec) (_gcry_aes_aesni_cbc_dec, _gcry_aes_aesni_ocb_crypt) (_gcry_aes_aesni_ocb_auth, _gcry_aes_aesni_xts_crypt) (_gcry_aes_ssse3_cfb_enc, _gcry_aes_ssse3_cbc_enc) (_gcry_aes_ssse3_ctr_enc, _gcry_aes_ssse3_cfb_dec) (_gcry_aes_ssse3_cbc_dec, _gcry_aes_ssse3_ocb_crypt) (_gcry_aes_ssse3_ocb_auth, _gcry_aes_ssse3_xts_crypt) (_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc) (_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec) (_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt) (_gcry_aes_armv8_ce_ocb_auth, _gcry_aes_armv8_ce_xts_crypt): Change prototypes to match bulk operations. (do_setkey): Setup bulk operations with optimized implementations. (_gcry_aes_cfb_enc, _gcry_aes_cbc_enc, _gcry_aes_ctr_enc) (_gcry_aes_cfb_dec, _gcry_aes_cbc_dec, _gcry_aes_ocb_crypt) (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth, _gcry_aes_xts_crypt): Update usage to match new prototypes, avoid prefetch and decryption preparation on optimized code paths. -- Replace bulk operation functions of cipher object with faster version for reduced per call overhead. Signed-off-by: Jussi Kivilinna --- cipher/rijndael-aesni.c | 60 ++++-- cipher/rijndael-armv8-ce.c | 46 +++-- cipher/rijndael-ssse3-amd64.c | 66 ++++-- cipher/rijndael.c | 366 +++++++++++++++++----------------- 4 files changed, 306 insertions(+), 232 deletions(-) diff --git a/cipher/rijndael-aesni.c b/cipher/rijndael-aesni.c index 50a0745b2..e7e61ca8a 100644 --- a/cipher/rijndael-aesni.c +++ b/cipher/rijndael-aesni.c @@ -371,8 +371,8 @@ _gcry_aes_aesni_do_setkey (RIJNDAEL_context *ctx, const byte *key) /* Make a decryption key from an encryption key. */ -void -_gcry_aes_aesni_prepare_decryption (RIJNDAEL_context *ctx) +static inline void +do_aesni_prepare_decryption (RIJNDAEL_context *ctx) { /* The AES-NI decrypt instructions use the Equivalent Inverse Cipher, thus we can't use the the standard decrypt key @@ -382,8 +382,6 @@ _gcry_aes_aesni_prepare_decryption (RIJNDAEL_context *ctx) int rr; int r; - aesni_prepare(); - #define DO_AESNI_AESIMC() \ asm volatile ("movdqa %[ekey], %%xmm1\n\t" \ /*"aesimc %%xmm1, %%xmm1\n\t"*/ \ @@ -419,7 +417,13 @@ _gcry_aes_aesni_prepare_decryption (RIJNDAEL_context *ctx) dkey[r] = ekey[0]; #undef DO_AESNI_AESIMC +} +void +_gcry_aes_aesni_prepare_decryption (RIJNDAEL_context *ctx) +{ + aesni_prepare(); + do_aesni_prepare_decryption (ctx); aesni_cleanup(); } @@ -1696,8 +1700,8 @@ _gcry_aes_aesni_encrypt (const RIJNDAEL_context *ctx, unsigned char *dst, void -_gcry_aes_aesni_cfb_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_aesni_cfb_enc (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks) { aesni_prepare (); @@ -1732,8 +1736,8 @@ _gcry_aes_aesni_cfb_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, void -_gcry_aes_aesni_cbc_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_aesni_cbc_enc (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks, int cbc_mac) { aesni_prepare_2_6_variable; @@ -1778,8 +1782,8 @@ _gcry_aes_aesni_cbc_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, void -_gcry_aes_aesni_ctr_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *ctr, +_gcry_aes_aesni_ctr_enc (RIJNDAEL_context *ctx, unsigned char *ctr, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks) { static const unsigned char be_mask[16] __attribute__ ((aligned (16))) = @@ -1851,8 +1855,8 @@ _gcry_aes_aesni_decrypt (const RIJNDAEL_context *ctx, unsigned char *dst, void -_gcry_aes_aesni_cfb_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_aesni_cfb_dec (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks) { aesni_prepare_2_6_variable; @@ -2006,15 +2010,21 @@ _gcry_aes_aesni_cfb_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, void -_gcry_aes_aesni_cbc_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, - size_t nblocks) +_gcry_aes_aesni_cbc_dec (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, + size_t nblocks) { aesni_prepare_2_6_variable; aesni_prepare (); aesni_prepare_2_6(); + if ( !ctx->decryption_prepared ) + { + do_aesni_prepare_decryption ( ctx ); + ctx->decryption_prepared = 1; + } + asm volatile ("movdqu %[iv], %%xmm5\n\t" /* use xmm5 as fast IV storage */ : /* No output */ @@ -2477,6 +2487,12 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, aesni_prepare (); aesni_prepare_2_6 (); + if ( !ctx->decryption_prepared ) + { + do_aesni_prepare_decryption ( ctx ); + ctx->decryption_prepared = 1; + } + /* Preload Offset and Checksum */ asm volatile ("movdqu %[iv], %%xmm5\n\t" "movdqu %[ctr], %%xmm6\n\t" @@ -2761,7 +2777,7 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, } -void +size_t _gcry_aes_aesni_ocb_crypt(gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks, int encrypt) { @@ -2769,10 +2785,12 @@ _gcry_aes_aesni_ocb_crypt(gcry_cipher_hd_t c, void *outbuf_arg, aesni_ocb_enc(c, outbuf_arg, inbuf_arg, nblocks); else aesni_ocb_dec(c, outbuf_arg, inbuf_arg, nblocks); + + return 0; } -void +size_t _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) { @@ -3004,6 +3022,8 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, aesni_cleanup (); aesni_cleanup_2_6 (); + + return 0; } @@ -3159,6 +3179,12 @@ _gcry_aes_aesni_xts_dec (RIJNDAEL_context *ctx, unsigned char *tweak, aesni_prepare (); aesni_prepare_2_6 (); + if ( !ctx->decryption_prepared ) + { + do_aesni_prepare_decryption ( ctx ); + ctx->decryption_prepared = 1; + } + /* Preload Tweak */ asm volatile ("movdqu %[tweak], %%xmm5\n\t" "movdqa %[gfmul], %%xmm6\n\t" diff --git a/cipher/rijndael-armv8-ce.c b/cipher/rijndael-armv8-ce.c index 6af7108f8..6e46830ee 100644 --- a/cipher/rijndael-armv8-ce.c +++ b/cipher/rijndael-armv8-ce.c @@ -284,8 +284,8 @@ _gcry_aes_armv8_ce_decrypt (const RIJNDAEL_context *ctx, unsigned char *dst, } void -_gcry_aes_armv8_ce_cbc_enc (const RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_armv8_ce_cbc_enc (const RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks, int cbc_mac) { const void *keysched = ctx->keyschenc32; @@ -296,19 +296,25 @@ _gcry_aes_armv8_ce_cbc_enc (const RIJNDAEL_context *ctx, unsigned char *outbuf, } void -_gcry_aes_armv8_ce_cbc_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_armv8_ce_cbc_dec (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks) { const void *keysched = ctx->keyschdec32; unsigned int nrounds = ctx->rounds; + if ( !ctx->decryption_prepared ) + { + _gcry_aes_armv8_ce_prepare_decryption ( ctx ); + ctx->decryption_prepared = 1; + } + _gcry_aes_cbc_dec_armv8_ce(keysched, outbuf, inbuf, iv, nblocks, nrounds); } void -_gcry_aes_armv8_ce_cfb_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_armv8_ce_cfb_enc (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks) { const void *keysched = ctx->keyschenc32; @@ -318,8 +324,8 @@ _gcry_aes_armv8_ce_cfb_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, } void -_gcry_aes_armv8_ce_cfb_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_armv8_ce_cfb_dec (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks) { const void *keysched = ctx->keyschenc32; @@ -329,8 +335,8 @@ _gcry_aes_armv8_ce_cfb_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, } void -_gcry_aes_armv8_ce_ctr_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, +_gcry_aes_armv8_ce_ctr_enc (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, size_t nblocks) { const void *keysched = ctx->keyschenc32; @@ -339,7 +345,7 @@ _gcry_aes_armv8_ce_ctr_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, _gcry_aes_ctr_enc_armv8_ce(keysched, outbuf, inbuf, iv, nblocks, nrounds); } -void +size_t _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks, int encrypt) @@ -353,13 +359,21 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, unsigned int nrounds = ctx->rounds; u64 blkn = c->u_mode.ocb.data_nblocks; + if ( !encrypt && !ctx->decryption_prepared ) + { + _gcry_aes_armv8_ce_prepare_decryption ( ctx ); + ctx->decryption_prepared = 1; + } + c->u_mode.ocb.data_nblocks = blkn + nblocks; crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, c->u_mode.ocb.L[0], nblocks, nrounds, (unsigned int)blkn); + + return 0; } -void +size_t _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, size_t nblocks) { @@ -374,6 +388,8 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, c->u_mode.ocb.L[0], nblocks, nrounds, (unsigned int)blkn); + + return 0; } void @@ -386,6 +402,12 @@ _gcry_aes_armv8_ce_xts_crypt (RIJNDAEL_context *ctx, unsigned char *tweak, : _gcry_aes_xts_dec_armv8_ce; unsigned int nrounds = ctx->rounds; + if ( !encrypt && !ctx->decryption_prepared ) + { + _gcry_aes_armv8_ce_prepare_decryption ( ctx ); + ctx->decryption_prepared = 1; + } + crypt_fn(keysched, outbuf, inbuf, tweak, nblocks, nrounds); } diff --git a/cipher/rijndael-ssse3-amd64.c b/cipher/rijndael-ssse3-amd64.c index 98660ecc8..07a64a4c1 100644 --- a/cipher/rijndael-ssse3-amd64.c +++ b/cipher/rijndael-ssse3-amd64.c @@ -175,11 +175,11 @@ _gcry_aes_ssse3_do_setkey (RIJNDAEL_context *ctx, const byte *key) /* Make a decryption key from an encryption key. */ -void -_gcry_aes_ssse3_prepare_decryption (RIJNDAEL_context *ctx) +static inline void +do_ssse3_prepare_decryption (RIJNDAEL_context *ctx, + byte ssse3_state[SSSE3_STATE_SIZE]) { unsigned int keybits = (ctx->rounds - 10) * 32 + 128; - byte ssse3_state[SSSE3_STATE_SIZE]; vpaes_ssse3_prepare(); @@ -190,6 +190,14 @@ _gcry_aes_ssse3_prepare_decryption (RIJNDAEL_context *ctx) vpaes_ssse3_cleanup(); } +void +_gcry_aes_ssse3_prepare_decryption (RIJNDAEL_context *ctx) +{ + byte ssse3_state[SSSE3_STATE_SIZE]; + + do_ssse3_prepare_decryption(ctx, ssse3_state); +} + /* Encrypt one block using the Intel SSSE3 instructions. Block is input * and output through SSE register xmm0. */ @@ -232,9 +240,9 @@ _gcry_aes_ssse3_encrypt (const RIJNDAEL_context *ctx, unsigned char *dst, void -_gcry_aes_ssse3_cfb_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, - size_t nblocks) +_gcry_aes_ssse3_cfb_enc (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, + size_t nblocks) { unsigned int nrounds = ctx->rounds; byte ssse3_state[SSSE3_STATE_SIZE]; @@ -271,9 +279,9 @@ _gcry_aes_ssse3_cfb_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, void -_gcry_aes_ssse3_cbc_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, - size_t nblocks, int cbc_mac) +_gcry_aes_ssse3_cbc_enc (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, + size_t nblocks, int cbc_mac) { unsigned int nrounds = ctx->rounds; byte ssse3_state[SSSE3_STATE_SIZE]; @@ -316,9 +324,9 @@ _gcry_aes_ssse3_cbc_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, void -_gcry_aes_ssse3_ctr_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *ctr, - size_t nblocks) +_gcry_aes_ssse3_ctr_enc (RIJNDAEL_context *ctx, unsigned char *ctr, + unsigned char *outbuf, const unsigned char *inbuf, + size_t nblocks) { static const unsigned char be_mask[16] __attribute__ ((aligned (16))) = { 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 }; @@ -384,7 +392,7 @@ _gcry_aes_ssse3_ctr_enc (RIJNDAEL_context *ctx, unsigned char *outbuf, unsigned int _gcry_aes_ssse3_decrypt (const RIJNDAEL_context *ctx, unsigned char *dst, - const unsigned char *src) + const unsigned char *src) { unsigned int nrounds = ctx->rounds; byte ssse3_state[SSSE3_STATE_SIZE]; @@ -405,9 +413,9 @@ _gcry_aes_ssse3_decrypt (const RIJNDAEL_context *ctx, unsigned char *dst, void -_gcry_aes_ssse3_cfb_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, - size_t nblocks) +_gcry_aes_ssse3_cfb_dec (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, + size_t nblocks) { unsigned int nrounds = ctx->rounds; byte ssse3_state[SSSE3_STATE_SIZE]; @@ -445,13 +453,19 @@ _gcry_aes_ssse3_cfb_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, void -_gcry_aes_ssse3_cbc_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, - const unsigned char *inbuf, unsigned char *iv, - size_t nblocks) +_gcry_aes_ssse3_cbc_dec (RIJNDAEL_context *ctx, unsigned char *iv, + unsigned char *outbuf, const unsigned char *inbuf, + size_t nblocks) { unsigned int nrounds = ctx->rounds; byte ssse3_state[SSSE3_STATE_SIZE]; + if ( !ctx->decryption_prepared ) + { + do_ssse3_prepare_decryption ( ctx, ssse3_state ); + ctx->decryption_prepared = 1; + } + vpaes_ssse3_prepare_dec (); asm volatile ("movdqu %[iv], %%xmm7\n\t" /* use xmm7 as fast IV storage */ @@ -563,6 +577,12 @@ ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, unsigned int nrounds = ctx->rounds; byte ssse3_state[SSSE3_STATE_SIZE]; + if ( !ctx->decryption_prepared ) + { + do_ssse3_prepare_decryption ( ctx, ssse3_state ); + ctx->decryption_prepared = 1; + } + vpaes_ssse3_prepare_dec (); /* Preload Offset and Checksum */ @@ -616,7 +636,7 @@ ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, } -void +size_t _gcry_aes_ssse3_ocb_crypt(gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks, int encrypt) { @@ -624,10 +644,12 @@ _gcry_aes_ssse3_ocb_crypt(gcry_cipher_hd_t c, void *outbuf_arg, ssse3_ocb_enc(c, outbuf_arg, inbuf_arg, nblocks); else ssse3_ocb_dec(c, outbuf_arg, inbuf_arg, nblocks); + + return 0; } -void +size_t _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) { @@ -683,6 +705,8 @@ _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, : "memory" ); vpaes_ssse3_cleanup (); + + return 0; } #endif /* USE_SSSE3 */ diff --git a/cipher/rijndael.c b/cipher/rijndael.c index f9666d0cf..d3fcb76f3 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -77,37 +77,29 @@ extern unsigned int _gcry_aes_aesni_encrypt (const RIJNDAEL_context *ctx, extern unsigned int _gcry_aes_aesni_decrypt (const RIJNDAEL_context *ctx, unsigned char *dst, const unsigned char *src); -extern void _gcry_aes_aesni_cfb_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_aesni_cbc_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks, - int cbc_mac); -extern void _gcry_aes_aesni_ctr_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *ctr, size_t nblocks); -extern void _gcry_aes_aesni_cfb_dec (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_aesni_cbc_dec (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_aesni_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, - const void *inbuf_arg, size_t nblocks, - int encrypt); -extern void _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, - size_t nblocks); -extern void _gcry_aes_aesni_xts_crypt (RIJNDAEL_context *ctx, - unsigned char *tweak, - unsigned char *outbuf, - const unsigned char *inbuf, - size_t nblocks, int encrypt); +extern void _gcry_aes_aesni_cfb_enc (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_aesni_cbc_enc (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks, int cbc_mac); +extern void _gcry_aes_aesni_ctr_enc (void *context, unsigned char *ctr, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_aesni_cfb_dec (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_aesni_cbc_dec (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern size_t _gcry_aes_aesni_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, + const void *inbuf_arg, size_t nblocks, + int encrypt); +extern size_t _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, + size_t nblocks); +extern void _gcry_aes_aesni_xts_crypt (void *context, unsigned char *tweak, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks, int encrypt); #endif #ifdef USE_SSSE3 @@ -121,32 +113,27 @@ extern unsigned int _gcry_aes_ssse3_encrypt (const RIJNDAEL_context *ctx, extern unsigned int _gcry_aes_ssse3_decrypt (const RIJNDAEL_context *ctx, unsigned char *dst, const unsigned char *src); -extern void _gcry_aes_ssse3_cfb_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_ssse3_cbc_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks, +extern void _gcry_aes_ssse3_cfb_enc (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_ssse3_cbc_enc (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks, int cbc_mac); -extern void _gcry_aes_ssse3_ctr_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *ctr, size_t nblocks); -extern void _gcry_aes_ssse3_cfb_dec (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_ssse3_cbc_dec (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_ssse3_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, - const void *inbuf_arg, size_t nblocks, - int encrypt); -extern void _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, - size_t nblocks); +extern void _gcry_aes_ssse3_ctr_enc (void *context, unsigned char *ctr, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_ssse3_cfb_dec (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_ssse3_cbc_dec (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern size_t _gcry_aes_ssse3_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, + const void *inbuf_arg, size_t nblocks, + int encrypt); +extern size_t _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, + size_t nblocks); #endif #ifdef USE_PADLOCK @@ -185,36 +172,30 @@ extern unsigned int _gcry_aes_armv8_ce_decrypt(const RIJNDAEL_context *ctx, unsigned char *dst, const unsigned char *src); -extern void _gcry_aes_armv8_ce_cfb_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_armv8_ce_cbc_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks, +extern void _gcry_aes_armv8_ce_cfb_enc (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_armv8_ce_cbc_enc (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks, int cbc_mac); -extern void _gcry_aes_armv8_ce_ctr_enc (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *ctr, size_t nblocks); -extern void _gcry_aes_armv8_ce_cfb_dec (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_armv8_ce_cbc_dec (RIJNDAEL_context *ctx, - unsigned char *outbuf, - const unsigned char *inbuf, - unsigned char *iv, size_t nblocks); -extern void _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, - const void *inbuf_arg, size_t nblocks, - int encrypt); -extern void _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, - const void *abuf_arg, size_t nblocks); -extern void _gcry_aes_armv8_ce_xts_crypt (RIJNDAEL_context *ctx, - unsigned char *tweak, - unsigned char *outbuf, - const unsigned char *inbuf, +extern void _gcry_aes_armv8_ce_ctr_enc (void *context, unsigned char *ctr, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_armv8_ce_cfb_dec (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern void _gcry_aes_armv8_ce_cbc_dec (void *context, unsigned char *iv, + void *outbuf_arg, const void *inbuf_arg, + size_t nblocks); +extern size_t _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, + const void *inbuf_arg, size_t nblocks, + int encrypt); +extern size_t _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, + const void *abuf_arg, size_t nblocks); +extern void _gcry_aes_armv8_ce_xts_crypt (void *context, unsigned char *tweak, + void *outbuf_arg, + const void *inbuf_arg, size_t nblocks, int encrypt); #endif /*USE_ARM_ASM*/ @@ -270,7 +251,8 @@ static void prefetch_dec(void) /* Perform the key setup. */ static gcry_err_code_t -do_setkey (RIJNDAEL_context *ctx, const byte *key, const unsigned keylen) +do_setkey (RIJNDAEL_context *ctx, const byte *key, const unsigned keylen, + gcry_cipher_hd_t hd) { static int initialized = 0; static const char *selftest_failed = 0; @@ -350,6 +332,17 @@ do_setkey (RIJNDAEL_context *ctx, const byte *key, const unsigned keylen) ctx->prefetch_enc_fn = NULL; ctx->prefetch_dec_fn = NULL; ctx->use_aesni = 1; + if (hd) + { + hd->bulk.cfb_enc = _gcry_aes_aesni_cfb_enc; + hd->bulk.cfb_dec = _gcry_aes_aesni_cfb_dec; + hd->bulk.cbc_enc = _gcry_aes_aesni_cbc_enc; + hd->bulk.cbc_dec = _gcry_aes_aesni_cbc_dec; + hd->bulk.ctr_enc = _gcry_aes_aesni_ctr_enc; + hd->bulk.ocb_crypt = _gcry_aes_aesni_ocb_crypt; + hd->bulk.ocb_auth = _gcry_aes_aesni_ocb_auth; + hd->bulk.xts_crypt = _gcry_aes_aesni_xts_crypt; + } } #endif #ifdef USE_PADLOCK @@ -371,6 +364,16 @@ do_setkey (RIJNDAEL_context *ctx, const byte *key, const unsigned keylen) ctx->prefetch_enc_fn = NULL; ctx->prefetch_dec_fn = NULL; ctx->use_ssse3 = 1; + if (hd) + { + hd->bulk.cfb_enc = _gcry_aes_ssse3_cfb_enc; + hd->bulk.cfb_dec = _gcry_aes_ssse3_cfb_dec; + hd->bulk.cbc_enc = _gcry_aes_ssse3_cbc_enc; + hd->bulk.cbc_dec = _gcry_aes_ssse3_cbc_dec; + hd->bulk.ctr_enc = _gcry_aes_ssse3_ctr_enc; + hd->bulk.ocb_crypt = _gcry_aes_ssse3_ocb_crypt; + hd->bulk.ocb_auth = _gcry_aes_ssse3_ocb_auth; + } } #endif #ifdef USE_ARM_CE @@ -381,6 +384,17 @@ do_setkey (RIJNDAEL_context *ctx, const byte *key, const unsigned keylen) ctx->prefetch_enc_fn = NULL; ctx->prefetch_dec_fn = NULL; ctx->use_arm_ce = 1; + if (hd) + { + hd->bulk.cfb_enc = _gcry_aes_armv8_ce_cfb_enc; + hd->bulk.cfb_dec = _gcry_aes_armv8_ce_cfb_dec; + hd->bulk.cbc_enc = _gcry_aes_armv8_ce_cbc_enc; + hd->bulk.cbc_dec = _gcry_aes_armv8_ce_cbc_dec; + hd->bulk.ctr_enc = _gcry_aes_armv8_ce_ctr_enc; + hd->bulk.ocb_crypt = _gcry_aes_armv8_ce_ocb_crypt; + hd->bulk.ocb_auth = _gcry_aes_armv8_ce_ocb_auth; + hd->bulk.xts_crypt = _gcry_aes_armv8_ce_xts_crypt; + } } #endif else @@ -517,8 +531,7 @@ rijndael_setkey (void *context, const byte *key, const unsigned keylen, gcry_cipher_hd_t hd) { RIJNDAEL_context *ctx = context; - (void)hd; - return do_setkey (ctx, key, keylen); + return do_setkey (ctx, key, keylen, hd); } @@ -783,36 +796,36 @@ _gcry_aes_cfb_enc (void *context, unsigned char *iv, const unsigned char *inbuf = inbuf_arg; unsigned int burn_depth = 0; - if (ctx->prefetch_enc_fn) - ctx->prefetch_enc_fn(); - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { - _gcry_aes_aesni_cfb_enc (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_aesni_cfb_enc (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_AESNI*/ #ifdef USE_SSSE3 else if (ctx->use_ssse3) { - _gcry_aes_ssse3_cfb_enc (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_ssse3_cfb_enc (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_SSSE3*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { - _gcry_aes_armv8_ce_cfb_enc (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_armv8_ce_cfb_enc (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_ARM_CE*/ else { rijndael_cryptfn_t encrypt_fn = ctx->encrypt_fn; + if (ctx->prefetch_enc_fn) + ctx->prefetch_enc_fn(); + for ( ;nblocks; nblocks-- ) { /* Encrypt the IV. */ @@ -844,36 +857,36 @@ _gcry_aes_cbc_enc (void *context, unsigned char *iv, unsigned char *last_iv; unsigned int burn_depth = 0; - if (ctx->prefetch_enc_fn) - ctx->prefetch_enc_fn(); - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { - _gcry_aes_aesni_cbc_enc (ctx, outbuf, inbuf, iv, nblocks, cbc_mac); - burn_depth = 0; + _gcry_aes_aesni_cbc_enc (ctx, iv, outbuf, inbuf, nblocks, cbc_mac); + return; } #endif /*USE_AESNI*/ #ifdef USE_SSSE3 else if (ctx->use_ssse3) { - _gcry_aes_ssse3_cbc_enc (ctx, outbuf, inbuf, iv, nblocks, cbc_mac); - burn_depth = 0; + _gcry_aes_ssse3_cbc_enc (ctx, iv, outbuf, inbuf, nblocks, cbc_mac); + return; } #endif /*USE_SSSE3*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { - _gcry_aes_armv8_ce_cbc_enc (ctx, outbuf, inbuf, iv, nblocks, cbc_mac); - burn_depth = 0; + _gcry_aes_armv8_ce_cbc_enc (ctx, iv, outbuf, inbuf, nblocks, cbc_mac); + return; } #endif /*USE_ARM_CE*/ else { rijndael_cryptfn_t encrypt_fn = ctx->encrypt_fn; + if (ctx->prefetch_enc_fn) + ctx->prefetch_enc_fn(); + last_iv = iv; for ( ;nblocks; nblocks-- ) @@ -913,30 +926,27 @@ _gcry_aes_ctr_enc (void *context, unsigned char *ctr, unsigned int burn_depth = 0; int i; - if (ctx->prefetch_enc_fn) - ctx->prefetch_enc_fn(); - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { - _gcry_aes_aesni_ctr_enc (ctx, outbuf, inbuf, ctr, nblocks); - burn_depth = 0; + _gcry_aes_aesni_ctr_enc (ctx, ctr, outbuf, inbuf, nblocks); + return; } #endif /*USE_AESNI*/ #ifdef USE_SSSE3 else if (ctx->use_ssse3) { - _gcry_aes_ssse3_ctr_enc (ctx, outbuf, inbuf, ctr, nblocks); - burn_depth = 0; + _gcry_aes_ssse3_ctr_enc (ctx, ctr, outbuf, inbuf, nblocks); + return; } #endif /*USE_SSSE3*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { - _gcry_aes_armv8_ce_ctr_enc (ctx, outbuf, inbuf, ctr, nblocks); - burn_depth = 0; + _gcry_aes_armv8_ce_ctr_enc (ctx, ctr, outbuf, inbuf, nblocks); + return; } #endif /*USE_ARM_CE*/ else @@ -944,6 +954,9 @@ _gcry_aes_ctr_enc (void *context, unsigned char *ctr, union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } tmp; rijndael_cryptfn_t encrypt_fn = ctx->encrypt_fn; + if (ctx->prefetch_enc_fn) + ctx->prefetch_enc_fn(); + for ( ;nblocks; nblocks-- ) { /* Encrypt the counter. */ @@ -1161,36 +1174,36 @@ _gcry_aes_cfb_dec (void *context, unsigned char *iv, const unsigned char *inbuf = inbuf_arg; unsigned int burn_depth = 0; - if (ctx->prefetch_enc_fn) - ctx->prefetch_enc_fn(); - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { - _gcry_aes_aesni_cfb_dec (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_aesni_cfb_dec (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_AESNI*/ #ifdef USE_SSSE3 else if (ctx->use_ssse3) { - _gcry_aes_ssse3_cfb_dec (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_ssse3_cfb_dec (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_SSSE3*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { - _gcry_aes_armv8_ce_cfb_dec (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_armv8_ce_cfb_dec (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_ARM_CE*/ else { rijndael_cryptfn_t encrypt_fn = ctx->encrypt_fn; + if (ctx->prefetch_enc_fn) + ctx->prefetch_enc_fn(); + for ( ;nblocks; nblocks-- ) { burn_depth = encrypt_fn (ctx, iv, iv); @@ -1219,32 +1232,27 @@ _gcry_aes_cbc_dec (void *context, unsigned char *iv, const unsigned char *inbuf = inbuf_arg; unsigned int burn_depth = 0; - check_decryption_preparation (ctx); - - if (ctx->prefetch_dec_fn) - ctx->prefetch_dec_fn(); - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { - _gcry_aes_aesni_cbc_dec (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_aesni_cbc_dec (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_AESNI*/ #ifdef USE_SSSE3 else if (ctx->use_ssse3) { - _gcry_aes_ssse3_cbc_dec (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_ssse3_cbc_dec (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_SSSE3*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { - _gcry_aes_armv8_ce_cbc_dec (ctx, outbuf, inbuf, iv, nblocks); - burn_depth = 0; + _gcry_aes_armv8_ce_cbc_dec (ctx, iv, outbuf, inbuf, nblocks); + return; } #endif /*USE_ARM_CE*/ else @@ -1252,6 +1260,11 @@ _gcry_aes_cbc_dec (void *context, unsigned char *iv, unsigned char savebuf[BLOCKSIZE] ATTR_ALIGNED_16; rijndael_cryptfn_t decrypt_fn = ctx->decrypt_fn; + check_decryption_preparation (ctx); + + if (ctx->prefetch_dec_fn) + ctx->prefetch_dec_fn(); + for ( ;nblocks; nblocks-- ) { /* INBUF is needed later and it may be identical to OUTBUF, so store @@ -1283,40 +1296,24 @@ _gcry_aes_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, const unsigned char *inbuf = inbuf_arg; unsigned int burn_depth = 0; - if (encrypt) - { - if (ctx->prefetch_enc_fn) - ctx->prefetch_enc_fn(); - } - else - { - check_decryption_preparation (ctx); - - if (ctx->prefetch_dec_fn) - ctx->prefetch_dec_fn(); - } - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { - _gcry_aes_aesni_ocb_crypt (c, outbuf, inbuf, nblocks, encrypt); - burn_depth = 0; + return _gcry_aes_aesni_ocb_crypt (c, outbuf, inbuf, nblocks, encrypt); } #endif /*USE_AESNI*/ #ifdef USE_SSSE3 else if (ctx->use_ssse3) { - _gcry_aes_ssse3_ocb_crypt (c, outbuf, inbuf, nblocks, encrypt); - burn_depth = 0; + return _gcry_aes_ssse3_ocb_crypt (c, outbuf, inbuf, nblocks, encrypt); } #endif /*USE_SSSE3*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { - _gcry_aes_armv8_ce_ocb_crypt (c, outbuf, inbuf, nblocks, encrypt); - burn_depth = 0; + return _gcry_aes_armv8_ce_ocb_crypt (c, outbuf, inbuf, nblocks, encrypt); } #endif /*USE_ARM_CE*/ else if (encrypt) @@ -1324,6 +1321,9 @@ _gcry_aes_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; rijndael_cryptfn_t encrypt_fn = ctx->encrypt_fn; + if (ctx->prefetch_enc_fn) + ctx->prefetch_enc_fn(); + for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.data_nblocks; @@ -1349,6 +1349,11 @@ _gcry_aes_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; rijndael_cryptfn_t decrypt_fn = ctx->decrypt_fn; + check_decryption_preparation (ctx); + + if (ctx->prefetch_dec_fn) + ctx->prefetch_dec_fn(); + for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.data_nblocks; @@ -1385,30 +1390,24 @@ _gcry_aes_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) const unsigned char *abuf = abuf_arg; unsigned int burn_depth = 0; - if (ctx->prefetch_enc_fn) - ctx->prefetch_enc_fn(); - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { - _gcry_aes_aesni_ocb_auth (c, abuf, nblocks); - burn_depth = 0; + return _gcry_aes_aesni_ocb_auth (c, abuf, nblocks); } #endif /*USE_AESNI*/ #ifdef USE_SSSE3 else if (ctx->use_ssse3) { - _gcry_aes_ssse3_ocb_auth (c, abuf, nblocks); - burn_depth = 0; + return _gcry_aes_ssse3_ocb_auth (c, abuf, nblocks); } #endif /*USE_SSSE3*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { - _gcry_aes_armv8_ce_ocb_auth (c, abuf, nblocks); - burn_depth = 0; + return _gcry_aes_armv8_ce_ocb_auth (c, abuf, nblocks); } #endif /*USE_ARM_CE*/ else @@ -1416,6 +1415,9 @@ _gcry_aes_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; rijndael_cryptfn_t encrypt_fn = ctx->encrypt_fn; + if (ctx->prefetch_enc_fn) + ctx->prefetch_enc_fn(); + for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.aad_nblocks; @@ -1454,41 +1456,41 @@ _gcry_aes_xts_crypt (void *context, unsigned char *tweak, rijndael_cryptfn_t crypt_fn; u64 tweak_lo, tweak_hi, tweak_next_lo, tweak_next_hi, tmp_lo, tmp_hi, carry; - if (encrypt) - { - if (ctx->prefetch_enc_fn) - ctx->prefetch_enc_fn(); - - crypt_fn = ctx->encrypt_fn; - } - else - { - check_decryption_preparation (ctx); - - if (ctx->prefetch_dec_fn) - ctx->prefetch_dec_fn(); - - crypt_fn = ctx->decrypt_fn; - } - if (0) ; #ifdef USE_AESNI else if (ctx->use_aesni) { _gcry_aes_aesni_xts_crypt (ctx, tweak, outbuf, inbuf, nblocks, encrypt); - burn_depth = 0; + return; } #endif /*USE_AESNI*/ #ifdef USE_ARM_CE else if (ctx->use_arm_ce) { _gcry_aes_armv8_ce_xts_crypt (ctx, tweak, outbuf, inbuf, nblocks, encrypt); - burn_depth = 0; + return; } #endif /*USE_ARM_CE*/ else { + if (encrypt) + { + if (ctx->prefetch_enc_fn) + ctx->prefetch_enc_fn(); + + crypt_fn = ctx->encrypt_fn; + } + else + { + check_decryption_preparation (ctx); + + if (ctx->prefetch_dec_fn) + ctx->prefetch_dec_fn(); + + crypt_fn = ctx->decrypt_fn; + } + tweak_next_lo = buf_get_le64 (tweak + 0); tweak_next_hi = buf_get_le64 (tweak + 8); -- 2.17.1 From jussi.kivilinna at iki.fi Tue Jun 19 21:31:16 2018 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Tue, 19 Jun 2018 22:31:16 +0300 Subject: [PATCH] Clean-up implementation selection for SHA1 and SHA2 Message-ID: <20180619193116.24394-1-jussi.kivilinna@iki.fi> * cipher/sha1.c (ASM_EXTRA_STACK): Increase by sizeof(void*)*4. (do_sha1_transform_amd64_ssse3, do_sha1_transform_amd64_avx) (do_sha1_transform_amd64_avx_bmi2, do_sha1_transform_intel_shaext) (do_sha1_transform_armv7_neon, do_sha1_transform_armv8_ce): New. (transform_blk, transform): Merge to ... (do_transform_generic): ... this and remove calls to assembly implementations. (sha1_init): Select hd->bctx.bwrite based on HW features. (_gcry_sha1_mixblock, sha1_final): Call hd->bctx.bwrite instead of transform. * cipher/sha1.h (SHA1_CONTEXT): Remove implementation selection bits. * cipher/sha256.h (SHA256_CONTEXT): Remove implementation selection bits. (ASM_EXTRA_STACK): Increase by sizeof(void*)*4. (do_sha256_transform_amd64_ssse3, do_sha256_transform_amd64_avx) (do_sha256_transform_amd64_avx2, do_sha256_transform_intel_shaext) (do_sha256_transform_armv8_ce): New. (transform_blk, transform): Merge to ... (do_transform_generic): ... this and remove calls to assembly implementations. (sha256_init, sha224_init): Select hd->bctx.bwrite based on HW features. (sha256_final): Call hd->bctx.bwrite instead of transform. * cipher/sha512-armv7-neon.S (_gcry_sha512_transform_armv7_neon): Return zero. * cipher/sha512.h (SHA512_CONTEXT): Remove implementation selection bits. (ASM_EXTRA_STACK): Increase by sizeof(void*)*4. (do_sha512_transform_armv7_neon, do_sha512_transform_amd64_ssse3) (do_sha512_transform_amd64_avx, do_sha512_transform_amd64_avx2): New. [USE_ARM_ASM] (do_transform_generic): New. (transform_blk, transform): Merge to ... [!USE_ARM_ASM] (do_transform_generic): ... this and remove calls to assembly implementations. (sha512_init, sha384_init): Select hd->bctx.bwrite based on HW features. (sha512_final): Call hd->bctx.bwrite instead of transform. -- Signed-off-by: Jussi Kivilinna --- cipher/sha1.c | 277 +++++++------ cipher/sha1.h | 6 - cipher/sha256.c | 435 ++++++++++---------- cipher/sha512-armv7-neon.S | 1 + cipher/sha512.c | 796 ++++++++++++++++++------------------- 5 files changed, 722 insertions(+), 793 deletions(-) diff --git a/cipher/sha1.c b/cipher/sha1.c index e50262ff4..76c486c7e 100644 --- a/cipher/sha1.c +++ b/cipher/sha1.c @@ -111,8 +111,114 @@ /* #endif */ + +/* Assembly implementations use SystemV ABI, ABI conversion and additional + * stack to store XMM6-XMM15 needed on Win64. */ +#undef ASM_FUNC_ABI +#undef ASM_EXTRA_STACK +#if defined(USE_SSSE3) || defined(USE_AVX) || defined(USE_BMI2) || \ + defined(USE_SHAEXT) +# ifdef HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS +# define ASM_FUNC_ABI __attribute__((sysv_abi)) +# define ASM_EXTRA_STACK (10 * 16 + sizeof(void *) * 4) +# else +# define ASM_FUNC_ABI +# define ASM_EXTRA_STACK 0 +# endif +#endif + + +#ifdef USE_SSSE3 +unsigned int +_gcry_sha1_transform_amd64_ssse3 (void *state, const unsigned char *data, + size_t nblks) ASM_FUNC_ABI; + +static unsigned int +do_sha1_transform_amd64_ssse3 (void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA1_CONTEXT *hd = ctx; + return _gcry_sha1_transform_amd64_ssse3 (&hd->h0, data, nblks) + + ASM_EXTRA_STACK; +} +#endif + +#ifdef USE_AVX +unsigned int +_gcry_sha1_transform_amd64_avx (void *state, const unsigned char *data, + size_t nblks) ASM_FUNC_ABI; + +static unsigned int +do_sha1_transform_amd64_avx (void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA1_CONTEXT *hd = ctx; + return _gcry_sha1_transform_amd64_avx (&hd->h0, data, nblks) + + ASM_EXTRA_STACK; +} +#endif + +#ifdef USE_BMI2 +unsigned int +_gcry_sha1_transform_amd64_avx_bmi2 (void *state, const unsigned char *data, + size_t nblks) ASM_FUNC_ABI; + +static unsigned int +do_sha1_transform_amd64_avx_bmi2 (void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA1_CONTEXT *hd = ctx; + return _gcry_sha1_transform_amd64_avx_bmi2 (&hd->h0, data, nblks) + + ASM_EXTRA_STACK; +} +#endif + +#ifdef USE_SHAEXT +/* Does not need ASM_FUNC_ABI */ +unsigned int +_gcry_sha1_transform_intel_shaext (void *state, const unsigned char *data, + size_t nblks); + static unsigned int -transform (void *c, const unsigned char *data, size_t nblks); +do_sha1_transform_intel_shaext (void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA1_CONTEXT *hd = ctx; + return _gcry_sha1_transform_intel_shaext (&hd->h0, data, nblks); +} +#endif + +#ifdef USE_NEON +unsigned int +_gcry_sha1_transform_armv7_neon (void *state, const unsigned char *data, + size_t nblks); + +static unsigned int +do_sha1_transform_armv7_neon (void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA1_CONTEXT *hd = ctx; + return _gcry_sha1_transform_armv7_neon (&hd->h0, data, nblks); +} +#endif + +#ifdef USE_ARM_CE +unsigned int +_gcry_sha1_transform_armv8_ce (void *state, const unsigned char *data, + size_t nblks); + +static unsigned int +do_sha1_transform_armv8_ce (void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA1_CONTEXT *hd = ctx; + return _gcry_sha1_transform_armv8_ce (&hd->h0, data, nblks); +} +#endif + + +static unsigned int +do_transform_generic (void *c, const unsigned char *data, size_t nblks); static void @@ -133,29 +239,38 @@ sha1_init (void *context, unsigned int flags) hd->bctx.nblocks_high = 0; hd->bctx.count = 0; hd->bctx.blocksize = 64; - hd->bctx.bwrite = transform; + /* Order of feature checks is important here; last match will be + * selected. Keep slower implementations at the top and faster at + * the bottom. */ + hd->bctx.bwrite = do_transform_generic; #ifdef USE_SSSE3 - hd->use_ssse3 = (features & HWF_INTEL_SSSE3) != 0; + if ((features & HWF_INTEL_SSSE3) != 0) + hd->bctx.bwrite = do_sha1_transform_amd64_ssse3; #endif #ifdef USE_AVX /* AVX implementation uses SHLD which is known to be slow on non-Intel CPUs. * Therefore use this implementation on Intel CPUs only. */ - hd->use_avx = (features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD); + if ((features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD)) + hd->bctx.bwrite = do_sha1_transform_amd64_avx; #endif #ifdef USE_BMI2 - hd->use_bmi2 = (features & HWF_INTEL_AVX) && (features & HWF_INTEL_BMI2); + if ((features & HWF_INTEL_AVX) && (features & HWF_INTEL_BMI2)) + hd->bctx.bwrite = do_sha1_transform_amd64_avx_bmi2; #endif #ifdef USE_SHAEXT - hd->use_shaext = (features & HWF_INTEL_SHAEXT) - && (features & HWF_INTEL_SSE4_1); + if ((features & HWF_INTEL_SHAEXT) && (features & HWF_INTEL_SSE4_1)) + hd->bctx.bwrite = do_sha1_transform_intel_shaext; #endif #ifdef USE_NEON - hd->use_neon = (features & HWF_ARM_NEON) != 0; + if ((features & HWF_ARM_NEON) != 0) + hd->bctx.bwrite = do_sha1_transform_armv7_neon; #endif #ifdef USE_ARM_CE - hd->use_arm_ce = (features & HWF_ARM_SHA1) != 0; + if ((features & HWF_ARM_SHA1) != 0) + hd->bctx.bwrite = do_sha1_transform_armv8_ce; #endif + (void)features; } @@ -192,30 +307,20 @@ _gcry_sha1_mixblock_init (SHA1_CONTEXT *hd) b = rol( b, 30 ); \ } while(0) - -#ifdef USE_NEON -unsigned int -_gcry_sha1_transform_armv7_neon (void *state, const unsigned char *data, - size_t nblks); -#endif - -#ifdef USE_ARM_CE -unsigned int -_gcry_sha1_transform_armv8_ce (void *state, const unsigned char *data, - size_t nblks); -#endif - /* * Transform NBLOCKS of each 64 bytes (16 32-bit words) at DATA. */ static unsigned int -transform_blk (void *ctx, const unsigned char *data) +do_transform_generic (void *ctx, const unsigned char *data, size_t nblks) { SHA1_CONTEXT *hd = ctx; - const u32 *idata = (const void *)data; - register u32 a, b, c, d, e; /* Local copies of the chaining variables. */ - register u32 tm; /* Helper. */ - u32 x[16]; /* The array we work on. */ + + do + { + const u32 *idata = (const void *)data; + u32 a, b, c, d, e; /* Local copies of the chaining variables. */ + u32 tm; /* Helper. */ + u32 x[16]; /* The array we work on. */ #define I(i) (x[i] = buf_get_be32(idata + i)) @@ -315,123 +420,11 @@ transform_blk (void *ctx, const unsigned char *data) hd->h3 += d; hd->h4 += e; - return /* burn_stack */ 88+4*sizeof(void*); -} - - -/* Assembly implementations use SystemV ABI, ABI conversion and additional - * stack to store XMM6-XMM15 needed on Win64. */ -#undef ASM_FUNC_ABI -#undef ASM_EXTRA_STACK -#if defined(USE_SSSE3) || defined(USE_AVX) || defined(USE_BMI2) || \ - defined(USE_SHAEXT) -# ifdef HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS -# define ASM_FUNC_ABI __attribute__((sysv_abi)) -# define ASM_EXTRA_STACK (10 * 16) -# else -# define ASM_FUNC_ABI -# define ASM_EXTRA_STACK 0 -# endif -#endif - - -#ifdef USE_SSSE3 -unsigned int -_gcry_sha1_transform_amd64_ssse3 (void *state, const unsigned char *data, - size_t nblks) ASM_FUNC_ABI; -#endif - -#ifdef USE_AVX -unsigned int -_gcry_sha1_transform_amd64_avx (void *state, const unsigned char *data, - size_t nblks) ASM_FUNC_ABI; -#endif - -#ifdef USE_BMI2 -unsigned int -_gcry_sha1_transform_amd64_avx_bmi2 (void *state, const unsigned char *data, - size_t nblks) ASM_FUNC_ABI; -#endif - -#ifdef USE_SHAEXT -/* Does not need ASM_FUNC_ABI */ -unsigned int -_gcry_sha1_transform_intel_shaext (void *state, const unsigned char *data, - size_t nblks); -#endif - - -static unsigned int -transform (void *ctx, const unsigned char *data, size_t nblks) -{ - SHA1_CONTEXT *hd = ctx; - unsigned int burn; - -#ifdef USE_SHAEXT - if (hd->use_shaext) - { - burn = _gcry_sha1_transform_intel_shaext (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif -#ifdef USE_BMI2 - if (hd->use_bmi2) - { - burn = _gcry_sha1_transform_amd64_avx_bmi2 (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif -#ifdef USE_AVX - if (hd->use_avx) - { - burn = _gcry_sha1_transform_amd64_avx (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif -#ifdef USE_SSSE3 - if (hd->use_ssse3) - { - burn = _gcry_sha1_transform_amd64_ssse3 (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif -#ifdef USE_ARM_CE - if (hd->use_arm_ce) - { - burn = _gcry_sha1_transform_armv8_ce (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) : 0; - return burn; - } -#endif -#ifdef USE_NEON - if (hd->use_neon) - { - burn = _gcry_sha1_transform_armv7_neon (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) : 0; - return burn; - } -#endif - - do - { - burn = transform_blk (hd, data); data += 64; } while (--nblks); -#ifdef ASM_EXTRA_STACK - /* 'transform_blk' is typically inlined and XMM6-XMM15 are stored at - * the prologue of this function. Therefore need to add ASM_EXTRA_STACK to - * here too. - */ - burn += ASM_EXTRA_STACK; -#endif - - return burn; + return 88+4*sizeof(void*); } @@ -451,7 +444,7 @@ _gcry_sha1_mixblock (SHA1_CONTEXT *hd, void *blockof64byte) u32 *p = blockof64byte; unsigned int nburn; - nburn = transform (hd, blockof64byte, 1); + nburn = (*hd->bctx.bwrite) (hd, blockof64byte, 1); p[0] = hd->h0; p[1] = hd->h1; p[2] = hd->h2; @@ -515,7 +508,7 @@ sha1_final(void *context) /* append the 64 bit count */ buf_put_be32(hd->bctx.buf + 56, msb); buf_put_be32(hd->bctx.buf + 60, lsb); - burn = transform( hd, hd->bctx.buf, 1 ); + burn = (*hd->bctx.bwrite) ( hd, hd->bctx.buf, 1 ); _gcry_burn_stack (burn); p = hd->bctx.buf; diff --git a/cipher/sha1.h b/cipher/sha1.h index 93ce79b5c..acf764baa 100644 --- a/cipher/sha1.h +++ b/cipher/sha1.h @@ -26,12 +26,6 @@ typedef struct { gcry_md_block_ctx_t bctx; u32 h0,h1,h2,h3,h4; - unsigned int use_ssse3:1; - unsigned int use_avx:1; - unsigned int use_bmi2:1; - unsigned int use_shaext:1; - unsigned int use_neon:1; - unsigned int use_arm_ce:1; } SHA1_CONTEXT; diff --git a/cipher/sha256.c b/cipher/sha256.c index 069597074..e82a9d902 100644 --- a/cipher/sha256.c +++ b/cipher/sha256.c @@ -102,26 +102,103 @@ typedef struct { gcry_md_block_ctx_t bctx; u32 h0,h1,h2,h3,h4,h5,h6,h7; +} SHA256_CONTEXT; + + +/* Assembly implementations use SystemV ABI, ABI conversion and additional + * stack to store XMM6-XMM15 needed on Win64. */ +#undef ASM_FUNC_ABI +#undef ASM_EXTRA_STACK +#if defined(USE_SSSE3) || defined(USE_AVX) || defined(USE_AVX2) || \ + defined(USE_SHAEXT) +# ifdef HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS +# define ASM_FUNC_ABI __attribute__((sysv_abi)) +# define ASM_EXTRA_STACK (10 * 16 + sizeof(void *) * 4) +# else +# define ASM_FUNC_ABI +# define ASM_EXTRA_STACK 0 +# endif +#endif + + #ifdef USE_SSSE3 - unsigned int use_ssse3:1; +unsigned int _gcry_sha256_transform_amd64_ssse3(const void *input_data, + u32 state[8], + size_t num_blks) ASM_FUNC_ABI; + +static unsigned int +do_sha256_transform_amd64_ssse3(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA256_CONTEXT *hd = ctx; + return _gcry_sha256_transform_amd64_ssse3 (data, &hd->h0, nblks) + + ASM_EXTRA_STACK; +} #endif + #ifdef USE_AVX - unsigned int use_avx:1; +unsigned int _gcry_sha256_transform_amd64_avx(const void *input_data, + u32 state[8], + size_t num_blks) ASM_FUNC_ABI; + +static unsigned int +do_sha256_transform_amd64_avx(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA256_CONTEXT *hd = ctx; + return _gcry_sha256_transform_amd64_avx (data, &hd->h0, nblks) + + ASM_EXTRA_STACK; +} #endif + #ifdef USE_AVX2 - unsigned int use_avx2:1; +unsigned int _gcry_sha256_transform_amd64_avx2(const void *input_data, + u32 state[8], + size_t num_blks) ASM_FUNC_ABI; + +static unsigned int +do_sha256_transform_amd64_avx2(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA256_CONTEXT *hd = ctx; + return _gcry_sha256_transform_amd64_avx2 (data, &hd->h0, nblks) + + ASM_EXTRA_STACK; +} #endif + #ifdef USE_SHAEXT - unsigned int use_shaext:1; +/* Does not need ASM_FUNC_ABI */ +unsigned int +_gcry_sha256_transform_intel_shaext(u32 state[8], + const unsigned char *input_data, + size_t num_blks); + +static unsigned int +do_sha256_transform_intel_shaext(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA256_CONTEXT *hd = ctx; + return _gcry_sha256_transform_intel_shaext (&hd->h0, data, nblks); +} #endif + #ifdef USE_ARM_CE - unsigned int use_arm_ce:1; +unsigned int _gcry_sha256_transform_armv8_ce(u32 state[8], + const void *input_data, + size_t num_blks); + +static unsigned int +do_sha256_transform_armv8_ce(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA256_CONTEXT *hd = ctx; + return _gcry_sha256_transform_armv8_ce (&hd->h0, data, nblks); +} #endif -} SHA256_CONTEXT; static unsigned int -transform (void *c, const unsigned char *data, size_t nblks); +do_transform_generic (void *ctx, const unsigned char *data, size_t nblks); static void @@ -145,25 +222,32 @@ sha256_init (void *context, unsigned int flags) hd->bctx.nblocks_high = 0; hd->bctx.count = 0; hd->bctx.blocksize = 64; - hd->bctx.bwrite = transform; + /* Order of feature checks is important here; last match will be + * selected. Keep slower implementations at the top and faster at + * the bottom. */ + hd->bctx.bwrite = do_transform_generic; #ifdef USE_SSSE3 - hd->use_ssse3 = (features & HWF_INTEL_SSSE3) != 0; + if ((features & HWF_INTEL_SSSE3) != 0) + hd->bctx.bwrite = do_sha256_transform_amd64_ssse3; #endif #ifdef USE_AVX /* AVX implementation uses SHLD which is known to be slow on non-Intel CPUs. * Therefore use this implementation on Intel CPUs only. */ - hd->use_avx = (features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD); + if ((features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD)) + hd->bctx.bwrite = do_sha256_transform_amd64_avx; #endif #ifdef USE_AVX2 - hd->use_avx2 = (features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2); + if ((features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2)) + hd->bctx.bwrite = do_sha256_transform_amd64_avx2; #endif #ifdef USE_SHAEXT - hd->use_shaext = (features & HWF_INTEL_SHAEXT) - && (features & HWF_INTEL_SSE4_1); + if ((features & HWF_INTEL_SHAEXT) && (features & HWF_INTEL_SSE4_1)) + hd->bctx.bwrite = do_sha256_transform_intel_shaext; #endif #ifdef USE_ARM_CE - hd->use_arm_ce = (features & HWF_ARM_SHA2) != 0; + if ((features & HWF_ARM_SHA2) != 0) + hd->bctx.bwrite = do_sha256_transform_armv8_ce; #endif (void)features; } @@ -190,25 +274,32 @@ sha224_init (void *context, unsigned int flags) hd->bctx.nblocks_high = 0; hd->bctx.count = 0; hd->bctx.blocksize = 64; - hd->bctx.bwrite = transform; + /* Order of feature checks is important here; last match will be + * selected. Keep slower implementations at the top and faster at + * the bottom. */ + hd->bctx.bwrite = do_transform_generic; #ifdef USE_SSSE3 - hd->use_ssse3 = (features & HWF_INTEL_SSSE3) != 0; + if ((features & HWF_INTEL_SSSE3) != 0) + hd->bctx.bwrite = do_sha256_transform_amd64_ssse3; #endif #ifdef USE_AVX /* AVX implementation uses SHLD which is known to be slow on non-Intel CPUs. * Therefore use this implementation on Intel CPUs only. */ - hd->use_avx = (features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD); + if ((features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD)) + hd->bctx.bwrite = do_sha256_transform_amd64_avx; #endif #ifdef USE_AVX2 - hd->use_avx2 = (features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2); + if ((features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2)) + hd->bctx.bwrite = do_sha256_transform_amd64_avx2; #endif #ifdef USE_SHAEXT - hd->use_shaext = (features & HWF_INTEL_SHAEXT) - && (features & HWF_INTEL_SSE4_1); + if ((features & HWF_INTEL_SHAEXT) && (features & HWF_INTEL_SSE4_1)) + hd->bctx.bwrite = do_sha256_transform_intel_shaext; #endif #ifdef USE_ARM_CE - hd->use_arm_ce = (features & HWF_ARM_SHA2) != 0; + if ((features & HWF_ARM_SHA2) != 0) + hd->bctx.bwrite = do_sha256_transform_armv8_ce; #endif (void)features; } @@ -247,7 +338,7 @@ sha224_init (void *context, unsigned int flags) + w[(i-16)&0x0f] ) static unsigned int -transform_blk (void *ctx, const unsigned char *data) +do_transform_generic (void *ctx, const unsigned char *data, size_t nblks) { SHA256_CONTEXT *hd = ctx; static const u32 K[64] = { @@ -269,219 +360,109 @@ transform_blk (void *ctx, const unsigned char *data) 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 }; - u32 a,b,c,d,e,f,g,h,t1,t2; - u32 w[16]; - - a = hd->h0; - b = hd->h1; - c = hd->h2; - d = hd->h3; - e = hd->h4; - f = hd->h5; - g = hd->h6; - h = hd->h7; - - R(a, b, c, d, e, f, g, h, K[0], I(0)); - R(h, a, b, c, d, e, f, g, K[1], I(1)); - R(g, h, a, b, c, d, e, f, K[2], I(2)); - R(f, g, h, a, b, c, d, e, K[3], I(3)); - R(e, f, g, h, a, b, c, d, K[4], I(4)); - R(d, e, f, g, h, a, b, c, K[5], I(5)); - R(c, d, e, f, g, h, a, b, K[6], I(6)); - R(b, c, d, e, f, g, h, a, K[7], I(7)); - R(a, b, c, d, e, f, g, h, K[8], I(8)); - R(h, a, b, c, d, e, f, g, K[9], I(9)); - R(g, h, a, b, c, d, e, f, K[10], I(10)); - R(f, g, h, a, b, c, d, e, K[11], I(11)); - R(e, f, g, h, a, b, c, d, K[12], I(12)); - R(d, e, f, g, h, a, b, c, K[13], I(13)); - R(c, d, e, f, g, h, a, b, K[14], I(14)); - R(b, c, d, e, f, g, h, a, K[15], I(15)); - - R(a, b, c, d, e, f, g, h, K[16], W(16)); - R(h, a, b, c, d, e, f, g, K[17], W(17)); - R(g, h, a, b, c, d, e, f, K[18], W(18)); - R(f, g, h, a, b, c, d, e, K[19], W(19)); - R(e, f, g, h, a, b, c, d, K[20], W(20)); - R(d, e, f, g, h, a, b, c, K[21], W(21)); - R(c, d, e, f, g, h, a, b, K[22], W(22)); - R(b, c, d, e, f, g, h, a, K[23], W(23)); - R(a, b, c, d, e, f, g, h, K[24], W(24)); - R(h, a, b, c, d, e, f, g, K[25], W(25)); - R(g, h, a, b, c, d, e, f, K[26], W(26)); - R(f, g, h, a, b, c, d, e, K[27], W(27)); - R(e, f, g, h, a, b, c, d, K[28], W(28)); - R(d, e, f, g, h, a, b, c, K[29], W(29)); - R(c, d, e, f, g, h, a, b, K[30], W(30)); - R(b, c, d, e, f, g, h, a, K[31], W(31)); - - R(a, b, c, d, e, f, g, h, K[32], W(32)); - R(h, a, b, c, d, e, f, g, K[33], W(33)); - R(g, h, a, b, c, d, e, f, K[34], W(34)); - R(f, g, h, a, b, c, d, e, K[35], W(35)); - R(e, f, g, h, a, b, c, d, K[36], W(36)); - R(d, e, f, g, h, a, b, c, K[37], W(37)); - R(c, d, e, f, g, h, a, b, K[38], W(38)); - R(b, c, d, e, f, g, h, a, K[39], W(39)); - R(a, b, c, d, e, f, g, h, K[40], W(40)); - R(h, a, b, c, d, e, f, g, K[41], W(41)); - R(g, h, a, b, c, d, e, f, K[42], W(42)); - R(f, g, h, a, b, c, d, e, K[43], W(43)); - R(e, f, g, h, a, b, c, d, K[44], W(44)); - R(d, e, f, g, h, a, b, c, K[45], W(45)); - R(c, d, e, f, g, h, a, b, K[46], W(46)); - R(b, c, d, e, f, g, h, a, K[47], W(47)); - - R(a, b, c, d, e, f, g, h, K[48], W(48)); - R(h, a, b, c, d, e, f, g, K[49], W(49)); - R(g, h, a, b, c, d, e, f, K[50], W(50)); - R(f, g, h, a, b, c, d, e, K[51], W(51)); - R(e, f, g, h, a, b, c, d, K[52], W(52)); - R(d, e, f, g, h, a, b, c, K[53], W(53)); - R(c, d, e, f, g, h, a, b, K[54], W(54)); - R(b, c, d, e, f, g, h, a, K[55], W(55)); - R(a, b, c, d, e, f, g, h, K[56], W(56)); - R(h, a, b, c, d, e, f, g, K[57], W(57)); - R(g, h, a, b, c, d, e, f, K[58], W(58)); - R(f, g, h, a, b, c, d, e, K[59], W(59)); - R(e, f, g, h, a, b, c, d, K[60], W(60)); - R(d, e, f, g, h, a, b, c, K[61], W(61)); - R(c, d, e, f, g, h, a, b, K[62], W(62)); - R(b, c, d, e, f, g, h, a, K[63], W(63)); - - hd->h0 += a; - hd->h1 += b; - hd->h2 += c; - hd->h3 += d; - hd->h4 += e; - hd->h5 += f; - hd->h6 += g; - hd->h7 += h; - - return /*burn_stack*/ 26*4+32; -} -#undef S0 -#undef S1 -#undef R - - -/* Assembly implementations use SystemV ABI, ABI conversion and additional - * stack to store XMM6-XMM15 needed on Win64. */ -#undef ASM_FUNC_ABI -#undef ASM_EXTRA_STACK -#if defined(USE_SSSE3) || defined(USE_AVX) || defined(USE_AVX2) || \ - defined(USE_SHAEXT) -# ifdef HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS -# define ASM_FUNC_ABI __attribute__((sysv_abi)) -# define ASM_EXTRA_STACK (10 * 16) -# else -# define ASM_FUNC_ABI -# define ASM_EXTRA_STACK 0 -# endif -#endif - - -#ifdef USE_SSSE3 -unsigned int _gcry_sha256_transform_amd64_ssse3(const void *input_data, - u32 state[8], - size_t num_blks) ASM_FUNC_ABI; -#endif - -#ifdef USE_AVX -unsigned int _gcry_sha256_transform_amd64_avx(const void *input_data, - u32 state[8], - size_t num_blks) ASM_FUNC_ABI; -#endif - -#ifdef USE_AVX2 -unsigned int _gcry_sha256_transform_amd64_avx2(const void *input_data, - u32 state[8], - size_t num_blks) ASM_FUNC_ABI; -#endif - -#ifdef USE_SHAEXT -/* Does not need ASM_FUNC_ABI */ -unsigned int -_gcry_sha256_transform_intel_shaext(u32 state[8], - const unsigned char *input_data, - size_t num_blks); -#endif - -#ifdef USE_ARM_CE -unsigned int _gcry_sha256_transform_armv8_ce(u32 state[8], - const void *input_data, - size_t num_blks); -#endif - -static unsigned int -transform (void *ctx, const unsigned char *data, size_t nblks) -{ - SHA256_CONTEXT *hd = ctx; - unsigned int burn; - -#ifdef USE_SHAEXT - if (hd->use_shaext) - { - burn = _gcry_sha256_transform_intel_shaext (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif - -#ifdef USE_AVX2 - if (hd->use_avx2) - { - burn = _gcry_sha256_transform_amd64_avx2 (data, &hd->h0, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif - -#ifdef USE_AVX - if (hd->use_avx) - { - burn = _gcry_sha256_transform_amd64_avx (data, &hd->h0, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif - -#ifdef USE_SSSE3 - if (hd->use_ssse3) + do { - burn = _gcry_sha256_transform_amd64_ssse3 (data, &hd->h0, nblks); - burn += burn ? 4 * sizeof(void*) + ASM_EXTRA_STACK : 0; - return burn; - } -#endif -#ifdef USE_ARM_CE - if (hd->use_arm_ce) - { - burn = _gcry_sha256_transform_armv8_ce (&hd->h0, data, nblks); - burn += burn ? 4 * sizeof(void*) : 0; - return burn; - } -#endif + u32 a,b,c,d,e,f,g,h,t1,t2; + u32 w[16]; + + a = hd->h0; + b = hd->h1; + c = hd->h2; + d = hd->h3; + e = hd->h4; + f = hd->h5; + g = hd->h6; + h = hd->h7; + + R(a, b, c, d, e, f, g, h, K[0], I(0)); + R(h, a, b, c, d, e, f, g, K[1], I(1)); + R(g, h, a, b, c, d, e, f, K[2], I(2)); + R(f, g, h, a, b, c, d, e, K[3], I(3)); + R(e, f, g, h, a, b, c, d, K[4], I(4)); + R(d, e, f, g, h, a, b, c, K[5], I(5)); + R(c, d, e, f, g, h, a, b, K[6], I(6)); + R(b, c, d, e, f, g, h, a, K[7], I(7)); + R(a, b, c, d, e, f, g, h, K[8], I(8)); + R(h, a, b, c, d, e, f, g, K[9], I(9)); + R(g, h, a, b, c, d, e, f, K[10], I(10)); + R(f, g, h, a, b, c, d, e, K[11], I(11)); + R(e, f, g, h, a, b, c, d, K[12], I(12)); + R(d, e, f, g, h, a, b, c, K[13], I(13)); + R(c, d, e, f, g, h, a, b, K[14], I(14)); + R(b, c, d, e, f, g, h, a, K[15], I(15)); + + R(a, b, c, d, e, f, g, h, K[16], W(16)); + R(h, a, b, c, d, e, f, g, K[17], W(17)); + R(g, h, a, b, c, d, e, f, K[18], W(18)); + R(f, g, h, a, b, c, d, e, K[19], W(19)); + R(e, f, g, h, a, b, c, d, K[20], W(20)); + R(d, e, f, g, h, a, b, c, K[21], W(21)); + R(c, d, e, f, g, h, a, b, K[22], W(22)); + R(b, c, d, e, f, g, h, a, K[23], W(23)); + R(a, b, c, d, e, f, g, h, K[24], W(24)); + R(h, a, b, c, d, e, f, g, K[25], W(25)); + R(g, h, a, b, c, d, e, f, K[26], W(26)); + R(f, g, h, a, b, c, d, e, K[27], W(27)); + R(e, f, g, h, a, b, c, d, K[28], W(28)); + R(d, e, f, g, h, a, b, c, K[29], W(29)); + R(c, d, e, f, g, h, a, b, K[30], W(30)); + R(b, c, d, e, f, g, h, a, K[31], W(31)); + + R(a, b, c, d, e, f, g, h, K[32], W(32)); + R(h, a, b, c, d, e, f, g, K[33], W(33)); + R(g, h, a, b, c, d, e, f, K[34], W(34)); + R(f, g, h, a, b, c, d, e, K[35], W(35)); + R(e, f, g, h, a, b, c, d, K[36], W(36)); + R(d, e, f, g, h, a, b, c, K[37], W(37)); + R(c, d, e, f, g, h, a, b, K[38], W(38)); + R(b, c, d, e, f, g, h, a, K[39], W(39)); + R(a, b, c, d, e, f, g, h, K[40], W(40)); + R(h, a, b, c, d, e, f, g, K[41], W(41)); + R(g, h, a, b, c, d, e, f, K[42], W(42)); + R(f, g, h, a, b, c, d, e, K[43], W(43)); + R(e, f, g, h, a, b, c, d, K[44], W(44)); + R(d, e, f, g, h, a, b, c, K[45], W(45)); + R(c, d, e, f, g, h, a, b, K[46], W(46)); + R(b, c, d, e, f, g, h, a, K[47], W(47)); + + R(a, b, c, d, e, f, g, h, K[48], W(48)); + R(h, a, b, c, d, e, f, g, K[49], W(49)); + R(g, h, a, b, c, d, e, f, K[50], W(50)); + R(f, g, h, a, b, c, d, e, K[51], W(51)); + R(e, f, g, h, a, b, c, d, K[52], W(52)); + R(d, e, f, g, h, a, b, c, K[53], W(53)); + R(c, d, e, f, g, h, a, b, K[54], W(54)); + R(b, c, d, e, f, g, h, a, K[55], W(55)); + R(a, b, c, d, e, f, g, h, K[56], W(56)); + R(h, a, b, c, d, e, f, g, K[57], W(57)); + R(g, h, a, b, c, d, e, f, K[58], W(58)); + R(f, g, h, a, b, c, d, e, K[59], W(59)); + R(e, f, g, h, a, b, c, d, K[60], W(60)); + R(d, e, f, g, h, a, b, c, K[61], W(61)); + R(c, d, e, f, g, h, a, b, K[62], W(62)); + R(b, c, d, e, f, g, h, a, K[63], W(63)); + + hd->h0 += a; + hd->h1 += b; + hd->h2 += c; + hd->h3 += d; + hd->h4 += e; + hd->h5 += f; + hd->h6 += g; + hd->h7 += h; - do - { - burn = transform_blk (hd, data); data += 64; } while (--nblks); -#ifdef ASM_EXTRA_STACK - /* 'transform_blk' is typically inlined and XMM6-XMM15 are stored at - * the prologue of this function. Therefore need to add ASM_EXTRA_STACK to - * here too. - */ - burn += ASM_EXTRA_STACK; -#endif - - return burn; + return 26*4 + 32 + 3 * sizeof(void*); } +#undef S0 +#undef S1 +#undef R + /* The routine finally terminates the computation and returns the @@ -534,7 +515,7 @@ sha256_final(void *context) /* append the 64 bit count */ buf_put_be32(hd->bctx.buf + 56, msb); buf_put_be32(hd->bctx.buf + 60, lsb); - burn = transform (hd, hd->bctx.buf, 1); + burn = (*hd->bctx.bwrite) (hd, hd->bctx.buf, 1); _gcry_burn_stack (burn); p = hd->bctx.buf; diff --git a/cipher/sha512-armv7-neon.S b/cipher/sha512-armv7-neon.S index a9d127245..6596f2cdb 100644 --- a/cipher/sha512-armv7-neon.S +++ b/cipher/sha512-armv7-neon.S @@ -443,6 +443,7 @@ _gcry_sha512_transform_armv7_neon: veor.u64 %q2, %q2; veor.u64 %q3, %q3; + eor %r0, %r0; pop {%pc}; .size _gcry_sha512_transform_armv7_neon,.-_gcry_sha512_transform_armv7_neon; diff --git a/cipher/sha512.c b/cipher/sha512.c index 9405de80b..721f34054 100644 --- a/cipher/sha512.c +++ b/cipher/sha512.c @@ -113,22 +113,145 @@ typedef struct { gcry_md_block_ctx_t bctx; SHA512_STATE state; +} SHA512_CONTEXT; + + +static const u64 k[] = + { + U64_C(0x428a2f98d728ae22), U64_C(0x7137449123ef65cd), + U64_C(0xb5c0fbcfec4d3b2f), U64_C(0xe9b5dba58189dbbc), + U64_C(0x3956c25bf348b538), U64_C(0x59f111f1b605d019), + U64_C(0x923f82a4af194f9b), U64_C(0xab1c5ed5da6d8118), + U64_C(0xd807aa98a3030242), U64_C(0x12835b0145706fbe), + U64_C(0x243185be4ee4b28c), U64_C(0x550c7dc3d5ffb4e2), + U64_C(0x72be5d74f27b896f), U64_C(0x80deb1fe3b1696b1), + U64_C(0x9bdc06a725c71235), U64_C(0xc19bf174cf692694), + U64_C(0xe49b69c19ef14ad2), U64_C(0xefbe4786384f25e3), + U64_C(0x0fc19dc68b8cd5b5), U64_C(0x240ca1cc77ac9c65), + U64_C(0x2de92c6f592b0275), U64_C(0x4a7484aa6ea6e483), + U64_C(0x5cb0a9dcbd41fbd4), U64_C(0x76f988da831153b5), + U64_C(0x983e5152ee66dfab), U64_C(0xa831c66d2db43210), + U64_C(0xb00327c898fb213f), U64_C(0xbf597fc7beef0ee4), + U64_C(0xc6e00bf33da88fc2), U64_C(0xd5a79147930aa725), + U64_C(0x06ca6351e003826f), U64_C(0x142929670a0e6e70), + U64_C(0x27b70a8546d22ffc), U64_C(0x2e1b21385c26c926), + U64_C(0x4d2c6dfc5ac42aed), U64_C(0x53380d139d95b3df), + U64_C(0x650a73548baf63de), U64_C(0x766a0abb3c77b2a8), + U64_C(0x81c2c92e47edaee6), U64_C(0x92722c851482353b), + U64_C(0xa2bfe8a14cf10364), U64_C(0xa81a664bbc423001), + U64_C(0xc24b8b70d0f89791), U64_C(0xc76c51a30654be30), + U64_C(0xd192e819d6ef5218), U64_C(0xd69906245565a910), + U64_C(0xf40e35855771202a), U64_C(0x106aa07032bbd1b8), + U64_C(0x19a4c116b8d2d0c8), U64_C(0x1e376c085141ab53), + U64_C(0x2748774cdf8eeb99), U64_C(0x34b0bcb5e19b48a8), + U64_C(0x391c0cb3c5c95a63), U64_C(0x4ed8aa4ae3418acb), + U64_C(0x5b9cca4f7763e373), U64_C(0x682e6ff3d6b2b8a3), + U64_C(0x748f82ee5defb2fc), U64_C(0x78a5636f43172f60), + U64_C(0x84c87814a1f0ab72), U64_C(0x8cc702081a6439ec), + U64_C(0x90befffa23631e28), U64_C(0xa4506cebde82bde9), + U64_C(0xbef9a3f7b2c67915), U64_C(0xc67178f2e372532b), + U64_C(0xca273eceea26619c), U64_C(0xd186b8c721c0c207), + U64_C(0xeada7dd6cde0eb1e), U64_C(0xf57d4f7fee6ed178), + U64_C(0x06f067aa72176fba), U64_C(0x0a637dc5a2c898a6), + U64_C(0x113f9804bef90dae), U64_C(0x1b710b35131c471b), + U64_C(0x28db77f523047d84), U64_C(0x32caab7b40c72493), + U64_C(0x3c9ebe0a15c9bebc), U64_C(0x431d67c49c100d4c), + U64_C(0x4cc5d4becb3e42b6), U64_C(0x597f299cfc657e2a), + U64_C(0x5fcb6fab3ad6faec), U64_C(0x6c44198c4a475817) + }; + + +/* AMD64 assembly implementations use SystemV ABI, ABI conversion and additional + * stack to store XMM6-XMM15 needed on Win64. */ +#undef ASM_FUNC_ABI +#undef ASM_EXTRA_STACK +#if defined(USE_SSSE3) || defined(USE_AVX) || defined(USE_AVX2) +# ifdef HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS +# define ASM_FUNC_ABI __attribute__((sysv_abi)) +# define ASM_EXTRA_STACK (10 * 16 + 4 * sizeof(void *)) +# else +# define ASM_FUNC_ABI +# define ASM_EXTRA_STACK 0 +# endif +#endif + + #ifdef USE_ARM_NEON_ASM - unsigned int use_neon:1; +unsigned int _gcry_sha512_transform_armv7_neon (SHA512_STATE *hd, + const unsigned char *data, + const u64 k[], size_t num_blks); + +static unsigned int +do_sha512_transform_armv7_neon(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA512_CONTEXT *hd = ctx; + return _gcry_sha512_transform_armv7_neon (&hd->state, data, k, nblks); +} #endif + #ifdef USE_SSSE3 - unsigned int use_ssse3:1; +unsigned int _gcry_sha512_transform_amd64_ssse3(const void *input_data, + void *state, + size_t num_blks) ASM_FUNC_ABI; + +static unsigned int +do_sha512_transform_amd64_ssse3(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA512_CONTEXT *hd = ctx; + return _gcry_sha512_transform_amd64_ssse3 (data, &hd->state, nblks) + + ASM_EXTRA_STACK; +} #endif + #ifdef USE_AVX - unsigned int use_avx:1; +unsigned int _gcry_sha512_transform_amd64_avx(const void *input_data, + void *state, + size_t num_blks) ASM_FUNC_ABI; + +static unsigned int +do_sha512_transform_amd64_avx(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA512_CONTEXT *hd = ctx; + return _gcry_sha512_transform_amd64_avx (data, &hd->state, nblks) + + ASM_EXTRA_STACK; +} #endif + #ifdef USE_AVX2 - unsigned int use_avx2:1; +unsigned int _gcry_sha512_transform_amd64_avx2(const void *input_data, + void *state, + size_t num_blks) ASM_FUNC_ABI; + +static unsigned int +do_sha512_transform_amd64_avx2(void *ctx, const unsigned char *data, + size_t nblks) +{ + SHA512_CONTEXT *hd = ctx; + return _gcry_sha512_transform_amd64_avx2 (data, &hd->state, nblks) + + ASM_EXTRA_STACK; +} #endif -} SHA512_CONTEXT; + + +#ifdef USE_ARM_ASM +unsigned int _gcry_sha512_transform_arm (SHA512_STATE *hd, + const unsigned char *data, + const u64 k[], size_t num_blks); static unsigned int -transform (void *context, const unsigned char *data, size_t nblks); +do_transform_generic (void *context, const unsigned char *data, size_t nblks) +{ + SHA512_CONTEXT *hd = context; + return _gcry_sha512_transform_armv7_neon (&hd->state, data, k, nblks); +} +#else +static unsigned int +do_transform_generic (void *context, const unsigned char *data, size_t nblks); +#endif + static void sha512_init (void *context, unsigned int flags) @@ -138,6 +261,7 @@ sha512_init (void *context, unsigned int flags) unsigned int features = _gcry_get_hw_features (); (void)flags; + (void)k; hd->h0 = U64_C(0x6a09e667f3bcc908); hd->h1 = U64_C(0xbb67ae8584caa73b); @@ -152,21 +276,27 @@ sha512_init (void *context, unsigned int flags) ctx->bctx.nblocks_high = 0; ctx->bctx.count = 0; ctx->bctx.blocksize = 128; - ctx->bctx.bwrite = transform; + /* Order of feature checks is important here; last match will be + * selected. Keep slower implementations at the top and faster at + * the bottom. */ + ctx->bctx.bwrite = do_transform_generic; #ifdef USE_ARM_NEON_ASM - ctx->use_neon = (features & HWF_ARM_NEON) != 0; + if ((features & HWF_ARM_NEON) != 0) + ctx->bctx.bwrite = do_sha512_transform_armv7_neon; #endif #ifdef USE_SSSE3 - ctx->use_ssse3 = (features & HWF_INTEL_SSSE3) != 0; + if ((features & HWF_INTEL_SSSE3) != 0) + ctx->bctx.bwrite = do_sha512_transform_amd64_ssse3; #endif #ifdef USE_AVX - ctx->use_avx = (features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD); + if ((features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD)) + ctx->bctx.bwrite = do_sha512_transform_amd64_avx; #endif #ifdef USE_AVX2 - ctx->use_avx2 = (features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2); + if ((features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2)) + ctx->bctx.bwrite = do_sha512_transform_amd64_avx2; #endif - (void)features; } @@ -192,69 +322,31 @@ sha384_init (void *context, unsigned int flags) ctx->bctx.nblocks_high = 0; ctx->bctx.count = 0; ctx->bctx.blocksize = 128; - ctx->bctx.bwrite = transform; + /* Order of feature checks is important here; last match will be + * selected. Keep slower implementations at the top and faster at + * the bottom. */ + ctx->bctx.bwrite = do_transform_generic; #ifdef USE_ARM_NEON_ASM - ctx->use_neon = (features & HWF_ARM_NEON) != 0; + if ((features & HWF_ARM_NEON) != 0) + ctx->bctx.bwrite = do_sha512_transform_armv7_neon; #endif #ifdef USE_SSSE3 - ctx->use_ssse3 = (features & HWF_INTEL_SSSE3) != 0; + if ((features & HWF_INTEL_SSSE3) != 0) + ctx->bctx.bwrite = do_sha512_transform_amd64_ssse3; #endif #ifdef USE_AVX - ctx->use_avx = (features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD); + if ((features & HWF_INTEL_AVX) && (features & HWF_INTEL_FAST_SHLD)) + ctx->bctx.bwrite = do_sha512_transform_amd64_avx; #endif #ifdef USE_AVX2 - ctx->use_avx2 = (features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2); + if ((features & HWF_INTEL_AVX2) && (features & HWF_INTEL_BMI2)) + ctx->bctx.bwrite = do_sha512_transform_amd64_avx2; #endif - (void)features; } -static const u64 k[] = - { - U64_C(0x428a2f98d728ae22), U64_C(0x7137449123ef65cd), - U64_C(0xb5c0fbcfec4d3b2f), U64_C(0xe9b5dba58189dbbc), - U64_C(0x3956c25bf348b538), U64_C(0x59f111f1b605d019), - U64_C(0x923f82a4af194f9b), U64_C(0xab1c5ed5da6d8118), - U64_C(0xd807aa98a3030242), U64_C(0x12835b0145706fbe), - U64_C(0x243185be4ee4b28c), U64_C(0x550c7dc3d5ffb4e2), - U64_C(0x72be5d74f27b896f), U64_C(0x80deb1fe3b1696b1), - U64_C(0x9bdc06a725c71235), U64_C(0xc19bf174cf692694), - U64_C(0xe49b69c19ef14ad2), U64_C(0xefbe4786384f25e3), - U64_C(0x0fc19dc68b8cd5b5), U64_C(0x240ca1cc77ac9c65), - U64_C(0x2de92c6f592b0275), U64_C(0x4a7484aa6ea6e483), - U64_C(0x5cb0a9dcbd41fbd4), U64_C(0x76f988da831153b5), - U64_C(0x983e5152ee66dfab), U64_C(0xa831c66d2db43210), - U64_C(0xb00327c898fb213f), U64_C(0xbf597fc7beef0ee4), - U64_C(0xc6e00bf33da88fc2), U64_C(0xd5a79147930aa725), - U64_C(0x06ca6351e003826f), U64_C(0x142929670a0e6e70), - U64_C(0x27b70a8546d22ffc), U64_C(0x2e1b21385c26c926), - U64_C(0x4d2c6dfc5ac42aed), U64_C(0x53380d139d95b3df), - U64_C(0x650a73548baf63de), U64_C(0x766a0abb3c77b2a8), - U64_C(0x81c2c92e47edaee6), U64_C(0x92722c851482353b), - U64_C(0xa2bfe8a14cf10364), U64_C(0xa81a664bbc423001), - U64_C(0xc24b8b70d0f89791), U64_C(0xc76c51a30654be30), - U64_C(0xd192e819d6ef5218), U64_C(0xd69906245565a910), - U64_C(0xf40e35855771202a), U64_C(0x106aa07032bbd1b8), - U64_C(0x19a4c116b8d2d0c8), U64_C(0x1e376c085141ab53), - U64_C(0x2748774cdf8eeb99), U64_C(0x34b0bcb5e19b48a8), - U64_C(0x391c0cb3c5c95a63), U64_C(0x4ed8aa4ae3418acb), - U64_C(0x5b9cca4f7763e373), U64_C(0x682e6ff3d6b2b8a3), - U64_C(0x748f82ee5defb2fc), U64_C(0x78a5636f43172f60), - U64_C(0x84c87814a1f0ab72), U64_C(0x8cc702081a6439ec), - U64_C(0x90befffa23631e28), U64_C(0xa4506cebde82bde9), - U64_C(0xbef9a3f7b2c67915), U64_C(0xc67178f2e372532b), - U64_C(0xca273eceea26619c), U64_C(0xd186b8c721c0c207), - U64_C(0xeada7dd6cde0eb1e), U64_C(0xf57d4f7fee6ed178), - U64_C(0x06f067aa72176fba), U64_C(0x0a637dc5a2c898a6), - U64_C(0x113f9804bef90dae), U64_C(0x1b710b35131c471b), - U64_C(0x28db77f523047d84), U64_C(0x32caab7b40c72493), - U64_C(0x3c9ebe0a15c9bebc), U64_C(0x431d67c49c100d4c), - U64_C(0x4cc5d4becb3e42b6), U64_C(0x597f299cfc657e2a), - U64_C(0x5fcb6fab3ad6faec), U64_C(0x6c44198c4a475817) - }; - #ifndef USE_ARM_ASM static inline u64 @@ -291,372 +383,240 @@ Sum1 (u64 x) * Transform the message W which consists of 16 64-bit-words */ static unsigned int -transform_blk (SHA512_STATE *hd, const unsigned char *data) -{ - u64 a, b, c, d, e, f, g, h; - u64 w[16]; - int t; - - /* get values from the chaining vars */ - a = hd->h0; - b = hd->h1; - c = hd->h2; - d = hd->h3; - e = hd->h4; - f = hd->h5; - g = hd->h6; - h = hd->h7; - - for ( t = 0; t < 16; t++ ) - w[t] = buf_get_be64(data + t * 8); - -#define S0(x) (ROTR((x),1) ^ ROTR((x),8) ^ ((x)>>7)) -#define S1(x) (ROTR((x),19) ^ ROTR((x),61) ^ ((x)>>6)) - - for (t = 0; t < 80 - 16; ) - { - u64 t1, t2; - - /* Performance on a AMD Athlon(tm) Dual Core Processor 4050e - with gcc 4.3.3 using gcry_md_hash_buffer of each 10000 bytes - initialized to 0,1,2,3...255,0,... and 1000 iterations: - - Not unrolled with macros: 440ms - Unrolled with macros: 350ms - Unrolled with inline: 330ms - */ -#if 0 /* Not unrolled. */ - t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[t%16]; - w[t%16] += S1 (w[(t - 2)%16]) + w[(t - 7)%16] + S0 (w[(t - 15)%16]); - t2 = Sum0 (a) + Maj (a, b, c); - h = g; - g = f; - f = e; - e = d + t1; - d = c; - c = b; - b = a; - a = t1 + t2; - t++; -#else /* Unrolled to interweave the chain variables. */ - t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[0]; - w[0] += S1 (w[14]) + w[9] + S0 (w[1]); - t2 = Sum0 (a) + Maj (a, b, c); - d += t1; - h = t1 + t2; - - t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+1] + w[1]; - w[1] += S1 (w[15]) + w[10] + S0 (w[2]); - t2 = Sum0 (h) + Maj (h, a, b); - c += t1; - g = t1 + t2; - - t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+2] + w[2]; - w[2] += S1 (w[0]) + w[11] + S0 (w[3]); - t2 = Sum0 (g) + Maj (g, h, a); - b += t1; - f = t1 + t2; - - t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+3] + w[3]; - w[3] += S1 (w[1]) + w[12] + S0 (w[4]); - t2 = Sum0 (f) + Maj (f, g, h); - a += t1; - e = t1 + t2; - - t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+4] + w[4]; - w[4] += S1 (w[2]) + w[13] + S0 (w[5]); - t2 = Sum0 (e) + Maj (e, f, g); - h += t1; - d = t1 + t2; - - t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+5] + w[5]; - w[5] += S1 (w[3]) + w[14] + S0 (w[6]); - t2 = Sum0 (d) + Maj (d, e, f); - g += t1; - c = t1 + t2; - - t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+6] + w[6]; - w[6] += S1 (w[4]) + w[15] + S0 (w[7]); - t2 = Sum0 (c) + Maj (c, d, e); - f += t1; - b = t1 + t2; - - t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+7] + w[7]; - w[7] += S1 (w[5]) + w[0] + S0 (w[8]); - t2 = Sum0 (b) + Maj (b, c, d); - e += t1; - a = t1 + t2; - - t1 = h + Sum1 (e) + Ch (e, f, g) + k[t+8] + w[8]; - w[8] += S1 (w[6]) + w[1] + S0 (w[9]); - t2 = Sum0 (a) + Maj (a, b, c); - d += t1; - h = t1 + t2; - - t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+9] + w[9]; - w[9] += S1 (w[7]) + w[2] + S0 (w[10]); - t2 = Sum0 (h) + Maj (h, a, b); - c += t1; - g = t1 + t2; - - t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+10] + w[10]; - w[10] += S1 (w[8]) + w[3] + S0 (w[11]); - t2 = Sum0 (g) + Maj (g, h, a); - b += t1; - f = t1 + t2; - - t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+11] + w[11]; - w[11] += S1 (w[9]) + w[4] + S0 (w[12]); - t2 = Sum0 (f) + Maj (f, g, h); - a += t1; - e = t1 + t2; - - t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+12] + w[12]; - w[12] += S1 (w[10]) + w[5] + S0 (w[13]); - t2 = Sum0 (e) + Maj (e, f, g); - h += t1; - d = t1 + t2; - - t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+13] + w[13]; - w[13] += S1 (w[11]) + w[6] + S0 (w[14]); - t2 = Sum0 (d) + Maj (d, e, f); - g += t1; - c = t1 + t2; - - t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+14] + w[14]; - w[14] += S1 (w[12]) + w[7] + S0 (w[15]); - t2 = Sum0 (c) + Maj (c, d, e); - f += t1; - b = t1 + t2; - - t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+15] + w[15]; - w[15] += S1 (w[13]) + w[8] + S0 (w[0]); - t2 = Sum0 (b) + Maj (b, c, d); - e += t1; - a = t1 + t2; - - t += 16; -#endif - } - - for (; t < 80; ) - { - u64 t1, t2; - -#if 0 /* Not unrolled. */ - t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[t%16]; - t2 = Sum0 (a) + Maj (a, b, c); - h = g; - g = f; - f = e; - e = d + t1; - d = c; - c = b; - b = a; - a = t1 + t2; - t++; -#else /* Unrolled to interweave the chain variables. */ - t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[0]; - t2 = Sum0 (a) + Maj (a, b, c); - d += t1; - h = t1 + t2; - - t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+1] + w[1]; - t2 = Sum0 (h) + Maj (h, a, b); - c += t1; - g = t1 + t2; - - t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+2] + w[2]; - t2 = Sum0 (g) + Maj (g, h, a); - b += t1; - f = t1 + t2; - - t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+3] + w[3]; - t2 = Sum0 (f) + Maj (f, g, h); - a += t1; - e = t1 + t2; - - t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+4] + w[4]; - t2 = Sum0 (e) + Maj (e, f, g); - h += t1; - d = t1 + t2; - - t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+5] + w[5]; - t2 = Sum0 (d) + Maj (d, e, f); - g += t1; - c = t1 + t2; - - t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+6] + w[6]; - t2 = Sum0 (c) + Maj (c, d, e); - f += t1; - b = t1 + t2; - - t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+7] + w[7]; - t2 = Sum0 (b) + Maj (b, c, d); - e += t1; - a = t1 + t2; - - t1 = h + Sum1 (e) + Ch (e, f, g) + k[t+8] + w[8]; - t2 = Sum0 (a) + Maj (a, b, c); - d += t1; - h = t1 + t2; - - t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+9] + w[9]; - t2 = Sum0 (h) + Maj (h, a, b); - c += t1; - g = t1 + t2; - - t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+10] + w[10]; - t2 = Sum0 (g) + Maj (g, h, a); - b += t1; - f = t1 + t2; - - t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+11] + w[11]; - t2 = Sum0 (f) + Maj (f, g, h); - a += t1; - e = t1 + t2; - - t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+12] + w[12]; - t2 = Sum0 (e) + Maj (e, f, g); - h += t1; - d = t1 + t2; - - t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+13] + w[13]; - t2 = Sum0 (d) + Maj (d, e, f); - g += t1; - c = t1 + t2; - - t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+14] + w[14]; - t2 = Sum0 (c) + Maj (c, d, e); - f += t1; - b = t1 + t2; - - t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+15] + w[15]; - t2 = Sum0 (b) + Maj (b, c, d); - e += t1; - a = t1 + t2; - - t += 16; -#endif - } - - /* Update chaining vars. */ - hd->h0 += a; - hd->h1 += b; - hd->h2 += c; - hd->h3 += d; - hd->h4 += e; - hd->h5 += f; - hd->h6 += g; - hd->h7 += h; - - return /* burn_stack */ (8 + 16) * sizeof(u64) + sizeof(u32) + - 3 * sizeof(void*); -} -#endif /*!USE_ARM_ASM*/ - -/* AMD64 assembly implementations use SystemV ABI, ABI conversion and additional - * stack to store XMM6-XMM15 needed on Win64. */ -#undef ASM_FUNC_ABI -#undef ASM_EXTRA_STACK -#if defined(USE_SSSE3) || defined(USE_AVX) || defined(USE_AVX2) -# ifdef HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS -# define ASM_FUNC_ABI __attribute__((sysv_abi)) -# define ASM_EXTRA_STACK (10 * 16) -# else -# define ASM_FUNC_ABI -# define ASM_EXTRA_STACK 0 -# endif -#endif - - -#ifdef USE_ARM_NEON_ASM -void _gcry_sha512_transform_armv7_neon (SHA512_STATE *hd, - const unsigned char *data, - const u64 k[], size_t num_blks); -#endif - -#ifdef USE_ARM_ASM -unsigned int _gcry_sha512_transform_arm (SHA512_STATE *hd, - const unsigned char *data, - const u64 k[], size_t num_blks); -#endif - -#ifdef USE_SSSE3 -unsigned int _gcry_sha512_transform_amd64_ssse3(const void *input_data, - void *state, - size_t num_blks) ASM_FUNC_ABI; -#endif - -#ifdef USE_AVX -unsigned int _gcry_sha512_transform_amd64_avx(const void *input_data, - void *state, - size_t num_blks) ASM_FUNC_ABI; -#endif - -#ifdef USE_AVX2 -unsigned int _gcry_sha512_transform_amd64_avx2(const void *input_data, - void *state, - size_t num_blks) ASM_FUNC_ABI; -#endif - - -static unsigned int -transform (void *context, const unsigned char *data, size_t nblks) +do_transform_generic (void *context, const unsigned char *data, size_t nblks) { SHA512_CONTEXT *ctx = context; - unsigned int burn; - -#ifdef USE_AVX2 - if (ctx->use_avx2) - return _gcry_sha512_transform_amd64_avx2 (data, &ctx->state, nblks) - + 4 * sizeof(void*) + ASM_EXTRA_STACK; -#endif - -#ifdef USE_AVX - if (ctx->use_avx) - return _gcry_sha512_transform_amd64_avx (data, &ctx->state, nblks) - + 4 * sizeof(void*) + ASM_EXTRA_STACK; -#endif - -#ifdef USE_SSSE3 - if (ctx->use_ssse3) - return _gcry_sha512_transform_amd64_ssse3 (data, &ctx->state, nblks) - + 4 * sizeof(void*) + ASM_EXTRA_STACK; -#endif + SHA512_STATE *hd = &ctx->state; -#ifdef USE_ARM_NEON_ASM - if (ctx->use_neon) + do { - _gcry_sha512_transform_armv7_neon (&ctx->state, data, k, nblks); + u64 a, b, c, d, e, f, g, h; + u64 w[16]; + int t; + + /* get values from the chaining vars */ + a = hd->h0; + b = hd->h1; + c = hd->h2; + d = hd->h3; + e = hd->h4; + f = hd->h5; + g = hd->h6; + h = hd->h7; + + for ( t = 0; t < 16; t++ ) + w[t] = buf_get_be64(data + t * 8); - /* _gcry_sha512_transform_armv7_neon does not store sensitive data - * to stack. */ - return /* no burn_stack */ 0; - } -#endif +#define S0(x) (ROTR((x),1) ^ ROTR((x),8) ^ ((x)>>7)) +#define S1(x) (ROTR((x),19) ^ ROTR((x),61) ^ ((x)>>6)) + + for (t = 0; t < 80 - 16; ) + { + u64 t1, t2; + + t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[0]; + w[0] += S1 (w[14]) + w[9] + S0 (w[1]); + t2 = Sum0 (a) + Maj (a, b, c); + d += t1; + h = t1 + t2; + + t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+1] + w[1]; + w[1] += S1 (w[15]) + w[10] + S0 (w[2]); + t2 = Sum0 (h) + Maj (h, a, b); + c += t1; + g = t1 + t2; + + t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+2] + w[2]; + w[2] += S1 (w[0]) + w[11] + S0 (w[3]); + t2 = Sum0 (g) + Maj (g, h, a); + b += t1; + f = t1 + t2; + + t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+3] + w[3]; + w[3] += S1 (w[1]) + w[12] + S0 (w[4]); + t2 = Sum0 (f) + Maj (f, g, h); + a += t1; + e = t1 + t2; + + t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+4] + w[4]; + w[4] += S1 (w[2]) + w[13] + S0 (w[5]); + t2 = Sum0 (e) + Maj (e, f, g); + h += t1; + d = t1 + t2; + + t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+5] + w[5]; + w[5] += S1 (w[3]) + w[14] + S0 (w[6]); + t2 = Sum0 (d) + Maj (d, e, f); + g += t1; + c = t1 + t2; + + t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+6] + w[6]; + w[6] += S1 (w[4]) + w[15] + S0 (w[7]); + t2 = Sum0 (c) + Maj (c, d, e); + f += t1; + b = t1 + t2; + + t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+7] + w[7]; + w[7] += S1 (w[5]) + w[0] + S0 (w[8]); + t2 = Sum0 (b) + Maj (b, c, d); + e += t1; + a = t1 + t2; + + t1 = h + Sum1 (e) + Ch (e, f, g) + k[t+8] + w[8]; + w[8] += S1 (w[6]) + w[1] + S0 (w[9]); + t2 = Sum0 (a) + Maj (a, b, c); + d += t1; + h = t1 + t2; + + t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+9] + w[9]; + w[9] += S1 (w[7]) + w[2] + S0 (w[10]); + t2 = Sum0 (h) + Maj (h, a, b); + c += t1; + g = t1 + t2; + + t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+10] + w[10]; + w[10] += S1 (w[8]) + w[3] + S0 (w[11]); + t2 = Sum0 (g) + Maj (g, h, a); + b += t1; + f = t1 + t2; + + t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+11] + w[11]; + w[11] += S1 (w[9]) + w[4] + S0 (w[12]); + t2 = Sum0 (f) + Maj (f, g, h); + a += t1; + e = t1 + t2; + + t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+12] + w[12]; + w[12] += S1 (w[10]) + w[5] + S0 (w[13]); + t2 = Sum0 (e) + Maj (e, f, g); + h += t1; + d = t1 + t2; + + t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+13] + w[13]; + w[13] += S1 (w[11]) + w[6] + S0 (w[14]); + t2 = Sum0 (d) + Maj (d, e, f); + g += t1; + c = t1 + t2; + + t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+14] + w[14]; + w[14] += S1 (w[12]) + w[7] + S0 (w[15]); + t2 = Sum0 (c) + Maj (c, d, e); + f += t1; + b = t1 + t2; + + t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+15] + w[15]; + w[15] += S1 (w[13]) + w[8] + S0 (w[0]); + t2 = Sum0 (b) + Maj (b, c, d); + e += t1; + a = t1 + t2; + + t += 16; + } + + for (; t < 80; ) + { + u64 t1, t2; + + t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[0]; + t2 = Sum0 (a) + Maj (a, b, c); + d += t1; + h = t1 + t2; + + t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+1] + w[1]; + t2 = Sum0 (h) + Maj (h, a, b); + c += t1; + g = t1 + t2; + + t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+2] + w[2]; + t2 = Sum0 (g) + Maj (g, h, a); + b += t1; + f = t1 + t2; + + t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+3] + w[3]; + t2 = Sum0 (f) + Maj (f, g, h); + a += t1; + e = t1 + t2; + + t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+4] + w[4]; + t2 = Sum0 (e) + Maj (e, f, g); + h += t1; + d = t1 + t2; + + t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+5] + w[5]; + t2 = Sum0 (d) + Maj (d, e, f); + g += t1; + c = t1 + t2; + + t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+6] + w[6]; + t2 = Sum0 (c) + Maj (c, d, e); + f += t1; + b = t1 + t2; + + t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+7] + w[7]; + t2 = Sum0 (b) + Maj (b, c, d); + e += t1; + a = t1 + t2; + + t1 = h + Sum1 (e) + Ch (e, f, g) + k[t+8] + w[8]; + t2 = Sum0 (a) + Maj (a, b, c); + d += t1; + h = t1 + t2; + + t1 = g + Sum1 (d) + Ch (d, e, f) + k[t+9] + w[9]; + t2 = Sum0 (h) + Maj (h, a, b); + c += t1; + g = t1 + t2; + + t1 = f + Sum1 (c) + Ch (c, d, e) + k[t+10] + w[10]; + t2 = Sum0 (g) + Maj (g, h, a); + b += t1; + f = t1 + t2; + + t1 = e + Sum1 (b) + Ch (b, c, d) + k[t+11] + w[11]; + t2 = Sum0 (f) + Maj (f, g, h); + a += t1; + e = t1 + t2; + + t1 = d + Sum1 (a) + Ch (a, b, c) + k[t+12] + w[12]; + t2 = Sum0 (e) + Maj (e, f, g); + h += t1; + d = t1 + t2; + + t1 = c + Sum1 (h) + Ch (h, a, b) + k[t+13] + w[13]; + t2 = Sum0 (d) + Maj (d, e, f); + g += t1; + c = t1 + t2; + + t1 = b + Sum1 (g) + Ch (g, h, a) + k[t+14] + w[14]; + t2 = Sum0 (c) + Maj (c, d, e); + f += t1; + b = t1 + t2; + + t1 = a + Sum1 (f) + Ch (f, g, h) + k[t+15] + w[15]; + t2 = Sum0 (b) + Maj (b, c, d); + e += t1; + a = t1 + t2; + + t += 16; + } + + /* Update chaining vars. */ + hd->h0 += a; + hd->h1 += b; + hd->h2 += c; + hd->h3 += d; + hd->h4 += e; + hd->h5 += f; + hd->h6 += g; + hd->h7 += h; -#ifdef USE_ARM_ASM - burn = _gcry_sha512_transform_arm (&ctx->state, data, k, nblks); -#else - do - { - burn = transform_blk (&ctx->state, data) + 3 * sizeof(void*); data += 128; } while (--nblks); -#ifdef ASM_EXTRA_STACK - /* 'transform_blk' is typically inlined and XMM6-XMM15 are stored at - * the prologue of this function. Therefore need to add ASM_EXTRA_STACK to - * here too. - */ - burn += ASM_EXTRA_STACK; -#endif -#endif - - return burn; + return (8 + 16) * sizeof(u64) + sizeof(u32) + 3 * sizeof(void*); } +#endif /*!USE_ARM_ASM*/ /* The routine final terminates the computation and @@ -713,7 +673,7 @@ sha512_final (void *context) /* append the 128 bit count */ buf_put_be64(hd->bctx.buf + 112, msb); buf_put_be64(hd->bctx.buf + 120, lsb); - stack_burn_depth = transform (hd, hd->bctx.buf, 1); + stack_burn_depth = (*hd->bctx.bwrite) (hd, hd->bctx.buf, 1); _gcry_burn_stack (stack_burn_depth); p = hd->bctx.buf; -- 2.17.1 From stefbon at gmail.com Wed Jun 20 12:11:02 2018 From: stefbon at gmail.com (Stef Bon) Date: Wed, 20 Jun 2018 12:11:02 +0200 Subject: Fwd: Low level ops? In-Reply-To: References: <534aeb69-8bf6-b8a9-6918-422b39cf956a@iki.fi> <82d38695-ff7e-ef03-9215-cf2dcdbfcae4@iki.fi> Message-ID: Sorry for double posting.. My emailer did not show enough info. ---------- Forwarded message --------- From: Stef Bon Date: di 19 jun. 2018 om 23:10 Subject: Re: Low level ops? To: Jussi Kivilinna Op di 19 jun. 2018 om 18:28 schreef Jussi Kivilinna : > > I made changes on weekend to reduce the overhead for cipher operations. > Ok! I don't want to look too impatient, but I did not know you have been working on the issue, so I posted again. > When I tried to get those patches to the mailing-list they just would > not get through. I've spend past two nights trying to figure out what > the ____ is wrong with my mail setup. > Ugh. > Anyway, overhead for example for AESNI/CBC decryption has reduced > from ~80 cycles per call to ~30 cycles. The remaining 30 cycles, seems > to be mainly caused by the optimized AESNI/CBC decryption function > itself. AESNI/CBC encryption function is less complex and overhead > for it is now 9 cycles per call (was 40 cycles). > >From 90 to 30 is already impressive. I'm very interested to test this. > > And what about chacha20-poly1305 at openssh.com? > > If you check the chacha20-poly1305 in OpenSSH, you see that for each > packet you need to perform one extra chacha20 block encryption, which > alone is going to cost over 400 cycles. > I've already implemented this. I've used your example code. It's working perfect. I know I've mailed earlier about his and reported that it's working. > > To utilize this, you need to provide input buffers larger than > blocksize to libgcrypt. For AESNI implementations, you get best > performance starting with buffer size of 8 blocks or 8*16=128 > bytes. For Chacha20, you need 4 blocks or 4*64=256 bytes. > Uhm Larger input buffers? I cannot play with that. My code calls gcry_cipher_encrypt(c_handle, packet->buffer, packet->len, NULL, 0) to encrypt and gcry_cipher_decrypt(c_handle, packet->buffer, packet->len, NULL, 0) to decrypt. (leaving out some details like decrypt the first block first to get the length). So the whole message, As mentioned before the messages vary from 128 to 1024 bytes, with some bigger (readdir and read/write). I think it's good to distinguish two different uses of parallel processing: 1. parallel processing of one message by splitting it in different blocks (with size blocksize of course). As I understand Libgcrypt handles these automatically when these messages are large enough and the cipher allows that. 2. parallel processing of two or more messages at the same time. Some ciphers like chacha20-poly1305 at openssh.com and aes256-ctr allow this. The starting state of the cipher does not depend on the previous message. I have not tested this, in some "heavy load situations" it maybe give some performance improvements. These situations with fuse/sftp are after a readdir a lot of lookup calls are done, one for every entry in the directory. Stef Bon From stefbon at gmail.com Wed Jun 20 12:20:35 2018 From: stefbon at gmail.com (Stef Bon) Date: Wed, 20 Jun 2018 12:20:35 +0200 Subject: p and q value genkey for ecc. Message-ID: Hi, I'm using the gcry_pk_genkey to create publc and private keys (for ecdh/curve ed25519 ao) Now I read in the documentation that the result of this is an s-expr with ecc and curve ed25519 that q and p are stored as values. (see 6.4 General public-key related Functions) This is somewhat counfusing. d is and mpi, q is an mpoint for ecc keys. for other keydata s-expressions (like for rsa) the parameters like d and p are stored as mpi. Now I expect q is stored in the keydata s-expr as value (to be read as an opaque mpi), which is ok, and the d as mpi, which is not the case. Is this intentional? Thanks in advance, Stef From arlyon at me.com Sat Jun 23 02:39:47 2018 From: arlyon at me.com (Alexander Lyon) Date: Sat, 23 Jun 2018 01:39:47 +0100 Subject: Correct method to generate a Curve25519 keypair Message-ID: Hello To preface, apologies if I am unconventional or naive; I am a little new to this. I am having issues with generating a Curve25519 key pair using gcry_pk_genkey. Specifically the private key doesn't match the expected bitmask (as defined here https://cr.yp.to/ecdh.html) nor does the generated public key match the expected value (in this case derived by manually applying the bit mask to the private key and calculating it with a different library). This is the S-expression used to generate the key: (genkey (ecc (curve "Curve25519") (flags djb-tweak comp) ) ) And an example snippet of code to extract the public and private keys, generating the sexp, extracting the mpis and then converting the compressed public key mpi into a point before extracting the X coordinate (the Y and Z were 0x01 and 0x00 respectively). gcry_sexp_build( &sexp_params, NULL, "(genkey" " (ecc" " (curve \"Curve25519\")" " (flags djb-tweak comp)" " )" ")" ); gcry_sexp_t sexp_curve25519_keypair; gcry_pk_genkey( &sexp_curve25519_keypair, sexp_params ); gcry_ctx_t ctx_curve; gcry_mpi_ec_new( &ctx_curve, NULL, "Curve25519" ); gcry_mpi_t mpi_curve_priv_key; gcry_mpi_t mpi_curve_pub_compressed; gcry_mpi_point_t point_curve_pub_key = gcry_mpi_point_new( 0 ); gcry_sexp_extract_param( sexp_curve25519_keypair, NULL, "qd", &mpi_curve_pub_compressed, &mpi_curve_priv_key, NULL ); gcry_mpi_ec_decode_point( point_curve_pub_key, mpi_curve_pub_compressed, ctx_curve ); At this point, when checking the results in the debugger it is clear that the generated keys are incorrect: > gcry_mpi_dump(mpi_curve_priv_key) 6ef90e0c0201256c301484580a59756529285a80537389235d98cb9d0b036e10 > gcry_mpi_dump(point_curve_pub_key->x) 30795e2d73beede300464f26f589e6d171f61a65fc2ab62719941f0b230dc8d9 > bytes_curve_priv_key[0] == (bytes_curve_priv_key[0] & 248) false > bytes_curve_priv_key[31] == ((bytes_curve_priv_key[31] & 127) | 64) false What could be causing this? What is the correct way to generate a key pair? Regards, Alex -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: Message signed with OpenPGP URL: From stefbon at gmail.com Sat Jun 23 08:06:58 2018 From: stefbon at gmail.com (Stef Bon) Date: Sat, 23 Jun 2018 08:06:58 +0200 Subject: Correct method to generate a Curve25519 keypair In-Reply-To: References: Message-ID: Op za 23 jun. 2018 om 02:40 schreef Alexander Lyon : > > Hello > > To preface, apologies if I am unconventional or naive; I am a little new to this. > No problem. I'm working on the same issue (to make curve25519-sha256 work). I've done this different. As I can see there is no such curve as Curve25519. This is the name of the keyexchange method, but it's based on ed25519. This is the correct name. Use: "(genkey (ecc (curve ed25519)))" instead. The generated s-expr sexp_curve25519_keypair has the public key and the private key. As you can see on ECC-key-parameters.html and General-public_002dkey-related-Functions.html the generated s-expr has two sub-s-expressions, one for the public and one for the private key: (key-data (public-key (ecc (curve Ed25519) (flags eddsa) (q q-value))) (private-key (ecc (curve Ed25519) (flags eddsa) (q q-value) (d d-value)))) If you want to read the public key, find it first by: s_key=gcry_sexp_find_token(sexp_curve25519_keypair, "public-key", 0); now get the cdr of that result by: s_param=gcry_sexp_cdr(s_key); (you might want to check the type/car which should be "ecc") Now get the right param (q or d) by: gcry_sexp_t s_q = gcry_sexp_find_token(s_param, "q", 0); and value of this const char value=gcry_sexp_nth_data(s_q, 1, &len); This is a pointer to the data of the q paramater. You can read the d parameter simular. I've posted about this issue before since the fields are stored as mpi for other types (rsa, dss) and here as q-value and d-value. Is this the raw format? I do not know. I hope this helps, Stef From stefbon at gmail.com Sun Jun 24 10:42:25 2018 From: stefbon at gmail.com (Stef Bon) Date: Sun, 24 Jun 2018 10:42:25 +0200 Subject: Correct method to generate a Curve25519 keypair In-Reply-To: References: Message-ID: Op za 23 jun. 2018 om 08:06 schreef Stef Bon : > > > (key-data > (public-key > (ecc > (curve Ed25519) > (flags eddsa) > (q q-value))) > (private-key > (ecc > (curve Ed25519) > (flags eddsa) > (q q-value) > (d d-value)))) > Hi, I found that the curve ed25519 has to start with a capital: Ed25519 thus. Stef From stefbon at gmail.com Sun Jun 24 10:53:13 2018 From: stefbon at gmail.com (Stef Bon) Date: Sun, 24 Jun 2018 10:53:13 +0200 Subject: Error with gcry_mpi_release with opaque value. Message-ID: Hi, I'm using gcry_mpi_set_opaque to store an mpoint in mpi. When freeing this mpi with gcry_mpi_release my program crashes without any logmessage in syslog. (normally I see segfault with error). So serious error. I use gcry_mpi_set_opaque like: buffer=malloc(len); if (buffer) { memcpy(buffer, from somewhere, size); mp->lib.mpi=gcry_mpi_set_opaque(NULL, (void *) buffer, (8 * len)); } And releasing this with gcry_mpi_release program crashes. What's happening? Stef From sbi at pi4.de Tue Jun 26 16:06:24 2018 From: sbi at pi4.de (Steffen Bingel, pi4) Date: Tue, 26 Jun 2018 16:06:24 +0200 Subject: RSA - relation between message size and key size Message-ID: Hi, at first, this is the first time for me using a mailing list and I apologize in advance for any violation of rules I may not know yet. I'm playing around with the private/public key functions of libgcrypt and ran into an behavior I couldn't find an explanation for. If my message that I try to encrypt is larger than the key I use for encryption the pk_encrypt seems to generate random data without throwing an error. The following code is a condensed copy from https://github.com/vedantk/gcrypt-example/blob/master/main.cc. If my message contains 32 characters (256 bit) this works fine but if I pass 33 or more characters the decrypted messages makes no sense at all. I was also playing around with bigger keys where I could observe the same behavior (msg bigger than key not working). So if the function is not intended to take data larger than the key, why is it not returning an error? What is the correct way to encrypt large, at least larger than the key, binary data I have in memory? Thanks a lot ??? gcry_error_t err; ??? #define _assert(cmd) {\ ??? ??? err = cmd;\ ??? ??? if (err != GPG_ERR_NO_ERROR) {\ ??? ??? ??? L("ERR: command returned: %s",gcry_strerror(err));\ ??? ??? }} ??? /* generate key pair */ ??? gcry_sexp_t rsa_keypair; ??? gcry_sexp_t parms; ??? _assert(gcry_sexp_build( &parms, NULL, "(genkey(rsa(nbits %d)))",256)); ??? _assert(gcry_pk_genkey( &rsa_keypair,parms )); ??? gcry_sexp_t pubk = gcry_sexp_find_token(rsa_keypair, "public-key", 0); ??? gcry_sexp_t privk = gcry_sexp_find_token(rsa_keypair, "private-key", 0); ??? /* Create a message. */ ??? gcry_mpi_t msg; ??? gcry_sexp_t data; ??? const unsigned char* s = (const unsigned char*) ??? ??? "uweoirdnd1iejfkslrm2kdleirjfm3xss"; ??? _assert(gcry_mpi_scan(&msg, GCRYMPI_FMT_USG, s, strlen((const char*) s), NULL)); ??? gcry_mpi_dump(msg); ??? _assert(gcry_sexp_build(&data, NULL,"(data (flags raw) (value %m))", msg)); ??? gcry_sexp_dump(data); ??? /* Encrypt the message. */ ??? gcry_sexp_t ciph; ??? _assert(gcry_pk_encrypt(&ciph, data, pubk)); ??? gcry_sexp_dump(ciph); ??? /* Decrypt the message. */ ??? gcry_sexp_t plain; ??? _assert(gcry_pk_decrypt(&plain, ciph, privk)); ??? /* Pretty-print the results. */ ??? gcry_mpi_t out_msg = gcry_sexp_nth_mpi(plain, 0, GCRYMPI_FMT_USG); ??? L("Original:"); ??? gcry_mpi_dump(msg); ??? L("\n" "Decrypted:"); ??? gcry_mpi_dump(out_msg); ??? if (gcry_mpi_cmp(msg, out_msg)) { ??? ??? L("data corruption!"); ??? } else { ??? ??? L("Messages match.\n"); ??? } From kmagnum at gmail.com Wed Jun 27 03:43:18 2018 From: kmagnum at gmail.com (Karl Magdsick) Date: Wed, 27 Jun 2018 09:43:18 +0800 Subject: RSA - relation between message size and key size In-Reply-To: References: Message-ID: There are a variety of attacks against RSA when used in this manner. You really should use OAEP ( https://en.m.wikipedia.org/wiki/Optimal_asymmetric_encryption_padding ) and you almost certainly should use RSA to exchange keys for a symmetric authenticated encryption algorithm (such as ChaCha20-Poly1305 or AES-GCM). It goes without saying that playing around with encryption is fun, but for anything serious, use a high-level well-reviewed library implementing well-studied protocols. libgnutls, libgpgme, and libsodium are good choices, depending on your use case. libgcrypt is a low-level library meant as a building block for high-level end-user libraries. Cheers, Karl On Tue, Jun 26, 2018, 23:33 Steffen Bingel, pi4 wrote: > Hi, > > at first, this is the first time for me using a mailing list and I > apologize in advance for any violation of rules I may not know yet. > > I'm playing around with the private/public key functions of libgcrypt > and ran into an behavior I couldn't find an explanation for. If my > message that I try to encrypt is larger than the key I use for > encryption the pk_encrypt seems to generate random data without throwing > an error. The following code is a condensed copy from > https://github.com/vedantk/gcrypt-example/blob/master/main.cc. If my > message contains 32 characters (256 bit) this works fine but if I pass > 33 or more characters the decrypted messages makes no sense at all. I > was also playing around with bigger keys where I could observe the same > behavior (msg bigger than key not working). > > So if the function is not intended to take data larger than the key, why > is it not returning an error? > > What is the correct way to encrypt large, at least larger than the key, > binary data I have in memory? > > Thanks a lot > > gcry_error_t err; > > #define _assert(cmd) {\ > err = cmd;\ > if (err != GPG_ERR_NO_ERROR) {\ > L("ERR: command returned: %s",gcry_strerror(err));\ > }} > > /* generate key pair */ > gcry_sexp_t rsa_keypair; > gcry_sexp_t parms; > _assert(gcry_sexp_build( &parms, NULL, "(genkey(rsa(nbits > %d)))",256)); > > _assert(gcry_pk_genkey( &rsa_keypair,parms )); > > gcry_sexp_t pubk = gcry_sexp_find_token(rsa_keypair, "public-key", 0); > gcry_sexp_t privk = gcry_sexp_find_token(rsa_keypair, > "private-key", 0); > > /* Create a message. */ > gcry_mpi_t msg; > gcry_sexp_t data; > const unsigned char* s = (const unsigned char*) > "uweoirdnd1iejfkslrm2kdleirjfm3xss"; > _assert(gcry_mpi_scan(&msg, GCRYMPI_FMT_USG, s, strlen((const > char*) s), NULL)); > > gcry_mpi_dump(msg); > > _assert(gcry_sexp_build(&data, NULL,"(data (flags raw) (value > %m))", msg)); > > gcry_sexp_dump(data); > > /* Encrypt the message. */ > gcry_sexp_t ciph; > _assert(gcry_pk_encrypt(&ciph, data, pubk)); > > gcry_sexp_dump(ciph); > > /* Decrypt the message. */ > gcry_sexp_t plain; > _assert(gcry_pk_decrypt(&plain, ciph, privk)); > > /* Pretty-print the results. */ > gcry_mpi_t out_msg = gcry_sexp_nth_mpi(plain, 0, GCRYMPI_FMT_USG); > L("Original:"); > gcry_mpi_dump(msg); > L("\n" "Decrypted:"); > gcry_mpi_dump(out_msg); > > if (gcry_mpi_cmp(msg, out_msg)) { > L("data corruption!"); > } else { > L("Messages match.\n"); > } > > > > > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wk at gnupg.org Thu Jun 28 09:08:55 2018 From: wk at gnupg.org (Werner Koch) Date: Thu, 28 Jun 2018 09:08:55 +0200 Subject: Error with gcry_mpi_release with opaque value. In-Reply-To: (Stef Bon's message of "Sun, 24 Jun 2018 10:53:13 +0200") References: Message-ID: <87a7rf1j7c.fsf@wheatstone.g10code.de> Hello! On Sun, 24 Jun 2018 10:53, stefbon at gmail.com said: > buffer=malloc(len); > if (buffer) { > memcpy(buffer, from somewhere, size); SIZE <= LEN ? > mp->lib.mpi=gcry_mpi_set_opaque(NULL, (void *) buffer, (8 * len)); > } > > And releasing this with gcry_mpi_release program crashes. Either you free BUFFER which you may not do because gcry_mpi_set_opaque takes ownership of BUFFER or, more likely, you have not setup memory allocation functions which map to standard malloc. Libgcrypt uses its own allocation functions and uses gcry_free to release them. The rationale for the own allocation fucntions is that they allow to distinguish between standard and secure memory. Either use buffer = gcry_malloc (len); or mp->lib.mpi = gcry_mpi_set_opaque_copy (NULL, from_somewhere, 8*len); Salam-Shalom, Werner -- # Please read: Daniel Ellsberg - The Doomsday Machine # Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 227 bytes Desc: not available URL: From stefbon at gmail.com Thu Jun 28 13:17:34 2018 From: stefbon at gmail.com (Stef Bon) Date: Thu, 28 Jun 2018 13:17:34 +0200 Subject: Error with gcry_mpi_release with opaque value. In-Reply-To: <87a7rf1j7c.fsf@wheatstone.g10code.de> References: <87a7rf1j7c.fsf@wheatstone.g10code.de> Message-ID: Op do 28 jun. 2018 om 09:18 schreef Werner Koch : > > > SIZE <= LEN ? > Sorry for the misunderstanding. size=len. I should have typed len instead. > Either you free BUFFER which you may not do because gcry_mpi_set_opaque > takes ownership of BUFFER or, more likely, you have not setup memory > allocation functions which map to standard malloc. Libgcrypt uses its > own allocation functions and uses gcry_free to release them. The > rationale for the own allocation fucntions is that they allow to > distinguish between standard and secure memory. > I know. I've been using the standard malloc, thinking that gcry_free falls back to the standard free if not using secure memory. This is not the case so I have to write alloc functions for libgcrypt using the api. If that solves the issue or not I will report that. Stef From wk at gnupg.org Thu Jun 28 13:33:17 2018 From: wk at gnupg.org (Werner Koch) Date: Thu, 28 Jun 2018 13:33:17 +0200 Subject: Error with gcry_mpi_release with opaque value. In-Reply-To: (Stef Bon's message of "Thu, 28 Jun 2018 13:17:34 +0200") References: <87a7rf1j7c.fsf@wheatstone.g10code.de> Message-ID: <87woujywle.fsf@wheatstone.g10code.de> On Thu, 28 Jun 2018 13:17, stefbon at gmail.com said: > I know. I've been using the standard malloc, thinking that gcry_free > falls back to the standard free Actually this is currently the case. If you are on Windows, take care: Each DLL may have its own version of malloc and free and mixing them can lead to crashes (due to differ tent runtime libraries). Shalom-Salam, Werner -- # Please read: Daniel Ellsberg - The Doomsday Machine # Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 227 bytes Desc: not available URL: From stefbon at gmail.com Thu Jun 28 13:54:37 2018 From: stefbon at gmail.com (Stef Bon) Date: Thu, 28 Jun 2018 13:54:37 +0200 Subject: Error with gcry_mpi_release with opaque value. In-Reply-To: <87woujywle.fsf@wheatstone.g10code.de> References: <87a7rf1j7c.fsf@wheatstone.g10code.de> <87woujywle.fsf@wheatstone.g10code.de> Message-ID: Op do 28 jun. 2018 om 13:43 schreef Werner Koch : > Replacing malloc by gcry_malloc "solved" the issue. The crash with gcry_release with a opaque mpi is not there anymore. So somehow for some reason I do not understand gcry_malloc and gcry_free do something more then fallback to the standard malloc and free. Maybe initialize? gcry_free does not work without first calling gcry_malloc? Stef From stefbon at gmail.com Thu Jun 28 17:11:49 2018 From: stefbon at gmail.com (Stef Bon) Date: Thu, 28 Jun 2018 17:11:49 +0200 Subject: Correct method to generate a Curve25519 keypair In-Reply-To: <7E3FC5D1-7172-45D1-ADE9-102EE73AA850@me.com> References: <7E3FC5D1-7172-45D1-ADE9-102EE73AA850@me.com> Message-ID: Op do 28 jun. 2018 om 15:57 schreef Alexander Lyon : > > A little late, but as a follow up to this, I managed to find the solution. The generated keys are valid but Curve25519 is little-endian and printing it to a buffer is big-endian so the hex dumped by gcry_mpi_dump is in reverse. It was as easy as reversing the buffer when appropriate. This allowed me to extract the public key binary. Ok, that explains the remark in "curve25519-sha256 at libssh.org.txt 4.3 Shared secret generation ": "This conversion follows the network byte order." Thanks for sharing this. You're helping me with implementing it in my application. Some questions: the public key is shared with the other side. This is for ecc/ed25519 available in the s-exp created by genkey as "q-value". How did you extract this "q-value"? You're using gcry_mpi_dump. Does this the right tool for this? Comments in the documentation say that "Dump the value of a in a format suitable for debugging to Libgcrypt?s logging stream." Stef From arlyon at me.com Thu Jun 28 17:25:23 2018 From: arlyon at me.com (Alexander Lyon) Date: Thu, 28 Jun 2018 16:25:23 +0100 Subject: Correct method to generate a Curve25519 keypair In-Reply-To: References: <7E3FC5D1-7172-45D1-ADE9-102EE73AA850@me.com> Message-ID: <68AB9180-0D76-4B76-99A5-C4E8AFA23CF0@me.com> I use use gcry_mpi_dump during debugging so I can check the hex values in a break point. It just prints a hex to stdout (or err not sure). As for extracting the q value, I use gcry_sexp_extract_param. the whole process looks something like this: ------------------------------------ gcry_mpi_t mpi_Curve_pub = gcry_mpi_new( 0 ); gcry_mpi_t mpi_Curve_priv; gcry_sexp_build( &sexp_genkey_params, NULL, "(genkey" " (ecc" " (curve \"Curve25519\")" " (flags djb-tweak comp)" " )" ")" ); gcry_sexp_t sexp_Curve25519_pair; gcry_pk_genkey( &sexp_Curve25519_pair, sexp_genkey_params ); // the public key is a point stored compressed (determined by the 0x40 prefix) // in an mpi and it will need to be decompressed gcry_mpi_t mpi_Curve_pub_compressed; gcry_sexp_extract_param( sexp_Curve25519_pair, NULL, "qd", &mpi_Curve_pub_compressed, &mpi_Curve_priv, NULL ); // to decompress, we decode it into a point // then extract the X and discard the rest gcry_mpi_point_t point_Curve_pub = gcry_mpi_point_new( 0 ); gcry_ctx_t ctx_curve; gcry_mpi_ec_new( &ctx_curve, NULL, "Curve25519" ); gcry_mpi_ec_decode_point( point_Curve_pub, mpi_Curve_pub_compressed, ctx_curve ); // we extract x, y and z but only need x because // curve only uses the x coordinate. y and z are discarded. gcry_mpi_t mpi_Curve_pub_y = gcry_mpi_new( 0 ); gcry_mpi_t mpi_Curve_pub_z = gcry_mpi_new( 0 ); gcry_mpi_point_snatch_get( mpi_Curve_pub, mpi_Curve_pub_y, mpi_Curve_pub_z, point_Curve_pub ); gcry_sexp_release( sexp_genkey_params ); gcry_sexp_release( sexp_Curve25519_pair ); gcry_mpi_release( mpi_Curve_pub_y ); gcry_mpi_release( mpi_Curve_pub_z ); gcry_mpi_release( mpi_Curve_pub_compressed ); uint8_t p_bytes_Curve[32]; error = gcry_mpi_print( GCRYMPI_FMT_USG, p_bytes_Curve, 32, NULL, mpi_Curve_pub ); // Curve25519 is little-endian reverse_buffer( p_bytes_Curve, 32 ); ------------------------------------ At that point you'll have the generated private key stored in binary in the p_bytes_Curve buffer. Hope that helps Alex > On 28 Jun 2018, at 16:11, Stef Bon wrote: > > Op do 28 jun. 2018 om 15:57 schreef Alexander Lyon : >> >> A little late, but as a follow up to this, I managed to find the solution. The generated keys are valid but Curve25519 is little-endian and printing it to a buffer is big-endian so the hex dumped by gcry_mpi_dump is in reverse. It was as easy as reversing the buffer when appropriate. This allowed me to extract the public key binary. > > Ok, that explains the remark in "curve25519-sha256 at libssh.org.txt 4.3 > Shared secret generation ": > "This conversion follows the network byte order." > > Thanks for sharing this. You're helping me with implementing it in my > application. > Some questions: the public key is shared with the other side. This is > for ecc/ed25519 available in the s-exp created by genkey as "q-value". > How did you extract > this "q-value"? > > You're using gcry_mpi_dump. Does this the right tool for this? > Comments in the documentation say that > "Dump the value of a in a format suitable for debugging to Libgcrypt?s > logging stream." > > Stef > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel From arlyon at me.com Thu Jun 28 15:56:52 2018 From: arlyon at me.com (Alexander Lyon) Date: Thu, 28 Jun 2018 14:56:52 +0100 Subject: Correct method to generate a Curve25519 keypair In-Reply-To: References: Message-ID: <7E3FC5D1-7172-45D1-ADE9-102EE73AA850@me.com> A little late, but as a follow up to this, I managed to find the solution. The generated keys are valid but Curve25519 is little-endian and printing it to a buffer is big-endian so the hex dumped by gcry_mpi_dump is in reverse. It was as easy as reversing the buffer when appropriate. This allowed me to extract the public key binary. Alex > On 24 Jun 2018, at 09:42, Stef Bon wrote: > > Op za 23 jun. 2018 om 08:06 schreef Stef Bon : >> >> >> (key-data >> (public-key >> (ecc >> (curve Ed25519) >> (flags eddsa) >> (q q-value))) >> (private-key >> (ecc >> (curve Ed25519) >> (flags eddsa) >> (q q-value) >> (d d-value)))) >> > > Hi, > I found that the curve ed25519 has to start with a capital: Ed25519 thus. > Stef > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel From stefbon at gmail.com Thu Jun 28 19:30:57 2018 From: stefbon at gmail.com (Stef Bon) Date: Thu, 28 Jun 2018 19:30:57 +0200 Subject: Correct method to generate a Curve25519 keypair In-Reply-To: <68AB9180-0D76-4B76-99A5-C4E8AFA23CF0@me.com> References: <7E3FC5D1-7172-45D1-ADE9-102EE73AA850@me.com> <68AB9180-0D76-4B76-99A5-C4E8AFA23CF0@me.com> Message-ID: Op do 28 jun. 2018 om 17:25 schreef Alexander Lyon : > > gcry_sexp_build( &sexp_genkey_params, NULL, > "(genkey" > " (ecc" > " (curve \"Curve25519\")" > " (flags djb-tweak comp)" > " )" > ")" ); > I did not know that the curve "Curve25519" is a valid curve, I've posted about this earlier. > gcry_sexp_extract_param( sexp_Curve25519_pair, NULL, "qd", > &mpi_Curve_pub_compressed, &mpi_Curve_priv, NULL ); > Ah gcry_sexp_extract_param does the trick. Good to know. The documentation is better with more examples like this. > // to decompress, we decode it into a point > // then extract the X and discard the rest > gcry_mpi_point_t point_Curve_pub = gcry_mpi_point_new( 0 ); > gcry_ctx_t ctx_curve; > gcry_mpi_ec_new( &ctx_curve, NULL, "Curve25519" ); > gcry_mpi_ec_decode_point( point_Curve_pub, mpi_Curve_pub_compressed, ctx_curve ); > > // we extract x, y and z but only need x because > // curve only uses the x coordinate. y and z are discarded. > gcry_mpi_t mpi_Curve_pub_y = gcry_mpi_new( 0 ); > gcry_mpi_t mpi_Curve_pub_z = gcry_mpi_new( 0 ); > > gcry_mpi_point_snatch_get( mpi_Curve_pub, mpi_Curve_pub_y, mpi_Curve_pub_z, point_Curve_pub ); > > gcry_sexp_release( sexp_genkey_params ); > gcry_sexp_release( sexp_Curve25519_pair ); > gcry_mpi_release( mpi_Curve_pub_y ); > gcry_mpi_release( mpi_Curve_pub_z ); > gcry_mpi_release( mpi_Curve_pub_compressed ); If it's working, that's good but it looks a bit too much to me. first you compress it using the djb-twaek flag, and later you have to decompress it later. Any other benfit using the djb-tweak I do not see. And if not using compression you have the public key already available in "mpi_Curve_pub_compressed" which should be renamed to mpi_Curve_pub_notcompressed. And then using gcry_mpi_print and reversing the result shoudl be enough. Am I overseeing something? Stef From arlyon at me.com Fri Jun 29 05:09:21 2018 From: arlyon at me.com (Alexander Lyon) Date: Fri, 29 Jun 2018 04:09:21 +0100 Subject: Correct method to generate a Curve25519 keypair In-Reply-To: References: <7E3FC5D1-7172-45D1-ADE9-102EE73AA850@me.com> <68AB9180-0D76-4B76-99A5-C4E8AFA23CF0@me.com> Message-ID: djb-tweak and comp are necessary to generate the key. I have not found out how to make it work without those flags. In fact, changing comp (compressed) to nocomp causes the program to crash. On Thu, Jun 28, 2018, 18:34 Stef Bon wrote: > Op do 28 jun. 2018 om 17:25 schreef Alexander Lyon : > > > > gcry_sexp_build( &sexp_genkey_params, NULL, > > "(genkey" > > " (ecc" > > " (curve \"Curve25519\")" > > " (flags djb-tweak comp)" > > " )" > > ")" ); > > > > I did not know that the curve "Curve25519" is a valid curve, > I've posted about this earlier. > > > gcry_sexp_extract_param( sexp_Curve25519_pair, NULL, "qd", > > &mpi_Curve_pub_compressed, &mpi_Curve_priv, > NULL ); > > > Ah gcry_sexp_extract_param does the trick. Good to know. > The documentation is better with more examples like this. > > > // to decompress, we decode it into a point > > // then extract the X and discard the rest > > gcry_mpi_point_t point_Curve_pub = gcry_mpi_point_new( 0 ); > > gcry_ctx_t ctx_curve; > > gcry_mpi_ec_new( &ctx_curve, NULL, "Curve25519" ); > > gcry_mpi_ec_decode_point( point_Curve_pub, mpi_Curve_pub_compressed, > ctx_curve ); > > > > // we extract x, y and z but only need x because > > // curve only uses the x coordinate. y and z are discarded. > > gcry_mpi_t mpi_Curve_pub_y = gcry_mpi_new( 0 ); > > gcry_mpi_t mpi_Curve_pub_z = gcry_mpi_new( 0 ); > > > > gcry_mpi_point_snatch_get( mpi_Curve_pub, mpi_Curve_pub_y, > mpi_Curve_pub_z, point_Curve_pub ); > > > > gcry_sexp_release( sexp_genkey_params ); > > gcry_sexp_release( sexp_Curve25519_pair ); > > gcry_mpi_release( mpi_Curve_pub_y ); > > gcry_mpi_release( mpi_Curve_pub_z ); > > gcry_mpi_release( mpi_Curve_pub_compressed ); > > If it's working, that's good but it looks a bit too much to me. first > you compress it using the djb-twaek flag, and later > you have to decompress it later. Any other benfit using the djb-tweak > I do not see. And if not using compression > you have the public key already available in > "mpi_Curve_pub_compressed" which should be renamed to > mpi_Curve_pub_notcompressed. > And then using gcry_mpi_print and reversing the result shoudl be enough. > > Am I overseeing something? > > Stef > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wk at gnupg.org Fri Jun 29 07:17:18 2018 From: wk at gnupg.org (Werner Koch) Date: Fri, 29 Jun 2018 07:17:18 +0200 Subject: Error with gcry_mpi_release with opaque value. In-Reply-To: (Stef Bon's message of "Thu, 28 Jun 2018 13:54:37 +0200") References: <87a7rf1j7c.fsf@wheatstone.g10code.de> <87woujywle.fsf@wheatstone.g10code.de> Message-ID: <87po0ayxwh.fsf@wheatstone.g10code.de> On Thu, 28 Jun 2018 13:54, stefbon at gmail.com said: > malloc and free. Maybe initialize? gcry_free does not work without > first calling gcry_malloc? Libcrypt needs to be initialized; if you forgot that you should get a notice in the syslog. There is some fallback initialization but that does not work in all cases. Why don't you use the gcry_mpi_set_opaque_copy function? Salam-Shalom, Werner -- # Please read: Daniel Ellsberg - The Doomsday Machine # Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 227 bytes Desc: not available URL: