Proposal: New keydb format
Guilhem Moulin
guilhem at fripost.org
Sun Nov 1 19:49:06 CET 2015
Hi again,
On Sat, 31 Oct 2015 at 01:58:00 +0100, Neal H. Walfield wrote:
> At Thu, 29 Oct 2015 15:28:28 +0100, Guilhem Moulin wrote:
>> Care to look into that? I'd be happy to resume the benchmark later.
>
> I spent a little bit of time on this this evening and found the bug.
> I think --with-colons --list-sigs should now work.
Thanks a lot! Yes indeed it does, and for my use case the performance is just
amazing, especially with --list-sigs \o/ See the details below.
However note that in order to create a keyring in keydb format I had to
to apply to apply the following patch. (But I guess the proper fix
would be to remove everything from and including the ‘;’ at the end of
each SQL statement.)
-8<-------------------------------------------------------------------------->8-
diff --git a/g10/sqlite.c b/g10/sqlite.c
index ee7514c..82577b4 100644
--- a/g10/sqlite.c
+++ b/g10/sqlite.c
@@ -105,7 +105,7 @@ sqlite3_stepx (sqlite3 *db,
/* We can only process a single statement. */
if (tail)
{
- while (*tail == ' ' || *tail == ';')
+ while (*tail == '\n' || *tail == ' ' || *tail == ';')
tail ++;
if (*tail)
-8<-------------------------------------------------------------------------->8-
> Please feel free to run some benchmarks.
* Setup environment *
/run/shm is mounted in memory (tmpfs). Benchmark is done on a x86_64,
4x Core i7-4770S @ 3.1GHz, 16GiB DDR3 PC3-12800. (As shown in issue1938
the keybox is mostly problematic when using --list-sigs on a slower
32-bits CPU's such as my X60s' L2400.)
‘gpg’ is GnuPG 1.4.19
‘gpg2’ is GnuPG 2.1.9
‘./g10/gpg2’ is GnuPG neal/next branch (3ad5c81)
for x in ring kbx kdb; do
mkdir -m0700 /run/shm/"gnupg-$x"
echo trusted-key 0x39278DA8109E6244 >/run/shm/"gnupg-$x"/gpg.conf
echo trust-model always >>/run/shm/"gnupg-$x"/gpg.conf
echo no-auto-check-trustdb >>/run/shm/"gnupg-$x"/gpg.conf
done
echo no-autostart | tee -a /run/shm/gnupg-kbx/gpg.conf >>/run/shm/gnupg-kdb/gpg.conf
{ echo no-default-keyring; echo primary-keyring gnupg-kdb:pubring.kdb; } >>/run/shm/gnupg-kdb/gpg.conf
alias time="/usr/bin/time -f '%E (%U user, %S sys) %Mk maxres'"
* Importing the keyring *
(From the same `gpg2 --export` command so that all three gnupg-*
keyrings contain the same keys, and in the same order.)
~$ gpg2 --export | time gpg --homedir /run/shm/gnupg-ring --import
1:29.91 (61.68 user, 27.82 sys) 19148k maxres
~$ gpg2 --export | time gpg2 --homedir /run/shm/gnupg-kbx --import
0:55.38 (12.74 user, 42.48 sys) 189444k maxres
~$ gpg2 --export | time ./g10/gpg2 --homedir /run/shm/gnupg-kdb --import
0:09.76 (9.46 user, 0.18 sys) 31724k maxres
VERDICT: keybox format is faster that the legacy one, at the expense of
memory usage (20M vs almost 200M of max res set size); also the fact
that it spends most of its time in kernel space is suspicious. Anyway,
the keydb format yields an impressive performance boost in (both kernel-
and userspace) CPU time and blow them both.
* Listing keys *
1293 public keys: `gpg2 --homedir /run/shm/gnupg-kbx --with-colons --list-keys | grep -c '^pub:'`
~$ time gpg --homedir /run/shm/gnupg-ring --list-keys >/dev/null
0:01.94 (1.92 user, 0.01 sys) 11192k maxres
~$ strace -c gpg --homedir /run/shm/gnupg-ring --list-keys >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
76.47 0.000013 0 21958 read
23.53 0.000004 0 2586 lseek
0.00 0.000000 0 91 write
0.00 0.000000 0 22 6 open
0.00 0.000000 0 14 close
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000017 34466 23 total
~$ time gpg2 --homedir /run/shm/gnupg-kbx --list-keys >/dev/null
0:00.60 (0.60 user, 0.00 sys) 16464k maxres
~$ strace -c gpg2 --homedir /run/shm/gnupg-kbx --list-keys >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 2051 read
0.00 0.000000 0 96 write
0.00 0.000000 0 26 9 open
0.00 0.000000 0 17 close
0.00 0.000000 0 6 1 stat
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 2388 28 total
~$ time ./g10/gpg2 --homedir /run/shm/gnupg-kdb --list-keys >/dev/null
0:00.73 (0.66 user, 0.07 sys) 108716k maxres
~$ strace -c ./g10/gpg2 --homedir /run/shm/gnupg-kdb --list-keys >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
54.23 0.000231 0 171112 read
40.38 0.000172 0 171094 lseek
5.40 0.000023 1 38 munmap
0.00 0.000000 0 102 write
0.00 0.000000 0 28 8 open
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000426 354634 2628 total
VERDICT: keybox format is the fastest, both in terms of CPU time and
memory usage. keydb comes close second in execution time, but needs
much more memory (16M vs over 100M of max res set size). It also makes
significantly more read(2) and lseek(2), but it doesn't seem to have a
significant impact on the execution time.
* Listing sigs without nested lookups (--fast-list-mode) *
242993 sigs: `gpg2 --homedir /run/shm/gnupg-kbx --with-colons --list-sigs | grep -Ec '^(sig|rev):'`
~$ time gpg --homedir /run/shm/gnupg-ring --fast-list-mode --list-sigs >/dev/null
0:02.06 (2.05 user, 0.00 sys) 11008k maxres
$ strace -c gpg --homedir /run/shm/gnupg-ring --fast-list-mode --list-sigs >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 21958 read
0.00 0.000000 0 2098 write
0.00 0.000000 0 22 6 open
0.00 0.000000 0 14 close
0.00 0.000000 0 2 stat
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 36473 23 total
~$ time gpg2 --homedir /run/shm/gnupg-kbx --fast-list-mode --list-sigs >/dev/null
0:00.80 (0.80 user, 0.00 sys) 16420k maxres
~$ strace -c gpg2 --homedir /run/shm/gnupg-kbx --fast-list-mode --list-sigs >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
0.00 0.000000 0 2051 read
0.00 0.000000 0 1964 write
0.00 0.000000 0 26 9 open
0.00 0.000000 0 17 close
0.00 0.000000 0 6 1 stat
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 4256 28 total
~$ time ./g10/gpg2 --homedir /run/shm/gnupg-kdb --fast-list-mode --list-sigs >/dev/null
0:00.96 (0.88 user, 0.08 sys) 108692k maxres
~$ strace -c ./g10/gpg2 --homedir /run/shm/gnupg-kdb --fast-list-mode --list-sigs >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
46.36 0.000223 0 171112 read
34.30 0.000165 0 171094 lseek
19.33 0.000093 2 38 munmap
0.00 0.000000 0 1970 write
0.00 0.000000 0 28 8 open
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000481 356502 2628 total
VERDICT: pretty much the same as --list-keys, not surprisingly.
* Listing sigs with nested lookups *
166214 sigs from an unknown signer: `gpg2 --homedir /run/shm/gnupg-kbx --with-colons --list-sigs | grep -Ec '^(sig|rev):([^:]*:){8}\[User ID not found\]:'`
14340 unknown signers: `gpg2 --homedir /run/shm/gnupg-kbx --with-colons --list-sigs | sed -nr 's/(sig|rev):([^:]*:){3}([0-9A-F]{16}):([^:]*:){4}\[User ID not found\]:.*/\3/p' | sort -u | wc -l`
~$ time gpg --homedir /run/shm/gnupg-ring --list-sigs >/dev/null
0:32.07 (25.26 user, 6.80 sys) 11612k maxres
~$ strace -c gpg --homedir /run/shm/gnupg-ring --list-sigs >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
74.45 0.010518 0 7820193 read
14.18 0.002004 0 2281813 getrusage
11.13 0.001572 0 2281813 clock_gettime
0.24 0.000034 0 154135 lseek
0.00 0.000000 0 3307 write
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.014128 12541477 23 total
~$ time gpg2 --homedir /run/shm/gnupg-kbx --list-sigs >/dev/null
2:31.89 (14.51 user, 137.29 sys) 18724k maxres
~$ strace -c gpg2 --homedir /run/shm/gnupg-kbx --list-sigs >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.95 1.729099 0 30097047 read
0.02 0.000385 0 15461 9 open
0.02 0.000356 0 15445 munmap
0.00 0.000063 0 15473 mmap
0.00 0.000044 0 15452 close
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 1.729969 30193211 28 total
~$ time ./g10/gpg2 --homedir /run/shm/gnupg-kdb --list-sigs >/dev/null
0:05.93 (5.75 user, 0.18 sys) 107972k maxres
~$ strace -c ./g10/gpg2 --homedir /run/shm/gnupg-kdb --list-sigs >/dev/null
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
47.44 0.000269 0 270111 read
32.45 0.000184 0 270093 lseek
9.88 0.000056 0 34850 34848 access
7.76 0.000044 0 69652 fcntl
2.47 0.000014 0 52262 fstat
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000567 700901 34858 total
VERDICT: Nested lookups yield a quadratic complexity on the legacy
keyring and the keybox format. Again, it's weird that gpg takes most its
time in kernel mode (30000000 reads(2), really?); furthermore a read(2)
takes 57.45ns on the keybox vs 1.34ns on the legacy format. Thanks to
the caching 2.1.9 brings the execution time from 20min to 2.5min. But
*YAY* for the keydb format, which delivers its promises, and brings a
fantastic performance boost (at the expense of memory usage)!
* Building ownertrust *
~$ time gpg --homedir /run/shm/gnupg-ring --trust-model=pgp --check-trustdb
0:08.01 (7.78 user, 0.22 sys) 104384k maxres
~$ rm -f /run/shm/gnupg-ring/trustdb.gpg
~$ strace -c gpg --homedir /run/shm/gnupg-ring --trust-model=pgp --check-trustdb
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
94.71 0.006102 3051 2 rename
1.51 0.000097 0 189215 clock_gettime
1.41 0.000091 0 188974 lseek
1.16 0.000075 0 84746 read
1.15 0.000074 0 189215 getrusage
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.006443 671914 22 total
~$ time gpg2 --homedir /run/shm/gnupg-kbx --trust-model=pgp --check-trustdb
0:06.81 (5.84 user, 0.97 sys) 102568k maxres
~$ rm -f /run/shm/gnupg-kbx/trustdb.gpg
~$ strace -c gpg2 --homedir /run/shm/gnupg-kbx --trust-model=pgp --check-trustdb
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.71 0.009584 0 180151 read
0.12 0.000012 0 815 9 open
0.10 0.000010 0 218 munmap
0.03 0.000003 0 5283 write
0.03 0.000003 0 15587 lseek
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.009612 205584 28 total
~$ time ./g10/gpg2 --homedir /run/shm/gnupg-kdb --trust-model=pgp --check-trustdb
0:11.35 (11.18 user, 0.16 sys) 248604k maxres
~$ rm -f /run/shm/gnupg-kdb/trustdb.gpg
~$ strace -c ./g10/gpg2 --homedir /run/shm/gnupg-kdb --trust-model=pgp --check-trustdb
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
51.40 0.000367 0 387421 read
43.56 0.000311 0 394650 lseek
1.26 0.000009 0 1088 brk
1.12 0.000008 0 5760 write
0.98 0.000007 0 9091 fstat
[…]
------ ----------- ----------- --------- --------- ----------------
100.00 0.000714 820003 6078 total
VERDICT: keydb format is slower than the other two, but overall either format
has comparable performance.
> FWIW, I probably won't be working on this much in the near future as
> it is not currently a priority for us.
Thanks for being honest about it and saying so; as for myself I'll
follow updates in your branch and eagerly wait for the new format :-)
--
Guilhem.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: </pipermail/attachments/20151101/69ace1ca/attachment.sig>
More information about the Gnupg-devel
mailing list