Ben McGinnes ben at adversary.org
Fri Mar 25 18:17:37 CET 2016

On Fri, Mar 25, 2016 at 04:37:59AM -0400, Robert J. Hansen wrote:
> > And that doesn't even get into the issues involved with selecting a
> > format for producing the documentation in.  Consider the following:
> Preach it, Brother Ben.


> And it's not just about formats, it's also about targets,

Right, possibly more about targets really since the purpose of
documentation is to help the end users use the software, not to help
developers tick a box on the feature list and say, "yeah, we've got
(extensive) documentation, sure."  [I do realise that saying that is
sacrilege in some circles.]

> because each of these formats works best with different targets.  Do
> we want to optimize for reading in a browser on a desktop?  Read in
> a mobile browser on a smartphone?  What about reading on a tablet or
> e-reader?  What about dead-tree editions?  How will we make it
> accessible to the blind?  How...

All excellent examples, especially that last one (which is often
overlooked by software projects until someone affected by it raises a
bug).  All of which, just in the asking, actually move things more in
the pro-DITA direction since there are existing DITA transformations
for XHTML 4 (strict & transitional), HTML 5, XHTML 5 (i.e. HTML 5 with
XML), EPUB 2.0, EPUB 3.0 (which covers the DAISY Consortium's primary
concerns with Accessibility), Docbook 4.2 & 5.0 (plus anything they
can be transformed into, including a proven track record with dead
tree versions), CHM output, ODF output, RTF output, Eclipse help
output, PDF output (usually via a FOP or [X]HTML+CSS rendering) and
with my setup there are three different types of WebHelp output
(standard, mobile and one with builtin feedback/comments system, but
the last needs to be built with SQL db support).

For examples of dead tree versions of things made with either DITA or
DocBook that actually sell, there's XML Press (not a huge surprise)
and also O'Reilly Media (which uses the same editor I do, see below).

For the EPUB files, results vary depending on which transformation and
source is used, but all the output is better than anything produced
with Pandoc, Calibre or Sphinx.  Sphinx can't produce a validating
EPUB 2, let alone EPUB 3; Calibre dumps everything in a single
directory regardless of what it is, but still manages to produce
validation errors (mainly in the XHTML) and can't produce EPUB 3;
Pandoc can produce both EPUB 2 & 3, but cheats outrageously (say
goodbye to all chapters, say hello to everything in a single XHTML
file and hope the end user device recognises the markup defining what
a chapter or section is).

Whereas the D4P EPUB transformation will retain your project's
directory structure inside the .epub zipfile and will produce a fully
validating EPUB 3.0.1 file in almost all circumstances.  The
exceptions are if a smaller cover image is included to display in
Apple iBooks, it will appear twice in the manifest and is easily fixed
with a manual edit.  The other issue is it currently uses XSLT 1.0,
which can't convert the build system's time to UTC and this may cause
issues depending on timezone, also easily fixed with a manual update
or possibly scripted or running the build on a system set to UTC.

For examples (other than me) of groups or orgs using the D4P EPUB
transformation, the biggest one I can think of is HarperCollins.  They
did, after all, pay to have it done and didn't mind having it licensed
the same way as the rest of the D4P project (Apache 2.0) and as freely
available as both that project and the DITA Open Toolkit (they're both
on GitHub).

> And no matter which you choose there's always a sea of people eager to
> tell you that you're doing it wrong.

Yet ever so rarely are able to provide suggestions which meet all the

No doubt there would also be objections to learning an entirely new
XML syntax, but then much of the most common mark-up is very similar
to HTML (e.g. b for bold, i for italics, u for underline, p for each
paragraph), linking is a little different, but not hugely (i.e. <xref
href="https://www.gnupg.org/" scope="external" format="html">GNU
Privacy Guard</xref>, links within the same project don't need the
scope and format bits) and images also differ slightly (<image
href="kitten.png"/> instead of <img src="kitten.png"/>).

On the other hand, once the realisation hits home about how much
content re-use is supported and being able to make document-wide
changes with a little careful design (including conditional output
generation based on things like, say, software versions or supported
feature sets), some of that grumpiness might dissipate.  Maybe.  Well,
OK, probably not.

> It's very frustrating.


In some respects it becomes even harder with an official GNU Project
like GPG and its sub-projects as there's that additional requirement
(usually) that all components must be licensed under the GPL or LGPL
(of whichever versions are relevant).  DITA itself is an open enough
standard to qualify, it's an OASIS standard, but the defining
implementation (see dita-ot.org) and most public specialisations, like
DITA for Publishers (see dita4publishers.org), are deliberately
licensed more permissively (both use Apache 2.0).

My setup is even "worse" since it mixes some more proprietary stuff in
the implementation (but that's primarily the editor and some
components which usually have Free/Libre or Open alternatives, albeit
with greater potential for frustration).  As the DITA-OT relies on
something I loathe (Java) and I was quite prepared to shell out a
modest sum to get what I wanted (generating valid EPUB 3 files and
editing them without breaking them, since Sigil can't), I ended up
getting oXygenXML Editor (their idea of support is in my experience
unparallelled: it extended to full support to a freebie trial version
and included adding a feature within 24 hours of asking about it
during said trial; first as a plugin and later as a built-in).  Oh,
yes, I can load any file I have open in oXygenXML in Emacs from within
it and doing so automatically loads nxml-mode, it's even bound to the
same key sequence as It's All Text! in Firefox.

Still, without a GNU implementation of DITA, it's still possible to
produce everything with the DITA-OT I've mentioned so far, except the
fully searchable WebHelp output (the XSLTs are part of oXygenXML
Editor) and possibly not the rest2dita thing (depending on which
version of Saxon is which, I keep getting them mixed up, but I've got
it working with Saxon-HE  I know some people might argue
against the DITA-OT itself with the Apache 2.0 license, but then turn
a blind eye to Markdown (BSD license) and since both are GPL
compatible I'm likely to ignore that argument.  I will, however, pay
more attention to arguments in favour of accessibility issues
(e.g. generating screen reader friendly material) and translation of
material (that's actual translations, not merely piping said material
through translate.google.com).


P.S.  The schemas I mentioned a few days ago are now in the ben/xml
      branch of GPGME on the git server.  Licensing is the same as
      GPGME plus Apache 2.0 because it can't hurt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 630 bytes
Desc: not available
URL: </pipermail/attachments/20160326/2234d7d6/attachment.sig>

More information about the Gnupg-users mailing list