Encoding: a proposal

George Pauliuc pauliuc@gmx.net
22 Nov 2002 18:05:06 +0200


--=-tLDu8jDwsJbAns4qYClA
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Vi, 2002-11-22 at 10:47, Lorenzo Cappelletti wrote:
> 1. Translator of lang LL (en, de, ...) chooses the encoding ENC (UTF-8,
> Latin10, ...) that best suits their needs.

Sounds excelent.  After all eacy translator should use whatever encoding
likes best.  UTF-8 would be the best choice, but most editors nowadays
still don't have UTF-8 support (or it is implemented
badly/experimental).  Because it is a multibyte representation it is
really easy to screw in case somebody makes a mistake and tries to
interpret it as single-byte encoding.  I know, it happened to me ;-)

> 2. Since no information is stored in text (thus .wml) file about
> encoding, they can translate a .wml file using their favorite text
> editor and encoding ENC.

Right.

> Translator's editor *must not* corrupt others' translations by simply
> opening and saving the .wml file!

Shouldn't be easier to make something like first step 'cp xx.wml
xx.LL.wml'?  And to make sure - block the update of xx.wml from anybody
than you or anybody who takes care of the original version (which
probably will always be in English, so no special encoding is used).

> For those of you who needs images for proper language symbol rendering
> (Romanian), I can provide some WML custom tags to make life easier.

Hmm... what do you have in mind Lorenzo?

> 3. Final .html.LL file will be made of text in plain ASCII (mainly
> tags) and translated text in encoding ENC.  No re-encoding is
> performed.

Plain ASCII?  The text will be included?  Sounds like XML -> XSLT ->
XHTML.  Where will the encoding be?

> HTML header will be filled out with these attributes:
>   encoding=3D"ENC"
>   lang=3D"LL"

Could you describe in more detail the mechanism you plann to use?  I'm
not sure I understand what will happen with the text.

I know it is silly, but, for beginning what is this .wml file?  I
understand it is some formated text.  How?  From where can I get more
details.

> In the future we can think to adopt po4a mechanism
> (http://savannah.nongnu.org/projects/po4a/).

The project looks quite in alpha stage.

--=-tLDu8jDwsJbAns4qYClA
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: 'See search.keyserver.net for my signing key'

iD8DBQA93jNXEM28XWGBdX8RAohVAKCWHan2+9vqI1U8v3wgJpvnJEN7+QCbB+kr
WUKvLOG2bPqQ9u8V93NH/C0=
=8Tjo
-----END PGP SIGNATURE-----

--=-tLDu8jDwsJbAns4qYClA--