]> The Guide to PGTEI Version 0.3 Marcello Perathoner Edition 3 Project Gutenberg April 12, 2005 20000

This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License online at www.gutenberg.org/license

Created electronically.
Library of Congress Classification ZA TEI XML October 2002 Marcello Perathoner Started writing it
Disclaimer

This document does not represent any official PG standard. It describes the dialect of TEI used by the Gnutenberg Press, which is part of the online text conversion system which is being installed on www.gutenberg.org.

The Gnutenberg Press will convert any TEI conformant text, but texts marked up according to this guide will look better and contain all the necessary headers and footers for posting on PG.

How to Mark Up a Text

To mark up a text means to identify its components according to a set of rules.

Basically you say: this (point in the text) is the start of a paragraph; this is the end of a paragraph; this is the start of a chapter; this the end, etc.

In TEI-speak a text component is referred to as element. Paragraphs, chapters, highlighted words, quotes, footnotes, etc. are such elements.

Elements

To mark a text region as element you have to insert an opening tag at the start and a closing tag at the end of the text region. In TEI a paragraph is represented by an element of type p. You type an opening tag by enclosing the element type with brackets: p. A closing tag has a slash after the first bracket: /p.

Here's an example of how you would mark up a paragraph:

'Oh, bless you, it doesn't matter in the least. If the man is caught, it will be _on account_ of their exertions; if he escapes, it will be _in spite_ of their exertions. It's heads I win and tails you lose.'

]]>

Don't worry about the line breaks, the text will get reformatted anyway. The formatter knows where a paragraph ends by the /p tag and does not care about empty lines and such.

Let's do some more markup. In TEI the emph element stands for emphasized text:

'Oh, bless you, it doesn't matter in the least. If the man is caught, it will be on account of their exertions; if he escapes, it will be in spite of their exertions. It's heads I win and tails you lose.'

]]>

In TEI the q element stands for quoted text:

Oh, bless you, it doesn't matter in the least. If the man is caught, it will be on account of their exertions; if he escapes, it will be in spite of their exertions. It's heads I win and tails you lose.

]]>

Every opening tag needs a corresponding closing tag. Opening and closing tags must always nest like parentheses in a mathematical equation.

This is right:

... ]]>

and this is wrong:

... ]]>
Attributes

Most elements can take attributes. Attributes are used to add specifications to elements in exactly the same way in which adjectives add specifications to nouns.

... Above all, why should the second man write up the German word Rache before decamping? ...

]]>

In TEI the foreign element is used to mark a passage in a foreign language. The lang attribute specifies which language it is.

The attribute name must be followed by an = and the attribute value must be put in quotes. An element can have zero or more attributes but every attribute must have a different name.

... Above all, why should the second man write up the German word Rache before decamping? ...

]]>
Entities

In TEI you can specify characters you don't have on your keyboard with entities. Let's see how to insert the em-dash character, that is the long dash you see in printed books. (In PG etexts that character is mostly represented by two dashes -‍- because ASCII lacks that character.)

... Among other things I bought these brown boots — gave six dollars for them — and had one stolen before ever I had them on my feet.]]>

In TEI the entity mdash represents an em-dash. Substituting mdash for -‍- makes the text look more professional.

Entities start with an ampersand (&) and end with a semicolon (;). You can find a list of supported TEI entities in Chapter 18.

Working Strategy

You can and should mark up a text incrementally. That is: make more than one pass over the whole text and in each pass mark up a subset of elements.

You may start marking only the most prominent text features like chapters and paragraphs. Later you make a second pass marking all italicized text. If you still want to do more, make another pass replacing all quotation marks with the q element.

TODO: a PG working group needs to codify different levels of PGTEI markup.

Most probably you will start with a TEI text automatically generated by a some program from the plain vanilla etext. Your task will then be to proof the tags inserted by the program.

Start with generic tags

If you cannot state with confidence the reason why a text passage is highlighted, use the generic hi tag. A person more knowledgeable than you can easily make another pass over the text searching for all generic tags and replacing them with more appropriate specific tags, eg. the emph or title tags.

If you encounter a passage in a foreign language unknown to you just use the bare foreign. Another person who knows the language may add the lang attribute.

Insert comments

You can insert comments any place you want. These will stay in the TEI text but not show up in the formatted output. By using the word: FIXME you can mark positions that require further inspection.

A comment starts with <!-‍- and ends with -‍->.

Oh, bless you, it doesn't matter in the least. If the man is caught, it will be on account of their exertions; if he escapes, it will be in spite of their exertions. It's heads I win and tails you lose. Whatever they do, they will have followers. Un sot trouve toujours un plus sot qui l'admire.

]]>

Later it will be easy to search for all FIXME in the text and fix them.

Validating your text

One of the advantages of XML is that a program can check the markup for you. To do this you need a validator and the DTD (Document Type Definition).

You can get XML validators from here:

xmllint

And here is the PGTEI DTD.

For all of you who don't want to install a validator on your own PC there is an online validation service for PGTEI. It can also convert your text to different output formats.

Further Reading

As primary source of information refer to <xref url="http://www.tei-c.org/Lite/">TEI Lite: An Introduction to Text Encoding for Interchange</xref> by Lou Burnard and C. M. Sperberg-McQueen, June 1995, revised May 2002.

A still smaller subset of TEI is described in: <xref url="http://www.tei-c.org/Vault/Bare">Bare Bones TEI A Very Very Small Subset of the TEI Encoding Scheme</xref> by C. M. Sperberg-McQueen, Document No. TEI U6, 30 Aug 1994, revised June 1995.

The complete TEI markup language (caveat emptor) is described in: <xref url="http://www.tei-c.org/P4X/">TEI P4: Guidelines for Electronic Text Encoding and Interchange</xref> by C. M. Sperberg-McQueen, and L. Burnard editors, 2002.

The homepage of the Text Encoding Initiative Consortium has many other interesting stuff and links.

More Links

Language Codes: <xref url="http://xml.coverpages.org/iso639a.html">Code for the Representation of the Names of Languages.</xref> From ISO 639, revised 1989

Implementation Details

The rest of this guide explains the implementation details and limitations of the pg-press system and shows more examples. Numbered headers refer to the corresponding section in the TEI Lite Introduction.

Standard Header for PG submission

These are examples for the official header and footer in a PGTEI text. The publicationStmt section is mandatory.

Alice's Adventures in Wonderland Illustrated by John Tenniel Lewis Carroll John Tenniel Edition 30 Project Gutenberg January, 1991 11

This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License online at www.gutenberg.org/license

#1 in our series by Lewis Carroll 1 Unknown
Library of Congress Classification PR Alice January 1991 Anonymous Project Gutenberg edition 10 March 1994 Anonymous Project Gutenberg edition 30 March 2003 Marcello Perathoner TEI Markup
]]>

And this is the footer:

]]>
3. The Structure of a TEI Text

See the TEI-Lite introduction.

Composite texts are not supported.

teiCorpus

Unsupported

group

Unsupported.

4. Encoding the Body

See the TEI-Lite introduction.

4.1. Text Division Elements
The rend attribute

On a block element the attribute rend may take one or more of the following values:

This block is left-adjusted.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem.

This block is centered.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem.

This block is right-adjusted.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem.

This block is left- and right-justified.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem.

This entity is rendered as a block and has wider margins.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem.

Use for examples of code. This block is rendered in a monospaced font. Line breaks are preserved.

This block gets indented by n em-spaces. n may be negative.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem.

This element starts a new page. Chapter 2

...

]]>

This element starts a new right-hand page. Part 2

...

]]>

Floats the division to the left or right margin (HTML mode) or to the top or bottom of the page or to a special page (PDF mode). Valid value is a string composed of one or more of the option letters:

The division is floated to the left margin. HTML mode only.

The division is floated to the right margin. HTML mode only.

The floated division may stay here if there is enough room left on this page. PDF mode only.

The division may float to the top of the current page, if there is enough room for both, it and the previous text. If this is not the case, it is added at the top of the next page. The subsequent text continueson the current page. PDF mode only.

The division may float to the bottom of the current page. The subsequent text continues until the room left on the current page is just enough for the float. If there is already insufficient room, the float will be put at the bottom of the next page. PDF mode only.

The division may float to a special page containig only floats. PDF mode only.

The picture in the next example will float to the left margin in HTML mode. In PDF mode it will appear at this point in the text if there is enough room left on the page, else it will float to the top of the next page.

White Rabbit checking watch.

]]>

You may also use one of the following shortcuts:

Shortcut for text-align(left).

Shortcut for text-align(center).

Shortcut for text-align(right).

Shortcut for text-align(justify).

Shortcut for indent(2).

p

Attribute rend takes the following values in addition to those listed under All Block Elements:

This paragraph will not have its first line indented.

Embedded letters

If you try to mark up an embedded letter (piece of correspondence) you'll be surprised to find that the simple approach doesn't validate. Use this approach instead: Chapter 1. The Spread of Evolution

His book on animals and plants ... C. Darwin to T.H.Huxley

...

...

]]>

Chapter 1. The Spread of Evolution

His book on animals and plants ... C. Darwin to T. H. Huxley

...

...

4.2. Headings and Closings
head

Use head for the main header and head type="sub" for any subtitle. It is up to you to decide which title is main and which are sub.

Part I Being a Reprint from the Reminiscences of John H. Watson, M.D. late of the Army Medical Department
Mr. Sherlock Holmes

In the year 1878 I took my degree of Doctor ...

The campaign brought honours and promotion to many, ...

...
The Science of Deduction

We met next day as he had arranged, and inspected the rooms at No. 221b, Baker Street, of which he had spoken at our meeting. ...

...
...
]]>
4.3. Prose, Verse and Drama
l

Attribute part; only the values, I, M and F are supported.

Example:

There was a young lady of Riga, Who smiled as she rode on a tiger; They came back from the ride With the lady inside, And the smile on the face of the tiger. ]]>

Will be rendered as:

There was a young lady of Riga, Who smiled as she rode on a tiger; They came back from the ride With the lady inside, And the smile on the face of the tiger.
sp Margarete Versprich mir, Heinrich! Faust Was ich kann! Margarete Nun sag, wie hast du's mit der Religion? Du bist ein herzlich guter Mann, Allein ich glaub, du hältst nicht viel davon. ]]>

Will be rendered as:

Margarete Versprich mir, Heinrich! Faust Was ich kann! Margarete Nun sag, wie hast du's mit der Religion? Du bist ein herzlich guter Mann, Allein ich glaub, du hältst nicht viel davon.
5. Page and Line Numbers

See the TEI-Lite introduction.

lb/

This tag has a different semantic than in TEI: without an ed attribute it produces a line break in the output at this point. (It should just record a line break in a certain edition.) There ain't a tag in TEI for a forced line break that is not a poetry line, so I collared this one.

milestone unit="tb"/

You can generate thought-breaks if you set the unit attribute to tb. The default is to generate a small vertical gap.

These are the supported values for the rend attribute:

generates a thought-break consisting of n stars (asterisks).

generates a horizontal rule that is n % the width of the text.

6. Marking Highlighted Phrases

See the TEI-Lite introduction.

6.1. Changes of Typeface, etc.

This are the supported values for the rend attribute when applied to inline elements such as hi:

for text in italics

for bold text

for underlined text

for text in Small Capitals

for superscript text

for subscript text

for expanded text

for strikeout text

for smaller text

for small text

for large text

for larger text

for tty-type text

where x is a font family name: Times New Roman, Courier or Zapf Chancery. Note that display depends also on the fonts actually available on the user's machine.

where x is a percentage value: 50% 75% 100% 150% 200%

where x is a value between 100 and 900: 400 700 900. Note that display depends also on the fonts actually available on the user's machine.

I have left everything in statu quo until I hear from you.

]]>
hi

The default rendering for hi without any rend attribute is italic.

6.2. Quotations and Related Features
q

Attribute rend takes the following values:

This quote is rendered as a displayed paragraph.

This quote has the opening mark only.

This quote has only the closing mark.

This quote has no quotation marks.

The first thing that put us out was that advertisement. Spaulding, he came down into the office just this day eight weeks, with this very paper in his hand, and he says:

I wish to the Lord, Mr. Wilson, that I was a red-headed man.

]]>

Will be rendered as:

The first thing that put us out was that advertisement. Spaulding, he came down into the office just this day eight weeks, with this very paper in his hand, and he says:

I wish to the Lord, Mr. Wilson, that I was a red-headed man.

6.3. Foreign Words or Expressions

7. Notes

This note ref should not display in the toc.

See the TEI-Lite introduction.

note

The place attribute supports only the values of foot, end and margin.

The n attribute is not supported.

The note text should always be enclosed in paragrafs.

note place="foot" inserts a footnote marker at the exact point in the text. There should be no space between the commented text and the opening note, any space should be moved after the closing /note.

When I was a boy, there was but one permanent ambition among my comrades in our village

Hannibal, Missouri.

on the west bank of the Mississippi River. That was, to be a steamboatman. ...

]]>

Will be rendered as:

When I was a boy, there was but one permanent ambition among my comrades in our village

Hannibal, Missouri.

on the west bank of the Mississippi River. That was, to be a steamboatman. ...

The handling of footnotes depends on the output format: if the format has facilities for pagination (PDF) the footnote appears at the bottom of the current page. If the format has no such facilities (HTML, TXT, PDB) the footnote appears at the end of the text. In HTML the footnote marker will be linked to the footnote text.

The endnote is less intrusive than the footnote and you should use it for any notes you add to the text yourself.

note place="end" does the same as note place="foot" for the HTML, TXT and PDB formats.

In the PDF format the endnotes get listed in the back matter with the page number. Because the user can only see the page number and not the exact position the note is attached, you should insert a short reminder in the note text.

Today about three o'clock the proofs of this paper arrived from the printers. The exercise consists of half a chapter of Thucydides

Thucydides: Greek historian, remembered for his History of the Peloponnesian War.

. I had to read it over carefully, as the text must be absolutely correct. ...

]]>

Will be rendered as:

Today about three o'clock the proofs of this paper arrived from the printers. The exercise consists of half a chapter of Thucydides

Thucydides: Greek historian, remembered for his History of the Peloponnesian War.

. I had to read it over carefully, as the text must be absolutely correct. ...

8. Cross References and Links

See the TEI-Lite introduction.

Note: links work only in the HTML and PDF formats.

8.1. Simple Cross References
ptr, ref

Use these for internal links.

Attribute target supports only one destination.

See: Chapter 42.

...
Chapter 42 The Answer

Wouldn't you like to know? ...

...
]]>
8.2. Extended Pointers
xptr, xref

Use these for external links.

Attribute doc is not supported.

New attribute url. Enter the link destination here.

The homepage of TEI.

]]>
8.3. Linking Attributes

9. Editorial Interventions

See the TEI-Lite introduction.

10. Omissions, Deletions, and Additions

See the TEI-Lite introduction.

11. Names, Dates, Numbers and Abbreviations

See the TEI-Lite introduction.

11.1. Names and Referring Strings

name type="ship" and rs type="ship" are output in italics.

11.2. Dates and Times

11.3. Numbers

11.4. Abbreviations and their Expansion

11.5. Addresses

12. Lists

See the TEI-Lite introduction.

list

Attribute rend value run-on is not supported.

Always put one or more paragraphs (p) into an item.

13. Bibliographic Citations

See the TEI-Lite introduction.

14. Tables

See the TEI-Lite introduction.

table

Attribute rend takes following values:

Use this to give the table rules around every cell.

PDF output only. Use to give &tex; hints about the table columns. The table is implemented using the La&tex; longtable environment. ]]>

TXT output only. Use to give nroff hints about the table columns. The table is implemented using the tbl preprocessor. ]]>

row

Attribute role not supported.

15. Figures and Graphics

See the TEI-Lite introduction.

figure

Only PNG and JPEG formats are supported at present.

Attribute entity not supported. Use url instead.

New attribute url: the url of the image file.

Attribute rend texwidth is the width the figure is scaled to in PDF (through &tex;) output. 100% represents the current linewidth.

White Rabbit checking watch.

]]>
16. Interpretation and Analysis

See the TEI-Lite introduction.

16.1. Orthographic Sentences

16.2. General-Purpose Interpretation Elements

17. Technical Documentation

See the TEI-Lite introduction.

17.1. Additional Elements for Technical Documents
eg

Use for examples. This is a block element, with line breaks preserved. In HTML it is also rendered as a shaded box.

Example Example Example ]]>

Will be rendered:

Example Example Example
formula

Attribute notation can take following values:

In PDF output mode this will pipe the contents of the formula element directly through to the &tex; processor.

In HTML output mode the contents of the formula element will be passed to an instance of &tex; and converted to an image. The resulting image file is inserted into the HTML file.

In all other output modes it will be ignored.

In HTML output mode the contents of the formula element will be inserted literally into the HTML file.

In all other output modes it will be ignored.

In HTML and PDF output modes the SVG contents of the formula element will be converted to an image and inserted into the file.

In all other output modes it will be ignored.

Example:

This is an inlined formula: $\int_0^\infty f(x)\,dx$. And this is some more text after the inlined formula.

]]>

Will display as:

This is an inlined formula: $\int_0^\infty f(x)\,dx$. And this is some more text after the inlined formula.

Example:

]]>

Note the use of a CDATA section to avoid having to replace all &s with amp. A CDATA section starts with <![CDATA[ and ends with ]]>.

Will display as:

An embedded SVG image.

]]]]>

]]>

Will display as:

]]>

17.2. Generated Divisions
divGen

Attribute n sets the title of the generated section. If missing a default title is used. See below.

Attribute type supports following values:

Generates a standard title page from the teiHeader element.

Generates a colophon from the teiHeader element. This includes all notes in the header and all revisions to the text. Default title is Credits.

Generates a table of contents from index index="toc" elements. Default title is Contents.

Generates a standard PG header appropriate for the output format.

Generates a standard PG footer appropriate for the output format.

Generates a footnotes section. This section is automatically populated with the contents of the note type="foot" tags found in the text. Default title is Notes.

Here is an example for the front matter: ]]>

And this is an example for the back matter: ]]>

17.3. Index Generation
index

Attributes level2 through level4 not supported.

Attribute index supports following values:

The table of contents.

The bookmarks section of a PDF file. Note that no special characters (like mdash) should be used for a PDF bookmark.

The bookmarks section of a PDB file. Note that PDB can accomodate a maximum of 15 characters per bookmark. Strings exceeding this length will be truncated.

Element index attribute level1 will default to the contents of the next head element.

18. Character Sets, Diacritics, etc.

See the TEI-Lite introduction.

XML entities

You should use a unicode-capable editor to edit your files and save them in utf-8 encoding. If you cannot do that, you'll have to choose a different encoding and enter all characters your encoding cannot handle with XML entities. To do that, you'll have to find out the unicode code point of the character first.

Ways to use XML entities Example  yields Decimal &#162; ¢ Hexadecimal &#xa2; ¢ Named &cent; ¢

XML special characters

You must replace these characters if they occur in your original text. This is because TEI recognizes them as special characters, eg. < and > are the start and the end of a markup tag, & is the start of an entity.

XML special charactersreplace  with& amp< lt> gt

Characters you may miss

If you use the ISO-8859-1 encoding to save your TEI file, you will not be able to enter these characters directly. You can still get them if you write: Useful characters not in ISO-8859-1Useful characters not in ISO-8859-1 (continued)to get  type commentŒ OElig The OE ligature used in French texts.œ oelig The oe ligature used in French texts.Š Scaron An S with an inverted hat.š scaron An s with an inverted hat.Ÿ Yuml An Y with a diaeresis.ƒ fnof matematical function of>‍< zwj Zero-width joiner. Use: shelfzwjful to get rid of the ff ligature. Sometimes &tex; may need a little hint. See shelf‍ful vs. shelfful. Use also to make sure a word does not get broken here.> < nbsp A non-breaking space. Use: Mr.nbspSherlock Holmes> < ensp A space the size of an n.> < emsp A space the size of an m. A typografical quad.> < thinsp A thin space. May be appropriate between quotes. Note: the program inserts a thin space between quotes automatically if you mark up quotes using the q tag. ndash Use between numbers. Eg: 1914ndash18 The great war of 1914–18. mdash Use instead of -‍- &qdash; qdash Use instead of -‍-‍-‍-. A quote dash or two-em dash is used to indicate missing letters: Mr. P&qdash; woke up. lsquo rsquo sbquo ldquo rdquo bdquo dagger Dagger bull bullet hellip Horizontal ellipsis. Use instead of three dots. Note how the ellipsis dots are spaced farther apart than if you enter three dots: ... permil prime Use for coordinates. Prime Use for coordinates. lsaquo French guillemet rsaquo French guillemet euro trade spades The card suites. hearts diams clubs

If you are using a UNICODE-capable editor, you can just enter the characters directly.

19. Front and Back Matter

See the TEI-Lite introduction.

19.1. Front Matter
19.1.1. Title Page
epigraph I pity the man who can travel from Dan to Beersheba, and say 'Tis all barren; and so is all the world to him who will not cultivate the fruits it offers. &qdash; Sterne: Sentimental Journey.

An epigraph contains a quotation, anonymous or attributed, appearing at the start of a section or chapter, or on a title page. An epigraph is rendered in smaller type and right adjusted.

argument

Monte Video — Maldonado — Excursion to R Polanco — Lazo and Bolas — Partridges — Absence of Trees — Deer — Capybara, or River Hog — Tucutuco — Molothrus, cuckoo-like habits — Tyrant Flycatcher — Mocking-bird — Carrion Hawks — Tubes formed by Lightning — House struck

A formal list or prose description of the topics addressed by a subdivision of a text.

19.1.2. Prefatory Matter

19.2. Back Matter
19.2.1. Structural Divisions of Back Matter

20. The Electronic Title Page

See the TEI-Lite introduction.

Experimental

Partially supported through the divGen type="colophon" element.

You should not try to build a conformant header by yourself (unless you are smarter than I am) but just copy the provided header template and modify the appropriate entries.

20.1. The File Description
20.1.1. The Title Statement

20.1.2. The Edition Statement

20.1.3. The Extent Statement

20.1.4. The Publication Statement

20.1.5. Series and Notes Statements

20.1.6. The Source Description

20.2. The Encoding Description
20.2.1. Project and Sampling Descriptions

20.2.2. Editorial Declarations

20.2.3. Tagging, Reference, and Classification Declarations

20.3. The Profile Description

20.4. The Revision Description

Extensions to TEI-Lite
pgIf

Used to insert conditional text.

Attribute has takes the following values:

Test if the text requires a footnote section.

Only paginated output formats like PDF can place the footnotes at the foot of the page. Other formats like HTML don't know pages at all, so we have to place the footnotes at the end of the whole text. (PDF too can have endnotes — notes that appear at the end of the book instead of at the foot of the page.)

This example creates a back only if there are footnotes. In a text with footnotes it will create a back in the HTML output format but not in the PDF output format.

]]>

Attribute output takes the following values:

Test if the output format is HTML.

Test if the output format is &tex;. &tex; is presently used for PDF generation.

Test if the output format is NROFF. NROFF is presently used for TXT and PDB generation.

If you use this feature your text will need revision to accomodate any change in the TEI processing system. For instance, it is not guaranteed that PDF output will always be generated by &tex; nor that TXT will always go through NROFF.

\reflectbox{Jabberwocky}\medbreak \reflectbox{'Twas brillig, and the slithy toves}\break \reflectbox{\quad Did gyre and gimble in the wabe;}\break \reflectbox{All mimsy were the borogoves,}\break \reflectbox{\quad And the mome raths outgrabe.}\par

ykcowrebbaJ sevot yhtils eht dna ,gillirb sawT' ebaw eht ni elbmig dna eryg diD  ,sevogorob eht erew ysmim llA .ebargtuo shtar emom eht dnA 
]]>

Will be rendered as (if you are viewing the PDF file you will see true mirrored text

Technical information: You may wonder why we don't use the convert formula to image feature here to generate the reflected text in HTML. Actually \reflectbox is a command of the pdflatex driver. To convert formulas into images we use the dvips driver because of its higher output quality.

):

\reflectbox{Jabberwocky}\medbreak \reflectbox{'Twas brillig, and the slithy toves}\break \reflectbox{\quad Did gyre and gimble in the wabe;}\break \reflectbox{All mimsy were the borogoves,}\break \reflectbox{\quad And the mome raths outgrabe.}\par

ykcowrebbaJ sevot yhtils eht dna ,gillirb sawT' ebaw eht ni elbmig dna eryg diD  ,sevogorob eht erew ysmim llA .ebargtuo shtar emom eht dnA 
The Gnutenberg Press

The Gnutenberg Press is the software to convert from TEI to HTML, &tex;, TXT and PDB. It is a collection of XSLT stylesheets driven by a Perl script.

This is a diagram showing how the conversion is done.

The Gnutenberg Press The conversion process. From TEI thru XSLT to HTML. From TEI thru XSLT and pdfTeX tp PDF. From TEI thru XSLT and nroff to TXT. From TEI thru XSLT, nroff and txt2pdbdoc to PDB. All driven by a Perl script.

The XSLT stylesheets do the bulk of the work. The Perl script calls XSLT at the right moments and fixes up things that are just too difficult to get right with XSLT, like the correct placement of newlines, which is crucial to &tex; and nroff.

nroff is called twice with slight differing parameters: with the latin1 device and line breaking on for TXT, with a custom PDB device and line breaking off for PDB. The PDB device is customized towards the special Palm-OS character set.

The Gnutenberg Press is released under the GNU General Public License (GPL).

You may download the Gnutenberg Press.

To use the Gnutenberg Press you need these tools:

Get libxml2 and libxslt from the XML C parser and toolkit of Gnome.

The Pathologically Eclectic Rubbish Lister by Larry Wall in a version >= 5.8.0. Get Perl from the Comprehensive Perl Archive Network. Install the XML::LibXML and XML::LibXSLT modules from CPAN.

The typesetting system invented by Donald Knuth. Get &tex; from the &tex; Users Group Home Page.

Get GNU groff.

You need a patched version of txt2pdbdoc. The patchfile is contained in the Gnutenberg Press archive.

If you are running a fairly recent Linux distribution you should already have got most of them. If you are on Windows you'll have to sweat some to get them all, but, if you run Windows, you like to suffer, right?

Caveats
The PDF conversion

If you have non-iso-8859-1 characters in the headings, the pdf conversion will choke. You'll have to use the index index="pdf" tag to provide an alternate heading without those special characters.

In this example the pdf converter would choke on the mdash character in the heading. Thus you have to provide an alternate heading for the pdf bookmark section.

Chapter 1 — First Day ]]>