Sunday, March 16, 2008

First word of a paragraph

Today, with all the troubles around tables, I learned that the first word of a paragraph never gets hyphenated. (Thanks, Ulrike) Just in case you wonder why your table header does not get automatically hyphenated, you know …


Everything I learned about tables in an introductory LaTeX is ok but not sufficient.

Very nice and up-to-date Table guide has been written by Lapo Filippo Mori and I warmly recommend it to everybody.

Friday, March 7, 2008

Block comment

If you need to comment out a block of text, use the verbatim package. That provides the comment environment:



Line spacing in tables

If you need more space between the lines in a table use the array package and put the following line somewhere between \begin{table} and \begin{tabular}


Instead of “length” you write the desired space, e.g. “3pt”

Wednesday, March 5, 2008

Assign another catcode

The actual and more elegant solution for my problem with verbatim text in argument position decribed in the previous post is much simpler. Since the underscore “_” is the only nasty character which does not allow me to write things like “NA071112-01_E.008” as an argument of my \ru command the best solution for me is to assign the underscore another catcode (thanks to Enrico Gregorio). This simply achieved by …


… in the preamble. Thereby the underscore looses its subscript function (category code 8) and is treated like any other puctuation character (category code 12).

As a linguist I do not need the underscore as a shortcut for subscript in math mode. If I want, I can still do it by “\sb” instead of the “_”. (\sb is actually short enough.)

I then can define my \ru just like this:


(I may want to fill it with additional commands later.)

I get the feeling that changing the category code of certain special characters may be the most accessible (if not the only) way of accepting verbatim text as an argument of a command.

This whole thing around my \ru command motivates me to go through some introduction to TeX. Very nice explanation of catcodes and their purpose I found in Eijkhout’s TeX by Topic on p. 29.

D. E. Knuth certainly was a smart guy.

Verbatim text as command argument

In my dissertation I will refer to the N|uu text corpus. The corpus has reference units which are called e.g. NA071112-01_E.008. I thought all the references to the text corpus are markup-worthy entities so I needed a command like \ru (short for reference unit) to markup those references with it like this: \ru{NA071112-01_E.008}.

As you know, the underscore “_” is a special character used in math mode for subscript. So it is not allowed to occur as a part of some string. I did not want to write \ru{NA071112-01\_E.008} instead. I looked for a way to feed commands verbatim text arguments.

I stumbled upon the fancyvrb package, which looks really promising for this, especially in the documentation on p. 17 where some magic aftersave is used. But the aftersave parameter is heavily underdocumented there. That’s just my luck. I could not find out what aftersave actually does and if it will enable me to write someting like \ru{NA071112-01_E.008}.

I then sought help at comp.text.tex (topic: fancyvrb problems) and Enrico Gregorio was so kind to write me this TeX code:

\def\ruspeciallist{\do\_} % add the special characters you need with "\do\X" (X is the characater)

\def\rucatcodes{\def\do##1{\catcode`##1=12 }\ruspeciallist}
\def\ru{\afterassignment\dorusetup\let\next= }
\rucatcodes \aftergroup\dorufinish}

An extensively commented version is also there.

In the last line between “\dorufinish{” and “\box\rubox}” come all the commands you want to apply the the argument of \ru. BUT:

Enrico: “A limit of this approach is that the string is never read as an
argument, so that it is not available for, say, writing an index
entry: we have it only in typeset form, inside the box.”

Ok, this certainly is a nice piece of code but it does not essentially allow verbatim text to be an argument of a command. It is just a work-around to get that verbatim text typeset somehow (and that's what I have originally asked for in the newsgroup). So, in that sense, verbatim text as argument still is an open issue.

Enrico — after I told him I actually may want to index that stuff — suggested a much simpler and elegant solution for my particular needs.

Anyway, if anybody of you readers finds out how to use the aftersave of fancyvrb package do inform me in the comments to this post. I’d really like to know. I think it has something to do with TeX’s \expandafter.

Newcommand with an optional argument

I don’t know why this feature is not implemented in LaTeX. You can very easily define your own commands, but they may take only a fixed number of arguments (up to 9). Bad luck, because the first command I needed to write for my dissertation had to have an optional argument:

Linguists traditionally write words and sounds from a language of interest in italics, e.g. the N|uu word ainki. Often it is followed by an enquoted translation, e.g. ainki ‘father’, xainki ‘mother’, n|ai ‘see’, and kx’ain ‘laugh’ all contain the diphthong ai.

So I wanted to write me a command \nuu by which I could write either just the N|uu word in italics or the word in italics followed by the enquoted translation. I would use it either just as \nuu{ainki} for ainki or as \nuu[father]{ainki} for ainki ‘father’.

Later, when I learn how to make an index of all N|uu words written in my document, I probably will just add an indexing command into the \nuu command definition. For now the optional argument was trouble enough. I did not expect that it will involve hardcore TeX programming.

Fortunately good people from the mailing list have suggested some solutions for this and the most bullet-proof and versatile was this (thanks to Morten Høgholm and Ross Moore):


\long\def\tlist@if@empty@nTF #1{%

\tlist@if@empty@nTF{#1}{}{ `#1'}% only the false code executed

\nuu{ainki} \nuu[mother]{xainki} \nuu[goat]{mudi}

I use \providecommand instead of \newcommand. I feel safer. “It works like \newcommand, but if the command is already defined, LaTeX2e will silently ignore it.” (Oetiker et al., lshort.pdf, p. 109.)

LaTeX’s new era started: XeTeX

My serious interest in LaTeX started in February 2008 when I realized that there is a Unicode-capable version of it at last, called XeTeX. Since I use primarily Windows, I immediately downloaded MiKTeX 2.7.2960 and installed it full with all possible packages to make sure I won’t need to install any additional stuff. The installation files took 625.5 MiB, the MiKTeX installation 1.1 GiB.

I started to learn with XeTeX right away, before knowing LaTeX. XeTeX source file differs from LaTeX practically just by few commands in the preamble for a more comfortable font selection (through the fontspec package) and in the document body by the respective font selection commands. Another difference is that XeTeX source is preferably encoded in UTF-16 or at least UTF-8 and when typesetting the document you run xelatex instead of pdflatex. Hence the URL of this blog: (Actually it is because was already taken by some inactive dude.) ;-)

After a week of reading and experimenting with XeTeX I took a LaTeX Beginner’s course at the University of Leipzig and learned some useful startup suff about LaTeX and also the troubles with encoding and fonts LaTeXers have to face, which are a piece of cake and joy for XeTeX.

More and more I feel that the native Unicode support of XeTeX (at least fot the Basic Multilingual Plane) and its comfortable access to system’s OTF/TTF fonts is a real breakthrough. Many more scientists in the world can use it and need not to stick with MS Word which is until now pretty much the only wide-spread typesetting word-processing software to support Unicode.

The actual XeTeX homepage is at SIL, but a very nice presentation of its features (which I hereby recommend to everybody) is found at CSTUG. (There is also a video recording of the whole XeTeX presentation by its developer Jonathan Kew. - Which imo is not worth watching.)

I plan to write my dissertation in XeTeX so there will be a lot of practical stuff to deal with. My field of study is Linguistics, so Linguists which certainly will switch to XeTeX en mass will find valuable information here.