Groff

From LQWiki
Jump to navigation Jump to search

Unix (and Multics) systems originally printed things with roff - they would run off a copy on the printer. This was replaced by nroff (new roff) and troff (pronounced "t-roff" from typesetter roff. The new variants were originally written by Joseph Ossanna around 1973, originally in assembly and then in C, until they were extended and made more device-independent by Brian Kernighan in 1979. GNU roff (originally called 'gtroff' before coming to be known as 'groff'), was originally the result of a herculean effort by James Clark (around 1990) to recreate a compatible but extended GPL (and C++!) version of the Unix typesetting system.

What groff is, is actually a frontend (in application terms) and an extremely complex language (in user terms). It is a frontend for a host of pre- and post-processors and a language for producing formatted documents. On Linux systems, most people's main contact with groff is through the man pages which are printed on their terminals or terminal emulators. What man pages are, are plain text documents with text interspersed with formatting codes. As an example:

$ zcat /usr/man/man1/man.1.gz | sed -n '20,30p'
.\"
.TH man 1 "September 2, 1995"
.LO 1
.SH NAME
man \- format and display the on-line manual pages
.SH SYNOPSIS
.B man 
.RB [ \-acdfFhkKtwW ]
.RB [ --path ] 
.RB [ \-m
.IR system ]

Groff is so low-level and complicated that most work with it is done through macro packages. These simplify the typesetter's task by providing shorthand markup codes that conceal all the complexity of precise specification. They generally begin with a period, followed by a two letter code, or may be done as inline escapes. The first line above is the end of a comment. 'TH' might be read as 'title header', '.LO' might be read as 'let option' which seems to be a form of variable assignment I don't entirely understand, '.SH' defines 'section headings', '.B' specifies the following text is to be 'bold', '.RB' specifies the next arguments (separated by whitespace in the source text, but joined in the output) will be alternating 'r'oman ('r'egular) and 'b'old type, the '.IR' is alternating 'i'talic and 'r'oman, and so on. An example of an inline escape sequence would be '\fIitalicized text\fRregular text' where the backslash is the escape, 'f' indicates a font change, 'I' sets it to 'italic', and 'R' resets it to 'roman', if the alternating macros weren't working out for you and you didn't want to set off a separate '.I' command.

If this seems related to TeX, HTML, SGML, XML, and a host of other markup and typesetting languages, that's because it is. Groff is just one example of all these, but still relatively small (in comparison to TeX), and tried (in comparison to xml), and specially designed for and suited to "the Unix way" of doing things. That being said, it seems to be out of favor compared to the latest frenzy over x[ht]ml.

(As a sidenote on TeX, LaTeX is a popular macro package for TeX which bears a similar relation to it as the 'man', 'mdoc', 'mm', and 'me' macro packages do to groff.)

Groff works as a pipeline - sometimes a frighteningly long one. While it is possible to invoke it in excruciating detail or even to simply do zcat manpage | groff -Tlatin1 -man | less, there are some even more specialized frontends that let you do man manpage and that's it. With other elements of groff, one may plug tbl or eqn or pic into the pipeline (or have groff invoke them) to format tables or mathematical equations or pictures. An interesting element of groff is its 'device' options, which can output documents (even man pages) to postscript (with further processing to pdf possible) or to html (with remarkably good markup for an automatic converter) or to various other formats such as getting a quick and dirty graphical preview with gxditview.

With the power of macros, it's possible to turn it into a quite readable and easy (easier) to use system. For instance, the 'mom' macros make use of groff's extension beyond earlier *roff's two character command code limit to produce almost plain-English markup.

For further information, documentation and data in the groff package is invaluable, especially the section 7 manual pages. There is also quite a bit of data on the web.

External links