Default serialization

Documents are generally written to disk using the standard TeXmacs syntax (which corresponds to the .tm and .ts file extensions). This syntax is designed to be unobtrusive and easy to read, so the content of a document can be easily understood from a plain text editor. For instance, the formula (?) is represented by

<with|mode|math|x+y+<frac|1|2>+<sqrt|y+z>>

On the other hand, TeXmacs syntax makes style files difficult to read and is not designed to be hand-edited: whitespace has complex semantics and some internal structures are not obviously presented. Do not edit documents (and especially style files) in the TeXmacs syntax unless you know what you are doing.

Main serialization principle

The TeXmacs format uses the special characters <, |, >, \ and / in order to serialize trees. By default, a tree like

(1)

is serialized as

<f|x1|...|xn>

If one of the arguments x1,…,xn is a multi-paragraph tree (which means in this context that it contains a document tag or a collection tag), then an alternative long form is used for the serialization. If f takes only multi-paragraph arguments, then the tree would be serialized as

<\f>
  x1
<|f>
  ...
<|f>
  xn
</f>

In general, arguments which are not multi-paragraph are serialized using the short form. For instance, if n=5 and x3 and x5 are multi-paragraph, but not x1, x2 and x4, then (?) is serialized as

<\f|x1|x2>
  x3
<|f|x4>
  x5
</f>

The escape sequences \<, \|, \> and \\ may be used to represent the characters <, |, > and \. For instance, α + β is serialized as \<alpha\>+\<beta\>.

Formatting and whitespace

The document and concat primitives are serialized in a special way. The concat primitive is serialized as usual concatenation. For instance, the text “an important note” is serialized as

an <em|important> note

The document tag is serialized by separating successive paragraphs by double newline characters. For instance, the quotation

Ik ben de blauwbilgorgel.

Als ik niet wok of worgel,

is serialized as

<\quote-env>
  Ik ben de blauwbilgorgel.

  Als ik niet wok of worgel,
</quote-env>

Notice that whitespace at the beginning and end of paragraphs is ignored. Inside paragraphs, any amount of whitespace is considered as a single space. Similarly, more than two newline characters are equivalent to two newline characters. For instance, the quotation might have been stored on disk as

<\quote-env>
  Ik ben de           blauwbilgorgel.


  Als ik niet wok of          worgel,
</quote-env>

The space character may be explicitly represented through the escape sequence “\ ”. Empty paragraphs are represented using the escape sequence “\;”.

Raw data

The raw-data primitive is used inside TeXmacs for the representation of binary data, like image files included into the document. Such binary data is serialized as

<#binary-data>

where the binary-data is a string of hexadecimal numbers which represents a string of bytes.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".