General architecture of TeXmacs |
The TeXmacs program has been written in C++. You need g++ and the makefile utility in order to compile TeXmacs. Currently, the source (in the src directory) of the TeXmacs implementation has been divided into the following parts:
All parts use the data structures from Basic. The graphical toolkit depends on Resource for the TeX fonts. The extension language is independent from Resource and Window. The typesetting part depends on all other parts except from Prg. The main editor and the TeXmacs server use all previous parts.
The TeXmacs data are contained in the directory edit which corresponds to the TeXmacs distribution without the source code. Roughly speaking, we have the following kind of data:
The directory misc contains some miscellaneous data like the edit icon (misc/pixmaps/edit.xpm).
TeXmacs represents all texts by trees (for a fixed text, the corresponding tree is called the edit tree). The nodes of such a tree are labeled by standard operators which are listed in Basic/Data/tree.hpp and Basic/Data/tree.cpp. The labels of the leaves of the tree are strings, which are either invisible (such as lengths or macro definitions), or visible (the real text).
The meaning of the text and the way it is typeset essentially depend on the current environment. The environment mainly consists of a relative hash table of type rel_hashmap<string,tree>, i.e. a mapping from the environment variables to their tree values. The current language and the current font are examples of system environment variables; new variables can be defined by the user.
All text strings in TeXmacs consist of sequences of either specific or universal symbols. A specific symbol is a character, different from '\0', '<' and '>'. Its meaning may depend on the particular font which is being used. A universal symbol is a string starting with '<', followed by an arbitrary sequence of characters different from '\0', '<' and '>', and ending with '>'. The meaning of universal characters does not depend on the particular font which is used, but different fonts may render them in a different way.
The language of the text is capable performing a further semantic analysis of a text phrase. At least, it is capable of splitting a phrase up into words (which are smaller phrases) and inform the typesetter about the desired spaces between words and hyphenation information. In the future, additional semantics may be added into languages. For instance, spell checkers might be implemented for natural languages and parsers for mathematical formulas or programming languages.
Roughly speaking, the typesetter of TeXmacs takes a tree on input and produces a box, while accessing and modifying the typesetting environment. The box class is multifunctional. Its principal method is used for displaying the box on a post-script device (either the screen or a printer). But it also contains a lot of typesetting information, such as logical and ink bounding boxes, the positions of scripts, etc.
Another functionality of boxes is to convert between physical cursors (positions on the screen) and logical cursors (paths in the edit tree). Actually, boxes are also organized into a tree, which often simplifies the conversion. However, because of macro expansions and line and page breaking, the conversion routines may become quite intricate. Notice also that, besides a horizontal and vertical position, the physical cursor also contains an infinitesimal horizontal position. Roughly speaking, this infinitesimal coordinate is used to give certain boxes (such as color changes) an extra infinitesimal width.
In Edit/Modify you find different routines for modifying the edit tree. Modifications go in several steps: