Mathematical typesetting |
In this chapter we describe the algorithms used by TeXmacs in order to typeset mathematical formulas. This is a difficult subject, because esthetics and effectiveness do not always go hand in hand. Until now, TeX is widely accepted for having achieved an optimal compromise in this respect. Nevertheless, we thought that several improvements could still be made, which have now been implemented in TeXmacs. We will shortly describe the motivations behind them.
In order to obtain esthetic formulas, what criteria should we use? It is often stressed that good typesetting allows the reader to concentrate on what he reads, without being distracted by ugly typesetting details. Such distracting details arise when distinct, though similar parts of text are typesetted in a non uniform way:
Additional difficulties may arise when considering automatically generated formulas, in which case line breaking has to be dealt with in a satisfactory way.
Unfortunately, the different esthetic criteria may enter into conflict
with each other. For instance, consider the formula
xp + x
. On the one hand, the baselines of the scripts should
be the same, but the other hand, the first subscript should not be
“disproportionally low” with respect to the x.
Unfortunately, this dilemma can not been solved in a completely
satisfactory way without the help of a human for the simple reason
that the computer has no way to know whether the
xp and x
2
p
are “related”. Indeed, if the
xp and x
i
p
are close (like in xp
+ x
i
p
), then it is natural to opt for a common base line.
However, if they are further away from each other (like in
xp + ∑
i
p
cix
∞
i = 0
), then we might want to opt for different base lines
and locally optimize the rendering of the first
xp.
i
p
Consequently, TeXmacs should offer a reasonable compromise for the
most frequent cases, while offering methods for the user to make finer
adjustments in the remaining ones. Currently, we just provided the
have different sizes, then one may resize the bottom of the subscript j of the second sum to 0fn. Alternatively, one may resize the bottoms of both the i and j subscripts to (say) -0.3fn.
Notice that one should adjust by preference in a structural and not visual way. For instance, one should prefer -0.3fn to -2mm in the above example, because the second option disallows you to switch to another font size for your document. Similarly, you should try not change the semantics of the formula. For instance, in the above example, you might have added a “dummy subscript” to the i subscript of the sum. However, this would alter the meaning of the formula (whence make it non suitable as input to a computer algebra system) In the future, we plan to provide additional constructs in order to facilitate structural adjusting. For instance, in the case of a formula like
2 |
1 |
2 |
1 |
2 |
2 |
2 |
2 |
2 |
1 |
2 |
2 |
one might think of a construct to enclose the entire formula into an area, where all scripts are forced to be double (using dummy superscripts whereever necessary).
Several font parameters are crucial for the correct positioning of the different components. The following are often needed:
The following parameters are mainly needed in order to deal with scripts:
The individual strings in a font also have several important positioning properties. First of all, they always admit left and right slopes. Furthermore, they admit left and right italic corrections, which are needed for the positioning of scripts or when passing from text in upright to text in italics (or vice versa).
The following heuristics are used:
The italic corrections are not taken into account during the positioning algorithms, because this may create the impression that the numerator and denominator are not correctly centered with respect to each other. Nevertheless, the italic corrections are taken into account in order to compute the logical bounding box of the fraction (whose has italic slopes vanish at both sides).
The following heuristics are used:
We take the logical right border plus the italic correction of the main argument in order to determine the right hand limit of the upper bar. The left italic correction is not needed.
The following heuristics are used:
The following heuristics are used:
The positioning of subscripts and superscripts is a complicated affair, due to the conflict between locally and globally optimal esthetics mentioned above. The base line for a subscript is determined as follows:
The base line for a superscript is determined as follows:
If both a subscript and a superscript were present, then we still have to adjust the base lines: if the top of the subscript and the bottom of the superscript are not physically separated by sep, then we both move the subscript and the superscript by the same amount away from each other. Because of step 1 in the positioning of the subscript, the base lines of double scripts will usually be the same in formulas with several of them.
The right slope and italic correction of a script box may be non trivial. In order to compute them, we first determine the script (or main argument), whose right limit (taking into account its italic correction) is furthest to the right (this may be the main box, in the case of a big integral with a tiny subscript). Then the right slope of the main box is inherited by the right slope of this script (or main argument). As to the italic correction, it is precisely the difference between the right offset of the script plus its italic correction minus the logical right coordinate of the entire box. The italic correction should be at least zero though. The left slope and italic correction are computed in a similar way.
The automatic positioning and computation of sizes of big delimiters is again complicated because of potential conflicts between locally and globally optimal esthetics.
First of all, TeX fonts come only with a discrete set of possible sizes for large delimiters. This is an advantage from the point of view that it favorites delimiters around slightly different expressions to have the same baselines. However, it has the disadvantage that delimiters are easily made “one size to large”. For this reason, we actually diminish the height and the depth of the delimited expression by the small amount sep, before computing the sizes of the delimiters.
Secondly, it is best when the vertical middles of big delimiters occur at the height of fraction bars. However, in a formula like
1 | ||||||
1 +
|
it may be worth it to descend the delimiters a bit. On the other hand, slight vertical shifts in the middles of the delimiters potentially have a bad effect on base lines, like in
b |
i = 1 |
a |
j = 1 |
In TeXmacs, we use the following compromise: we start with the middle of the delimited expression as a first approximation to the middle of the delimiters. The real middle is obtained by shifting this middle towards the height of fraction bars by an amount which cannot exceed sep.
From a horizontal point of view, we finally have to notice that we adapted the metrics of the big delimiters in a way that potential scripts are positioned in a better way. For instance, according to the TeX tfm file, in a formula like
10 |
i = 1 |
the square rather seems to be a left superscript of the second closing bracket than a right superscript of the first one. This is particularly annoying in the case of automatically generated formulas, where this situation occurs quite often.