Matching regular expressions |
Regular expressions naturally generalize from strings to trees and allow to test whether a given tree matches a given pattern. TeXmacs implements the primitives match? and match for this purpose, which also provide support for wildcards, user-defined grammars and more.
This function determines whether a scheme expression expr
satisfies a given pattern. It will be detailed below how to form
valid patterns. The matching routines recursively understand that
native trees match their scheme counterparts. For instance,
(match? (tree "x") "x
Given a list l of scheme expressions, a pattern with free variables and an association list of bindings, this routine determines all substitutions of free variables by values (extending the given bindings), for which l matches the pattern.
Given a list of rules of the form (:var pattern-1 ... pattern-n), this instruction defines a new terminal symbol :var for each such rule, which matches the disjunction of the patterns pattern-1 until pattern-n. This terminal symbol can then be used as an abbreviation in matching patterns. Grammar rules may be interdependent.
Valid patterns are formed in the following ways:
A leaf is only matched against itself.
In the case when lists l-1 until l-n match pattern-1 until pattern-n, their concatenation matches the pattern (pattern-1 ... pattern-n).
The wildcard :#n, where n is a number matches any list of length n. The wildcard :* matches any list, including the empty list.
This pattern attempts to bind the variable var against the expression. If var is used only once, then it essentially behaves as a wildcard. More generally, it can be used to form patterns with identical subexpressions. For instance, the pattern (frac 'x 'x) will match all fractions
x |
x |
In the case when :var is a user-provided terminal symbol (see define-grammar above), this pattern matches the corresponding grammar.
Given a
(:not pattern)
(:or pattern-1 ... pattern-n)
Negation, disjunction and conjunction of patterns.
Given lists l-1 until l-n which match pattern, their concatenation matches the repetition (:repeat pattern). In particular, the empty list is matched.
Groups a concatenation of patterns into a new list patterns. For instance, all lists of the form (a b a b ... a b) are matched by (:repeat (:group a b)), whereas (:repeat (a b)) rather matches all lists of the form ((a b) (a b) ... (a b)).
Only matches a given expression expr.
(define t '(foo (bar "x") (bar
"y") (option "z")))
matches the pattern (foo (:repeat (bar :#1)) :*), but not (foo (:repeat (bar 'x)) :*). The call (match t '(foo 'x 'y :*)) will return (((x . (bar "x")) (y . (bar "y")))).
(define-grammar
(:a a b c)
(:b (:repeat :a)))
Then the list (a b x y c a a) matches the pattern (:b :#2 :b).