|
Nowadays, asymptotically fast algorithms are widely used in computer algebra for computations in towers of algebraic field extensions of small height. Yet it is still unknown how to reach softly linear time for products and inversions in towers of arbitrary height. In this paper we design the first algorithm for general ground fields with a complexity exponent that can be made arbitrarily close to one from the asymptotic point of view. We deduce new faster algorithms for changes of tower representations, including the computation of primitive element representations in subquadratic time.
|
Let be an effective commutative ring with unity.
Here effective means that elements of
can be represented by concrete data structures and that we have
algorithms for the ring operations, including the zero test. Effective
fields are defined in a similar way.
Given a monic polynomial of degree
, the quotient ring
is
an effective ring and fast algorithms exist for the arithmetic
operations in
. More
precisely, counting in terms of operations in
, additions in
require
additions in
,
whereas multiplications can be done [15] in almost linear
time
. Here
is a standard notation for the cost of multiplying two univariate
polynomials of degree
: see
section 2.2.
Given another monic polynomial ,
we may build the effective ring
and fast
arithmetic operations in
are available for the
same reason. Doing so inductively, we obtain a tower
of extensions of
,
with
for
.
Such a tower is written
and we call
its height. Throughout this paper, we write
for the class of
in
, we let
, and
.
The
are called the defining polynomials
of the tower and
its degree. Elements
of
are naturally represented by univariate
polynomials in
over
of
degrees
. If all
are fields, then we write
instead
of
and
instead of
, for convenience.
Towers of this type naturally arise in several contexts of applied algebra, including cryptography (for instance in [2, 3, 18]), error correcting codes (for instance in [26]), and the resolution of differentially algebraic systems in the more general setting of triangular sets (for instance in [1, 12]).
By induction over , basic
arithmetic operations in
can be naively
performed in time
, for some
constant
. But it is well
known that, for a sufficiently large constant
, these operations can actually be carried out in
time
by means of Kronecker substitution (see for
instance [27, Chapter 8]). However, if many of the
individual degrees
are small, then
can become as large as
,
which leads to an overall complexity bound of the form
. The main aim of this paper is to prove the
sharper bound
in the case when
is a field and the
are irreducible and
separable. The top level complexity result is stated in Corollary 4.11.
In order to simplify the presentation of complexity bounds, we often use
the soft-Oh notation: means that
; see [27, Chapter 25,
section 7] for technical details. The least integer larger or equal to
is written
;
the largest one smaller or equal to
is written
. The
-module of polynomials of degree
is written
.
One essential tool for our new algorithms is modular composition. Before returning to our main problem, let us recall several useful existing results on this problem and various related topics.
Let be as above and assume that
, whence
.
Given
the computation of the remainder
of the Euclidean division of
by
is called the problem of modular
composition. It is equivalent to the problem of evaluating
at a point
.
Currently, the best known solution to this problem is due to Brent and
Kung [13, 56] and requires
operations in
. Here the
constant
, with
, denotes an exponent such that a rectangular
matrix can be multiplied with a square
matrix with
operations in
. Huang and Pan showed in [40]
that one may take
. Brent and
Kung's algorithm is based on the baby-step giant-step technique [50]; we will recall and generalize it in our section 3.
For a fixed point , the
evaluation map
is linear. The dual or transposed map is given by
and sends an -linear
functional
to the vector
. The computation of this transposed map is called
the problem of modular power projection. Using the
transposition principle (see [10] and historical references
therein), the cost of power projection and modular composition are
essentially the same. In particular, the computation of the traces
corresponds to one modular power projection.
Given an element , the
characteristic polynomial of
is the
characteristic polynomial of the multiplication endomorphism by
in
. If
is a field, then this polynomial can be computed
in softly quadratic time by means of a resultant (see for instance [45, Corollary 29]). Shoup has also designed a practical
randomized algorithm to compute minimal polynomials for the case when
has sufficiently many elements, with an expected
complexity of
.
The fastest known method to compute the characteristic polynomial of an element
over
proceeds as follows: first compute the traces
of the powers of
for all
using modular power projection, and then recover
by integrating the differential equation
![]() |
(1.1) |
where represents the reverse polynomial
of
. This integration
requires to have the inverses of
in
at our disposal. Historically, this method goes back to Le
Verrier [43], and formula (1.1), often called
the Newton–Girard formula, expresses the relationship between
symmetric and power sum polynomials. The integration step takes softly
linear time: the seminal contributions are due to Brent and Kung [13], generalizations can be found in [8], the
first efficient extension to finite fields has been designed in [9], and we refer the reader to [29, section 2]
for a concise proof of it.
Recall that an element is said to be
primitive over
if
. The primitive element theorem states that
such an element
always exists: if
contains sufficiently many elements, then it suffices to
take a sufficiently generic
-linear
combination of the
. In this
light, it may seem odd at first sight to work with towers
of height
,
since an isomorphic tower of height one always exists. The problem is
that both the computation of primitive elements and conversions between
representations are usually expensive.
Concerning primitive elements, it is currently not known how to
efficiently find one, together with its minimal polynomial and polynomials
such that
for
. In
fact, naive algorithms mostly boil down to linear algebra in dimension
, and thus feature at least
quadratic costs. One may for instance appeal to the FGLM algorithm to
change orderings of Gröbner bases [21, 22].
If a modular composition algorithm (technically, for the multivariate version of section 3.2) of softly linear complexity were known over any field, then primitive element representations of towers would be computable in softly linear expected time by a randomized algorithm; conversions between towers and their primitive element representations would also become possible in softly linear time. These aspects have been studied by Poteaux and Schost in [51, 52], where subquadratic costs are achieved for multivariate modular composition and its transpose, as well as for computing primitive representations of towers.
Let represent a fixed positive rational value.
If
is the finite field with
elements, then a major result by Kedlaya and Umans [42]
states that modular composition and power projection are possible in
time
. Whenever
, then this in particular implies that
characteristic polynomials can be obtained with expected bit complexity
: see [29,
sections 2.3–2.5] for details. Based on these results, Poteaux and
Schost have derived [51] fast algorithms for
multiplication, division, traces and primitive element representations
for separable towers over finite fields. Unfortunately, very large
constant factors are hidden in the complexities of all these algorithms,
which currently prevent them from being of any practical relevance.
For some other particular cases of interest, fast algorithms also exist
for modular composition and related problems. For fixed irreducible
moduli of sufficiently smooth degree
, practically faster algorithms for
modular composition have been proposed in [39]. When
, we have also shown in [38]
that modular composition can be achieved in quasi-linear time when
computing with a sufficiently high numeric precision.
More direct approaches for algebraic tower arithmetic are usually based
on computations modulo so-called “triangular sets”. We may
see the preimage of over
as a multivariate polynomial
in
such that
is monic in
,
has degree
in
, degrees
in
for all
, and
.
The sequence
forms a special case of a
triangular set. Triangular sets and decompositions have a long
history that goes back to Ritt's contributions in differential algebra
[53].
In this context of triangular sets, it was shown in [46]
that two elements in can be multiplied in time
. In [44,
Proposition 2], Lebreton has proved the stronger bound
. This result even applies if the ideal (
) is not radical, under the
condition that
form a regular chain (in
the sense of Kalkbrener [41], see also [1] for
generalities about triangular sets). For triangular sets of certain
quite special types, we notice that even faster multiplication
algorithms have been designed in [7]. When working over
finite fields, practically efficient algorithms have also been proposed
in [17], based on fast embeddings into towers of a specific
type.
Another major occurrence of algebraic towers in the literature concerns
“dynamic evaluation”. This technique was introduced by Della
Dora, Dicrescenzo and Duval [19, 20] as a way
to compute with algebraic numbers without requiring algorithms for
polynomial factorization. Basically, the idea is to compute in as if it were a field, and cases are distinguished
whenever a zero test of an element
is requested:
the “current tower” is then explicitly decomposed into a
“direct product of two towers” such that the projections of
in these two towers are respectively zero and
invertible. The computations finally resume non-deterministically with
each of the two new towers in the role of the current tower. For recent
complexity results on this technique we refer the reader to [16].
The efficient computation with polynomials modulo zero-dimensional ideals defined by triangular sets is an important topic in applied algebra. Obtaining softly linear complexity bounds when working over a general field is an open problem. The previous best known bound due to Lebreton [44, section 3.2] is recalled in section 2.4.
Then, one first technical contribution of us concerns multivariate
modular composition: in section 3.2 we design a new
baby-step giant-step algorithm to evaluate with
the same kind of complexity as the Brent–Kung algorithm for the
univariate case. It slightly improves upon the incremental method
described in [52], that actually relies on the bivariate
case. In fact, our method applies the baby-step giant-step paradigm over
the whole representation of
.
Compared to [52, Lemma 4] used with
, the complexity bound given in Proposition 3.3
does not have restrictions on the characteristic, and is asymptotically
smaller by a factor of
.
The main contribution of this paper is section 4, which is
devoted to a new deterministic algorithm to multiply in towers of
separable field extensions (all the are thus
irreducible and separable). For the first time we achieve an asymptotic
complexity
with an exponent that becomes
arbitrarily close to one. From the asymptotically complexity point of
view, and in the specific case of towers of separable field extensions,
this improves upon [44, 52], and also upon the
randomized algorithms for finite fields from [51]. For
specific towers over finite fields, some of the algorithms from [17] remain more efficient both in practice and in theory.
The main idea behind our algorithms is as follows: we use the primitive
element theorem to replace a general tower of height
by a new “accelerated” tower for which we can control the
height
. There is a tradeoff
between the cost of tower arithmetic (more expensive for larger
) and primitive element
computations (more expensive for small
).
By making a suitable compromise, we obtain our main complexity bound.
One shortcoming of our main result is that it only applies to towers of
fields. Now if an algorithm is available for factoring
polynomials in into irreducibles, then our
results generalize to towers of separable integral extensions over
, by “factoring” them
into towers of separable field extensions. This reduction and the
corresponding conversion algorithms (that generalize Chinese
remaindering) are detailed in section 5. Some particular
cases when the cost of factorization is often affordable are discussed
in section 5.5.
Our final section 6 contains a general subquadratic Las
Vegas complexity bound for computing primitive element representations
of towers of separable field extensions, along with the related data for
conversions: our algorithms directly benefit from the fast product, and
they rely on power projections as in [11, 42,
51]. Compared to [52, Lemma 4] used with , section 6 gains a factor of
asymptotically, without restrictions on the
characteristic. When working over a finite field, section 6 does not
improve upon the asymptotic complexity bounds from [51].
Nevertheless, [51] relies on the Kedlaya–Umans
algorithm for modular composition, so we expect our approach to behave
better in practice.
Let us finally stress once more that we regard our contribution as a
uniform way to prove almost optimal complexity bounds for tower
arithmetic in the algebraic complexity model. It is well known
that this model is reasonably close to bit complexity models when
working over finite fields. Over other fields such as , coefficient growth needs to be taken into
account, which leads us beyond the scope of this paper. From a practical
perspective, we also recall that efficient tower arithmetic has been
developed over various specific kinds of base fields such as finite
fields and subfields of
: see
[17, 38, 39] and section 5.5.
It is true however that actual implementations of our own algorithms
have fallen somewhat behind. We intend to address such implementation
issues in the near future.
This section gathers basic prerequisites on complexity models and polynomial arithmetic. In particular we recall Lebreton's results on multiplication in towers [44]. Then we examine the costs of divisions.
Let be an effective ring. Our complexity
analyses concern the algebraic complexity model [14,
Chapter 4] over
, and more
precisely straight-line programs and computation trees. In other words,
running times of algorithms are measured in terms of the required number
of ring operations in
, and
constants are thought to be freely at our disposal.
It is convenient to introduce the notations and
as abstractions for the respective costs of
multiplication and division in
(whenever
defined). For simplicity we always assume that multiplication is at
least as expensive as addition, subtraction, and zero testing.
For randomized algorithms over a finite effective ring , we assume a special instruction that
uniformly generates random elements in
,
with constant cost. For a given input, the cost of a randomized
algorithm is thus a random variable. The expected cost for
input size
is defined as the maximum of the
averages of these random variables over all possible inputs of size
.
Remark or
, it is
common to use Turing or RAM machines to count the total number of
“bit operations”. The translation of our results in terms of
bit complexity is mostly straightforward, although some care is needed
for the manipulation of multivariate polynomials on Turing machines. We
refer to [36, 37] for more technical details.
Let be an effective commutative ring with unity.
We denote by
a cost function for multiplying two
polynomials
. For general
, it has been shown in [15] that one may take
![]() |
(2.1) |
For rings of positive characteristic, one has
by [30]. We make the following assumptions:
is a non-decreasing function in
.
is sufficiently close to linear, in the
sense that
![]() |
(2.2) |
holds whenever .
These assumptions hold for as in (2.1).
Notice that (2.2) is also equivalent to
. Applying equation (2.2) a finite
number of times, it follows that
for any fixed
positive integer
.
If is an effective
-algebra
that is also a finite dimensional free
-module,
then we denote by
a cost function for
multiplying two polynomials in
,
in terms of the number of operations in
,
and we make the same assumptions as for
.
Notice that we may always take
.
In particular, if is a monic polynomial in
of degree
then
is such an
-algebra
of dimension
; elements in
are represented by polynomials in
. In this case, we have
and
by means of Kronecker substitution [27,
Chapter 8].
If is a finitely generated
-algebra, an element
is
said to be primitive if
.
In this case, we write
the monic minimal
polynomial of
, so the
following isomorphism holds:
Notice that such a primitive element representation does not necessarily
exist: consider and
.
consists of the
following data:
A primitive element of
over
, for
.
The minimal polynomial of
over
,
for
.
such that
,
for
and
.
These data induce the following isomorphisms for :
The polynomials are called
parametrizations of the
in terms of the
.
With the triangular set as defined in the
introduction, an element
is represented by a
polynomial
with
defined
modulo
. So the conversion of
into an element of
boils
down to evaluating
modulo
. The backward conversion consists in evaluating a
univariate polynomial
at
. Such conversions are the topic of section 3
below.
Remark and
. In the present paper, we only rely on
conversions between
and
, the natural identity isomorphism
, and combinations of these conversions.
Under the assumptions of section 2.2, recall that there
exists a constant such that
for
, and where
represents a cost function for multiplying two polynomials
in
, in terms of the number
of operations in
. When using
this bound in an iterated fashion, we obtain
By taking care of the cost of polynomial divisions in terms of multiplications, one may prove the following sharper bounds:
Proof. Taking ,
Lebreton proved that
; see
[44, section 3.2]. His proof was done for the particular
cost function
. For
completeness, we briefly repeat this proof, for a general cost function
such that
is
non-decreasing. We recall that two polynomials
of partial degrees
in the
can be multiplied in time
using Kronecker
substitution recursively; see [27, Chapter 8] and [44,
section 3.2].
We let represent the triangular set associated
to the tower, as defined in section 1.3. We identify
elements in the tower with their canonical representatives in
. Let
represent the reverse polynomial of
in
. Assume for now that we
precomputed polynomials
in
, of partial degrees
in
the
, such that
Let represent a cost function for reducing
polynomials in
of degrees
in
for
modulo
. The reduction of such a
polynomial
modulo
is
done recursively, as follows:
Let . If
, then we reduce
modulo
and we are done.
Otherwise, let be the reverse of
.
Reduce modulo
, in time
.
Compute , in time
.
Reduce modulo
, in time
.
Let , compute
, and then reduce
modulo
. It
turns out that
equals
modulo
; see for instance
[27, Chapter 9, section 1].
Altogether, the cost of this algorithm is
therefore bounded by
for some universal constant .
We deduce that
.
Now let and
be the
respective representatives of two elements
and
to be multiplied in the tower. The product
takes
,
and
is then reduced modulo
, whence
For the second bound of the proposition, we may multiply two polynomials
in as in
with
operations in
,
and then reduce their product coefficientwise modulo
with cost
, whence
.
It remains to observe that once are known, the
precomputation of
using Newton's method takes
operations in
;
see [27, section 9.1], for instance. Therefore, the
complexity bounds obtained so far remain valid up to a constant factor
when taking the cost for obtaining the
into
account. Finally, our assumption
for
implies
, so
(2.2) leads to the claimed bounds.
Remark does not play a critical role in the design of
the forthcoming algorithms. Of course any improvement of
remains relevant; the dependence on
will be made apparent in Corollary 4.11.
Remark for all
is convenient for proving several complexity bounds. For
with
, we
notice that
does not occur in the representation
of elements in
, so
extensions of degree one can be suppressed from the tower without loss
of generality. The cost of the corresponding rewritings does not
intervene in the algebraic complexity model. On a Turing or RAM machine
this cost would actually depend on the way how multivariate polynomials
are represented, but it is typically at most linear in the size of the
representation.
In the remainder of this section, we assume that we work in a tower of
fields over
.
Given
such that
is
monic, it is well known [27, Chapter 9] that the quotient
and the remainder
of the
Euclidean division of
by
can be computed in time
.
It is important for us that the gcd algorithm with input returns the monic polynomial
along
with
such that
.
In this way, if
then
is
the inverse of
modulo
and
is the inverse of
modulo
. It is well known [27, Chapter 11, Algorithm 11.6] that this extended gcd can be
performed in time
.
In fact, when replacing all polynomial divisions by pseudo-divisions in
[27, Chapter 11], we avoid all divisions in , but the computed gcd
is not necessarily monic. Multiplying
,
and
with the inverse of
the leading coefficient of
,
we do obtain a monic gcd, at the expense of a single division in
. Summarizing, this shows that the
extended monic gcd can actually be computed in time
.
Now consider an extension of
, where
is a monic
irreducible polynomial of degree
.
Then the above bounds for division and gcd computations yield
When using the univariate algorithm for inverting non-zero elements in
in a recursive manner, we obtain the following
complexity bound:
and assume that
for
.
With the above assumptions and notations, and with
as in Proposition 2.4, we have
Proof. Let and
be universal constants with
for
all
and
for all
extensions
of
as
previously considered. Then we have
and we conclude by induction.
Let us mention that inversion modulo a general triangular set (that does not determine a prime ideal) is more subtle; see various approaches in [16, 49].
Throughout this section represents an effective
ring and
is an effective
-algebra with a given basis
. Given a polynomial
and a
point
, we study how to
compute
efficiently. We first recall the well
known baby-step giant-step algorithm in the case when
. In section 4 below, these
evaluation algorithms will be used for conversions between triangular
and primitive representations of the same algebra.
Given a polynomial and
, how to evaluate
efficiently at
? Horner's
method uses time
. For
convenience of the reader, we now recall a well known faster algorithm
from [50] that relies on the baby-step giant-step
technique.
Algorithm
Output: .
Proof. By construction, we have
for
, whence
. This proves the correctness of the algorithm.
Step 2a requires
multiplications in
, when
computing the powers using
.
Step 2b takes
assignments in
, which come
for free in our complexity model. The matrix multiplication in step 3 can be done in time
.
Step 5 involves
multiplications and
additions in
, when using
Horner's method. Altogether, this leads to the claimed running
time.
Remark and if a monic annihilator
with
is known, then it is possible to compute
with
operations in
. Indeed, we have
, and
is computed in
time
.
Let us now study the evaluation of a multivariate polynomial of partial degree
in each
at a point
.
Setting
and writing
for
the coefficient of the monomial
in
, we have the following generalization of
Algorithm 3.1 to multivariate polynomials.
Algorithm
Output: .
Proof. This time, for all , we have
Plugging this expression into the formula of step 6 shows that the algorithm indeed returns the desired evaluation.
From we always have
. If
,
then we obtain
and then
The lower bound on also implies
whence
Otherwise straightforwardly implies
and
. In all
cases
and
are in
. Consequently the complexity
analysis is the same as for Algorithm 3.1. Of course, one
has to carefully evaluate the power products in step 3b and the sum in step 6, in order to use only
multiplications in
.
Remark in step 2 of Algorithm 3.2
may be replaced by any other positive value strictly less than
.
Let be a tower of separable algebraic extensions
of an effective base field
.
Throughout this section, we assume that the extension degrees
satisfy
for all
(which is usually not restrictive as explained in Remark 2.6),
and that
is as in Proposition 2.4.
Our main objective is to design a fast algorithm for multiplying
elements in the tower.
The basic idea is as follows. Let be an
arbitrarily small positive constant. If at least half of the
are “sufficiently large”, then
automatically becomes “small enough” to ensure
that
. Otherwise, there exist
consecutive small degrees
, and we replace the corresponding subtower of
extensions
by a primitive one
. We keep repeating these replacements until
the height of the tower becomes “small enough”. Some careful
balancing is required here, since the computation of a primitive element
for
over
can become
expensive if
gets “too large”. The
precise tradeoff will be made explicit at the end of the section.
In this section, given two effective rings and
along with a natural way to convert elements in
to elements in
,
we denote by
the cost of such conversions. If we
also have an algorithm for conversions in the opposite direction, then
we write
.
Assuming that contains sufficiently many
elements, the aim of this subsection is to construct a primitive element
representation in the sense of Definition 2.2. We thus have
to compute primitive elements
over
such that
together with the minimal polynomials of
over
for
. We also show how to convert efficiently
between each of the two representations of
.
such that
and
for some
,
for
. Then we have
Proof. The bound on the cost of conversions from
to
is a direct
consequence of Proposition 3.3 by taking the assumption
into account. For
,
the minimal polynomial of
over
may thus be converted into a minimal polynomial
over
in time
.
The computation of
requires
extra operations since
for all
.
For conversions from to
, we consider a polynomial
. We evaluate
at
in
seen as
with cost
by Proposition 3.1. We
thus obtain a polynomial
such that
.
We next need to convert the coefficients of
from
to
. Using a straightforward induction this leads
to the following cost:
Here we again use of the assumptions that and
for all
.
Remark
is “sufficiently small”. More precisely, modulo suitable
precomputations, we have shown that conversions between elements of
and
can be done in time
. In various special cases, the
factor
can be further reduced.
The next proposition provides us with complexity bounds for computing
primitive tower representations by induction. We denote the cardinality
of by
.
and that a
primitive tower representation of
has already
been computed. Then
We can compute of the form
with
,
and
in time
Given and
,
we can compute
with
for
in time
In addition, if then the computations in
part a can be achieved with expected cost
by
means of a randomized Las Vegas algorithm.
Proof. Thanks to Proposition 4.1,
modulo conversions of cost between
and
, we
rewrite
into
such that
,
,
. We
thus have
Now let be a parameter that will be specified
later, and consider
. The
characteristic polynomial of
over
equals to
.
Consequently the characteristic polynomial
of
over
is
-proportional to
The polynomial can be obtained as the preimage
of
whose computation costs
by a standard “divide and conquer” approach; see [5,
Lemma 7], for instance. Then the resultant may be computed in time
by [45, Corollary 26].
Regarding as an indeterminate, the polynomial
has degree
.
Therefore
may be interpolated from
values of
. The
total cost to obtain
is thus
Geometrically speaking, the system admits the
pairwise distinct solutions
in
, where
denotes the algebraic closure of
.
The polynomial
is separable if, and only if, its
roots
are pairwise distinct. There are at most
values of
for which this
is not the case, so we simply need to try
distinct values of
until
becomes separable. Testing the separability of
for
values
of
is achieved as follows.
We first evaluate for
. Using fast multi-point evaluation, this takes time
. We next test whether the
discriminant of
in
vanishes, for
; this can be
done in time
. Doing this for
packets of
values, the
overall computation can be done in time
.
This completes the proof of part a.
As to part b, for any root of
the gcd of
and
has degree
, so the
specialization property of subresultants ensures that the subresultant
in
of degree
of
and
has the form
with
coprime to
. This subresultant can be computed in time
again by [45, Corollary 26]. Since
and
have degrees
, we obtain the polynomial
modulo
in
with additional cost
. Then
from
we deduce that
. For
,
we then take
. Proposition 3.1 implies that the cost of these modular compositions is
bounded by
. This completes
the proof of part b.
If , then we use a Las Vegas
algorithm for part a: we pick
at random
in a set of size
until
is separable. Each pick is successful with probability at least
, so the expected cost is
.
.
Then a primitive tower representation of
can
be computed in time
. Using
a randomized Las Vegas algorithm, the computation can even be done
with expected cost
.
Proof. This directly follows from the latter
proposition, using that for all
.
Remark
The main problem with the complexity bounds of Proposition 2.4
is that the height of the tower may get as large
as
. Now if
indeed gets large, then many of the
are
necessarily small. Furthermore, as long as a product of the form
is reasonably small, the results from the previous
subsection allow us to change the given representation of
into a more efficient primitive element representation
. Repeating this operation as
many times as necessary, we reduce the height of the tower and guarantee
that all defining polynomials have sufficiently large degrees. This
process is detailed in the next paragraphs.
be a positive
integer. We define a
-accelerated
tower representation of
to consist of
the following data:
A sequence of integers .
A tower such that
for
, represented by
a sequence of defining polynomials
,
where
and
.
For , a primitive
tower representation of
seen as a tower
over
and ending with
.
Denote . If
and
,
then
. If
, then
.
. There exists
a
-accelerated tower
representation of height
.
Proof. The construction of the sequence is done by induction. For
with
, assume that we have
completed the construction up to height
.
We distinguish the following cases:
If then we set
.
Otherwise, we take maximal such that
. So either
or
.
Whenever we must have either
or
. Consequently the number
of such indices
is
constrained by
. It follows
that
. On the other hand the
number of indices
with
is necessarily
. It follows
that
. Once the indices
are known, the existence of a
-accelerated tower representation follows from
Corollary 4.4.
Algorithm
Output. A -accelerated
tower representation of
.
In order to bound the execution time of Algorithm 4.1, we
need to carefully analyze the cost of conversions between and
for
.
such that for
, we have
Proof. By Proposition 2.4, there
exists a universal constant with
By Proposition 4.1, there also exists a universal constant
such that conversions between
and
can be performed in time
for , and where we used the
assumption that
is non-decreasing in
. If
then
such conversions are actually for free. Otherwise we have
, whence
Now take and let us prove the lemma by induction
over
. For
we have nothing to do, so we assume that
.
Using the induction hypothesis, conversions between
and
are performed in time
Consequently,
The lemma follows by induction.
.
Then Algorithm 4.1 is correct and runs in time
In addition, if , then a
randomized Las Vegas version of Algorithm 4.1 has
expected cost
.
Proof. The correctness of Algorithm 4.1
is ensured by Lemma 4.7. Let us analyze the cost of step 2a for a given .
The necessary conversions for rewriting the defining polynomials of
over
can be done in time
by Lemma 4.8. If ,
then there is nothing to be done since
.
So assume that
, whence
. Then Proposition 2.7
gives
Consequently, in view of Corollary 4.4 and using assumption (2.2), the remainder of step 2a takes time
which dominates the cost of conversions since . The total cost for all
is
thus
.
Finally, if , then the
randomized variant from Corollary 4.4 leads to the claimed
expected cost.
In addition, if a -accelerated
tower representation of
is known, then we
have
Proof. The cost to obtain a -accelerated tower representation of
is given in Proposition 4.9, namely
.
Then, in order to multiply two elements in ,
we convert them into the accelerated representation, and multiply them
with cost
by Proposition 2.4. We finally convert the product back
into . By Lemma 4.8,
each of these conversions can be done in time
.
Proof. It is important to notice that constants
hidden in the “”
of Theorem 4.10 are independent of the value for
, so we may freely set
in terms of
.
Now taking
balances the contributions of
and
.
Indeed, since
by Lemma 4.7, we have
. We conclude by the latter
theorem.
Remark can actually be
avoided.
Up to now we have focused on towers of field extensions. In this
section, we turn our attention to more general towers of separable
integral ring extensions. Each extension ring is still assumed to be a
product of separable field extensions over a common ground field , which will allow us to apply the
theory from the previous sections.
We start with a description of the precise types of towers of rings that
we will be able to handle. We next explain how such towers of rings can
be “factored” into towers of fields and we analyze the cost
of the corresponding conversions. Finally, we give a complexity bound
for the computation of “tower factorizations”. We do not
claim that our approach is efficient in all cases. Nevertheless it is
expected to be so whenever the factorization process behaves as a
pretreatment with negligible cost (or when factoring polynomials over
is reasonably affordable; see section 5.5).
See the conclusion of this paper for a discussion of alternative
approaches.
Throughout this section, we still assume that the ground ring is a field, written
,
and we consider a tower
of integral ring
extensions of
. For
, we still let
denote the defining polynomial of
over
, define
,
,
etc. as before, and assume that
. We say that
is a tower of
separable integral ring extensions, or just separable in short,
if
is the field
.
There exist polynomials and
in
such that
,
for all
, where
denotes the derivative of
.
By induction over , one may
check that each
is isomorphic to a direct
product
of separable algebraic extensions of
. The projection
of
into
is
separable in the usual sense, which means that
and
are coprime for all
and
.
In terms of the triangular set defined in the
introduction, a separable tower corresponds to the situation where the
ideal
is radical over the algebraic closure
of
. In
particular, this means that
and therefore
are uniquely determined by the variety
For each , we notice that
is the projection of
onto the first
coordinates.
From a geometric point of view, one may regard computations with
elements in as computations with all zeros
in
in parallel.
Alternatively, we may regard them as computations with parameters
subject to the polynomial constraints
.
Consider two separable towers and
of the same height
and over the
same base field
. Let
and
denote the respective
defining polynomials for
and
. We say that
is a
factor of
if
. This is the case if and only if there exist
natural projections
(that naturally extend to
projections
in a coefficientwise manner) such
that:
divides
for
.
sends an element
represented by
to
, for
.
We say that is irreducible if the
are all fields or, equivalently, if the
are all irreducible.
The notation “()” stands for the empty tuple. Let be index sets of
-tuples
of integers with
and
for suitable integers and
. Consider a family
of
factor towers of
with the property that
only depends on
for all
, and write
. We say that such a family of factors forms a
tree factorization of
if the variety
is partitioned into
. We say that the factorization is
irreducible, if all its factor towers are irreducible. If
and
, then
we also write
for the defining polynomial of
over
.
It is convenient to represent such a factorization by a labeled tree
(see Figure
5.1
): the nodes are identified with the index set
,
and each node
is labeled with the algebraic extension
.
The parent of the node
with
is simply the node
.
Each individual factor
corresponds to a path from the root to a leaf.
Given and
,
let
Projecting the equality
on the first coordinates, we observe that the
projection of
, for given
, is the same for all
, and equals
, whence
.
Consequently,
forms a tree factorization of the
subtower
. From an algebraic
point of view, this means that we have a natural isomorphism
For any , we denote by
the natural projection of
onto
. Dually, the family
forms a tree factorization of the tower
, where
for . In particular, if
, then this relation becomes
This corresponds to the factorization
where we recall that stands for the defining
polynomial of
for
,
whereas
is the defining polynomial of
.
If , then (5.1)
reduces into the well known isomorphism
from the Chinese remainder theorem. We may thus view the isomorphism (5.1) as a generalized Chinese remainder theorem. The
multi-modular representation of an element in
is simply its image under this isomorphism. The aim of this section is
to present efficient algorithms for conversions between the usual and
the multi-modular representations.
For the usual Chinese remainder theorem, efficient algorithms are well
known for carrying out the corresponding conversions [6, 23, 48]. These algorithms are based on the
technique of so-called remainder trees, for which recent
improvements can be found in [4, 10, 35].
In particular, if then the conversions from (5.4) can be carried out in time
;
see for instance [27, Chapter 10]. These fast algorithms
actually work in our context. This means that the isomorphism
from (5.2) and (5.3) can be computed with
complexity , whenever
modulo
are precomputed for
all
. In fact, if
are elements of
with natural
preimages
for
,
then the natural preimage of
can be computed as
![]() |
(5.5) |
using fast “linear combination for linear moduli” [27, Algorithm 10.9].
The idea is to combine these “isomorphisms at nodes ” in order to compute the global
isomorphism from (5.1). For the complexity analysis, it is
convenient to introduce the following maximal normalized cost of
multiplication in any of the factors of the tree factorization:
For the time being, we may use Proposition 2.4 as a
complexity bound for . The
conversion of elements in
into elements in
is called multi-modular reduction and we may
use the following algorithm for this operation:
Algorithm
Output: .
.
Proof. The correctness is straightforward from
the definitions. We perform step 4b using fast
univariate multi-modular reduction. Let be a
universal constant such that multi-modular reduction of a polynomial of
degree
over an arbitrary effective ring
can be performed in time
.
Let us show by induction over that the algorithm
runs in time
. For
, the result is obvious, so assume
that
. By the induction
hypothesis, step 3 runs in time
. By the definition of
,
the contribution of step 4b to the complexity
is bounded by
Adding up, it follows that the algorithm runs in time .
The opposite conversion of elements in into
elements in
is called Chinese
remaindering; we may use the following algorithm for this
operation:
Algorithm
Output: such that
for all
.
Assumption: modulo
are precomputed for all
,
, and
.
.
Proof. The proof is very similar to the one of
Proposition 5.1. The polynomials
modulo
are actually used in the Chinese
remaindering of step 2 via formula (5.5).
Factoring univariate polynomials over an arbitrary effective field is generally expensive or even impossible [24,
25]. Still, if an algorithm for this task is given, then it
is interesting to study how to exploit it for the computation of an
irreducible tree factorization
of a given
separable tower
. Once such
an irreducible tree factorization is known, arithmetic and other
operations can potentially be sped up by switching to the multi-modular
representation thanks to Algorithms 5.1 and 5.2.
For the complexity analysis, the function
represents the time necessary to factor a polynomial in
, and it is again convenient to introduce the
following maximal normalized cost of factoring univariate polynomials
over any of the fields in the factor towers:
Here we notice that each of the irreducible factors
is a tower of separable fields, so we may directly use the accelerated
arithmetic from section 4 for computations over any of the
.
Algorithm
Output: the irreducible tree factorization of
.
If , then return
.
Recursively compute the irreducible tree factorization of
.
Compute using Algorithm 5.1.
Return , where
.
whenever
for
.
Proof. The cost of all factorizations in step 4a is bounded by
Together with recursive calls, the total cost of all factorizations is bounded by
Step 3 takes time ,
by Proposition 5.1, and the same step for the recursive
calls amounts to a similar complexity. We conclude by adding up these
contributions.
We are now ready to present a major corollary of this result; in combination with Algorithms 5.1 and 5.2, it reduces arithmetic in general separable towers to arithmetic in accelerated towers of fields.
can be computed in time
.
Proof. This follows from Proposition 5.3
combined to Corollary 4.11, which yields .
For certain specific fields of coefficients , factorization of univariate polynomials over
can be reasonably cheap. Let us briefly study two
particular such cases.
Even if or if
is an
algebraic number field, then it may be useful to temporarily convert
coefficients in
into
bit
complex floating point numbers and convert the results of computations
back using rational reconstruction techniques [27, Chapter
5].
A convenient framework for analyzing the “ultimate
complexity” of numeric algorithms was introduced in [38].
Concerning the factorization of complex polynomials of degree , it was shown therein that all
roots could be computed at precision
in time
for “sufficiently large precisions”
. The actual precision from
which this bound becomes relevant highly depends on the location of the
roots. As a rule of thumb, a precision
is often
required and sufficient, but we refer the reader to the seminal works of
Schönhage, Pan and others for details [47, 54].
Since is algebraically closed, towers of
separable algebraic extensions can be factored into towers of degree
one. From the “ultimate bit complexity” point of view from
[38], it follows that the ultimate bit complexity of the
conversions in Propositions 5.1 and 5.2
becomes
and so does the cost of factoring itself
in Proposition 5.3. Using our rule of thumb, we expect that
a precision
of the order of
should be sufficient in practice in order to observe these complexity
bounds. Small values of
are thus expected to be
favourable for this type of coefficients (notice that this was the least
favourable case for general coefficients!). A refined comparison between
the bit complexities of this approach and the building of accelerated
towers would require efforts that are beyond the scope of the present
paper.
This section revisits the computation of traces and characteristic
polynomials in towers of field extensions over a general field by using
fast products. Our algorithms are rather different from those of [52]: we compute traces faster, but for characteristic
polynomials and primitive elements the bottleneck lies in the baby-step
giant-step method, so the complexity exponents in terms of turn out to be the same as in [52] (notice
that we gain a logarithmic factor
).
Until the end of this section,
is a tower of
separable field extensions of degrees
and
is as in Proposition 2.4.
Let us recall how to compute traces in a tower
of separable field extensions. We let
represent
the reverse polynomial of
.
Its constant coefficient is
and the trace
function
of
over
may be computed as the first
terms of the power series
In other words, the trace map can be regarded as
a vector in
that can be computed in time
: indeed this vector represents the
matrix of
with respect
to the basis
. Given such a
vector representation, the individual evaluation of
at an element in
takes time
.
More generally, is a
-algebra for all
,
and the relative trace function satisfies
are performed
by linear algorithms over
in the sense of [10] (and as is always the case in this paper). Then the
vector representation of the trace map
can be
computed in time
Proof. First we compute , as a vector in
,
for
. This computation takes
time
. We regard each
as a
-linear
map
from
to
, whose evaluation takes time
. Regarding
as a vector
, we apply the transpose of
to
in order to obtain a
new vector
that represents
. This computation takes time
thanks to the transposition principle (as explained in [10]).
We next apply the transpose of
to
and obtain a vector
that
represents
. Continuing this
way, we obtain
as a vector
in time
.
We may take advantage of accelerated towers to compute a primitive
element over
,
together with its minimal polynomial
and the
parametrizations
with
for
. We first consider the
computation of characteristic polynomials.
-accelerated
tower
for
is given,
with
, and that the
inverses of
are available in
. Then the characteristic polynomial of
over
can be computed in
time
.
Proof. Lemma 6.1 and Proposition 2.4 allow us to compute the representation of
as a vector in
in time
Using Lemma 4.8, we convert into an
element
of
in time
. By using the transposed product
in
the vector representation of the linear form
can be computed in time
. Then, as
in section 1.2, the computation of
is achieved by transposing Algorithm 3.1, so Proposition 3.1 yields the cost
We finally use the Newton–Girard identity (1.1) to
recover the characteristic polynomial of with
further cost
; see for
instance [29, section 2.4]. The conclusion follows.
By taking as in Corollary 4.11, the
expected cost of Theorem 6.2 simplifies to
, thanks to the assumption
.
We are now in a position to design a randomized algorithm with
subquadratic expected cost for computing a primitive element
representation of over
.
, that
the inverses of
are available in
, and that a
-accelerated
tower
for
is given,
with
. Then a primitive
element representation
of
over
can be computed by a randomized Las Vegas
algorithm with expected cost
Taking as in Corollary 4.11,
this cost further simplifies into
.
Proof. The characteristic polynomial of
has total degree
in
. So its
discriminant is a polynomial in
of total degree
. By the
Schwartz–Zippel lemma [55, 57], picking
at random with the
in a
set of size
leads to a primitive element with
probability
. Computing the
characteristic polynomial of
takes time
by Theorem 6.2. We can verify that
is a primitive element by checking that its
characteristic polynomial is separable with further cost
. Consequently the expected time to find a
primitive element is
.
Now assume that a primitive element has been
found, and let us consider the computation of polynomials
such that
for
. We follow Kronecker's classical deformation
technique of the primitive element; see for instance [28,
section 3.3]. For this purpose, we first introduce a new formal
indeterminate
with
,
so that
is the ring of tangent numbers over
. We aim at computing the
characteristic polynomial of
over
. Then
Notice that is already available at this point.
The computation of the vector representation of the linear form
can be done in time
by the
transposition principle. Therefore
is obtained
by transposing Algorithm 3.1. In view of Proposition 3.1, this computation takes time
At this point, we have computed the traces for
all
. We deduce the
characteristic polynomial
of
using Newton–Girard's formula (1.1), so that
It follows that . The total
cost to obtain all the
for
is
Now let be an element of
. It can be converted into an element
of
in time O(
) by Lemma 4.8, where
has partial degree
in
for
. The
expression of
in terms of
is obtained as
in time
by Proposition 3.3. By successively taking
for
, the polynomials
can all be obtained in time
Finally, thanks to , the
assumption
implies the simplified complexity
bound.
We have proved nearly optimal complexity bounds for basic arithmetic operations in towers of fields of arbitrary height. Our results are based on the newly introduced concept of “accelerated” towers. Two major problems remain open:
Do there exist algorithms of quasi-linear complexity instead of
?
It is well known that this is indeed the case for towers of bounded
height : see Propositions
2.4 and 2.7. In section 5.5,
we have singled out a few other situations when quasi-linear
algorithms do exist. We refer to [7] for yet a few
other examples.
Can one efficiently construct accelerated towers for general separable towers, without relying on univariate polynomial factorization as in section 5?
One obvious strategy is to use the dynamic evaluation paradigm [19, 20] that we already discussed in the introduction, while controlling complexity issues using techniques from [16]. We do not see any serious obstacle to a positive answer, but the technical details are beyond the scope of this paper. We intend to come back to this problem in a future work.
Another major challenge concerns efficient implementations.
Asymptotically fast algorithms based on Proposition 2.4
obviously only become interesting when is in the
range where evaluation-interpolation style algorithms are used for
univariate arithmetic. For instance, FFT-based algorithms are typically
used for
.
For such sufficiently large ,
we expect our acceleration techniques to be useful. For instance, assume
that
and
are both small,
say
. Then conversions
between
and an equivalent primitive
representation
can be done very fast.
Conversions between
and
can be done coefficientwise, so they are also fast. Without much cost,
this allows us to diminish the height
by one and
gain a constant factor
when applying Proposition
2.4.
This discussion indicates that our techniques should be of practical
interest for sufficiently large and generic base
fields
. For specific base
fields
and specific types of towers, various
alternative approaches have been proposed [17, 38,
39], and it would require a more significant implementation
effort to figure out which approaches are best; we intend to address to
this issue in the near future.
Ph. Aubry, D. Lazard, and M. Moreno Maza. On the theories of triangular sets. J. Symbolic Comput., 28(1–2):105–124, 1999.
S. Baktir, J. Pelzl, T. Wollinger, B. Sunar, and C. Paar. Optimal tower fields for hyperelliptic curve cryptosystems. In Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004, volume 1, pages 522–526. IEEE, 2004.
N. Benger and M. Scott. Constructing tower extensions of finite fields for implementation of pairing-based cryptography. In M. Anwar Hasan and T. Helleseth, editors, Arithmetic of Finite Fields. Third International Workshop, WAIFI 2010. Istanbul, Turkey, June 27-30, 2010. Proceedings, volume 6087 of Lecture Notes in Comput. Sci., pages 180–195. Springer-Verlag Berlin Heidelberg, 2010.
D. Bernstein. Scaled remainder trees. Available from https://cr.yp.to/arith/scaledmod-20040820.pdf, 2004.
J. Berthomieu, G. Lecerf, and G. Quintin. Polynomial root finding over local rings and application to error correcting codes. Appl. Algebra Engrg. Comm. Comput., 24(6):413–443, 2013.
A. Borodin and R. T. Moenck. Fast modular transforms. J. Comput. System Sci., 8:366–386, 1974.
A. Bostan, M. F. I. Chowdhury, J. van der Hoeven, and É. Schost. Homotopy methods for multiplication modulo triangular sets. J. Symbolic Comput., 46(12):1378–1402, 2011.
A. Bostan, F. Chyzak, F. Ollivier, B. Salvy, É. Schost, and A. Sedoglavic. Fast computation of power series solutions of systems of differential equations. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '07, pages 1012–1021, Philadelphia, PA, USA, 2007. Society for Industrial and Applied Mathematics.
A. Bostan, L. Gonzales-Vega, H. Perdry, and É. Schost. Complexity issues on Newton sums of polynomials. Distributed in the digital proceedings of MEGA'05, 2005.
A. Bostan, G. Lecerf, and É. Schost. Tellegen's principle into practice. In Hoon Hong, editor, Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation, ISSAC '03, pages 37–44, New York, NY, USA, 2003. ACM.
A. Bostan, B. Salvy, and É. Schost. Fast algorithms for zero-dimensional polynomial systems using duality. Appl. Algebra Engrg. Comm. Comput., 14(4):239–272, 2003.
F. Boulier, D. Lazard, F. Ollivier, and M. Petitot. Computing representations for radicals of finitely generated differential ideals. Appl. Algebra Engrg. Comm. Comput., 20(1):73–121, 2009.
R. P. Brent and H. T. Kung. Fast algorithms for manipulating formal power series. J. ACM, 25(4):581–595, 1978.
P. Bürgisser, M. Clausen, and M. A. Shokrollahi. Algebraic complexity theory, volume 315 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, 1997.
D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Inform., 28:693–701, 1991.
X. Dahan, M. Moreno Maza, É. Schost, and Yuzhen Xie. On the complexity of the D5 principle. In J.-G. Dumas, editor, Proceedings of Transgressive Computing 2006: a conference in honor of Jean Della Dora, pages 149–168. U. J. Fourier, Grenoble, France, 2006.
L. De Feo, J. Doliskani, and É. Schost. Fast
algorithms for -adic
towers over finite fields. In Proceedings of the 38th
International Symposium on Symbolic and Algebraic
Computation, ISSAC '13, pages 165–172, New York, NY,
USA, 2013. ACM.
L. De Feo and É. Schost. Fast arithmetics in Artin–Schreier towers over finite fields. J. Symbolic Comput., 47(7):771–792, 2012.
J. Della Dora, C. Dicrescenzo, and D. Duval. About a new method for computing in algebraic number fields. In B. F. Caviness, editor, EUROCAL '85: European Conference on Computer Algebra Linz, Austria, April 1–3 1985 Proceedings Vol. 2: Research Contributions, pages 289–290. Springer Berlin Heidelberg, 1985.
D. Duval. Algebraic numbers: An example of dynamic evaluation. J. Symbolic Comput., 18(5):429–445, 1994.
J.-C. Faugère, P. Gaudry, L. Huot, and G. Renault. Sub-cubic change of ordering for Gröbner basis: a probabilistic approach. In K. Nabeshima, editor, Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, ISSAC '14, pages 170–177, New York, NY, USA, 2014. ACM.
J.-C. Faugère, P. Gianni, D. Lazard, and T. Mora. Efficient computation of zero-dimensional Gröbner bases by change of ordering. J. Symbolic Comput., 16(4):329–344, 1993.
C. M. Fiduccia. Polynomial evaluation via the division algorithm: the fast Fourier transform revisited. In A. L. Rosenberg, editor, Fourth annual ACM symposium on theory of computing, pages 88–93, 1972.
A. Fröhlich and J. C. Shepherdson. On the factorisation of polynomials in a finite number of steps. Math. Z., 62:331–334, 1955.
A. Fröhlich and J. C. Shepherdson. Effective procedures in field theory. Philos. Trans. Roy. Soc. London. Ser. A., 248:407–432, 1956.
A. Garcia and H. Stichtenoth. A tower of Artin–Schreier extensions of function fields attaining the Drinfeld–Vladut bound. Invent. Math., 121(1):211–222, 1995.
J. von zur Gathen and J. Gerhard. Modern computer algebra. Cambridge University Press, New York, 3rd edition, 2013.
M. Giusti, G. Lecerf, and B. Salvy. A Gröbner free alternative for polynomial system solving. J. Complexity, 17(1):154–211, 2001.
B. Grenet, J. van der Hoeven, and G. Lecerf. Deterministic root finding over finite fields using Graeffe transforms. Appl. Algebra Engrg. Comm. Comput., 27(3):237–257, 2016.
D. Harvey and J. van der Hoeven. Faster integer and polynomial multiplication using cyclotomic coefficient rings. Technical report, ArXiv, 2017. http://arxiv.org/abs/1712.03693.
D. Harvey and J. van der Hoeven. Faster integer multiplication using short lattice vectors. Technical report, ArXiv, 2018. http://arxiv.org/abs/1802.07932, to appear in Proc. ANTS 2018.
D. Harvey and J. van der Hoeven. Faster integer multiplication using plain vanilla FFT primes. Math. Comp., 88(315):501–514, 2019.
D. Harvey, J. van der Hoeven, and G. Lecerf. Even faster integer multiplication. J. Complexity, 36:1–30, 2016.
D. Harvey, J. van der Hoeven, and G. Lecerf. Faster polynomial multiplication over finite fields. J. ACM, 63(6), 2017. Article 52.
J. van der Hoeven. Faster Chinese remaindering. Technical report, CNRS & École polytechnique, 2016. http://hal.archives-ouvertes.fr/hal-01403810.
J. van der Hoeven and G. Lecerf. On the complexity of multivariate blockwise polynomial multiplication. In J. van der Hoeven and M. van Hoeij, editors, Proceedings of the 37th International Symposium on Symbolic and Algebraic Computation, ISSAC '12, pages 211–218. ACM, 2012.
J. van der Hoeven and G. Lecerf. On the bit-complexity of sparse polynomial and series multiplication. J. Symbolic Comput., 50:227–254, 2013.
J. van der Hoeven and G. Lecerf. Modular composition via complex roots. Technical report, CNRS & École polytechnique, 2017. http://hal.archives-ouvertes.fr/hal-01455731.
J. van der Hoeven and G. Lecerf. Modular composition via factorization. J. Complexity, 48:36–68, 2018.
Xiaohan Huang and V. Y. Pan. Fast rectangular matrix multiplication and applications. J. Complexity, 14(2):257–299, 1998.
M. Kalkbrener. A generalized Euclidean algorithm for computing triangular representations of algebraic varieties. J. Symbolic Comput., 15(2):143–167, 1993.
K. S. Kedlaya and C. Umans. Fast polynomial factorization and modular composition. SIAM J. Comput., 40(6):1767–1802, 2011.
U. J. J. Le Verrier. Sur les variations séculaires des éléments elliptiques des sept planètes principales : Mercure, Venus, La Terre, Mars, Jupiter, Saturne et Uranus. J. Math. Pures Appli., 1:220–254, 1840.
R. Lebreton. Relaxed Hensel lifting of triangular sets. J. Symbolic Comput., 68(2):230–258, 2015.
G. Lecerf. On the complexity of the Lickteig–Roy subresultant algorithm. J. Symbolic Comput., 92:243–268, 2019.
X. Li, M. Moreno Maza, and É Schost. Fast arithmetic for triangular sets: from theory to practice. J. Symbolic Comput., 44(7):891–907, 2009.
J. M. McNamee and V. Y. Pan. Numerical Methods for Roots of Polynomials, Part II, volume 16 of Studies in Computational Mathematics. Elsevier, 2013.
R. T. Moenck and A. Borodin. Fast modular transforms via division. In Thirteenth annual IEEE symposium on switching and automata theory, pages 90–96, Univ. Maryland, College Park, Md., 1972.
M. Moreno Maza, É. Schost, and P. Vrbik. Inversion modulo zero-dimensional regular chains. In V. P. Gerdt, W. Koepf, E. W. Mayr, and E. V. Vorozhtsov, editors, Computer Algebra in Scientific Computing. 14th International Workshop, CASC 2012. Maribor, Slovenia, September 2012. Proceedings, volume 7442 of Lect. Notes Comput. Sci., pages 224–235. Springer Berlin Heidelberg, 2012.
M. S. Paterson and L. J. Stockmeyer. On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J. Comput., 2(1):60–66, 1973.
A. Poteaux and É. Schost. Modular composition modulo triangular sets and applications. Comput. Complex., 22(3):463–516, 2013.
A. Poteaux and É. Schost. On the complexity of computing with zero-dimensional triangular sets. J. Symbolic Comput., 50:110–138, 2013.
J. F. Ritt. Differential algebra. Amer. Math. Soc., New York, 1950.
A. Schönhage. The fundamental theorem of algebra in terms of computational complexity. Technical report, Math. Inst. Univ. of Tübingen, 1982.
J. T. Schwartz. Fast probabilistic algorithms for verification of polynomial identities. J. ACM, 27(4):701–717, 1980.
V. Shoup. Fast construction of irreducible polynomials over finite fields. J. Symbolic Comput., 17(5):371–391, 1994.
R. Zippel. Probabilistic algorithms for sparse polynomials. In Edward W. Ng, editor, Symbolic and Algebraic Computation. EUROSAM '79, An International Symposium on Symbolic and Algebraic Manipulation, Marseille, France, June 1979, number 72 in Lect. Notes Comput. Sci., pages 216–226. Springer Berlin Heidelberg, 1979.