|
Modular composition is the problem to compute the composition of two univariate polynomials modulo a third one. For polynomials with coefficients in a finite field, Kedlaya and Umans proved in 2008 that the theoretical bit complexity for performing this task could be made arbitrarily close to linear. Unfortunately, beyond its major theoretical impact, this result has not led to practically faster implementations yet. In this article, we explore how polynomial factorization may help modular composition. Most of our algorithms assume that the modulus is fixed, which allows us to precompute factorizations of the modulus over suitable extension fields. This significant restriction with respect to Kedlaya and Umans' algorithm is acceptable whenever we need to perform many compositions modulo the same modulus. For polynomials over a finite field, we show that compositions modulo a fixed modulus of composite degree are cheaper than general modular compositions. In some very favourable cases of fixed moduli of very smooth degree, we show that the complexity even becomes quasi-linear. |
Let be an effective field. This means that
elements of
can be represented on a computer and
that we have algorithms for performing the field operations. Given
polynomials
, the problem of
modular composition is to compute
modulo
. Modular composition
is an important problem in complexity theory because of its applications
to polynomial factorization [29-31]. It also
occurs very naturally whenever one wishes to perform polynomial
computations over
inside an algebraic extension
of
. Given two different
representations
of an algebraic extension of
, the implementation of an
explicit isomorphism boils down to modular composition as well.
In this paper, we study the problem of composition modulo a fixed
polynomial mostly in the case when
is a finite field. We assume that
is separable; the case of moduli of the form
is
studied in a separate paper [24]. Our results are based on
the following simple observation: if a factorization
is known, then composition modulo
reduces to
composition modulo the
with
. Curiously, this
observation does not seem to be exploited in the standard literature on
modular composition. In the case when
is
irreducible over
, but
admits a non trivial divisor
, then the second crucial observation is that
factors over
.
We may then apply the first observation in order to obtain an efficient
algorithm for composition modulo
.
Finding the factorizations of
over
can be quite expensive in general, but such computations
can be regarded as pre-computations if the modulus
is fixed.
Besides modular composition, we also study the related problem of
computing the characteristic polynomial
of
modulo
.
More precisely, we understand
to be the
characteristic polynomial of the multiplication endomorphism by
in
. In
particular, we have
modulo
. Theoretically speaking, the fastest current
method relies on the computation of so-called power projections
(see for instance [16, section 2.5]); this task turns out
to be of the same complexity as modular composition in all known cases.
Denote by the number of operations in
required to multiply two polynomials of
. Let
and
be polynomials in
of degrees
,
and
. The naive modular composition
algorithm takes
operations in
. In 1978, Brent and Kung gave an algorithm
with cost
: their algorithm
from [5, section 2.1] actually concerns the case when
, but it naturally extends to
general
. It uses the
baby-step giant-step technique due to Paterson and Stockmeyer
[40], and even yields a sub-quadratic cost
when using fast linear algebra (see [28, p. 185]). The
constant
is such that a
matrix over
may be multiplied with another
rectangular matrix in time
. The best current bound
is
due to Huang and Pan [26, Theorem 10.1].
When linear algebra benefits from very fast implementations, its
contribution is expected to be smaller than the other polynomial
operations, on a certain bounded range for .
In fact, for fixed values of
and
, the sizes of the “baby” and
“giant” steps may be optimized in order to balance cost
contributions of matrix and polynomial operations (this was studied for
finite fields in an unpublished preprint of Shoup and Smolensky in
1992). Further improvements have been proposed in [27],
based on the Lagrange inversion formula for the reversion of
formal power series.
A major breakthrough has been achieved by Kedlaya and Umans [30,
31] in the case when is the finite
field
. For any positive
, they have shown that the
composition
modulo
can
be computed using
bit operations. Unfortunately,
it remains a major open problem to turn this theoretical bit complexity
bound into practically useful implementations.
In the special case of power series composition (i.e. when
), the best known complexity
bound is again due to Brent and Kung: in [5], they showed
that this requires
operations in
, under the condition that
is invertible and that the characteristic is at least
, where
.
The variant proposed by van der Hoeven [20, section 3.4.3]
removes the condition on
.
For fields of small characteristic, Bernstein [1] proposed
an algorithm that is softly linear in the precision
but linear in the characteristic. These algorithms are generalized to
moduli
of the form
in
[24]; we show there that the composition reduces to one
power series composition at order
in
, plus
compositions modulo
, and one
characteristic polynomial computation modulo
. Let us finally mention that series with integer,
rational or floating point coefficients can often be composed in
quasi-linear time as well in suitable bit complexity models, as shown by
Ritzmann [43]; see also [21].
The expression is linear in
. It is well known that the transposition of
the application
corresponds to the power
projection task (see section 2.7), which is an
important ingredient for computing minimal and characteristic
polynomials. In [46], Shoup studied the computation of
minimal polynomials in algebraic extensions of the form
or
,
explicitly given by defining polynomials. He designed fast practical
algorithms with low memory consumption, based on a smart combination of
“baby-step giant-step” and transposed algorithms. However
his method does not improve upon Brent and Kung's one from the
asymptotic complexity point of view.
The characteristic polynomial of modulo
may be obtained from suitable power projections of
modulo
thanks to the
well-known Newton–Girard identities, which involve
solving a first order differential equation to precision
. This is rather elementary when the
characteristic is zero or sufficiently large. Otherwise,
-adic arithmetic is needed. A complete solution
is described in [16]. More generally, a framework for using
-adic arithmetic to solve
ordinary differential equations in positive characteristic may be found
in [34].
The aim of this article is to show how suitable factorizations of the
modulus can be exploited to speed up
compositions modulo
, as well
as the computation of characteristic polynomials modulo
. Most of the new results are derived from the
following simple observation: if
splits into
linear factors in
, and if
its roots are given, then modular composition basically reduces to
evaluating
at the roots of
, evaluating
at these
values of
, and interpolate
. It is well-known that each
of these steps can be done in softly linear time using multiple point
evaluation and interpolation. More generally, whenever
can be factored into
, the
computation of
reduces to the computations of
for
.
Of course, the existence of factorizations of
heavily depends on
itself and on fields over
which we allow ourselves to factor
.
For instance, if
, then we
might consider computing the roots of
in
using a sufficient precision, or factoring
over the
-adic
numbers
for some well chosen prime number
. If
is a
finite field, and
is an irreducible polynomial
of composite degree
, then we
may consider factorizations over the intermediate field
. In a separate paper, we study the case when
is the field of computable complex numbers [25]. In this paper, we mainly focus on the finite field case.
Concerning the organization of this paper, we first revisit several
known techniques in section 2. This includes the
introduction of cost functions for modular composition, power
projection, and the computation of characteristic polynomials. In
section 3, we examine how modular composition can be
accelerated when a factorization of in
is given. More precisely, if the irreducible factorization
of the modulus is available, then our method
reduces the composition modulo
to several
compositions modulo
in softly linear time. A key
ingredient, reused several times in the article, is the simultaneous
computation of characteristic polynomials and modular compositions.
In section 4, we turn our attention to the specific
situation of an irreducible modulus of composite
degree
over a finite field
. We show how to exploit the existence of
factorizations of
over the intermediate fields
into factors of degree
.
The natural generalization to degrees with
will be the subject of sections 5, 6 and 7. One important question is how to
represent the elements of the intermediate fields
and it is convenient to introduce the special concept of an
effective algebraic tower for this purpose. We also introduce
the notion of a composition tower for
, which formalizes the requirement that we are given
factorizations of
over each of the intermediate
fields
. In section 5,
we generalize the algorithm from section 4 to arbitrary
composition towers. In section 6 we also give a detailed
complexity analysis in the case of triangular towers when each
intermediate field
admits the form
for suitable
.
Section 7 is dedicated to primitive towers, in which case
each is generated by a single primitive element
over
.
If the
are pairwise coprime (see section 7.4), then the field
can be taken to
be the composed product of
,
and computations in the tower become particularly efficient: in the case
when
is “super-smooth” (in the sense
that the largest prime-power divisor
of
satisfies
),
we will show that composition modulo
can be done
in quasi-linear time (modulo precomputations). In this very particular
situation, our method is asymptotically faster than Kedlaya–Umans'
algorithm. If
admits a small characteristic
and
, then
a similar result holds when using so called Artin–Schreier towers
(see section 7.5). For general smooth
, one may also consider nested towers (see
section 7.3) for which the primitive elements
are compositions of polynomials of degrees
over
. The
existence of such towers for given
and
is an interesting open problem, with a generally positive
answer in practice.
Our main complexity bounds for modular composition are summarized in
Table 1.1. In this table, is a
fixed irreducible polynomial of degree
and
. The entries correspond to the
various types of towers that are considered in sections 6
and 7, while assuming that all necessary precomputations
that depend on
have been done.
This leaves us with the issue of how to conduct the precomputations.
This is the subject of section 8, where we analyze the cost
of building composition towers of various types. We will see that the
construction of composition towers for prescribed composite extension
degrees can usually be done fast. On the other
hand, building composition towers for a prescribed modulus
is of the same order of difficulty as factoring
over the intermediate fields
,
or finding a root of
in
for a given representation of the elements of
. Practical algorithms for this task are well known
but their cost is quadratic in
;
see [38] for recent theoretical complexity bounds.
|
||||||||||||||||||
Whether our approach leads to competitive algorithms for modular
composition thus depends on the question whether we require the
factorization of to be part of the complexity or
not. Indeed, if we are doing a large polynomial computation over
in the algebraic extension
, then we typically need to perform many modular
compositions
for different
and
, but for a fixed modulus
. In such cases, it is
reasonable to regard the factorization of
as a
precomputation. Furthermore, if we want to perform computations in a
large finite field extension
and if we are free
to select a suitable representation for elements of this finite field,
then we may build a modulus
with
using dedicated algorithms; from the asymptotic complexity
point of view, these algorithms are usually faster than finding an
irreducible modulus at random.
In fact, we recall that testing the irreducibility of
in
reduces to
modular
compositions in degree
over
, plus
bit operations
(see [31, section 8.2], based on Rabin's algorithm [41]). In practice, for the fast construction of an irreducible
polynomial of degree
over
, it is recommended to use Shoup's algorithm [45], which uses an expected number of
arithmetic operations in
. If
is much smaller than
, then one may also resort to Couveignes and
Lercier's algorithm [9]. Theoretically speaking, this
algorithm admits an expected bit complexity
, but it relies on Kedlaya and Umans' algorithm for
modular composition.
Recall that an effective ring is a ring
with unity whose elements can be represented on a computer and such that
we have algorithms for performing the ring operations. Effective fields
and effective algebras over an effective ring
are defined similarly.
Given an effective ring ,
algebraic complexity models express running times in terms of
the number of operations in
.
Unless otherwise stated, we will analyze the costs of the algorithms in
this paper in this way. More precisely, our results both apply for the
straight-line program and computation tree models [6,
chapter 4].
For randomized algorithms over a finite effective ring , we assume a special instruction that
uniformly generates random elements in
.
For simplicity we assume that this instruction has constant cost. For a
given input, the cost of an algorithm is thus a random variable. The
expected cost of an algorithm for input size
is defined as the maximum of the averages of these random variables over
all possible inputs of size
.
When working over the finite field with
elements, we may also analyze the costs of algorithms
in the bit complexity model, which relies on Turing machines
with a sufficient number of tapes [39]. We will not
explicitly consider this model in our paper, but most of our complexity
bounds can easily be converted to this setting.
Let be an effective ring with unity, let
, and denote
Given a polynomial or power series and
, it is convenient to write
We write for a cost function such that two
polynomials in
can be multiplied using
operations in
.
The schoolbook algorithm allows us to take
.
The fastest currently known algorithm [7] yields
. Here, the soft-Oh
notation
means that
; see [13, section 25.7] for technical
details. If
is a field of finite characteristic,
then it has been shown [18, 19] that
, where
denotes the iterated logarithm function. In what follows, we
will always assume that
is an increasing
function in
. This customary
assumption implies the super-additivity of
, namely
for all
and
.
More generally, if is an effective
-algebra, then it is sometimes convient to
denote by
a cost function such that two
polynomials in
can be multiplied using
operations in
.
Let be an effective field. The
remainder (resp. quotient) of the euclidean division
of
by
in
is denoted by
(resp. by
). For a fixed modulus of degree
, euclidean divisions by
are usually performed by computing a
pre-inverse
of
. More precisely,
is the
inverse of
in
,
computed at precision
. Given
, one obtains the quotient
by multiplying
with
and the remainder as
.
Given
we may thus compute the modular product
using
operations in
.
We recall that the greatest common divisor of two polynomials of degrees
at most over
can be
computed using
operations in
[13, Algorithm 11.4].
Let and consider
points
. Then the evaluations
can be computed using
operations in
[13, chapter 10].
This operation is also called multipoint evaluation. The
inverse operation is the interpolation, which consists of
recovering
from
;
it can be performed with a similar cost. If the
are fixed, then it is often possible to gain a factor
using FFT trading [22].
More generally, if are polynomials with
, then all the remainders
can be computed using
operations
in
. The inverse problem,
called Chinese remaindering, again admits the same complexity
.
Let be an effective ring. Given a bivariate
polynomial
, we define its
bidegree to be the pair
with
and
. Using
Kronecker substitution [13, section 8.4], two polynomials
of bidegree
may be multiplied using
ring operations in
.
If is a monic polynomial of degree
in
, and if
and
are two polynomials
in
then their canonical preimages in
may be multiplied with
operations
in
before being projected into
with
additional operations. Consequently each
ring operation in
in degree
reduces to
operations in
. If
is monic in
then
also takes
operations in
.
The computation of bivariate subresultants usually relies on fast evaluation/interpolation, as in the following well known proposition.
be an effective field with
elements. Any polynomial subresultant in
of
two polynomials
and
in
of bidegrees
can be
computed using
operations in
.
Proof. The subresultant polynomial of degree
of
and
in
has degree
in
. It
can be computed by evaluating
and
at
values for
in
, computing
subresultants in
of degree
, and interpolating the
coefficients of
. In total
this costs
.
However for the sake of generality we will rely on the following result.
be an effective
field. Any polynomial subresultant in
of two
polynomials
and
in
of bidegrees
,
with the corresponding Bézout relation, can be computed using
operations in
that
comprise at most
inversions in
.
Proof. This result corresponds to [36,
Corollary 26]. The number of inversions comes from the fact that the
underlying algorithm only needs to perform exact divisions by
subresultant coefficients in .
Each division requires to invert the leading coefficient of the divisor.
There exist at most
such leading
coefficients.
Let be the finite field with
elements. One way to represent elements of a finite field extension
is as remainder classes of polynomials in
modulo a monic reducible polynomial
of degree
. We write
in order to emphasize that we use this
representation. Multiplication in
can be done
using
operations in
. Given an element
of degree
over
,
we write
for the subfield of
generated by
over
,
where we understand that elements in
are
represented as evaluations of polynomials in
at
.
For the bulk of the algorithms in this paper, we will work over the
finite field . In that case,
it can be shown that two polynomials in
can be
multiplied in time
on a Turing machine with a
sufficient number of tapes [19, section 8.1]. The algebraic
complexity bounds in this paper are easy to adapt to this model: it
mainly suffices to replace
by
in all bounds.[update references and bounds]
The constant represents a feasible
exponent for the multiplication cost of matrices: two square
matrices of size
can be multiplied using
operations in their coefficient ring. The constant
is defined similarly but for multiplying a
matrix by a
rectangular one.
At present time the best known bound
is due to
Le Gall [35]. This naturally yields
. However the latter bound does not improve upon the
earlier bound
due to Huang and Pan [26,
Theorem 10.1].
Let be an effective ring. Let
an effective
-algebra of
dimension
whose elements are represented by
vectors of size
in a given basis. We introduce
the following cost functions:
: the cost of computing
the modular composition
,
where
is a monic polynomial of degree
, and
.
: the cost of computing
the modular composition
in terms of
operations in
, where
is a monic polynomial of degree
, and
.
: the cost of computing
the characteristic polynomial
of
modulo a monic polynomial
of degree
. This is the
characteristic polynomial
of the
multiplication endomorphism by
in
.
: the cost of modular
power projections, i.e. the cost to compute
, where
is monic of degree
and
is an
-linear form on
.
Let us recall a few known results about these functions.
be a monic polynomial of degree
over a ring
,
and let
. The composed
polynomial
may be computed with
operations in
, or
or
operations in
.
Proof. The first bound is immediate. The proof
of the second bound is detailed in [13, section 12.2].
For a fixed monic polynomial in
of degree
, the modular
composition
is a linear operation in
. For
and
of degrees
,
the corresponding transposed application is precisely the operation of
modular power projections. If a modular composition algorithm with cost
can be transposed in the sense of [3],
then this leads to a power projection algorithm with cost
.
be a monic polynomial of degree
over a ring
,
and let
. The
characteristic polynomial
of
modulo
can be computed using
operations in
, including divisions in
(the partial division in
is supposed to
be implemented), or
operations in
, if
is a field, or
operations in
, if
is a field with
elements, or
operations in
, if there exist given inverses of
in
.
Proof. See [24, section 2.2].
Let be an effective field. A monic polynomial
is said to be separable if
. We may use the following algorithm for
composition modulo
whenever
is separable and its
pairwise distinct roots
in
are given:
Algorithm
Compute using fast multi-point
evaluation.
Compute using fast multi-point
evaluation.
Retrieve with
using fast interpolation.
Return .
operations in
.
Proof. By construction,
for
. Since
and the
are pairwise distinct, it follows that
. This proves the correctness
of the algorithm. The complexity bound follows from the fact that each
of the steps 1, 2 and 3 can be performed in time
.
Let again be a general effective field. The
algorithm from the previous section may be generalized to composition
modulo a polynomial
that can be factored
partially as
in
,
where the polynomials
are pairwise coprime
(although not necessarily irreducible).
Algorithm
Use a multi-remainder algorithm to compute , for all
.
For all , compute the
characteristic polynomial
of
modulo
.
Use a multi-remainder algorithm to compute , for all
.
For all , perform the
modular composition
.
Use Chinese remaindering to compute in
of degree
such
that
for all
.
Return and
.
operations in
,
where
.
Proof. For all ,
the Cayley–Hamilton theorem gives us
,
which implies
, whence the
correctness of
. The
correctness of the characteristic polynomial of
follows from the usual isomorphism of
-algebras
.
The costs of steps 1, 3, 5, and 6 are .
Step 2 costs
, and step 4
takes
operations in
.
Example
Let be a finite field of characteristic
such that
with
and
odd. Let
be an element of order
.
Let
be a multiple of
having all its prime factors dividing
but not
. Then the polynomial
factors into
irreducible
polynomials of degrees
.
The irreducible factors may be described explicitly.
For example, with ,
,
,
,
,
, the
polynomial
factors into irreducible polynomials
of degree
. Taking
instead leads to
and
.
Let still be an effective field and assume that
we wish to compute a modular composition
,
where
and
is monic. Let
us study what happens if the polynomials
and
to be composed have degrees larger than
. We clearly have
and we may compute using
operations in
[13, Exercise 9.16].
Without loss of generality we may therefore assume that
.
If exceeds
,
then it suffices to perform
modular
compositions:
This requires additional compositions modulo
in size
,
plus
operations in
.
Alternatively, given a polynomial with
, the following formula provides us
with a more efficient way to reduce the degree of
:
Taking to be the characteristic polynomial
of
modulo
, its computation usually admits a similar cost
as modular composition. Therefore it is worth using this method unless
. This is actually one key
ingredient for the upcoming algorithms for modular composition: in order
to reduce a composition modulo
to compositions
modulo a factor
of
,
we in particular need to compute the characteristic polynomial of
modulo
.
At the end of the recursive calls, one should nevertheless keep in mind
that we only need annihilating polynomials, so that we may also use
minimal polynomials. Shoup has given a probabilistic
algorithm for computing minimal polynomials [46], which is
useful for actual implementations.
In the case when we wish to compute a composition modulo an irreducible
polynomial , we cannot apply
the algorithms from sections 3.1 and 3.2.
Nevertheless, it might happen that
admits a non
trivial factorization over an algebraic extension of
. This generically happens when
is a finite field and
is
composite. Indeed, we recall the following well known result.
be a monic
irreducible polynomial in
of degree
, and let
be an integer dividing
.
Then there exist an irreducible polynomial
of
degree
, and a polynomial
of bidegree
,
monic in
, such that the
irreducible factorization of
over
is exactly
.
Proof. Since is
irreducible,
is isomorphic to
, which contains
.
We may thus take a generator
of the image of
in
,
and write
its minimal polynomial over
. Let
be
the class of
in
and let
be its monic minimal polynomial over
. Then
divides
and so do its conjugates
for
. On the
other hand, since
, we have
, so any root of
is a root of one of the
,
which proves the equality
.
We call factorizations as in this proposition “normal
factorizations”. This concept can actually be defined over
arbitrary fields, as follows. Let be a monic
separable polynomial in
, let
be a divisor of
,
and let
be a monic separable irreducible
polynomial of degree
. We set
, and write
for the class of
in
. We say that
admits a
normal factorization over
if there
exists a bivariate polynomial
that is monic in
, of bidegree
, and such that
factors
into
over
,
with
and
coprime
whenever
. We call
the normal factor of
over
. Notice that the polynomials
and
are not required to
be irreducible here.
Example ,
is
irreducible, and we have
.
For
we take
and present
as
.
Then
factors over
as
where are the two roots of
in
. More precisely for
we may take the class of
in
, whereas
. The normal factor of
is thus
.
Example the situation is different from the case of
finite fields. For instance
is irreducible, but
it remains irreducible over
,
, etc. Nevertheless, for a
prescribed extension degree
,
we may randomly pick an irreducible
of degree
and a polynomial
of
bidegree
that is irreducible as a polynomial in
, and build
as
. If
is separable, then it is irreducible in
,
and
is a normal factor of
over
.
Assume that admits a normal factorization as
above. Then the Chinese remainder theorem yields a natural isomorphism
and we may define as the unique polynomial in
that satisfies
and
. We may now adapt the algorithm
from section 3.2 as follows:
Algorithm
Compute the remainder in
.
Compute the characteristic polynomial of
modulo
in
.
Compute in
.
Perform the modular composition in
.
Return and
.
Proof. We first observe that
It follows that , whence
is computed correctly. As to the characteristic
polynomial of
, the argument
is the same as for Algorithm 3.2, thanks to the Poisson
formula
.
Now the multiplication of two polynomials in of
degree
using Kronecker substitution requires
operations in
.
This way, steps 1 and 3 take
operations in
. Steps 2 and 4 respectively cost
and
operations in
. The computation of
in the last step may be done naively using
operations in
.
The computation of
requires
further operations by Proposition 2.2.
with
irreducible, for
and
, the modular composition
can be computed using
operations in
.
Proof. We simply apply Algorithm 4.1.
For we use
,
which takes
operations in
by Proposition 2.2. We perform the computations in step 4
naively, which yields
.
Corollary 4.5 already illustrates the potential of our
ability to factor non trivially over an
extension field
. This idea
can be pushed even farther whenever the factors
with
can be factored recursively over a tower of
extension fields of
. In
order to carry out this generalization, we first need to decide how to
compute with elements in the successive fields of such a tower. Instead
of privileging particular representations, we rely on the abstract
concept of an effective tower.
is
a tower of fields
with the following properties:
Each field comes with a specific way to
represent its elements and algorithms for the field operations.
For , the field
is a finite separable algebraic extension of
, and we have
precomputed an element
along with its
minimal polynomial
over
, such that
and
. We set
.
For , we have
algorithms for computing the natural bijection
, given by
and its inverse . We
call
and
the
upward and downward conversions at level
. The coefficientwise
extensions of
and
yield mappings
and
that we still denote by
and
.
We denote by the cost of multiplying two
polynomials in
in terms of the number of
required operations in
.
Similarly, we write
for the cost of inverting an
element in
in terms of the number of operations
in
. We always assume that
additions and subtractions can be done in linear time. We also let
upper bound the costs of both the upward and downward
conversions at level
.
Let us now return to our particular modulus and
assume that its degree
admits the factorization
with
for
. If
,
then Algorithm 4.1 shows how to reduce modular composition
modulo
to composition modulo a polynomial over
of degree
,
provided a normal factor of
over
exists and is given. In order to generalize this idea to
the case when
, it is useful
to introduce the concept of a composition tower.
be an effective
tower. Let
be a monic separable polynomial of
degree
. We say that
is a composition tower for
over
if the following properties
are satisfied:
We let , and for each
, we have precomputed
a monic polynomial
such that
is a normal factor of
over
. We let
. For convenience,
is also called a normal factor of
.
For , we have the
isomorphism
and we assume we have precomputed a polynomial
such that
Given monic and separable along with a
composition tower
, we may
now apply Algorithm 4.1 recursively. Unrolling the
recursive calls yields the following algorithm for modular composition.
Algorithm
Proof. Let ,
and for
, let
in
. The
polynomial
in step 3 is the
characteristic polynomial of
modulo
over
and we notice that
in step 6. The correctness of the algorithm
results from these observations.
The computation of the resultant in step 3 can be performed in time
by
Proposition 2.2. It follows that the complete step 3
requires
operations in .
The computation of in step 4 can be
done in time
. The
computation of the remainder of its division by
can be done in time
.
Therefore step 4 amounts to
operations in .
In step 6, the naive evaluation of
at
modulo
using Horner's
method requires
operations in
. Consequently, step 6 requires
operations in . The
conclusion follows by adding up the above bounds for the costs of the
individual steps.
Remark is not necessarily irreducible. In that
case, we assume that the tower
is only a
“partial composition tower”, meaning that we no longer
require that
. In Algorithm
5.1, we then need to make two adjustments:
In step 2, we compute to be the
characteristic polynomial of
modulo
over
.
In step 5, we compute in
.
These computations lead to an additional term in
the complexity bound of Theorem 5.3.
Assume that we are given a tower
of finite fields such that for each
. One obvious way to represent an element of
is to write it as
,
where
is a polynomial with
. An effective tower that uses this
representation for the elements of the fields
is
called a triangular tower. For such towers, the costs of the
upward and downward conversions are zero. Throughout this section it
will be convenient to make the relatively harmless assumption that
for all effective rings
.
We always assume available the following precomputed data:
For all , the
pre-inverse of
.
Proof. Let us first show how to reduce
polynomial multiplication over to polynomial
multiplication over
. So
consider two polynomials
and
in
, represented as
polynomials in
evaluated at
. We may compute their product in
as follows: we first substitute
in
and
,
which yields two polynomials
.
We next compute their product in
.
Now
for each
.
Since each remainder can be computed using two multiplications of
elements in
using the pre-inverse of
, we obtain
for a sufficiently large constant independent of
. On the other hand, applying
Karatsuba's trick, there exists a constant
such
that
holds for all
.
If then we have
which
yields
We claim that
The proof is done by induction assuming the inequality holds for :
Remark to be optimal in the latter
lemma, but it is sufficient for our purposes.
Proof. By Proposition 2.2 the
inverse of an element in takes
operations in
. Combined with
the previous lemma, we obtain
for some sufficiently large constant .
This yields the claimed bound.
be a triangular
composition tower for
with
. Given
,
we may then compute
and the characteristic
polynomial of
modulo
using
operations in .
Proof. By Lemma 6.1 we have
Then Lemma 6.3 gives
The conclusion follows from Theorem 5.3.
Recall that an integer is said to be
-smooth whenever all its prime
factors are at most
.
. If
is
-smooth and sufficiently
large, then there exist integers
such that
,
for all
, and
, where
.
Proof. It suffices to gather prime factors,
counted with multiplicities, into products in
the range
. Then
implies
.
.
If
is
-smooth,
and given a suitable triangular composition tower for
of degree
, then one
composition or one characteristic polynomial modulo
may be computed using
operations in
.
Proof. We appeal to the previous lemma to
construct the integer sequence for which we
precompute a triangular composition tower for
. The cost of Proposition 6.4
simplifies to
Notice that minimizes the latter exponent, which
leads to the cost
provided that
is
-smooth.
Remark -smooth
integers below an integer
is asymptotically
equal to
, where
is the Dickman–de Bruijn function defined by
with initial condition
for
all
(see [42] for the original
proof). For
, we have
. Then
decreases rapidly with
,
, etc. Another important
consequence of Proposition 6.4 is the following: for more
than 30% of large values of
,
modular compositions and characteristic polynomials for irreducible
of degree
over
may be computed using
operations
in
.
An interesting application of Proposition 6.4 concerns
cyclic moduli in
,
where
is a prime number different from the
characteristic
.
is a prime number different from
, and if
divides
, then the degrees of the
irreducible factors of
in
divide
.
Proof. Assume that
divides
, and write
. It is well known that the
polynomial
is the product of the monic
irreducible polynomials of
whose degrees divide
. We obtain
, by using Fermat's little theorem, which
asserts that
.
.
If
is a prime number different from
, and if
is
-smooth, then we may
precompute suitable triangular composition towers for all irreducible
factors of
, so one
composition or characteristic polynomial modulo
may be obtained using
operations in
.
Proof. The modulus is
separable, and the precomputations first involve the irreducible
factorization of
into
, whose respective degrees
divide
.
Since is
-smooth,
each
is
-smooth,
where
. Consequently, for the
modulus
, the cost of
Corollary 6.6 simplifies to
The total cost for all is thus
. Applying Proposition 3.2, this
leads to the cost
for one composition or
characteristic polynomial modulo
.
Proposition 6.4 shows that the overhead of triangular set
arithmetic rapidly grows with the height of the
tower. In this section we consider an alternative representation for
elements in the fields
. This
representation allows for faster multiplication inside the fields
, but the upward and downward
conversions may become more expensive. One major goal of this section is
to provide a more precise analysis of the cost of these conversions and
to isolate particular situations in which they can be computed fast.
An effective tower with
is said to be primitive if
for each
. In that case, we assume
that we precomputed the minimal polynomial
of
each
over
.
It follows that
for . On the other hand, the
upward and downward conversions are more expensive than in the case of
triangular towers. The following consequence of (7.1), (7.2) and Theorem 5.3 will be of frequent use.
Proof. We have
so the conclusion follows from Theorem 5.3.
For computing with arbitrary primitive towers, we recall that we always assume available the following precomputed data:
For all , the minimal
polynomial of
over
,
For all , the minimal
polynomial of
over
,
For all , the
polynomial expression of
in terms of
over
.
Assume that for some arbitrary primitive element
. For a natural morphism
for
-algebras
and
,
let
denote the cost of applying the morphism
once in terms of the number of required operations in
.
Proof. Let with
. We may convert
into
with
using
operations in
.
Then we use the precomputed polynomial
with
, and also the minimal
polynomial
of
over
, which has degree
. We now compute
using
Horner's method. This requires
operations in
. The evaluation
is the natural image of
in
. This proves (7.3).
For the opposite direction, consider and
reinterpret
as an element of
with
and
.
We use the precomputed minimal polynomial
of
over
,
which has degree
. We next
compute
. This requires
operations in
.
We finally convert
coefficientwise into an
element
of
.
This requires
operations in
and yields the natural image of
in
. This completes the proof of (7.4).
and
,
Proof. It will be convenient to use the following abbreviations:
Let be a constant such that
in the previous lemma and let us show by induction over
that
For , we have
, so all possible conversions are trivial and
(7.7) holds. Now assume that the result holds until
. Let us first consider the case
when
. Then (7.6)
yields
Since , this also deals with
the case when
. If
, then (7.5) yields
in a similar way. This also deals with the case when . We conclude by induction.
be a primitive
composition tower for
with
. Given
,
we may then compute one composition or characteristic polynomial
modulo
using
operations in .
Proof. From Corollary 7.4 we deduce
![]() |
(7.8) |
so the conclusion follows from Lemma 7.1.
Comparing to Proposition 6.4, using primitive towers thus turns out to be more efficient than using triangular towers, although it requires more precomputations. Therefore the costs for the two particular cases studied in sections 6.2 and 6.3 may be revisited and slightly improved.
We say that a primitive tower with
is a nested tower, if there exist
with
,
,
,
and
coprime, and
Setting and
,
this also means that
so that has degree
.
For this specific type of towers we require the following
precomputations:
and
for
and
,
where
is the smallest power of two above
.
Proof. Consider an element
with
. Let
be the smallest power of two above
.
Then we may compute
so that
. This computation follows the natural
“divide and conquer” strategy: given
, we split it into
with
, we compute recursively
and
, and
deduce
We may end the recurrence when ,
which incurs a cost
that is repeated
times in total. For each depth of the recursive calls the
total cost remains
operations in
. The overall cost of the method is
. The computation of the inverse of
requires
additional
operations in
.
Conversely, let with
, and let
be the smallest
power of two above
. We
compute
and look for an expansion of the form
where the . We again use the
“divide and conquer” strategy:
Compute , and recursively
compute the expansion
Compute , and recursively
compute the expansion
At the end we have , as
required. Thanks to the precomputations this expansion requires
operations in
.
Now we observe that
, where
and
.
A final reduction by
takes
operations in
.
be a nested
composition tower for
with
. Then we may compute one composition or
characteristic polynomial modulo
using
operations in .
Proof. Lemma 7.6 implies
![]() |
(7.9) |
The result now follows from Lemma 7.1.
The above corollary makes nested composition towers extremely attractive
from a complexity point of view. The existence of such towers only
depends on and the degrees
, but not on
.
A practical way to construct nested towers is to pick random monic
of degrees
and to check that
is irreducible for each
. We repeat this process for random choices of
until we find a suitable tower. From a heuristic
point of view, we will show below that the probability that we
eventually obtain a nested tower is nonzero in most cases of interest.
From a theoretical point of view, the existence problem of nested towers
remains an interesting problem.
In order to analyze the probability that random choices of provide us with a nested tower, we rely on
the fact that a random polynomial over of
degree
is irreducible with probability
;
the heuristic assumption that is
again random for
.
In this framework, the probability that is
irreducible for each
is given by
On the other hand, if has cardinality
, then we have
possible choices for the tuple .
For a fixed value of
, we
maximize
by taking
.
Setting
, we then have
The existence of a nested tower over with
extension degrees
is likely whenever
. Taking logarithms, this happens
as soon as
The algorithm for finding needs
runs before finding a suitable tower. An obvious optimization is to make
a better use of successful guesses of
for which
is irreducible for all
: instead of starting everything over after one
unsuccessful guess of
, we
try at least
times for some fixed constant
. The expected number of guesses
then drops to
.
It is an interesting question whether there exist finite fields and sequences
for which it is
possible to construct monic
of degrees
such that
is irreducible for each
. The literature contains
specific constructions of nested towers over finite fields based on [8, Lemma 1] which relates the irreducibility of
to
where
. We refer the reader to [37, section
3.2] for a nice survey. Let us exemplify two useful constructions.
Example is a prime power,
and
, where
, and
is a nonzero
square and
is a non-square in
. Then we may build a nested tower with
,
,
. For instance we may take
,
, and
.
Example be a monic irreducible
polynomial of degree
over
, where
is even if
. Let
such
that
is a non-square in
. Then we may take
and
for all
.
In this way we have
. For
instance with
, we may take
,
,
.
Notice that computations with nested towers simplify a bit in this
situation where
for all
.
Example are of odd degree
.
A non trivial candidate example of this kind is the sequence
. We verified
to be irreducible over
for
, but does this hold for all
?
Over finite fields, the situation when the are
pairwise coprime may be exploited to construct special towers, relying
on the following well-known lemma:
be a finite field and consider two monic
irreducible polynomials
and
in
whose degrees are coprime. Then the
composed product
,
defined by
is irreducible in and of degree
.
Proof. See for instance [4, Theorem
2].
Remark and
of coprime degrees
and
over
, an alternative way to
state the lemma is that
is again a primitive
element of degree
over
.
Remark
Let be a primitive tower and let
,
and
be as usual. We say that
is a composed
tower if the
are pairwise coprime and if
there exist monic irreducible polynomials
of
degrees
such that
and
for
.
In that case, the minimal polynomial of
over
is given by
and
for
: if
were reducible over
then
would have a proper
gcd with
which is impossible (see Proposition 4.1).
By construction, we thus have and
. For each
,
let
be a root of
and
, so that
. For such composed towers, we assume that the
following precomputations have been done for
:
,
the trace map of over
(as a vector of
),
and its inverse modulo
and
.
Proof. Let with
. We wish to compute
with
. For this
purpose we first calculate
using operations in
. Then
can be computed using additional operations.
Conversely, let with
. The direct image
with
is obtained via Chinese remaindering:
In fact we just verify that .
So if we let
, then we may
calculate
which takes operations in
, when using our assumption that
and its inverse modulo
and
have been precomputed.
be a composed
composition tower for
with
. Then, given
,
we may compute
using
operations in .
Proof. Lemma 7.14 implies
![]() |
(7.10) |
We conclude by Lemma 7.1.
Given a number , we say that
an integer
is
-super-smooth
if for every prime power
that divides
, we have
. For instance, the product of the first
prime numbers grows as
(see for
instance [17, chapter 22]) and is therefore
-super-smooth for sufficiently large
. Similarly, the number
is
-super-smooth
for every
. For a fixed
modulus of super-smooth degree, the following corollary shows that
modular composition can be done in softly linear time in this specific
situation:
be a fixed
irreducible polynomial of
-super-smooth
degree
over a finite field
, and assume a composed composition tower has
been precomputed for
.
Then, given any
, we may
compute
using
operations in
.
Proof. We apply the previous corollary with and
.
Using the composed tower approach, we are left with the question how to
deal with algebraic extensions of prime power degree . In the case when
coincides with the characteristic
of the field
, one may use
Artin–Schreier towers instead, as outlined below.
An Artin–Schreier polynomial over a field of characteristic
,
is an irreducible polynomial of
of the form
. An Artin–Schreier
tower of height
over
is a tower of field extensions
where each
extension
is explicitly constructed from an
Artin–Schreier polynomial over
.
In order to simplify the presentation, we restrict ourselves to the case
when . For the minimal
polynomials of the successive extensions, we take
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
(i=2,p=2) |
![]() |
![]() |
![]() |
(all other cases). |
In [10, Theorem 2], it is shown that this defines a
primitive tower. The polynomials may be computed
using
operations in
, according to [10, Theorems 12]. We
assume that the following precomputations have been done for
(see [10, end of section 4]):
the trace map on over
in the canonical basis,
.
The costs of these precomputations are and
, respectively. We now have the
following complexity bound for the upward and downward conversions.
be an Artin–Schreier tower. Then, modulo precomputations, we
have
for all .
be an
Artin–Schreier composition tower for
with
. Then, given
, we may compute
using
operations in .
Proof. Lemma 7.17 implies
From Lemma 7.1, it follows that can
be computed using
operations in
. Using the fact that
, the result follows.
Let be a given finite field and let
be a given monic irreducible polynomial of degree
over
. In
this section we consider the task of constructing composition towers for
. In fact, we may distinguish
three different problems of increasing complexity:
Building an algebraic tower with the
prescribed extension degrees
;
Building a composition tower for some monic irreducible
of degree
;
Building a composition tower for the prescribed modulus .
Moreover, one may study these problems for each type of towers that we
have encountered so far. From now on, we drop the study of triangular
towers, for simplicity and because primitive towers are more efficient
anyway. Given an effective tower, we notice that the construction of a
primitive tower still requires computing the minimal polynomials over
of the
. In general, it is not known how to do this in
quasilinear time without using the power projection algorithm by Kedlaya
and Umans.
The traditional solution to the first problem involves computing
irreducible polynomials of “small”
degrees
over “large” field
extensions
. For instance,
when using Shoup's algorithm [45], the number of operations
in
grows with
(in this
case, Couveignes and Lercier's algorithm [9] is less
competitive). In [14, Proposition 4.6] von zur Gathen and
Seroussi proved the lower bound
for factoring
polynomials of degree
over
in the arithmetic circuit model. Consequently the quadratic cost in
might be difficult to decrease in general. Theorem 8.2 below shows how to achieve a cost that is quasi-linear in
in the case when
.
In this section we mainly focus on the efficient construction of
primitive composition towers for prescribed . We first describe the general algorithm and then
explain possible speed-ups for nested, composed, and Artin-Schreier
towers. Altogether, this yields an efficient answer to the second
problem. Notice that composition towers with no prescribed modulus are
sufficient if we merely need a representation for the finite field
such that polynomials in
can
be evaluated efficiently at points in
.
The third problem is more difficult. So far we have not been able to
apply the techniques of this paper to obtain more efficient solutions,
even when is very smooth or in the extremely
favourable case of Artin–Schreier towers. In practice, a
straightforward strategy for the construction of composition towers for
is to factor
over all
intermediate fields
using standard available
algorithms. This is discussed at the end of the section.
The naive construction of a primitive composition tower with prescribed
extension degrees proceeds by induction. Assume
the tower is built up to height
.
We first construct an irreducible polynomial
in
and set
.
If
is “sufficiently random”, then
the class
of
in
is a primitive element over
. Its minimal polynomial
over
may simply be obtained as
where
is the natural preimage with bidegree
that satisfies
.
The latter resultant requires roughly
operations
in
when using the best known algorithms. To
complete the composition tower of height
,
it remains to compute the sequence of normal factors of
over
. Factorization
algorithms based on modular composition [31] can achieve
this in time
, roughly
speaking.
In fact, we will show how to build primitive composition towers far more
efficiently. We still proceed by induction. The composition towers
naturally share the same underlying effective subtowers of . More precisely, the subtower of height
consists of the fields
and
forms a composition tower for
;
the successive normal factors are written
,
and the auxiliary polynomials
of Definition 5.2 are written
.
We begin with constructing an irreducible polynomial
of degree
, so the first
primitive tower is made of
,
and it is a composition tower for
with normal
factor
. The auxiliary
polynomial
satisfies
.
For the second composition tower we construct an irreducible polynomial
of degree
,
which defines
as the class of
in
. If
is “sufficiently random”, then
is a
primitive element of
over
. We obtain the minimal polynomial of
over
as
, where
and
. The normal factor of
over
is
,
and the one of
over
is
. We clearly have
and we obtain
from the
subresultant of degree
in
of
and
(it necessarily
exists because
is a primitive element of
over
,
which implies that
is separable; this will be
detailed below in the general situation). In this way we obtain a
composition tower of height
.
The third tower again requires building an irreducible polynomial . This defines
and
so we have
.
We assume that
generates
over
and we wish to obtain the normal
factorizations of its minimal polynomial
.
Over
and
the normal
factors are respectively
and
. For
,
we use the second composition tower: we compute
such that
and
,
then
. Then we obtain
, where
. The auxiliary polynomials
and
are obtained from the corresponding first
subresultants.
For general heights, we use the following algorithm for the construction of composition towers.
Algorithm
Set ,
,
.
For from
down to
do
Compute over
, where
;
Compute the subresultant of degree
in
of
and
written
, and then set
.
Set to the class of
in
, and let
.
Notice that the precomputations PRE-P1 correspond to ; concerning PRE-P2, we
have already computed
,
and
is simply
; the precomputations PRE-P3 correspond to
.
Return the composition tower for
made from
,
,
,
,
and the other precomputed auxiliary data.
Proof. By decreasing induction on we prove that
is a composition
tower for
which has degree
. This is clear for
and
. Assume the induction
hypothesis holds for
. At the
end of step 2.a the polynomial
has degree
and is the minimal polynomial of
over
. Since
is a normal factor of
over
, the degree of the ideal generated by
is
.
Since is separable it belongs to the
Gröbner basis of
for the lexicographic
order induced by
. In
particular a polynomial with leading monomial
belongs to
. This proves that
the subresultant of degree
of
and
is nonzero, and that
is well defined, which implies the requested conditions:
The induction hypothesis is thus satisfied for .
As to the complexity analysis, we first notice that . The conversions in step 2.a therefore take
operations in . The resultant
in step 2.a and the subresultant in step 2.b require
further operations, by Proposition 2.2, where . The inversion of
modulo
takes
further
operations.
. A primitive
composition tower of degrees
may be built
using
expected operations in .
Proof. The tower is constructed by natural successive applications of Algorithm 8.1. By Equation (7.8) the bound from Proposition 8.1 simplifies to
It remains to take the construction of the into
account for
. By Theorem A.4 a random polynomial
can be
computed using an expected number of
compositions modulo
plus
operations in . Thanks to
Corollary 7.5 each composition modulo
may be done using
operations in
. On the other hand we simply take
. The sum over
yields
We claim that, with a small uniformly bounded probability, the roots of
do not have maximal degree
over
. Such a bad
is easily detected since the corresponding
is not separable. In that case we build another
at random and the average number of failures is bounded.
To prove the claim, we notice that the roots of
do not have maximal degree
over
if, and only if, the
conjugates of
over
are not pairwise distinct.
Equivalently this means that the coefficients of
belong to a proper subfield of
.
For each prime factor
of
the field
is a maximal proper subfield of
. All maximal proper subfields of
are obtained is this way, so there exist at most
many of them. The number of monic irreducible
polynomials of degree
with coefficients in such
a subfield is at most
; the
number of monic irreducible polynomials of degree
over
is at least
(see
for instance [13, Lemma 14.38]). Consequently the
probability that the roots of
do not have
maximal degree
over
is
at most
which is uniformly bounded by as soon as
or
is sufficiently
large.
Let us now investigate the consequences of Proposition 8.1 in the particular cases of nested, composed and Artin–Schreier towers. In our analysis, we actually construct all intermediate subtowers along with the composition tower itself. If one just needs the highest tower, then there are cases that can be optimized by exploiting the specificities of the types of towers under consideration.
and assume that we are given a
nested tower as in section 7.3, defined by
and
. Then
the nested composition tower for
may be built
using
operations in .
Proof. The tower is again constructed by natural
successive applications of Algorithm 8.1, except that we
replace the precomputations in step 3 by those specified in PRE-N1.
These precomputations amount to .
By Equation (7.9), the cost of Proposition 8.1
simplifies to
.
and assume that we are given a
composed tower as in section 7.4, defined by
. Then the composed composition tower for
may be built using
operations in .
Proof. The tower is again constructed by natural
successive applications of Algorithm 8.1, except that we
replace the precomputations in step 3 by those specified in PRE-C1,
PRE-C2, and PRE-C3. These precomputations amount to . By Equation (7.10), the cost of
Proposition 8.1 simplifies to
.
and assume that we are given an
Artin–Schreier tower as in section 7.5, defined by
. Then the
Artin–Schreier composition tower for
may
be built using
operations in .
Proof. The tower is again constructed by natural
successive applications of Algorithm 8.1, except that we
replace the precomputations in step 3 by those specified in PRE-A1 and
PRE-A2, which amount to . By
Equation (7.11), the cost of Proposition 8.1
becomes
Let be an algebraic tower. So far we have shown
how to build composition towers for
.
Let us now explain how to deduce composition towers for a given
monic and irreducible of degree
. By standard algorithms, we first compute a
root
of
in
(for example with Rabin's algorithm; see [13,
chapter 14]). Unfortunately we do not know how to exploit a composition
tower to compute
more efficiently. We are
interested in finding a monic normal factor
of
over each field
in the
tower. In the extreme case when
,
we take
.
, given an irreducible
, together with a root
of
, we may compute the
minimal polynomials
of
over
along with the
from Definition 5.2 for
,
with cost
Proof. We have .
We compute
and
by
descending induction on
from
down to
. First we have
where . The first
subresultant
of
and
is well defined and
is
invertible modulo
. Therefore
we have
According to Proposition 2.2, the resultant and the first
subresultant can be computed using operations in
. Then
is deduced with
operations in
. Summing over
,
the result follows.
Remark ,
we notice that the polynomial
is of the form
with
.
In other words, the computation of a composition tower for
is as least as hard as finding a root of
in
.
We have shown that modular composition and characteristic polynomials
for a fixed irreducible modulus can be computed
fast if
is smooth, when allowing for
precomputations that depend solely on
.
From a practical point of view, we expect that efficient implementations
of our algorithms will lead to speed-ups with respect to state of the
art methods as soon as
is composite and
sufficiently large. From a theoretical point of view, the algorithms by
Kedlaya and Umans generally outperform our new algorithms. One notable
exception occurs when
is very smooth, in which
case we were able to prove a quasi-linear complexity bound: see
Corollary 7.16.
Our new algorithms are only more efficient for fixed moduli . Nevertheless, the cost of the required
precomputations as a function of
is of a similar
order of magnitude as one composition modulo
without our methods. In other words, if we need to compute
compositions modulo the same modulus
, then our new methods are already of interest
for small values of
. This
problem of multiple modular compositions occurs in various applications:
Given a non trivial divisor of
, yet another representation of elements of
was needed in [23]. Let
be a primitive element of
and
be an element whose minimal polynomial has degree
. Then
and
are both isomorphic to
and correspond to two different representations. If
and
can be chosen “nicely”, then
conversions between these representations can be computed fast, using
similar methods as in sections 7.3, 7.4 and 7.5. Using modular composition, we may reduce the general case
to this special case in which we are allowed to choose
and
.
Many intriguing questions remain to be answered. One major open problem
is whether our new techniques can be used to build composition towers
for a prescribed irreducible modulus of smooth
degree
in quasi-linear time. This would give us
an unconditional algorithm for composition modulo
of quasi-linear time complexity. Another natural question is whether
there exists an efficient way to reduce general modular composition to
composition modulo irreducible polynomials of smooth degree. This
question may be related to generalizations of our algorithms to modular
composition for multivariate polynomials. Besides these fundamental
issues, we finally think that there remains a lot of room for more minor
improvements of the techniques in this paper and for working out various
applications in more detail. It would also be interesting to have
efficient implementations of the multiple algorithms that we have
presented and investigate under which conditions they outperform
previous algorithms in practice.
This appendix is devoted to Kaltofen and Shoup's fast algorithm for
pseudo-traces over finite fields [28], along with its
application to building irreducible polynomials (required in our section
8). We recall the algorithm for completeness, but also in
order to make it work more generally over any finite field .
Algorithm
If then return
.
Let , and recursively
compute
.
Compute .
Compute over
.
Compute over
.
If is even, then return
.
Compute and return ,
over
,
and
over
.
and
. Algorithm A.1 is correct and
takes
compositions modulo
and
additional operations in .
Proof. The proof proceeds by induction on . The algorithm is clearly correct
if
. Assume that it is
correct for
. By linearity of
the
-th power we have
This already proves the correctness by induction when
is even. Otherwise
and similar computations
yield
,
This proves the correctness.
The cost for obtaining from
is essentially one or two compositions modulo
. If we write
with
, then we are led to compute
for all
and then
over
. Overall
requires
compositions
modulo
plus
operations
in
. The same bound holds for
. If
is odd, then step 7 takes again
compositions
modulo
plus
operations
in
. Finally, the depth of
the recursion is
.
, let
be a monic irreducible polynomial in
of degree
,
and let
. For any monic
polynomial
of degree
, and any
,
the pseudo-trace
may be computed using
compositions modulo
plus
operations in .
Proof. If suffices to compute
for Algorithm A.1 using
operations
in
, in order to apply the
preceding proposition.
, let
be a monic irreducible polynomial in
of degree
, and let
. Given
and the set
of its prime factors of cardinality
,
we may check whether a monic polynomial
of
degree
is irreducible using
compositions modulo
plus
operations in .
Proof. The polynomial is
irreducible if, and only if,
and
for all prime divisor
of
. We may thus apply the above
corollary.
, let
be a monic irreducible polynomial in
of degree
, and let
. A random irreducible polynomial
of degree
over
may be
computed using an expected number of
compositions modulo
plus an expected number
of
operations in .
Proof. We appeal to Ben-Or's algorithm [13,
Algorithm 14.40], but use modular composition instead of the usual
modular powering. Let be a monic random
polynomial of degree
over
. For
from
to
we compute
and stop
as soon as
. In this case
is reducible, since it admits an irreducible
factor of degree
(that divides
). If all the
are
then
is necessarily
irreducible.
With the notations from Algorithm A.1, the
modulo
are computed as follows. First, we
compute
and
using
operations in
.
Second, we obtain
and
with
compositions modulo
and
additional operations in , by
Proposition A.1.
Then we compute of bi-degree
such that
Notice that . In order to
compute
from
we write
with , so we are led to
compute
for all
and then
over
.
The overall computation of
requires
compositions modulo
plus
operations in
.
If the smallest degree of the irreducible factors of
is
, then testing the
irreducibility of
takes
compositions modulo
plus
In average, random trials for
are necessary to find an irreducible one, and the average value of the
smallest degree
is
.
Therefore the total expected cost is
Taking , the expected cost in
Theorem A.4 rewrites into
compositions modulo
and
field operations in
(thus, without using the
Kedlaya–Umans algorithm). In favorable cases studied in the
present paper, we may assume that one composition modulo
amounts to say
operations in
for a fixed, small number
. The expected cost then reduces to
operations in
.
As a first comparison, Shoup's algorithm [45] needs expected operations in
(still
without using the Kedlaya–Umans algorithm). For bounded values of
and large
,
the availability of a fast algorithm for composition modulo
makes it possible to improve on Shoup's algorithm, as was
already observed in [28]. In this specific case, Theorem A.4 indeed leads to a bit cost
instead
of
with Shoup's algorithm.
As a second comparison, let be a function such
that
in the neighborhood of infinity. Based on
Kedlaya and Umans' algorithm, Couveignes and Lercier designed an
algorithm [9] that builds an irreducible polynomial of
degree
over
in time
, when ignoring the cost of
conversions to put elements of
into their
standard representation in
.
To be fair, we may assume fast modular composition for
, so the expected bit cost of Theorem A.4
further reduces to
. So
roughly speaking, if
, then
the Couveignes–Lercier algorithm is asymptotically faster; if
, then Theorem A.4
is competitive. A comparison with Shoup's algorithm [45]
would not be fair here unless enhancing it with the Kedlaya–Umans
algorithm; for large
,
Shoup's algorithm [45] also remains competitive, of course.
Similar observations hold concerning the complexity of the factorization
of polynomials of degree in
. In fact, Kaltofen and Shoup, still in [28], applied their technique (namely Algorithm A.1)
to reduce the cost of the Cantor–Zassenhaus factorization
algorithm mostly to compositions modulo
and to
. For bounded
and large
,
fast composition modulo a fixed modulus
is
therefore particularly relevant to factorization. We intend to work out
the details of this application in a forthcoming paper.
D. J. Bernstein. Composing power series over a finite ring in essentially linear time. J. Symbolic Comput., 26(3):339–341, 1998.
A. Bostan, Ph. Flajolet, B. Salvy, and É. Schost. Fast computation of special resultants. J. Symbolic Comput., 41(1):1–29, 2006.
A. Bostan, G. Lecerf, and É. Schost. Tellegen's principle into practice. In Hoon Hong, editor, Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation, ISSAC '03, pages 37–44, New York, NY, USA, 2003. ACM.
J. V. Brawley and L. Carlitz. Irreducibles and the composed product for polynomials over a finite field. Discrete Math., 65(2):115–139, 1987.
R. P. Brent and H. T. Kung. Fast algorithms for manipulating formal power series. J. ACM, 25(4):581–595, 1978.
P. Bürgisser, M. Clausen, and M. A. Shokrollahi. Algebraic complexity theory, volume 315 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, 1997.
D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Infor., 28(7):693–701, 1991.
S. D. Cohen. On irreducible polynomials of certain types in finite fields. Proc. Camb. Phil. Soc., 66(2):335–344, 1969.
J.-M. Couveignes and R. Lercier. Fast construction of irreducible polynomials over finite fields. Israel J. Math., 194(1):77–105, 2013.
L. De Feo and É. Schost. Fast arithmetics in Artin-Schreier towers over finite fields. J. Symbolic Comput., 47(7):771–792, 2012.
L. E. Dickson. Linear Groups: With an Exposition of the Galois Field Theory. Dover Publication Inc., 1958.
J. Doliskani and É. Schost. Taking roots over high extensions of finite fields. Math. Comp., 83(285):435–446, 2014.
J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, 3rd edition, 2013.
J. von zur Gathen and G. Seroussi. Boolean circuits versus arithmetic circuits. Inform. and Comput., 91(1):142–154, 1991.
J. von zur Gathen and V. Shoup. Computing Frobenius maps and factoring polynomials. Comput. Complexity, 2(3):187–224, 1992.
B. Grenet, J. van der Hoeven, and G. Lecerf. Deterministic root finding over finite fields using Graeffe transforms. Appl. Alg. Eng. Comm. Comp., 27(3):237–257, 2016.
G. H. Hardy and E. M. Wright. An introduction to the theory of numbers. Oxford University Press, Oxford, 6th edition, 2008.
D. Harvey and J. van der Hoeven. Faster integer and polynomial multiplication using cyclotomic coefficient rings. Technical report, ArXiv, 2017. http://arxiv.org/abs/1712.03693.
D. Harvey, J. van der Hoeven, and G. Lecerf. Faster polynomial multiplication over finite fields. J. ACM, 63(6), 2017. Article 52.
J. van der Hoeven. Relax, but don't be too lazy. J. Symbolic Comput., 34(6):479–542, 2002.
J. van der Hoeven. Fast composition of numeric power series. Technical Report 2008-09, Université Paris-Sud, Orsay, France, 2008.
J. van der Hoeven. Faster Chinese remaindering. Technical report, CNRS & École polytechnique, 2016. http://hal.archives-ouvertes.fr/hal-01403810.
J. van der Hoeven and R. Larrieu. The Frobenius FFT. In M. Burr, editor, Proceedings of the 2017 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC '17, pages 437–444, New York, NY, USA, 2017. ACM.
J. van der Hoeven and G. Lecerf. Composition modulo powers of polynomials. In M. Burr, editor, Proceedings of the 2017 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC '17, pages 445–452, New York, NY, USA, 2017. ACM.
J. van der Hoeven and G. Lecerf. Modular composition via complex roots. Technical report, CNRS & École polytechnique, 2017. http://hal.archives-ouvertes.fr/hal-01455731.
Xiaohan Huang and V. Y. Pan. Fast rectangular matrix multiplication and applications. J. Complexity, 14(2):257–299, 1998.
F. Johansson. A fast algorithm for reversion of power series. Math. Comp., 84:475–484, 2015.
E. Kaltofen and V. Shoup. Fast polynomial factorization over high algebraic extensions of finite fields. In Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation, ISSAC '97, pages 184–188, New York, NY, USA, 1997. ACM.
E. Kaltofen and V. Shoup. Subquadratic-time factoring of polynomials over finite fields. Math. Comp., 67(223):1179–1197, 1998.
K. S. Kedlaya and C. Umans. Fast modular composition in any characteristic. In FOCS'08: IEEE Conference on Foundations of Computer Science, pages 146–155, Washington, DC, USA, 2008. IEEE Computer Society.
K. S. Kedlaya and C. Umans. Fast polynomial factorization and modular composition. SIAM J. Comput., 40(6):1767–1802, 2011.
M. K. Kyuregyan. Recurrent methods for constructing
irreducible polynomials over of odd
characteristics. Finite Fields Appl., 9(1):39–58,
2003.
M. K. Kyuregyan. Recurrent methods for constructing
irreducible polynomials over of odd
characteristics, II. Finite Fields Appl.,
12(3):357–378, 2006.
P. Lairez and T. Vaccon. On p-adic differential equations with separation of variables. In Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC '16, pages 319–323, New York, NY, USA, 2016. ACM.
F. Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, ISSAC '14, pages 296–303, New York, NY, USA, 2014. ACM.
G. Lecerf. On the complexity of the Lickteig–Roy subresultant algorithm. J. Symbolic Comput., 2018. https://doi.org/10.1016/j.jsc.2018.04.017.
G. L. Mullen and D. Panario. Handbook of Finite Fields. Discrete Mathematics and Its Applications. Chapman and Hall/CRC, 2013.
A. K. Narayanan. Fast computation of isomorphisms between finite fields using elliptic curves. Technical report, arXiv:1604.03072, 2016. https://arxiv.org/abs/1604.03072.
C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
M. S. Paterson and L. J. Stockmeyer. On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J.Comput., 2(1):60–66, 1973.
M. O. Rabin. Probabilistic algorithms in finite fields. SIAM J. Comput., 9(2):273–280, 1980.
V. Ramaswami. On the number of positive integers less
than and free of prime divisors greater
than
. Bull. Amer.
Math. Soc., 55:1122–1127, 1949.
P. Ritzmann. A fast numerical algorithm for the composition of power series with complex coefficients. Theoret. Comput. Sci., 44:1–16, 1986.
J.-A. Serret. Cours d'algèbre
supérieure, volume 2 of Les grands classiques
Gauthier-Villars. Jacques Gabay, 4th edition, 1992.
Réimpression du tome second de la 4
édition du Cours d'algèbre
supérieure de Joseph-Alfred Serret, publiée
par Gauthier-Villars en 1879.
V. Shoup. Fast construction of irreducible polynomials over finite fields. J. Symbolic Comput., 17(5):371–391, 1994.
V. Shoup. Efficient computation of minimal polynomials in algebraic extensions of finite fields. In Proceedings of the 1999 International Symposium on Symbolic and Algebraic Computation, ISSAC '99, pages 53–58, New York, NY, USA, 1999. ACM.