Joris van der Hoeven and
Grégoire Lecerf
Laboratoire d'informatique de l'École
polytechnique, CNRS UMR 7161
École polytechnique
91128 Palaiseau Cedex, France
{vdhoeven,lecerf}@lix.polytechnique.fr
Composition modulo powers of
polynomials
Modular composition is the problem to compose two univariate polynomials modulo a third one. For polynomials with coefficients in a finite field, Kedlaya and Umans proved in 2008 that the theoretical bit complexity for performing this task could be made arbitrarily close to linear. Unfortunately, beyond its major theoretical impact, this result has not led to practically faster implementations yet. In this paper, we study the more specific case of composition modulo the power of a polynomial. First we extend previously known algorithms for power series composition to this context. We next present a fast direct reduction of our problem to power series composition.
Modular composition; Series Composition; Algorithm; Complexity
Let be an effective field and let
be polynomials in
of degrees
. The problem of modular
composition is to compute
modulo
. This is a fundamental problem in
complexity theory because of its applications to polynomial
factorization [16-18].
From a theoretical point of view, Kedlaya and Umans achieved a major
breakthrough by showing that modular compositions over a finite field
can be computed in time
. Unfortunately, the practical impact of these
results has been limited so far. The best current implementations still
admit nearly quadratic complexities in
.
More precisely, over a field
,
denoting by
the cost to multiply two polynomials
in
of degrees
,
Brent and Kung [3] gave an algorithm with cost
. It uses the baby-step giant-step
technique due to Paterson–Stockmeyer [21] and even
yields a sub-quadratic cost when using fast linear algebra (see [15, p. 185] and our Theorem 2 below).
In the special case when ,
composition modulo
corresponds to the
composition of truncated power series at order
. Better practical algorithms are known for this
particular situation. Brent and Kung gave an algorithm [3]
with complexity
under the conditions that
is invertible and that the characteristic is at least
. A variant proposed by van
der Hoeven [11, Section 3.4.3] removes the condition on
. For fields of small
positive characteristic, Bernstein [1] proposed an
algorithm that is softly linear in the precision
but linear in the characteristic. Series with integer, rational or
floating point coefficients can often be composed in quasi-linear time
as well in suitable bit complexity models, as shown by Ritzmann [22]; see also [12].
In this paper, we focus on the intermediate case between power series
and general modular compositions. More precisely, we assume to be a power
of some polynomial
. Composition modulo
can also be regarded as a special “base”
case of composition modulo a polynomial
that can
be factored into
. Using
Chinese remaindering, one may reduce composition modulo such
to compositions modulo powers of irreducible polynomials.
In a separate paper [13], we exploit this observation for
the design of fast practical (although not fully general) algorithms for
modular composition over finite fields.
We develop two main approaches for fast composition modulo . In section 3, we generalize
Brent and Kung's and Bernstein's algorithms. We achieve a complexity
exponent close to
, but only
under the condition that
is much larger than
. In section 4,
we directly reduce composition modulo
to
truncated power series composition. Our main result is Theorem 11:
one composition modulo
is essentially reduced to
compositions modulo
and
one power series composition in
.
An important ingredient of this reduction is a softly optimal algorithm
for converting between
and
where
is sent to
.
Potential applications of these conversions might also concern algebraic
algorithms for plane curves such as integral basis computations and
Puiseux expansions, which usually require several Taylor expansions at
roots of discriminants.
Our paper is organized as follows. In section 2, we
introduce basic notations and recall several well known complexity
results. We also specify the complexity model that we work with. Further
technical complexity results are gathered in the appendix. In sections
3 and 4, we present our main algorithms. For
testing purposes, our algorithms are implemented in the
Recall that an effective ring is a ring
with unity whose elements can be represented on a computer and such that
we have algorithms for performing the ring operations. Effective fields
and effective algebras over an effective ring
are defined similarly.
Given an effective ring ,
algebraic complexity models express running times in terms of
the number of ring operations in
.
Unless otherwise stated, we will analyze the costs of the algorithms in
this paper in this way. More precisely, our results both apply for the
straight-line program and computation tree models [4,
Chapter 4].
Let be an effective ring with unity, let
, and denote
We write for a cost function such that two
polynomials in
can be multiplied using
operations in
.
It will be convenient to assume that
is a
nondecreasing function in
.
Notice that this assumption implies the super-additivity of
, namely
for all
and
.
The schoolbook algorithm allows us to take .
The fastest currently known algorithm [5] yields
. If
is a
field of finite characteristic, then we even have
, where
[10].
The constant represents a feasible
exponent for the multiplication cost of matrices: two square
matrices of size
can be multiplied using
operations in their coefficient ring. The constant
is defined similarly but for multiplying a
matrix by a
rectangular one.
The best known bound
is due to Le Gall [19].
This naturally yields
.
However the latter bound does not improve upon the earlier bound
due to Huang and Pan [14, Theorem 10.1].
If and
are polynomials
in
, then
represents the remainder in the division of
by
. Let
be an effective ring and
be an effective
-algebra. We introduce the
following cost functions:
: the cost of computing
the power series composition
, where
.
: the cost of computing
the power series composition
in
terms of operations in
,
where
.
: the cost of computing
the modular composition
,
where
is a monic polynomial of degree
and
.
: the cost of computing
the characteristic polynomial
of
modulo a monic polynomial
of degree
. This is the
characteristic polynomial
of the
multiplication endomorphism by
in
.
: the cost of
modular power projections, i.e. the cost to
compute
, where
is monic of degree
and
is a linear form on
.
Let us recall a few known results about these functions. For , we denote
and
.
be a field of characteristic
.
be a monic polynomial of degree
over a ring
and let
. The composed polynomial
may be computed with
operations in
, or
or
operations in
.
follows from
by
taking
.
For a fixed monic polynomial in
of degree
, the modular
composition
is a linear operation in
, for
and
in
.
The corresponding transposed application is precisely the operation of
modular power projections: it corresponds to computing
, where
is
a linear form on
. If a
modular composition algorithm with cost
can be
transposed in the sense of [2], then this leads to a power
projection algorithm with cost
.
be a monic polynomial of degree
over a ring
and let
. The characteristic polynomial
of
modulo
can be
computed using
operations in
, including divisions in
(the partial division in
is supposed to be
implemented), or
operations in
, if
is a field, or
operations in
, if
is a field with
elements, or
operations in
, if there exist given inverses of
in
.
by using [20, Corollary
29]. The third bound is obtained by computing
for
values of
in
, from which
is interpolated using
additional operations in
. We refer to Appendix A.1 for the fourth bound.
For the particular kind of moduli that interest
us in here, the problem of computing characteristic polynomials quite
easily reduces to the case when
:
be of degree
in
, let
with
, and let
be in
. Then the characteristic
polynomial
of
modulo
can be computed using
operations in
.
is simply the
-th
power of the characteristic polynomial of
modulo
.
From now on, we assume the modulus to be the
-th power of a polynomial
of degree
.
In this section we extend the two fast known algorithms for power series
composition from Theorem 1 to composition modulo
. We directly state and prove the
generalized algorithms. For more explanations we refer to the original
papers [1, 3] and [11, Section
3.4.3].
If has positive characteristic
, then Bernstein's algorithm [1]
exploits the Frobenius map in order to recursively reduce the degrees of
the polynomials to be composed by a factor
.
Algorithm
If , then return
, computed by a
general modular composition algorithm of cost
.
Split into polynomials
such that
. Let
and
,
be such that
and
.
Recursively compute for all
.
Return .
operations
in
.
,
the degrees of
and the
are at most
. Now
implies
which ensures the correctness of the algorithm.
In step 2 we need: operations in
to compute all the necessary
-th
powers,
for
,
and
for the division to obtain
. In step 4, by [8, Exercise
9.16], the computation of
for
may be done in time
The remaining calculations of step 4 amount to
operations in
.
Let be the time complexity for precision
and let
be such that
. By what precedes, we have
for all and a sufficiently large constant
. Unrolling this inequality
times, we obtain
Using and the super-additivity of
, we conclude that
We now extend Brent and Kung's series composition algorithm to
composition modulo . This
method uses Taylor series expansion in order to reduce the degrees of
the polynomials to be composed. For a given polynomial
, we write
for its
-th derivative. Algorithm 2 is a sub-algorithm that is used when the arguments of the
composition both have small degree.
Algorithm
If then return
.
Split with
and
.
Recursively compute for
,
Return .
Algorithm
Compute and
.
If then compute the valuation
of
in
and the modular inverse
.
Compute using Algorithm 2.
If then compute
for all
from
to
. Otherwise, for
from
to
, compute
.
Return .
operations
in
if
.
which makes use of and
. If
then
holds for all
, so the
algorithm is correct. Otherwise we may write
with
and
of degree
, which implies
. Since
and
separable, it follows that
,
whence
is well defined.
Let us show that , by
induction on
. This is clear
for
. Assume that the
identity holds for some
.
From
we see that the valuation of
in
is necessarily
and that
. The induction
hypothesis is thus satisfied for
because
. We are done with the correctness.
The -adic expansion of
takes
operations and
dominates the cost of steps 1 and 2. Steps 4 and 5 amount to
operations. As to step 3, we enter Algorithm 2
with
.
Let be the cost function of Algorithm 2.
The degrees of
,
and
are
, so that
for some constant . Let
be the largest integer such that
. Unrolling
times the
inequality for
and
,
we obtain
In a similar way we obtain .
Consequently,
. The
conclusion follows from
.
be of degree
in
, let
be of
degree
, and let
be in
. If
, then the modular
composition
may be computed using
operations in
.
,
then we may use Algorithm 1, which requires
operations in
.
Otherwise, we use Algorithm 3.
Notice that for fixed , this
corollary gives a cost
,
independently of the characteristic.
Recall that the modulus is the
-th power of a polynomial
of degree
. From now on, we
assume that
is separable. So far, our algorithms
for composition modulo
relied on extensions of
known algorithms dedicated to power series. In this section, we
investigate the other natural approach: we design algorithms that reduce
one composition modulo
to one series composition
of order
over a suitable extension of the ground
field. We set
and write
for the residue class of
in
. The associated trace function is denoted by
. This function naturally
extends to a map
in a coefficientwise fashion.
The first method we study is rather elementary. It works well for any
characteristic, but it is only efficient when is
much smaller than
. We
compute
for all roots
of
and recover
by Chinese
remaindering. The simultaneous computation with all roots of
is emulated via standard algebraic techniques.
Since is not assumed to be irreducible,
is not necessarily a field. We will compute inverses
in
using the technique of dynamic
evaluation, which is briefly recalled in Appendix A.2.
For convenience of the reader, we use the auxiliary variable
for power series and keep the variable
for polynomials.
Algorithm
Compute in
.
Compute in
.
Perform the power series composition
in
.
Compute in
.
With the algorithm underlying Corollary 20, compute
the modular inverse .
Compute
and then .
Return .
operations in .
of
in some algebraic closure
of
, we write
for the homomorphism from
to
that sends
to
.
This homomorphism extends coefficientwise into a map
. One verifies that
equals
so equals
for all roots
of
.
This proves the correctness of the algorithm thanks to the Chinese
remainder theorem.
For the complexity analysis, step 3 amounts to
operations in
. In step 5, we
use Corollary 20 to compute
with
operations in
.
Steps 1, 2, 4, and 6 require
additional ring
operations in
. In step 7,
the row matrix of
may be computed with
operations in
by formula (20),
so each trace takes
operations in
. Therefore step 7 amounts to
operations in
, which is
negligible.
When is much smaller than
, we may directly benefit from fast power series
composition. But overall this does not improve much on Corollary 7. In the rest of this section, we aim at decreasing the
dependency in
in complexity bounds.
In order to improve on Algorithm 4, we shall use a better
way to expand and
modulo
. For this purpose we
introduce the
-algebra
homomorphism
We regard and
as
conversions that we respectively call the untangling and
tangling mappings.
and all
, the map
is a
-algebra isomorphism.
is clearly a
homomorphism between
-algebras
of equal dimensions, so we need to prove its injectivity when
is separable. We consider the
-adic expansion
of
modulo
, where
all the
have degrees
, and we assume that
.
It is immediate that
. We may
thus suppose, by induction on
,
that
for all
.
From
we obtain
. Since
is
invertible, it follows that
,
whence
. This completes the
induction.
Remark. The separability assumption on is
necessary: if
and
,
then we have
.
Algorithm
Compute .
Let and compute the characteristic
polynomial
of
modulo
.
Compute .
Write , where
is the class of
in
, and the
have degrees
.
For each
, compute
and let
.
Perform the power series composition in
.
Return .
From now on, in cost analyses, the expression
(resp.
) represents a cost
function for computing the direct image (resp. the
preimage) of
. If
good algorithms are known for
and
, for characteristic polynomials and for
compositions modulo
, then we
observe that Algorithm 5 boils down to one power series
composition at precision
over
.
operations in .
that sends
to
is well
defined, so applying it to
we obtain
. Therefore we have
, which ensures the correctness of the
algorithm. The cost directly follows from the definitions.
The rest of this section is devoted to algorithms with softly linear
costs for computing and its inverse. More
precisely, we will prove that we may take
and
to be
in the latter
proposition. This leads to our main theorem:
and
,
where
, such that
is separable of degree
.
Then, we can compute
using
operations in .
The following lemma is elementary, but we shall refer to it several times.
and a polynomial
, we have
.
by Leibnitz
formula, showing that
is divisible by
.
We begin with the easiest situation when the characteristic of
is zero or sufficiently large
and achieve softly linear time for
.
Algorithm
If then return
, otherwise let
.
Recursively apply the algorithm to ,
giving
.
Recursively apply the algorithm to ,
giving
.
Return .
operations
in
. If
or
, then, given
, one may compute
using
operations in
.
for all
and
for all
.
The cost analysis is straightforward. Finally, to deduce
, thanks to the assumption on the
characteristic, we may simply use the Taylor expansion
.
In order to handle the case ,
we need to pay attention to vanishing coefficients when computing Taylor
expansions using higher order derivatives. For this, we rely on the
following adaptation of the Taylor expansion of
:
The polynomial is called the
-th order Hasse derivative of
. The operator
is linear and we may compute it as follows: if
is a monomial and
, then
; otherwise
and
,
we have
.
.
A straightforward calculation in characteristic zero yields
Now if the characteristic divides the integer
, then it also divides
, and the lemma remains
correct.
It is convenient to also introduce the Pochhammer symbol
The next lemma expresses conditions to compute Hasse derivatives from ones of lower orders, in positive characteristic.
be an integer power of
, let
be an integer such
that
, and let
be two integers in
.
Then, for any polynomial
, we
may compute
from
using
the formula
![]() |
(1) |
. Now in characteristic zero, we
have
for all . In positive
characteristic, we claim that the ratios
have
non-negative valuations in
.
Let us first consider the case when .
If
, with
, then
.
If
, then
. This shows that
is
well defined and may be computed in
whenever
is a multiple of
.
At a second stage, we make the induction hypothesis that has been computed in
for
such that
. The
value for
may then be obtained via
since . This deals with the
case when
.
Finally, for any , the ratio
is also well defined since it rewrites into
which concludes the proof.
We are now in a position to adapt the “divide and conquer”
Algorithm 6 to the case when has
small positive characteristic
.
Algorithm
If , then return
.
Let .
Compute .
By using formula (1), compute
from
.
Recursively apply the algorithm to ,
,
,
,
, and
. This yields
for all
.
Recursively apply the algorithm to ,
,
,
,
,
, and
.
This yields
, for all
.
Return , for all
.
operations
in
.
Algorithm
If , then use
Algorithm 6 to compute
and
return
.
Let be minimal with
, let
,
and
.
Compute for all
using Algorithm 7 (called with
,
,
and
).
If , then compute
from
with
formula (1).
For all , use the
algorithm recursively with
in order to
obtain
.
If then use the algorithm recursively
with
in order to obtain
.
Return .
operations in .
. We need to show that
coincides with
for all
. From Lemma 14, we
have
. Since
, it follows that
.
Now we suppose that . We have
, so
is a power of
, we have
, and the assumption
is preserved through recursive calls. The conditions of
Algorithm 7 are satisfied, whence the correctness of step 3
by Proposition 16. Step 4 is a consequence of Lemmas 12 and 15. We are done with the correctness.
Step 1 costs by Proposition 13.
Step 3 costs
by Proposition 16 and
step 4 takes
operations in
. The total cost function
of the algorithm satisfies
where is a sufficiently large constant and
when
.
This leads to
.
The next algorithm is independent of the characteristic. It relies on
the untangling algorithms from the previous subsections, by using the
“divide and conquer” strategy. For a polynomial or series
in
and
, we denote
and
. For convenience, we assume that
is a nondecreasing function of
.
Algorithm
If , then let
be the preimage of
in
and return
.
Let and compute
,
.
Return .
.
In particular, for any
, we
have
. Otherwise we verify that
Kronecker substitution [8, Chapter 8, Section 4] allows us
to multiply two series in at precision
using
operations in
.
The computation of requires one inversion in
that takes
operations in
, together with
operations for the series inversion. The other
amount to
more operations in
. The cost function
satisfies
for some constant , which
leads to
. The final bound of
the lemma follows from Proposition 17.
Let be a monic polynomial in
of degree
, let
be a linear form on
,
and let
. We study the
modular power projection problem, which corresponds to
computing
.
Let represent
.
If we take
, where
denotes the usual trace function of
over
, then the latter power
projections write as
It is well known that these traces are related to the characteristic
polynomial of modulo
by
the Newton–Girard identities:
where is the reverse polynomial of
. If there exist given inverses of
in
, then it is
possible to integrate the latter differential equation in
with
operations in
, which directly leads to an algorithm to
compute
from the power sums of the roots of
(see for instance [9, Section 2] for
details, which also cover the case when
is a
finite field).
First, let us explain how to compute
efficiently. Recall that
. In
the basis
, the linear form
may be computed as
which uses operations in
. If
represents the
multiplication endomorphism by
in
, then we can write
and
therefore
. By transposing
the modular multiplication by
with the
techniques from [2, Sections 4 and 5], the computation of
requires
operations in
.
Let be a field. Often in computer algebra, we
are led to compute in monogen algebras such as
, where
is separable of
degree
, but not necessarily
irreducible. Ring operations in
require
operations in
.
Algorithms such as the determinant are often slower over rings than
their counterparts over fields. For this reason, we often like to
conduct our computations as if
were a field.
From the programming point of view, this is easy to achieve: whenever we
need to invert an element
or test whether
, we factor
, where
and
is invertible modulo
. We
then restart the relevant part of the computations with
and/or
in the role of
. This approach is known as the dynamic
evaluation principle in computer algebra [6, 7].
We borrow the following complexity result for extended gcds from Dahan
et al. [6]:
be a separable polynomial of degree
over
and let
and
be univariate polynomials over
of degrees
.
Using
operations in
, one can compute a factorization
of
and triples of polynomials
respectively over
, such that
,
is
monic,
generates the extension of the ideal
(
,
) to
,
and the Bézout relation
holds with
and
, for
all
.
is square-free over a perfect field” by
“
is separable”. In fact all
arguments of [6] apply in this setting mutatis
mutandis. An alternative point of view is to apply [6,
Propositions 2.4 and 4.1] over the algebraic closure of
, while noticing that all actual computations
are really done over
.
be a separable polynomial of degree
over
and let
and
be univariate polynomials over
of degrees
,
such that
is monic and
is invertible modulo
. Then
the inverse of
modulo
may be computed using
operations in
.
of the
previous proposition satisfy
over
. We recover the inverse
of
modulo
from the
using the Chinese remainder theorem.
D. J. Bernstein. Composing power series over a finite ring in essentially linear time. J. Symbolic Comput., 26(3):339–341, 1998.
A. Bostan, G. Lecerf, and É. Schost. Tellegen's principle into practice. In Hoon Hong, editor, Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation, ISSAC '03, pages 37–44, New York, NY, USA, 2003. ACM.
R. P. Brent and H. T. Kung. Fast algorithms for manipulating formal power series. J. ACM, 25(4):581–595, 1978.
P. Bürgisser, M. Clausen, and M. A. Shokrollahi. Algebraic complexity theory, volume 315 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, 1997.
D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Infor., 28(7):693–701, 1991.
X. Dahan, M. Moreno Maza, É. Schost, and Yuzhen Xie. On the complexity of the D5 principle. In J.-G. Dumas, editor, Proceedings of Transgressive Computing 2006: a conference in honor of Jean Della Dora, pages 149–168. U. J. Fourier, Grenoble, France, 2006.
J. Della Dora, C. Dicrescenzo, and D. Duval. A new method for computing in algebraic number fields. In G. Goos and J. Hartmanis, editors, Eurocal'85 (2), volume 174 of Lect. Notes in Comp. Science, pages 321–326. Springer, 1985.
J. von zur Gathen and J. Gerhard. Modern computer algebra. Cambridge University Press, New York, 2 edition, 2003.
B. Grenet, J. van der Hoeven, and G. Lecerf. Deterministic root finding over finite fields using Graeffe transforms. Appl. Alg. Eng. Comm. Comp., 27(3):237–257, 2016.
D. Harvey, J. van der Hoeven, and G. Lecerf. Faster polynomial multiplication over finite fields. J. ACM, 63(6), 2017. Article 52.
J. van der Hoeven. Relax, but don't be too lazy. J. Symbolic Comput., 34(6):479–542, 2002.
J. van der Hoeven. Fast composition of numeric power series. Technical Report 2008-09, Université Paris-Sud, Orsay, France, 2008.
J. van der Hoeven and G. Lecerf. Modular composition via factorization. Technical report, CNRS & École polytechnique, 2017. http://hal.archives-ouvertes.fr/hal-01457074.
Xiaohan Huang and V. Y. Pan. Fast rectangular matrix multiplication and applications. J. Complexity, 14(2):257–299, 1998.
E. Kaltofen and V. Shoup. Fast polynomial factorization over high algebraic extensions of finite fields. In Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation, ISSAC '97, pages 184–188, New York, NY, USA, 1997. ACM.
E. Kaltofen and V. Shoup. Subquadratic-time factoring of polynomials over finite fields. Math. Comp., 67(223):1179–1197, 1998.
K. S. Kedlaya and C. Umans. Fast modular composition in any characteristic. In FOCS'08: IEEE Conference on Foundations of Computer Science, pages 146–155, Washington, DC, USA, 2008. IEEE Computer Society.
K. S. Kedlaya and C. Umans. Fast polynomial factorization and modular composition. SIAM J. Comput., 40(6):1767–1802, 2011.
F. Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, ISSAC '14, pages 296–303, New York, NY, USA, 2014. ACM.
G. Lecerf. On the complexity of the Lickteig-Roy subresultant algorithm. Technical report, CNRS & École polytechnique, 2017. https://hal.archives-ouvertes.fr/hal-01450869.
M. S. Paterson and L. J. Stockmeyer. On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J.Comput., 2(1):60–66, 1973.
P. Ritzmann. A fast numerical algorithm for the composition of power series with complex coefficients. Theoret. Comput. Sci., 44:1–16, 1986.