CNRS, Laboratoire d'informatique
École polytechnique
91128 Palaiseau Cedex
France
Email: joris@texmacs.org
Faster relaxed
multiplication
November 3, 2023
. This work has been partly supported
by the French ANR-09-JCJC-0098-01
Abstract
In previous work, we have introduced several fast algorithms for relaxed
power series multiplication (also known under the name on-line
multiplication) up to a given order .
The fastest currently known algorithm works over an effective base field
with sufficiently many
-th roots of unity and has algebraic time complexity
. In this paper, we will
generalize this algorithm to the cases when
is
replaced by an effective ring of positive characteristic or by an
effective ring of characteristic zero, which is also torsion-free as a
-module and comes with an
additional algorithm for partial division by integers. In particular, we
may take
to be any effective field. We will also
present an asymptotically faster algorithm for relaxed multiplication of
-adic numbers.
Keywords: power series, multiplication, on-line algorithm, FFT, computer algebra
A.M.S. subject classification: 68W30, 30B10, 68W25, 33F05, 11Y55, 42-04
Let be an effective (possibly non-commutative)
ring; i.e., we assume data structures for representing the
elements of
and algorithms for performing the
ring operations
,
and
. The aim
of algebraic complexity theory is to study the cost of basic or more
complex algebraic operations over
(such as the
multiplication of polynomials or matrices) in terms of the number of
operations in
.
The algebraic complexity usually does not coincide with the bit
complexity, which also takes into account the potential growth of the
actual coefficients in .
Nevertheless, understanding the algebraic complexity usually constitutes
a first useful step towards understanding the bit complexity. Of course,
in the special case when
is a finite field, both
complexities coincide up to a constant factor.
One of the most central operations is polynomial multiplication. We will
denote by the number of operations required to
multiply two polynomials of degrees
in
. If
admits
primitive
-th roots of unity
for all
, then we have
using FFT multiplication, which is based on the fast
Fourier transform [12]. In general, it has been shown [28, 10] that
.
The complexities of most other operations (division, Taylor shift,
extended g.c.d., multipoint evaluation, interpolation,
etc.) can be expressed in terms of
. Often, the cost of such other operations is simply
, where
stands for
; see [2,
6, 15] for some classical results along these
lines.
The complexity of polynomial multiplication is fundamental for studying
the cost of operations on formal power series in
up to a given truncation order
.
Clearly, it is possible to perform the multiplication up to order
in time
:
it suffices to multiply the truncated power series at order
and truncate the result. Using Newton's method, and
assuming that
, it is also
possible to compute
,
, etc. up to order
in time
.
More generally, it has been shown in [9, 17,
29, 23] that the power series solutions of
algebraic differential equations with coefficients in
can be computed up to order
in time
. However, in this case, the “
” hides a non-trivial
constant factor which depends on the expression size of the equation
that one wants to solve.
The relaxed approach for computations with formal power series
makes it possible to solve equations in quasi-optimal time with respect
to the sparse expression size of the equations. The idea is to consider
power series as streams of coefficients
and to require that all operations are performed
“without delay” on these streams. For instance, for a
multiplication
of two power series
, we require that
is
computed as soon as
are known. Any
algorithm which has this property will be called a relaxed or
on-line algorithm for multiplication.
Given a relaxed algorithm for multiplication, it is possible to let the
later coefficients of the input depend
on the known coefficients
of the output. For
instance, given a power series
with
, we may compute
using
the formula
provided that . Indeed,
extraction of the coefficient of
in
and
yields
and only depends on
. More generally, we define an equation of the form
to be recursive, if only depends on
. Replacing
by
, we notice that the same
terminology applies to systems of
equations. In
the case of an implicit equation, special rewriting techniques can be
implied in order to transform the input equation into a recursive
equation [24, 22, 5].
Let denote the cost of performing a relaxed
multiplication up to order
.
If
is an expression which involves
multiplications and
other
“linear time” operations (additions, integrations, etc.),
then it follows that (2) can be solved up to order
in time
.
If we had
, then this would
yield an optimal algorithm for solving (2) in the sense
that the computation of the solution would essentially require the same
time as its verification.
The naive algorithm for computing
, based on the formula
is clearly relaxed. Unfortunately, FFT multiplication is not
relaxed, since are computed simultaneously as a
function of
, in this case.
In [16, 17] it was remarked that Karatsuba's
algorithm [25] for multiplying
polynomials can be rewritten in a relaxed manner. Karatsuba
multiplication and its relaxed version thus require exactly the
same number of operations. In [16, 17], an
additional fast relaxed algorithm was presented with time complexity
We were recently made aware of the fact that a similar algorithm was first published in [13]. However, this early paper was presented in a different context of on-line (relaxed) multiplication of integers (instead of power series), and without the application to the resolution of recursive equations (which is quite crucial from our perspective).
An interesting question remained: can the bound (3) be
lowered further, be it by a constant factor? In [18], it
was first noticed that an approximate factor of two can be gained if one
of the multiplicands is known beforehand. For instance, if we want to
compute for a known series
with
, then the coefficients
of
are already known in the product
in (1), so only one of the inputs depends on
the output. An algorithm for the computation of
is said to be semi-relaxed, if
is
written to the output as soon as
are known, but
all coefficients of
are known beforehand. We
will denote by
the complexity of semi-relaxed
multiplication. We recall from [21] (see also Section 3) that relaxed multiplication reduces to semi-relaxed
multiplication:
It has recently been pointed out [26] that the factor that was gained in the case of semi-relaxed
multiplication can also be gained in the general case.
The first reduction of (3) by a non-constant factor was
published in [21], and uses the technique of FFT
blocking (which has also been used for the multiplication of
multivariate polynomials and power series in [17, Section
6.3], and for speeding up Newton iterations in [3, 23]).
Under the assumption that admits primitive
-th roots of unity for all
(or at least for all
with
), we showed that
The function has slower growth than any strictly
positive power of
. It is
convenient to write
whenever
for all
. In particular, it
follows that
In Section 3, we will recall the main ideas from [21] which lead to the complexity bound (4).
In fact, in Section 4, we will see that the complexity bound from [21] can be further reduced to
Since such complexity bounds involve rather complex expressions, it will be convenient to use the following abbreviations:
Clearly, .
We recall that the characteristic of a ring is
the integer
such that the canonical ring
homomorphism
has kernel
. If
is torsion-free as a
-module (i.e.
for any
and
), then we will say that
admits an effective partial division by integers if there exists an
algorithm which takes
and
on input and which returns the unique
with
on output. We will count a unit cost for such
divisions. The main result of this paper is:
Theorem
is an effective ring of characteristic zero,
which is torsion-free as a
-module,
and which admits an effective partial division by integers.
is an effective ring of positive
characteristic.
Then we have
We notice that the theorem holds in particular if
is an effective field. In Section 8, we will also consider
the relaxed multiplication of
-adic
numbers, with
and
.
If we denote by
the bit complexity of
multiplying two
-bit
integers, then [13] essentially provided an algorithm of
bit complexity
for the relaxed multiplication of
-adic numbers at order
. Various algorithms and benchmarks
for more general
were presented in [4].
It is also well known [33, 11, 28,
14] that
. Let
denote the bit complexity of the relaxed
multiplication of two
-adic
numbers modulo
. In Section
8, we will prove the following new result:
Theorem with
.
Then we have
uniformly in and
.
For comparison, the best previously known bound for relaxed
multiplication in was
We thus improved the previous bound by a factor
at least, up to sublogarithmic terms.
The main idea which allows for the present generalizations is quite
straightforward. In our original algorithm from [21], the
presence of sufficiently many primitive -th
roots of unity in
gives rise to a quasi-optimal
evaluation-interpolation strategy for the multiplication of polynomials.
More precisely, given two polynomials of degrees
, their FFT-multiplication only requires
evaluation and interpolation points, and both the
evaluation and interpolation steps can be performed efficiently, using
only
operations. Now it has recently been shown
[8] that quasi-optimal evaluation-interpolation strategies
still exist if we evaluate and interpolate at points in geometric
progressions instead of roots of unity. This result is the key to our
new complexity bounds, although further technical details have to be
dealt with in order to make things work for various types of effective
rings
. We also notice that
the main novelty of [8] concerns the interpolation step.
Fast evaluation at geometric progressions was possible before using the
so called chirp transform [27, 7].
For effective rings
of positive characteristic,
this would actually have been sufficient for the proving the bound (5).
Our paper is structured as follows. Since the algorithms of [8]
were presented in the case when is an effective
field, Section 2 starts with their generalization to more
general effective rings
.
These generalizations are purely formal and contain no essentially new
ideas. In Section 3, we give a short survey of the
algorithm from [21], but we recommend reading the original
paper for full technical details. In Section 4, we sharpen
the complexity analysis for the algorithm from [21]. In
Section 5, we prove Theorem 1 in the case when
has characteristic zero. In Section 6,
we turn our attention to the case when
has prime
characteristic
. If the
characteristic is sufficiently large, then we may find sufficiently
large geometric progressions in
in order to
generalize the results from Section 5. Otherwise, we have
to work over
for some sufficiently large
. In Section 7, we
complete the proof of Theorem 1; the case when the
characteristic is a prime power is a refinement of the result from
Section 6. The remaining case is done via Chinese
remaindering. In Section 8, we will prove Theorem 2.
Acknowledgments. I am grateful to the referees for their detailed comments and suggestions. A special thanks goes to the referee who made a crucial suggestion for the improved complexity analysis in Section 4.
Let be a (commutative) effective integral domain
and let
be its quotient field. Assume that
has characteristic zero. Let
be an effective torsion-free
-module
and
. Elements of
and
are fractions
with
(resp.
) and
,
and the operations
on such fractions are as
usual:
For , we also have
. It follows that
is an effetive field and
an effective
-vector space. Moreover, all field
operations in
(and all vector space operations
in
) can be performed using
only
operations in
(resp.
or
).
We will say that admits an effective partial
division, if for every
and
, we can compute the unique
with
. In that case, and we
will count any division of the above kind as one operation in
. Similarly, given a fixed
, we say that
admits an effective partial division by
, we can compute the unique
with
.
Given
, we define
Given and
,
we will denote by
the number of operations in
and
which are needed in
order to compute the product
.
Lemma be an effective integral domain and
an effective torsion-free
-module.
There exists a constant
,
such that the following holds: for any
,
and
such that
are pairwise distinct, and such that
admits effective divisions by
and
, we have:
We may compute from
using
operations in
and
.
We may reconstruct from
using
operations in
and
.
Proof. In the case when is a
field and
, this result was
first proven in [8]. More precisely, the conversions can be
done using the algorithms NewtonEvalGeom,
NewtonInterpGeom, NewtonToMonomialGeom
and MonomialToNewtonGeom in that paper. Examining these
algorithms, we observe that general elements in
are only multiplied with elements in
and divided
by elements of the set
. In
particular, the algorithms can still be applied in the more general case
when
is a field and
a
vector space.
If is only an effective integral domain and
an effective torsion-free
-module with an effective partial division, then we
define the effective field
and the effective
vector space
as above, and we may still apply
the generalized algorithms for multipoint evaluation and interpolation
in
. In particular, both
multipoint evaluation and interpolation can still be done using
operations in
and
, whence
operations in
and
.
If we know that the end-results of these algorithms are really
in the subspace
of
(or
in the submodule
of
), then we use the partial division in
to replace their representations in
(or
) by representations in
(or
).
Let be an effective (possibly non-commutative)
ring and recall that
Given a power series and
, we will also use the notations
The fast relaxed algorithms from [21] are all based on two main changes of representation: “blocking” and the fast Fourier transform. Let us briefly recall these transformations and how to use them for the design of fast algorithms for relaxed multiplication.
Given , we may then compute
using
where and
and it is classical [12] that both
and
can be computed using
operations in
. The
operations
and
extend
naturally to
via
Given , this allows us to
compute
using the formula
![]() |
(6) |
where is a pointwise product in
. The first
coefficients of
can be computed using at most
operations in
.
allows for the relaxed computation of at order
using at most
operations in . Similarly,
the formula
allows for the semi-relaxed computation of at
order
using at most
operations in . For a given
expansion order
, one may
take
, and use the above
formula in a recursive manner. This yields [21, Theorem 11]
Remark is chosen as a function of
, the above method really describes a relaxed
algorithm for computing the product up to an order
which is specified in advance. In fact, such an algorithm automatically
yields a genuine relaxed algorithm with the same complexity (up to a
constant factor), by doubling the order
each
time when needed.
shows that a relaxed product of two polynomials
and
of degrees
reduces
to a relaxed product
of half the size, two
semi-relaxed products
,
, and one non-relaxed product
. Under the assumptions that
and
are increasing, a routine
calculation thus yields
One of the referees suggested to take instead of
in (8). As a matter of fact, it is
even better to take
. In this
section, we will show that this leads to the bound
which further improves on (9). We will prove a slightly more general result, which is useful for the analysis of algorithms which satisfy complexity bounds similar to (8).
Lemma be an increasing function with
and let
where . Then there exist
constants
and
such that
for all
,
, and
,
we have
Proof. Notice that and
for sufficiently large
.
We have
For a suitable constant and all sufficiently
large
, it follows that
![]() |
(10) |
We also have
whence
For all sufficiently large ,
it follows that
Consequently,
Adding up (10) and (11), we obtain
for all sufficiently large .
Theorem be an increasing function with
. Assume that
for all sufficiently large ,
where
Then
Proof. We define functions by
and let ,
and
be as in Lemma 5. Without loss
of generality, we may pick
sufficiently large
such that
and such that (12) holds
for
. Let
be such that
and denote
and
. For certain
, we have
and
. Now (12)
implies
Let and
.
Let us prove by induction that
for all
. This is clear for
. Assume now that
and
for all
.
Then, with the above notations,
implies
and
,
whence
as desired. For all , we have
thus shown that
.
Let us now consider the less favourable case when
is an effective ring which does not necessarily contain primitive
-th roots of unity for arbitrarily
high
. In this section, we
will first consider the case when
is
torsion-free as a
-module and
also admits a partial algorithm for division by integers.
Given a block size and
(say
), we will replace the
discrete Fourier transform
at a
-th primitive root of unity by multipoint
evaluation at
. More
precisely, we define
and the inverse transform .
By Lemma 3, these transforms can both be computed using
operations in
.
In a similar way as for
and
, we extend
and
to power series in
.
Theorem both be an effective ring and an effective
torsion-free
-module with an
effective partial division by elements in
.
Then
Proof. It suffices to prove the complexity bound for
semi-relaxed multiplication. Instead of (7), we now compute
using
The bound (8) then has to be replaced by
Plugging in the bound from [10] and
applying Theorem 6, the result follows.
Let now be an effective ring of prime
characteristic
. For
expansion orders
, the ring
does not necessarily contain
distinct points in geometric progression. Therefore, in order to apply
Lemma 3, we will first replace
by a
suitable extension, in which we can find sufficiently large geometric
progressions.
Given , let
be even with
. Let
be such that the finite field
is
isomorphic to
. Then the ring
has dimension over
as a
vector space, so we have a natural
-linear
bijection
The ring is an effective ring and one addition
or subtraction in
corresponds to
additions or subtractions in
.
Similarly, one multiplication in
can be done
using
operations in
.
In order to multiply two series up to order
, the idea is now to rewrite
and
as series in
with
. If
we want to compute the relaxed product, then we also have to treat the
first
coefficients apart, as we did before for
the blocking strategy. More precisely, we will compute the semi-relaxed
product
using the formula
where we extended to
in
the natural way:
From the complexity point of view, we get
Since contains a copy of
, it also contains at least
points in geometric progression. For the multiplication up to order
of two series with coefficients in
, we may thus use the blocking strategy
combined with multipoint evaluation and interpolation.
Theorem be an effective ring of prime characteristic
. Then
Proof. With the notations from above, we may find a
primitive -th root of unity
in
.
We may thus use formula (13) for the semi-relaxed
multiplication of two series in
up to order
. In a similar way as in the
proof of Theorem 7, we thus get
Using classical fast relaxed multiplication [16, 17, 13], we also have
whence (15) simplifies to
Since and
,
the result follows.
Remark , then
in (16), so the bound further reduces into
Remark with
.
Using a randomized algorithm, such a polynomial can be computed in time
; see [15,
Corollary 14.44]. If
, then
this is really a precomputation of negligible cost
.
If we insist on computing in a deterministic
way, then one may use [30, Theorem 3.2], which provides us
with an algorithm of time complexity
.
Similarly, there both exist randomized and deterministic algorithms [32, 31] for the efficient computation of
primitive -th roots of unity
in
. In particular, thanks to
[31, Theorem 1] such a primitive root of unity can always
be computed in time
, when
using a naive algorithm for factoring
.
Let us now show that the technique from the previous section actually
extends to the case when is an arbitrary
effective ring of positive characteristic. We first show that the
algorithm still applies when the characteristic of
is a prime power. We then conclude by showing how to apply Chinese
remaindering in our setting.
Theorem be an effective ring of prime power
characteristic
. Then
Proof. Taking ,
let
of degree
be as in
the previous section and pick a monic polynomial
of degree
in
such that
the reduction
of
modulo
yields
.
Then we get a natural commutative diagram
where stands for reduction modulo
. In particular, we have an epimorphism
with .
Now let be an element in
of order
. Then any lift
of
with
has order at least
.
Moreover,
and
are all
invertible. Consequently,
and
do not lie in
, whence they
are invertible as well. It follows that we may still apply multipoint
evaluation and interpolation in
at the sequence
, whence Theorem 8
generalizes to the present case.
Remark , we notice that
the complexity bound is uniform in the following sense: there exists a
constant
such that for all effective rings of
characteristic
with
, we have
.
Indeed, the choice of
only depends on
and
, and any
operation in
or
in the
case
corresponds to exactly one lifted operation
in
or
in the general
case.
Remark
leads to the improved bound
This bound is uniform in a similar way as in Remark 12.
Theorem be an effective ring of non-zero characteristic
. Then
Proof. We will prove the theorem by induction on the
number of prime divisors of .
If
is a prime power, then we are done. So assume
that
, where
and
are relatively prime, and let
be such that
Then we may consider the rings
These rings are effective, when representing their elements by elements
of and transporting the operations from
. Of course, the representation of
an element
of
(or
) is not unique, since we may
replace it by
for any
(or
). But this is not a
problem, since our definition of effective ring did not require unique
representability or the existence of an equality test.
Now let and let
be their
projections in
, for
. Consider the relaxed products
, for
. These products are represented by relaxed series
via
,
for
. By the induction
hypotheses, we may compute
and
at order
using
operations in
. The linear
combination
can still be expanded up to order
with the same complexity. We claim that
. Indeed,
Summing both relations, our claim follows.
Remark can be given along similar lines as
in Remark 12. This time, such a bound depends linearly on
the number of prime factors of
.
Let be an integer, not necessarily a prime
number, and denote
. We will
regard
-adic numbers
as series
with
, and such that the basic ring operations
,
and
require an additional carry treatment.
In order to multiply two relaxed -adic
numbers
, we may rewrite them
as series
, multiply these
series
, and recover the
product
from the result. Of course, the
coefficients of
may exceed
, so some relaxed carry handling is required in
order to recover
from
. We refer to [4, Section 2.7] for
details. In particular, we prove there that
can
be computed up to order
using
ring operations in
of bit size
.
Given , let
, and consider two power series
. We will denote by
(resp.
) the bit
complexity of multiplying
and
up to order
using a relaxed (resp.
semi-relaxed) algorithm.
Proof. Let be a prime number
with
. Such a prime
can be computed in time
using the
polynomial time polynomial primality test from [1],
together with the Bertrand-Chebychev theorem.
Let and consider
such
that
. Let
be the reductions of
modulo
. Then
may be
reconstructed up to order
from the product
. We thus get
By Theorem 11 and Remarks 12 and 13,
while using the fact that ,
we have
and this bound is uniform in .
Since
, the result
follows.
For the above strategy to be efficient, it is important that . This can be achieved by combining
it with the technique of
-adic
blocking. More precisely, given a
-adic
block size
, then any
-adic number in
can naturally be considered as a
-adic
number in
, and vice
versa. Assuming that numbers in
are written
in base
, the conversion is
trivial if
is a power of two. Otherwise, the
conversion involves base conversions and we refer to [4,
Section 4] for more details. In particular, the conversions in both
directions up to order
can be done in time
.
Let (resp.
) the complexity of relaxed (resp.
semi-relaxed) multiplication in
up to order
.
Theorem , we have
Proof. Let ,
so that
Using the strategy of -adic
blocking, a semi-relaxed product in
may then be
reduced to one semi-relaxed product in
and one
relaxed multiplication with an integer in
.
In other words,
where stands for the cost of semi-relaxed
multiplication of two
-adic
numbers in
up to order
. By [4, Proposition 4], we have
By Lemma 16, we also have
Notice that
which completes the proof of the theorem.
For the moment, we have not implemented any of the new algorithms in practice. Nevertheless, our old implementation of the algorithm from [21] allowed us to gain some insight on the practical usefulness of blockwise relaxed multiplication. Let us briefly discuss the potential impact of the new results for practical purposes.
In the case of integer coefficients, it is best to re-encode the
integers as polynomials in for a prime number
which fits into a machine word and such that
admits many
-th roots of
unity (it is also possible to take several primes
and use Chinese remaindering). After that, one may again use the old
algorithm from [21]. Also, integer coefficients usually
grow in size with
, so one
really should see the power series as a bivariate power series in
with a triangular support. One may then want to use
TFT-style multiplication [19, 20] in order to
gain another constant factor.
If denotes the cost of multiplying two
polynomials
and
in
with
,
then the classical fast relaxed multiplication algorithm from [17,
13] generalizes and still admits the time complexity
. However, the blockwise algorithm
from this paper does not generalize to this setting, at least not in a
straightforward way.
M. Agrawal, N. Kayal, and N. Saxena. PRIMES is in P. Annals of Mathematics, 160(2):781–793, 2004.
A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, Massachusetts, 1974.
D. Bernstein. Removing redundancy in high precision Newton iteration. Available from http://cr.yp.to/fastnewton.html, 2000.
J. Berthomieu, J. van der Hoeven, and G. Lecerf.
Relaxed algorithms for -adic
numbers. Journal de Théorie des Nombres de
Bordeaux, 23(3):541–577, 2011.
J. Berthomieu and R. Lebreton. Relaxed -adic hensel lifting for algebraic
systems. In J. van der Hoeven and M. van Hoeij, editors,
Proc. ISSAC '12, pages 59–66, Grenoble, France,
July 2012.
D. Bini and V.Y. Pan. Polynomial and matrix computations. Vol. 1. Birkhäuser Boston Inc., Boston, MA, 1994. Fundamental algorithms.
Leo I. Bluestein. A linear filtering approach to the computation of discrete Fourier transform. IEEE Transactions on Audio and Electroacoustics, 18(4):451–455, 1970.
A. Bostan and É. Schost. Polynomial evaluation and interpolation on special sets of points. Journal of Complexity, 21(4):420–446, August 2005. Festschrift for the 70th Birthday of Arnold Schönhage.
R.P. Brent and H.T. Kung. Fast algorithms for manipulating formal power series. Journal of the ACM, 25:581–595, 1978.
D.G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Informatica, 28:693–701, 1991.
S. A. Cook. On the minimum computation time of functions. PhD thesis, Harvard University, 1966.
J.W. Cooley and J.W. Tukey. An algorithm for the machine calculation of complex Fourier series. Math. Computat., 19:297–301, 1965.
M. J. Fischer and L. J. Stockmeyer. Fast on-line integer multiplication. Proc. 5th ACM Symposium on Theory of Computing, 9:67–72, 1974.
M. Fürer. Faster integer multiplication. In Proceedings of the Thirty-Ninth ACM Symposium on Theory of Computing (STOC 2007), pages 57–66, San Diego, California, 2007.
J. von zur Gathen and J. Gerhard. Modern Computer Algebra. Cambridge University Press, first edition, 1999.
J. van der Hoeven. Lazy multiplication of formal power series. In W. W. Küchlin, editor, Proc. ISSAC '97, pages 17–20, Maui, Hawaii, July 1997.
J. van der Hoeven. Relax, but don't be too lazy. JSC, 34:479–542, 2002.
J. van der Hoeven. Relaxed multiplication using the middle product. In Manuel Bronstein, editor, Proc. ISSAC '03, pages 143–147, Philadelphia, USA, August 2003.
J. van der Hoeven. The truncated Fourier transform and applications. In J. Gutierrez, editor, Proc. ISSAC 2004, pages 290–296, Univ. of Cantabria, Santander, Spain, July 4–7 2004.
J. van der Hoeven. Notes on the Truncated Fourier Transform. Technical Report 2005-5, Université Paris-Sud, Orsay, France, 2005.
J. van der Hoeven. New algorithms for relaxed multiplication. JSC, 42(8):792–802, 2007.
J. van der Hoeven. Relaxed resolution of implicit equations. Technical report, HAL, 2009. http://hal.archives-ouvertes.fr/hal-00441977.
J. van der Hoeven. Newton's method and FFT trading. JSC, 45(8):857–878, 2010.
J. van der Hoeven. From implicit to recursive equations. Technical report, HAL, 2011. http://hal.archives-ouvertes.fr/hal-00583125.
A. Karatsuba and J. Ofman. Multiplication of multidigit numbers on automata. Soviet Physics Doklady, 7:595–596, 1963.
R. Lebreton and É. Schost. A simple and fast online power series multiplication and its analysis. Technical report, LIRMM, 2013. http://hal-lirmm.ccsd.cnrs.fr/lirmm-00867279.
L.R. Rabiner, R.W. Schafer, and C.M. Rader. The chirp z-transform algorithm and its application. Bell System Tech. J., 48:1249–1292, 1969.
A. Schönhage and V. Strassen. Schnelle Multiplikation großer Zahlen. Computing, 7:281–292, 1971.
Alexandre Sedoglavic. Méthodes seminumériques en algèbre différentielle ; applications à l'étude des propriétés structurelles de systèmes différentiels algébriques en automatique. PhD thesis, École polytechnique, 2001.
V. Shoup. New algorithms for finding irreducible polynomials over finite fields. Math. Comp., 54:535–447, 1990.
V. Shoup. Searching for primitive roots in finite fields. Math. Comp., 58:369–380, 1992.
I. Shparlinski. On primitive elements in finite fields and on elliptic curves. Mat. Sbornik, 181(9):1196–1206, 1990.
A.L. Toom. The complexity of a scheme of functional elements realizing the multiplication of integers. Soviet Mathematics, 4(2):714–716, 1963.