Accelerated tower arithmetic

Let

be an effective commutative ring with unity. Here effective means that elements of

can be represented by concrete data structures and that we have algorithms for the ring operations, including the zero test. Effective fields are defined in a similar way.

Given a monic polynomial

of degree

, the quotient ring

is an effective ring and fast algorithms exist for the arithmetic operations in

. More precisely, counting in terms of operations in

, additions in

require

additions in

, whereas multiplications can be done [15] in almost linear time

. Here

is a standard notation for the cost of multiplying two univariate polynomials of degree

: see section 2.2.

Given another monic polynomial

, we may build the effective ring

and fast arithmetic operations in

are available for the same reason. Doing so inductively, we obtain a tower

of extensions of

, with

for

. Such a tower is written

and we call

its height. Throughout this paper, we write

for the class of

, we let

, and

. The

are called the defining polynomials of the tower and

its degree. Elements of

are naturally represented by univariate polynomials in

over

of degrees

. If all

are fields, then we write

instead of

and

instead of

, for convenience.

Towers of this type naturally arise in several contexts of applied algebra, including cryptography (for instance in [2, 3, 18]), error correcting codes (for instance in [26]), and the resolution of differentially algebraic systems in the more general setting of triangular sets (for instance in [1, 12]).

By induction over

, basic arithmetic operations in

can be naively performed in time

, for some constant

. But it is well known that, for a sufficiently large constant

, these operations can actually be carried out in time

by means of Kronecker substitution (see for instance [27, Chapter 8]). However, if many of the individual degrees

are small, then

can become as large as

, which leads to an overall complexity bound of the form

. The main aim of this paper is to prove the sharper bound

in the case when

is a field and the

are irreducible and separable. The top level complexity result is stated in Corollary 4.11.

In order to simplify the presentation of complexity bounds, we often use the soft-Oh notation:

means that

; see [27, Chapter 25, section 7] for technical details. The least integer larger or equal to

is written

; the largest one smaller or equal to

is written

. The

-module of polynomials of degree

is written

1.2.Modular composition and related problems

One essential tool for our new algorithms is modular composition. Before returning to our main problem, let us recall several useful existing results on this problem and various related topics.

Let

be as above and assume that

, whence

. Given

the computation of the remainder

of the Euclidean division of

is called the problem of modular composition. It is equivalent to the problem of evaluating

at a point

. Currently, the best known solution to this problem is due to Brent and Kung [13, 56] and requires

operations in

. Here the constant

, with

, denotes an exponent such that a rectangular

matrix can be multiplied with a square

matrix with

operations in

. Huang and Pan showed in [40] that one may take

. Brent and Kung's algorithm is based on the baby-step giant-step technique [50]; we will recall and generalize it in our section 3.

and sends an

-linear functional

to the vector

. The computation of this transposed map is called the problem of modular power projection. Using the transposition principle (see [10] and historical references therein), the cost of power projection and modular composition are essentially the same. In particular, the computation of the traces

corresponds to one modular power projection.

Given an element

, the characteristic polynomial of

is the characteristic polynomial of the multiplication endomorphism by

. If

is a field, then this polynomial can be computed in softly quadratic time by means of a resultant (see for instance [45, Corollary 29]). Shoup has also designed a practical randomized algorithm to compute minimal polynomials for the case when

has sufficiently many elements, with an expected complexity of

The fastest known method to compute the characteristic polynomial

of an element

over

proceeds as follows: first compute the traces

of the powers of

for all

using modular power projection, and then recover

by integrating the differential equation

where

represents the reverse polynomial of

. This integration requires to have the inverses of

at our disposal. Historically, this method goes back to Le Verrier [43], and formula (1.1), often called the Newton–Girard formula, expresses the relationship between symmetric and power sum polynomials. The integration step takes softly linear time: the seminal contributions are due to Brent and Kung [13], generalizations can be found in [8], the first efficient extension to finite fields has been designed in [9], and we refer the reader to [29, section 2] for a concise proof of it.

1.3.Previous work on fast tower arithmetic

Recall that an element

is said to be primitive over

. The primitive element theorem states that such an element

always exists: if

contains sufficiently many elements, then it suffices to take a sufficiently generic

-linear combination of the

. In this light, it may seem odd at first sight to work with towers

of height

, since an isomorphic tower of height one always exists. The problem is that both the computation of primitive elements and conversions between representations are usually expensive.

Concerning primitive elements, it is currently not known how to efficiently find one, together with its minimal polynomial

and polynomials

such that

for

. In fact, naive algorithms mostly boil down to linear algebra in dimension

, and thus feature at least quadratic costs. One may for instance appeal to the FGLM algorithm to change orderings of Gröbner bases [21, 22].

If a modular composition algorithm (technically, for the multivariate version of section 3.2) of softly linear complexity were known over any field, then primitive element representations of towers would be computable in softly linear expected time by a randomized algorithm; conversions between towers and their primitive element representations would also become possible in softly linear time. These aspects have been studied by Poteaux and Schost in [51, 52], where subquadratic costs are achieved for multivariate modular composition and its transpose, as well as for computing primitive representations of towers.

Let

represent a fixed positive rational value. If

is the finite field with

elements, then a major result by Kedlaya and Umans [42] states that modular composition and power projection are possible in time

. Whenever

, then this in particular implies that characteristic polynomials can be obtained with expected bit complexity

: see [29, sections 2.3–2.5] for details. Based on these results, Poteaux and Schost have derived [51] fast algorithms for multiplication, division, traces and primitive element representations for separable towers over finite fields. Unfortunately, very large constant factors are hidden in the complexities of all these algorithms, which currently prevent them from being of any practical relevance.

For some other particular cases of interest, fast algorithms also exist for modular composition and related problems. For fixed irreducible moduli

of sufficiently smooth degree

, practically faster algorithms for modular composition have been proposed in [39]. When

, we have also shown in [38] that modular composition can be achieved in quasi-linear time when computing with a sufficiently high numeric precision.

More direct approaches for algebraic tower arithmetic are usually based on computations modulo so-called “triangular sets”. We may see the preimage of

over

as a multivariate polynomial

such that

is monic in

has degree

, degrees

for all

, and

. The sequence

forms a special case of a triangular set. Triangular sets and decompositions have a long history that goes back to Ritt's contributions in differential algebra [53].

In this context of triangular sets, it was shown in [46] that two elements in

can be multiplied in time

. In [44, Proposition 2], Lebreton has proved the stronger bound

. This result even applies if the ideal (

) is not radical, under the condition that

form a regular chain (in the sense of Kalkbrener [41], see also [1] for generalities about triangular sets). For triangular sets of certain quite special types, we notice that even faster multiplication algorithms have been designed in [7]. When working over finite fields, practically efficient algorithms have also been proposed in [17], based on fast embeddings into towers of a specific type.

Another major occurrence of algebraic towers in the literature concerns “dynamic evaluation”. This technique was introduced by Della Dora, Dicrescenzo and Duval [19, 20] as a way to compute with algebraic numbers without requiring algorithms for polynomial factorization. Basically, the idea is to compute in

as if it were a field, and cases are distinguished whenever a zero test of an element

is requested: the “current tower” is then explicitly decomposed into a “direct product of two towers” such that the projections of

in these two towers are respectively zero and invertible. The computations finally resume non-deterministically with each of the two new towers in the role of the current tower. For recent complexity results on this technique we refer the reader to [16].

1.4.Our contributions

The efficient computation with polynomials modulo zero-dimensional ideals defined by triangular sets is an important topic in applied algebra. Obtaining softly linear complexity bounds when working over a general field is an open problem. The previous best known bound due to Lebreton [44, section 3.2] is recalled in section 2.4.

Then, one first technical contribution of us concerns multivariate modular composition: in section 3.2 we design a new baby-step giant-step algorithm to evaluate

with the same kind of complexity as the Brent–Kung algorithm for the univariate case. It slightly improves upon the incremental method described in [52], that actually relies on the bivariate case. In fact, our method applies the baby-step giant-step paradigm over the whole representation of

. Compared to [52, Lemma 4] used with

, the complexity bound given in Proposition 3.3 does not have restrictions on the characteristic, and is asymptotically smaller by a factor of

The main contribution of this paper is section 4, which is devoted to a new deterministic algorithm to multiply in towers of separable field extensions (all the

are thus irreducible and separable). For the first time we achieve an asymptotic complexity

with an exponent that becomes arbitrarily close to one. From the asymptotically complexity point of view, and in the specific case of towers of separable field extensions, this improves upon [44, 52], and also upon the randomized algorithms for finite fields from [51]. For specific towers over finite fields, some of the algorithms from [17] remain more efficient both in practice and in theory.

The main idea behind our algorithms is as follows: we use the primitive element theorem to replace a general tower of height

by a new “accelerated” tower for which we can control the height

. There is a tradeoff between the cost of tower arithmetic (more expensive for larger

) and primitive element computations (more expensive for small

). By making a suitable compromise, we obtain our main complexity bound.

One shortcoming of our main result is that it only applies to towers of fields. Now if an algorithm is available for factoring polynomials in

into irreducibles, then our results generalize to towers of separable integral extensions over

, by “factoring” them into towers of separable field extensions. This reduction and the corresponding conversion algorithms (that generalize Chinese remaindering) are detailed in section 5. Some particular cases when the cost of factorization is often affordable are discussed in section 5.5.

Our final section 6 contains a general subquadratic Las Vegas complexity bound for computing primitive element representations of towers of separable field extensions, along with the related data for conversions: our algorithms directly benefit from the fast product, and they rely on power projections as in [11, 42, 51]. Compared to [52, Lemma 4] used with

, section 6 gains a factor of

asymptotically, without restrictions on the characteristic. When working over a finite field, section 6 does not improve upon the asymptotic complexity bounds from [51]. Nevertheless, [51] relies on the Kedlaya–Umans algorithm for modular composition, so we expect our approach to behave better in practice.

Let us finally stress once more that we regard our contribution as a uniform way to prove almost optimal complexity bounds for tower arithmetic in the algebraic complexity model. It is well known that this model is reasonably close to bit complexity models when working over finite fields. Over other fields such as

, coefficient growth needs to be taken into account, which leads us beyond the scope of this paper. From a practical perspective, we also recall that efficient tower arithmetic has been developed over various specific kinds of base fields such as finite fields and subfields of

: see [17, 38, 39] and section 5.5. It is true however that actual implementations of our own algorithms have fallen somewhat behind. We intend to address such implementation issues in the near future.

2.Basic tower arithmetic

This section gathers basic prerequisites on complexity models and polynomial arithmetic. In particular we recall Lebreton's results on multiplication in towers [44]. Then we examine the costs of divisions.

2.1.Complexity model

Let

be an effective ring. Our complexity analyses concern the algebraic complexity model [14, Chapter 4] over

, and more precisely straight-line programs and computation trees. In other words, running times of algorithms are measured in terms of the required number of ring operations in

, and constants are thought to be freely at our disposal.

It is convenient to introduce the notations

and

as abstractions for the respective costs of multiplication and division in

(whenever defined). For simplicity we always assume that multiplication is at least as expensive as addition, subtraction, and zero testing.

For randomized algorithms over a finite effective ring

, we assume a special instruction that uniformly generates random elements in

, with constant cost. For a given input, the cost of a randomized algorithm is thus a random variable. The expected cost for input size

is defined as the maximum of the averages of these random variables over all possible inputs of size

Remark 2.1. For arithmetic operations in basic rings and fields, such as

, it is common to use Turing or RAM machines to count the total number of “bit operations”. The translation of our results in terms of bit complexity is mostly straightforward, although some care is needed for the manipulation of multivariate polynomials on Turing machines. We refer to [36, 37] for more technical details.

2.2.Univariate polynomial multiplication

Let

be an effective commutative ring with unity. We denote by

a cost function for multiplying two polynomials

. For general

, it has been shown in [15] that one may take

These assumptions hold for

as in (2.1). Notice that (2.2) is also equivalent to

. Applying equation (2.2) a finite number of times, it follows that

for any fixed positive integer

is an effective

-algebra that is also a finite dimensional free

-module, then we denote by

a cost function for multiplying two polynomials in

, in terms of the number of operations in

, and we make the same assumptions as for

. Notice that we may always take

In particular, if

is a monic polynomial in

of degree

then

is such an

-algebra of dimension

; elements in

are represented by polynomials in

. In this case, we have

and

by means of Kronecker substitution [27, Chapter 8].

2.3.Primitive towers

is a finitely generated

-algebra, an element

is said to be primitive if

. In this case, we write

the monic minimal polynomial of

, so the following isomorphism holds:

Notice that such a primitive element representation does not necessarily exist: consider

and

With the triangular set

as defined in the introduction, an element

is represented by a polynomial

with

defined modulo

. So the conversion of

into an element of

boils down to evaluating

modulo

. The backward conversion consists in evaluating a univariate polynomial

. Such conversions are the topic of section 3 below.

Remark 2.3. Special kinds of primitive tower representations have been used before; see [18, 39], for instance. Algorithms therein use special routines for conversions between

and

. In the present paper, we only rely on conversions between

and

, the natural identity isomorphism

, and combinations of these conversions.

2.4.Multiplication in towers

Under the assumptions of section 2.2, recall that there exists a constant

such that

for

, and where

represents a cost function for multiplying two polynomials in

, in terms of the number of operations in

. When using this bound in an iterated fashion, we obtain

By taking care of the cost of polynomial divisions in terms of multiplications, one may prove the following sharper bounds:

Proof. Taking

, Lebreton proved that

; see [44, section 3.2]. His proof was done for the particular cost function

. For completeness, we briefly repeat this proof, for a general cost function

such that

is non-decreasing. We recall that two polynomials

of partial degrees

in the

can be multiplied in time

using Kronecker substitution recursively; see [27, Chapter 8] and [44, section 3.2].

We let

represent the triangular set associated to the tower, as defined in section 1.3. We identify elements in the tower with their canonical representatives in

. Let

represent the reverse polynomial of

. Assume for now that we precomputed polynomials

, of partial degrees

in the

, such that

Let

represent a cost function for reducing polynomials in

of degrees

for

modulo

. The reduction of such a polynomial

modulo

is done recursively, as follows:

Now let

and

be the respective representatives of two elements

and

to be multiplied in the tower. The product

takes

, and

is then reduced modulo

, whence

For the second bound of the proposition, we may multiply two polynomials in

as in

with

operations in

, and then reduce their product coefficientwise modulo

with cost

, whence

It remains to observe that once

are known, the precomputation of

using Newton's method takes

operations in

; see [27, section 9.1], for instance. Therefore, the complexity bounds obtained so far remain valid up to a constant factor when taking the cost for obtaining the

into account. Finally, our assumption

for

implies

, so (2.2) leads to the claimed bounds.

Remark 2.5. The constant

does not play a critical role in the design of the forthcoming algorithms. Of course any improvement of

remains relevant; the dependence on

will be made apparent in Corollary 4.11.

Remark 2.6. The assumption that

for all

is convenient for proving several complexity bounds. For

with

, we notice that

does not occur in the representation of elements in

, so extensions of degree one can be suppressed from the tower without loss of generality. The cost of the corresponding rewritings does not intervene in the algebraic complexity model. On a Turing or RAM machine this cost would actually depend on the way how multivariate polynomials are represented, but it is typically at most linear in the size of the representation.

2.5.Division in towers

In the remainder of this section, we assume that we work in a tower of fields

over

. Given

such that

is monic, it is well known [27, Chapter 9] that the quotient

and the remainder

of the Euclidean division of

can be computed in time

It is important for us that the gcd algorithm with input

returns the monic polynomial

along with

such that

. In this way, if

then

is the inverse of

modulo

and

is the inverse of

modulo

. It is well known [27, Chapter 11, Algorithm 11.6] that this extended gcd can be performed in time

In fact, when replacing all polynomial divisions by pseudo-divisions in [27, Chapter 11], we avoid all divisions in

, but the computed gcd

is not necessarily monic. Multiplying

and

with the inverse of the leading coefficient of

, we do obtain a monic gcd, at the expense of a single division in

. Summarizing, this shows that the extended monic gcd can actually be computed in time

Now consider an extension

, where

is a monic irreducible polynomial of degree

. Then the above bounds for division and gcd computations yield

When using the univariate algorithm for inverting non-zero elements in

in a recursive manner, we obtain the following complexity bound:

Proof. Let

and

be universal constants with

for all

and

for all extensions

as previously considered. Then we have

Let us mention that inversion modulo a general triangular set (that does not determine a prime ideal) is more subtle; see various approaches in [16, 49].

3.Evaluating polynomials at points in algebras

Throughout this section

represents an effective ring and

is an effective

-algebra with a given basis

. Given a polynomial

and a point

, we study how to compute

efficiently. We first recall the well known baby-step giant-step algorithm in the case when

. In section 4 below, these evaluation algorithms will be used for conversions between triangular and primitive representations of the same algebra.

3.1.Univariate baby-step giant-step method

Given a polynomial

and

, how to evaluate

efficiently at

? Horner's method uses time

. For convenience of the reader, we now recall a well known faster algorithm from [50] that relies on the baby-step giant-step technique.

Let and .
For do:
1. Compute and decompose .
  
  (This yields an matrix .)
2. For , let .
  
  (This yields a matrix .)
Compute the matrix product .
For , let .
Return .

Proof. By construction, we have

for

, whence

. This proves the correctness of the algorithm. Step 2 a requires

multiplications in

, when computing the powers using

. Step 2 b takes

assignments in

, which come for free in our complexity model. The matrix multiplication in step 3 can be done in time

. Step 5 involves

multiplications and additions in

, when using Horner's method. Altogether, this leads to the claimed running time.

Remark 3.2. If

and if a monic annihilator

with

is known, then it is possible to compute

with

operations in

. Indeed, we have

, and

is computed in time

3.2.Multivariate baby-step giant-step generalization

Let us now study the evaluation of a multivariate polynomial

of partial degree

in each

at a point

. Setting

and writing

for the coefficient of the monomial

, we have the following generalization of Algorithm 3.1 to multivariate polynomials.

Input: and with for .

Let be maximal with .
If then let else let .

Then let , , and .
For do:
1. Let .
2. Compute and decompose .
  
  (This yields an matrix .)
3. For do:
  1. Let .
  2. Let .
    
    (This yields a matrix .)
Compute the matrix product .
For , let .
Return .

Plugging this expression into the formula of step 6 shows that the algorithm indeed returns the desired evaluation.

Otherwise

straightforwardly implies

and

. In all cases

and

are in

. Consequently the complexity analysis is the same as for Algorithm 3.1. Of course, one has to carefully evaluate the power products in step 3 b and the sum in step 6, in order to use only

multiplications in

Remark 3.4. The constant

in step 2 of Algorithm 3.2 may be replaced by any other positive value strictly less than

4.Separable towers of fields

Let

be a tower of separable algebraic extensions of an effective base field

. Throughout this section, we assume that the extension degrees

satisfy

for all

(which is usually not restrictive as explained in Remark 2.6), and that

is as in Proposition 2.4. Our main objective is to design a fast algorithm for multiplying elements in the tower.

The basic idea is as follows. Let

be an arbitrarily small positive constant. If at least half of the

are “sufficiently large”, then

automatically becomes “small enough” to ensure that

. Otherwise, there exist

consecutive small degrees

, and we replace the corresponding subtower of extensions

by a primitive one

. We keep repeating these replacements until the height of the tower becomes “small enough”. Some careful balancing is required here, since the computation of a primitive element for

over

can become expensive if

gets “too large”. The precise tradeoff will be made explicit at the end of the section.

In this section, given two effective rings

and

along with a natural way to convert elements in

to elements in

, we denote by

the cost of such conversions. If we also have an algorithm for conversions in the opposite direction, then we write

4.1.Primitive tower representations

Assuming that

contains sufficiently many elements, the aim of this subsection is to construct a primitive element representation in the sense of Definition 2.2. We thus have to compute primitive elements

over

such that

together with the minimal polynomials

over

for

. We also show how to convert efficiently between each of the two representations of

Proof. The bound on the cost of conversions from

is a direct consequence of Proposition 3.3 by taking the assumption

into account. For

, the minimal polynomial of

over

may thus be converted into a minimal polynomial

over

in time

. The computation of

requires

extra operations since

for all

For conversions from

, we consider a polynomial

. We evaluate

seen as

with cost

by Proposition 3.1. We thus obtain a polynomial

such that

We next need to convert the

coefficients of

from

. Using a straightforward induction this leads to the following cost:

Remark 4.2. In [39, Corollary 7.3], we have introduced an alternative approach to conversions that is more efficient when

is “sufficiently small”. More precisely, modulo suitable precomputations, we have shown that conversions between elements of

and

can be done in time

. In various special cases, the factor

can be further reduced.

The next proposition provides us with complexity bounds for computing primitive tower representations by induction. We denote the cardinality of

Proof. Thanks to Proposition 4.1, modulo conversions of cost

between

and

, we rewrite

into

such that

. We thus have

Now let

be a parameter that will be specified later, and consider

. The characteristic polynomial of

over

equals to

. Consequently the characteristic polynomial

over

-proportional to

The polynomial

can be obtained as the preimage of

whose computation costs

by a standard “divide and conquer” approach; see [5, Lemma 7], for instance. Then the resultant may be computed in time

by [45, Corollary 26].

Regarding

as an indeterminate, the polynomial

has degree

. Therefore

may be interpolated from

values of

. The total cost to obtain

is thus

Geometrically speaking, the system

admits the

pairwise distinct solutions

, where

denotes the algebraic closure of

. The polynomial

is separable if, and only if, its roots

are pairwise distinct. There are at most

values of

for which this is not the case, so we simply need to try

distinct values of

until

becomes separable. Testing the separability of

for

values

is achieved as follows.

We first evaluate

for

. Using fast multi-point evaluation, this takes time

. We next test whether the discriminant of

vanishes, for

; this can be done in time

. Doing this for

packets of

values, the overall computation can be done in time

. This completes the proof of part a.

As to part b, for any root

the gcd of

and

has degree

, so the specialization property of subresultants ensures that the subresultant in

of degree

and

has the form

with

coprime to

. This subresultant can be computed in time

again by [45, Corollary 26]. Since

and

have degrees

, we obtain the polynomial

modulo

with additional cost

. Then from

we deduce that

. For

, we then take

. Proposition 3.1 implies that the cost of these modular compositions is bounded by

. This completes the proof of part b.

, then we use a Las Vegas algorithm for part a: we pick

at random in a set of size

until

is separable. Each pick is successful with probability at least

, so the expected cost is

Proof. This directly follows from the latter proposition, using that

for all

Remark 4.5. In a similar way as for the computation of gcds in section 2.5, it is possible to avoid divisions during the computation of resultants and subresultants. However, the required adaptations of the algorithms from [45] are a bit more technical than the mere replacement of Euclidean divisions by pseudo-divisions. For this reason, we have not further optimized the number of divisions in our complexity bounds.

4.2.Accelerated tower arithmetic

The main problem with the complexity bounds of Proposition 2.4 is that the height

of the tower may get as large as

. Now if

indeed gets large, then many of the

are necessarily small. Furthermore, as long as a product of the form

is reasonably small, the results from the previous subsection allow us to change the given representation of

into a more efficient primitive element representation

. Repeating this operation as many times as necessary, we reduce the height of the tower and guarantee that all defining polynomials have sufficiently large degrees. This process is detailed in the next paragraphs.

Proof. The construction of the sequence

is done by induction. For

with

, assume that we have completed the construction up to height

. We distinguish the following cases:

Whenever

we must have either

. Consequently the number

of such indices

is constrained by

. It follows that

. On the other hand the number of indices

with

is necessarily

. It follows that

. Once the indices

are known, the existence of a

-accelerated tower representation follows from Corollary 4.4.

Input. A separable tower of fields over and with .

Output. A -accelerated tower representation of .

Determine the integer sequence as described in the proof of Lemma 4.7. Set and .
For do:
1. Compute a primitive tower representation of over .
2. Let and respectively represent the top level primitive element found for over and its minimal polynomial, and set .
Return , , together with the primitive tower representations from step 2 a.

In order to bound the execution time of Algorithm 4.1, we need to carefully analyze the cost of conversions between

and

for

By Proposition 4.1, there also exists a universal constant

such that conversions between

and

can be performed in time

for

, and where we used the assumption that

is non-decreasing in

. If

then such conversions are actually for free. Otherwise we have

, whence

Now take

and let us prove the lemma by induction over

. For

we have nothing to do, so we assume that

. Using the induction hypothesis, conversions between

and

are performed in time

Proof. The correctness of Algorithm 4.1 is ensured by Lemma 4.7. Let us analyze the cost of step 2 a for a given

. The necessary conversions for rewriting the defining polynomials of

over

can be done in time

by Lemma 4.8. If

, then there is nothing to be done since

. So assume that

, whence

. Then Proposition 2.7 gives

Consequently, in view of Corollary 4.4 and using assumption (2.2), the remainder of step 2 a takes time

which dominates the cost of conversions since

. The total cost for all

is thus

Finally, if

, then the randomized variant from Corollary 4.4 leads to the claimed expected cost.

Proof. The cost to obtain a

-accelerated tower representation of

is given in Proposition 4.9, namely

Then, in order to multiply two elements in

, we convert them into the accelerated representation, and multiply them with cost

by Proposition 2.4. We finally convert the product back into

. By Lemma 4.8, each of these conversions can be done in time

Proof. It is important to notice that constants hidden in the “

” of Theorem 4.10 are independent of the value for

, so we may freely set

in terms of

. Now taking

balances the contributions of

and

. Indeed, since

by Lemma 4.7, we have

. We conclude by the latter theorem.

Remark 4.12. For the same reasons as in Remark 4.5, we have not attempted to further optimize the number of divisions in our complexity bounds. Again, we expect that most divisions in

can actually be avoided.

5.Separable towers and factorization

Up to now we have focused on towers of field extensions. In this section, we turn our attention to more general towers of separable integral ring extensions. Each extension ring is still assumed to be a product of separable field extensions over a common ground field

, which will allow us to apply the theory from the previous sections.

We start with a description of the precise types of towers of rings that we will be able to handle. We next explain how such towers of rings can be “factored” into towers of fields and we analyze the cost of the corresponding conversions. Finally, we give a complexity bound for the computation of “tower factorizations”. We do not claim that our approach is efficient in all cases. Nevertheless it is expected to be so whenever the factorization process behaves as a pretreatment with negligible cost (or when factoring polynomials over

is reasonably affordable; see section 5.5). See the conclusion of this paper for a discussion of alternative approaches.

5.1.Separable towers of rings

Throughout this section, we still assume that the ground ring

is a field, written

, and we consider a tower

of integral ring extensions of

. For

, we still let

denote the defining polynomial of

over

, define

, etc. as before, and assume that

. We say that

is a tower of separable integral ring extensions, or just separable in short, if

By induction over

, one may check that each

is isomorphic to a direct product

of separable algebraic extensions of

. The projection

into

is separable in the usual sense, which means that

and

are coprime for all

and

In terms of the triangular set

defined in the introduction, a separable tower corresponds to the situation where the ideal

is radical over the algebraic closure

. In particular, this means that

and therefore

are uniquely determined by the variety

From a geometric point of view, one may regard computations with elements in

as computations with all zeros

in parallel. Alternatively, we may regard them as computations with parameters

subject to the polynomial constraints

5.2.Tree factorizations of towers

Consider two separable towers

and

of the same height

and over the same base field

. Let

and

denote the respective defining polynomials for

and

. We say that

is a factor of

. This is the case if and only if there exist natural projections

(that naturally extend to projections

in a coefficientwise manner) such that:

We say that

is irreducible if the

are all fields or, equivalently, if the

are all irreducible.

The notation “()” stands for the empty tuple. Let

be index sets of

-tuples of integers with

and

for suitable integers

and

. Consider a family

of factor towers of

with the property that

only depends on

for all

, and write

. We say that such a family of factors forms a tree factorization of

if the variety

is partitioned into

. We say that the factorization is irreducible, if all its factor towers are irreducible. If

and

, then we also write

for the defining polynomial of

over

It is convenient to represent such a factorization by a labeled tree (see Figure 5.1 ): the nodes are identified with the index set

, and each node

is labeled with the algebraic extension

. The parent of the node

with

is simply the node

. Each individual factor

corresponds to a path from the root to a leaf.

on the first

coordinates, we observe that the projection of

, for given

, is the same for all

, and equals

, whence

. Consequently,

forms a tree factorization of the subtower

. From an algebraic point of view, this means that we have a natural isomorphism

For any

, we denote by

the natural projection of

onto

. Dually, the family

forms a tree factorization of the tower

, where

where we recall that

stands for the defining polynomial of

for

, whereas

is the defining polynomial of

5.3.Multi-modular representations

from the Chinese remainder theorem. We may thus view the isomorphism (5.1) as a generalized Chinese remainder theorem. The multi-modular representation of an element in

is simply its image under this isomorphism. The aim of this section is to present efficient algorithms for conversions between the usual and the multi-modular representations.

For the usual Chinese remainder theorem, efficient algorithms are well known for carrying out the corresponding conversions [6, 23, 48]. These algorithms are based on the technique of so-called remainder trees, for which recent improvements can be found in [4, 10, 35]. In particular, if

then the conversions from (5.4) can be carried out in time

; see for instance [27, Chapter 10]. These fast algorithms actually work in our context. This means that the isomorphism

from (5.2) and (5.3) can be computed with complexity

, whenever

modulo

are precomputed for all

. In fact, if

are elements of

with natural preimages

for

, then the natural preimage of

can be computed as

The idea is to combine these “isomorphisms at nodes

” in order to compute the global isomorphism from (5.1). For the complexity analysis, it is convenient to introduce the following maximal normalized cost of multiplication in any of the factors of the tree factorization:

For the time being, we may use Proposition 2.4 as a complexity bound for

. The conversion of elements in

into elements in

is called multi-modular reduction and we may use the following algorithm for this operation:

Input: a tree factorization of a tower , and .

If , then return .
Expand with .
For , recursively compute .
For each do:
1. Set .
2. Compute for , so that .
Collect and return the family .

Proof. The correctness is straightforward from the definitions. We perform step 4 b using fast univariate multi-modular reduction. Let

be a universal constant such that multi-modular reduction of a polynomial of degree

over an arbitrary effective ring

can be performed in time

Let us show by induction over

that the algorithm runs in time

. For

, the result is obvious, so assume that

. By the induction hypothesis, step 3 runs in time

. By the definition of

, the contribution of step 4 b to the complexity is bounded by

The opposite conversion of elements in

into elements in

is called Chinese remaindering; we may use the following algorithm for this operation:

Input: a tree factorization of and a family with .

Output: such that for all .

Assumption: modulo are precomputed for all , , and .

If , then is the singleton made of the empty tuple, so return .
For each do:
For do:
Return .

Proof. The proof is very similar to the one of Proposition 5.1. The polynomials

modulo

are actually used in the Chinese remaindering of step 2 via formula (5.5).

5.4.Factorization

Factoring univariate polynomials over an arbitrary effective field

is generally expensive or even impossible [24, 25]. Still, if an algorithm for this task is given, then it is interesting to study how to exploit it for the computation of an irreducible tree factorization

of a given separable tower

. Once such an irreducible tree factorization is known, arithmetic and other operations can potentially be sped up by switching to the multi-modular representation thanks to Algorithms 5.1 and 5.2.

For the complexity analysis, the function

represents the time necessary to factor a polynomial in

, and it is again convenient to introduce the following maximal normalized cost of factoring univariate polynomials over any of the fields in the factor towers:

Here we notice that each of the irreducible factors

is a tower of separable fields, so we may directly use the accelerated arithmetic from section 4 for computations over any of the

Input: a separable tower .

Output: the irreducible tree factorization of .

If , then return .
Recursively compute the irreducible tree factorization of .
Compute using Algorithm 5.1.
For each do:
1. Compute the irreducible factors of .
2. Let for .
Return , where .

Together with recursive calls, the total cost of all factorizations is bounded by

Step 3 takes time

, by Proposition 5.1, and the same step for the recursive calls amounts to a similar complexity. We conclude by adding up these contributions.

We are now ready to present a major corollary of this result; in combination with Algorithms 5.1 and 5.2, it reduces arithmetic in general separable towers to arithmetic in accelerated towers of fields.

Proof. This follows from Proposition 5.3 combined to Corollary 4.11, which yields

5.5.Remarks about special coefficient fields

For certain specific fields of coefficients

, factorization of univariate polynomials over

can be reasonably cheap. Let us briefly study two particular such cases.

Complex numbers

Even if

or if

is an algebraic number field, then it may be useful to temporarily convert coefficients in

into

bit complex floating point numbers and convert the results of computations back using rational reconstruction techniques [27, Chapter 5].

A convenient framework for analyzing the “ultimate complexity” of numeric algorithms was introduced in [38]. Concerning the factorization of complex polynomials of degree

, it was shown therein that all roots could be computed at precision

in time

for “sufficiently large precisions”

. The actual precision from which this bound becomes relevant highly depends on the location of the roots. As a rule of thumb, a precision

is often required and sufficient, but we refer the reader to the seminal works of Schönhage, Pan and others for details [47, 54].

Since

is algebraically closed, towers of separable algebraic extensions can be factored into towers of degree one. From the “ultimate bit complexity” point of view from [38], it follows that the ultimate bit complexity of the conversions in Propositions 5.1 and 5.2 becomes

and so does the cost of factoring itself in Proposition 5.3. Using our rule of thumb, we expect that a precision

of the order of

should be sufficient in practice in order to observe these complexity bounds. Small values of

are thus expected to be favourable for this type of coefficients (notice that this was the least favourable case for general coefficients!). A refined comparison between the bit complexities of this approach and the building of accelerated towers would require efforts that are beyond the scope of the present paper.

Finite fields

6.Primitive element representations

This section revisits the computation of traces and characteristic polynomials in towers of field extensions over a general field by using fast products. Our algorithms are rather different from those of [52]: we compute traces faster, but for characteristic polynomials and primitive elements the bottleneck lies in the baby-step giant-step method, so the complexity exponents in terms of

turn out to be the same as in [52] (notice that we gain a logarithmic factor

). Until the end of this section,

is a tower of separable field extensions of degrees

and

is as in Proposition 2.4.

6.1.Trace computations in towers

Let us recall how to compute traces in a tower

of separable field extensions. We let

represent the reverse polynomial of

. Its constant coefficient is

and the trace function

over

may be computed as the first

terms of the power series

In other words, the trace map

can be regarded as a vector in

that can be computed in time

: indeed this vector represents the

matrix of

with respect to the basis

. Given such a vector representation, the individual evaluation of

at an element in

takes time

More generally,

is a

-algebra for all

, and the relative trace function satisfies

Proof. First we compute

, as a vector in

, for

. This computation takes time

. We regard each

as a

-linear map

from

, whose evaluation takes time

. Regarding

as a vector

, we apply the transpose of

in order to obtain a new vector

that represents

. This computation takes time

thanks to the transposition principle (as explained in [10]). We next apply the transpose of

and obtain a vector

that represents

. Continuing this way, we obtain

as a vector

in time

6.2.Characteristic polynomials

We may take advantage of accelerated towers to compute a primitive element

over

, together with its minimal polynomial

and the parametrizations

with

for

. We first consider the computation of characteristic polynomials.

Proof. Lemma 6.1 and Proposition 2.4 allow us to compute the representation of

as a vector in

in time

Using Lemma 4.8, we convert

into an element

in time

. By using the transposed product in

the vector representation of the linear form

can be computed in time

. Then, as

in section 1.2, the computation of

We finally use the Newton–Girard identity (1.1) to recover the characteristic polynomial of

with further cost

; see for instance [29, section 2.4]. The conclusion follows.

By taking

as in Corollary 4.11, the expected cost of Theorem 6.2 simplifies to

, thanks to the assumption

6.3.Primitive elements

We are now in a position to design a randomized algorithm with subquadratic expected cost for computing a primitive element representation of

over

Proof. The characteristic polynomial

has total degree

. So its discriminant is a polynomial in

of total degree

. By the Schwartz–Zippel lemma [55, 57], picking

at random with the

in a set of size

leads to a primitive element with probability

. Computing the characteristic polynomial of

takes time

by Theorem 6.2. We can verify that

is a primitive element by checking that its characteristic polynomial is separable with further cost

. Consequently the expected time to find a primitive element is

Now assume that a primitive element

has been found, and let us consider the computation of polynomials

such that

for

. We follow Kronecker's classical deformation technique of the primitive element; see for instance [28, section 3.3]. For this purpose, we first introduce a new formal indeterminate

with

, so that

is the ring of tangent numbers over

. We aim at computing the characteristic polynomial of

over

. Then

Notice that

is already available at this point. The computation of the vector representation of the linear form

can be done in time

by the transposition principle. Therefore

is obtained by transposing Algorithm 3.1. In view of Proposition 3.1, this computation takes time

At this point, we have computed the traces

for all

. We deduce the characteristic polynomial

using Newton–Girard's formula (1.1), so that

Now let

be an element of

. It can be converted into an element

in time O(

) by Lemma 4.8, where

has partial degree

for

. The expression of

in terms of

is obtained as

in time

by Proposition 3.3. By successively taking

for

, the polynomials

can all be obtained in time

Conclusion

We have proved nearly optimal complexity bounds for basic arithmetic operations in towers of fields of arbitrary height. Our results are based on the newly introduced concept of “accelerated” towers. Two major problems remain open:

Another major challenge concerns efficient implementations. Asymptotically fast algorithms based on Proposition 2.4 obviously only become interesting when

is in the range where evaluation-interpolation style algorithms are used for univariate arithmetic. For instance, FFT-based algorithms are typically used for

For such sufficiently large

, we expect our acceleration techniques to be useful. For instance, assume that

and

are both small, say

. Then conversions between

and an equivalent primitive representation

can be done very fast. Conversions between

and

can be done coefficientwise, so they are also fast. Without much cost, this allows us to diminish the height

by one and gain a constant factor

when applying Proposition 2.4.

This discussion indicates that our techniques should be of practical interest for sufficiently large

and generic base fields

. For specific base fields

and specific types of towers, various alternative approaches have been proposed [17, 38, 39], and it would require a more significant implementation effort to figure out which approaches are best; we intend to address to this issue in the near future.

Acknowledgments

Bibliography

[1]: Ph. Aubry, D. Lazard, and M. Moreno Maza. On the theories of triangular sets. J. Symbolic Comput., 28(1–2):105–124, 1999.
[2]: S. Baktir, J. Pelzl, T. Wollinger, B. Sunar, and C. Paar. Optimal tower fields for hyperelliptic curve cryptosystems. In Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004, volume 1, pages 522–526. IEEE, 2004.
[3]: N. Benger and M. Scott. Constructing tower extensions of finite fields for implementation of pairing-based cryptography. In M. Anwar Hasan and T. Helleseth, editors, Arithmetic of Finite Fields. Third International Workshop, WAIFI 2010. Istanbul, Turkey, June 27-30, 2010. Proceedings, volume 6087 of Lecture Notes in Comput. Sci., pages 180–195. Springer-Verlag Berlin Heidelberg, 2010.
[4]: D. Bernstein. Scaled remainder trees. Available from https://cr.yp.to/arith/scaledmod-20040820.pdf, 2004.
[5]: J. Berthomieu, G. Lecerf, and G. Quintin. Polynomial root finding over local rings and application to error correcting codes. Appl. Algebra Engrg. Comm. Comput., 24(6):413–443, 2013.
[6]: A. Borodin and R. T. Moenck. Fast modular transforms. J. Comput. System Sci., 8:366–386, 1974.
[7]: A. Bostan, M. F. I. Chowdhury, J. van der Hoeven, and É. Schost. Homotopy methods for multiplication modulo triangular sets. J. Symbolic Comput., 46(12):1378–1402, 2011.
[8]: A. Bostan, F. Chyzak, F. Ollivier, B. Salvy, É. Schost, and A. Sedoglavic. Fast computation of power series solutions of systems of differential equations. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '07, pages 1012–1021, Philadelphia, PA, USA, 2007. Society for Industrial and Applied Mathematics.
[9]: A. Bostan, L. Gonzales-Vega, H. Perdry, and É. Schost. Complexity issues on Newton sums of polynomials. Distributed in the digital proceedings of MEGA'05, 2005.
[10]: A. Bostan, G. Lecerf, and É. Schost. Tellegen's principle into practice. In Hoon Hong, editor, Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation, ISSAC '03, pages 37–44, New York, NY, USA, 2003. ACM.
[11]: A. Bostan, B. Salvy, and É. Schost. Fast algorithms for zero-dimensional polynomial systems using duality. Appl. Algebra Engrg. Comm. Comput., 14(4):239–272, 2003.
[12]: F. Boulier, D. Lazard, F. Ollivier, and M. Petitot. Computing representations for radicals of finitely generated differential ideals. Appl. Algebra Engrg. Comm. Comput., 20(1):73–121, 2009.
[13]: R. P. Brent and H. T. Kung. Fast algorithms for manipulating formal power series. J. ACM, 25(4):581–595, 1978.
[14]: P. Bürgisser, M. Clausen, and M. A. Shokrollahi. Algebraic complexity theory, volume 315 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, 1997.
[15]: D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Inform., 28:693–701, 1991.
[16]: X. Dahan, M. Moreno Maza, É. Schost, and Yuzhen Xie. On the complexity of the D5 principle. In J.-G. Dumas, editor, Proceedings of Transgressive Computing 2006: a conference in honor of Jean Della Dora, pages 149–168. U. J. Fourier, Grenoble, France, 2006.
[17]: L. De Feo, J. Doliskani, and É. Schost. Fast algorithms for -adic towers over finite fields. In Proceedings of the 38th International Symposium on Symbolic and Algebraic Computation, ISSAC '13, pages 165–172, New York, NY, USA, 2013. ACM.
[18]: L. De Feo and É. Schost. Fast arithmetics in Artin–Schreier towers over finite fields. J. Symbolic Comput., 47(7):771–792, 2012.
[19]: J. Della Dora, C. Dicrescenzo, and D. Duval. About a new method for computing in algebraic number fields. In B. F. Caviness, editor, EUROCAL '85: European Conference on Computer Algebra Linz, Austria, April 1–3 1985 Proceedings Vol. 2: Research Contributions, pages 289–290. Springer Berlin Heidelberg, 1985.
[20]: D. Duval. Algebraic numbers: An example of dynamic evaluation. J. Symbolic Comput., 18(5):429–445, 1994.
[21]: J.-C. Faugère, P. Gaudry, L. Huot, and G. Renault. Sub-cubic change of ordering for Gröbner basis: a probabilistic approach. In K. Nabeshima, editor, Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, ISSAC '14, pages 170–177, New York, NY, USA, 2014. ACM.
[22]: J.-C. Faugère, P. Gianni, D. Lazard, and T. Mora. Efficient computation of zero-dimensional Gröbner bases by change of ordering. J. Symbolic Comput., 16(4):329–344, 1993.
[23]: C. M. Fiduccia. Polynomial evaluation via the division algorithm: the fast Fourier transform revisited. In A. L. Rosenberg, editor, Fourth annual ACM symposium on theory of computing, pages 88–93, 1972.
[24]: A. Fröhlich and J. C. Shepherdson. On the factorisation of polynomials in a finite number of steps. Math. Z., 62:331–334, 1955.
[25]: A. Fröhlich and J. C. Shepherdson. Effective procedures in field theory. Philos. Trans. Roy. Soc. London. Ser. A., 248:407–432, 1956.
[26]: A. Garcia and H. Stichtenoth. A tower of Artin–Schreier extensions of function fields attaining the Drinfeld–Vladut bound. Invent. Math., 121(1):211–222, 1995.
[27]: J. von zur Gathen and J. Gerhard. Modern computer algebra. Cambridge University Press, New York, 3rd edition, 2013.
[28]: M. Giusti, G. Lecerf, and B. Salvy. A Gröbner free alternative for polynomial system solving. J. Complexity, 17(1):154–211, 2001.
[29]: B. Grenet, J. van der Hoeven, and G. Lecerf. Deterministic root finding over finite fields using Graeffe transforms. Appl. Algebra Engrg. Comm. Comput., 27(3):237–257, 2016.
[30]: D. Harvey and J. van der Hoeven. Faster integer and polynomial multiplication using cyclotomic coefficient rings. Technical report, ArXiv, 2017. http://arxiv.org/abs/1712.03693.
[31]: D. Harvey and J. van der Hoeven. Faster integer multiplication using short lattice vectors. Technical report, ArXiv, 2018. http://arxiv.org/abs/1802.07932, to appear in Proc. ANTS 2018.
[32]: D. Harvey and J. van der Hoeven. Faster integer multiplication using plain vanilla FFT primes. Math. Comp., 88(315):501–514, 2019.
[33]: D. Harvey, J. van der Hoeven, and G. Lecerf. Even faster integer multiplication. J. Complexity, 36:1–30, 2016.
[34]: D. Harvey, J. van der Hoeven, and G. Lecerf. Faster polynomial multiplication over finite fields. J. ACM, 63(6), 2017. Article 52.
[35]: J. van der Hoeven. Faster Chinese remaindering. Technical report, CNRS & École polytechnique, 2016. http://hal.archives-ouvertes.fr/hal-01403810.
[36]: J. van der Hoeven and G. Lecerf. On the complexity of multivariate blockwise polynomial multiplication. In J. van der Hoeven and M. van Hoeij, editors, Proceedings of the 37th International Symposium on Symbolic and Algebraic Computation, ISSAC '12, pages 211–218. ACM, 2012.
[37]: J. van der Hoeven and G. Lecerf. On the bit-complexity of sparse polynomial and series multiplication. J. Symbolic Comput., 50:227–254, 2013.
[38]: J. van der Hoeven and G. Lecerf. Modular composition via complex roots. Technical report, CNRS & École polytechnique, 2017. http://hal.archives-ouvertes.fr/hal-01455731.
[39]: J. van der Hoeven and G. Lecerf. Modular composition via factorization. J. Complexity, 48:36–68, 2018.
[40]: Xiaohan Huang and V. Y. Pan. Fast rectangular matrix multiplication and applications. J. Complexity, 14(2):257–299, 1998.
[41]: M. Kalkbrener. A generalized Euclidean algorithm for computing triangular representations of algebraic varieties. J. Symbolic Comput., 15(2):143–167, 1993.
[42]: K. S. Kedlaya and C. Umans. Fast polynomial factorization and modular composition. SIAM J. Comput., 40(6):1767–1802, 2011.
[43]: U. J. J. Le Verrier. Sur les variations séculaires des éléments elliptiques des sept planètes principales : Mercure, Venus, La Terre, Mars, Jupiter, Saturne et Uranus. J. Math. Pures Appli., 1:220–254, 1840.
[44]: R. Lebreton. Relaxed Hensel lifting of triangular sets. J. Symbolic Comput., 68(2):230–258, 2015.
[45]: G. Lecerf. On the complexity of the Lickteig–Roy subresultant algorithm. J. Symbolic Comput., 92:243–268, 2019.
[46]: X. Li, M. Moreno Maza, and É Schost. Fast arithmetic for triangular sets: from theory to practice. J. Symbolic Comput., 44(7):891–907, 2009.
[47]: J. M. McNamee and V. Y. Pan. Numerical Methods for Roots of Polynomials, Part II, volume 16 of Studies in Computational Mathematics. Elsevier, 2013.
[48]: R. T. Moenck and A. Borodin. Fast modular transforms via division. In Thirteenth annual IEEE symposium on switching and automata theory, pages 90–96, Univ. Maryland, College Park, Md., 1972.
[49]: M. Moreno Maza, É. Schost, and P. Vrbik. Inversion modulo zero-dimensional regular chains. In V. P. Gerdt, W. Koepf, E. W. Mayr, and E. V. Vorozhtsov, editors, Computer Algebra in Scientific Computing. 14th International Workshop, CASC 2012. Maribor, Slovenia, September 2012. Proceedings, volume 7442 of Lect. Notes Comput. Sci., pages 224–235. Springer Berlin Heidelberg, 2012.
[50]: M. S. Paterson and L. J. Stockmeyer. On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J. Comput., 2(1):60–66, 1973.
[51]: A. Poteaux and É. Schost. Modular composition modulo triangular sets and applications. Comput. Complex., 22(3):463–516, 2013.
[52]: A. Poteaux and É. Schost. On the complexity of computing with zero-dimensional triangular sets. J. Symbolic Comput., 50:110–138, 2013.
[53]: J. F. Ritt. Differential algebra. Amer. Math. Soc., New York, 1950.
[54]: A. Schönhage. The fundamental theorem of algebra in terms of computational complexity. Technical report, Math. Inst. Univ. of Tübingen, 1982.
[55]: J. T. Schwartz. Fast probabilistic algorithms for verification of polynomial identities. J. ACM, 27(4):701–717, 1980.
[56]: V. Shoup. Fast construction of irreducible polynomials over finite fields. J. Symbolic Comput., 17(5):371–391, 1994.
[57]: R. Zippel. Probabilistic algorithms for sparse polynomials. In Edward W. Ng, editor, Symbolic and Algebraic Computation. EUROSAM '79, An International Symposium on Symbolic and Algebraic Manipulation, Marseille, France, June 1979, number 72 in Lect. Notes Comput. Sci., pages 216–226. Springer Berlin Heidelberg, 1979.

1.Introduction

1.1.Statement of the problem

1.2.Modular composition and related problems

1.3.Previous work on fast tower arithmetic

1.4.Our contributions

2.Basic tower arithmetic

2.1.Complexity model

2.2.Univariate polynomial multiplication

2.3.Primitive towers

2.4.Multiplication in towers

2.5.Division in towers

3.Evaluating polynomials at points in algebras

3.1.Univariate baby-step giant-step method

3.2.Multivariate baby-step giant-step generalization

4.Separable towers of fields

4.1.Primitive tower representations

4.2.Accelerated tower arithmetic

5.Separable towers and factorization

5.1.Separable towers of rings

5.2.Tree factorizations of towers

5.3.Multi-modular representations

5.4.Factorization

5.5.Remarks about special coefficient fields

Complex numbers

Finite fields

6.Primitive element representations

6.1.Trace computations in towers

6.2.Characteristic polynomials

6.3.Primitive elements

Conclusion

Acknowledgments

Bibliography