Fast interpolation of multivariate polynomialswith sparse exponents

Consider a multivariate integer polynomial

. Then

can uniquely be written as a sum

where

and

are pairwise distinct. Here we understand that

for any

. We call (1.1) the sparse representation of

In this paper, we assume that

is not explicitly given through its sparse representation and that we only have a program for evaluating

. The goal of sparse interpolation is to recover the sparse representation of

Theoretically speaking, we could simply evaluate

at a single point

with

for some sufficiently large positive integer

. Then the sparse representation of

can directly be read off from the binary digits of

. However, the bit-complexity of this method is terrible, since the bit-size of

typically becomes huge.

In order to get a better grip on the bit-complexity to evaluate and then interpolate

, we will assume that we actually have a program to evaluate

modulo

for any positive integer modulus

. We denote by

the cost to evaluate

for a modulus with

and we assume that the average cost per bit

is a non-decreasing function that grows not too fast as a function of

. Roughly speaking, when using fast integer arithmetic,

is essentially the number of arithmetic operations needed to evaluate

, up to logarithmic factors in

. More generally, this notation is useful for practice in order to distinguish the cost of the evaluation of

from the cost of the actual interpolation, independently of the underlying integer arithmetic.

As in [15], for complexity estimates we use a random memory access (RAM) machine over a finite alphabet along with the soft-Oh notation:

means that

. The machine is expected to have an instruction for generating a random bit in constant time (see section 2 for the precise computational model and hypotheses that we use). Assuming that we are given a bound

for the number of terms of

and a number

such that each term of

has bit-size

, the main result of this paper is the following:

If the

variables all appear in the sparse representation of

, then

and

, so the bound further simplifies into

. When using fast integer arithmetic to evaluate

, the latter bound simplifies to

, where

stands for the number of arithmetic operations needed to evaluate

. In addition, detecting useless variables can be done fast, e.g. using the heuristic method from [28, section 7.4].

The problem of sparse interpolation has a long history and goes back to work by Prony in the 18^th century [41]. The first modern fast algorithm is due to Ben-Or and Tiwari [6]. Their work spawned a new area of research in computer algebra together with early implementations [11, 13, 20, 31, 34, 35, 38, 45]. We refer to [42] and [40, section 3] for nice surveys.

Modern research on sparse interpolation has developed in two directions. The first theoretical line of research has focused on rigorous and general complexity bounds [2–4, 14, 17, 32]. The second direction concerns implementations, practical efficiency, and applications of sparse interpolation [5, 19, 21, 23, 26, 28, 30, 33, 36, 37].

The present paper mainly falls in the first line of research, although we will briefly discuss practical aspects in section 5. The proof of our main theorem relies on some number theoretical facts about prime numbers that will be recalled in section 2.3. There is actually a big discrepancy between empirical observations about prime numbers and hard theorems that one is able to prove. Because of this, our algorithms involve constant factors that are far more pessimistic than the ones that can be used in practice. Our algorithms also involve a few technical complications in order to cover all possible cases, including very large exponents that are unlikely to occur in practice.

Our paper borrows many techniques from [17, 40] that deal with the particular case when

is a univariate polynomial, and that achieves a quasi-optimal complexity bound for when the bit-sizes of the terms of

are well-balanced. In principle, the multivariate case can be reduced to the univariate case: setting

for a sufficiently large

, the interpolation of

reduces to the interpolation of

. However, this reduction is optimal only if the entries of the vector exponents

are all approximately of the same bit-size. One interesting case that is not well covered by this reduction is when the number of variables

is large and when the exponent vectors

are themselves sparse in the sense that only a few entries are non-zero. The main contribution of the present paper is the introduction of a quasi-optimal technique to address this problem.

Another case that is not well covered by [17, 40] is when the bit-sizes of the coefficients or exponents vary wildly. Recent progress on this issue has been made in [18] and it is plausible that these improvements can be extended to our multivariate setting (see also Remark 4.11). Theorem 4.9 also provides a quasi-optimal solution for an approximation of the sparse interpolation problem: for a fixed constant

, we only require the determination of at least

correct terms of (1.1).

In order to cover sparse exponent vectors in a more efficient way, we will introduce a new technique in section 3. The idea is to compress such exponent vectors using random projections. With high probability, it will be possible to reconstruct the actual exponent vectors from their projections. We regard this section as the central technical contribution of our paper. Let us further mention that random projections of the exponents were previously implemented in [4] in order to reduce the multivariate case to the univariate one: monomial collisions were avoided in this way but the reconstruction of the exponents needed linear algebra and could not really catch sparse or unbalanced exponents. The proof of Theorem 1.1 will be completed at the end of section 4. Section 5 will address the practical aspects of our new method. An important inspiration behind the techniques from section 3 and its practical variants is the mystery ball game from [23]; this connection will also be discussed in section 5.

Our paper focuses on the case when

has integer coefficients, but our algorithm can be easily adapted to rational coefficients as well, essentially by appealing to rational reconstruction [15, Chapter 5, section 5.10] during the proof of Lemma 4.8. However when

has rational coefficients, its blackbox might include divisions and therefore raise “division by zero” errors occasionally. This makes the probability analysis and the worst case complexity bounds more difficult to analyze, so we preferred to postpone this study to another paper.

Our algorithm should also be applicable to various other coefficient rings of characteristic zero. However it remains an open challenge to develop similar algorithms for coefficient rings of small positive characteristic.

Notation.

We may use both

and

as sets of canonical representatives modulo

. Given

and depending on the context, we write

for the unique

with

2.Preliminaries

This section presents sparse data structures, computational models, and quantitative results about prime distributions. At the end, an elementary fast algorithm is presented for testing the divisibility of several integers by several prime numbers.

2.1.Sparse polynomials

We order

lexicographically by

. Given formal indeterminates

and an exponent

, we define

. We define the bit-size of an integer

. In particular,

, etc. We define the bit-size of an exponent tuple

. We extend these definitions to the cases when

and

by setting

and

, where

where

and

are such that

. We call this the sparse representation of

. We call

the exponents of

and

the corresponding coefficients. We also say that

are the terms of

and we call

the support of

. Any non-zero

with

and

is called an individual exponent of

. We define

to be the bit-size of

Remark 2.1. For the complexity model we could have chosen a multi-tape Turing machine, but this would have led to more tedious cost analyses. In fact on a Turing tape, we would actually need to indicate the ends of numbers using adequate markers. Using a liberal notion of “bit”, which allows for the storage of such markers in a single bit, the Turing bit-size of an integer

then becomes

. For

, we also define

. Exponents

can be stored by appending the representations of

, but this is suboptimal in the case when only a few entries of

are non-zero. For such “sparse exponents”, one prefers to store the pairs

for which

, again using suitable markers. For this reason, the Turing bit-size of

becomes

2.2.Modular blackbox polynomials

Throughout this paper, we will analyze bit complexities in the RAM model as in [15]. In this model, it is known [22] that two

-bit integers can be multiplied in time

. As a consequence, given an

-bit modulus

, the ring operations in

can be done with the same complexity [9, 15]. Inverses can be computed in time

, whenever they exist [9, 15]. For randomized algorithms, we assume that we have an instruction for generating a random bit in constant time.

Consider a polynomial

. A modular blackbox representation for

is a program that takes a modulus

and

integers

as input, and that returns

. A modular blackbox polynomial is a polynomial

that is represented in this way. The cost (or, better, a cost function) of such a polynomial is a function

such that

yields an upper bound for the running time if

has bit-size

. It will be convenient to always assume that the average cost

per bit of the modulus is non-decreasing and that

for any

. Since

should at least read its

input values, we will freely assume that

Remark 2.2. A popular type of modular blackboxes are straight-line programs (SLPs) [10]. For an SLP of length

that only uses ring operations, the above average cost function usually becomes

, for some fixed constant

that does not depend on

If the SLP also allows for divisions, then we rather obtain

, but this is out of the scope of this paper, due to the “division by zero” issue. In fact, computation trees [10] are more suitable than SLPs in this context. For instance, the computation of determinants using Gaussian elimination naturally fits in this setting, since the chosen computation path may then depend on the modulus

However, although these bounds “usually” hold (i.e. for all common algebraic algorithms that we are aware of, including the RAM model), they may fail in pathological cases when the SLP randomly accesses data that are stored at very distant locations on the Turing tape. For this reason, the blackbox cost model may be preferred in order to study bit complexities. In this model, a suitable replacement for the length

of an SLP is the average cost function

, which typically involves only a logarithmic overhead in the bit-length

2.3.Number theoretic reminders

All along this paper

(resp.

) will bound the bit-size (resp. number of terms) of the polynomial to be interpolated, and

and

will denote random prime numbers that satisfy:

The construction of

and

will rely on the following number theoretic theorems, where

stands for the natural logarithm, that is

. We will also use

Proof. The function

is increasing for

and

. So it is always non-decreasing and continuous. The number of prime divisor of any

it at most

. Let

and

respectively be the number of divisors and prime divisors of

. Then clearly

. Now for all

we know from [39] that

We will need the following slightly modified version of [16, Theorem 2.1], which is itself based on a result from [44].

Proof. In [16, Theorem 2.1] the statement (b) is replaced by the simpler condition that

. But by looking at step 2 of the algorithm on page 4 of [44], we observe that

is actually uniformly distributed amongst the primes of

and that there are at least

such primes with high probability

Proof. Theorem 2.3 asserts that there are at least

primes in the interval

. The probability to fetch a prime number in

while avoiding at most

fixed numbers is at least

The probability of failure after

trials is at most

. By using the AKS algorithm [1] each primality test takes time

. The probability of success for picking

distinct prime numbers in this way is at least

In order to guarantee this probability of success to be at least

, it suffices to take

2.4.Amortized determination of prime divisors in a fixed set

Let

be prime numbers and let

be strictly positive integers. The aim of this subsection is to show that the set of pairs

can be computed in quasi-linear time using the following algorithm named

Algorithm divisors

Input: non empty subsets and .

Output: the set .

If , then return .
Let , let be a subset of of cardinality , and let .
Compute and .
Compute and .
Return .

Proof. Let

and

. Step 1 costs

by using fast multi-remaindering [15, Chapter 10]. Using fast sub-product trees [15, Chapter 10], step 3 takes

. By means of fast multi-remaindering again, step 4 takes

operations.

Let

denote the cost of the inner recursive calls occurring during the execution of the algorithm. Inside the recursive calls,

holds, so we have shown that

Unrolling this inequality and taking into account that the depth of the recursion is

, we deduce that

. Finally, the total cost of the algorithm is obtained by adding the cost of the top-level call, which is

3.Probabilistic codes for exponents

Consider a sparse polynomial

that we wish to interpolate. In the next section, we will describe a method that allows us to efficiently compute most of the exponents

in an encoded form

. The simplest codes

are of the form

When

, the most common such encoding is the Kronecker encoding, with

for all

. However, this encoding may incur large bit-size

with respect to the bit-size of

The aim of this section is to introduce more compact codes

. These codes will be “probabilistic” in the sense that we will only be able to recover

from

with high probability, under suitable assumptions on

. Moreover, the recovery algorithm is only efficient if we wish to simultaneously “bulk recover”

exponents from their codes.

3.1.The exponent encoding

Throughout this section, the number of variables

and the target number of exponents

are assumed to be given. We allow exponents to be vectors of arbitrary integers in

. Actual computations on exponents will be done modulo

for a fixed odd base

and a flexible

-adic precision

. We also fix a constant

such that

We finally assume

to be a parameter that will be specified in section 4. The exponent encoding will depend on one more parameter

We will sometimes write

and

instead of

and

in order to make the dependence on

and

explicit.

Example 3.1. Let us take

, and

. Then

and possible random choices for

and

are

3.2.Guessing individual exponents of prescribed size

Given an exponent

, our first aim is to determine the individual exponents

of small size. More precisely, assuming that

We say that

is transparent if for each

with

, there exists a

such that

. This property does only depend on the random choices of the

Example 3.3. In Example 3.1, we observe that

is transparent, by taking

for

. However,

is not transparent, since no suitable index

can be found for

. For our parameters, the probability from Lemma 3.2 becomes

. For random

with

, it turns out that this bound is a bit pessimistic.

Proof. Let

be the set of all primes strictly between

and

. Let

be the set of all

such that

are pairwise distinct. Let

be the subset of

of all choices of

for which

is not faithful.

Now consider the set

such that

is divisible by

. Any such

is a divisor of either

, or

. Since

is odd we have

From Theorem 2.3 we know that

. This yields

, as well as

, thanks to (3.3). We conclude that

whose rows are the reductions of

modulo

. By investigating the zero coefficients of this matrix, we verify that

is faithful. Again, for our choice of parameters, the probability estimate from Lemma 3.4 turns out to be somewhat pessimistic.

Given

and

, let

be the smallest index such that

and

. If no such

exists, then we let

. We define

Assume that

is transparent and faithful for some

. Given

, let

and

otherwise. Then the condition

implies that

always holds. Moreover, if

, then the definitions of “transparent” and “faithful” imply that

. In other words, these

can efficiently be recovered from

and

Example 3.6. Following up on our running Example 3.1, recall that

is both transparent and faithful. We claim that we can reconstruct

from

. Indeed, by looking at the matrix (3.5), we observe that

. Consequently,

, and

In what follows, we will need a procedure to efficiently compute

for a large number

of different encodings

, corresponding to the terms of a sparse polynomial.

Proof. Note that the hypotheses

and

imply that

. Using Lemma 2.7, we can compute all triples

with

in time

thanks to (3.2). Using

further operations we can filter out the triples

for which

. We next sort the resulting triples for the lexicographical ordering, which can again be done in time

. For each pair

, we finally only retain the first triple of the form

. This can once more be done in time

. Then the remaining triples are precisely those

with

Proof. Lemmas 3.2 and 3.4 bound the probability of failure for the transparency and the faithfulness of the random choices for each of the

terms. The complexity bound follows from Lemma 3.7.

3.3.Guessing exponents of prescribed size

Our next aim is to determine all exponents

with

, for some fixed threshold

. For this, we will apply Lemma 3.8 several times, for different choices of

. Let

We also choose

and

independently at random as explained in section 3.1, with

, and

in the roles of

, and

. So

is a set of

pairwise distinct random prime numbers in

and

is a tuple

of subsets of

of cardinality

. We finally define

Proof. We compute successive approximations

as follows. We start with

. Assuming that

is known, we compute

for all

, and then apply the algorithm underlying Lemma 3.8 to

and

. Note that for all

the equality

implies

Our choice of

and

, the inequality (3.6), and the assumption

successively ensure that

Let

be the set of indices

such that

and

. Lemma 3.8 provides us with

such that

for all

. In addition

holds whenever

and

with probability at least

At the end, we return

as our guess for

. For the analysis of this algorithm, we first assume that all applications of Lemma 3.8 succeed. Let us show by induction (and with the above definitions of

and

) that, for all

, and

, we have:

and

then (i) clearly holds. If

and

, then (3.6) and (3.7) imply

. Since

we have

. Now we clearly have

and

, so (i) holds.

, then let

be such that

, so we have

. The induction hypothesis (iii) and (3.4) yield

Hence the total bit-size of all

such that

is at least

and at most

by definition of

. This concludes the proof of (i). Now if

, then

, so Lemma 3.8 and (i) imply that

. In other words, we obtain an approximation

such that

for all

, and

holds whenever

As to (iii), assume that

. If

, then

is immediate. If

, then the induction hypothesis (ii) yields

, whence

As to the complexity bound, Lemma 3.9 shows that

, when applying Lemma 3.8. Consequently, the cost of all these applications for

is bounded by

since

. The cost of the evaluations of

and all further additions, subtractions, and modular reductions is not more expensive.

The bound for the probability of success follows by induction from Lemmas 3.8 and 3.9, while using the fact that all

and

are chosen independently.

4.Sparse interpolation

Throughout this section, we assume that

is a modular blackbox polynomial in

with at most

terms and of bit-size at most

. In order to simplify the cost analyses, it will be convenient to assume that

Our main goal is to interpolate

. From now on

will stand for the actual number of non-zero terms in the sparse representation of

Our strategy is based on the computation of increasingly good approximations of the interpolation of

, as in [3], for instance. The idea is to determine an approximation

, such that

contains roughly half the number of terms as

, and then to recursively re-run the same algorithm for

instead of

. Our first approximation will only contain terms of small bit-size. During later iterations, we will include terms of larger and larger bit-sizes.

so that at most

of the terms of

have size

. Our main technical aim will be to determine at least

terms of

of size

, with high probability.

4.1.Cyclic modular projections

Our interpolation algorithm is based on an extension of the univariate approach from [17]. One first key ingredient is to homomorphically project the polynomial

to an element of

for a suitable cycle length

and a suitable modulus

(of the form

, where

is as in the previous section and

and compute

as in Theorem 2.5. Now let

be random elements of

, and consider the map

Proof. Given

, let

and

. Note that

and

for any

. The first inequality yields

, whereas the second one implies

, then we say that a term

and the corresponding exponent

are

-faithful if there is no other term

such that

. If

, then we define

-faithfulness in the same way, while requiring in addition that

be invertible modulo

. For any

, we note that

-faithful if and only if

-faithful. We say that

-faithful if all its terms are

-faithful. In a similar way, we say that

-faithful if

is invertible for any non-zero term

The first step of our interpolation algorithm is similar to the one from [17] and consists of determining the exponents of

. Let

be a prime number. If

-faithful, then the exponents of

are precisely those of

Proof. We first precompute

in time

. We compute

by evaluating

for

. This takes

bit-operations. We next retrieve

from these values using an inverse discrete Fourier transform (DFT) of order

. This takes

further operations in

, using Bluestein's method [7].

Assuming that

-faithful and that we know

, consider the computation of

for higher precisions

. Now

is also

-faithful, so the exponents of

and

coincide. One crucial idea from [17] is to compute

using only

instead of

evaluations of

modulo

. This is the purpose of the next lemma.

Proof. We first Hensel lift the primitive

-th root of unity

to a principal

-th root of unity

in time

, as detailed in [17, section 2.2]. We next compute

in time

, using binary powering. We pursue with the evaluations

for

. This can be done in time

Now the exponents

are known, since they coincide with those of

, and we have the relation

where

denotes the coefficient of

, for

. It is well known that this linear system can be solved in quasi-linear time

: in fact this problem reduces to the usual interpolation problem thanks to the transposition principle [8, 35]; see for instance [25, section 5.1].

4.2.Probability of faithfulness

The next lemma is devoted to bounding the probability of picking up random prime moduli that yield

-faithful projections.

Proof. We let

and

as in (4.2), which allows us to apply Theorem 2.5. Let

and

. For any

, let

Given

, consider

and note that

. We say that

is admissible if

and

for all

. This is the case if and only if

is not divisible by

. Now

is divisible by at most

distinct prime numbers, by Theorem 2.4. Since there are at least

prime numbers in

, by Theorem 2.3, the probability that

is divisible by

is at most

since

from (4.1). Now consider two admissible exponents

and let

with

. For fixed values of

with

, there is a single choice of

with

. Hence the probability that this happens with random

. Consequently, for fixed

, the probability that

for some

is at most

Assuming now that

is fixed, let us examine the probability that

-faithful. Let

be the product of all non-zero coefficients of

. Then

faithful if and only if

does not divide

. Now the bit-size of the product

is bounded by

, by Lemma 4.1. Hence

is divisible by at most

prime numbers, by Theorem 2.4 and our assumption that

. With probability at least

there are at least

prime numbers amongst which

is chosen at random, by Theorem 2.5(b). Assuming this,

is not

-faithful with probability at most

Let

be the set of

such that

and

for all

. Altogether, the bounds (4.3), (4.4), (4.5), and (4.6) imply that the probability that a given

belongs to

is at least

. It follows that the expectation of

is at least

. For

this further implies that the probability that

cannot exceed

: otherwise the expectation of

would be

We finally observe that

-faithful whenever

for all

such that

. Now for every

with

, there is at most one

with

: if

for

, then

, whence

. By (4.1), there are at most

exponents

with

. We conclude that

, whenever

4.3.Computing probabilistic codes for the exponents

Lemma 4.3 allows us to compute the coefficients of

with good complexity. In the univariate case, it turns out that the exponents of

-faithful terms of

can be recovered as quotients of matching terms of

and

by taking

sufficiently large.

In the multivariate case, this idea still works, for a suitable Kronecker-style choice of

. However, we only reach a suboptimal complexity when the exponents of

are themselves sparse and/or of unequal magnitudes. The remedy is to generalize the “quotient trick” from the univariate case, by combining it with the algorithms from section 3: the quotients will now yield the exponents in encoded form.

Let us now specify the remaining parameters from section 3. First of all, we take

, where

We also take

and

. Since

is odd, the inequalities (3.2) hold. We compute

and

for

as in section 3.3.

and succeeds with probability at least

. The probability of success for all

is at least

, because

By definition of

and

, we have

. By (4.1) we also have

. Hence the cost for computing

and

simplifies to

Consider a family of numbers

, where

, and

. We say that the family

is a faithful exponent encoding for

if we have

whenever

is a

-faithful exponent of

with

. We require nothing for the remaining numbers

, which should be considered as garbage.

Proof. We compute all

and

using Lemma 4.6. Let

and

be the coefficients of

and

. Then we take

is divisible by

and

if not. For a fixed

all these divisions take

bit-operations, using a similar reasoning as in the proof of Lemma 4.6. Since

, the result follows.

4.4.Approximate sparse interpolation

Proof. We first compute the required random parameters as in Lemma 4.5. This takes time

and succeeds with probability at least

. We next compute

using Lemma 4.2, which can be done in time

We next apply Corollary 4.7 and compute a faithful exponent encoding for

, in time

. By Lemma 4.4, this computation fails with probability at most

. Moreover, in case of success, we have

, still with the notation of Lemma 4.4.

From (4.1) and

, we get

. This allows us to apply Theorem 3.10 to the faithful exponent encoding for

. Let

be the exponents of

. Consider

such that there exists a

-faithful term

with

and

. Theorem 3.10 produces a guess for

for every such

. With probability at least

Let

be the smallest integer such that

. We finally compute

using Lemma 4.3, which can be done in time

With a probability at least

, all the above steps succeed. In that case, we clearly have

. Let

, where

(resp.

) is the sum of the terms of bit-size

(resp.

), so that

. Then the definition of

implies that

and

In order to conclude, it remains to prove the claimed inequality (4.9). Using the definition of

and the inequalities

, we have

Let us analyze the right-hand side of (4.12). Without further mention, we will frequently use that

. First of all, we have

Instead of just half of the coefficients, it is possible to compute any constant portion of the terms with the same complexity. More precisely, given

, we say that

is a

-approximation of

and

Proof. When

we have

, so (4.10) becomes

. In addition, we have

, whenever

. In other words, if

is sufficiently large, then the proof of Lemma 4.8 actually yields a

-approximation of

. We finally note that

4.5.Sparse interpolation

Once an approximation

of the sparse interpolation of

is known, we wish to obtain better approximations by applying the same algorithm with

in the role of

. This requires an efficient algorithm to evaluate

. In fact, we only need to compute projections of

as in Lemma 4.2 and to evaluate

at geometric sequences as in Lemma 4.3. Building a blackbox for

turns out to be suboptimal for these tasks. Instead, it is better to use the following lemma.

Proof. Given

, the projections

can be computed in time

. The projections

can be obtained using

further operations. Adding up these projections for

takes

operations. Altogether, this yields (i).

As for part (ii) we follow the proof of Lemma 4.3. We first compute

with

by binary powering. The reductions

of exponents are obtained in time

. Then, the powers

and

can be deduced in time

By the transposition principle, this matrix-vector product can be computed with the same cost as multi-point evaluation of a univariate polynomial. We conclude that

can be computed in time

Proof of Theorem 1.1. We set

and take

as in (4.2). Thanks to Theorem 2.5, we may compute a suitable triple

in time

, with probability of success at least

, where

Let

. Let

and

for

. Starting with

, we compute a sequence

of successive approximations of

as follows. Assuming that

is known for some

, we apply Lemma 4.8 with

and

in the roles of

and

. With high probability, this yields a

-approximation

and we set

. Taking

into account in Lemma 4.8 requires the following changes in the complexity analysis:

It follows that the complexity bounds of Lemmas 4.6, 4.8, and Corollary 4.7 remain unchanged when replacing

. The total running time for computing the sequence

is therefore bounded by

Remark 4.11. As interesting open problem is whether the complexity bound in Theorem 1.1 can be replaced by

, where

is a given bound on the bit-size of

. This was erroneously asserted in a prior version of this paper [27] and such a bound would be more efficient in cases where the sizes of the coefficients and exponents of

vary widely. Here, the first difficulty appears when a first approximation

has

terms with small coefficients and

has just a few terms but huge coefficients of bit-size

: the evaluation cost of

in part (ii) of Lemma 4.10 would need to remain in

. Recent advances in this problem can be found in [18] for the univariate case. Note that the bound

is still achieved in the approximate version (Theorem 4.9) of our main theorem.

5.Practical variants

For practical purposes, the choice of

is not realistic. The high constant

is due to the fact that we rely on [44] for unconditional proofs for the existence of prime numbers with the desirable properties from Theorem 2.5. Such unconditional number theoretic proofs are typically very hard and lead to pessimistic constants. Numerical evidence shows that a much smaller constant would do in practice: see [16, section 4] and [40, section 2.2.2]. For the univariate case the complexity of the sparse interpolation algorithm in [40] is made precise in term of the logarithmic factors.

The exposition so far has also been optimized for simplicity of presentation rather than practical efficiency: some of the other constant factors might be sharpened further and some of the logarithmic factors in the complexity bounds might be removed. In practical implementations, one may simply squeeze all thresholds until the error rate becomes unacceptably high. Here one may exploit the “auto-correcting” properties of the algorithm. For instance, although the

-approximation

from the proof of Theorem 1.1 may contain incorrect terms, most of these terms will be removed at the next iteration.

Our exposition so far has also been optimized for full generality. For applications, a high number of, say

, variables may be useful, but the bit-size of individual exponents rarely exceeds the machine precision. In fact, most multivariate polynomials

of practical interest are of low or moderately large total degree. A lot of the technical difficulties from the previous sections disappear in that case. In this section we will describe some practical variants of our sparse interpolation algorithm, with a main focus on this special case.

5.1.Conducting most computations in double precision

In practice, the evaluation of our modular blackbox polynomial is typically an order of magnitude faster if the modulus fits into a double precision number (e.g.

bits if we rely on floating point arithmetic and

bits when using integer arithmetic). In this subsection, we describe some techniques that can be used to minimize the use of multiple precision arithmetic.

Chinese remaindering.

Tangent numbers.

so we may again obtain

from

and

using one division. Of course, this requires our ability to evaluate

at elements of

, which is indeed the case if

is given by an SLP. Note that arithmetic in

is about three times as expensive as arithmetic in

Small prime divisors.

Chinese remaindering, bis.

Let

be a triple as in Theorem 2.5. In order to avoid evaluations of

modulo large integers of the form

, we wish to use Chinese remaindering. Let

triples as in Theorem 2.5 with

, the

pairwise distinct, and where

is shared by all triples. Since there are many primes

for which

contains at least

primes, such triples can still be found with high probability. In practice,

already contains enough prime numbers.

Evaluations of

modulo

are now replaced by separate evaluations modulo

after which we can reconstruct evaluations modulo

using Chinese remaindering. However, one crucial additional idea that we used in Lemma 4.3 is that

is automatically

-faithful as soon as it is

-faithful. In the multi-modular setting, if

-faithful, then it is still true that the exponents of

are contained in the exponents of

for

. This is sufficient for Lemma 4.3 to generalize.

5.2.The mystery exponent game

Let

be bounds for the degree of

, let

be a bound for the maximal number of variables that can occur in a term of

, and let

the prime numbers from section 3.1.

Assume that our polynomial

has only nonnegative “moderately large exponents” in the sense that

fits into a machine number. Then we may simplify the setup from section 3.1 by taking

where

and

are small constants and where we forget about

and

. For any

, let

. In this simplified setup, one may use the following algorithm to retrieve

from

Output: with or .

While there exist and with and do:

If for some or , then let .

Otherwise, update , , and .

The probability of success is non-trivial to analyze due to the interplay of the choices of

and the updates of

. For this reason, we consider the algorithm to be heuristic. Nevertheless, returned values are always correct:

Proof. By construction, we have the loop invariant that

, so the answer is clearly correct in case of success. The set of “problematic pairs”

was introduced in order to guarantee progress and avoid infinite loops. Indeed,

strictly decreases at every update of

. For definiteness, we also ensured that

remains in

throughout the algorithm. (One may actually release this restriction, since incorrect decisions may be corrected later during the execution.)

Remark 5.2. Algorithm 5.1 has one interesting advantage with respect to the method from section 3: the correct determination of some of the

leads to a simplification of

, which diminishes the number of collisions (i.e. entries

such that the sum contains at least two non-zero terms). This allows the algorithm to discover more coefficients

that might have been missed without the updates of

. As a result, the algorithm may succeed for lower values of

and

, e.g. for a more compressed encoding of

Remark 5.3. From a complexity perspective, some adaptations are needed to make Algorithm 5.1 run in quasi-linear time. Firstly, one carefully has to represent the sets

so as to make the updates

efficient. Secondly, from a theoretical point of view, it is better to collect pairs

with

in one big pass (thereby benefiting from Lemma 2.7) and perform the updates during a second pass. However, this second “optimization” is only useful in practice when

becomes very large (i.e.

), as explained in the previous subsection.

Algorithm 5.1 is very similar to the mystery ball game algorithm from [23]. This analogy suggests to choose the random sets

in a slightly different way: let

be random maps, where

and

, and take

Assuming for simplicity that

whenever

in our algorithm, the probability of success was analyzed in [23]. It turns out that there is a phase change around

(and

). For any

and

, numeric evidence suggests that the probability of success exceeds

for sufficiently large

This should be compared with our original choice of the

, for which the mere probability that a given index

belongs to none of the

is roughly

. We clearly will not be able to determine

whenever this happens. Moreover, the probability that this does not happen for any

is roughly

. In order to ensure that

, say, this requires us to take

5.3.The Ben-Or–Tiwari encoding

Ben-Or and Tiwari's seminal algorithm for sparse interpolation [6] contained another way to encode multivariate exponents, based on the prime factorization of integers: given

pairwise distinct prime numbers

, we encode an exponent

. Ben-Or and Tiwari simply chose

to be the

-th prime number. For our purposes, it is better to pick

pairwise distinct small random primes

with

. Using (a further extension of) Lemma 2.7, we may efficiently bulk retrieve

from

for large sets of exponents

The Ben-Or–Tiwari encoding can also be used in combination with the ideas from section 4.3. The idea is to compute both

and

with

. For monomials

, we have

If the total degree of

is bounded by a small number

, then the Ben-Or–Tiwari encoding is very compact. In that case, all exponents

indeed satisfy

, whence

. However, if

is a bit larger (say

and

), then

might not fit into a machine integer and there is no obvious way to break the encoding

up in smaller parts that would fit into machine integers.

By contrast, the encoding

from the previous subsection uses a vector of numbers that do fit into machine integers, under mild conditions. Another difference is that

only grows linearly with

instead of

, and only logarithmically with

. As soon as

is large and

not very small, this encoding should therefore be more efficient for practical purposes.

Acknowledgments

Bibliography

[1]: M. Agrawal, N. Kayal, and N. Saxena. PRIMES is in P. Ann. Math., 160(2):781–793, 2004.
[2]: A. Arnold, M. Giesbrecht, and D. S. Roche. Sparse interpolation over finite fields via low-order roots of unity. In ISSAC '14: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, pages 27–34. New York, NY, USA, 2014. ACM Press.
[3]: A. Arnold, M. Giesbrecht, and D. S. Roche. Faster sparse multivariate polynomial interpolation of straight-line programs. J. Symb. Comput., 75:4–24, 2016.
[4]: A. Arnold and D. S. Roche. Multivariate sparse interpolation using randomized Kronecker substitutions. In ISSAC '14: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, pages 35–42. New York, NY, USA, 2014. ACM Press.
[5]: M. Asadi, A. Brandt, R. Moir, and M. Moreno Maza. Sparse polynomial arithmetic with the BPAS library. In V. Gerdt, W. Koepf, W. Seiler, and E. Vorozhtsov, editors, Computer Algebra in Scientific Computing. CASC 2018, volume 11077 of Lect. Notes Comput. Sci., pages 32–50. Springer, Cham, 2018.
[6]: M. Ben-Or and P. Tiwari. A deterministic algorithm for sparse multivariate polynomial interpolation. In STOC '88: Proceedings of the twentieth annual ACM symposium on Theory of computing, pages 301–309. ACM Press, 1988.
[7]: L. I. Bluestein. A linear filtering approach to the computation of discrete Fourier transform. IEEE Transactions on Audio and Electroacoustics, 18(4):451–455, 1970.
[8]: A. Bostan, G. Lecerf, and É. Schost. Tellegen's principle into practice. In J. R. Sendra, editor, Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation, ISSAC '03, pages 37–44. New York, NY, USA, 2003. ACM Press.
[9]: R. P. Brent and P. Zimmermann. Modern Computer Arithmetic. Cambridge University Press, 2010.
[10]: P. Bürgisser, M. Clausen, and M. A. Shokrollahi. Algebraic complexity theory. Springer-Verlag, Berlin, 1997.
[11]: J. Canny, E. Kaltofen, and Y. Lakshman. Solving systems of non-linear polynomial equations faster. In Proceedings of the ACM-SIGSAM 1989 International Symposium on Symbolic and Algebraic Computation, pages 121–128. New York, NY, USA, 1989. ACM Press.
[12]: J. Doliskani, P. Giorgi, R. Lebreton, and É. Schost. Simultaneous conversions with the residue number system using linear algebra. ACM Trans. Math. Softw., 44(3), 2018.
[13]: T. S. Freeman, G. M. Imirzian, E. Kaltofen, and Y. Lakshman. DAGWOOD: a system for manipulating polynomials given by straight-line programs. ACM Trans. Math. Software, 14:218–240, 1988.
[14]: S. Garg and É. Schost. Interpolation of polynomials given by straight-line programs. Theor. Comput. Sci., 410(27-29):2659–2662, 2009.
[15]: J. von zur Gathen and J. Gerhard. Modern computer algebra. Cambridge University Press, New York, 3rd edition, 2013.
[16]: P. Giorgi, B. Grenet, A. Perret Du Cray, and D. S. Roche. Random primes in arithmetic progressions. Technical Report https://arxiv.org/abs/2202.05955, Arxiv, 2022.
[17]: P. Giorgi, B. Grenet, A. Perret du Cray, and D. S. Roche. Sparse polynomial interpolation and division in soft-linear time. In Proceedings of the 2022 International Symposium on Symbolic and Algebraic Computation, ISSAC '22, pages 459–468. New York, NY, USA, 2022. ACM Press.
[18]: P. Giorgi, B. Grenet, A. Perret du Cray, and D. S. Roche. Fast interpolation and multiplication of unbalanced polynomials. In J. Hauenstein, W. Lee, and S. Chen, editors, Proceedings of the 2024 International Symposium on Symbolic and Algebraic Computation, ISSAC '24, pages 437–446. New York, NY, USA, 2024. ACM Press.
[19]: B. Grenet, J. van der Hoeven, and G. Lecerf. Randomized root finding over finite fields using tangent Graeffe transforms. In ISSAC '15: Proceedings of the 2015 ACM on International Symposium on Symbolic and Algebraic Computation, pages 197–204. New York, NY, USA, 2015. ACM Press.
[20]: D. Y. Grigoriev, M. Karpinski, and M. F. Singer. Fast parallel algorithms for sparse multivariate polynomial interpolation over finite fields. SIAM J. Comput., 19(6):1059–1063, 1990.
[21]: A. W. Groves and D. S. Roche. Sparse polynomials in FLINT. ACM Commun. Comput. Algebra., 50(3):105–108, 2016.
[22]: D. Harvey and J. van der Hoeven. Integer multiplication in time . Ann. Math., 193(2):563–617, 2021.
[23]: J. van der Hoeven. Probably faster multiplication of sparse polynomials. Technical Report, HAL, 2020. https://hal.archives-ouvertes.fr/hal-02473830.
[24]: J. van der Hoeven. The Jolly Writer. Your Guide to GNU TeXmacs. Scypress, 2020.
[25]: J. van der Hoeven and G. Lecerf. On the bit-complexity of sparse polynomial multiplication. J. Symbolic Comput., 50:227–254, 2013.
[26]: J. van der Hoeven and G. Lecerf. Sparse polynomial interpolation in practice. ACM Commun. Comput. Algebra, 48(3/4):187–191, 2015.
[27]: J. van der Hoeven and G. Lecerf. Fast interpolation of sparse multivariate polynomials. Technical Report, HAL, 2023. https://hal.science/hal-04366836v1.
[28]: J. van der Hoeven and G. Lecerf. Sparse polynomial interpolation: faster strategies over finite fields. Appl. Algebra Eng. Commun. Comput., 2024. https://doi.org/10.1007/s00200-024-00655-5.
[29]: J. van der Hoeven, G. Lecerf, and G. Quintin. Modular SIMD arithmetic in Mathemagix. ACM Trans. Math. Softw., 43(1):5–1, 2016.
[30]: J. Hu and M. B. Monagan. A fast parallel sparse polynomial GCD algorithm. In S. A. Abramov, E. V. Zima, and X.-S. Gao, editors, ISSAC '16: Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation, pages 271–278. New York, NY, USA, 2016. ACM Press.
[31]: M. A. Huang and A. J. Rao. Interpolation of sparse multivariate polynomials over large finite fields with applications. In SODA '96: Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms, pages 508–517. Philadelphia, PA, USA, 1996. Society for Industrial and Applied Mathematics.
[32]: Q.-L. Huang. Sparse polynomial interpolation over fields with large or zero characteristic. In ISSAC '19: Proceedings of the 2019 on International Symposium on Symbolic and Algebraic Computation, pages 219–226. New York, NY, USA, 2019. ACM Press.
[33]: M. Javadi and M. Monagan. Parallel sparse polynomial interpolation over finite fields. In M. Moreno Maza and J.-L. Roch, editors, PASCO '10: Proceedings of the 4th International Workshop on Parallel and Symbolic Computation, pages 160–168. New York, NY, USA, 2010. ACM Press.
[34]: E. Kaltofen, Y. N. Lakshman, and J.-M. Wiley. Modular rational sparse multivariate polynomial interpolation. In ISSAC '90: Proceedings of the International Symposium on Symbolic and Algebraic Computation, pages 135–139. New York, NY, USA, 1990. ACM Press.
[35]: E. Kaltofen and L. Yagati. Improved sparse multivariate polynomial interpolation algorithms. In P. M. Gianni, editor, ISSAC '88: Proceedings of the International Symposium on Symbolic and Algebraic Computation, volume 358 of Lect. Notes Comput. Sci., pages 467–474. Springer-Verlag, Berlin, Heidelberg, 1988.
[36]: M. Monagan and B. Tuncer. Using sparse interpolation to solve multivariate diophantine equations. ACM Comm. Computer Algebra, 49(3):94–97, 2015.
[37]: M. Monagan and B. Tuncer. Sparse multivariate Hensel lifting: a high-performance design and implementation. In J. Davenport, M. Kauers, G. Labahn, and J. Urban, editors, Mathematical Software – ICMS 2018, volume 10931 of Lect. Notes Comput. Sci., pages 359–368. Springer, Cham, 2018.
[38]: H. Murao and T. Fujise. Modular algorithm for sparse multivariate polynomial interpolation and its parallel implementation. J. Symb. Comput., 21:377–396, 1996.
[39]: J.-L. Nicolas and G. Robin. Majorations explicites pour le nombre de diviseurs de . Canad. Math. Bull., 26:485–492, 1983.
[40]: A. Perret du Cray. Algorithmes pour les polynômes creux : interpolation, arithmétique, test d'identité. PhD thesis, Université de Montpellier (France), 2023.
[41]: R. Prony. Essai expérimental et analytique sur les lois de la dilatabilité des fluides élastiques et sur celles de la force expansive de la vapeur de l'eau et de la vapeur de l'alkool, à différentes températures. J. de l'École Polytechnique Floréal et Plairial, an III, 1(cahier 22):24–76, 1795.
[42]: D. S. Roche. What can (and can't) we do with sparse polynomials? In C. Arreche, editor, Proceedings of the 2018 ACM International Symposium on Symbolic and Algebraic Computation, ISSAC '18, pages 25–30. New York, NY, USA, 2018. ACM Press.
[43]: J. B. Rosser and L. Schoenfeld. Approximate formulas for some functions of prime numbers. Illinois J. Math., 6(1):64–94, 1962.
[44]: A. Sedunova. A partial Bombieri–Vinogradov theorem with explicit constants. Publications mathématiques de Besançon. Algèbre et théorie des nombres, pages 101–110, 2018.
[45]: K. Werther. The complexity of sparse polynomial interpolation over finite fields. Appl. Algebra Engrg. Comm. Comput., 5(2):91–103, 1994.


			(since )
			(by (3.2))
			(since )

			(by (3.4))
			(by (3.8))


			(by induction hypothesis (ii))

			(since )
			(by (3.9))
			(by (3.2))

			(since )
			(by (3.2))

			(by (3.4))
			(by (3.4))
			(since )

			(by (3.10))

			(by (3.6))





			(since )
			(4.3)

1.Introduction

Notation.

2.Preliminaries

2.1.Sparse polynomials

2.2.Modular blackbox polynomials

2.3.Number theoretic reminders

2.4.Amortized determination of prime divisors in a fixed set

3.Probabilistic codes for exponents

3.1.The exponent encoding

3.2.Guessing individual exponents of prescribed size

3.3.Guessing exponents of prescribed size

4.Sparse interpolation

4.1.Cyclic modular projections

4.2.Probability of faithfulness

4.3.Computing probabilistic codes for the exponents

4.4.Approximate sparse interpolation

4.5.Sparse interpolation

5.Practical variants

5.1.Conducting most computations in double precision

Chinese remaindering.

Tangent numbers.

Small prime divisors.

Chinese remaindering, bis.

5.2.The mystery exponent game

5.3.The Ben-Or–Tiwari encoding

Acknowledgments

Bibliography