Joris van der Hoevena,
Grégoire Lecerfb
Laboratoire d'informatique de l'École
polytechnique
LIX, UMR 7161 CNRS
Campus de l'École polytechnique
1, rue Honoré d'Estienne d'Orves
Bâtiment Alan Turing, CS35003
91120 Palaiseau, France
a. Email:
vdhoeven@lix.polytechnique.fr
b. Email:
lecerf@lix.polytechnique.fr
Evaluating Straight-Line Programs over
Balls
Abstract—Interval arithmetic achieves numerical reliability for a wide range of applications, at the price of a performance penalty. For applications to homotopy continuation, one key ingredient is the efficient and reliable evaluation of complex polynomials represented by straight-line programs. This is best achieved using ball arithmetic, a variant of interval arithmetic. In this article, we describe strategies for reducing the performance penalty of basic operations on balls. We also show how to bound the effect of rounding errors at the global level of evaluating a straight-line program. This allows us to introduce a new and faster “transient” variant of ball arithmetic.
Index terms—ball arithmetic, polynomial evaluation, software implementation
Interval arithmetic is a classical tool for making numerical computations reliable, by systematically computing interval enclosures for the desired results instead of numerical approximations [1, 10, 11, 13, 14, 17, 24]. Interval arithmetic has been applied with success in many areas, such as the resolution of systems of equations, homotopy continuations, reliable integration of dynamical systems, etc. There exist several variants of interval arithmetic such as ball arithmetic, interval slope arithmetic, Taylor models, etc. Depending on the context, these variants may be more efficient than standard interval arithmetic and/or provide tighter enclosures.
In this paper, we will mainly focus on ball arithmetic (also known as midpoint-radius interval arithmetic), which is particularly useful for reliable computations with complex numbers and multiple precision numbers [5]. One of our main motivations is the implementation of reliable numerical homotopy methods for polynomial system solving [2, 3, 6, 27-29]. One basic prerequisite for this project concerns the efficient and reliable evaluation of multivariate polynomials represented by so called straight-line programs (SLPs).
Two classical disadvantages of interval arithmetic and its variants are the additional computational overhead and possible overestimation of errors. There is a trade-off between these two evils: it is always possible to reduce the overestimation at the expense of a more costly variant of interval arithmetic (such as high order Taylor models). For a fixed variant, the computational overhead is usually finite, but it remains an important issue for high performance applications to reduce the involved constant factors as much as possible. In this paper, we will focus on this “overhead problem” in the case of basic ball arithmetic (and without knowledge about the derivatives of the functions being evaluated).
About ten years ago, we started the implementation of a basic C++
template library for ball arithmetic in the
In this article we investigate various strategies to reduce the overhead involved when evaluating straight-line programs over balls. Our point of view is pragmatic and directed towards the development of more efficient implementations. We will not turn around the fact that the development of efficient ball arithmetic admits a quite technical side: the optimal answer greatly depends on available hardware features. We will consider on the following situations, encountered for modern processors:
Without specific IEEE 754 compliant hardware (which is frequently the case for GPUs), we may only assume faithful rounding, specifying a bound on relative errors for each operation, and that errors can be thrown on numerical exceptions.
For recent
Recent AVX-512 processors propose floating point instructions which directly incorporate a rounding mode, thereby eliminating all need to switch rounding modes via the status register and all resulting delays caused by broken pipelines.
As a general rule, modern processors have also become highly parallel. For this reason, it is important to focus on algorithms that can easily be vectorized. The first contribution of this paper concerns various strategies for ball arithmetic as a function of the available hardware. From the conceptual point of view, this will be done in Section 2, whereas additional implementation details will be given in Section 4.
Our second main contribution is the introduction of transient ball arithmetic in Section 2.5 and its application to the evaluation of SLPs in Section 3. The idea is to not bother about rounding errors occurring when computing centers and radii of individual balls, which simplifies the implementation of the basic ring operations. Provided that the radii of the input balls are not too small and that neither overflow nor underflow occur, we will show that the cumulated effect of the ignored rounding errors can be bounded for the evaluation of a complete SLP. More precisely, we will show that the relative errors due to rounding are essentially dominated by the depth of the computation. Although it is most convenient to apply our result to SLPs, much of it can be generalized to arbitrary programs, by regarding the trace of the execution as an SLP. For such generalizations, the main requirement is to have an a priori bound for the (parallel) depth of the computation.
Most of our strategies have been implemented inside the
In the context of real midpoint-radius interval arithmetic, and for specific calculations, such as matrix products, several authors proposed dedicated solutions that mostly perform operations on centers and then rely on fast bounds on radii [18-20, 23, 25]. In this case, the real gain comes from the ability to exploit HPC (high performance computing) solutions to linear algebra, and similar tricks sometimes apply to classical interval methods [16, 21].
The use of SIMD instructions for intervals started in [30], and initially led to a rather modest speed-up, according
to the author. Changing the rounding mode for almost each arithmetic
operation seriously slows down computations, and might involve a serious
stall of a hundred of cycles by breaking FPU pipelines of
some hardware such as
Throughout this paper, we denote by
the set of machine floating point numbers. We let
be the machine precision (which corresponds to the number of fractional
bits of the mantissa plus one), and let
and
be the minimal and maximal exponents (included). For
IEEE 754 double precision numbers, this means that
,
and
. We enlarge
with symbols
,
, and
,
with their standard meaning.
For our basic arithmetic, we allow for various rounding modes written
. The first three rounding
modes correspond to IEEE arithmetic with correct rounding
(downwards, to the nearest and upwards). The
rounding mode
corresponds to faithful rounding
without any well specified direction. For
,
we write
for the approximation of
in
with the specified rounding
mode
. The quantity
stands for the corresponding rounding error, that may be
. Given a single operation
, it will be convenient to
write
for
.
For compound expressions
, we
will also write
for the full evaluation of
using the rounding mode
.
For instance,
.
Modern processors usually support fused-multiply-add (fma) and
fused-multiply-subtract (fms) instructions, both for scalar and
SIMD vector operands: and
represent the roundings of
and
according to the mode
. For generalities about these instructions, we
refer the reader to the book [15] for instance.
We will denote by any upper bound function for
that is easy to compute. In absence of
underflows/overflows, we claim that we may take
, with
for
and
. Indeed, given
and
, let
be the exponent of
,
so that
and
.
Then
for all rounding modes, and
.
If we want to allow for underflows during computations, we may safely
take , where
is the smallest positive subnormal number in
. Notice that if
,
then we may still take
since no underflow occurs
in that special case. Underflows and overflows will be further discussed
in Section 3.3.
In order to describe our algorithms in a flexible context, denotes a Banach algebra over
, endowed with a norm written
. The most common examples are
and
with
for all
.
For actual machine computations, we will denote by
the counterpart of
when
is replaced by
. For
instance, if
, then we may
take
. In other words, if
is the
, where complex represents the template type available from
the standard
For any rounding mode , the
notations from the previous section naturally extend to
. For instance,
.
Similarly, if
, then
. For complex arithmetic we
consider the following implementation:
As for we may clearly take
in absence of underflows/overflows, and
in
general.
Remark , it is sometimes
interesting to replace computations of norms
by
computations of quick and rough upper bounds
. For instance, on architectures where square roots
are expensive, one may use
Given and
,
we write
for the closed ball with
center
and radius
.
The set of such balls is denoted by
.
One may lift the ring operations
in
to balls in
,
by setting:
These formulas are simplest so as to satisfy the so called inclusion
principle: given ,
and
, we
have
. This arithmetic for
computing with balls is called exact ball arithmetic. It
extends to other operations that might be defined on
, as long as the ball lifts of operations
satisfy the inclusion principle. For instance, we may implement
fused-multiply-add and fused-multiply-subtract using
We will denote by the set of balls with centers
in
and radii in
.
In order to implement machine ball arithmetic for
, we need to adjust formulas from the previous
section so as to take rounding errors into account. Indeed, the main
constraint is the inclusion principle, which should still hold for each
operation in such an implementation. The best way to achieve this
depends heavily on rounding modes used for operations on centers and
radii.
On processors that feature IEEE 754 style rounding, it is
most natural to perform operations on centers using any rounding mode
(and preferably rounding to the nearest), and
operations on radii using upward rounding. This leads to the following
formulas:
For instance, when using of the form
in the last case of fma/fms,
this means that three additional instructions are needed with respect to
the exact arithmetic from the previous subsection:
In practical implementations, unless ,
the radius computations involve many roundings. The correctness of the
above formulas is justified by the fact that we systematically use
upward rounding for all bound computations; the rounding error for the
single operation on the centers is captured by
.
Remark
Another approach is to use the same rounding mode for computations on
centers and radii. If we take as the sole
rounding mode, then we may directly apply the above formulas, but the
quality of computations with centers degrades. If we take
, then we have to further adjust the radii so
as to take into account the additional rounding errors that might occur
during radius computations. For instance, if
, in the cases of addition and subtraction, and in
absence of underflows/overflows, we may use
since we have . This
arithmetic, sometimes called rough ball arithmetic, fits one of
the main recommendations of [22]: “Get free from the
rounding mode by bounding, roughly but robustly, errors with formulas
independent of the rounding mode if needed.”
The adjustments which where needed above in order to counter the problem of rounding errors induce a non trivial computational overhead, despite the fact that these adjustments are usually very small. It is interesting to consider an alternative transient ball arithmetic for which we simply ignore all rounding errors. In the next Section 3, we will see that it is often possible to treat these rounding errors at a more global level for a complete computation instead of a single operation.
For any rounding mode , we
will denote by
the corresponding “rounding
mode” for transient ball arithmetic; the basic operations are
defined as follows:
Of course, these formulas do not satisfy the inclusion principle. On processors that allow for efficient switching of rounding modes, it is also possible to systematically use upward rounding for radius computations, with resulting simplifications in the error analysis below.
Let us fix a rounding mode .
We assume that we are given a suitable floating point number
in
such that
and
hold for all and
,
in absence of underflows and overflows. If
,
then we may take
. If
, and assuming that complex
arithmetic is implemented using (1–3),
then it can be checked that we may take
(see
Appendix A). The following lemma provides an error estimate
for the radius computation of a transient ball product.
Lemma and
such that the
computation of
involves no underflow or overflow, we have .
Proof. We have
![]() |
![]() |
![]() |
|
![]() |
![]() |
||
![]() |
![]() |
||
![]() |
![]() |
□ |
A straight-line program over a ring
is a sequence
of
instructions of the form
where are variables in a finite ordered set
,
constants in
, and
. Variables that appear for the
first time in the sequence in the right-hand side of an instruction are
called input variables. A distinguished subset of the set of
all variables occurring at the left-hand side of instructions is called
the set of output variables. Variables which are not used for
input or output are called temporary variables and determine
the amount of auxiliary memory needed to evaluate the program. The
length
of the sequence is called the
length of
.
Let be the input variables of
and
the output variables, listed in increasing
order. Then we associate an evaluation function
to
as follows: given
, we assign
to
for
, then
evaluate the instructions of
in sequence, and
finally read off the values of
,
which determine
.
To each instruction , one may
associate the remaining path lengths
as
follows. Let
, and assume
that
have been defined for some
. Then we take
,
where
are those indices
such that
is of the form
with
and
.
If no such indices
exist, then we set
. Similarly, for each input
variable
we define
,
where
are those indices such that
is of the form
with
and
. We also
define
, where
are the input variables of
and
all indices
such that
is of the form
.
Example , of length
. The input variables are
and
, and we
distinguish
as the sole output variable. This
SLP thus computes the function
.
The associated computation graph, together with remaining path lengths
are as pictured:
Consider the “semi-exact” ball arithmetic in which all
computations on centers are done using a given rounding mode and all computations on radii are exact. More precisely,
we take
for any and
.
We wish to investigate how far the transient arithmetic from Section 2.5 can deviate from this idealized arithmetic (which
satisfies the inclusion principle). We will write
for the
-th harmonic number
and
for all
.
Theorem be a SLP of length
as above and let
be an
arbitrary parameter such that
.
Consider two evaluations of
with two different
ball arithmetics. For the first evaluation, we use the above semi-exact
arithmetic with
. For the
second evaluation, we use transient ball arithmetic with the same
arithmetic for centers, and the additional property that any input or
constant ball
is replaced by a larger ball
with
where
If no underflow or overflow occurs during the second evaluation, then
for all in the output of the first evaluation
with corresponding entry
for the second
evaluation, we have
.
Proof. It will be convenient to systematically use the star superscript for the semi-exact radius evaluation and no superscript for the transient evaluation.
Let be the ball value of the variable
just after evaluation of
using
transient ball arithmetic. Let us show by induction over
that
So assume that the hypothesis holds for all strictly smaller values of
. If
is of the form
, then we are
done by assumption since
. If
is of the form
,
then we claim that
contains a ball
with
Indeed, this holds by assumption if is an input
variable. Otherwise, let
be largest with
. Then
by
the construction of
, whence
our claim follows by the induction hypothesis. In a similar way,
contains a ball
with
Having shown our claim, let us first consider the case when . Then we get
Using the inequalities ,
, and the fact that
, we obtain:
![]() |
![]() |
![]() |
|
![]() |
![]() |
||
![]() |
![]() |
||
![]() |
![]() |
||
![]() |
![]() |
as desired. In the same way, if ,
then
whence, using Lemma 3, and the fact that ,
![]() |
![]() |
![]() |
|
![]() |
![]() |
||
![]() |
![]() |
||
![]() |
![]() |
||
![]() |
![]() |
which completes our claim by induction.
For all , we introduce
so that . Using a second
induction over
, let us next
prove that
Assume that this inequality holds up until order . If
is of the form
, then we are done by the fact that
.
If is of the form
,
then let
be largest with
and
. Then
, and
In particular, . With
and
as above, it follows that
If is of the form
,
then thanks to Lemma 3, a similar computation yields
This completes the second induction and the proof of this theorem.
For fixed and
,
we observe that
. The value
of the parameter
may be optimized for given SLPs
and inputs. Without entering details,
should be
taken large when inputs are accurate, and small when inputs are rough,
as encountered for instance within subdivision algorithms.
In Theorem 5, we have assumed the absence of underflows and overflows. There are several strategies for managing such exceptions, whose efficiencies heavily depend on the specific kind of hardware being used.
One strategy to manage exceptions is to use the status register of an IEEE 754 FPU. The idea is to simply clear the underflow and overflow flags of the status register, then perform the evaluation, and finally check both flags at the end. Whenever an overflow occurred during the evaluation, then we may set the radii of all results to plus infinity, thereby preserving the inclusion principle. If an underflow occurred (which is quite unlikely), then we simply reevaluate the entire SLP using a more expensive, fully certified ball arithmetic.
When performing all computations using the IEEE 754 rounding-to-nearest
mode , we also notice that
consulting the status register can be avoided for managing overflows.
Indeed, denoting by
the largest positive finite
number in
, we have
for all
and
for all
. Consequently,
whenever a computation on centers overflows, the corresponding radius
will be set to infinity.
From now on, we assume the absence of overflows and we focus on
underflows. In addition to the constant ,
we suppose given an other positive constant
in
such that
for all and
.
For instance, if
, then we
may take
. If
, and assuming the arithmetic from Section 2.2, then it is safe to take
(see
Appendix A). In addition to the method based on the status
register, the following strategies can be used for managing underflows.
This arithmetic makes Theorem 5 hold without the assumption on the absence of underflows, and provided that
![]() |
(13) |
Indeed, inequalities (5), (8), and (9)
immediately hold, even without extra factors . It remains to prove that (6) also
holds. From
we have
. Therefore, if
,
then
holds, whence inequality (6).
Otherwise
, and the extra
assumption (13) directly implies
because
always holds by construction.
No corrections are necessary for additions and subtractions which never
provoke underflows. As to multiplication, Lemma 3 admits
Lemma 6 below as its analogue in the case when . Using this, it may again be shown that
Theorem 5 holds without the assumption on the absence of
underflows, and provided that
![]() |
(14) |
Indeed, inequalities (5), (8), and (9)
immediately hold. It remains to prove that (6) also holds.
The case is handled as for the previous
strategy. Otherwise, we have
,
and the extra assumption (14) directly implies
because
holds by construction.
Lemma and
,
letting
and
,
we have
.
Proof. We have
It follows that .
Remark , then the above method still applies under the
condition that all norm computations are replaced by reliable upper
bounds. In other words, assuming that
satisfies
for all
,
we may take
Our new algorithms have been implemented in . In this
section, we briefly present our implementation. We first describe the
implemented strategies for evaluations over the standard numeric types
double and complex<double>.
We next consider balls over these types, and finally say a few words
about vectorized variants.
Our implementation is divided into
JIT compilation, is a classical technique, traditionally
used in scientific computing: when an expression such as a SLP needs to
be evaluated at many points, it pays off to compile the expression and
then perform the evaluations using fast executable code. When allowed by
the operating system, it suffices to compile SLPs into executable memory
regions, without temporary files. Since SLPs are very basic programs,
there is no special need to appeal to general purpose compilers. In
To develop a confortable JIT library dedicated to SLPs, we turned to the
In order to estimate the concrete overhead of ball arithmetic, we first
focus on timings for double and complex<double>.
In the rest of this article, timings are measured on a platform equipped
with an GHz and 8 GB of
MHz DDR3, which includes
AVX2 and FMA technologies. The platform runs
the
In Table 1, column “double”
displays timings for evaluating a multivariate polynomial over double, with variables, made of 100
terms, built from random monomials of partial degrees at most 10. We
build the SLP using a dedicated algorithm implemented in multimix/dag_polynomial.hpp.
Support for this specific coefficient type may be found in multimix/slp_polynomial_double.hpp.
This example, with the corresponding functions to reproduce these
timings are available in multimix/bench/slp_polynomial_bench.cpp.
The evaluation of this SLP takes 1169 products and 100 sums.
The first row corresponds to using the naive interpreted evaluation
available by default from naive
evaluations.
The third row concerns JIT compilation from
The fourth row is for our JIT implementation in the
The second column of Table 1 shows similar timings for the
complex<double> type from the
|
|||||||||||||||
We are now interested in measuring the speed-up of our evaluation
strategies over balls. The first column of Table 2 shows
timings for balls over double (i.e. ). Early versions of the
Notice that we carefully tuned the assembly code generated by our SLP
compiler. For instance, if ymm0 contains and if ymm1 contains a center
, then
is obtained as
vxorpd ymm0 ymm1 ymm2, and
as
vandnpd ymm0 ymm1 ymm2. The latency and throughput of
both these instructions are a single cycle and no branchings are
required to compute
. In this
way, each transient addition/subtraction takes 2 cycles, and each
product 6 cycles. For rough arithmetic this increases to 5 and 9 cycles
respectively. The gain of the transient arithmetic is thus well
reflected in practice, since our example essentially performs products.
Comparing to Table 1, we observe that transient arithmetic
is just about 4 times slower than numeric arithmetic. This turns out to
be competitive with interval arithmetic, where each interval product
usually requires 8 machine multiplications and 6 min/max operations.
The second column of timings in Table 2 is similar to the first one, but with balls over complex<double>. The computation of norms is expensive in this case because the scalar square root instruction takes 14 CPU cycles. In order to reduce this cost, we rewrite SPLs so that norms of products are computed as products of norms. Taking care of using or simulating upwards rounding, this involves a slight loss in precision but does not invalidate the results. At the end, comparing to Table 1, we are glad to observe that our new transient arithmetic strategy is only about twice as expensive as standard numeric evaluations.
|
|||||||||||||||||||||
In order to profit from SIMD technologies, we also
implemented vectorization in our or
is easily observed for double, complex<double>,
and balls over double. For balls over complex<double>, a penalty occurs, because the
vectorial square root instruction is about twice slower than its scalar
counterpart. For instance, the evaluation of a univariate polynomial in
degree
with Hörner's method takes about
5000 CPU cycles over double (each fma instruction takes
the expected latency), 13000 cycles over complex<double>,
and 16000 cycles for transient balls of double, with
both scalar and vectorial instructions. As for transient balls over complex<double>, scalar
instructions amount to about 17000 cycles, while vectorial ones take
about 28000 cycles.
In this paper, we have shown how to implement highly efficient ball
arithmetic dedicated to polynomial evaluation. In the near future, we
expect significant speed-ups in our numerical system solvers implemented
within
It is interesting to notice in Table 2 that the code generated by our rather straightforward JIT compiler for SLPs is more efficient than the code produced by GCC. This suggests that it might be worthwhile to put more efforts into the development of specific JIT compilers for SLPs dedicated to high performance computing.
We also plan to adapt the present results to standard intervals, that
are useful for real solving. In that case, we replace input and constant
intervals by intervals of the form
, where
for suitable
. We next proceed as usual,
but without any assumption on the rounding mode.
We finally notice that interval arithmetic benefits from hardware
accelerations on many current processors, as soon as IEEE 754 style
rounding is integrated in an efficient manner. An interesting question
is whether ball arithmetic might benefit from similar hardware
accelerations. In particular, is it possible to integrate rounding
errors more efficiently into error bounds (the radii of balls)? This
might for instance be achieved using an
accumulate-rounding-error instruction for computing a
guaranteed upper bound for as a function of
.
G. Alefeld and J. Herzberger. Introduction to interval analysis. Academic Press, New York, 1983.
D. Bates, J. Hauenstein, A. Sommese, and C. Wampler. Bertini: software for numerical algebraic geometry. http://www.nd.edu/~sommese/bertini/, 2006.
C. Beltrán and A. Leykin. Certified numerical homotopy tracking. Experiment. Math., 21(1):69–83, 2012.
F. Goualard. Fast and correct SIMD algorithms for interval arithmetic. Technical Report, Université de Nantes, 2008. https://hal.archives-ouvertes.fr/hal-00288456.
J. van der Hoeven. Ball arithmetic. In A. Beckmann, Ch. Gaßner, and B. Löwe, editors, Logical approaches to Barriers in Computing and Complexity, number 6 in Preprint-Reihe Mathematik, pages 179–208. Ernst-Moritz-Arndt-Universität Greifswald, 2010. International Workshop.
J. van der Hoeven. Reliable homotopy continuation. Technical Report, CNRS & École polytechnique, 2011. http://hal.archives-ouvertes.fr/hal-00589948.
J. van der Hoeven and G. Lecerf. Interfacing Mathemagix with C++. In M. Monagan, G. Cooperman, and M. Giesbrecht, editors, Proc. ISSAC '13, pages 363–370. ACM, 2013.
J. van der Hoeven and G. Lecerf. Mathemagix User Guide. CNRS & École polytechnique, France, 2013. http://hal.archives-ouvertes.fr/hal-00785549.
J. van der Hoeven, G. Lecerf, B. Mourrain et al. Mathemagix. 2002. http://www.mathemagix.org.
L. Jaulin, M. Kieffer, O. Didrit, and E. Walter. Applied interval analysis. Springer, London, 2001.
U. W. Kulisch. Computer Arithmetic and Validity. Theory, Implementation, and Applications. Number 33 in Studies in Mathematics. De Gruyter, 2008.
B. Lambov. Interval arithmetic using SSE-2. In P. Hertling, C. M. Hoffmann, W. Luther, and N. Revol, editors, Reliable Implementation of Real Number Algorithms: Theory and Practice, volume 5045 of Lect. Notes Comput. Sci., pages 102–113. Springer Berlin Heidelberg, 2008.
R. E. Moore. Interval Analysis. Prentice Hall, Englewood Cliffs, N.J., 1966.
R. E. Moore, R. B. Kearfott, and M. J. Cloud. Introduction to Interval Analysis. SIAM Press, 2009.
J.-M. Muller, N. Brisebarre, F. de Dinechin, C.-P. Jeannerod, V. Lefèvre, G. Melquiond, N. Revol, D. Stehlé, and S. Torres. Handbook of Floating-Point Arithmetic. Birkhäuser Boston, 2010.
Hong Diep Nguyen. Efficient algorithms for verified scientific computing: numerical linear algebra using interval arithmetic. PhD thesis, École Normale Supérieure de Lyon, Université de Lyon, 2011.
A. Neumaier. Interval methods for systems of equations. Cambridge university press, Cambridge, 1990.
T. Ogita and S. Oishi. Fast inclusion of interval matrix multiplication. Reliab. Comput., 11(3):191–205, 2005.
K. Ozaki, T. Ogita, and S. Oishi. Tight and efficient enclosure of matrix multiplication by using optimized BLAS. Numer. Linear Algebra Appl., 18:237–248, 2011.
K. Ozaki, T. Ogita, S. M. Rump, and S. Oishi. Fast algorithms for floating-point interval matrix multiplication. J. Comput. Appl. Math., 236(7):1795–1814, 2012.
N. Revol and Ph. Théveny. Parallel implementation of interval matrix multiplication. Reliab. Comput., 19(1):91–106, 2013.
N. Revol and Ph. Théveny. Numerical reproducibility and parallel computations: issues for interval algorithms. IEEE Trans. Comput., 63(8):1915–1924, 2014.
S. M. Rump. Fast and parallel interval arithmetic. BIT, 39(3):534–554, 1999.
S. M. Rump. Verification methods: rigorous results using floating-point arithmetic. Acta Numer., 19:287–449, 2010.
S. M. Rump. Fast interval matrix multiplication. Numer. Algor., 61(1):1–34, 2012.
N. Stolte. Arbitrary 3D resolution discrete ray tracing of implicit surfaces. In E. Andres, G. Damiand, and P. Lienhardt, editors, Discrete Geometry for Computer Imagery, volume 3429 of Lect. Notes Comput. Sci., pages 414–426. Springer Berlin Heidelberg, 2005.
A. J. Sommese and C. W. Wampler. The Numerical Solution of Systems of Polynomials Arising in Engineering and Science. World Scientific, 2005.
J. Verschelde. Homotopy continuation methods for solving polynomial systems. PhD thesis, Katholieke Universiteit Leuven, 1996.
J. Verschelde. PHCpack: a general-purpose solver for polynomial systems by homotopy continuation. ACM Trans. Math. Software, 25(2):251–276, 1999. Algorithm 795.
J. W. Von Gudenberg. Interval arithmetic on multimedia architectures. Reliab. Comput., 8(4):307–312, 2002.
GCC, the GNU Compiler Collection. Software available at http://gcc.gnu.org, from 1987.
Let and
be
two complex numbers in
. We
will show that
![]() |
(15) |
![]() |
(16) |
![]() |
(17) |
In addition, in absence of underflows, the terms in
may be discarded.
For short we set , and we
shall freely use the assumption
.
The first bound (15) is immediate. As for the second one,
it will be convenient to use the norm
,
that satisfies
. We begin
with combining
and
into
We deduce that
whence
which simplifies into
It follows that
which simplifies into (16).
As to (17), we begin with and
, that give
It follows that
from which we extract
We thus obtain that
We distinguish the two following cases:
If , then
. Using
since the square root does not provoke underflows, we obtain
Otherwise , which means
that both products
and
involve underflows. We restart the analysis from
which implies . If
, then we are done, since
. Otherwise
, and
,
and we have