|
In this paper, we present an efficient algorithm for the certification of numeric singular value decompositions (SVDs) in the regular case, i.e., in the case when all the singular values are pairwise distinct. Our algorithm is based on a Newton-like iteration that can also be used for doubling the precision of an approximate numerical solution. |
Let be the set of floating point numbers for a fixed precision and a fixed exponent range. We denote . Consider an matrix with complex floating entries, where . The problem of computing the numeric singular value decomposition of is to compute unitary transformation matrices , , and a diagonal matrix such that
If , then is understood to be “diagonal” if it is of the form
The diagonal entries of are the approximate singular values of the matrix and throughout this paper we will assume them to be pairwise distinct and ordered
There are several well-known algorithms for the computation of numeric singular value decompositions [8, 4].
Now (1) is only an approximate equality. It is sometimes important to have a rigorous bound for the distance between an approximate solution and some exact solution. More precisely, we may ask for a diagonal matrix and matrices , , such that there exist unitary matrices , , and a diagonal matrix for which
and
for all . This task will be called the certification problem for the given numeric singular value decomposition (1). The matrices , and can be thought of as reliable error bounds for the matrices , and of the numerical solution.
It will be convenient to rely on ball arithmetic [13, 19], which is a systematic technique for this kind of bound computations. When computing with complex numbers, ball arithmetic is more accurate than more classical interval arithmetic [22, 1, 23, 18, 21, 24], especially in multiple precision contexts. We will write for the set of balls with centers in and radii in . In a similar way, we may consider matricial balls : given a center matrix and a radius matrix , we have
Alternatively, we may regard as the set of matrices in with ball coefficients:
Standard arithmetic operations on balls are carried out in a reliable way. For instance, if , then the computation of the product using ball arithmetic has the property that for any and .
In the language of ball arithmetic, it is natural to allow for small errors in the input and replace the numeric input by a ball input . Then we may still compute a numeric singular value decomposition of the center matrix :
The generalized certification problem now consists of the computation of matrices , , and a diagonal matrix such that, for every , there exist unitary matrices , and a diagonal matrix with
In this paper we propose an efficient solution for this problem in the case when all singular values are simple. Our algorithm relies on an efficient Newton iteration that is also useful for doubling the precision of a given numeric singular value decomposition. The iteration admits a quadratic convergence and only requires matrix sums and products of size at most . In [13, 15], a similar approach was used for the certification of eigenvalues and eigenvectors.
We are not aware of similarly efficient and robust algorithms in the literature. Jacobi-like methods from Kogbeliantz' SVD algorithm admit quadratic convergence in the presence of cluster in the Hermitian case [3], but only linear convergence is achieved in general [5]. Gauss-Newton type methods have also been proposed for the approximation the regular real SVD in [17, 16].
From the theoretical bit complexity point of view, our algorithm essentially reduces the certification problem to a constant number of numeric matrix multiplications. When using a precision of bits for numerical computations, it has recently been shown [10] that two matrices can be multiplied in time
Here with is the cost of -bit integer multiplication [11, 9] and is the exponent of matrix multiplication [7]. If is large enough with respect to the log of the condition number, then yields an asymptotic bound for the bit complexity of our certification problem.
We have implemented unoptimized versions of the new algorithms in
Throughout this paper, we will use the max-norm for vectors and the corresponding matrix norm. More precisely, given positive integers , a vector , and an matrix , we set
For a second matrix , we clearly have
We also define
Explicit machine computation of the matrix norm is easy using the formula
Given a second matrix it follows that the coefficientwise product
satisfies
In particular, when changing certain entries of a matrix to zero, its matrix norm can only decrease. We will write for the ball matrix whose entries are all unit balls . This matrix has the property that for all matrices .
In the sequel we consider two integers and we introduce the sets of matrices:
and
We also write for the natural projection that replaces all non-diagonal entries by zeros. For any integer we finally define the map by
where is the identity matrix of size .
Given , the triple with forms an SVD for if and only if it satisfies the following system of equations:
(7) |
This is a system of equations with unknowns. Our efficient numerical method for solving this system will rely on the following principles:
For a well-chosen ansatz close to the unitary group , we prove that
is even closer to the unitary group than : see section 4. Similarly, for an ansatz close to , we take
From , and , we prove that is possible to explicitly compute , and two skew Hermitian matrices and such that
after which is a first-order approximation of : see section 5.
Let , , and . If is sufficiently close to , then we will prove that is a better approximation of the matrix than :
More precisely, given , , and , we define the following sequence of matrices
where is a diagonal matrix and , are two skew Hermitian matrices such that
(13) |
In order to measure the quality of the ansatz, we define
The main result of the paper is the following theorem that gives explicit conditions for the quadratic convergence of the sequence , together with explicit error bounds.
where stand for the diagonal entries of . If
then the sequence defined by
The proof of this theorem will be postponed to section 6. Assuming that the theorem holds, it naturally gives rise to the following algorithm for certifying an approximate SVD:
Algorithm
Output: ball enclosures , and of , and such that for any , there exist , and such that is an exact singular value decomposition of
Compute using ball arithmetic
Let be an upper bound for
Let and be upper bounds for and (with and as in Theorem 1)
If , then set , ,
Else set , , (using upward rounding)
Set
Set
Set
Return
Proof. If , then we return matrix balls with infinite radii for which the result is trivially correct. If , then for any , the actual values of , and are bounded by , and , so Theorem 1 applies for the ansatz . As a consequence, we obtain an SVD for with the property that , , and . We conclude that , , , as desired.
Remark
Since we are doing approximate computations, the unitary matrices in an SVD are not given exactly, so we may wish to estimate the distance between an approximate unitary matrix and the closest actual unitary matrix. This is related to the following problem: given an approximately unitary matrix , find a good approximation for its projection on the group of unitary matrices. We recall a Newton iteration for this problem [20, 2, 12] and provide a detailed analysis of its (quadratic) convergence.
The tangent space to at is
Consider the Riemannian metric inherited from the embedding space
Then the normal space is
We wish to compute using an appropriate Newton iteration. From the characterization of the normal space, it turns out that it is more convenient to write , where is Hermitian. With and , we have
Taking
it follows that
We are thus lead to the following Newton iteration that we will further study below:
Remark
Consequently . In this context the classical Newton operator thus becomes
Proof. The conclusion follows from (16), since .
Proof. Modulo taking instead of , it suffices to consider the case when . Now
is an increasing function in and , since its power series expansion in and admits only positive coefficients. Consequently, .
We recall that any invertible matrix admits a unique polar decomposition
where and is a positive-definite Hermitian matrix. We call the polar projection of on . The matrix can uniquely be written as the exponential of another Hermitian matrix. It is also well known that is indeed the closest element in to for the Riemannian metric [6, Theorem 1].
Proof. The Newton sequence (17) defined from gives
with . An obvious induction using Proposition 5 yields and . Therefore this sequence converges to a limit that is given by
Lemma 6 implies
More generally, we have
Since is unitary, we have . Neumann's lemma also implies that is invertible with
By induction on , it can also be checked that for all . This means that the all commute, whence and are actually Hermitian matrices. Since , the logarithm is well defined. We conclude that is the exponential of a Hermitian matrix, whence it is positive-definite.
Let be a matrix with diagonal entries . Consider a perturbation
We wish to compute an approximate SVD
where , , and . Discarding higher order terms, this leads to the linear equation
with and . In view of (14), this means that and are skew Hermitian. The following proposition shows how to solve the linear equation explicitly under these constraints.
For , we take
For , we take
For and , we take
For and , we take
Then we have
Proof. Since and are skew Hermitian, we have . In view of (21), we thus get
By skew symmetry, for the equation
to hold, it is sufficient to have
The formulas (22) clearly imply (30). The from (27) clearly satisfy (32) as well. For , the formulas (31) can be rewritten as
Since , the formulas (23–26) indeed provide us with a solution. The entries with do not affect the product , so they can be chosen as in (28). In view of the skew symmetry constraints and , we notice that the matrices and are completely defined.
Given with , we have
Setting
and , we also have
Proof. From the formula (21) we clearly have . The formula (22) implies for all . For , the formulas (23–26) imply
whence
Similarly,
It follows that
From (27), and using (4), we also deduce that , for and . Combined with the fact that , we get
Since and , we now observe that
Plugging in the above norm bounds, we deduce that
In a similar way, one proves that .
Let us denote
and, for each ,
where denote the diagonal entries of . Let us show by induction on that
These inequalities clearly hold for . Assuming that the induction hypothesis holds for a given and let us prove it for .
By the definition of , we have and . Setting
we have
It follows that
Let . Applying Proposition 9 to , we get
Since , we have
Using (18), we obtain
Similarly,
Using and (13), we next have
It follows that
This completes the proof that . We also have
We deduce that and . Let us finally prove that . From , we get
so that
Similarly, using
we get
Hence , which completes the proof of the four induction hypotheses (36–39) at order .
From the continuity of the maps , , and , we deduce that the sequence converges. Let be the limit. By continuity, we have . The unitary matrix is of the form with
From above we know that
whence
Lemma 6 now implies
This shows that is invertible, with
Hence
From the definition of we also have
Using Lemma 6, this yields
Similar bounds can be computed for , , and . Altogether, this leads to
We finally have
since . This completes the proof.