Introduction

We consider the matrix equation:

X=Q−A∗X−1A+B∗X−1B,

where Q is an n × n Hermitian positive definite matrix, and A and B are arbitrary n × n matrices. Equation ( ¹ ) is a special stochastic rational Riccati equation arising in stochastic control theory, and it can be described below. Some stochastic control problems lead to computing the positive definite solution of the following stochastic rational Riccati equation[ ¹ ]:

C∗XC−X+S+π1(X)−(L+C∗XP+π12(X))(R+P∗XP+π2(X))+(L+C∗XP+π12(X))∗=0,

where Z ⁺ stands for the Moore-Penrose inverse of a matrix Z and C; P, S, R and L are given matrices of size n × n n × m n × n m × m , and n × m , respectively, such that

T=SLL∗R

is a Hermitian matrix, and the operator

π(X)=π1(X)π12(X)π12(X)∗π2(X)

is positive, i.e. X ≥ 0 implies π ( X ) ≥ 0. Consider the following case: C is the identity matrix, P is an n × n nonsingular matrix, S is an n × n positive definite matrix, L is the zero matrix, and π ₁₂ ( X ) = π ₂ ( X ) = 0, π ₁ ( X ) = ( R + P ^∗ XP ) ⁻¹ , where R + P ^∗ XP is positive definite for all positive semidefinite matrices X . Meanwhile, the stochastic rational Riccati Equation ( ² ) has the form

S+(R+P∗XP)−1−XP(R+P∗XP)−1P∗X=0.

Set

Y=R+P∗XP,

then

P−∗(Y−R)=XP.

By Equations ³ to ⁵ , we have

S+Y−1−P−∗(Y−R)Y−1(Y−R)P−1=0,

which implies that

Y+R∗Y−1R−P∗Y−1P=2R+P∗SP.

Set

Q=2R+P∗SP,A=R,B=P,

then Equation ³ can be equivalently written as Equation ¹ . Therefore, Equation ¹ is a special stochastic rational Riccati equation (Equation 2). Moreover, some special cases of Equation ¹ are also problems of practical importance, such as the matrix equation X + M ^∗ X ⁻¹ M = Q that arises in the control theory, ladder networks, dynamic programming, stochastic filtering, statistics, and so on[ ² – ⁴ ]. The matrix equation X − M ^∗ X ⁻¹ M = Q arises in the analysis of stationary Gaussian reciprocal processes over a finite interval[ ⁵ , ⁶ ].

Since 1993, the matrix equations X + M ^∗ X ⁻¹ M = Q and X − M ^∗ X ⁻¹ M = Q have been extensively studied, and the research results mainly concentrated on the following:

sufficient conditions and necessary conditions for the existence of a (unique) positive definite solution[ ² , ⁶ – ⁸ ];

numerical methods for computing the (unique) positive definite solution[ ⁴ – ⁶ , ⁹ – ¹³ ];

properties of the positive definite solution[ ² , ⁴ ]; and

perturbation bound for the positive definite solution[ ³ , ¹⁴ ].

In addition, other nonlinear matrix equations such as A X ² + BX + C = 0[ ¹⁵ ], X ^s ± A ^∗ X ^−
t A = Q [ ¹⁶ , ¹⁷ ], X+∑i=1mAi∗X−1Ai=I [ ¹⁸ , ¹⁹ ], X ± A ^∗ X ^−
q A = Q [ ³ , ²⁰ – ²⁷ ], X−∑i=1mAi∗XδiAi=Q [ ²⁸ ], X + A ^∗ F ( X ) A = Q [ ²⁹ , ³⁰ ] have been investigated by many authors. However, results on the general nonlinear matrix equation (Equation 1) are few as far as we know.

In this paper, we first use the the Bhaskar and Lakshmikantham fixed point theorem to study the positive definite solution of the nonlinear matrix equation (Equation 1). A new sufficient condition for the existence of a unique positive definite solution to Equation ¹ is derived. An iterative method is constructed to compute the unique Hermitian positive definite solution, and the error estimation formal is also given. In the end, we use some numerical examples to illustrate that the iterative method is feasible to compute the unique positive definite solution of Equation ¹ .

Methods

Throughout this paper, we denote by ℳ(N) and ℋ(N) the set of N × N complex and N × N Hermitian matrices, respectively. For A,B∈ℋ(N) , A ≥ 0 ( A > 0) means that A is positive semi-definite (positive definite). Moreover, A ≥ B ( A > B ) means that A − B ≥ 0 ( A − B > 0), and X ∈[ A , B ] means A ≤ X ≤ B . A ^∗ and r ( A ) denote the complex conjugate transpose and the spectral radius of A , respectively. We denote by ∥·∥ the spectral norm, i.e., ∥A∥=λ+(A∗A) , where λ ⁺ ( A ^∗ A ) is the largest eigenvalue of A ^∗ A . The N × N identity matrix will be written as I . We denote by ∥·∥ _tr the trace norm. Recall that this norm is given by

∥A∥tr=∑j=1Nσj(A),

where σ _j ( A ), j = 1,…, N are the singular values of A .

The following lemmas will be useful later.

Lemma 2.1 (See[31])

Let A ≥ 0 and B ≥ 0 be N × N matrices, then 0 ≤ tr ( AB ) ≤ ǁ A ǁ tr( B ).

Lemma 2.2 (See[32])

If 0 < θ ≤ 1, and P and Q are positive definite matrices of the same order with P , Q ≥ bI > 0, then for every unitarily invariant norm ||| P ^θ − Q ^θ ||| ≤ θ b ^θ
−1 ||| P − Q ||| and ||| P ^−
θ − Q ^−
θ ||| ≤ θ b ^{−(
θ
+ 1)} ||| P − Q |||.

Lemma 2.3 (See[32])

Let A∈ℋ(N) satisfying − I < A < I , then ∥ A ∥ < 1.

Let ( X ,≼) be a partially ordered set and F : X × X → X be a given mapping. We say that F has the mixed monotone property if for any x , y ∈ X ,

x1,x2∈X,x1≼x2⇒F(x1,y)≼F(x2,y),y1,y2∈X,y1≼y2⇒F(x,y1)≽F(x,y2).

We say that ( x , y ) is a coupled fixed point of F if x = F ( x , y ) and y = F ( y , x ).

The proof of our main result is based on the following two fixed point theorems.

Theorem 2.1 ([33])

Let ( X ,≼) be a partially ordered set endowed with a metric d such that ( X , d ) is complete. Let F : X × X → X be a continuous mapping having the mixed monotone property on X . Assume that there exists a δ ∈[0,1), such that

d(F(x,y),F(u,v))≤δ2[d(x,u)+d(y,v)],

for all ( x , y ),( u , v ) ∈ X × X with x ≽ u and y ≼ v . We suppose that there exist x ₀ , y ₀ ∈ X , such that x ₀ ≼ F ( x ₀ , y ₀ ) and y ₀ ≽ F ( y ₀ , x ₀ ). Then,

F has a coupled fixed point (x¯,y¯)∈X×X ; and
the sequences { x _n } and { y _n } defined by x _{n
+ 1} = F ( x _n , y _n ) and y _{n
+ 1} = F ( y _n , x _n ) converge respectively to x¯ and y¯ .

In addition, suppose that every pair of elements has a lower bound and an upper bound, then

F has a unique coupled fixed point (x¯,y¯)∈X×X ;
x¯=y¯
; and
we have the following estimate:
max{d(xn,x¯),d(yn,x¯)}≤δn2(1−δ)[d(F(x0,y0),x0)+d(F(y0,x0),y0)].

For other results concerning fixed point theorems on ordered sets, we refer to[ ³⁴ – ³⁷ ].

Theorem 2.2 (Schauder Fixed point theorem)

Let S be a nonempty, compact, convex subset of a normed vector space. Every continuous function f : S → S mapping S into itself has a fixed point.

Results and discussion

There exist a > 0, b > 0 (real numbers), such that the following assumptions were considered:

a ⁻¹ A ^∗ A + aI ≤ Q ≤ bI
b A ^∗ A − a B ^∗ B ≤ ab ( Q − aI )
b B ^∗ B − a A ^∗ A ≤ ab ( bI − Q )
A∗A<a22I
, B∗B<a22I .

We denote by Ω the set of matrices defined by

Ω={X∈ℋ(N):X≥aI}.

Our main result is discussed below:

Theorem 3.1

Under the assumptions 1 to 4, we have

Equation 1 has a unique solution X¯∈Ω
X¯∈[Q+b−1B∗B−a−1A∗A,Q+a−1B∗B−b−1A∗A]
the sequences { X _n } and { Y _n } defined by
X0=aIXn+1=Q−A∗Xn−1A+B∗Yn−1B;Y0=bIYn+1=Q−A∗Yn−1A+B∗Xn−1B

converge to x¯ , that is,

limn→∞∥Xn−X¯∥tr=limn→∞∥Yn−X¯∥tr=0,

and the error estimation is given by

max∥Xn−X̂∥tr,∥Yn−X̂∥tr≤δn1−δmax∥X1−X0∥tr,∥Y1−Y0∥tr,

where 0 < δ < 1.

Proof

For all X,Y∈ℋ(N) , let

F(X,Y)=Q−A∗X−1A+B∗Y−1B.

□

We claim that F ( Ω × Ω ) ⊂ Ω . Indeed, let X , Y ∈ Ω , that is, X ≥ aI and Y ≥ aI . This implies that

Q−A∗X−1A+B∗Y−1B≥Q−A∗X−1A≥Q−a−1A∗A.

On the other hand, from assumption 1, we have

Q−A∗X−1A≥aI.

Thus, we have

F(X,Y)=Q−A∗X−1A+B∗Y−1B≥aI,

which implies that F ( X , Y ) ∈ Ω . Then, our claim holds.

Now, the mapping F : Ω × Ω → Ω is well defined. Let X , Y , U , V ∈ Ω , such that X ≥ U and Y ≤ V . We have

∥F(X,Y)−F(U,V)∥tr=∥A∗(U−1−X−1)A+B∗(Y−1−V−1)B∥tr≤∥A∗(U−1−X−1)A∥tr+∥B∗(Y−1−V−1)B∥tr=trA∗(U−1−X−1)A+trB∗(Y−1−V−1)B=trAA∗(U−1−X−1)+trBB∗(Y−1−V−1).

Since U ⁻¹ − X ⁻¹ ≥ 0 and Y ⁻¹ − V ⁻¹ ≥ 0, using Lemma 2.1, we get

∥F(X,Y)−F(U,V)∥tr≤∥AA∗∥tr(U−1−X−1)+∥BB∗∥tr(Y−1−V−1).

On the other hand, since X , Y , U , V ≥ aI , using Lemma 2.2, we have

tr(U−1−X−1)≤a−2tr(X−U)

and

tr(Y−1−V−1)≤a−2tr(V−Y).

Thus, we get

∥F(X,Y)−F(U,V)∥tr≤∥AA∗∥a2∥X−U∥tr+∥BB∗∥a2∥V−Y∥tr.

This implies that

∥F(X,Y)−F(U,V)∥tr≤δ2∥X−U∥tr+∥V−Y∥tr,

where

δ=2a2max∥AA∗∥,∥BB∗∥.

From condition 4 and Lemma 2.3, we can easily show that 0 ≤ δ < 1. Now, taking X ₀ = aI and Y ₀ = bI , from conditions 2 and 3, we can easily show that X ₀ ≤ F ( X ₀ , Y ₀ ) and Y ₀ ≥ F ( Y ₀ , X ₀ ). On the other hand, for every X,Y∈ℋ(N) , there is a greatest lower bound and a least upper bound. Note also that F is a continuous mapping. Now, (I) and (III) follow immediately from Theorem 2.1. Let x¯ be the unique solution to Equation ¹ in Ω .

To prove (II), we shall use the Schauder fixed point theorem. We define the mapping G :[ F ( aI , bI ), F ( bI , aI )] → Ω by

G(X)=F(X,X),for allX∈[F(aI,bI),F(bI,aI)].

We claim that G ([ F ( aI , bI ), F ( bI , aI )])⊆[ F ( aI , bI ), F ( bI , aI )]. Let X ∈[ F ( aI , bI ), F ( bI , aI )], that is,

F(aI,bI)≤X≤F(bI,aI).

Using the mixed monotone property of F , we get

F(F(aI,bI),F(bI,aI))≤F(X,X)=G(X)≤F(F(bI,aI),F(aI,bI)).

On the other hand, from conditions 2 and 3, we have

aI≤F(aI,bI)andbI≥F(bI,aI).

Again, using the mixed monotone property of F , we get

F(F(bI,aI),F(aI,bI))≤F(bI,aI)andF(F(aI,bI),F(bI,aI))≥F(aI,bI).

From Equations ⁷ and ⁸ , it follows that

F(aI,bI)≤G(X)≤F(bI,aI).

Thus, our claim that G ([ F ( aI , bI ), F ( bI , aI )])⊆[ F ( aI , bI ), F ( bI , aI )] holds.

Now, G maps the compact convex set [ F ( aI , bI ), F ( bI , aI )] into itself. Since G is continuous, it follows from Schauder fixed point theorem (see Theorem 2.2 ) that G has at least one fixed point in this set. However, fixed points of G are solutions of Equation ¹ , and we proved already that Equation ¹ has a unique solution in Ω . Thus, this solution must be in the set [ F ( aI , bI ), F ( bI , aI )], that is,

X¯∈[Q+b−1B∗B−a−1A∗A,Q+a−1B∗B−b−1A∗A].

Thus, we proved (II). This makes end to the proof. □

The following results are immediate consequences of our Theorem 3.1.

Theorem 3.2

Consider Equation 1 with Q = I . Suppose that

( 1 ) 0<a≤12,b≥1+a2 ; and

( 2 ) A∗A<a22I,B∗B<a22I .

Then, items I to III of Theorem 3.1 hold.

Theorem 3.3

Consider Equation 1 with A and B which are unitary matrices. Suppose that

2<a<b
; and
( a ⁻¹ + a ) I ≤ Q ≤ ( b + b ⁻¹ − a ⁻¹ ) I .

Then, items I to III of Theorem 3.1 hold.

Theorem 3.4

Consider Equation 1 with A = 0. Suppose that

aI ≤ Q ≤ bI ;
B ^∗ B ≤ a ( bI − Q ); and
B∗B<a22I
.

Then, items I to III of Theorem 3.1 hold.

Theorem 3.5

Consider Equation 1 with B = 0. Suppose that

a ⁻¹ A ^∗ A + aI ≤ Q ≤ bI ;
A ^∗ A ≤ a ( Q − aI ); and
A∗A<a22I
.

Then, items I to III of Theorem 3.1 hold.

Numerical experiments

All programs are written in MATLAB version 7.1.

Example 1

In this example, we consider Equation ¹ with

Q=7−01−071118,A=2.110.010.01−0.051.98−0.180.10.192.38,B=−3.090.010.01−0.01−3.15−0.090.040.1−2.94.

All the hypotheses of Theorem 3.1 are satisfied with a = 5 and b = 14. We consider the sequences { X _n } and { Y _n } defined in item III of Theorem 3.1 with X ₀ = aI and Y ₀ = bI . For each iteration k , we consider the errors

R(Xk)=∥Xk−(Q−A∗Xk−1A+B∗Xk−1B)∥,

R(Yk)=∥Yk−(Q−A∗Yk−1A+B∗Yk−1B)∥

and

Rk=max{R(Xk),R(Yk)}.

After 23 iterations, we get

X¯≈X23=Y23=7.680201122270050.029506336696800.889174866125000.029506336696807.796938173834590.925604525774540.889174866125000.925604525774548.34452699090856

with

R23=2.42861287e−017.

Example 2

In this example, we consider Equation ¹ with

Q=100010001,A=0.30.010.0100.28−0.020.020.030.34,B=−0.34000−0.3400.010.01−0.32.

All the hypotheses of Theorem 3.2 are satisfied with a = 0.5 and b = 5. After 20 iterations, we get

X¯≈X20=Y20=1.02444745949421−0.003561623099836826−0.01296282338345968−0.0035616230998368261.034823675282171−0.008218578980308637−0.01296282338345968−0.0082185789803086390.9861513844061653

with

R20=2.09918957e−016.

Example 3

We consider Equation ¹ with

Q=1053.45106.73.46.710,A=0.05910.07370.03280.0737−0.0328−0.05910.0328−0.05910.0737,B=0.5910.7370.3280.737−0.328−0.5910.328−0.5910.737.

In this case, A and B are unitary matrices. All the hypotheses of Theorem 3.3 are satisfied with a = 1.514 and b = 101.5. After 7 iterations, we get

X¯≈X7=Y7=10.064126899410095.0132637235503493.3450793249298845.01326372355034910.139999446575516.7198879398948023.3450793249298846.71988793989480210.29931432720346

with

R7=1.77635684e−015.

Example 4

We consider Equation ¹ with

Q=100503450100673467100,A=000000000,B=10.500.5100.50.51.5.

All the hypotheses of Theorem 3.4 are satisfied with a = 3.5 and b = 300. After 3 iterations, we get

X¯≈X3=Y3=100.010462998708950.0045068006224934.0043507679599750.00450680062249100.010522175965566.9953801120922234.0043507679599766.99538011209222100.0407917033456

with

R3=3.00990733e−014.

Example 5

We consider Equation ¹ with

Q=1053.45106.73.46.710,A=0.50.2500.250.500.250.250.75,B=000000000.

All the hypotheses of Theorem 3.5 are satisfied with a = 2 and b = 100. After 10 iterations, we get

X¯≈X10=Y10=9.9737389153364334.9887612642282043.3888191290125714.9887612642282049.9735426757535656.7120617143630093.3888191290125716.7120617143630099.89541012219485

with

R10=1.32107728e−014.

Acknowledgements

The second author acknowledges the supports from the National Natural Science Foundation of China (grant no.: 11101100) and the Natural Science Foundation of Guangxi Province (grant no.: 2012GXNSFBA053006).

Competing interests

Authors’ contributions

All authors contributed equally and significantly in writing this paper. All authors read and approved the final manuscript.

Positive definite solution of the matrix equation X = Q − A∗X −1A + B∗X −1B via Bhaskar-Lakshmikantham fixed point theorem

Abstract

Purpose

Methods

Results

Conclusion

Introduction

Methods

Lemma 2.1 (See[31])

Lemma 2.2 (See[32])

Lemma 2.3 (See[32])

Theorem 2.1 ([33])

Theorem 2.2 (Schauder Fixed point theorem)

Results and discussion

Theorem 3.1

Proof

Theorem 3.2

Theorem 3.3

Theorem 3.4

Theorem 3.5

Numerical experiments

Example 1

Example 2

Example 3

Example 4

Example 5

Conclusion

Acknowledgements

Competing interests

Authors’ contributions

References