J Appl Math Comput (2011) 35: 537–550
DOI 10.1007/s12190-010-0377-8 JAMC
On the Gauss–Newton method
Ioannis K. Argyros · Saïd Hilout
Received: 28 October 2009 / Published online: 16 January 2010
© Korean Society for Computational and Applied Mathematics 2010
Abstract We provide a new semilocal convergence analysis of the Gauss–Newton
method (GNM) for solving nonlinear equation in the Euclidean space.
Using a combination of center-Lipschitz, Lipschitz conditions, and our new idea
of recurrent functions, we provide under the same or weaker hypotheses than be-
fore (Ben-Israel, J. Math. Anal. Appl. 15:243–252, 1966; Chen and Nashed, Numer.
Math. 66:235–257, 1993; Deuflhard and Heindl, SIAM J. Numer. Anal. 16:1–10,
1979; Guo, J. Comput. Math. 25:231–242, 2007; Häußler, Numer. Math. 48:119–
125, 1986; Hu et al., J. Comput. Appl. Math. 219:110–122, 2008; Kantorovich and
Akilov, Functional Analysis in Normed Spaces, Pergamon, Oxford, 1982), a finer
convergence analysis. The results can be extended in case outer or generalized in-
verses are used.
Numerical examples are also provided to show that our results apply, where others
fail (Ben-Israel, J. Math. Anal. Appl. 15:243–252, 1966; Chen and Nashed, Numer.
Math. 66:235–257, 1993; Deuflhard and Heindl, SIAM J. Numer. Anal. 16:1–10,
1979; Guo, J. Comput. Math. 25:231–242, 2007; Häußler, Numer. Math. 48:119–
125, 1986; Hu et al., J. Comput. Appl. Math. 219:110–122, 2008; Kantorovich and
Akilov, Functional Analysis in Normed Spaces, Pergamon, Oxford, 1982).
Keywords Gauss–Newton method · Semilocal convergence · Fréchet-derivative ·
More–Penrose pseudo-inverse
Mathematics Subject Classification (2000) 65F20 · 65G99 · 65H10 · 49M15
I.K. Argyros ()
Department of Mathematics Sciences, Cameron University, Lawton, OK 73505, USA
e-mail:
[email protected]S. Hilout
Laboratoire de Mathématiques et Applications, Poitiers University, Bd. Pierre et Marie Curie,
Téléport 2, B.P. 30179, 86962 Futuroscope Chasseneuil Cedex, France
e-mail: [email protected]
538 I.K. Argyros, S. Hilout
1 Introduction
Let F be a Fréchet-differentiable function, defined on a convex subset D of Ri , with
value in Rj (i ≤ j ). In this study, we are concerned with the problem of finding
x ∈ Ri , minimizing the objective function:
1 1
G(x) := F (x)2 = F (x)T F (x), (1.1)
2 2
where . denotes the Euclidean norm. Many problems in applied mathematics, and
also in engineering are solved by finding such solutions x [1–14].
Except in special cases, the most commonly used solution methods are iterative,
when starting from one or several initial approximations a sequence is constructed
that converges to the solution of the equation. Iteration methods are also used for
solving optimization problems like (1.1).
Iteration sequences converge to an optimal solution of the problem at hand. In
particular, here for x to be a local minimum it is necessary to be a zero of the gradient
∇G of G, too:
∇G(x ) = J T (x )F (x ) = 0, (1.2)
with
J (x) = F (x) (x ∈ D). (1.3)
The iterative method for computing such zero is so-called Gauss–Newton method
(GNM), as introduced by Ben–Israel [7]:
xn+1 = xn − J + (xn )F (xn ) (n ≥ 0), (x0 ∈ D), (1.4)
where, J + denotes the well known Moore–Penrose-pseudoinverse of J [5].
In this paper, we assume that M+ is the Moore–Penrose-pseudoinverse of matrix
M satisfying the four axioms:
(M+ M)T = M+ M,
(MM+ )T = MM+ ,
M+ MM+ = M+ ,
and
MM+ M = M.
In the case of a full rank (m, n) matrix M, with rank M = n, the pseudo-inverse
is given by:
M+ = (MT M)−1 MT .
There is an extensive literature on convergence results for the (GNM). We refer
the reader to [1–14], and the reference there. In particular, Häußler [11] provided a
Kantorovich-type semilocal convergence analysis for (GNM).
On the Gauss–Newton method 539
Here, we re-visit the (GNM). Using the center-Lipschitz conditions (instead of
Lipschitz conditions used in [11]) to find more precise upper bounds on the inverses
of the mappings involved, and our new idea of recurrent functions, we provide a
analysis with the following advantages (under the same or weaker computational cost
and hypotheses):
(a) finer estimates on the distances xn+1 − xn , xn − x (n ≥ 0);
(b) an at least as precise information on the distances involved.
Numerical examples are provided to show that our results apply, where the corre-
sponding ones in [7–13] do not.
2 Semilocal convergence analysis of (GNM)
We need the following result on majorizing sequences for (GNM).
Lemma 2.1 Let β > 0, γ0 > 0, γ > 0, with γ0 ≤ γ , and η > 0 be given constants.
Assume: Quadratic polynomial
f1 (s) = 2γ0 βs 2 − (2 − (2γ0 + γ )β)s + 2η, (2.1)
has a root in (0, 1), denoted by 2δ , and for
γβ + 2η
δ0 = , (2.2)
1 − γ0 β
−γ + γ 2 + 8γ0 γ
α= , (2.3)
4γ0
the following holds:
δ0 ≤ δ ≤ 2α. (2.4)
Then, scalar sequence {tn } (n ≥ 0) generated by
γ (tn+1 − tn ) + 2η
t0 = 0, t1 = β, tn+2 = tn+1 + (tn+1 − tn ) (2.5)
2(1 − γ0 tn+1 )
is increasing, bounded above by
2η
t = , (2.6)
2−δ
and converges to its unique least upper bound t such that
t ∈ [0, t ]. (2.7)
Moreover, the following estimates hold for all n ≥ 0:
n+1
δ δ
0 ≤ tn+2 − tn+1 ≤ (tn+1 − tn ) ≤ β, (2.8)
2 2
540 I.K. Argyros, S. Hilout
and
n
2β δ
t − tn ≤
. (2.9)
2−δ 2
Proof We shall show using induction on the integer m:
γ (tm+1 − tm ) + 2η δ
0 < tm+2 − tm+1 = (tm+1 − tm ) ≤ (tm+1 − tm ), (2.10)
2(1 − γ0 tm+1 ) 2
and
γ0 tm+1 < 1. (2.11)
If (2.10), and (2.11) hold, then we have (2.8) holds, and
δ
tm+2 ≤ tm+1 + (tm+1 − tm )
2
δ δ
≤ tm + (tm − tm−1 ) + (tm+1 − tm )
2 2
m+1
δ δ
≤η+ β + ··· + β
2 2
1 − ( 2δ )m+2
= β
1− δ
2
2η
< = t (by (2.6)). (2.12)
2−δ
Estimates (2.10) and (2.11) hold for m = 0, by the initial conditions, (2.4), and the
choices of δ, and δ0 :
γ (t1 − t0 ) + 2η γβ + 2η
= = δ0 ≤ δ,
1 − γ 0 t1 1 − γ0 β
γ0 t1 = γ0 β < 1.
Let us assume (2.8), (2.10), and (2.11) hold for all m ≤ n + 1.
Estimate (2.10) can be re-written as:
γ (tm+1 − tm ) + 2η + γ0 δtm+1 − δ ≤ 0,
or
m
δ 1 − ( 2δ )m+1
γ β + γ0 δ β + 2η − δ ≤ 0. (2.13)
2 1 − 2δ
Estimate (2.13) motivates us to introduce functions fm on [0, +∞) (m ≥ 1) for
s = 2δ by:
fm (s) = γ s m β + 2γ0 s(1 + s + s 2 + · · · + s m )β − 2s + 2η. (2.14)
On the Gauss–Newton method 541
Estimate (2.13) certainly holds, if:
δ
fm ≤ 0 for all m ≥ 1. (2.15)
2
We need to find a relationship between two consecutive polynomials fm :
fm+1 (s) = γ s m+1 β + 2γ0 s(1 + s + s 2 + · · · + s m + s m+1 )β − 2s + 2η
= γ s m β − γ s m β + γ s m+1 β
+ 2γ0 s(1 + s + s 2 + · · · + s m )β + 2γ0 s m+2 β − 2s + 2η
= fm (s) + g(s)βs m , (2.16)
where,
g(s) = 2γ0 s 2 + γ s − γ . (2.17)
Note that function g has a unique positive root α given by (2.3).
Estimate (2.15) holds for m = 1 as equality.
Using (2.16) for m = 1, and (2.17), we obtain:
δ δ δ δ
f2 = f1 +g β ≤ 0,
2 2 2 2
since
δ δ
f1 = 0, and g ≤ 0.
2 2
Assume
δ
fk ≤ 0, for all k ≤ m. (2.18)
2
Then, it follows from (2.16), and (2.17):
m
δ δ δ δ
fm+1 = fm +g β ≤ 0. (2.19)
2 2 2 2
Hence, estimate (2.19) holds for all m ≥ 1.
Moreover, we obtain:
δ δ
f∞ = lim fm ≤ 0. (2.20)
2 m→∞ 2
That completes the induction.
Estimate (2.9) follows from (2.8) by using standard majorization techniques
[5, 13].
Finally, sequence {tn } is non-decreasing, bounded above by t , and as such it
converges to its unique least upper bound t .
That completes the proof of Lemma 2.1.
542 I.K. Argyros, S. Hilout
Remark 2.2
δ
(a) The hypothesis on the existence of the root 2 of polynomial f1 can be replaced
by conditions:
(4γ0 + γ )β < 2(1 − η), (2.21)
and
η < 1. (2.22)
Indeed, we have by the intermediate value theorem:
f1 (0) = 2η > 0, and f1 (1) = (4γ0 + γ )β + 2(1 − η) < 0.
(b) Another set of conditions for the existence of s1 is 1 > 0, where 1 is the
discriminant of polynomial f1 , and (2γ0 + γ )β < 2.
We can also show the following alternative to Lemma 2.1.
Lemma 2.3 Let β > 0, γ0 > 0, γ > 0, with γ0 ≤ γ , and η > 0 be given constants.
Assume:
1
0 < hA = aβ ≤ , (2.23)
2
and
η < α, (2.24)
where,
1
a= max{γ + 2αγ0 , (2γ0 α + 2γ0 + γ )α}.
4(α − η)
Then, the following hold: polynomial f1 has a positive root 2δ ,
max{δ0 , δ} ≤ 2α,
and the conclusions of Lemma 2.1 hold, with α replacing 2δ .
Proof It follows from (2.16), (2.23), and (2.24) that
fm (α) = f1 (α) ≤ 0, m ≥ 1, (2.25)
which together with f1 (0) = 2η > 0, imply that there exists a positive root 2δ of
polynomial f1 , satisfying δ ≤ 2α. It also follows from (2.23), and (2.24), that δ0 ≤
2α, and (2.15) holds, with α replacing 2δ . Note also that α ∈ (0, 1).
That completes the proof of Lemma 2.3.
Remark 2.4 The conclusions of Lemma 2.1 also hold (with δ0 replacing δ) under the
following conditions:
Let β > 0, γ0 > 0, γ > 0, with γ0 ≤ γ , and η > 0 be given constants.
On the Gauss–Newton method 543
Assume:
η < α, (2.26)
2(α − η)
β≤ , (2.27)
γ + 2αγ0
and
δ0
f1 ≤ 0, (2.28)
2
where polynomial f1 is given by (2.14) for m = 1. Estimate (2.15) holds for m = 1
by (2.28). Using (2.14)–(2.17), we get
δ0 δ0 δ0 δ0
f2 = f1 +g β ≤ 0,
2 2 2 2
since,
δ0 δ0
f1 = 0, and g ≤ 0.
2 2
Assume (2.15) holds for all k ≤ m. We shall show (2.15) for m replaced by m + 1.
We can have:
m
δ0 δ0 δ0 δ0
fm+1 = fm +g β ≤ 0,
2 2 2 2
which show (2.15) for all m.
Finally, we obtain:
δ0 δ0
f∞ = lim fm ≤ 0.
2 m→∞ 2
We need the following standard perturbation lemma [5, 11, 14].
Lemma 2.5 Let A and B be (m × n) matrices.
Assume:
rank(A) ≤ rank(B) = r ≤ i (r ≥ 1), (2.29)
and
A − BB + < 1. (2.30)
Then, the following hold:
rank(A) = r, (2.31)
and
B +
A+ ≤ . (2.32)
1 − B + A − B
544 I.K. Argyros, S. Hilout
We can show the semilocal convergence result for (GNM):
Theorem 2.6 Let F ∈ C 1 (D0 ), D0 ⊆ D ⊆ Ri , and D0 be a convex set.
Assume: there exist x0 ∈ D0 , and constants β > 0, β0 > 0, K > 0, K0 > 0, and
η : D0 −→ R+ , such that for all x, y ∈ D0 :
rank(J (x0 )) = r ≤ i, r ≥ 1, (2.33)
rank(J (x)) ≤ r, (2.34)
+
J (x0 )F (x0 ) ≤ β, (2.35)
J (x) − J (y) ≤ Kx − y, (2.36)
J (x) − J (x0 ) ≤ K0 x − x0 , (2.37)
J + (x0 ) ≤ β0 , (2.38)
+
J (y)r(x) ≤ η(x)x − y, (2.39)
with
r(x) = (I − J (x)J + (x))F (x), (2.40)
η(x) ≤ η < 1, (2.41)
U (x0 , t ) ⊆ D0 , (2.42)
where, t is given in (2.7), and hypotheses of Lemmas 2.1, or 2.3 hold, for
γ0 = β0 K0 , and γ = β0 K. (2.43)
Then, the following hold:
rank(J (x)) = r, x ∈ U (x0 , t ). (2.44)
Sequence {xn } (n ≥ 0) generated by (GNM) is well defined, remains in U (x0 , t )
for all n ≥ 0, and converges to a zero x of J + (x)F (x) in U (x0 , t );
xn+1 − xn ≤ tn+1 − tn , (2.45)
and
xn − x ≤ t − tn , (2.46)
where, sequence {tn } is given in Lemma 2.1.
Moreover, the following equality holds
rank(J (x )) = r, (2.47)
and, if rank(J (x0 )) = i, and F (x ) = 0, then, x is unique in U (x0 , t ), and also x
is the unique zero of J + (x)F (x) in U (x0 , t ) too.
On the Gauss–Newton method 545
Proof By hypothesis x1 ∈ U (x0 , t ), since x1 − x0 ≤ β ≤ t . Then, (2.45) holds
for n = 0.
Assume xm ∈ U (x0 , t ), and (2.45) holds for m ≤ n.
Using (2.37), and (2.11), we get:
J (xm ) − J (x0 ) ≤ K0 xm − x0
1
≤ K0 (tm − t0 ) = K0 tm < . (2.48)
β0
It follows from (2.48), Lemma 2.5, that (2.44), (2.47), and
β0
J + (xm ) ≤
1 − β0 K0 xm − x0
β0
≤ (2.49)
1 − γ 0 tm
hold.
Using (1.4), (2.5), (2.36), (2.39)–(2.43), (2.49), and the induction hypotheses, we
obtain in turn:
1
xm+1 − xm = J + (xm ) (J (xm−1 + θ (xm − xm−1 )) − J (xm−1 ))
0
× (xm − xm−1 )dθ
+ J + (xm )(I − J (xm−1 )J + (xm−1 ))F (xm−1 )
1 1
≤ γ xm − xm−1 + η xm − xm−1
1 − γ 0 tm 2
1
≤ (γ (tm − tm−1 ) + η)(tm − tm−1 ) = tm+1 − tm , (2.50)
2(1 − γ0 tm )
which completes the induction for (2.45).
Note also that (2.45), implies:
xk+1 − x0 ≤ tk+1 for k = 1, . . . , m + 1.
That is xm+1 ∈ U (x0 , t ).
In view of Lemma 2.1, sequence {xn } is Cauchy in Ri , and as such it converges to
some x ∈ U (x0 , t ) (since U (x0 , t ) is a closed set).
We claim: x is a zero of J + (x)F (x). Indeed, we get:
J + (x )F (xm ) ≤ J + (x )(I − J (xm )J + (xm ))F (xm )
+ J + (x )J (xm )J + (xm )F (xm )
≤ ηxm − x + J + (x )J (xm )xm+1 − xm . (2.51)
By using (2.51), and the continuity of mapping J (x), F (x), we justify the claim.
546 I.K. Argyros, S. Hilout
Finally, estimate (2.46) follows from (2.45) by using standard majorization tech-
niques [5, 13].
The uniqueness part as identical to Lemma 2.9 in [11, p. 122] is omitted.
That completes the proof of Theorem 2.6.
We can now state Häußler’s result for comparison purposes:
Theorem 2.7 [11] Under hypotheses (2.33)–(2.41) (excluding (2.37)), further as-
sume:
1
hH = βγ ≤ (1 − η)2 , (2.52)
2
and
U (x0 , v ) ⊆ D0 , (2.53)
where,
v = lim vn , (2.54)
n−→∞
γ (vn+1 − vn ) + 2η
v0 = 0, v1 = β, vn+2 = vn+1 + (vn+1 − vn ). (2.55)
2(1 − γ vn+1 )
Then, the conclusions of Theorem 2.6 hold, with v , {vn } replacing t , {tn } (n ≥ 0),
respectively.
Remark 2.8 Note that in general
γ0 ≤ γ (2.56)
γ
holds in general, and γ0 can be arbitrarily large [3–5].
Using induction on integer, we can easily show:
Proposition 2.9 Under only hypotheses of Theorem 2.7, or Theorems 2.6 and 2.7,
the following hold for all n ≥ 0:
xn+1 − xn ≤ tn+1 − tn ≤ vn+1 − vn , (2.57)
tn ≤ v n (n ≥ 2), (2.58)
and
xn − x ≤ t − tn ≤ v − vn . (2.59)
Note also that if γ0 < γ , then, strict inequality holds in (2.57), and (2.58) for
n ≥ 2.
Remark 2.10 By Proposition 2.9, the error estimates of Theorem 2.6 can certainly be
improved under the same computational cost, since in practice, the computation of γ
requires that of γ0 .
On the Gauss–Newton method 547
In the next section, we shall show:
(a) conditions of Lemma 2.1 are always weaker than (2.52), when γ0 < γ , and i = j
(i.e., when J (x) = F (x)−1 (x ∈ D0 ), in the case of Newton’s method), where as
they coincide, when γ0 = γ ;
(b) conditions of Lemmas 2.1, or 2.3 can be weaker than (2.52), when γ0 < γ .
3 Special cases and applications
Application 3.1 (Newton’s method) That is η = 0.
Hypothesis
(1 − η)2
hG = βγ ≤ (see [10]) (3.1)
2
reduces to the famous for its simplicity and clarity Newton–Kantorovich hypothesis
[4, 13] for solving nonlinear equations:
1
hK = γβ ≤ . (3.2)
2
Note that in this case, polynomials fm (m ≥ 1) should be:
fm (s) = γ s m−1 + 2γ0 (1 + s + s 2 + · · · + s m ) β − 2, (3.3)
and
fm+1 (s) = fm (s) + g(s)s m−1 β. (3.4)
It is then simple algebra to show that condition of Lemma 2.1 reduces to:
1
hA = αβ ≤ , (3.5)
2
where,
1
α= γ + 4γ0 + γ 2 + 8γ0 γ . (3.6)
8
In view of (3.2), (3.5), and (3.6), we get:
1 1
hK ≤ ⇒ hA ≤ , (3.7)
2 2
but not necessarily vice verca unless if γ = γ0 .
Moreover, if γ0 < γ , condition (3.5) is also weaker than
γ0 + γ 1
hH SL = β≤ , (3.8)
2 2
provided in [12] for nonsingular operators. Note that condition (3.8) was first given
by us in [2, 4] for the case when linear operator F (x0 ) is invertible.
548 I.K. Argyros, S. Hilout
We provide examples, where γ0 ≤ γ , or (3.5) holds but (3.2) is violated.
Example 3.2 Let X = Y = R2 , equipped with the max-norm, and
1
x0 = (1, 1) ,
T
U0 = {x : x − x0 ≤ 1 − p}, p ∈ 0, .
2
Define function F on U0 by
F (x) = (ξ13 − p, ξ23 − p), x = (ξ1 , ξ2 )T . (3.9)
The Fréchet-derivative of operator F is given by
3ξ12 0
F (x) = .
0 3ξ22
Case 1: η = 0.
Using hypotheses of Theorem 2.6, we get:
1
β = (1 − p), γ0 = 3 − p, and γ = 2(2 − p).
3
The Kantorovich condition (3.2) is violated, since
4 1
(1 − p)(2 − p) > 1 for all p ∈ 0, .
3 2
√ √
Hence, there is no guarantee that Newton’s method converges to x = ( 3 p, 3 p)T ,
starting at x0 .
However, our condition (3.5) is true for all p ∈ I = [.450339002, 12 ). Hence, the
conclusions of our Theorem 2.6 can apply to solve (3.9) for all p ∈ I .
Case 2: 0 = η = 0.01.
Choose p = .49, then we get
γ0 = 2.51 < γ = 3.02, β = .17,
δ
= .033058514 < α = .53112045,
2
and
δ0 = .3347085 < 2α.
Note also that condition (3.1) is violated no matter how η is chosen in (0, 1).
Finally, by comparing (2.23) with (2.52), we see that our condition is weaker pro-
vided that
γ
a< , (3.10)
(1 − η)2
which can certainly happen.
For example, if γ0 ≈ 0, then α ≈ 0, in which case (3.10) holds.
On the Gauss–Newton method 549
Application 3.3 In the case X = Y = Rj (j fixed in N), we can split matrix F (xn )
into F (xn ) = Bn − Cn , to obtain the inner–outer iteration:
xn+1 = xn − (Hnmn −1 + · · · + Hn + I)Bn−1 F (xn ) (n ≥ 0), (3.11)
Hn = Bn−1 Cn , (3.12)
where, mn is the number of inner iterations. Let us assume mn = m in iteration (3.11).
We can obtain result concerning the estimation of the number of inner iterations
under the conditions of Theorem 2.6:
Theorem 3.4 Under the hypotheses of Theorem 2.6, further assume:
B0−1 F (x0 ) ≤ q,
a0 hm + mbhm−1 ≤ ηn , sup Hn ≤ h < 1,
n
where,
3 − 2η + 2βγ n
a0 = ,
η2
(3.13)
2−η q(q + 1)γ0 (1 − η)2 1 − η
b= + + β ;
η [1 − (1 − η)γ0 q]2 2γ γ
the matrix norm has the property:
F (x0 )−1 R ≤ F (x0 )−1 S
with R any submatrix of S;
U (x0 , t ) ⊆ D;
and hypotheses of Lemmas 2.1, or 2.3 hold.
Then the conclusions of Theorem 2.6 hold true for inexact iteration (1.4).
Proof It follows exactly as in Corollary 3.3 in [10], and our Theorem 3.7 in [6]. Here
are the changes (with γ0 replacing γ in the proof):
F (x0 )−1 F (xn ) ≤ 1 + γ0 xn − x0 ,
1
F (xn )−1 F (x0 ) ≤ ,
1 − γ0 xn − x0
γ
F (x0 )−1 F (xn ) ≤ xn − x0 2 + xn − x0 + β,
2
F (x0 )−1 (Bn − Bn−1 ) ≤ γ xn − xn−1 ,
and
q
Bn−1 F (x0 )−1 ≤ .
1 − γ0 xn − x0 q
550 I.K. Argyros, S. Hilout
The constant b defined in [10] (for γ0 = γ ) is larger than b, which is an advantage
of our approach for the selection of a smaller η, when γ < γ0 .
Note that the hypotheses of Theorem 3.4 are simpler than the hypotheses of our
Theorem 3.7 in [6], and weaker than Corollary 3.3 in [10].
Hence, all the above justify the claims made.
Note that in the case i = j , the results can be provided in affine-invariant form
by simply replacing F (x) by F (x0 )−1 F (x) for x ∈ D0 , and setting β0 = 1. The
advantages of this approach have been explained in [5, 9].
Finally, our results immediately extend to the more general case of outer or
generalized inverses, by simply replacing perturbation Lemma 2.5 by its analog in
[8, Lemma 2.2, p. 238] (see also [1–5]), and using the same approach as in this pa-
per. Note that the crucial majorizing sequence (2.5) remains the same in this new
setting. We leave the details in the motivated reader.
References
1. Argyros, I.K.: Convergence domains for some iterative processes in Banach spaces using outer or
generalized inverses. J. Comput. Anal. Appl. 1, 87–104 (1999)
2. Argyros, I.K.: On the Newton–Kantorovich hypothesis for solving equations. J. Comput. Appl. Math.
169, 315–332 (2004)
3. Argyros, I.K.: A convergence analysis for Newton-like methods for singular equations using outer or
generalized inverses. Appl. Math. 32, 37–49 (2005)
4. Argyros, I.K.: On the semilocal convergence of the Gauss–Newton method. Adv. Nonlinear Var. In-
equal. 8, 93–99 (2005)
5. Argyros, I.K.: Convergence and Applications of Newton-Type Iterations. Springer, New York (2008)
6. Argyros, I.K.: On the semilocal convergence of inexact methods in Banach spaces. J. Comput. Appl.
Math. 228, 434–443 (2009)
7. Ben-Israel, A.: A Newton–Raphson method for the solution of systems of equations. J. Math. Anal.
Appl. 15, 243–252 (1966)
8. Chen, X., Nashed, M.Z.: Convergence of Newton-like methods for singular operator equations using
outer inverses. Numer. Math. 66, 235–257 (1993)
9. Deuflhard, P., Heindl, G.: Affine invariant convergence theorems for Newton’s method and extensions
to related methods. SIAM J. Numer. Anal. 16, 1–10 (1979)
10. Guo, X.: On semilocal convergence of inexact Newton methods. J. Comput. Math. 25, 231–242
(2007)
11. Häußler, W.M.: A Kantorovich-type convergence analysis for the Gauss–Newton-method. Numer.
Math. 48, 119–125 (1986)
12. Hu, N., Shen, W., Li, C.: Kantorovich’s type theorems for systems of equations with constant rank
derivatives. J. Comput. Appl. Math. 219, 110–122 (2008)
13. Kantorovich, L.V., Akilov, G.P.: Functional Analysis in Normed Spaces. Pergamon, Oxford (1982)
14. Penrose, R.: A generalized inverse for matrices. Proc. Camb. Philos. Soc. 51, 406–413 (1955)