INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 130

On Iterative Methods in Optimization

Henrietta Nkansah

1

*

& Bismark Kwao Nkansah

2

1

Department of Mathematics, University of Cape Coast, Ghana

2

Department of Statistics, University of Cape Coast, Ghana



Corresponding Author

DOI : https://doi.org/10.51583/IJLTEMAS.2024.1311014

Received: 18 November 2024; Accepted: 30 November 2024; Published: 16 December 2024

Abstract: The paper highlights a failure in the implementation of a recommendation for the modified Newton’s method using a

Rosenbrock type of functions that have slow convergence with two minimum points as test functions. The study finds that a

recommended procedure, if the Hessian

)(

k

xH

at a point is not positive definite, may not lead to the desired optimal solution

particularly when the initial point is not close enough to the expected solution. It has been demonstrated how to go round this

problem. The results show that more than one technique may be required to determine all critical points of a given function.

Keywords: Optimization, Rosenbrock’s function, Modified Newton’s Method, Descent direction

I. Introduction

In some gradient-based unconstrained optimization techniques a non-positive definite Hessian matrix of the problem function

possesses challenges for convergence of the solution. For the Newton’s method in particular, various modifications have been

suggested (e.g., Fiacco & McCormick, 1967; Marquardt, 1963) that incorporates the Steepest Descent (SD) method to obtain a

new direction that probably will help to get to the minimum.

The convergence rate of an optimization problem may be influenced by a number of factors that includes the role of the

underlying optimization method. Others involves the global or local nature of the convergence (Lewis & Nash, 2006). For studies

on convergence, a typical test function of the type of the Rosenbrock’s function (Emiola & Adem, 2021) comes handy for

studying robustness of gradient-based optimization algorithms.

In Section 2, we provide a description of the illustrative problem functions and review how the upper bound on the convergence

rate, expressed as a function of the level of ill-conditioning, poses a challenge for the optimization process. It is known that

obtaining the desired critical point depends on the initial point. We demonstrate in this paper that with the appropriate search

direction, it is possible to reach the desired optimal point even with a (reasonably) distant starting point and for highly ill-

conditioned problem function for the Newton’s method. In Section 3, the Newton’s method as well as the Modified Newton’s

method is presented. In the process, the problem of interest of the study is highlighted that points out the failure in convergence in

spite of known recommendations in the literature and prescribes a remedy. Throughout the implementations, we set a tolerance of

5

10







for which

]0;0[)( 

k

f x

for a function



p

f :

. In Section 4, the summary of the proposed procedure is

presented and followed by the conclusion.

II. Illustrative Functions and Effect of Ill-Conditioning on Convergence

Two main types of functions are used in this paper. The functions are selected to illustrate the effect of ill-conditioning on the

determination of optimal points.

Test Function I

We consider the problem of minimizing the function

2

221

4

11

)1()( xxxxf x

(2.1)

The graph of 



󰇛󰇜 with almost flat base is given in Figure 1. The graph shows that locating a minimum of the function would

pose a challenge.

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 131

Figure 1: Graph of

2

221

4

11

)1()( xxxxf x

Using the same starting point

]0;0[

0

x

, the Newton’s method and Conjugate Gradient method, respectively, lead to two

different critical points given as

]084945.0;204128.0[

and

]34794.1;69588.0[ 

The results show that

)(

1

xf

has multiple critical points which may be difficult to detect using a single method. The paper

highlights the challenge in using the modified Newton’s method to arrive at the same point as obtained by the Conjugate Gradient

method.

Test Function II

We consider the problem of minimizing the Rosenbrock’s function given as

2

1

22

122

)1()(100)( xxxf x

(2.2)

using the steepest descent (SD) method. Table 1 gives some iterates in the application of the SD to the function using a starting

point

]0;0[

0

x

.

Table 1: Number of iterations and condition number of the Hessian of f

2

(x)

Iterate (k)



κ(

H

)

0

[0; 0]

100.00

1

[0.15541; 0]

20.829

10

[0.31095; 0.08560]

60.578

100

[0.58593; 0.33975]

331.81

1000

[0.92741; 0.85909]

1653.2

From the table, there is slow rate of convergence of

)(

2

xf

which can be attributed to a large condition number,

).(H



Even

after 1000 iterations, the algorithm shows only slow convergence to the exact minimum

]1;1[

. The graph of

)(

2

xf

within the

interval

)2:1.0:2,2:1.0:2(],[ YX

is given in Figure 2.

Figure 2: Graph of the Rosenbrock’s Function

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 132

The rate of convergence of the function is explained by the long-valley base as shown in Figure 2. In spite of this, there is a single

minimum point of the function.

The observations illustrate a major weakness of the SD method that it is not appropriate when the Hessian matrix is ill-

conditioned. They also show that the determination of a minimum point of

)(

1

xf

could be more daunting than that of

)(

2

xf

. It

will be seen in the next section that the implementation of the SD for

)(

1

xf

converges after a few iterations rather to a point in

the neighbourhood of the initial point but not to the desired optimum point.

Linear Convergence for the Case of a Quadratic Function

An algorithm exhibits linear convergence in the objective function values if there is a constant    such that for all k

sufficiently large, the iterates

k

x

satisfy the expression











)()(

1

xx

ff

k

(2.3)

where



x

is some optimal point of the problem. By this statement, the optimality gap shrinks by at least δ at each iteration and

speeds up the rate of convergence if  is not close to 1. Consider the case in which the objective function

)(xf

is a simple

quadratic function of the form

xbAxxx

TT

f 

2

1

)(

,

where A is a positive definite symmetric matrix, and suppose that the eigenvalues of A are

.0

21



p





The optimal

solution of the problem is computed as

bAx

1



and by direct substitution, the optimal objective function value is

bAbx

1

2

1

)(





T

f

Let

k

x

denote the current point in the SD algorithm and let

k

d

denote the current direction given by

bAxxd 

kkk

f )(

(2.4)

To obtain the next iterate of the SD algorithm, we compute the step size,  by considering

k

T

kk

T

kk

T

kk

T

kkkk

f

Addddx

dxbdxAdxdx

2

1

)(

)()()(

2

1

)(









Optimizing the value of



therefore yields

.

k

T

k

T

k

Add

dd





and the next iterate is given as

k

T

k

T

k

kk

d

Add

dd

xx 

1

and

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 133

 

k

T

k

T

k

kk

ff

Add

dd

xx

2

1

2

1

)()( 



From the results above, we obtain

k

T

k

T

k

T

k

T

k

T

k

T

k

T

k

T

k

T

k

T

k

T

k

T

ff

dAd

bAxAbAx

AbxbAx

bAxAbxbAx

bAbxbxbAx

bAbxbAxxxx

1

2

1

)()(

2

1

)()(

2

1

)(

2

1

)(

2

1

2

1

2

1

)(

2

1

2

1

2

1

)()(













Thus,



1

))((

)(

1

)()(

)(

2

1

)(

)()(

1

2

1















k

T

kk

T

k

T

k

T

k

T

k

ff

dAdAdd

dd

xx

x

Add

dd

x

xx

(2.5)

where

2

1

)(

))((

k

T

k

T

kk

T

k

dd

dAdAdd







(2.6)

By the Kantorovich inequality, an upper bound on the value of β is given (Huang & Zhou, 2005) as

 







 

















Now, applying this inequality in Equation (2.3) gives (Freund, 2004)







:

1

)(

4

1

)()(

2

1

2

1

2

1





































p

k

ff

xx

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 134

Since by definition

1)(

1



p





H

, if the ratio is large, then the convergence constant  will be close to (slightly smaller than)

1, which slows down the convergence rate. Thus, a much smaller value than 1 of  is desired.

Example

Consider 



󰇛󰇜 in Equation (2.2). With a starting point of 



 󰇟󰇠, the descent direction

]0;2[)(

0

 xd f

.

Table 2 gives the sensitivity of SD convergence rate to the eigenvalue ratio.

Table 2: Sensitivity of SD Convergence Rate to the Eigenvalue Ratio

λ

1

λ

p

Upper Bound on δ

k

κ(

H

)

237.21622

7.97875

0.87407

2

29.731

279.17358

4.60848

0.93609

10

60.578

476.63669

1.43646

0.98800

100

331.81

744.95842

0.66666

0.99634

500

1117.4

889.92511

0.53830

0.99757

1000

1653.2

From Table 2,

)(H



keeps increasing as we increase the number of iterations, k. It also shows the effect of the relationship

between 󰇛󰇜 and the upper bound on  on the convergence rate. We see that the convergence constant ranges from 0.87407 to

0.99757, implying that for functions of the type 



󰇛󰇜, the convergence rate of SD could be extremely slow.

III. Methods and Illustrative Problem

Newton’s Method

The multi-dimensional Newton’s method finds a stationary point of



p

f :

by solving the non-linear system of equations

.0)(  xf

Alternatively, using the Taylor series for a function of several variables gives

 ))(()(

2

1

)()()()(

000000

xxxHxxxxxxx

TT

fff

(3.1)

Where

)(xH

(also denoted by 



󰇛󰇜) is the Hessian matrix, if we take

0

x

to be



x

, and since 󰇛



󰇜 is zero, we have











))(()(

2

1

)(

))(()(

2

1

)()()()(

xxxHxxx

xxxHxxxxxxx

T

TT

f

fff

Since 󰇛



󰇜 is the local minimum value of , it must follow that

0))(()( 



xxxHxx

T

at least for

x

near



x

. Using Newton’s method to find the roots of 

󰇛



󰇜

 , we arrive at the minimization method (assuming

that

)(



xH

is positive definite)

)()(

1

1 kkkk

f xxHxx 





Hence,

)())((

1 kkkk

f xxxxH 



(3.2)

We seek now to:

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 135

1. solve for the step

kkk

xx 

1



;

2. compute

1k

x

from

kkk







xx

1

.

The Illustrative Problem

We consider the minimization problem

2

221

4

11

)1()(:min xxxxf x

Using an initial guess of

]25.1;75.0[

0

x

we obtain

]2500.0;43750.0[)(

0

 xf

and















21

174999.6

)(

0

xH

. By repeated use of Equation (3.2), we obtain the results in Table 3.

Table 3: Iterations for minimizing 



󰇛󰇜 using the Newton’s method

k

x

k



f

(

x

k

)

0

[0.75; -1.25]

[0.43750; 0.25000]

1

[0.70; -1.35]

[2.2 × 10

−2

; −5.5511 × 10

−12

]

2

[0.69591; -1.34796]

[1.3104 × 10

−4

; −1.0 × 10

−5

]

3

[0.69588; -1.34794]

[−2.3294 × 10

−5

; 5.5511 × 10

−12

]

Thus, from Table 3,

]0;0[)(

3

 xf

and















99999.11

181097.5

)(

3

xH

is positive definite.

Thus,

]34794.1;69588.0[

3

x

is a minimum point of

)(xf

. As noted in Section 2, the minimum point of this function

under Conjugate Gradient method is the same as 



and is attained in just one iteration with the starting point

]0;0[

0

x

which

is farther away from

3

x

than the starting point

]25.1;75.0[

0

x

used in this case. This presents the problem of interest in

this paper.

Now, suppose that

]0;0[

0

x

in this example. Then,

]2;0[)(

0

 xf

and















21

10

)(

0

xH

. which is indefinite and hence a saddle point at

]0;0[

0

x

. This means that the expression

)()(

1

kk

f xxH 



in the Newton’s method fails to be a descent direction.

To ensure a descent direction, we make use of an alternative direction 



which is the eigenvector corresponding to the negative

eigenvalue of the Hessian matrix at

k

x

in the SD method. That is,

kkkk







xx

1

(3.3)

Using this method, we compute the various iterates for given values of the step size 



and

.

k



These iterations are given in

Table 4.

Table 4: Optimization of

)(

1

xf

for various step sizes under SD method

k



.

k



k

x

0

0.21571

[-0.92388; 0.38268]

[0; 0

0]

1

0.0051142

[ -0.89608; 0.44388]

[-0.199292;

0.082549]

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 136

2

0.00024236

[-0.89451; 0.44704]

[-0.203875;

0.084819]

3

0.000040420

[-0.89444; 0.44719]

[-0.204092;

0.084927]

4

0.000040420

[-0.89443 ; 0.44722]

[-0.204128;

0.084945]

5

0.000040420

[-0.89443 ; 0.44722]

[-0.204128;

0.084945]

We see from Table 4 that the iterations converge after four iterations to the critical point [−0.204128; 0.084945]. It is interesting

to note that this minimum point is different from [0.69588; −1.34794] obtained under the Conjugate Gradient method after only

one iteration, and under the Newton’s method after three iterations using the initial point [0.75; −1.25].

The situation encountered so far is that the Hessian at a point is indefinite and therefore could not continue with the Newton’s

method. At this point, the steps taken to obtain a positive definite Hessian basically combines the Newton’s method with the SD

method. It should be noticed that even though this combined approach (McMormick & Fiacco, 1967) produces a solution, this

solution is not what is intended (see Table 3). To achieve the desired result, we first review the modified Newton’s method.

Modified Newton’s Method

In the Newton’s method,

)()(

1

1 kkkk

f xxHxx 





the expression

)()(

1

kkk

fs xxH 



(3.4)

is a descent direction if

)(

1

k

xH



is positive definite. The algorithm for damped Newton’s method is

)()(

1

1 kkkkk

f xxHxx 







(3.5)

for some damping sequence

,10,}{

0





 kkk



and

1

k



as

k

.

When

)()(

1

kk

f xxH 



is not a descent direction, a modified Newton’s method substitutes a descent direction 



for

)()(

1

kk

f xxH 



From Equation (3.5), the modified Newton’s method becomes the SD method





 



 





󰇛





󰇜

, (3.6)

by taking

IxH 



)(

1

k

, the identity matrix. In most cases, it is still possible to compute 



from

Equation (3.4) and to search along 



, the sign chosen to ensure a descent direction.

A theoretical argument against the generalized Newton’s method is that if

)(

k

xH

is not a positive definite matrix, a move in the

direction given by Equation (3.5) where

0

k



is chosen to minimize  along

k

s

in Equation (3.4) starting from 



, may

result in an increase rather than a decrease in 

󰇛



󰇜

, yielding 



 , which terminate the process at 



. Another objection is that

)(

k

xH

may not have an inverse, even if 

󰇛



󰇜

is convex. The modified second-order method takes into account these two

objections. The direction vector 



is generated according to two rules (McMormick & Fiacco, 1967). In both cases,

kkkk

s







xx

1

where 



is chosen to be the smallest value of    for which 



 







gives a local minimum of 

󰇛



󰇜

. The rules are as

follows:

1. If

)(

k

xH

has a negative eigenvalue, let 



be a vector where

0)( 

kk

T

k

ss xH

and

0)( 

k

T

k

fs x

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 137

2. If 󰇛



󰇜 has all eigenvalues greater than or equal to zero, choose 



such that either



󰇛





󰇜





  and

0)( 

k

T

k

fs x

or

)()(

kkk

fs xxH 

(3.7)

The rationale for Rule 1 is that if the second partial derivative matrix has a negative eigenvalue there are certain directions along

which the function 

󰇛



󰇜

decreases and along which the rate of decrease also decreases. Now, to obtain 



, we use the

factorization

T

k

LDLxH )(

(3.8)

where L is a non-singular lower triangular matrix and

),,,(diag

21 p

ddd D

. The conditions for the factorization are as

follows:

1. If D has all positive diagonal elements, solve for

)()(

1

kkk

fs xxH 



..

2. If D has all non-negative diagonal elements, and at least one is zero, the vector 



is generated according to the second

rule above.

3. If D has some diagonal elements that are negative, solve

,

k

T

avL 

where











otherwise,0

0if,1

k

d

a

Let













otherwise,

0)(if,

v

xvv

k

T

k

f

s

Other forms of modification are to:

1. Modify the Newton’s search direction by giving it a bias towards the steepest descent vector 

󰇛





󰇜

(Levenberg,

1944; Marquardt, 1963). This is achieved by adding a scalar multiple of the unit matrix to 

󰇛





󰇜

and solving the system

)())((

kkk

fs xIxH 



to obtain v so that the matrix

IxH



)(

k

is positive definite.

2. Modify the Hessian matrix in the form (Murray, 1972; Hebden, 1973) as

DxH )(

k

where D is diagonal and it is used to determine the search direction. The modification occurs as the matrix is being factorized.

The Illustrative Problem Continued

For 



󰇛󰇜 with a starting point of 



󰇟 󰇠, it has already been found that 

󰇛





󰇜

is indefinite and that

)()(

1

kkk

fs xxH 



is not a descent direction. Therefore, we look for a new direction, v. The Newton’s method then

becomes

kkkk

vxx





1

.

To obtain v, we factorize the Hessian matrix H(x

0

) as in Equation (3.8) to obtain

















5.00

02

D

.

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 138

Now, because a diagonal entry is negative, we use the third condition to obtain  󰇟 󰇠, and hence   󰇟󰇠 which is

the required descent direction. Using the SD algorithm to compute the first iterate, we obtain

]10181.0;20363.0[

1

x

with

the Hessian















99999.11

149758.0

)(

1

xH

which is not positive definite. The process therefore cannot continue. Thus, the recommendation of Fiacco and McCormick

(1967) seizes to work.

In order to overcome this problem, we need to search along a different direction which is in the same direction as v. We observe

that if we choose

]86608.0;23656.0[)(

12

 xx f

, the Hessian given as















21

167150.0

)(

2

xH

is positive definite. We then switch back to the Newton’s method to continue with the process. Table 5 shows the iterations of the

process.

Table 5: Optimization process for 



󰇛󰇜 using Modified Newton’s method

Method

k







󰇛





󰇜



󰇛





󰇜

0

[0; 0]

[0; 2]

1

Steepest

1

[-0.20363; 0.10181]

[0.068041; 2]

1.1950

Newton

2

[-0.23656; -0.86608]

[-0.919032; 0.031278]

0.22594

Newton

3

[5.2134; -3.6067]

[563.17; −8; 3901 × 10

−6

]

726.70

Newton

4

[3.4840; -2.7415]

[166.41; 9.8662 × 10

−4

]

140.81

Newton

5

[2.3375; -2.1688]

[48.921; 1.9252 × 10

−6

]

26.153

Newton

6

[1.5857; -1.7928]

[14.156; 7.9298 × 10

−6

]

4.1081

Newton

7

[1.1086; -1.5543]

[3.8962; −2.6482 × 10

−7

]

0.09475

Newton

8

[0.83521; -1.4176]

[0.91289; 2.2768 × 10

−7

]

-0.52299

Newton

9

[0.71923; -1.35961]

[0.12858; 3.8630 × 10

−7

]

-0.58096

Newton

10

[0.69670; -1.34835]

[0.0043342; 6.8773 × 10

−8

]

-0.58244

Newton

11

[0.69589; -1.34794]

[5.5268 × 10

−6

; 1.8097 × 10

−9

]

-0.58245

Newton

12

[0.69588; -1.34794]

[−5.5511 × 10

−12

; 5.5511 × 10

−12

]

-0.58245

It can be observed from the table that after twelve iterations, the solution converges to

]34794.1;69588.0[

12

x

. This

solution is the same as the one obtained with a closer initial point

]25.1;75.0[

0

x

using the Newton’s method.

IV. Summary and Conclusion

In this section, we present the summary and conclusion of the study. In the summary, the conceptualization of the procedure for

correcting the failure in the convergence of the Newton’s method is provided.

Summary

Suppose that Hessian

)(

k

xH

is not positive definite. Then

)()(

1

kkk

fs xxH 



is not a descent direction. By the

factorization

T

k

LDLxH )(

where L is a non-singular lower triangular matrix, if some elements of

),,,(diag

21 p

ddd D

are negative, and if for the solution of

,

k

T

avL 

where

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,

MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIII, Issue XI, November 2024

www.ijltemas.in Page 139











otherwise,0

0if,1

k

d

a

kkkk







xx

1

does not provide a positive definite Hessian 

󰇛





󰇜

, then

)(

kkk

f



 x

could provide a descent

direction for the Newton’s method.

Conclusion

The study has examined a failure in the implementation of some of the procedures and algorithms that are used in unconstrained

optimization techniques. These methods are examined in the light of known highly ill-conditioned functions.

The study finds that a recommended modification to the Newton’s method, if the Hessian

)(

k

xH

is not positive definite, may

not lead to the desired optimal solution particularly when the initial point is not close enough to the expected solution. It has

therefore been demonstrated how to go round the problem. The results show that an optimal solution of a function should be

confirmed by using multiple initial points that are not too close.

References

1. Emiola, I., & Adem, R. (2021). Comparison of Minimization Methods for Rosenbrock Functions. arXiv:2101.10546

[math.OC].

2. Freund, R. M. (2004). The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line Search

Method. Massachusetts Institute of Technology.

3. Hebden, M. D. (1973). An Algorithm for Minimization using Exact Second Derivatives. AERE Harwell Report, TP515.

4. Huang, J. & Zhou, J. (2005). A direct proof and a generalization for a Kantorovich type inequality. Linear Algebra and

its Applications. 397, 185-192.

5. Levenberg, K. (1944). A Method for the Solution of Certain Nonlinear Problems in Least Squares. Quart. Appl. Math.,

2. 164-168.

6. Lewis R.M. & Nash S.G. (2006) Factors Affecting the Performance of Optimization-based Multigrid Methods. In: Hager

W.W., Huang SJ., Pardalos P.M., Prokopyev O.A. (eds) Multiscale Optimization Methods and Applications. Nonconvex

Optimization and Its Applications, 82. Springer, Boston, MA.

7. Marquardt, D. W. (1963). An Algorithm for Least Squares estimation of non-linear Parameters. SIAM J., 11, 431-441.

8. McCormick, G. P., & Fiacco, A. V. (1967). Nonlinear programming. John Wiley, New York.

9. Murray, W. (1972). Second Derivative Methods in Numerical Methods for Unconstrained Optimization. Academic

Press, London.