TY - JOUR
T1 - Parameter conjugate gradient with secant equation based elman neural network and its convergence analysis
AU - Fan, Qinwei
AU - Zhang, Zhiwen
AU - Huang, Xiaodi
N1 - Publisher Copyright:
© 2022 Wiley-VCH GmbH.
PY - 2022/9
Y1 - 2022/9
N2 - Abstract Elman neural network (ENN) is one of the local recursive networks with a feedback mechanism. The parameter conjugate gradient method is a promising alternative to the gradient descent method, due to its faster convergence speed that results from searching for the conjugate descent direction with an adaptive step size (obtained by Wolfe conditions). However, there are still some challenges such as how to avoid the sawtooth phenomenon in gradient algorithms to improve the learning accuracy of the second-order curvature of an objective function. As such, this paper presents a novel parametric conjugate gradient method that is based on the secant equation for training ENN in an effective way. Strict proof of the theoretical convergence of the proposed algorithm is provided in detail. In particular, the weak convergence and strong convergence of the algorithm, as well as the monotonicity of the error function are proved. Except for the theoretical analysis, the three numerical experiments have been conducted by applying the algorithm to three problems of classification, regression, and function approximation on nine real-world datasets. The experimental results have demonstrated the feasibility of the proposed algorithm and the correctness of this theoretical analysis.
AB - Abstract Elman neural network (ENN) is one of the local recursive networks with a feedback mechanism. The parameter conjugate gradient method is a promising alternative to the gradient descent method, due to its faster convergence speed that results from searching for the conjugate descent direction with an adaptive step size (obtained by Wolfe conditions). However, there are still some challenges such as how to avoid the sawtooth phenomenon in gradient algorithms to improve the learning accuracy of the second-order curvature of an objective function. As such, this paper presents a novel parametric conjugate gradient method that is based on the secant equation for training ENN in an effective way. Strict proof of the theoretical convergence of the proposed algorithm is provided in detail. In particular, the weak convergence and strong convergence of the algorithm, as well as the monotonicity of the error function are proved. Except for the theoretical analysis, the three numerical experiments have been conducted by applying the algorithm to three problems of classification, regression, and function approximation on nine real-world datasets. The experimental results have demonstrated the feasibility of the proposed algorithm and the correctness of this theoretical analysis.
KW - conjugate gradient
KW - Elman
KW - secant equation
KW - Wolfe condition
UR - http://www.scopus.com/inward/record.url?scp=85134053512&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134053512&partnerID=8YFLogxK
U2 - 10.1002/adts.202200047
DO - 10.1002/adts.202200047
M3 - Article
SN - 2513-0390
VL - 5
JO - Advanced Theory and Simulations
JF - Advanced Theory and Simulations
IS - 9
M1 - 2200047
ER -