Performance Comparison of Neural Networks Training Algorithms in Time Series Forecasting

Authors

DOI:

https://doi.org/10.24086/cuesj.v10n1y2026.pp6-11

Keywords:

Artificial Neural Network, Conjugate Gradient Descent, Deep Learning, Forecasting, Training Algorithm

Abstract

Research gaps exist in the area of forecasting since there has been no comparison between different algorithms for training the Feed Forward Neural Network (FFNN) model. One of the main reasons for this existing gap is that it’s hard to decide which training algorithm to choose. This study proposes to fill that gap by identifying the best method to train the FFNN (both shallow and deep learning), a technique for a univariate monthly time series forecast. A total of seven widely known techniques have been studied to compare their efficiency: Quick Propagation Algorithm, Conjugate Gradient Descent Algorithm, Quasi-Newton Algorithm, Limited Memory Quasi-Newton Algorithm, Levenberg Marquardt Algorithm, Online Back Propagation Algorithm (Stochastic Gradient Descent), and lastly the Batch Back Propagation Algorithm. The study was carried out using Alyuda NeuroIntelligence 2.2. Additionally, four statistical measurements (MAPE, RMSE, MAE, and R²) were used for comparison. The results indicate that Conjugate Gradient Descent outperforms the other algorithms, yielding the highest R² and the lowest values for RMSE, MAPE, and MAE, proving its superior effectiveness in training the FFNN for forecasting monthly time series. Whereas, Batch Back-propagation was shown to be the least effective algorithm, with the lowest R² value and highest values for RMSE, MAPE, and MAE.

Downloads

Download data is not yet available.

Author Biography

Diyar M. Khalil, Department of Mathematics, Faculty of Science, Soran University, Kurdistan Region, Iraq

Diyar M. Khalil is a lecturer in the Mathematics Department at Soran University, Kurdistan Region, Iraq.  His scholarly interests and expertise span artificial neural networks, Deep learning, Applied Statistics, time series forecasting, regression, and factor analysis.

References

1. G. P. Zhang. Business forecasting with artificial neural networks. In: Neural Networks in Business Forecasting. IGI Global Scientific Publishing, United States, 2004. DOI: https://doi.org/10.4018/978-1-59140-176-6

2. R. Fletcher. Practical Methods of Optimization. John Wiley and Sons, United States, 2000. DOI: https://doi.org/10.1002/9781118723203

3. G. Zhang, B. E. Patuwo and M. Y. Hu. Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, vol. 14, pp. 35-62, 1998. DOI: https://doi.org/10.1016/S0169-2070(97)00044-7

4. D. G. Baur and T. K. McDermott. Is gold a safe haven? International evidence. Journal of Banking and Finance, vol. 34, no. 8, pp. 1886-1898, 2010. DOI: https://doi.org/10.1016/j.jbankfin.2009.12.008

5. L. Kilian. The economic effects of energy price shocks. Journal of Economic Literature, vol. 46, no. 4, pp. 871-909, 2008. DOI: https://doi.org/10.1257/jel.46.4.871

6. M. K. Awang, M. R. Ismail, M. Makhtar, M. N. A. Rahman and A. R. Mamat. Performance comparison of neural network training algorithms for modeling customer churn prediction. International Journal of Engineering and Technology, vol. 7, no. 2.15, pp. 35-37, 2018. DOI: https://doi.org/10.14419/ijet.v7i2.15.11196

7. R. Tabbussum and A. Q. Dar. Comparative analysis of neural network training algorithms for the flood forecast modelling of an alluvial Himalayan river. Journal of Flood Risk Management, vol. 13, pp. e12656, 2020. DOI: https://doi.org/10.1111/jfr3.12656

8. C. K. Arthur, V. A. Temeng and Y. Y. Ziggah. Performance evaluation of training algorithms in backpropagation neural network approach to blast-induced ground vibration prediction. Ghana Mining Journal, vol. 20, no. 1, pp. 20-33, 2020. DOI: https://doi.org/10.4314/gm.v20i1.3

9. R. Bergendal and A. Rohlén. A Comparison of Training Algorithms When Training a Convolutional Neural Network for Classifying Road Signs. KTH Royal Institute of Technology, Sweden, 2019.

10. I. N. Da Silva, D. H. Spatti, R. A. Flauzino, L. H. B. Liboni and S. F. Dos Reis Alves. Artificial Neural Networks: A Practical Course. Springer, New York, 2016. DOI: https://doi.org/10.1007/978-3-319-43162-8

11. S. C. Wang. Interdisciplinary Computing in Java Programming. Springer, New York, 2003.

12. X. H. Yu, G. A. Chen and S. X. Cheng. Dynamic learning rate optimization of the backpropagation algorithm. IEEE Transactions on Neural Network and Learning, vol. 6, pp. 669-677, 1995. DOI: https://doi.org/10.1109/72.377972

13. S. E. Fahlman. An Empirical Study of Learning Speed in Back- Propagation Networks. Techical Report, Carnegie-Mellon University, Pittsburgh, 1988.

14. T. M. Bafitlhile, Z. Li and Q. Li. Comparison of Levenberg Marquardt and conjugate gradient descent optimization methods for simulation of streamflow using artificial neural network. Advances in Ecological and Environmental Research, vol. 3, pp. 217-237, 2018.

15. J. R. Shewchuk. An Introduction to the Conjugate Gradient Method without the Agonizing Pain. Carnegie Mellon University, Pittsburgh, PA, 1994.

16. C. J. Li and L. Yan. Mechanical system modelling using recurrent neural networks via quasi-Newton learning methods. Applied Mathematical Modelling, vol. 19, no. 7, pp. 421-428, 1995. DOI: https://doi.org/10.1016/0307-904X(95)00015-C

17. Z. Cömert and A. Kocamaz. A study of artificial neural network training algorithms for classification of cardiotocography signals. Bitlis Eren University Journal of Science and Technology, vol. 7, no. 2, pp. 93-103, 2017. DOI: https://doi.org/10.17678/beuscitech.338085

18. R. Malouf. A Comparison of Algorithms for Maximum Entropy Parameter Estimation. In: COLING-02: The 6th Conference on Natural Language Learning, 2002. DOI: https://doi.org/10.3115/1118853.1118871

19. G. Yu, C. H. Farquharson, Q. Xiao and M. Li. Two-dimensional anisotropic magnetotelluric inversion using a limited-memory quasi-Newton method. Geophysics, vol. 87, pp. E13-E34, 2022. DOI: https://doi.org/10.1190/geo2020-0488.1

20. D. R. S. Saputro and P. Widyaningsih. Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method for the parameter estimation on geographically weighted ordinal logistic regression model (GWOLR). AIP Conference Proceedings, vol. 1868, pp. 040009, 2017. DOI: https://doi.org/10.1063/1.4995124

21. H. P. Gavin. The Levenberg-Marquardt Algorithm for Nonlinear Least Squares Curve-Fitting Problems. Technical Reports. Duke University, United States, 2019.

22. M. I. Lourakis. A brief description of the Levenberg-Marquardt algorithm implemented by levmar. Foundation Research Technology, vol. 4, no. 1, pp. 1-6, 2005.

23. Y. Tian, Y. Zhang and H. Zhang. Recent advances in stochastic gradient descent in deep learning. Mathematics, vol. 11, pp. 682, 2023. DOI: https://doi.org/10.3390/math11030682

24. L. Bottou. Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade. 2nd ed. Springer, Berlin, pp. 421-436, 2012. DOI: https://doi.org/10.1007/978-3-642-35289-8_25

25. M. S. Al-Duais and F. S. Mohamad. A review on enhancements to speed up training of the batch back propagation algorithm. Indian Journal of Science and Technology, vol. 9, no. 46, pp. 1-10, 2016. DOI: https://doi.org/10.17485/ijst/2016/v9i46/91755

26. M. S. Al-Duais, F. S. Mohamad, M. Mohamad and M. N. Husen. Enhancement processing time and accuracy training via significant parameters in the batch BP algorithm. International Journal of Intelligent Systems and Applications, vol. 10, no. 1, pp. 43, 2020. DOI: https://doi.org/10.5815/ijisa.2020.01.05

Published

2026-01-01

How to Cite

1.
Khalil DM. Performance Comparison of Neural Networks Training Algorithms in Time Series Forecasting. Cihan U Erbil SCI J [Internet]. 2026 Jan. 1 [cited 2026 Jun. 12];10(1):6-11. Available from: https://journals.cihanuniversity.edu.iq/index.php/cuesj/article/view/1679

Issue

Section

Research Article

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.