Performance Comparison of Neural Networks Training Algorithms in Time Series Forecasting
DOI:
https://doi.org/10.24086/cuesj.v10n1y2026.pp6-11Keywords:
Artificial Neural Network, Conjugate Gradient Descent, Deep Learning, Forecasting, Training AlgorithmAbstract
Research gaps exist in the area of forecasting since there has been no comparison between different algorithms for training the Feed Forward Neural Network (FFNN) model. One of the main reasons for this existing gap is that it’s hard to decide which training algorithm to choose. This study proposes to fill that gap by identifying the best method to train the FFNN (both shallow and deep learning), a technique for a univariate monthly time series forecast. A total of seven widely known techniques have been studied to compare their efficiency: Quick Propagation Algorithm, Conjugate Gradient Descent Algorithm, Quasi-Newton Algorithm, Limited Memory Quasi-Newton Algorithm, Levenberg Marquardt Algorithm, Online Back Propagation Algorithm (Stochastic Gradient Descent), and lastly the Batch Back Propagation Algorithm. The study was carried out using Alyuda NeuroIntelligence 2.2. Additionally, four statistical measurements (MAPE, RMSE, MAE, and R²) were used for comparison. The results indicate that Conjugate Gradient Descent outperforms the other algorithms, yielding the highest R² and the lowest values for RMSE, MAPE, and MAE, proving its superior effectiveness in training the FFNN for forecasting monthly time series. Whereas, Batch Back-propagation was shown to be the least effective algorithm, with the lowest R² value and highest values for RMSE, MAPE, and MAE.
Downloads
References
1. G. P. Zhang. Business forecasting with artificial neural networks. In: Neural Networks in Business Forecasting. IGI Global Scientific Publishing, United States, 2004. DOI: https://doi.org/10.4018/978-1-59140-176-6
2. R. Fletcher. Practical Methods of Optimization. John Wiley and Sons, United States, 2000. DOI: https://doi.org/10.1002/9781118723203
3. G. Zhang, B. E. Patuwo and M. Y. Hu. Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, vol. 14, pp. 35-62, 1998. DOI: https://doi.org/10.1016/S0169-2070(97)00044-7
4. D. G. Baur and T. K. McDermott. Is gold a safe haven? International evidence. Journal of Banking and Finance, vol. 34, no. 8, pp. 1886-1898, 2010. DOI: https://doi.org/10.1016/j.jbankfin.2009.12.008
5. L. Kilian. The economic effects of energy price shocks. Journal of Economic Literature, vol. 46, no. 4, pp. 871-909, 2008. DOI: https://doi.org/10.1257/jel.46.4.871
6. M. K. Awang, M. R. Ismail, M. Makhtar, M. N. A. Rahman and A. R. Mamat. Performance comparison of neural network training algorithms for modeling customer churn prediction. International Journal of Engineering and Technology, vol. 7, no. 2.15, pp. 35-37, 2018. DOI: https://doi.org/10.14419/ijet.v7i2.15.11196
7. R. Tabbussum and A. Q. Dar. Comparative analysis of neural network training algorithms for the flood forecast modelling of an alluvial Himalayan river. Journal of Flood Risk Management, vol. 13, pp. e12656, 2020. DOI: https://doi.org/10.1111/jfr3.12656
8. C. K. Arthur, V. A. Temeng and Y. Y. Ziggah. Performance evaluation of training algorithms in backpropagation neural network approach to blast-induced ground vibration prediction. Ghana Mining Journal, vol. 20, no. 1, pp. 20-33, 2020. DOI: https://doi.org/10.4314/gm.v20i1.3
9. R. Bergendal and A. Rohlén. A Comparison of Training Algorithms When Training a Convolutional Neural Network for Classifying Road Signs. KTH Royal Institute of Technology, Sweden, 2019.
10. I. N. Da Silva, D. H. Spatti, R. A. Flauzino, L. H. B. Liboni and S. F. Dos Reis Alves. Artificial Neural Networks: A Practical Course. Springer, New York, 2016. DOI: https://doi.org/10.1007/978-3-319-43162-8
11. S. C. Wang. Interdisciplinary Computing in Java Programming. Springer, New York, 2003.
12. X. H. Yu, G. A. Chen and S. X. Cheng. Dynamic learning rate optimization of the backpropagation algorithm. IEEE Transactions on Neural Network and Learning, vol. 6, pp. 669-677, 1995. DOI: https://doi.org/10.1109/72.377972
13. S. E. Fahlman. An Empirical Study of Learning Speed in Back- Propagation Networks. Techical Report, Carnegie-Mellon University, Pittsburgh, 1988.
14. T. M. Bafitlhile, Z. Li and Q. Li. Comparison of Levenberg Marquardt and conjugate gradient descent optimization methods for simulation of streamflow using artificial neural network. Advances in Ecological and Environmental Research, vol. 3, pp. 217-237, 2018.
15. J. R. Shewchuk. An Introduction to the Conjugate Gradient Method without the Agonizing Pain. Carnegie Mellon University, Pittsburgh, PA, 1994.
16. C. J. Li and L. Yan. Mechanical system modelling using recurrent neural networks via quasi-Newton learning methods. Applied Mathematical Modelling, vol. 19, no. 7, pp. 421-428, 1995. DOI: https://doi.org/10.1016/0307-904X(95)00015-C
17. Z. Cömert and A. Kocamaz. A study of artificial neural network training algorithms for classification of cardiotocography signals. Bitlis Eren University Journal of Science and Technology, vol. 7, no. 2, pp. 93-103, 2017. DOI: https://doi.org/10.17678/beuscitech.338085
18. R. Malouf. A Comparison of Algorithms for Maximum Entropy Parameter Estimation. In: COLING-02: The 6th Conference on Natural Language Learning, 2002. DOI: https://doi.org/10.3115/1118853.1118871
19. G. Yu, C. H. Farquharson, Q. Xiao and M. Li. Two-dimensional anisotropic magnetotelluric inversion using a limited-memory quasi-Newton method. Geophysics, vol. 87, pp. E13-E34, 2022. DOI: https://doi.org/10.1190/geo2020-0488.1
20. D. R. S. Saputro and P. Widyaningsih. Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method for the parameter estimation on geographically weighted ordinal logistic regression model (GWOLR). AIP Conference Proceedings, vol. 1868, pp. 040009, 2017. DOI: https://doi.org/10.1063/1.4995124
21. H. P. Gavin. The Levenberg-Marquardt Algorithm for Nonlinear Least Squares Curve-Fitting Problems. Technical Reports. Duke University, United States, 2019.
22. M. I. Lourakis. A brief description of the Levenberg-Marquardt algorithm implemented by levmar. Foundation Research Technology, vol. 4, no. 1, pp. 1-6, 2005.
23. Y. Tian, Y. Zhang and H. Zhang. Recent advances in stochastic gradient descent in deep learning. Mathematics, vol. 11, pp. 682, 2023. DOI: https://doi.org/10.3390/math11030682
24. L. Bottou. Stochastic gradient descent tricks. In: Neural Networks: Tricks of the Trade. 2nd ed. Springer, Berlin, pp. 421-436, 2012. DOI: https://doi.org/10.1007/978-3-642-35289-8_25
25. M. S. Al-Duais and F. S. Mohamad. A review on enhancements to speed up training of the batch back propagation algorithm. Indian Journal of Science and Technology, vol. 9, no. 46, pp. 1-10, 2016. DOI: https://doi.org/10.17485/ijst/2016/v9i46/91755
26. M. S. Al-Duais, F. S. Mohamad, M. Mohamad and M. N. Husen. Enhancement processing time and accuracy training via significant parameters in the batch BP algorithm. International Journal of Intelligent Systems and Applications, vol. 10, no. 1, pp. 43, 2020. DOI: https://doi.org/10.5815/ijisa.2020.01.05
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Diyar M. Khalil

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License [CC BY-NC-ND 4.0] that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).



