journal article Open Access Jul 05, 2025

NadamClip: A Novel Optimization Algorithm for Improving Prediction Accuracy and Training Stability

Processes Vol. 13 No. 7 pp. 2145 · MDPI AG
View at Publisher Save 10.3390/pr13072145
Abstract
Accurate prediction of key environmental parameters is crucial for intelligent control and optimization, yet it remains challenging due to gradient instability in deep learning models, like Long Short-Term Memory (LSTM), during time series forecasting. This study introduces a novel adaptive optimization algorithm, NadamClip, which integrates gradient clipping directly into the Nadam framework to address the trade-off between convergence efficiency and gradient explosion. NadamClip incorporates an adjustable gradient clipping threshold strategy that permits manual tuning. Through systematic experiments, we identified an optimal threshold range that effectively balances model performance and training stability, dynamically adapting to the evolving convergence characteristics of the network across different training phases. Aquaculture systems are regarded as similar to modern biomanufacturing systems. The study evaluated an aquaculture dataset for ammonia concentration prediction in aquaculture environmental control processes. NadamClip achieved outstanding results on key metrics, including a Root Mean Square Error (RMSE) of 0.2644, a Mean Absolute Error (MAE) of 0.6595, and a Coefficient of Determination (R2) score of 0.9743. Compared to existing optimizer enhancements, NadamClip pioneers the integration of gradient clipping with adaptive momentum estimation, overcoming the traditional paradigm where clipping primarily serves as an external training control rather than an intrinsic algorithmic component. This study provides a practical and reproducible optimization framework for intelligent modeling of dynamic process systems, thereby contributing to the broader advancement of machine learning methods in predictive modeling and optimization for data-driven manufacturing and environmental processes.
Topics

No keywords indexed for this article. Browse by subject →

References
32
[1]
Prakash "Nitrogen Pollution Threat to Mariculture and Other Aquatic Ecosystems: An Overview" J. Pharm. Pharmacol. (2021)
[2]
Kaur "State-of-the-Art Techniques to Enhance Biomethane/Biogas Production in Thermophilic Anaerobic Digestion" Process Saf. Environ. Prot. (2024) 10.1016/j.psep.2024.03.123
[3]
Wang "Unveiling the Multi-Dimensional Spatio-Temporal Fusion Transformer (MDSTFT): A Revolutionary Deep Learning Framework for Enhanced Multi-Variate Time Series Forecasting" IEEE Access (2024) 10.1109/access.2024.3444788
[4]
He, Y., Huang, P., Hong, W., Luo, Q., Li, L., and Tsui, K.-L. (2024). In-Depth Insights into the Application of Recurrent Neural Networks (RNNs) in Traffic Prediction: A Comprehensive Review. Algorithms, 17. 10.3390/a17090398
[5]
Rosindell, J., and Wong, Y. (2018). Biodiversity, the Tree of Life, and Science Communication. Phylogenetic Diversity: Applications and Challenges in Biodiversity Science, Springer. 10.1007/978-3-319-93145-6_3
[6]
Marshall, N., Xiao, K.L., Agarwala, A., and Paquette, E. (2024). To Clip or Not to Clip: The Dynamics of SGD with Gradient Clipping in High-Dimensions. arXiv.
[7]
Mai, V.V., and Johansson, M. (2021, January 18–24). Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness. Proceedings of the 38th International Conference on Machine Learning, Virtual.
[8]
Zhang "Why Are Adaptive Methods Good for Attention Models?" Adv. Neural Inf. Process. Syst. (2020)
[9]
Seetharaman, P., Wichern, G., Pardo, B., and Roux, J. (2020, January 21–24). Le Autoclip: Adaptive Gradient Clipping for Source Separation Networks. Proceedings of the 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP), Espoo, Finland. 10.1109/mlsp49062.2020.9231926
[10]
Liu "A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks" Adv. Neural Inf. Process. Syst. (2022)
[11]
Tang "Differentially Private Image Classification by Learning Priors from Random Processes" Adv. Neural Inf. Process. Syst. (2023)
[12]
Qian "Understanding Gradient Clipping In Incremental Gradient Methods" Proc. Mach. Learn. Res. (2021)
[13]
Ramaswamy "Gradient Clipping in Deep Learning: A Dynamical Systems Perspective" Int. Conf. Pattern Recognit. Appl. Methods (2023) 10.5220/0011678000003411
[14]
Dozat, T. (2016). Incorporating Nesterov Momentum into Adam. ICLR Work., 2013–2016. Available online: https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ.
[15]
Haji "Comparison of Optimization Techniques Based on Gradient Descent Algorithm: A Review" PalArch’s J. Archaeol. Egypt/Egyptol. (2021)
[16]
Praharsha, C.H., Poulose, A., and Badgujar, C. (2024). Comprehensive Investigation of Machine Learning and Deep Learning Networks for Identifying Multispecies Tomato Insect Images. Sensors, 24. 10.3390/s24237858
[17]
Kuppusamy, P., Raga Siri, P., Harshitha, P., Dhanyasri, M., and Iwendi, C. (2023, January 29–31). Customized CNN with Adam and Nadam Optimizers for Emotion Recognition using Facial Expressions. Proceedings of the 2023 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), Chennai, India. 10.1109/wispnet57748.2023.10134002
[18]
Deep Learning With TensorFlow: A Review

Bo Pang, Erik Nijkamp, Ying Nian Wu

Journal of Educational and Behavioral Statistics 2020 10.3102/1076998619872761
[19]
Kanai "Preventing Gradient Explosions in Gated Recurrent Units" Adv. Neural Inf. Process. Syst. (2017)
[20]
Botchkarev, A. (2022). Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology. arXiv.
[21]
Jongjaraunsuk, R., Taparhudee, W., and Suwannasing, P. (2024). Comparison of Water Quality Prediction for Red Tilapia Aquaculture in an Outdoor Recirculation System Using Deep Learning and a Hybrid Model. Water, 16. 10.3390/w16060907
[22]
Elesedy, B., and Hutter, M. (2023). U-Clip: On-Average Unbiased Stochastic Gradient Clipping. arXiv.
[23]
Root-mean-square error (RMSE) or mean absolute error (MAE): when to use them or not

Timothy O. Hodson

Geoscientific Model Development 2022 10.5194/gmd-15-5481-2022
[24]
Piepho, H.P. (2023). An Adjusted Coefficient of Determination (R2) for Generalized Linear Mixed Models in One Go. Biom. J., 65. 10.1002/bimj.202200290
[25]
Duchi "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization" J. Mach. Learn. Res. (2011)
[26]
Zeiler, M.D. (2012). ADADELTA: An Adaptive Learning Rate Method. arXiv.
[27]
Kingma, D.P., and Ba, J.L. (2015). Adam: A Method for Stochastic Optimization. arXiv.
[28]
Loshchilov, I., and Hutter, F. (2017). Decoupled Weight Decay Regularization. arXiv.
[29]
Nguyen, T.N., Nguyen, P.H., Nguyen, L.M., and Van Dijk, M. (2023). Batch Clipping and Adaptive Layerwise Clipping for Differential Private Stochastic Gradient Descent. arXiv.
[30]
Chezhegov, S., Klyukin, Y., Semenov, A., Beznosikov, A., Gasnikov, A., Horváth, S., Takáč, M., and Gorbunov, E. (2024). Gradient Clipping Improves AdaGrad When the Noise Is Heavy-Tailed. arXiv.
[31]
Sun, H., Cui, J., Shao, Y., Yang, J., Xing, L., Zhao, Q., and Zhang, L. (2024). A Gastrointestinal Image Classification Method Based on Improved Adam Algorithm. Mathematics, 12. 10.3390/math12162452
[32]
Sun, H., Yu, H., Shao, Y., Wang, J., Xing, L., Zhang, L., and Zhao, Q. (2024). An Improved Adam’s Algorithm for Stomach Image Classification. Algorithms, 17. 10.3390/a17070272
Related

You May Also Like

DPPH Radical Scavenging Assay

İlhami Gulçin, Saleh H. Alwasel · 2023

946 citations

Alkaline Water Electrolysis Powered by Renewable Energy: A Review

Jörn Brauns, Thomas Turek · 2020

696 citations

A Review of Stereolithography: Processes and Systems

Junxi Huang, Qin Qin · 2020

504 citations

Various Approaches for the Detoxification of Toxic Dyes in Wastewater

Abdulmohsen K. D. Alsukaibi · 2022

357 citations

Metal Ions, Metal Chelators and Metal Chelating Assay as Antioxidant Method

İlhami Gulçin, Saleh H. Alwasel · 2022

343 citations