journal article Open Access Apr 21, 2025

Enhancing Software Quality with AI: A Transformer-Based Approach for Code Smell Detection

Applied Sciences Vol. 15 No. 8 pp. 4559 · MDPI AG
View at Publisher Save 10.3390/app15084559
Abstract
Software quality assurance is a critical aspect of software engineering, directly impacting maintainability, extensibility, and overall system performance. Traditional machine-learning techniques, such as gradient boosting and support vector machines (SVM), have demonstrated effectiveness in code smell detection but require extensive feature engineering and struggle to capture intricate semantic dependencies in software structures. In this study, we introduce Relation-Aware BERT (RABERT), a novel transformer-based model that integrates relational embeddings to enhance automated code smell detection. By modeling interdependencies among software complexity metrics, RABERT surpasses classical machine-learning methods, achieving an accuracy of 90.0% and a precision of 91.0%. However, challenges such as low recall (53.0%) and computational overhead indicate the need for further optimization. We present a comprehensive comparative analysis between classical machine-learning models and transformer-based architectures, evaluating their computational efficiency and predictive capabilities. Our findings contribute to the advancement of AI-driven software quality assurance, offering insights into optimizing transformer-based models for practical deployment in software development workflows. Future research will focus on lightweight transformer variants, cost-sensitive learning techniques, and cross-language generalizability to enhance real-world applicability.
Topics

No keywords indexed for this article. Browse by subject →

References
45
[1]
Fontana "Comparing and experimenting machine learning techniques for code smell detection" Empir. Softw. Eng. (2015) 10.1007/s10664-015-9378-4
[2]
Vaswani "Attention is all you need" Advances in Neural Information Processing Systems, Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017 (2017)
[3]
Feng, Z., Guo, D., Tang, D., and Li, D.-A. (2020, January 16–20). CodeBERT: A pre-trained model for programming and natural languages. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online. 10.18653/v1/2020.findings-emnlp.139
[4]
Guo, D., Li, S., Xue, X., and Li, D. (2021, January 2–9). GraphCodeBERT: Pre-training code representations with data flow. Proceedings of the AAAI Conference on Artificial Intelligence, Online.
[5]
Alazba "CoRT: A transformer-based model for semantic and structural code analysis" J. Softw. Evol. Process (2022)
[6]
Gao "SCSmell: Stacking pre-trained transformers for code smell detection" IEEE Trans. Softw. Eng. (2022)
[7]
Lozano "Detecting and analyzing design smells in object-oriented software" Empir. Softw. Eng. (2021)
[8]
Olbrich, S., Cruzes, D.S., and Sjøberg, D.I.K. (2010, January 12–18). Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open-source systems. Proceedings of the 2010 IEEE International Conference on Software Maintenance, Timișoara, Romania. 10.1109/icsm.2010.5609564
[9]
Bakhshandeh "Code quality improvement using convolutional neural networks" J. Syst. Softw. (2021)
[10]
Zhang "Learning semantic representations for code analysis with Graph Neural Networks" IEEE Trans. Neural Netw. Learn. Syst. (2021)
[11]
Kim, S., Lee, J., and Yoo, S. (2021, January 20–30). Code representation with pre-trained transformers. Proceedings of the 43rd International Conference on Software Engineering (ICSE), Madrid, Spain.
[12]
Li "Deep learning for code representation: Challenges and progress" ACM Comput. Surv. (2022)
[13]
Ahmed "Code analysis using deep learning: A survey" J. Softw. Eng. Res. Dev. (2021)
[14]
Sharma "Integrating code and documentation for code smell detection: A multimodal approach" Empir. Softw. Eng. (2023)
[15]
Smith "Graph embeddings for software smell detection" Softw. Qual. J. (2023)
[16]
Feng "CodeBERT: Pre-trained models for programming language understanding" Empir. Softw. Eng. (2022)
[17]
Johnson, D., Singh, A., and Kim, E. (2022, January 19–20). Transformer-based approaches to code semantics. Proceedings of the 50th ACM Symposium on Software Engineering, New York, NY, USA.
[18]
Wang "Deep semantic models for detecting code smells" J. Softw. Maint. Evol. (2022)
[19]
Liu "Attention-based deep learning for code defect prediction" Inf. Softw. Technol. (2023)
[20]
Mishra, A., Gupta, R., and Nandi, S. (2022, January 18–22). Hierarchical BERT models for code structure analysis. Proceedings of the ACM SIGSOFT Symposium, Online.
[21]
Brown "Extending BERT models for software engineering: A systematic review" J. Syst. Softw. (2023)
[22]
Gupta "Exploring transformer-based architectures for software analysis" ACM Trans. Softw. Eng. Methodol. (2023)
[23]
Wang "Structural embeddings for code quality assessment" J. Empir. Softw. Eng. (2023)
[24]
Baker "Optimizing code embeddings for defect prediction" Softw. Test. Verif. Reliab. (2022)
[25]
Lee, S., Johnson, R., and Park, M. (2022, January 10–14). Pre-trained code embeddings for software defect detection. Proceedings of the IEEE/ACM ASE Conference, Rochester, MI, USA.
[26]
Hashimoto, K., Yoshida, Y., and Tanaka, H. (2023, January 17–21). Code smells in the age of machine learning. Proceedings of the 2023 ACM SIGSOFT International Symposium, Seattle, WA, USA.
[27]
Mishra "Deep learning strategies for hierarchical code modeling" Empir. Softw. Eng. (2023)
[28]
Sharma "Leveraging pre-trained transformers for software bug prediction" J. Syst. Softw. (2023)
[29]
Brown, J., White, T., and Green, S. (2022, January 3–7). Hybrid approaches to software smell detection using deep learning. Proceedings of the 2022 IEEE International Conference on Software Maintenance and Evolution, Limassol, Cyprus.
[30]
Williams "Analyzing code metrics with graph-based models" Empir. Softw. Eng. (2023)
[31]
Gupta "Relation-aware deep learning models for defect prediction" ACM Trans. Softw. Eng. Methodol. (2022)
[32]
Smith "A systematic review of transformer applications in software engineering" J. Empir. Softw. Eng. (2023)
[33]
Kim, H., Liu, Z., and Sun, Y. (2022, January 10–14). Pre-trained models for bug severity classification. Proceedings of the IEEE/ACM ASE Conference, Rochester, MI, USA.
[34]
Jones "Multimodal embeddings for software smell detection" Softw. Test. Verif. Reliab. (2023)
[35]
Mishra "Transforming software metrics into embeddings for defect prediction" J. Syst. Softw. (2023)
[36]
Brown "Explainable AI approaches for code smell detection" Empir. Softw. Eng. (2023)
[37]
Gupta "BERT-inspired models for code review assistance" J. Syst. Softw. (2023)
[38]
Singh "Deep learning for analyzing inter-file relationships in code smells" ACM Trans. Softw. Eng. Methodol. (2022)
[39]
The Elements of Statistical Learning

Trevor Hastie, Robert Tibshirani, Jerome Friedman

Springer Series in Statistics 10.1007/978-0-387-84858-7
[40]
Smola, A.J., Bartlett, P.L., Schölkopf, B., and Schuurmans, D. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, MIT Press. 10.7551/mitpress/1113.001.0001
[41]
Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Trees, Chapman & Hall/CRC.
[42]
(2025, April 18). Available online: https://github.com/IsrarAli-IU/Code-Smell-Detection.
[43]
Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv.
[44]
Loshchilov, I., and Hutter, F. (2019, January 6–9). Decoupled Weight Decay Regularization. Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA.
[45]
Greedy function approximation: A gradient boosting machine.

Jerome H. Friedman

The Annals of Statistics 2001 10.1214/aos/1013203451
Metrics
3
Citations
45
References
Details
Published
Apr 21, 2025
Vol/Issue
15(8)
Pages
4559
License
View
Cite This Article
Israr Ali, Syed Sajjad Hussain Rizvi, Syed Hasan Adil (2025). Enhancing Software Quality with AI: A Transformer-Based Approach for Code Smell Detection. Applied Sciences, 15(8), 4559. https://doi.org/10.3390/app15084559