journal article Open Access Jan 01, 2025

DRSE‐YOLO: Efficient and Lightweight Architecture for Accurate Waste Detection

View at Publisher Save 10.1049/ipr2.70022
Abstract
ABSTRACT
This paper introduces DRSE‐YOLO, an efficient waste detection model designed to address detection accuracy and lightweight design challenges. The RCCA module in the model's neck enhances multi‐scale feature representation, thereby improving detection performance. The DySample module optimizes upsampling through adaptive point‐sampling, reducing computational demands and improving resource efficiency. The Slim‐Neck module is applied to select convolutional layers and C2f modules to streamline the model and enhance computational efficiency. The ECC‐Head integrates asymmetric depth convolution, point convolution, and an attention mechanism, balancing accuracy with reduced parameters and computational load. Evaluated on a custom dataset comprising 46 waste classes and approximately 25,000 images, DRSE‐YOLO achieves significant improvements over YOLOv8n, including a higher mAP@0.5 (+1.59%) and mAP@0.5:95 (+2.08%), alongside a reduced parameter count (2.43 M vs. 3.2 M) and GFLOPs (5.8 vs. 8.2, a 24.4% reduction). These results underscore DRSE‐YOLO's efficiency and accuracy.
Topics

No keywords indexed for this article. Browse by subject →

References
48
[1]
F.Liang Y.Zhou X.Chen F.Liu C.Zhang andX.Wu “Review of Target Detection Technology Based on Deep Learning ” inProceedings of the 5th International Conference on Control Engineering and Artificial Intelligence(Association for Computing Machinery 2021) 132–135. 10.1145/3448218.3448234
[3]
Mask R-CNN

Kaiming He, Georgia Gkioxari, Piotr Dollar et al.

2017 IEEE International Conference on Computer Vis... 10.1109/iccv.2017.322
[4]
Cascade R-CNN: Delving Into High Quality Object Detection

Zhaowei Cai, Nuno Vasconcelos

2018 IEEE/CVF Conference on Computer Vision and Pa... 10.1109/cvpr.2018.00644
[5]
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, Kaiming He, Ross Girshick et al.

IEEE Transactions on Pattern Analysis and Machine... 10.1109/tpami.2016.2577031
[6]
Learning Deconvolution Network for Semantic Segmentation

Hyeonwoo Noh, Seunghoon Hong, Bohyung Han

2015 IEEE International Conference on Computer Vis... 10.1109/iccv.2015.178
[7]
D. J. D.Romero I.Zaplana S.Van Den Eynde W.Sterkens T.Goedemé andJ.Peeters “Enhanced Plastic Recycling Using RGB+ Depth Fusion With MassFaster and MassMask R‐CNN ” inProceedings of the 2022 Fourth International Conference on Transdisciplinary AI (TransAI) (IEEE 2022) 22–29. 10.1109/transai54797.2022.00010
[9]
You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, Santosh Divvala, Ross Girshick et al.

2016 IEEE Conference on Computer Vision and Patter... 10.1109/cvpr.2016.91
[10]
A.Bochkovskiy C. Y.Wang andH. Y. M.Liao “Yolov4: Optimal Speed and Accuracy of Object Detection ” preprint arXiv:2004.10934 April 23 2020 https://doi.org/10.48550/arXiv.2004.10934.
[11]
A.Wang H.Chen L.Liu et al. “Yolov10: Real‐Time End‐To‐End Object Detection ” preprint arXiv:2405.14458 October 30 2024 https://doi.org/10.48550/arXiv.2405.14458. 10.52202/079017-3429
[14]
W.Hoshino J.Seo andY.Yamazaki “A Study for Detecting Disaster Victims Using Multi‐Copter Drone With a Thermographic Camera and Image Object Recognition by SSD ” inProceedings of the 2021 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM)(IEEE 2021) 162–167. 10.1109/aim46487.2021.9517524
[16]
S. N.WadekarandA.Chaurasia “Mobilevitv3: Mobile‐Friendly Vision Transformer With Simple and Effective Fusion of Local Global and Input Features ” preprint arXiv:2209.15159 October 6 2022 https://doi.org/10.48550/arXiv.2209.151599
[17]
Li Y. "Efficientformer: Vision Transformers at Mobilenet Speed" Advances in Neural Information Processing Systems (2022)
[19]
A.Wang H.Chen Z.Lin J. Han and G. Ding “Repvit: Revisiting Mobile CNN From VIT Perspective ” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(IEEE 2024) 15909–15920. 10.1109/cvpr52733.2024.01506
[20]
Han K. "Transformer in Transformer" Advances in Neural Information Processing Systems (2021)
[21]
S. H.Lee C. H.Yeh T. W.Hou andC.‐S.Yang “A Lightweight Neural Network Based on AlexNet‐SSD Model for Garbage Detection ” inProceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference(Association for Computing Machinery 2019) 274–278. 10.1145/3341069.3341087
[22]
J.Wu C.Leng Y.Wang Q.Hu andJ.Cheng “Quantized Convolutional Neural Networks for Mobile Devices ” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition(IEEE 2016) 4820–4828. 10.1109/cvpr.2016.521
[23]
Z.Jiang L.Zhao S.Li et al. “Real‐Time Object Detection Method Based on Improved YOLOv4‐Tiny ” preprint arXiv:2011.04244 December 2 2020 https://doi.org/10.48550/arXiv.2011.04244.
[24]
X.Long K.Deng G.Wang et al. “PP‐YOLO: An Effective and Efficient Implementation of Object Detector ” preprint arXiv:2007.12099 August 3 2020.https://doi.org/10.48550/arXiv.2007.12099
[25]
C. Y.Wang H. Y. M.Liao Y. H.Wu Y.‐H.Wu P.‐Y.Chen andJ.‐W.Hsieh “CSPNet: A New Backbone That Can Enhance Learning Capability of CNN ” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(IEEE 2020) 390–391. 10.1109/cvprw50498.2020.00203
[27]
M.Tan R.Pang andQ. V.Le “Efficientdet: Scalable and Efficient Object Detection ” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(IEEE 2020) 10781–10790. 10.1109/cvpr42600.2020.01079
[29]
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin et al.

2018 IEEE/CVF Conference on Computer Vision and Pa... 10.1109/cvpr.2018.00716
[31]
C.Cui T.Gao S.Wei et al. “PP‐LCNet: A Lightweight CPU Convolutional Neural Network ” preprint arXiv:2109.15099 September 17 2021.https://doi.org/10.48550/arXiv.2109.15099
[32]
M.Hu J.Feng J.Hua et al. “Online Convolutional Re‐Parameterization ” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(IEEE 2022) 568–577. 10.1109/cvpr52688.2022.00065
[33]
Yang J. "Focal Modulation Networks" Advances in Neural Information Processing Systems (2022)
[34]
Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network

Chao Peng, Xiangyu Zhang, Gang Yu et al.

2017 IEEE Conference on Computer Vision and Patter... 10.1109/cvpr.2017.189
[35]
F. N.Iandola “SqueezeNet: AlexNet‐Level Accuracy With 50× Fewer Parameters and< 0.5 MB Model Size ” preprint arXiv:1602.07360 November 4 2016.https://doi.org/10.48550/arXiv.1602.07360
[36]
W.Liu H.Lu H.Fu andZ.Cao “Learning to Upsample by Learning to Sample ” inProceedings of the IEEE/CVF International Conference on Computer Vision(IEEE 2023) 6027–6037. 10.1109/iccv51070.2023.00554
[37]
Slim-neck by GSConv: a lightweight-design for real-time detector architectures

Hulin Li, Jun Li, Hanbing Wei et al.

Journal of Real-Time Image Processing 10.1007/s11554-024-01436-6
[38]
J.Wang K.Chen R.Xu Z.Liu C. C.Loy andD.Lin “Carafe: Content‐Aware Reassembly of Features ” inProceedings of the IEEE/CVF International Conference on Computer Vision(IEEE 2019) 3007–3016. 10.1109/iccv.2019.00310
[39]
H.Lu W.Liu H.Fu andZ.Cao “FADE: Fusing the Assets of Decoder and Encoder for Task‐Agnostic Upsampling ” inProceedings of the European Conference on Computer Vision(Springer 2022) 231–247. 10.1007/978-3-031-19812-0_14
[40]
Lu H. "SAPA: Similarity‐Aware Point Affiliation for Feature Upsampling" Advances in Neural Information Processing Systems (2022)
[41]
S.Woo J.Park J. Y.Lee andI. S.Kweon “CBAM: Convolutional Block Attention Module ” inProceedings of the European Conference on Computer Vision (ECCV)(Springer 2018) 3–19. 10.1007/978-3-030-01234-2_1
[43]
A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

Juan Terven, Diana-Margarita Córdova-Esparza, Julio-Alejandro Romero-González

Machine Learning and Knowledge Extraction 10.3390/make5040083
[44]
Rethinking the Inception Architecture for Computer Vision

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe et al.

2016 IEEE Conference on Computer Vision and Patter... 10.1109/cvpr.2016.308
[46]
Yang M. "Classification of Trash for Recyclability Status" CS229 Project Report (2016)
[47]
P. F.ProençaandP.Simões “Taco: Trash Annotations in Context for Litter Detection ” preprint arXiv:2003.06975 March 17 2020.https://doi.org/10.48550/arXiv.2003.06975
[48]
G.Yang J.Lei Z.Zhu Z.Feng andR.Liang “AFPN: Asymptotic Feature Pyramid Network for Object Detection ” inProceedings of the 2023 IEEE International Conference on Systems Man and Cybernetics (SMC)(IEEE 2023) 2184–2189. 10.1109/smc53992.2023.10394415
Metrics
0
Citations
48
References
Details
Published
Jan 01, 2025
Vol/Issue
19(1)
License
View
Funding
National Natural Science Foundation of China Award: 62001004
Cite This Article
Guangling Sun, Fenqi Zhang (2025). DRSE‐YOLO: Efficient and Lightweight Architecture for Accurate Waste Detection. IET Image Processing, 19(1). https://doi.org/10.1049/ipr2.70022