PDF (5.7 MB)
Collect
Submit Manuscript
Show Outline
Outline
Abstract
Keywords
Show full outline
Hide outline
Open Access | Just Accepted

MFF-YOLO: An improved YOLO algorithm based on multi-scale semantic feature fusion 

1 Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China.

2 School of Information Engineering, Huzhou University, Huzhou 313000, China. 

3 Hebei Key Laboratory of Machine Learning and Computational Intelligence,College of Mathematics and Information Science, Hebei University, Baoding 071002, China. 

4 Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China, and also with National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China. 

 

Show Author Information

Abstract

The YOLOv5 algorithm is widely used in edge computing systems for object detection. However, the limited computing resources of embedded devices and the large model size of existing deep learning-based methods increase the difffculty of real-time object detection on edge devices. To address this issue, we propose a smaller, less computationally intensive and more accurate algorithm for object detection. Multi-scale feature fusion-YOLO (MFF-YOLO) is built on top of the YOLOv5s framework, but it contains substantial improvements to YOLOv5s. First, we design the MFF module to improve the feature propagation path in the feature pyramid, which further integrates the semantic information from different paths of feature layers. Then, a large convolutionkernel module is used in the bottleneck. The structure enlarges the receptive ffeld and preserves shallow semantic information, which overcomes the performance limitation arising from uneven propagation in feature pyramid networks (FPN). In addition, a multi-branch downsampling method based on depthwise separable convolutions and a bottleneck structure with deformable convolutions are designed to reduce the complexity of the backbone network and minimize the real-time performance loss caused by the increased model complexity. The experimental results on the PASCAL VOC and MS COCO datasets show that, compared with YOLOv5s, MFF-YOLO reduces the number of parameters by 7% and the number of FLOPs by 11.8%. The mAP@0.5 has improved by 3.7% and 5.5%, and the mAP@0.5:0.95 has improved by 6.5% and 6.2%. Furthermore, compared with YOLOv7-tiny, PP-YOLO-tiny, and other mainstream methods, MFF-YOLO has achieved better results on multiple indicators. 

Tsinghua Science and Technology
Cite this article:
Zhang J, Xu C, Shen S, et al. MFF-YOLO: An improved YOLO algorithm based on multi-scale semantic feature fusion . Tsinghua Science and Technology, 2024, https://doi.org/10.26599/TST.2024.9010097
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return