Abstract:
Detecting and tracking basketball in videos are helpful for coaches to review gameplays. In video streams of games, the You Only Look Once v5 (YOLOv5) algorithm exhibits low discriminative ability between basketball and other small circular targets due to the small size of the basketball target. To address this issue, we propose several improvements based on the YOLOv5. Firstly, we replace the original C3 module with the VoVNet C3 (V-C3) module to address the problem of limited basketball features and validate the effectiveness of this enhancement through Kullback-Leibler divergence. Secondly, we introduce the Bridge Path Aggregation Network (BPANet) to replace the Path Aggregation Network (PANet) for better detection of small basketball targets in the scene. Thirdly, a classification penalty mechanism is constructed to reduce false alarms between basketball and similar targets. Lastly, we explore the influence of various parameters on the performance of the basketball detection algorithm to determine optimal parameter values and model structures. Experimental results demonstrate that the improved algorithm improves recognition accuracy by approximately 3% over the original YOLOv5 algorithm, with an average precision increase of about 2.4% on the COCO dataset, and reduces the algorithm's parameter size by about 5.3%. The proposed four enhancement strategies of this study based on the YOLOv5 algorithm improve the detection accuracy of basketball targets in videos while reducing model complexity, thereby offering a new approach for similar object detection tasks.