采用改进YOLOv5s检测牧区牲畜

    Livestock detection in pastoral areas using improved YOLOv5s

    • 摘要: 畜牧业自动化管理面临的一个关键挑战是如何准确地检测大规模放牧养殖牲畜的种群,确定其数量和实时更新群体信息。牲畜规模化、自动化检测受环境场地等因素影响,当前目标检测算法经常出现漏检、误检等情况。该研究基于YOLOV5s目标检测网络设计了一种牲畜检测算法LDHorNet(livestock detect hor net),参考HorNet的递归门控卷积设计了HorNB模块对网络模型进行改进,以提高检测算法的空间交互能力和检测精度。然后在网络结构中嵌入CBAM(convolutional block attention module)注意力机制,以提高小目标的检测精度和注意力权重,并利用Repulsion 损失函数提高目标检测网络的召回率和预测精度。试验结果表明,所提出的LDHorNet算法的精准率、召回率分别为95.24%、88.87%,平均精准率均值mAP_0.5、mAP_0.5:0.95分别为94.11%、77.01%,比YOLOv5s、YOLOv8s、YOLOv7-Tiny精准率分别提高了2.83、2.93和9.79个百分点,召回率分别提高了6.66和4.95、13.42个百分点,平均精准率均值mAP_0.5:0.95分别提高12.46、5.26和20.97个百分点。该算法对于小目标和遮挡场景下的牲畜检测效果优于原算法与对比算法,表现出良好的鲁棒性,具有广泛的应用前景。

       

      Abstract: Animal husbandry is ever-increasing in artificial intelligence (AI) at present, in terms of scale, informatization, and refinement. Collective cattle farms can be expected to gradually replace the small-scale mode, such as individual farming. An important challenge has posed on the automated management in animal husbandry. It is very necessary to accurately identify the livestock population in the large-scale grazing and breeding, in order to determine the quantity and then update the population information in real-time. However, it is very difficult for the scale and automation of livestock detection, due to the complex environment in the pastoral areas, which was often obstructed by trees, sunlight, and sandstorms. The current object algorithms often encounter missed and error detections, even leading to monitoring accurately. In this article, a livestock detection called LDHorNet (livestock detection HorNet) was proposed using the YOLOV5s object detection network. Firstly, a HorNB module was designed to detect the small targets in the long-distance and complex environments. The recursive gated convolution in the HorNB was used to extend the second-order interactions in self-attention to any order. As such, the C3 module was then replaced in the Backbone and Neck sections, in order to improve the performance of the network model. The high-order spatial information interaction was realized without introducing significant additional calculations. The better performance of the improved model was achieved to understand the relationship between different targets in the image, and then better capture the spatial structure and contextual information between targets. The accuracy of detection was also enhanced to locate and detect the small targets. Secondly, the gradient accumulation was utilized to decrease the variance of gradient, in order to enhance both the stability of the model and the efficiency of feature extraction. The CBAM (convolutional block attention module) attention mechanism was also embedded in the Neck section of the network structure to better capture the key information in the image. The detection accuracy and attention weight of small targets were improved to effectively solve the interference of light, wind and sand, noise, and livestock motion blur on livestock detection in complex pastoral environments. At the same time, there was a decrease in the missed detection, due to overlapping occlusion of livestock in the pastoral areas. The repulsion loss function was utilized to handle the overlapping occlusion. There was an enhancement in the recall and accuracy of the model. Finally, accurate detection was achieved in the complex scenes, such as the complex lighting and environmental interference. The experimental results show that the LDHorNet model shared the better performance and high accuracy, with a Precision and Recall of 95.24% and 88.87%, respectively, and with the mAP_0.5 and mAP_0.5:0.95 of 94.11% and 77.01%, respectively. The precision of the improved model increased by 2.83, 2.93, and 9.79 percentage points, respectively, compared with the YOLOv5s, YOLOv8s, and YOLOv7 Tiny, and the Recall increased by 6.66, 4.95, and 13.42 percentage points, respectively, mAP_0.5 increased by 3.98, 3.4, and 10.78 percentage points, respectively, with mAP_0.5:0.95 increased by 12.46, 5.26, and 20.97 percentage points, respectively. This network performed the best to detect the livestock in the small targets and occlusion scenes. Therefore, this model can provide a strong reference for livestock detection under large-scale grazing conditions.

       

    /

    返回文章
    返回