基于特征点检测的生猪行为识别方法

倪彭飞; 周素茵; 叶俊华; 徐爱俊

doi:10.11975/j.issn.1002-6819.202408188

基于特征点检测的生猪行为识别方法

Pig behavior recognition method based on feature point detection

摘要

摘要: 生猪行为与其健康状况密切相关，生猪日常活动时的身体特征信息反映了其行为状态，在生猪特征的提取过程中，由于生猪姿态多变，导致现有的生猪特征提取方法复杂且效率低，从而影响了识别的效果。为此，该研究构建了生猪特征点检测模型YOLO-ASF-P2以提取生猪关键部位的特征点时序信息，再结合特征点的时序信息进而构建了生猪行为识别模型CNN-BiGRU识别生猪的坐、站和躺3种行为。YOLO-ASF-P2以YOLOv8s-Pose为基线模型，结合其骨干网络中P2层的大分辨率特征图充分挖掘更多目标特征，同时采用ASF架构改进了其特征融合部分。首先，应用尺度特征融合模块SSFF对齐不同尺度特征图，提升了模型对多尺度特征信息的融合能力；其次，应用三重特征编码模块TFE，均衡多尺度特征信息的细节，避免模型丢失目标特征；最后，通过通道和位置注意力机制CPAM捕获特征点的空间信息，精准检测生猪的特征点。CNN-BiGRU通过双向门控循环单元和注意力机制灵活捕捉生猪特征点时序信息并进行加权处理，有效结合生猪特征点的时序特征，高效识别生猪的行为。经验证，YOLO-ASF-P2的检测精度为92.5%，召回率为90%，平均精度（AP_50-95）为68.2%，浮点运算次数为39.6G，模型参数量为18.4M。CNN-BiGRU模型针对生猪的坐、站和躺3种行为的平均识别精度为96%，浮点运算次数为27.1G，模型参数量为155kB。综上，该研究提出的生猪特征点检测模型精度较高且轻量化，能够有效应对生猪姿态多变对特征点准确检测的挑战。同时，生猪行为识别模型结合生猪特征点时间域信息能有效识别生猪的坐、站和躺3种行为，为生猪的行为识别提供了新思路。

Abstract: With advancements in artificial intelligence, automation, deep learning, and other technologies, modern pig farming moved towards intensive and intelligent development. The integration of machine vision and deep learning technology realized non-invasive individual identification and behavior monitoring, providing an effective tool for refined breeding. The characteristic information generated by pigs during daily activities is crucial for recognizing and analyzing pig behavior. However, due to frequent changes in pig posture, existing pig feature extraction methods are complex and inefficient. To address these issues, this study proposed a pig keypoint detection model, YOLO-ASF-P2, which focuses on key areas of the pig's body to extract feature point information. Additionally, a pig behavior recognition model, CNN-BiGRU, was introduced, which combines temporal information from keypoints to identify pig behavior. First, video and image data of pigs were collected by multi-angle cameras deployed in the pig house, forming a pig feature point detection dataset and a pig behavior recognition dataset. To address the issues of complex calculations, redundant feature information, and poor model robustness associated with traditional pig feature extraction methods, the YOLOv8s-Pose model was improved, resulting in the YOLO-ASF-P2 model. This model utilized the small target feature information of the P2 detection layer and combined the attention scale sequence fusion (ASF) architecture to focus on the key feature points of live pigs. The scale sequence feature fusion module (SSFF) module used Gaussian kernel and nearest neighbor interpolation to align multi-scale feature maps of different downsampling rates (such as P2, P3, P4, and P5 detection layers) to the same resolution as the high-resolution feature map, ensuring comprehensive information extraction. The triple feature encoding (TFE) module captured local fine details of small targets and fused local and global feature information. The channel and position attention mechanism module (CPAM) module captured and refined the spatial positioning information related to small targets, effectively extracted the important feature information contained in the feature map in different channels, and improved the positioning accuracy of the model. The CNN-BiGRU pig behavior recognition model used bidirectional gated recurrent unit (BiGRU) units to capture forward and backward information of sequence data in a bidirectional manner, with the output weighted using the attention mechanism module (AttentionBlock). On the self-built dataset, the average recognition accuracy of the model for the three behaviors of sitting, standing, and lying reached 96%, demonstrating good and stable performance. The detection accuracy of YOLO-ASF-P2 reached 92.5%, the recall rate was 90%, the average precision (AP_50～95) was 68.2%, the parameter volume was only 18.4M, and the performance was 39.6GFLOPs. These results were 1.1%, 2.3%, 1.5%, and 32.9% higher than the original model, respectively, with the model parameter volume reduced by 17.5%. Compared with MMPose, the average precision (AP_50～95) and accuracy of YOLO-ASF-P2 have been improved by 17.4% and 2.9%, respectively, while almost the same level of recall has been maintained, thereby enhancing detection performance. Compared with RTMPose, YOLO-ASF-P2 achieves improvements in accuracy, recall rate, average precision (AP_50～95), and the number of parameters. When compared with YOLOv5s-Pose, the same accuracy is achieved, but the recall rate and average precision (AP_50～95) are improved despite a higher number of parameters. When compared with YOLOv7s-Pose, slightly lower accuracy is observed, but improvements are achieved in the recall rate and average precision (AP_50～95). The proposed model was lighter and showed better recognition performance for pig feature points. The CNN-BiGRU pig behavior model had high average recognition accuracy and stable performance, the parameter volume was 155 kB, and the performance was 27.1 GFLOPs. In summary, the pig behavior recognition method proposed in this paper demonstrated good feasibility and provided a new way for pig behavior recognition. The integration of YOLO-ASF-P2 and CNN-BiGRU models significantly improved the accuracy and robustness of pig feature point detection and behavior recognition, offering valuable tools for the intensive and intelligent development of pig farming.

HTML全文

参考文献(40)

施引文献

资源附件(0)