基于背部特征的挤奶场景下奶牛识别方法

李泽昊; 王月明; 蒲朝燚

doi:10.11975/j.issn.1002-6819.2024081987

摘要: 针对奶牛养殖场挤奶场景下奶牛实时无接触身份识别需求，该研究提出一种基于背部特征的奶牛实时身份识别方法。首先在挤奶场景下采集奶牛背部视频数据，建立包含793张图片的目标检测数据集和包含3145头奶牛、52834张图片的身份识别数据集，构建YOLOv8-DW目标检测模型，检测奶牛背部。然后，训练对比HRNet、EfficientNet、ConvNeXt、Swin Transformer和Swin Transformer V2这5种骨干网络，选择最优网络提取奶牛背部特征，通过计算其与数据库中身份的相似度并与入库验证阈值对比，得到单帧识别结果。最后，设计多帧识别方法，通过对单帧识别结果进行融合得到奶牛身份。YOLOv8-DW目标检测模型平均精度均值（mAP₅₀）达99.3%，处理速度达169.49帧/s；最优网络HRNet在测试集中实现了99.76%的排序结果平均精度均值（mAP）和120.12帧/s的处理速度；在对不同挤奶日期奶牛的识别中，多帧识别下已入库奶牛的识别率达97.6%，未入库奶牛识别率达94%。该研究结果可为挤奶场景下奶牛的识别提供参考。

Abstract: Accurate recognition of dairy cow identity is the cornerstone and indispensable requirement for intelligent farming, particularly in milking scene, where precise and expeditious recognition can substantially enhance management efficiency and ensure animal welfare. Traditional recognition techniques, such as branding and ear tagging, have been shown to potentially cause harm to the animals and negatively impact their well-being, and have consequently been gradually phased out with the advancement of technological developments. Currently, Radio Frequency Identification (RFID) technology was widely used, but it faced issues such as discomfort for the animals and fragility leading to loss of equipment. In contrast, contactless recognition technology based on computer vision proved to have advantages such as low cost and high efficiency, making them more suitable for application in milking scene. To improve management efficiency and reduce operational costs, a real-time recognition method based on back features of dairy cows was proposed. This method involved capturing images of cows in milking scene using cameras to identify them, thereby reducing the probability of missed and false detections and avoiding harm to animal welfare. Videos of milking operations were collected, which were then decomposed into images, resulting in an object detection dataset containing 793 annotated images. We performed lightweight improvements on the YOLOv8 architecture and constructed the YOLOv8-DW network. After training and comparing the performance of YOLOv8-DW and different versions of YOLOv8, we utilized the YOLOv8-DW model to detect cow targets and extract images. The YOLOv8-DW model achieved remarkable results, with a precision of 98.9%, a recall of 96.4%, and a mean average precision of 99.3%, outperforming the different versions of the YOLOv8 network. The implementation of the YOLOv8-DW model in the NVIDIA GeForce RTX 3090 24GB GPU environment demonstrated its exceptional efficiency, reaching a detection rate of 169.49 frames per second. This significant improvement in inference speed can be attributed to the lightweight nature of the YOLOv8-DW network, which maintains high accuracy while substantially reducing the computational complexity compared to the original YOLOv8 model. Using the YOLOv8-DW model, we analyzed milking videos and created an identity recognition dataset, containing 3145 cow identities and a total of 52834 images. This dataset was used to train and compare five backbone networks: HRNet, EfficientNet, ConvNeXt, Swin Transformer, and Swin Transformer V2. We conducted a comprehensive evaluation of the recognition performance of five different network architectures on the test set, different-date milking videos, and a dataset of 79 solid black backcows. After careful analysis, we ultimately selected the HRNet network as the most suitable model for our application. The HRNet network demonstrated exceptional performance in extracting the distinctive back features of the target cows and calculating the cosine similarity between the target cows and the cows in the database. On the test set, the HRNet model achieved a mean average precision of 99.76% and a rank@1 accuracy of 100%, while maintaining an impressive image processing speed of 120.12 frames per second. Through kernel density estimation, we analyzed the distribution of similarities for correct and incorrect recognitions and determined the recognition threshold. Based on this threshold, we conducted multi-frame detection and weighted voting, ultimately achieving a recognition rate of 97.7% for registered cows and 94% for unregistered cows. This method achieved real-time identity recognition of dairy cows in milking scenarios, significantly improving management efficiency and reducing farming costs, while maximizing the protection of animal welfare. This provides important technical support for the intelligent farming of dairy cows.

基于背部特征的挤奶场景下奶牛识别方法

Cow recognition method in milking scene based on back features