Cow recognition method in milking scene based on back features
-
Graphical Abstract
-
Abstract
Accurate recognition of dairy cow identity is the cornerstone and indispensable requirement for intelligent farming, particularly in milking scene, where precise and expeditious recognition can substantially enhance management efficiency and ensure animal welfare. Traditional recognition techniques, such as branding and ear tagging, have been shown to potentially cause harm to the animals and negatively impact their well-being, and have consequently been gradually phased out with the advancement of technological developments. Currently, Radio Frequency Identification (RFID) technology was widely used, but it faced issues such as discomfort for the animals and fragility leading to loss of equipment. In contrast, contactless recognition technology based on computer vision proved to have advantages such as low cost and high efficiency, making them more suitable for application in milking scene. To improve management efficiency and reduce operational costs, a real-time recognition method based on back features of dairy cows was proposed. This method involved capturing images of cows in milking scene using cameras to identify them, thereby reducing the probability of missed and false detections and avoiding harm to animal welfare. Videos of milking operations were collected, which were then decomposed into images, resulting in an object detection dataset containing 793 annotated images. We performed lightweight improvements on the YOLOv8 architecture and constructed the YOLOv8-DW network. After training and comparing the performance of YOLOv8-DW and different versions of YOLOv8, we utilized the YOLOv8-DW model to detect cow targets and extract images. The YOLOv8-DW model achieved remarkable results, with a precision of 98.9%, a recall of 96.4%, and a mean average precision of 99.3%, outperforming the different versions of the YOLOv8 network. The implementation of the YOLOv8-DW model in the NVIDIA GeForce RTX 3090 24GB GPU environment demonstrated its exceptional efficiency, reaching a detection rate of 169.49 frames per second. This significant improvement in inference speed can be attributed to the lightweight nature of the YOLOv8-DW network, which maintains high accuracy while substantially reducing the computational complexity compared to the original YOLOv8 model. Using the YOLOv8-DW model, we analyzed milking videos and created an identity recognition dataset, containing 3145 cow identities and a total of 52834 images. This dataset was used to train and compare five backbone networks: HRNet, EfficientNet, ConvNeXt, Swin Transformer, and Swin Transformer V2. We conducted a comprehensive evaluation of the recognition performance of five different network architectures on the test set, different-date milking videos, and a dataset of 79 solid black backcows. After careful analysis, we ultimately selected the HRNet network as the most suitable model for our application. The HRNet network demonstrated exceptional performance in extracting the distinctive back features of the target cows and calculating the cosine similarity between the target cows and the cows in the database. On the test set, the HRNet model achieved a mean average precision of 99.76% and a rank@1 accuracy of 100%, while maintaining an impressive image processing speed of 120.12 frames per second. Through kernel density estimation, we analyzed the distribution of similarities for correct and incorrect recognitions and determined the recognition threshold. Based on this threshold, we conducted multi-frame detection and weighted voting, ultimately achieving a recognition rate of 97.7% for registered cows and 94% for unregistered cows. This method achieved real-time identity recognition of dairy cows in milking scenarios, significantly improving management efficiency and reducing farming costs, while maximizing the protection of animal welfare. This provides important technical support for the intelligent farming of dairy cows.
-
-