Abstract:
The face of the pig contains rich biometric information, and the detection of the facial gestures can provide a basis for the individual identification and behavior analysis of the pig. Detection of facial posture can provide basis for individual recognition and behavioral analysis of pigs. However, under the scene of group pigs breeding, there always have many factors, such as pig house lighting and pig adhesion, which brings great challenges to the detection of pig face. In this paper, we take the group raising pigs in the real breeding scene as the research object, and the video frame data is used as the data source. Latter we propose a new detection algorithm named DAT-YOLO which based on the attention mechanism and Tiny-YOLO model, and channel attention and spatial attention information are introduced into the feature extraction process. High-order features guide low-order features for channel attention information acquisition, and low-order features in turn guide high-order features for spatial attention screening, meanwhile the model parameters don't have significant increase, the model feature extraction ability is improved and the detection accuracy is improved. We collect 504 sheets total 3 712 face area picture for the 5 groups of 20 days to 3 and a half months of group health pig video extraction, the number of pigs is 35. In order to obtain the model input data set, we perform a two-step pre-processing operation of filling pixel values and scaling for the captured video. The model outputs are divided into six classes, which are horizontal face, horizontal side-face, bow face, bow side-face, rise face and rise side-face. The results show that for the test set, the detection precision(AP) reaches 85.54%, 79.3%, 89.61%, 76.12%, 79.37%, 84.35% of the horizontal face, horizontal side-face, bow face, bow side-face, rise face and rise side-face respectively, and the mean detection precision(mAP) is 8.39%, 4.66% and 2.95% higher than that of the general Tiny-YOLO model, the CAT-YOLO model only refers to channel attention and the SAT-YOLO model only introduces spatial attention respectively. In order to further verify the migration performance of attention on the remaining models, under the same experimental conditions, two attentional information were introduced to construct the corresponding attention sub-models based on the YOLOV3-based model. The experiment shows that compared to the YOLOV3 submodel, the sub-model based on Tiny-YOLO increase by 0.46% to 1.92% in the mAP. The Tiny-YOLO and YOLOV3 series models have different performance improvements after adding attention information, indicating that the attention mechanism is beneficial to the accurate and effective group gestures detection of different groups of pigs. In this study, the data is pseudo-equalized from the perspective of loss function to avoid the data imbalance caused by the number of poses of different facial categories, and actively explore the reasons for the difference in the accuracy of different facial gesture detection. The study can provide reference for the subsequent individual identification and behavior analysis of pigs.