Abstract
Abstract: Pig posture recognition can greatly contribute to the early warning of pig health under the complex scenes in swine farms, even to reduce economic loss. However, it is still challenging in posture recognition, due to the mutual occlusion and adhesion of group-housed pigs in the group house. In this study, a two-stage group-housed pigs pose recognition was proposed using the combination of Instance Segmentation and Classification identification. Firstly, the instance segmentation data set was used as the input, Cascade Mask R-CNN was used as the reference network, and the HrNetV2 network was as the feature extraction network of Cascade Mask R-CNN. High-precision segmentation was carried out on the pig images in different scenes. The multi-resolution image processing in the HrNetV2 was realized to reduce the loss of spatial details of the images, in order to improve the representation of the segmentation target by the model. The FPN module was also introduced to construct the HrNetV2+FPN composite structure. The pig bodies of different sizes were mapped to the feature maps of different levels, and then to achieve the effective fusion of features of different scales. The overlapping and adhesion between pig bodies were reduced for the subsequent recognition of pig body posture. Secondly, the CA coordinate attention mechanism was introduced to design the CA-MobilenetV3 lightweight pig body pose recognition network. The pig individual pose was extracted to build the data set after the segmentation. The interference of the background region was reduced to enhance the feature learning in the key parts of the pig body. CAM visualization was used to display the extracted part of the pig body pose feature, in order to realize the accurate and rapid recognition of individual pig posture. Finally, the improved model was effectively segmented and detected the pig body, indicating the more accurate segmentation edge of the pig body. There was a rough segmentation of the conventional MS R-CNN and Mask R-CNN. By contrast, the improved Cascade Mask R-CNN model was achieved 96.9%, 96.3%, 85.2%, and 89.4% for the AP0.50, AP0.75, AP0.50:0.95, and AP0.50:0.95-large, respectively. Therefore, the improved model performed better than the Mask R-CNN and MS R-CNN networks, in terms of detection and segmentation. In terms of pig posture recognition, the recall and F1 value of the CA-MobileNetV3 model increased by 10.8%, 13.1%, respectively, compared with the rest. Furthermore, the performance of the CA-MobileNetV3 network was significantly improved to identify the sitting and lying posture class. All evaluation indexes achieved an accuracy rate of 96.5%, 99.3%, 98.5%, and 98.7%, respectively, in the kneeling, standing, lying, and sitting posture class. The performance was much better than the MobileNetV3, ResNet50, DenseNet121, and VGG16 networks of the same type. In conclusion, the improved model was superior to the current popular networks, in the aspects of pig individual segmentation and pig pose recognition. The non-contact and low-cost recognition can be expected to realize the pig attitude under different scenarios, such as deep separation, mutual adhesion, and debris shielding. The finding can also provide the model support for the practical management of pig-intensive farming.