QI Yongsheng, JIAO Jie, BAO Tengfei, et al. Cattle face detection algorithm in complex scenes using adaptive attention mechanism[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(14): 173-183. DOI: 10.11975/j.issn.1002-6819.202304218
    Citation: QI Yongsheng, JIAO Jie, BAO Tengfei, et al. Cattle face detection algorithm in complex scenes using adaptive attention mechanism[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2023, 39(14): 173-183. DOI: 10.11975/j.issn.1002-6819.202304218

    Cattle face detection algorithm in complex scenes using adaptive attention mechanism

    • Precision breeding has been a research hotspot in the cattle breeding industry in recent years, with the development of smart livestock farming and large-scale expansion. Among them, facial detection and recognition of cattle can be critical to intelligent farming on ranches. However, the accuracy of facial detection can be severely affected by three common environmental factors: blurriness, backlighting, and occlusion, due to the complexity of the livestock farming environment. In this study, cattle facial detection was proposed using adaptive attention mechanisms in complex scenarios. Firstly, three evaluation indicators were designed for each of the three interfering factors of blurriness, backlighting, and occlusion, respectively, and then normalized the three types of evaluation indicators using fuzzy membership functions. Three indicators were also utilized to comprehensively evaluate the scene information of the input image. The weighting coefficients were adjusted to reflect the complexity of the target scene, according to the changes in the evaluation indicators. Secondly, a new attention mechanism called CDAA (Composite Dual-Branch Adaptive Attention) was introduced into the backbone feature extraction network using YOLOV7-tiny. The parallel structures were incorporated for channel and spatial attention, along with adaptive weighting coefficients to effectively enhance the respective attention branches' importance. Dynamic weighting was realized for the automatic adjustment of channel and spatial attention mechanisms in different scenarios. The network's ability was improved to extract the features in complex scenarios for higher detection accuracy in complex scenarios. The channel attention branch selectively emphasized the information features using global information using a fusion of global average pooling and global max pooling, in order to suppress the redundant features. As such, the edge features of the detection target were selectively highlighted to effectively solve image blurring and occlusion. The spatial attention branch also used a parallel structure of channel max pooling and channel average pooling, in order to arrange the different positions of feature information with different importance. Therefore, the important spatial positions were highlighted to suppress the spatial information redundancy, in order to effectively enhance the regional features under strong background backlighting interference. Finally, the image scene evaluation indicators were introduced into the loss function to adaptively adjust the weight of the large-scale grid loss function. The network was more focused to detect a large number of small targets during training, thereby improving the overall detection accuracy of the network. A series of ablation experiments were conducted on a specific dataset. Various classical detections were compared to verify the effectiveness and real-time performance of the detection. The Jetson Xavier NX platform was adopted to fully meet the transplantation requirements with high detection accuracy. The test results indicate that the improved model was achieved with a detection accuracy of 89.58%. The cattle face detection accuracy was improved by 7.34 percentage points, compared with the original YOLOv7-tiny network. The detection speed was 62 frames per second. Detection performance outperformed both the original and comparative network under the condition of almost no loss in detection speed, particularly for the captured images in complex real-world scenarios. This performance demonstrated excellent robustness and significant practical value with wide-ranging application prospects.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return