Tang Chen, Xu Lihong, Liu Shijing. Fine-grained classification algorithm of fish feeding state based on optical flow method[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(9): 238-244. DOI: 10.11975/j.issn.1002-6819.2021.09.027
    Citation: Tang Chen, Xu Lihong, Liu Shijing. Fine-grained classification algorithm of fish feeding state based on optical flow method[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(9): 238-244. DOI: 10.11975/j.issn.1002-6819.2021.09.027

    Fine-grained classification algorithm of fish feeding state based on optical flow method

    • To solve the fine-grained classification of fish feeding state in the factory production environment, the fine-grained classification of fish feeding state in the factory production environment is beneficial to describe the fish feeding behavior in more detail. While current studies are mostly based on an ideal laboratory environment where external disadvantages are ignored such as light conditions and image quality, these studies can't be applied in the factory environment. Moreover, these studies focus on the binary classification of fish feeding state (eating or non-eating), which is imprecise. This study carried out the fine-grained classification of fish feeding state where a small-scaled fine-grained classified fish feeding state dataset was collected. Videos used to make this dataset were all captured in the factory production environment. There was a total of 752 videos in the dataset, each video was 3 s (90 frames) and labeled as non-eating, weak-eating, or strong-eating. Based on this dataset, a fine-grained classification algorithm of fish feeding state was proposed to solve the fish feeding state classification problem in the factory production environment. Firstly, this algorithm solved optical flow fields according to all consecutive frames in videos and calculating the moving magnitude and angle of pixels according to optical flow fields solved before. After that, the magnitude and angle were divided into eight intervals separately, and the histograms of pixels' magnitude and angle were counted in these eight intervals. The spliced magnitude and angle histogram was represented as an inter-frame motion feature in frame level for further classification. In this process, the algorithm turned a video sample into many inter-frame motion feature samples by calculating optical flow fields of all consecutive frames in the video. Then a 5-layer (one input layer, one output layer, and three hidden layers) classification neural network was built to classify inter-frame motion features extracted before. The classification network had three output categories corresponding to three different feeding states (non-eating, weak-eating, and strong-eating) and was optimized by a cross-entropy loss function, the output category probability was calculated by Softmax classification function. All inter-frame motion feature classification predictions were considered in the final video classification through voting strategy. The most frequent predicted frame-level category in all frames was considered as the video's probable category, a voting threshold was additionally set to ensure the frequency of the prediction. When the predicted frequency of the probable category was greater than the voting threshold, the video sample could be predicted as the corresponding probable category. Otherwise, the video sample would be predicted as the uncertain category. The frequency of prediction was proportional to the voting threshold. By setting a high voting threshold, the algorithm could output more reliable classification results. The experiment results showed the video accuracy of the algorithm was 98.7% under the 50% voting threshold. When the voting threshold increased to 80%, the video accuracy remained at 91.4% which proved the robustness of the algorithm. The video accuracy decreased with the increase of the voting threshold because a higher voting threshold needed more corresponding frame-level predictions and more videos might be predicted as the uncertain category due to the low frequency of prediction. Some comparative experiments were conducted to prove the effectiveness of the proposed algorithm. The experiments of texture-based algorithm and single frame convolutional neural network showed single frame features were not able to solve the fine-grained feeding state classification problem, which also proved the effectiveness of inter-frame motion features calculated in the proposed algorithm. Besides, the proposed algorithm got good performances in the small-scaled dataset collected before due to the inter-frame motion features extracted by optical flow method, it transferred the training data from video level to frame level which increased training samples implicitly. This study concentrated on the commercial recirculating aquaculture system thus could be better applied in the factory production environment. Moreover, it realized the fine-grained classification of fish feeding state, which could help describe the fish feeding behavior in more detail.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return