Abstract:
Monitoring fish phenotype during aquaculture is essential to promote the quality of fishery resources. However, the low-level visual information cannot accurately match the specific target in complex real-world scenarios. Semantic segmentation can be expected to segment the object outlines, due to the classifies images at the pixel level. It is still lacking on the segmentation of fish morphological feature. Phenotype segmentation without optimized or improved models can result in the coarse segmentation across multiple phenotypic scales, leading to the low accuracy of fish phenotype proportion. In this study, a VED-SegNet model was developed for the segmentation and measurement of fish phenotypes. This model was integrated the Cross Stage Partial Network and GSConv as the encoder (VoV-GSCSP) to extract more comprehensive semantic information on fish phenotypes. Secondly, the EMA (Efficient Multi-Scale Attention Module with Cross-Spatial Learning) was employed to create an enhanced structure. The information transfer between the encoder and decoder was selected for the efficient learning of feature maps without channel dimensionality reduction during convolution, resulting in the superior pixel information and enhanced model accuracy. Finally, the decoder was configured with four upsampling layers, enabling the classification of eight phenotypic categories. Experiments were conducted on a custom-built dataset of fish phenotype segmentation. It was found that the model performed optimally with an EMA channel dimension of 16, a batch size of 4, a learning rate of 1×10
-4, and an image size of 512×512 pixels. The better performance was achieved in the mIoU (mean intersection over union) of 87.92% and an mPA (mean pixel accuracy) of 92.83%, with parameters and GFLOPs (Giga Floating Point Operations) at 30.9M and 440.95G. The inference speed reached the real-time levels with an FPS (Frames Per Second) of 30.44. In the extraction of fish phenotypic proportions, the model was closely aligned with the actual measurements, with the minimun mean absolute percentage error of 0.61% and the maximum of only 11.28%. These findings demonstrated that the VED-SegNet model was used to rapidly and accurately segment and extract the fish phenotypes, and then facilitate the contactless extraction of fish phenotype proportions in aquaculture. There was the great contribution to the intelligent evolution of the industry. Compared with the advanced semantic segmentation models, such as PSPNet, Deeplabv3+, Unet, and Segformer, the VED-SegNet model exhibited the highest accuracy and superior overall capability in segmenting fish phenotypes, achieving over 77% and 86% in IoU and PA metrics, respectively. This performance demonstrated that the VED-SegNet's balanced proficiency across various phenotypic indicators, in order to precisely monitor the fish growth curves and scientific management in aquaculture. The segmentation experiments showed that the superior performance was achieved in the fish phenotypic segmentation, particularly the enhanced robustness in segmenting fish fins with similar shapes and colors. This approach was effectively reduced the false negatives, due to a limited number of effective pixels with more precise and complete edge details. The advanced capability was obtained to segment the fish phenotypes, indicating more accurate determination of fish phenotypic proportions. In addition, a series of tests were conducted on the fish phenotypic segmentation under various aquaculture environments, including the scenarios with complex water quality and overlapping of multiple fish. The VED-SegNet model was maintained the effective segmentation performance in these complex and multi-fish overlapping scenarios. The VED-SegNet phenotypic intelligent segmentation model can be expected to facilitate the phenotypic segmentation of specific fish species, even in the complex environments and multi-fish overlapping scenarios. Additionally, this approach can be extend to the phenotypic segmentation and measurement of other fish species in the wide application scope.