构建VED-SegNet分割模型提取鱼类表型比例

    Constructing VED-SegNet segmentation model to extract fish phenotype proportions

    • 摘要: 为提高鱼类表型分割精度和准确度,实现鱼类表型智能监测,该研究基于深度学习算法构建了VED-SegNet模型用于鱼类表型分割和测量。该模型将cross stage partial network和GSConv结合作为编码器(VoV-GSCSP),保持足够精度的同时降低网络结构复杂性。另一方面,该模型采用EMA(efficient multi-scale attention module with cross-spatial learning)建立强化结构,加强编码器和解码器之间的信息传递,提高模型精度,并实现了8个表型类别的输出。采用自建的鱼类表型分割数据集对VED-SegNet模型进行了测试,测量结果中鱼类各表型比例与实际测量值相接近,表型最大平均绝对和平均相对误差为0.39%、11.28%,能实现无接触式提取水产养殖中鱼类表型比例。对比其他常见语义分割模型,平均交并比mean intersection over union,mIoU和平均像素准确率mean pixel accuracy,mPA最高,分别到达了87.92%、92.83%。VED-SegNet模型在环境复杂、多鱼重叠的养殖水体鱼类监测场景中精准分割鱼类形态特征,可为鱼类表型智能测量提供技术支持。

       

      Abstract: Monitoring fish phenotype during aquaculture is essential to promote the quality of fishery resources. However, the low-level visual information cannot accurately match the specific target in complex real-world scenarios. Semantic segmentation can be expected to segment the object outlines, due to the classifies images at the pixel level. It is still lacking on the segmentation of fish morphological feature. Phenotype segmentation without optimized or improved models can result in the coarse segmentation across multiple phenotypic scales, leading to the low accuracy of fish phenotype proportion. In this study, a VED-SegNet model was developed for the segmentation and measurement of fish phenotypes. This model was integrated the Cross Stage Partial Network and GSConv as the encoder (VoV-GSCSP) to extract more comprehensive semantic information on fish phenotypes. Secondly, the EMA (Efficient Multi-Scale Attention Module with Cross-Spatial Learning) was employed to create an enhanced structure. The information transfer between the encoder and decoder was selected for the efficient learning of feature maps without channel dimensionality reduction during convolution, resulting in the superior pixel information and enhanced model accuracy. Finally, the decoder was configured with four upsampling layers, enabling the classification of eight phenotypic categories. Experiments were conducted on a custom-built dataset of fish phenotype segmentation. It was found that the model performed optimally with an EMA channel dimension of 16, a batch size of 4, a learning rate of 1×10-4, and an image size of 512×512 pixels. The better performance was achieved in the mIoU (mean intersection over union) of 87.92% and an mPA (mean pixel accuracy) of 92.83%, with parameters and GFLOPs (Giga Floating Point Operations) at 30.9M and 440.95G. The inference speed reached the real-time levels with an FPS (Frames Per Second) of 30.44. In the extraction of fish phenotypic proportions, the model was closely aligned with the actual measurements, with the minimun mean absolute percentage error of 0.61% and the maximum of only 11.28%. These findings demonstrated that the VED-SegNet model was used to rapidly and accurately segment and extract the fish phenotypes, and then facilitate the contactless extraction of fish phenotype proportions in aquaculture. There was the great contribution to the intelligent evolution of the industry. Compared with the advanced semantic segmentation models, such as PSPNet, Deeplabv3+, Unet, and Segformer, the VED-SegNet model exhibited the highest accuracy and superior overall capability in segmenting fish phenotypes, achieving over 77% and 86% in IoU and PA metrics, respectively. This performance demonstrated that the VED-SegNet's balanced proficiency across various phenotypic indicators, in order to precisely monitor the fish growth curves and scientific management in aquaculture. The segmentation experiments showed that the superior performance was achieved in the fish phenotypic segmentation, particularly the enhanced robustness in segmenting fish fins with similar shapes and colors. This approach was effectively reduced the false negatives, due to a limited number of effective pixels with more precise and complete edge details. The advanced capability was obtained to segment the fish phenotypes, indicating more accurate determination of fish phenotypic proportions. In addition, a series of tests were conducted on the fish phenotypic segmentation under various aquaculture environments, including the scenarios with complex water quality and overlapping of multiple fish. The VED-SegNet model was maintained the effective segmentation performance in these complex and multi-fish overlapping scenarios. The VED-SegNet phenotypic intelligent segmentation model can be expected to facilitate the phenotypic segmentation of specific fish species, even in the complex environments and multi-fish overlapping scenarios. Additionally, this approach can be extend to the phenotypic segmentation and measurement of other fish species in the wide application scope.

       

    /

    返回文章
    返回