基于水下机器视觉的大西洋鲑摄食行为分类

    Classification of Atlantic salmon feeding behavior based on underwater machine vision

    • 摘要: 根据鱼群摄食行为状态进行水产养殖精准投喂控制,是有效提高饵料利用率降低水体污染的关键技术。目前,大多数基于机器视觉的鱼类摄食行为研究都是在实验室对真实养殖环境进行模拟并采用水上摄像机获取数据,由于光照条件和养殖环境的影响,该数据无法反映大西洋鲑在实际生产状况下的摄食行为,因此应用范围有限。为解决此问题,该研究提出一种基于真实工厂化养殖环境的鱼类摄食行为分类算法。该算法使用水下观测方式并采用视频序列作为样本,首先利用变分自动编码器对视频序列样本进行逐帧编码以产生所有帧对应的高斯均值和方差向量,分别联立所有均值和方差向量得到均值特征矩阵和方差特征矩阵。然后将特征矩阵输入到卷积神经网络中,实现对鱼群的摄食行为分类。试验结果表明,在真实的工厂化养殖环境下,该研究所提出的方法综合准确率达到了89%,与已有的基于单张图像的鱼类摄食行为分类方法相比,综合准确率提高了14个百分点,召回率提高了15个百分点。研究结果可为基于鱼类摄食行为的鱼饵精准投喂控制提供参考。

       

      Abstract: Fish feeding behavior can provide effective decision-making information for accurate feeding in aquaculture. Most previous studies were usually conducted in a laboratory environment to understanding fish feeding behavior. The limited application cannot reveal the actual production status of fish due to the influence of light conditions and farming environment in practice. Particularly, the cameras placed over the water surface cannot work well in most methods, due to serious light reflection resulted from the complex illumination conditions. For instance, the light reflection is so serious that many fishes are blocked out. In this study, an attempt was made to introduce an underwater video dataset for the feeding behavior of Atlantic salmon. In the dataset, the video clips were captured from an industrial recirculating aquaculture system. Each sample that labeled as eating or noneating was a 5-second clip with the frame rate of 30 Hz. A total of 3 791 samples were marked in the dataset, where 3 132 samples were marked as noneating and 659 samples eating. A novel video classification method based on Variational Auto-Encoder and Convolutional Neural Network (VAE-CNN) was proposed to identify the fish-feeding behavior from the video clip. Two steps were as followed. In the first step, a Variational Auto-Encoder (VAE) model was trained to extract the spatial feature of video frames. All video frames were encoded as a multivariate Gaussian probability distribution function in a latent space, indicating that represented by a Gaussian mean vector and a Gaussian variance vector. Specifically, the frames in a video clip were input into a trained VAE encoder to produce Gaussian mean vectors and Gaussian variance vectors, then to combine them in column order separately, finally to obtain the Gaussian mean feature matrix and Gaussian variance feature matrix of the video. In this step, the video clip of fish feeding behavior was coded as a feature map with two channels for the subsequent classification. In the second step, the fish feeding behavior was classified by inputting the feature matrix into the CNN. The VAE output features were input to train the CNN, while the spatio-temporal features in fish feeding behavior videos were extracted for the final classification. To verify the CNN, the VAE output features were also input into the backpropagation neural network (VAE-BP) and support vector machine (VAE-SVM) to classify the feeding behavior of fish. The results showed that VAE-CNN performed better. The main reason is that the CNN with a local receptive field function can allow it to better learn the spatio-temporal features in fish feeding behavior videos, while the other two methods only consider the output features of VAE as a common feature map. In real factory farming, the accuracy of the proposed method reached 89%, the recall reached 90%, and the specificity reached 87%. Compared with the single-image classification method, VAE-CNN recall increased by 15 percentage points, and other performance indexes of video classification method improved significantly. In terms of running time, the proposed algorithm only needed 4.15 s to process 5 s (150 frames) for the video of fish feeding behavior. This novel method can build a solid foundation for the future system with feedback control based on the fish feeding behavior.

       

    /

    返回文章
    返回