基于RGB-D的肉牛图像全卷积网络语义分割优化

邓寒冰; 周云成; 许童羽; 苗腾; 徐静

doi:10.11975/j.issn.1002-6819.2019.18.019

基于RGB-D的肉牛图像全卷积网络语义分割优化

Optimization of cattle's image semantics segmentation with fully convolutional networks based on RGB-D

摘要

摘要: 基于卷积神经网络的深度学习模型已越来越多的应用于检测肉牛行为。利用卷积操作实现肉牛图像的像素级分割有助于实现远距离、无接触、自动化的检测肉牛行为，为肉牛异常行为早期发现提供必要手段。为了提高复杂背景下肉牛图像语义分割精度，降低上采样过程中的语义分割误差，该文提出基于RGB-D的肉牛图像全卷积网络（fully convolutional networks, FCN）的语义分割优化方法，用深度密度值来量化深度图像中不同像素点是否属于相同类型的概率，并根据深度图像与彩色图像在内容上的互补关系，优化和提升FCN对肉牛图像的语义分割（像素密集预测）精度。通过试验验证，该方法与全卷积网络的最优分割结果相比，可以将统计像素准确率平均提高2.5%，类别平均准确率平均提升2.3%，平均区域重合度平均提升3.4%，频率加权区域重合度平均提升2.7%。试验证明，该方法可以提升全卷积网络模型在复杂背景下肉牛图像语义分割精度。

Abstract: With the decreasing cost of image sensor equipment, full-time monitoring has been gradually realized in the process of cattle breeding. Especially, in the whole life of cattle, the monitoring and analysis for cattle's behavior have become a research hotspot in the field of breeding. Acquiring a large amount of cattle image and video information, people are more concerned about how to process, analyze, understand and apply these data. How to segment dynamic objects from complex environment background is the precondition of cattle behavior analysis, and it is also the key of realizing long-distance, contactless and automatic detection for cattle behavior. The traditional machine vision image segmentation method is used to realize the clustering and extraction of pixels by artificially extracting image features. However, when the image background is complex, feature extraction will become very troublesome and even difficult to achieve. Deep Convolutional Neural Networks (DCNN) provides another solution, which enables computers to automatically learn and find the most descriptive and prominent features in each specific category of objects, and allows deep networks to discover potential patterns in various types of images. On the basis of massive labeled data, the accuracy of classification, segmentation, recognition and detection with convolutional neural network can be improved automatically through continuous training, and the labor cost is transferred from algorithm design to data acquisition, which reduces the difficulty of technology application. However, for cattle image segmentation, the complex breeding environment will be a problem. The color and texture of environmental information in the image will have an impact on the segmentation of cattle's details. Especially when FCN uses deconvolution operation in the process of up-sampling, it is insensitive to the details of the image and does not take into account the class relationship between the pixels, which makes the segmentation result lack of spatial regularity and spatial consistency, so the segmentation effect will be very rough. In order to improve the accuracy of semantics segmentation for fully convolutional networks and segmentation effect of cattle image details, this paper proposes a method of fully convolutional networks semantic segmentation based on RGBD cattle image. We create a concept which named "depth density". The value of depth density can quantify the probability about whether different pixels have the same category. According to the mapping relationship between RGB image and depth image on pixel level content, we optimize the semantic segmentation results of cattle's image by FCN. The experimental results showed that, better than FCN-8s, the proposed method could improve the pixel accuracy, mean accuracy, mean intersection over union and frequency weight intersection over union by 2.5%, 2.3%, 3.4% and 2.7% respectively.

HTML全文

参考文献(34)

施引文献

资源附件(0)