采用改进YoloV4模型检测复杂环境下马铃薯

    Potato detection in complex environment based on improved YoloV4 model

    • 摘要: 为解决马铃薯联合收获机在作业过程中分级清选的问题,并在收获作业过程中实时监测评估收获状态,该研究提出一种在光照亮度变化大、土壤与薯块遮挡、机器振动以及尘土干扰等情况下对马玲薯进行识别检测并快速准确获取马铃薯数量以及损伤情况的机器学习模型。在卷积神经残差网络中引入轻量级注意力机制,改进YoloV4检测网络,并将YoloV4结构中的CSP-Darknet53网络替换为MobilenetV3网络,完成特征提取。试验结果表明,基于卷积神经网络的深度学习方法相比于传统Open-CV识别提高了马铃薯识别精度,相比于其他传统机器学习模型,MobilenetV3-YoloV4识别速度更快,马铃薯识别的全类平均准确率达到91.4%,在嵌入式设备上的传输速度为23.01帧/s,模型鲁棒性强,能够在各种环境下完成对正常马铃薯和机械损伤马铃薯的目标检测,可为马铃薯联合收获机智能清选以及智能收获提供技术支撑。

       

      Abstract: Abstract: Potatoes have been provided more guarantee for the national food security as the fourth largest food crop in China. However, the relatively low harvest efficiency and intelligence operation have been serious bottlenecks in the potato industry at present. It is necessary to real-time detect and evaluate the potato's state during harvesting, particularly on the grading and cleaning treatment in a combine harvester. In this study, a machine learning model was proposed to quickly and accurately identify the number and damage of potatoes under the various working environments, such as light brightness, shielding of soil and potato blocks, machine vibration, and dust interference. A lightweight attention mechanism was also introduced into the convolutional neural residual network. The attention mechanism acted on the full connection layer was then added to the YoloV4 using the different weights of each channel. The original K-means aggregation was abandoned, due to the relatively consistent size of potatoes. Three output layers of YoloV4 were combined into a large output layer, where the cspdarknet53 was replaced by the mobile netv3 network structure to realize the feature extraction. As such, the MobilenetV3 presented an inverse residual structure with the deeply separable convolution blocks and linear bottlenecks. The amount of calculation and parameters were reduced to 1/4 of the original using the H-swish activation function instead of the swish function, thereby significantly improving the detection speed without loss of the recognition rate of the potato. Some operations were selected to process the collected images for the better generalization ability of the training model, including the horizontal flip, vertical flip, mirror image, and adding noise. Among them, there were 1 296 images with high quality, 322 images of mechanically damaged potatoes, and 231 images with disturbing for comparison. The collected image data set was used for the model training at the workstation, where the loss value of training set and test set were recorded. Subsequently, the comparative and field tests were carried out, where the trained network was introduced into the embedded equipment. The evaluation indexes were set as the precision-recall curve, AP (detection accuracy), map (mean value of AP value in all categories) and detection speed. It was proved that the depth learning improved the recognition accuracy of potato, compared with the traditional open CV model. The MobilenetV3-YoloV4 also presented a higher recognition speed, and an excellent extraction performance to the target, compared with YoloV4, YoloV3, VGG16, and traditional open CV models. The results show that the average accuracy of potato recognition was 91.4%, indicating strong robustness for the target detection of normal potato and mechanically damaged potato in various environments. There was a better performance at the illumination of 30o, 45o, 60o and 90o, where the transmission speed of 23.01 frames per second when the network model was applied to embedded devices. A field experiment proved that the MobilenetV3-YoloV4 was used to real-time detect the potato flow in the actual harvest. According to the flow, the separation speed of the vertical annular was adjusted to avoid the excessive accumulation of potatoes, when the potato was fed too much. Otherwise, the linear scratch between potato and soil potato would result in the increase of the skin breaking rate. Once the feeding amount was reduced, the rotating speed of the vertical annular was adjusted to reduce the damage caused by the vibration of the device, where there was less energy consumption, as well as the less linear scratch between the potato and the grid. This finding can provide sound technical support for the intelligent cleaning and grading of potatoes in a combine harvester.

       

    /

    返回文章
    返回