Abstract:
Abstract: China is the leading country in the world for the production of fruits, and the variety of fruits is very wide. But fruits harvesting heavily depends on manual hand picking, and it's time-consuming, low efficient and labor-intensive. Fruit picking robot can realize the automation of fruit picking operation and solve the problems of shortage of labor force and high cost. Fruit identification with machine vision is the primary task. But in the field environment, fruit images are easily affected by many external environmental factors such as light changes, fruit size difference, complicated background noise, which can reduce the identification accuracy of fruit with traditional fruit recognition algorithm. And without general feature extraction model, traditional fruit recognition algorithm can only focus on one specific fruit. Deep learning algorithm has the advantages of strong non-linear feature expression ability, and good generalization performance, and can avoid the subjectivity and limitation of human selection on feature selection. In order to solve the problems of low recognition rate and weak generalization for fruit recognition in the field environment, with the apple, litchi, navel orange, Huangdi gan as the research object, an improved single shot detector (SSD) deep learning model for fruit detection is put forward in this study. That is to use ResNet-101 model to replace the VGG16 network in the classic SSD detection framework. After the replacement, the framework still uses 6 feature extraction layers to predict the type and location of fruit objects at each layer; then the weight model under the large data sets is transplanted to multi-class fruit detection tasks with the method of transfer learning. The SSD deep learning model is optimized by using SGD (stochastic gradient descent) algorithm. The weight model of the pre-training on the ImageNet data set is used as the initial weight model of the SSD detection framework, and the training time and resources are further reduced by transferring the characteristics of the learning. At the same time, data enhancement method is used to improve the robustness of the algorithm without reducing the detection accuracy. Based on the Caffe deep learning framework, fruit detection results are compared for the multi-class fruit pictures collected in the field environment with different network models, different data set sizes and different occlusion ratios. Experimental results show that after a day of training, the residual error reference model takes about 0.14 s when detecting the image with a resolution of 500×500 pixels, only about 0.09 s slower than the VGG16 network model. And in various environments, the average detection accuracy of the 4 kinds of fruit based on the improved SSD deep learning fruit detection model can reach 88.4%, which is higher than that of the classic SSD deep learning model that is 86.38%. After data are enhanced, the average detection accuracy can be improved by 1.13 percentage points and reach 89.53%, and the F1-score can reach 96.12% when the occlusion area is lower than 50%. Therefore, compared with the traditional recognition algorithm, this method based on improved SSD model can realize multi-class fruit image detection simultaneously without artificial feature selection for different fruit images, and has better generalization and robustness. It can achieve accurate detection of multiple kinds of fruits in the field environment, and provides a new solution for the problem of fruit detection and recognition in agricultural automation.