Abstract:
Automatic recognition, segmentation, and location of grape stems’ picking points are important aspects for the picking operation of grape-picking robots. In the actual scene of an orchard, it is extremely difficult to accurately identify and segment grape stems and then locate the picking point, due to strong similarity between stems and the surrounding environment, as well as other conditions such as weather, light, and occlusion. Therefore, big challenges are posed for the grape-picking robots to perform picking operations. Recognition and the optimal picking point location of grape stem based on deep learning were proposed in this study. Considering that shape of small grape stems and their color would gradually change, a Mask Region with Convolutional Neural Network (Mask R-CNN) instance segmentation model was optimized. This model was divided into three modules, including backbone, region proposal network, and three branches. The backbone network aimed at obtaining a feature map with different levels. The regional proposal network’s target was to find regions with grape stems. And the three-branches network aimed at obtaining classification, bounding-box regression, and mask calculation of grape stems. The result of recognition and segmentation of grape stem in pixel-level was obtained through the training model, with the category and position of the grape stem were returned. To improve the segmentation effect on grape stems, the study adopted the idea of color threshold segmentation, the HSV (Hue, Saturation, Value) color space of each grape stem in the segmentation result was analyzed in segments. The average value of HSV color components of each segment was taken as the benchmark color threshold of the stem in this segment. Based on this threshold, an improved regional growth algorithm was introduced to automatically adjust and optimize the shape of the segmented grape stem. By this optimized shape, the centroid of the grape stem was calculated, the picking area was determined by the two horizontal sides of the grape stem that were closest to the centroid point, and the midpoint of this area was considered as the picking point. Approaches in this study were stated as follows. Grape stem regions of training samples were manually labeled, 600 images were selected as the training set, and 100 images as the verification set. In addition, data of the training set was enhanced by taking into consideration rotation, mirroring, blurring, and exposure operations, to improve the generalization ability of the model. A total of 3 000 training set images were generated. All the above measures contributed to the optimization of grape stem recognition and segmentation network based on the Mask R-CNN. An improved region growth algorithm was initiated to finely adjust results from multiple segments of grape stem segmentation, and the picking point was obtained based on the relationship between centroid and contour of the grape stem. The specific performance of the method under different weather and illumination image conditions was verified. The detection accuracy value and the location rate of the optimal picking point were taken to evaluate the models, and the detection effects before and after the model optimization were compared. Experimental results showed that detection accuracy in the optimized model reached up to 88%. Compared with the model before optimization, the model detection time was reduced by 24%. The success rate of picking point location taking the method of this study was 81.58%. And calculated picking points reached up to 99.43% within the optimal picking range which was manually set. Results showed that the proposed method recognized multiple types of grape stems under different weather and light conditions, results of grape stem region segmentation were also satisfactory. This method was capable to locate the picking point quickly and was suitable for the grape-picking robots to perform picking operations in an orchard with a complex environment. In this study, the deep learning model could be applied for the first time in the research of stem identification and segmentation and it could be an approach for grape-picking robots to pick grapes efficiently and intelligently in the natural environment.