Abstract:
Surface defect of pears has been one of the most significant influencing factors on the fruit quality. The appearance of the fruit with surface defects can allow for the bacterial growth. Consequently, the defect recognition can be expected to serve as the quality grading of fruits. However, the surface defects visually resemble the numerous spots on the surface of pears. It is quite challenging in the task of defect recognition. Fortunately, deep semantic segmentation networks have been widely applied in the field of non-destructive testing in recent years. Nevertheless, the supervised semantic segmentation is typically required for a large number of finely pixel-level labels as the training samples. A considerable challenge is still remained on the defect recognition task of pears. Taking the Dangshan pears as the object, this study aims to recognize the surface defects using weakly supervised semantic segmentation network. Pixel-level pseudo-labels were generated using global information. The local statistical insights were derived from the human experience. The fine pixel-level labels were obtained to reduce the high cost of annotation. The defect samples of pears were firstly captured using mobile phones and industrial cameras. Image enhancement techniques were applied to increase the diversity of the samples. In the bounding box weak labels using global experience, the images were converted to the HSV color space using transformation. Histogram statistics were performed on each channel of this space. The thresholds were determined for the preliminary segmentation. The morphological processing was used to refine the pixel-level labels. In the point-level weak labels using global experience, a seed region growing algorithm was employed to segment the defect areas. The pixel-level pseudo-labels generated from the weak labels were used to create a weakly supervised semantic segmentation dataset. Subsequently, the rapid and accurate identification of surface defects were achieved in the Dangshan pears. A lightweight semantic segmentation network was constructed using U-Net, referred to as MCF-Unet. Feature Pyramid Network (FPN) was integrated at the bottom layer of the backbone feature network, in order to enhance the edge perception. Additionally, a Convolutional Block Attention Module (CBAM) was incorporated at the skip connection points, in order to improve the relevant target information. Finally, the segmentation of the network was validated using a self-built weakly supervised dataset. A comparison was made with the current models, such as DeepLabv3+, PSPNet, ResNet-U-Net, and VGG-U-Net. Experimental results indicated that the MCF-UNet network was achieved in the high segmentation accuracy and speed, after training on the two self-generated weakly supervised datasets. The mean Intersection over Union (IoU) of the predicted segmentation reached 70.80% and 72.94%, respectively. The training time of the MCF-UNet model was significantly reduced, compared with the more accurate VGG-U-Net network, with a prediction time of 0.055 s per frame. Visualization data demonstrated that the MCF-UNet model was rapidly and accurately identified the surface defects in Dangshan pears after low-cost weakly supervised training. Additionally, the two weak labels were suitable for the defect segmentation of Dangshan pears, compared with the graffiti annotation. The pixel-level pseudo-label generation was also combined with deep semantic learning. The weakly supervised deep learning was applied to recognize the surface defects of Dangshan pears. The finding can also provide the valuable insights in the non-destructive detection of fruits using weakly supervised learning.