Abstract:
Abstract: Recognition of the main organs of tomato from plant images has important significance for the detection of plant diseases and insect pests, the application of targeted pesticides, and the development of intelligent agricultural machinery. The traditional object recognition methods which are limited by hand-engineered features extraction process and the selection of filter operator types, have poor generality and low recognition accuracy. The convolutional layers of convolutional neural network (CNN) can achieve automatic extraction of image features. In view of this, we propose a tomato main organs recognition method which relies on deep convolutional neural network (DCNN) in this paper. Under different illumination conditions, and considering the morphological diversity of tomato organs, 5 kinds of images including flower, fruit, stem, leaf, and environment background and the plant images are collected, which are used as training and testing samples. Under the premise of maintaining the basic structure of VGGNet, by adjusting the number of convolutional kernels and the depth of convolutional layers, 10 kinds of tomato organs classification networks are constructed. To reduce overfitting on tomato organ image dataset, before using the dataset to train the individual network, we use many data augmentation techniques to enlarge the dataset such as label-preserving transformations, specifically image rotation at random angles, extracting random patches of one image, image flip horizontal, and altering the intensities of the RGB (red, green, blue) channels of training images. At the same time, the median filter is used to denoise image samples. We train our models end to end using stochastic gradient descent with input image size of 64×64 or 128×128 pixels, mini-batch size of 128 examples, and momentum of 0.9. The classification test results show that the 10 kinds of networks all have very good classification performance by top-1 error rate. Among the 10 kinds of networks, the 8-layer structure network's top-1 error rate is 0.297%, which has the best classification performance. The results reveal that 8-layer structure network can satisfy the features extraction and classification requirements of tomato image dataset with fewer categories. Considering the classification performance and computing speed, referencing Fast R-CNN, with the 8-layer structure network as the basic architecture, and using the region of interest pooling (RoI pooling) layer to replace the last maximum pooling (max-pool) layer, a tomato main organ detector is designed, and named as TD-Net. The detector can output the category probabilities for each object proposal through the soft-max layer. The selective search algorithm is adopted for generating a large number of proposal regions for each tomato plant image, and the non-maximum suppression algorithm is used to suppress the outputs of the TD-Net. Using plant images and proposal regions as input, the detector is trained iteratively. The detection experiment results show that the detection average precisions (AP) of the DCNN-based tomato organ detector for the fruit, flower and stem are 81.64%, 84.48% and 53.94% correspondingly, and the detector can effectively identify the fruits of different maturity degree and the flowers of different flower age. On the same tomato plant image dataset, compared with R-CNN and Fast R-CNN, TD-Net has a better detection performance and detection speed, which illustrates that the method of this paper is effective.