Abstract:
Abstract: An improved Convolutional Neural Network (CNN) was proposed to replace the current manual observation of the rice development period for higher efficiency and accuracy. In this study, a CNN image recognition was established with 50 layers using a risk adaptive authorization mechanism (RAdam) optimizer. Five developmental stages of rice were selected to automatically detect, including regreening, tillering, jointing, heading, and milk stage. Two cameras were assumed in 12 test fields for two consecutive years, where two pre-set points were set in each test field. Images and videos of rice were taken continuously at 8:00 and 16:00 each day. The geometric transformation of image was also used to increase the amount of input data. Finally, 35422 datasets of grading images were obtained on rice development stages. Training and test datasets were divided at the ratio of 7:3, where the original 1920x1080 pixel image was processed into 224x224 pixel size. Each image was then classified and labelled manually. A combined ExG factor with Otsu threshold was utilized to segment the rice images, to avoid the interference of some factors (water, soil, and garbage) in the rice field on the characteristics of rice development period. Strong robustness was obtained when the light and color changed, indicating high requirements of extracting the "green" characteristics of rice plant images. The parallel operation of CNN was realized by Tensor flow GPU. Four pre-trained CNN models were selected to conduct comparative experiments, including VGG16, VGG19, ResNet50, and Inception v3. The initial learning rate was set to be 0.001. The training accuracies of the VGG16, VGG19, and Inception v3 network models were 99.46%, 94.36%, and 98.70%, respectively whereas the verification accuracies were 94.76%, 89.43%, and 93.59%, respectively. The training accuracy of the ResNet50 network model was about 5% higher than that of the VGG19 network model, also higher than those of the VGG16, and Inception v3 network models. The loss value of the ResNet50 network model was also about 90% lower than those of models. Thus, it was inferred that the ResNet50 model was better suitable for the identification of key developmental stages of rice. Nevertheless, the accuracy and loss of the ResNet50 model varied greatly under the Adam and RAdam optimizers. The RAdam optimizer was faster than Adam, indicating high stability and convergence speed. Specifically, the convergence speed for Adam was 11 s per step, while that for RAdam was 12 s per step. Multiple experiments were performed on the batch size and learning rate, and further to evaluate the performance of the ResNet50 model. The training time was reduced by 737 s, when the learning rate was set to be 0.001, and the batch size was 32. Subsequently, 5 experiments were performed on the ResNet50 network model to train the datasets of rice images during different developmental stages. The accuracies of the training and validation set were 99.53%, and 97.66%, respectively, when the training iteration reached the 18th round. Once the iterative training continued, the accuracies of the training and validation set remained stable. The constructed CNN model can be expected to recognize rice images in different developmental stages, with an average recognition accuracy of 97.33%, while high network stability and fast convergence speed. The finding can provide an effective way to automatically monitor the development stages of rice in intelligent agriculture.