Abstract:
Abstract: A rapid and accurate identification of maize growth stages is of great significance for the precise management of the corn planting cycle. However, the complex background and outdoor lighting can be posed a great challenge to the current classification and identification of the corn growth stage in the field. In this study, an unmanned aerial vehicle (UAV) was adopted to capture the images of maize at the four growth stages, including the seedling, jointing, small-trumpet, and big-trumpet stages. A Swin Transformer model was also used to introduce transfer learning for the rapid identification of maize at different growth stages. The high recognition rate was significantly enhanced to monitor the large area than before. The generalization ability of the model was greatly improved to consider a single angle for image acquisition by the drone. The reason was that there was a consistent distribution of ridge growth in the same period, where there were different shapes and colors of the canopy in the four growth periods of maize. The deep learning model was then promoted to classify the corn ridge orientation as a feature. The data set used in the model training phase was usually collected in a stable and high-quality environment. In addition to making the images clear using algorithms, the anti-interference ability of the model was modified to reduce the performance degradation caused by the image blur in the task of recognition. First, the training set was rotated 8 times to expand the data set, combined with the corn ridge surface orientation. The shooting angle of the drone was simulated to improve the generalization ability of the model. Second, the Gaussian fuzzy method was used to convert the test set 6 times. The aHash, pHash, and dHash histogram algorithms were selected to evaluate the similarity between the actual and Gaussian blurred images, in order to explore the performance of each model on the unclear data set. At the same time, a comprehensive evaluation index was constructed for the performance degradation degree of the model under different definitions using accuracy. Finally, the Swin-T model performance was verified to compare the classical convolutional neural network (CNN) with the AlexNet, VGG16, and GoogLeNet. The experiments demonstrate that the overall accuracy of the Swin-T model in the original test set was 98.7%, which was 6.9, 2.7, and 2.0 percentage points higher than the AlexNet, VGG16, and GoogLeNet models, respectively. The complex background of the field posed a certain impact on the recognition accuracy of the seedling and jointing stage in the misclassification. Since the phenotype of the weed was similar to that of the maize, the images with the true category of the seedling stage were misidentified as the jointing. There were large areas of weeds in the images at the jointing stage, whereas, the maize canopy features were not outstanding. The images whose real category was the jointing stage were misidentified as the seedling stage. The overall accuracies of the degradation index were 12.38%, 10.38%, and 15.03% for the AlexNet, VGG16, and GoogLeNet models under the unclear data set, respectively. The overall degradation index of the Swin-T model was only 8.31%, indicating all the best in the degradation balance, the average degradation index, and the accuracy rates of the maximum degradation. It infers that the Swin-T model was degraded the least when the image quality was degraded. In terms of classification accuracy, and fuzzy image input, the Swin-T model can be expected to fully meet the harsh needs of the classification and identification at the different growth stages of maize in the actual production. The finding can also provide promising technical support for the intelligent identification and monitoring of maize growth stages.