Lodging resistance prediction of maize varieties based on support vector machine and ReliefF algorithm

Zhang Tianliang; Zhang Dongxing; Cui Tao; Yang Li; Ding Youqiang; Xie Chunji; Du Zhaohui; Zhong Xiangjun

doi:10.11975/j.issn.1002-6819.2021.20.026

Zhang Tianliang, Zhang Dongxing, Cui Tao, Yang Li, Ding Youqiang, Xie Chunji, Du Zhaohui, Zhong Xiangjun. Lodging resistance prediction of maize varieties based on support vector machine and ReliefF algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(20): 226-233. DOI: 10.11975/j.issn.1002-6819.2021.20.026

Citation:

Lodging resistance prediction of maize varieties based on support vector machine and ReliefF algorithm

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Abstract: Maize is one of the main food crops in the world. The lodging of maize has posed a serious challenge on the yield and mechanized harvesting in modern agriculture. Current identification methods cannot fully meet the lodging resistance and long breeding cycle of maize varieties, due to the time-consuming and laborious tasks. In this study, hyperspectral imaging technology was combined with statistical learning to predict the lodging resistance of maize varieties during the vegetative growth period. A field trial was also carried out in 2018 and 2019. The hyperspectral images were then collected for the top leaves of 8 corn varieties with and without lodging resistance at the 9-leaf stage. The experimental procedure was as follows. A threshold segmentation was first utilized to identify the leaf area. The K-means clustering was then used to divide the leaf into three areas: normal reflection, dark reflection, and leaf vein area. The average spectral curve was finally extracted in the normal reflection area, in order to analyze the data characteristics of lodging-resistant and lodging samples. The Kennard Stone was selected to sort the sample data of each species. Two parts of the set sample were also divided, including the training and test set at a ratio of 3:1. The division of each variety was integrated into the final training and test set data, in order to obtain an evenly distributed dataset of each variety. As such, there were 378 training and 120 test set samples in the 2018 test, while there were 383 training and 120 test set samples in the 2019 test. The filtering feature selection Relevant Features (ReliefF) and Principal Component Analysis (PCA) were selected to mine the spectral classification features of lodging-resistant varieties and lodging varieties. Specifically, a different number of the nearest neighbors in ReliefF was set to determine some features, according to the stability of feature variables. The redundant features were often selected with a high correlation in adjacent bands. Correspondingly, the PCA was first performed on the spectral data, thereby selecting principal components without redundant features using the ReliefF. The classification models of ReliefF- Support Vector Machine (SVM) and PCAReliefF-SVM were established, where the original spectral data features were selected by the ReliefF, and the principal component features were selected by the PCAReliefF. The grid search was also selected to optimize the penalty and kernel parameters in the SVM model for a better prediction of the model. First, cross-validation was used on the training set data to optimize the number of selected features. 40 and 50 features in the trials in 2018 and 2019 were selected to build the model, in order to balance the accuracy of the model and the complexity of calculation. All the samples were then used in the training set, where the final parameters were used for model training. The accuracy rates of prediction in the PCAReliefF-SVM model were 85.00% and 85.83% in 2018 and 2019, respectively. In the ReliefF-SVM model, the prediction accuracy rates were 84.17% and 84.17% in 2018 and 2019, respectively. It indicated that the PCAReliefF-SVM model performed better prediction. The ROC curve was also used to evaluate the performance of the model. It was found that the ROC curve in the PCAReliefF-SVM modeling almost completely "enclosed" the ROC curve in the ReliefF-SVM, indicating a better performance of the PCAReliefF-SVM model. As such, hyperspectral imaging was used for the early classification of maize varieties, particularly for the overwhelm resistance. Consequently, the findings can provide a reliable idea for the maize resistance to overwhelm using spectral extraction, feature analysis, and modeling prediction.

FullText(HTML)

References (30)

Cited By

Lodging resistance prediction of maize varieties based on support vector machine and ReliefF algorithm

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content