Abstract:
The increasing severity of global climate change poses significant challenges to agricultural production and food security. In recent years, the Chinese government has placed considerable emphasis on the development of remote sensing technologies, achieving remarkable advancements, particularly in the field of satellite remote sensing, where China has established itself as a global leader. As an agricultural powerhouse, China recognizes the critical importance of enhancing the application of agricultural remote sensing monitoring to obtain timely and accurate information regarding crop types and planting structures. This information is vital for agricultural surveys, monitoring agricultural conditions, estimating crop yields, and assessing disaster impacts. Maximizing the utilization of existing remote sensing data and ground sample information, in conjunction with feature engineering techniques and machine learning methods for classification, is an essential pathway to improving crop classification accuracy. With the continuous advancement of remote sensing technology, quantitative remote sensing monitoring based on satellite-derived spatiotemporal big data has emerged as a crucial development direction in contemporary precision agriculture. While individual remote sensing imagery and multi-feature selection are indeed important for crop recognition and classification, substantial variations exist in the adaptability of feature selection algorithms and machine learning methods across different scenarios, leading to significant fluctuations in recognition effectiveness and classification accuracy. This study focuses on Ying Shang County in Anhui Province, employing Sentinel-2 and GF-3 satellite imagery data to extract 58 feature indicators, including spectral, index, texture, and polarization characteristics. Subsequently, three feature selection algorithms and three machine learning methods were selected and combined, designing three experimental schemes to explore the effects of feature selection and machine learning techniques on crop classification. By comparing feature dimensions and classification accuracy, the effectiveness of various classification schemes was thoroughly evaluated. The research findings reveal several key insights: (1) The application of feature selection algorithms effectively facilitates dimensionality reduction, avoiding the data redundancy associated with excessively high dimensions. Among the algorithms assessed, Relief F demonstrated significant capability in reducing feature dimensions while maintaining effective crop classification performance. Specifically, Relief F selected 28 features, which is 4 fewer than the number selected by RF_RFE and 22 fewer than those selected by CST. (2) The comparative analysis of the three feature selection algorithms indicated that the red-edge features B5, B6, and B7 were all integral to the classification process, underscoring their critical role in crop recognition. Additionally, the inclusion of bands B11 and B12, along with texture features, further enhanced the accuracy of crop recognition. Different feature selection algorithms exhibited varying preferences in feature selection; specifically, Relief F showed a tendency to favor spectral and index features. (3) A comparison of three different machine learning methods revealed that the combination of Relief F with Random Forest (RF) yielded the most accurate crop classification for the study area, while XGBoost and Support Vector Machine (SVM) demonstrated inferior performance. Under the conditions established by the Relief F feature selection, RF achieved an overall accuracy of 93.39%, a Kappa coefficient of
0.8933, and an F1 score of 94.31%. These results indicated that RF outperformed XGBoost and SVM by 1.36%, 0.021, and 1.31%, as well as by 8.81%,
0.1312, and 8.78%, respectively. This experiment strongly validates the effectiveness and advanced nature of combining the Relief F feature selection method with Random Forest classification. It demonstrates how optimizing feature selection and employing suitable machine learning methods can significantly enhance classification performance across various agricultural contexts. These findings contribute to the effective utilization of remote sensing technologies, improving agricultural monitoring and decision-making. By illustrating the relationship between feature selection methods and classification accuracy, this study provides a novel technical approach for precision agricultural recognition and classification.