高光谱结合理化参数跨品种识别玉米种子贮藏年份

    Cross-species identification of maize seed storage year by hyperspectral combination of physicochemical parameters

    • 摘要: 玉米种子老化劣变是影响玉米种子活力的关键因素,因此,识别种子的贮藏时间对于种质资源保存、种子质量鉴定具有重要意义。为实现跨品种玉米种子贮藏年份的准确分类,该研究通过对103个玉米品种种子的理化参数与高光谱波段的相关性结合机器学习算法对不同贮藏年份的玉米种子进行识别。分析玉米种子内部理化参数随贮藏年份增加的变化趋势,筛选出随贮藏年份变化呈现一致变化规律的理化参数。分别基于胚面与胚乳面的全波段光谱数据和特征波长筛选的光谱数据建立分类模型,筛选出最优分类模型。对随贮藏年份变化呈现一致变化规律的理化参数与高光谱波段进行Pearson相关性分析,利用相关性筛选特征波段用于建模。对比了特征波长筛选、相关系数筛选和相关系数筛选与特征波长筛选结合的建模准确率,并验证了模型跨品种检测能力。结果表明,基于相关系数筛选与特征波长筛选结合建立反向传播神经网络(back propagation neural network,BPNN)分类模型的准确率最高,单粒预测准确率为92.3%,群体预测准确率为94.4%。该研究提出的方法通过多个玉米品种的验证,模型具有较高的泛化能力。研究结果可为玉米种子贮藏的精准管理提供理论依据且对种业振兴具有重要的现实意义。

       

      Abstract: Aging and deterioration of corn seeds have been two of the key influencing factors on their vitality. Therefore, it is of great significance to identify the storage time of seeds for the preservation of germplasm resources and the quality identification of seeds. This study aims to identify the stored corn seeds over the different years using machine learning. 103 corn varieties were also selected for the correlation analysis among the physicochemical parameters and hyperspectral bands. Hyperspectral imaging was employed to obtain the spectral data from both sides of the seed embryonic and endosperm. The near-infrared component analyzer (IM9500) and portable ultraviolet-visible fluorescence spectrometer (Multi-plex) were utilized to acquire the physicochemical parameters of corn seeds. Variance and linear discriminant analysis was conducted to determine the trends of physicochemical parameters within corn seeds, and their impact on the discrimination of storage time. The influencing factors were selected with the physicochemical parameters on the discrimination of corn seed storage time, as the storage time increased. The hyperspectral data was also performed on the black-and-white correction. The threshold segmentation was used to effectively separate the background area, in order to obtain the region of interest (ROI). Among them, the spectral average of all pixel points on the image served as the spectral data of ROI. Five preprocessing methods, including savitzky-golay (SG) smoothing, standard normal variate transformation (SNV), multiplicative scattering correction (MSC), first derivative (1-Der), and second derivative (2-Der), were applied to the spectral data, in order to eliminate the interference signals during spectral acquisition, such as the background noise, baseline drift, and stray light. Feature wavelength was selected from the complete spectral data, due mainly to the hyperspectral images with many bands and redundant information. Competitive adaptive reweighted sampling (CARS) and uninformative variable elimination (UVE) were used to select the feature wavelength. Support vector machine (SVM), back propagation neural network (BPNN), and convolutional neural network (CNN) classification models were developed to preprocess the spectral data from the embryo and endosperm surfaces. The spectral data was obtained as the input for the models. The classification data was compared with the different models. The results indicated that better discrimination was achieved in the spectral data from the embryonic side on the storage time of corn seeds. Modeling with the preprocessed and feature-selected spectral data significantly outperformed that with the raw data. Pearson correlation analysis was conducted among physicochemical parameters that significantly affected the seed age discrimination and hyperspectral bands. These parameters were selected with the highly correlated feature bands for modeling. The comparison was made on the modeling accuracy of feature wavelength selection, correlation coefficient, and the combination of correlation and feature wavelength selection. The performance of the model was verified to detect across multiple corn varieties, indicating the high generalization. BPNN classification model with the correlation coefficient and feature wavelength shared the highest accuracy, with a single kernel prediction accuracy of 92.3% and a Colony prediction accuracy of 94.4%. This finding can provide a significant theoretical basis and practical implications for the precise management of corn seed storage in the seed industry.

       

    /

    返回文章
    返回