Fast identification of watermelon seed variety using near infrared hyperspectral imaging technology
-
-
Abstract
Abstract: Watermelon seed variety selection plays a vital role in watermelon planting, and the variety of watermelon seeds directly affect the yield and quality of watermelons. In this study, we aimed to identify the cultivars of watermelon seeds by using a novel, rapid, non-invasive, and low cost technique named hyperspectral imaging. 121 samples of four different cultivars of watermelon seeds were investigated, and a near-infrared hyperspectral imaging system (874-1734 nm with 256 bands) was established to acquire the hyperspectral images of the samples. A region of interest (ROI) with 15×15 pixels of the hyperspectral image of each sample was defined, and the average reflectance spectrum of the ROI were extracted. To remove the absolute noises of the spectra, only the spectral range 1 042-1 646 nm was used for analysis, and to reduce the noises existed in spectral range 1 042-1 646 nm, the extracted 121 reflectance spectra were preprocessed by Savitzky-Golay smoothing (SG), Empirical Mode Decomposition (EMD), and Wavelet Transform (WT) methods. The preprocessed spectra were then used to select sensitive wavelengths by Successive Projections Algorithm (SPA) and Genetic Algorithm-partial least squares (GA-PLS) methods. Different numbers of sensitive wavelengths were selected by different variable selection methods with different preprocessing methods. 24, 16, and 15 sensitive wavelengths were selected by SPA with spectra preprocessed by SG, EMD, and WT, respectively. Moreover, 38, 33. and 32 sensitive wavelengths were selected by GA-PLS with spectra preprocessed by SG, EMD. and WT, respectively. Partial least squares - discriminant analysis (PLS-DA) was used to build discriminant models with the full spectra, and back-propagation neural network (BPNN) and extreme learning machine (ELM) were applied to build discriminant models with the selected wavelength variables. A PLS-DA model with spectra preprocessed by EMD obtained the best identification rate among all PLS-DA models, with an identification rate of 91.57% in the calibration set and 78.95% in the prediction set. SPA-BPNN models showed relatively worse results than GA-PLS-BPNN models with the same spectral preprocessing methods. The SG-GA-PLS-BPNN model obtained the best performance among all BPNN models, with an identification rate of 92.77% in the calibration set and 86.84% in the prediction set. Compared with the PLS-DA models and the BPNN models, ELM models obtained the best results. All ELM models obtained an identification rate over 90% in the calibration set and the prediction set, and the SG-SPA-ELM model, SG-GA-PLS-ELM model, and WT-SPA-ELM model obtained the identification rate of 100% of calibration and prediction. The overall results showed that BPNN and ELM models performed better than PLS-DA models, and the ELM models with the selected wavelengths based on SG preprocessed spectra obtained the best results, with 100% classification accuracy for both the calibration set and the prediction set. The SG preprocessing method showed the best performance in all PLS-DA, BPNN, and ELM models. The results indicated that it was feasible to use near-infrared hyperspectral imaging to identify the watermelon seed varieties, and near-infrared hyperspectral imaging provided an alternate way of rapid identification of watermelon seed variety. ELM, as a single hidden layer feed-forward network, was an effective classification method in watermelon seed cultivar identification. Moreover, the results in this paper showed the great potential of hyperspectral imaging in the seed industry for on-line identification of seed cultivars and detection of the seed quality parameters.
-
-