Abstract:
Abstract: As one of the most popular nuts produced in China, pecan contains large amounts of protein and a variety of unsaturated fatty acids required for human body. However, pecans are prone to rancidity because of the influence of environmental factors such as light, oxygen, and moisture. Therefore, the detection of pecan's quality has a certain practical significance. As a bionic electronic system, electronic nose (E-nose) detects the quality of pecan qualitatively and quantitatively through the analysis of sample volatile gas's fingerprint information, and is pretty suitable for pecan quality detection. However, pecan odor is comprised of complicated compositions and small differences exist among pecans with different qualities, which makes the detection difficult. In order to improve the accuracy of detection, it's essential to optimize the sensor array of E-nose during the application. In this research, an embedded E-nose based on digital signal processer (DSP) was designed for pecan detection, and 4 batches of pecans with different aging time were used for experiment. According to the existing GC-MS (gas chromatography - mass spectrometer) analysis of pecan volatile, 13 gas sensors were selected, and part of them with small response were obsoleted by analyzing the response curve of each sensor firstly. Then, 3 feature extraction methods were applied to each sensor's abstraction to generate the initial feature matrix, thus the mean differential coefficient value, stable value and response area value. After that, a series of data analysis methods were applied to select the features with good performance and realize the optimization of array. First, features with smaller otherness were rejected by the mean analysis. Then, variation coefficient was used to remove the features with poor stability. Afterwards, the features reserved were classified through the cluster analysis based on the correlation, and the feature with the minimum redundancy in each class was selected according to the result of correlation coefficient analysis. Eventually, the degree of matrix's multicollinearity was decreased by removing the features with high value of variance inflation factor, and the optimized sensor array was chosen according to the ultimate feature matrix. To verify the validity of optimization, principal component analysis (PCA) and partial least squares regression (PLSR) were used to compare the ability of discrimination and forecast between the data before and after optimization. Results indicated that pecans different in aging time were well classified by using the optimized array. Each group of samples were clustered closely in PCA score plot, and the contribution rates of the first 2 principal components of the optimized array (they were 76.01% and 14.60%, respectively) were obviously better than that of pre-optimized array (they were 66.36% and 13.45%, respectively). Meanwhile, the result of PLSR showed that the fitting determination coefficients and root mean square error (RMSE) of the regression model based on the optimized array (R2=0.933 4, RMSE=1.452 9 d) performed better than that based on the pre-optimized array (R2=0.888 7, RMSE=2.509 2 d), and there was little difference of prediction parameters between the training set and validation set, which meant the phenomena of over-fit didn't exist and the ability of forecast was better for the optimized array. As a result, through the optimization of sensor array, E-nose can perform better in the detection of pecan's quality and reduce the dimension of data, and the research provides an efficient method for E-nose's application in various fields.