Abstract
Abstract: In order to improve the identification ability of fish meal quality detection device based on bionic olfaction, in this paper, we used the developed fish meal quality detection device based on bionic olfaction to extract response characteristic information of fish meal samples, and performed multi-characteristic data fusion optimization on its sensor array. Firstly, according to the response curve of each sensor to the sample, the sensor features (10×6) were extracted to form the original feature matrix, then the normalization method was adopted to normalize the sensor features, and after that, compactness was taken as the standard to evaluate the rationality of the feature selection methods. Three single feature ranking methods (MIC, c2、F-test), three multi-feature ranking methods (RF, LR, SVM) and four recursive feature elimination methods (RFRFE, SVMRFE, DTRFE, LRRFE) were selected to carry out classification accuracy tests on fish meal with different quality. The experimental results showed that in the single feature ranking method, the MIC method had the best classification accuracy of 98.3 %, the number of features was 55, the Chi - square method had the best classification accuracy of 98.9 %, the number of features was 40, the F-test method had the best classification accuracy of 98.3 %, and the number of features was 50, thus the Chi - square feature selection method was more compact than the other two single feature selection methods. In the multi-feature ranking methods, the best classification accuracy rate of RF method was 98.3 %, the number of features was 38, the best classification accuracy rate of LR method was 83.3 %, the number of features was 24, the best classification accuracy rate of SVM method was 92.2%, and the number of features was 33. Therefore, RF feature selection method was more compact than the other two multi-feature selection methods. In recursive feature elimination, the best classification accuracy rate of RFRFE method was 98.3 %, the number of features was 33, the best classification accuracy rate of SVMRFE method was 92.2%, the number of features was 34, the best classification accuracy rate of DTRFE method was 95.6 %, the number of features was 22, the best classification accuracy rate of LRRFE method was 83.9 %, and the number of features was 37. From this, it could be seen that DTRFE and LR feature selection methods had the least number of features, but the classification accuracy rate was low. However, the RFRFE feature selection method was relatively more compact. The random forest-based recursive feature elimination algorithm (RFRFE) was adopted to select the original features, and the best classification accuracy was 98.3%, at this time, the number of features was 33. The idea of this feature selection method was to repeatedly build the model, then select the worst features, put the selected features aside, and then repeat the process on the remaining features until all features had been traversed. The order in which features were eliminated in this process was the order of features. Therefore, this was a greedy algorithm to find the optimal feature subset, while the RFRFE method selected the random forest (RF) as the base model, and obtained the optimal subset by obtaining the best classification accuracy rate. The number of features optimized by RFRFE feature selection method was 33, which reduced the number of features by 45% and greatly reduced irrelevant and redundant information for fish meal quality classification. The characteristics of the optimized sensor array had changed obviously. The sensor array had changed from the original 10 to 8 and sensor 4 (TGS2620) and sensor 6 (TGS2600) had been removed. This showed that these sensors had little contribution to the classification of fish meal quality using RF classifier. Of the six selected features values, only sensor 1 (TGS822), sensor 3 (TGS813) and sensor 5 (MQ136) had selected all the characteristic values, which showed that these sensors played an important role in the classification of fish meal quality by using RF classifier. By using 10 fold cross validation, the RFRFE algorithm was verified to be more compact again. The feature selection method provided a new method and reference for feature optimization of identifying other animal-derived raw material samples by bionic olfaction technology.