Abstract:
According to the grading standard of raw cotton issued by Chinese government, 14 texture features were extracted in HSI color space to describe their color and impurity content, and 16 shape features were extracted to describe their size and geometric structure, which leads dimensionality reduction for its dimensionality curse. The feature selection problem aiming at quality grading of raw cotton is a NP hard problem. A solution algorithm was proposed based on cross-validation, hybrid Filter-Wrapper and heuristic search. First, the optimal l feature subset was selected on each training set for 10-fold cross-validation by using heuristic search strategy, including optimal scalar feature combination and floating search, and filter with an assessing function of class-separability criterion, l=1, 2, 3, …, 30. Then, the capacity of optimal feature subset was selected from the optimal l (l=1, 2, 3, …, 30) feature subsets on 10 training sets by using wrapper with an assessing function of the error rate of Bayes-classifier, at which the average error rate of 10 optimal feature subsets on 10 corresponding validation sets reached the minimum value. Finally, the average error rate of 10 optimal feature subsets on prediction set was verified at the capacity of optimal feature subset. Experimental result showed that the average classification rate of the 10 optimal feature subsets on prediction set was 88.39%, and the feature selection algorithm for hybrid Filter-Wrapper and floating search had higher efficiency and good effect.