Abstract:
As a significant traditional Chinese medicinal herb, the clinical efficacy of
Coptis chinensis (Huanglian) is closely associated with its geographical origin, which directly influences its chemical composition, pharmacological activity, and market value. However, conventional analytical methods for verifying the authenticity and origin of Huanglian—such as high-performance liquid chromatography (HPLC) and mass spectrometry—are often constrained by high instrument costs, complex sample preparation procedures, stringent operational requirements, and the need for skilled personnel, limiting their applicability in rapid, on-site quality assessment. To overcome these challenges, this study proposes an innovative, cost-effective approach for discriminating the geographical origins of
Coptis chinensis using an electronic nose (e-nose) system, which captures volatile organic compound (VOC) profiles as fingerprint signatures reflective of regional growing conditions. To enhance the discrimination capability and efficiency of the e-nose, a novel feature optimization algorithm for gas sensor arrays is developed, based on a hybrid weighting index that integrates mutual information (MI) and Hilbert-Schmidt independence criterion (HSIC), termed the MI-HSIC method. This hybrid index leverages the complementary strengths of MI in capturing nonlinear dependencies and HSIC in measuring statistical independence in reproducing kernel Hilbert spaces, thereby providing a more comprehensive evaluation of feature relevance and redundancy. To further improve the adaptability of the feature selection process, an improved Bayesian optimization algorithm is employed, in which a Bayesian neural network (BNN) serves as the surrogate model, replacing the conventional Gaussian process model. The BNN-based optimizer dynamically adjusts the relative weights of MI and HSIC during the iterative optimization process, enabling adaptive and intelligent screening of critical gas-sensing features while accounting for uncertainty in model predictions through probabilistic inference. Subsequently, three well-established machine learning classifiers—support vector machine (SVM), K-nearest neighbors (KNN), and random forest (RF)—are applied to perform geographical traceability analysis on both sliced and powdered
Coptis chinensis samples collected from six major authentic production regions across China. Experimental results demonstrate that the proposed MI-HSIC hybrid weighting method significantly enhances classification accuracy while drastically reducing feature dimensionality. Specifically, the SVM, KNN, and RF classifiers achieve test accuracy of 96.25%, 93.33%, and 94.58%, respectively, using only 6, 13, and 20 optimized features. These results represent improvements of 2.91%~4.58% in accuracy compared to baseline models using all 70 original sensor features, while simultaneously reducing the number of features by 71.4%~91.4%, thus enhancing computational efficiency and model interpretability. Furthermore, when compared to feature optimization methods relying solely on either MI or HSIC, the proposed MI-HSIC approach demonstrates superior comprehensive performance, achieving higher classification accuracy with fewer features. In addition, the proposed BNN-based Bayesian optimization algorithm exhibits enhanced computational efficiency: it reduces the single-iteration time by 19.8%~36.0% compared to the traditional Gaussian process-based Bayesian optimization, accelerating hyperparameter tuning and improving scalability. The robustness of the method is further validated through independent test sets, confirming its generalization ability across different physical forms of
Coptis chinensis. The study confirms that the integration of the MI-HSIC hybrid weighting index with the BNN-enhanced Bayesian optimization enables effective dimensionality reduction and performance enhancement in e-nose systems. Overall, the proposed methodology provides a robust, accurate, and cost-effective solution for the geographical authentication of genuine
Coptis chinensis, offering a promising alternative to conventional analytical techniques. This work contributes to the advancement of intelligent sensing technologies in the quality control of traditional Chinese medicines, paving the way for rapid, non-destructive, and on-site herb authentication in practical applications.