When using near-infrared spectroscopy and machine learning methods to quickly identify the quality of Pu-er tea, the spectra collected by medium and low-end near-infrared spectroscopy acquisition equipment have the characteristics of high dimension, overlap and large noise, which seriously affects the accuracy of modeling. This paper proposes a noise-robust feature extraction method, which is combined with support vector machine (SVM) classifier to establish the quality identification method of Pu-er tea. Firstly, the noise-robust feature extraction method, principal component analysis (PCA) and successive projections algorithm (SPA) are used to extract the features from the obtained near-infrared spectral data. Then, SVM is used to train the data after feature extraction to obtain the identification model. The comparison of the identification results of the model shows that for the noiseresidual near-infrared spectral data, the noise robust feature extraction method in this paper can effectively resist the influence of noise and propose feature variables from the high-dimensional spectrum to improve the accuracy of the identification model. The accuracy, recall, specificity, accuracy and F-score predicted by the identification model were significantly higher than those obtained by the other two methods. For the detection of ancient Pu-er tea and non-ancient Pu-er tea, the accuracy and recall predicted by the identification model in this paper have reached 92.06% and 95.38% respectively, indicating that the identification model has good identification ability. The research results provide theoretical reference and basis for accurately judging the quality of Pu-er tea in practical application.
ZhaoYang, GongJia-shun, WangQiu-ping. Change in aroma components of raw pu-erh tea from ancient tea trees during storage[J]. Food Science, 2022, 43(4): 241-248.
ZengMin, GongZheng-li. Modeling for aroma quality evaluation of Yunnan Pu-erh raw tea made from ancient trees based on principal component analysis[J]. Science and Technology of Food Industry, 2017, 38(15): 264-269.
WangSheng-peng, GongZi-ming, GaoShi-wei, et al. Identification of Enshi yulu tea conserved years based on near infrared spectroscopy[J]. Journal of Huazhong Agricultural University, 2015, 34(5): 111-114.
[9]
RenG, WangY, NingJ, et al. Highly identification of keemun black tea rank based on cognitive spectroscopy: near infrared spectroscopy combined with feature variable selection[J]. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 2020, 230: 118079.
HanGuang, WangXiao-yan, ChenSi-qi, et al. Research progress on improving the accuracy of near inf rared spectroscopy detection of human blood and other complex solution components[J]. Spectroscopy and Spectral Analysis, 2021, 41(7): 1993-1997.
DongChun-wang, LiangGao-zhen, AnTing, et al. Near-infrared spectroscopy detection model for sensory quality and chemical constituents of black tea[J]. Transactions of the Chinese Society of Agricultural Engineering, 2018, 34(24): 306-313.
LiuPeng, AiShi-rong, YangPu-xiang, et al. Nonlinear manifold dimensionality reduction methods for quick discrimination of tea at different altitude by near infrared spectroscopy[J]. Journal of Tea Science, 2019, 39(6): 715-722.
[19]
CanovaL D S, ValleseF D, PistonesiM F, et al. An improved successive projections algorithm version to variable selection in multiple linear regression[J]. Analytica Chimica Acta, 2023, 1274: 341560.
[20]
PangL, WangL, YuanP, et al. Rapid seed viability prediction of Sophora japonica by improved successive projection algorithm and hyperspectral imaging[J]. Infrared Physics & Technology, 2022, 123: 104143.
[21]
GhoshT, KirbyM. Linear centroid encoder for supervised principal component analysis[J]. Pattern Recognition, 2024, 155: 110634.
[22]
CardosoV G K, PoppiR J. Non-invasive identification of commercial green tea blends using NIR spectroscopy and support vector machine[J]. Microchemical Journal, 2021, 164: 106052.
[23]
PangY, WangY, LaiX, et al. Enhanced kriging leave-one-out cross-validation in improving model estimation and optimization[J]. Computer Methods in Applied Mechanics and Engineering, 2023, 414: 116194.
[24]
LuqueA, CarrascoA, MartínA, et al. The impact of class imbalance in classification performance metrics based on the binary confusion matrix[J]. Pattern Recognition, 2019, 91: 216-231.