Font Size: a A A

Machine Learning Based On Structural And Spectral Features And Applications

Posted on:2024-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:H Q WangFull Text:PDF
GTID:2568307136976509Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Machine learning is the process of using data and algorithms to make machines selflearning.It uses computer algorithms and statistical models to analyze and identify patterns in data and to learn and improve themselves based on those patterns.In recent years,machine learning has shown vigorous vitality in various fields,and various industries have been competing to cite machine learning into their own fields.In all kinds of work involving manual classification,there are often errors,omissions,and confusion,especially when faced with large sample sizes,and manual work inevitably goes awry.Using machine learning methods,we can quickly identify and classify samples to be classified with high accuracy,thus improving classification efficiency,freeing up manual labor,and eliminating various misclassification phenomena.In this paper,58 herbal seeds from different regions of China with similar genera were used as research materials at the Institute of Traditional Chinese Medicine and Ethnomedicine of Xinjiang Uygur Autonomous Region.The data acquisition was performed at the Shanghai Synchrotron Radiation Light Source(SSRF),where Synchrotron Radiation-based X-ray Phasecontrast Computed Tomography(SR-Computed Tomography)was used.Among them,Synchrotron Radiation-based X-ray Phase-contrast Computed Tomography(SR-XPCT)data were collected on the BL13 HB line station and Synchrotron Fourier Transform Micro Infrared(SR-FTIR)Data were collected in transmission mode on the BL06 B line station at SSRF,and two separate studies were completed using these medicinal seeds.(1)Seven seed samples with similar appearance were selected for classification experiments,and after processing the SRXPCT dataset and SR-FTIR dataset,the SR-XPCT data and SR-FTIR data of the seven samples were mixed to form a new hybrid feature dataset and used in the set-up BP neural network(BPNN)classifier for classification.The distribution of data within the dataset was analyzed,and the classification performance of the hybrid feature dataset was significantly improved up to 99.2% compared to the traditional single dataset by the result analysis.(2)Uniform classification of all 58 samples was performed to find the most confusing samples among the 58 species and analyze them.After the unified classification,11 species that are obviously confusing were obtained,so they were selected individually,and the features were extracted and tested for classification.Based on the previous study,we developed a new dataset in the SR-FTIR dataset:the Peak-shift dataset,and applied it to the mixed-feature dataset to introduce machine learning.After processing the data,five datasets were obtained,and they were tested for classification using four different machine learning methods to derive the results and analyze them.In the analysis results,Random Forest(RF)is more suitable for classification of infrared data;BPNN gets high accuracy rate but the calculation time is longer than other classifiers;plain Bayes is extremely fast but lacks accuracy;The classification performance of Support Vector Machine for single datasets is average,but its performance for mixed feature datasets is excellent with an accuracy of 99.19% and fast computing speed,which is considered the most suitable traditional machine learning method for mixed feature datasets.The research in this thesis is a practice of traditional machine learning methods on classification work and an innovative attempt to process feature datasets,which will provide certain techniques and references for the field of intelligent automatic classification.
Keywords/Search Tags:Automatic classification, Machine learning, SR-XPCT, SR-FTIR, Random Forest, Support Vector Machine, Naive Bayes, BP Neural Network
PDF Full Text Request
Related items