Font Size: a A A

Research And Implementation Of Cancer Assisted Intelligent Screening Analysis Method

Posted on:2024-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:S B XieFull Text:PDF
GTID:2544306938451904Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As one of the diseases with the highest mortality rate,cancer seriously threatens the physical and mental health of Chinese residents,and early cancer screening is very important for reducing the mortality rate.Machine learning algorithms have the advantages of fast processing speed and high accuracy,and are widely used in medical,transportation,logistics and other fields.Therefore,cancer risk prediction based on machine learning algorithms can help doctors adopt appropriate clinical screening methods,which is of great significance in cancer assisted screening.The true positive rate and false positive rate of cancer screening through traditional methods are low and high,respectively.But there are drawbacks such as heavy workload and longtime consumption.Based on the above background,this paper takes the screening questionnaire data of common cancers(mainly including gastrointestinal cancer,lung cancer,breast cancer,liver cancer)and UCI machine learning dataset as the research object,analyzes the correlation between multiple risk factors in the questionnaire and various types of cancers,reduces the data dimensions and extracts important features through principal component analysis.Based on the support vector data description for abnormal data screening,three machine learning algorithms were trained and tested on common cancers screening questionnaire datasets and machine learning cancer datasets,and the performance of the three algorithms was compared.The most effective risk prediction algorithm was selected for cancer assisted screening.The specific research work is as follows:(1)Analyze the correlation between different types of cancer and risk factors in screening questionnaires,and preprocess the dataset using principal component analysis and support vector data description methods.The screening questionnaire data of 3411 subjects(including3386 health samples and 25 cancer samples)included 30 risk factors related to common cancers,including personal health status,habits and family history.By filling in missing values in the data,principal component analysis is used to reduce data dimensions and extract important features from the questionnaire data,while preserving the original information to the greatest extent.Support vector data description is used to filter out abnormal data in the questionnaire,and finally the preprocessed data is obtained.(2)Establish common cancers risk prediction models based on genetic algorithm optimized support vector machine(GA-SVM),bayesian regularization neural network(BRNN),deep belief network-extreme learning machine-back propagation(DBN-ELM-BP)algorithm.Train and test these machine learning algorithms by the cancer screening questionnaire dataset and UCI machine learning cancer dataset and compare the performance of the three prediction algorithms to prove their feasibility.At the same time,select the algorithm with better performance to establish a risk prediction model for cancer assisted screening.GA-SVM: uses radial basis kernel function to train support vector machines and uses genetic algorithm to find the global optimal parameters in model training.BRNN: based on Levenberg-Marquardt algorithm,bayesian regularization algorithm is applied to train the artificial neural network to improve the generalization ability.DBN-ELM-BP: DBN is used in the unsupervised pretraining stage to solve the problem of setting a large number of labelled training data;The ELM classifier is used to calculate the weight between the last hidden layer and the output layer,in order to receive the output matrix from the last Restricted Boltzmann Machine(RBM)of the DBN and calculate the deviation;The BP algorithm is used to update the weight matrix and bias.(3)Build an intelligent cancer assisted screening system.Using Visual Studio-MATLAB joint programming,based on the collected screening questionnaire data,the used cancer prediction algorithms established in MATLAB are encapsulated and transformed into dynamic link libraries.The library function is used in Visual Studio 2019 to save and process the data of common cancers screening questionnaires.Because there are many training samples of gastrointestinal cancer and lung cancer in the screening questionnaire,risk prediction is mainly conducted for them,and the classification results are output as the risk status of gastrointestinal or lung cancer of subjects.Machine learning algorithm assisted cancer screening can effectively compensate for the shortcomings of human resources in prediction and judgment,reduce the number of false positives and alleviate problems such as missed diagnosis and delayed treatment,ultimately provide a non-invasive and economical screening method for detecting common cancers.
Keywords/Search Tags:cancer screening, machine learning, classification of common cancers, risk prediction
PDF Full Text Request
Related items