Font Size: a A A

Research On Data Mining Techniques For Computer-aided Colorectal Carcinoma Diagnosis Systems

Posted on:2009-06-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z F LiaoFull Text:PDF
GTID:1118360278954057Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of Bioinformatics and Biomedical Engineering, a lot of medical information including medical imagine resource, physiological guideline, bioinformation and some patients' stuff are available in many hospitals and research groups. We need to analyze the information as some useful information is concealed by the general processing methods which sometimes can be the aided diagnosis rules.Data mining technology is improved quickly in biomedical areas. It can be used to process ocean-store history medical data that results some useful diagnosis rules derives from the patients' information including age, gender, habits and examine results, so the rules are in popular items with no inference and large-scale data processing.This dissertation presents the research issues to process Auto-Fluorescence Spectrogram for Colorectal Carcinoma by data mining techniques with the steps of preprocessing, forming the training samples, building the classification model. Some Auto-Fluorescence Spectrogram for Colorectal Carcinoma Aided Diagnosis Methods will be built with the research results, and try to provide the ways to the doctors for the diagnosis.This dissertation first analyses the theory, characteristics of Auto-Fluorescence Spectrogram for Colorectal Carcinoma, and presents the modules in Auto-Fluorescence Spectrogram for Colorectal Carcinoma Aided Diagnosis System, together with the details of each part. And some methods to derive noises from the spectrogram are provided.To meet the requirement of data incomplety, the dissertation presents an algorithm, called RPCA, to deal with the attributes reduction by rough set with PCA based on tolerant relation. A novel definition of entropy is introduced which knowledge decreases as the granularity of information becomes smaller. Then a new reduction algorithm in tolerant rough set is presented, extract the data feature together with PCA. With the algorithm, data feature cab be extracted, data attributes can be reduced, and the complexity can be reduced as well for later testing.As most biomedical data are hybrid data, the dissertation presents a clustering algorithm based on lattice for hybrid data. The algorithm uses lattice to eliminate the difference between ordinal and nominal samples without exchanges which affects the algorithm accuracy. And the parameters in this Algorithm are optimized as well. Genetic Algorithm is used to optimize the initial clustering number and the mean points are optimized as well. With the clustering samples, we use several ways to get the rules between normal and pathology tissues.To solve the time-restrict problem, a novel Index algorithm for classification is designed and applied to solve this problem. The algorithm uses index tree to reduce the repetition calculation and gets higher efficiency both on computation and storage amount, especially in the application with large scale repetition data.To deal with the data unbalance, the dissertation presents several ways to solve the problem as Overall-density unbalance classification,μ-density unbalance classification and Margin-density unbalance classification algorithms. All of these ways are based on the samples theory as increasing the sparse data number and obtain higher performance, especially on unbalance data processing. Some parameters in these algorithms are analyzed, as a cost-sensitive way is presented to optimizeμby the cost of right and error ratio; and other two parameters in Margin-density unbalance classification algorithm are analyzed as well.Finally, the innovations of this thesis have summarized. And the future research subjects were also presented.
Keywords/Search Tags:Biomedical, information analysis, data mining, feature extraction, cluster analysis, classification
PDF Full Text Request
Related items