Study On Several Typical Data Mining Methods And Their Applications

Posted on:2011-02-06

Degree:Master

Type:Thesis

Country:China

Candidate:C L Dong

Full Text:PDF

GTID:2178360305951060

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of science and technology, especially the widely used Internet, various large-scale data floods our daily life. While huge dataset delivering kinds of information via the format of text, web page, images, etc., it brings us the disaster of "explosion of data, lack of knowledge". It is much difficult to extract information or knowledge satisfied by users from such huge dataset. With the aim of extracting useful information from large-scale structured and semi-structured datasets, data mining has attracted much attention in the last two decades and has been used in many fields such as business decision, market analysis, industry control and medical diagnosis.It produces lots of clinical data during the process of modern clinical medicine. Analyzing and evaluating such data can find some potential useful patterns, which would help people with enhancing the knowledge of diseases and accordingly the research and management to their propagation. As one of the most effective approaches to extracting useful information from medical database and providing scientific decision-making for diagnosis and treatment of diseases, medical data mining has become an increasingly hot topic in the last few years. Different from the traditional data mining applications, medical data mining faces many challenges, for instance, the high dimensions of large datasets, the heterogeneous data with privacy issue as well as strict evaluation criterion. By elaborating various challenges existing in KDD Cup 2008 competition, this paper presents many challenges occurring in medical data mining. Via describing how we construct the final classification modle--Modified Boosted Tree which is ranked the fourth among all the solutions to Task 1, we present our analysis and solutions to these problems. This case can be regared as an epitome of the medical data mining. The issues and corresponding solutions it covers can provide some guidance to the medical data mining applications to some extent. The popularity of World Wide Web provides people with a more convenient platform for communication and sharing, it also boosts the flourish of kinds of web-based communities. With the aim of identifying and characterizing different relationships among community members, community mining becomes one of the hot topics in data mining. In our work, we choose the DBLP (Digital Bibliography & Library Project) data set as a test bed for our experiments. For the purpose of the current work, using the techniques from bibiliometrics and text mining methods, we construct local communities around either main topics or prominent authors with regard to a specific conference. To further analyze the evolution of local communities, we trace the yearly similarities of some of the related conference pairs. Besides, we divide the computer science area into 14 research communities by prominent research directions and characterize them from the point of view of the publication growth rate, collaboration trends, and population stability. Such information may be interested by many audiences to make decisions, such as the advanced students who are about to choose their specializations, the authorities who in charge of the foundation and investment.

Keywords/Search Tags:

data mining, classification, medical data mining, breast cancer detection, community mining

PDF Full Text Request

Related items

1	Research On Breast Cancer Screening Medical Grid
2	Research On Classification Technology Of Breast Disease Data And Application Based On Attribute Reduction
3	The Rearch And Application Of Data Mining Techniques On Medical Insurance
4	Design And Realization For Online Diagnoses System Based On Medical Data Mining
5	Research On Medical Data Mining Based On Neural Networks
6	Data Mining Research And Application Of Technology In The Health Care Field
7	Study On The Infectious Regularity Of Patients With Advanced Lung Cancer Based On Data Mining
8	The Research Of Application On Medical Data Process And Mining Algorithm Of Association Rules
9	A Raman spectroscopic-based platform using advanced data mining methods for in-situ cancer cell classification and characterization
10	Medical Data Mining And Appling Research Based On The Deep Learning