The Research On Classification Using Statistic And Its Application To Forecast Latent Customers

Posted on:2005-09-29

Degree:Master

Type:Thesis

Country:China

Candidate:C Li

Full Text:PDF

GTID:2168360125958871

Subject:Computer application technology

Abstract/Summary:

With the ripe of database technology and the popularization of data application, the numerical value format data is increasing at the exponential speed. People will not settle for simple affair management and information search on these data, but look forward to gaining knowledge from data to assistant decision-making. Such demand makes data mining being one of the hotspots in the field of computer recently. Data mining technology has become riper by the research of ten and more years. So now the emphases of research are changing for the application of data mining technology and business affairs are becoming the leading application domains, and the requirements of demand are more prominent. In the face of huge pressure of market, the competition states are behaved as the contention drastically of the most favorable customers between corporations. Therefore we can know the research of latent customers discriminating has important realism significance.Firstly this paper introduce the conception of latent customers discriminating and classify algorithm, Based on which the problems existed in latent customers discriminating are analyzed. Aimed at the condition attribute values and the relationship of special customers sorts, Author advances an assured factors algorithm based on statistic. The algorithm based on the research of statistical algorithm and the algorithm part the data sets to equivalent classes based on conditional contributions firstly, then to calculate the ratio of special sort data member numbers and the total of data members to gain assured factors and as the correlation measurement of conditional contributions after normally standard. The experiment results indicate that the algorithm can resolve the uncertain knowledge problems effectively. Aimed at the problem of attributes selection, Author advances a two layers selecting algorithm based on the analysis of the attributions selecting algorithms in existence. The algorithm uses conditional attributions and the correlation measurement of class mark contributions to estimate the correlation degree between both. The algorithm wipes off those conditional attributions independent of ornegative correlative with class mark contributions and minishes the follow learning scale to reduce the time spending; On the other hand, Author import feedback theory in attribution selection and advances a attribution selection model based on improvement. This model effectively restrains learning insufficiency and excessive learning questions caused by artificial threshold. On the whole the algorithm enhances the precision of attribution selection at the same time of timesaveing to improve the model precision. The experiment results indicate the result. The paper also aimed at data contradiction problem advances a constitution variable algorithm to token the influence of the integration essentials of correlation conditional attributions to data discriminating. The algorithm tokens the association of the integration essentials of correlation conditional attributions and data special classes by adding independent variables, in other words, by reflecting the integration essentials of correlation conditional attributions using model to minish model errors and improve the model precision. The paper bases on the statistical algorithms in existence and combines the improvement advanced above, in the end, realizes the customer discriminating system prototype based on rough sets and statistic knowledge.

Keywords/Search Tags:

data mining, classification, statistics, rough sets, data identifi-cation, character extraction, assured factor

Related items

1	Research On Data Stream Classification Based On Granular Computing And F-Rough Sets Extension
2	Space Data Mining Research Based On Rough Set Theory
3	Research On Dynamic Data Mining Methods And Techniques Based On Rough Set Theory
4	Research On Data Mining Algorithm Based On Rough Sets
5	Some Research On Data Mining Based On Rough Sets Theory
6	Research On Rough Sets Theory Based Data Mining
7	Research Of Medical Image Classification Approach Based On Rough Sets And Association Rule
8	Research On Classification Algorithms Of Data Mining Based On Imbalanced Data Sets
9	Research And Implementation On Larger Data Sets Mining Algorithm Based On Rough Set
10	Research On Data Mining Technique Based On Data Element Standard And Rough Sets Theory