Outlier Mining Of Book Selling Information Based On Rough Set

Posted on:2011-07-09

Degree:Master

Type:Thesis

Country:China

Candidate:X J Chen

Full Text:PDF

GTID:2178360302988342

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

In this paper, the currently prevailing attribute reduction algorithm based on the rough set is applied to the detection and analysis of outlier concern selling. Since the outlier mining is a sub-branch of data mining, the former has been applied to a great multitude of fields, where the mined data, instead of being regarded as noisy ones and then discarded, are of certain value and applicability. An algorithm of outlier mining based on dissimilarity is designed, with the basic ideas as follows: in the first place, the positive region reduction algorithm is utilized to extract the relative reduction of the data set concerning books and eliminate redundant attributes; in the second place, the formula of dissimilarity, an accelerating method of detection, is then used to detect the outlier.The main research target of this paper covers the introduction to the prevailing rough set theory and the analysis of three major reduction algorithm based on the rough set: that is, the attribute reduction algorithm based on the discernibility matrix, the attribute reduction algorithm based on information entropy, and the attribute reduction algorithm based on the algebraic form. In this paper, the positive region attribute reduction algorithm is adopted because it is in closer proximity with the essence of rough set reduction and it is algorithmically simple and understandable.The pros and cons of various models for outlier mining are intensively studied, and an algorithm for outlier mining based on dissimilarity is designed, the basic idea of which lies in that the algorithm of positive region attribute reduction of the rough set is used to alter the high-dimensional data set to the low-dimensional one. Meanwhile, the advantage of the data mining algorithm based on the dissimilarity is demonstrated through analyzing the shortcomings reflected in the study proposed by Tu Lihong and Yang Liping concerning isolated vertexes based on dissimilarity.In order to achieve higher flexibility of this system, users can customize the threshold, restrict the range of value in that the smaller the threshold is, the more accurate record of outlier they can obtain, and vice versa. This system exhibits certain flexibility and practicality when applied to the data set concerning book selling.

Keywords/Search Tags:

rough set, dissimilarity, outlier mining, attribute reduction, data concerning selling

PDF Full Text Request

Related items

1	Data Mining Research Of Vehicle Sales Based On Hash Quick Attribute Reduction Algorithm
2	Network Intrusion Detection Research Based On Rough Set And Outliers Mining
3	Application Of Outlier Detection In The Abnormal Analysis Of Medical Prescription
4	Rough Set Data Mining Approach And Its Application Relative To Decision Problem
5	Research On The Attribute Reduction Algorithm Based On Rough Set In Data Mining
6	Based On Rough Set Attribute Reduction Algorithm Of Data Mining To Improve Research
7	Research On Heuristic Attribute Reduction Algorithm Based On Rough Set
8	Research Of Higher Vocational Student's Non-intellectual Factors Based On Rough Set
9	Association Rule Mining Algorithm Based On Rough Set
10	Based On Rough Set Theory Data Mining Technology And Its Application Of Potential Consumers Of Private Cars