Font Size: a A A

Knowledge Reduction Algorithms Research Of Incomplete Information System In Massive Datasets

Posted on:2016-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2308330470970741Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
With the development of information technology, data volumes are growing far more than any time before. Analysis of data is a very important technology for any organizations. Big data, a potentially valuable resource. If the data resources cannot be full used, which do not bring benefit but a burden for an organization. How to deal with these important data resources, especially for massive data with missing information has become a hot research, now.The traditional patterns and data mining algorithms of various data are mainly for the analysis about small amount of structured data. For the big and unstructured data, scholars have been committed to study for data mining algorithms and have got certain achievements. But the data mining algorithms for massive data with missing information is little involved.Researchers should rethink the original algorithms and models so that the patterns of data mining can better adapt to the characteristics of massive data with missing information today.In the data mining patterns, the rough set theory is a kind of tool, which is used to deal with uncertainly and fuzziness of knowledge data and now it plays very important roles in artificial intelligence, pattern recognition, decision analysis and other important fields. Traditional knowledge reduction algorithms assume that all the datasets can be loaded into the main memory, which are obviously infeasible for large-scale datasets, especially for massive datasets with missing information. To this end, Through analyzing the theory of rough set, this paper deeply analyzes the characteristics of massive datasets with missing information, and allows the missing attribute value to take all possible values. Then, by combining the parallel computations used in classical knowledge reduction algorithms with the discernibility (indiscernibility) of the attributes, a knowledge reduction algorithm is designed for incomplete information systems under MapReduce framework. Finally, the experimental results demonstrate that this algorithm can efficiently process massive datasets for knowledge reduction in incomplete information systems.
Keywords/Search Tags:Massive Data, Cloud Computing, Rough Set, Incomplete Information System, Reduction, Map Reduce
PDF Full Text Request
Related items