Font Size: a A A

Research On Null Value Estimation In Database Based On Fuzzy Clustering

Posted on:2017-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:F WuFull Text:PDF
GTID:2348330503495750Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Along with the coming of the era of digital, researchers pay more and more attentions to data storage and processing. As a data processing method, data mining technology requires for preprocessing data in an effective way. It is a hot research problem to be solved to and fill the null value stored in the database in the process of preprocessing. In this paper, the main work is as follows:General FCM algorithm remains two basically problem: the choice of initial clustering center is random and slower convergence rate. To solving the choice of initial clustering center problem,combined with k-d tree and a space partition tree proposed in this paper, two improved FCM algorithms are proposed. The modified algorithms we proposed can find a set of optimized initial clustering center so as to reduce the number of iterations and overall running time.For dealing with null values in the database, we present a single null-value estimation method on a single table(SNEF) in relational database based on FCM. First, our proposed method obtains dependency attributes of null value by related dimension reduction technique. Based on the dependency attributes, we apply our proposed improved FCM algorithm on the data set. After that, we can get an approximate fitting function by using the fitting regression method. The experiments results show that the proposed method has relatively high accuracy compared with other related estimating method.Most null value estimation methods are based on the information of the data table which null values located, but ignore the relationship between data tables in the relational database, as we all know, relationships between the tables also contain the relevant and even important information of table where null values located. In order to solve this problem, this paper took the advantage of the foreign key relationships between the data tables in relational database, introduced the relationships between the tables to expansion information of the table containing null values. According to the different patterns of correlation between tables containing null values and other tables in relational database, three different modes of estimation methods are proposed. Comprehensive above, a multi-null values estimation method based on multi-tables(MNEMT) is proposed. Compared with the SNEF and other common methods for estimating null value, proposed method has higher accuracy.
Keywords/Search Tags:relational database, fuzzy clustering, null value, correlation information between tables, multiple linear regression
PDF Full Text Request
Related items