Font Size: a A A

The Research Of Network Information Filtering System Based On Genetic Algorithm And Fuzzy Clustering

Posted on:2009-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:H J LuFull Text:PDF
GTID:2178360242495122Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Following the development of Internet, more and more commercial and daily activities are carried out through the Internet. Network becomes closer to people's daily life. Coin has two sides. When we are enjoying the convenience from the Internet, it also brings some bad information to Internet users. In addition, because the Internet is openly, dynamic and isomerous , it is rather hard to get information what we need. This demands a method to reduce the irrelevant information according to user's information demand. So information filtering becomes one of the hot research fields.Gaining the user's profiles, expressing user's interests, using which to classify the documents form the Internet is the key technique of network information filtering. The relevant techniques of text classifier are often used, such as Rocchio, K Nearest Neighbor, Na?ve Bayesian, Support Vector Machine, and Genetic Algorithm (GA). The application of GA in information filtering is to gain the user's profiles and its effect is determined by GA's Fitness Function. At present, the Fitness Function often adopts the method that based on computing the similarity of GA's individuals. The evaluation method pays more attention to individuals'similarity but less to the classificatory attribute of individuals and features, also the typicality of features. Therefore, the effect of users'profiles is not so good.After the fuzzy set theory brought forward by Zadeh in 1965, people begin to use fuzzy theory to do clustering problems. Because the fuzzy clustering (FC) can obtain the degree of classificatory indeterminacy, express the samples'medi-attribute, it reflects the realistic world better. So, if we use the fuzzy clustering method to evaluate the GA's individuals in network information filtering system based on Genetic Algorithm, can considers more the non-absoluteness of the classificatory residing of each feature, features'typicality and the involved classificatory attribute, mean while, can give a classificatory attribute evaluation of individuals to some degree. Accordingly, gain the more veracious user's profile. This paper uses fuzzy clustering method to evaluate GA's individuals in information filtering, proposes a genetic training algorithm based on FC, and then applys this algorithm to an information filtering system, forms the GA and FC network information filtering system, using which proves the validity of GA based on FC. The main tasks that this paper has done as follows:1. Using the GA combined with FC to evaluate GA's individuals. Before computing the fitness, expresses the training set as vectors according to one individual, then clusters it using direct clustering method of Fuzzy Similar Matrix, computes the fitness finally by evaluating the result of clustering. This method evaluates the individuals according to its ability of juding texts'sorts, pays more attention to the typicality of features and its classificatory attribute.2. Improving the training algorithm's ability of anti-jamming. The fitness function computes the fitness by combining the correctness and denseness of the result of fuzzy clustering. This function sets an parameter w which can lower the sensitivity to outliers of training text set. Thereby improves the training algorithm's ability of anti-jamming.3. Implementing the network information filtering system based on GA and FC. This system adopting simulated annealing genetic algorithm to training, evaluating individuals by fuzzy clustering, obtaining user's profiles through certain generations'iterative training, classifying the information according to profiles using the improved Sim function, accomplishing the process of information filtering, presents the experiment results which proves the validity of the GA based on FC.This paper presents a genetic algorithm based on fuzzy clustering by evaluating GA's individuals using fuzzy clustering technique. Testing proves that it has an obvious advantage in the aspect of precision and F1 measure.
Keywords/Search Tags:Information Filtering, Genetic Algorithm, Fuzzy Clustering, Fitness Function, Similarity
PDF Full Text Request
Related items