Font Size: a A A

The Person Name Identical Judgement Method In Web-Base Social Network Search

Posted on:2012-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y J PangFull Text:PDF
GTID:2218330362956515Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the progressing of the computer science and internet technology and the growing, people can do online communicate with each other via kinds of software and type no more be restricted by time and region. As the increase of activity on internet from people, the social contact base on internet close in on the external world. We can structure a real social network via search technology from internet. social network has caused a large number of researcher attentions which focus on social relationship network search. Rich information has be included in Web page, so we can acquire a people social network information by Web mining. When you search for information regarding a particular person on the Web, a search engine returns mayn pages. Some of these pages may be for people woth the same name. How can we disambiguate these different people with the same name has be the key technology in social network search. In order to construct a accurate social network for person, we research the person name identical judgement technology.This paper presents an unsupervised hierarchical algorithm base on vector space modle to disambiguate different people with the samename. In order to reduce the dimension of vector, paper presents using C-value and IDF to calculate weight of character which extracted from the Web page. We calculate the similarity via calculate the cosine of angle on two vector. Combining the hierarchical and partitional clustering, this paper presents a improved hierarchical clustering algorithm to implement person name identical judgementing. For reducing the time complexity of clustering algorithm, this paper presents an new method on calculating core of cluster.This paper presents a system framework to implement person social network search. Introducting a method to implement Web page download from Web search engine so we can to acquire the social network information. Using the api which provided by ICTCLAS to complete chinese word segmentation and calculate weight of the character word. Finally using the method which this paper presented disambiguate different name via document classification. The framework using the commercial Web search engine to calculate relationship between two people.At last this paper evlated the algorithm on a collection of documents retrevied from the Web. Experimental results show a significant improvement over the existing methods proposed for this task.
Keywords/Search Tags:social network, vector space model, identical judgement, hierarchical clustering
PDF Full Text Request
Related items