Font Size: a A A

Research On Web People Information Search

Posted on:2016-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:T XinFull Text:PDF
GTID:2298330470457826Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and the explosive growth of network data, the Web has accumulated vast amounts of information resources become an important channel for people to obtain information.People are the main participants in the objextive activities.people information is a very important resource and it is very common to search for it in the Internet. However, the existence of vast amounts of information makes it more difficult to find the information which we want. Just as the saying goes "Information is no longer a scarce resource, attention is". So It’s an urgent problem to help peope to find the information that they need accurately rapidly and comprehensively.General search technology can meet people’s demand for information search to some extent, but there are still some problems.First, there are a lot of people information on the social media. But due to the special nature of social networking sites, general search technology can not effectively get their resources. Second, ambiguity problem often encounter when we search for a person name. The general search engine just ranks the results based on keyword matching algorithm without considering the semantic, so it can’t solve the problem of person namesake.In this paper, the two key issues are studied. The specific contents are as follows:Study on the method of personal information search in the social networking sites:Based on the detailed research about the social web platform and its related technologies, a combined method of web page parsing and API query is proposed. A cross-platform social networking sites personal information search system is achived, which resolves name ambiguity problems by way of attribute matching and saves the extracted information by person model.Research on name disambiguation problem in Web People Search:Based on the summary of the previous related work, we propose a web name disambiguation approach based on combined features. The approach extracts different features from the page, constructes combined feature vector using space vector model, and then employs hierarchical clustering method to resolve person name disambiguation based on comparison of the similarity between the feature vectors.Design and Implementation of Web names disambiguation prototype system: Based on the study of web person name disambiguation method, a prototype system is implemented. The system receives an person name,and then uses the name disambiguation method mentioned above to cluster the pages which get from the web by Google API. At last, it lable the page cluster by the cluster feature, rearrange them. Experiments on the prototype system show that the combined features are more accurate. So the approach based on combined features is more effective in solving name disambiguation problem.
Keywords/Search Tags:Web People Search, The Social Network, Information Search, NameDisambiguation, Hierarchical Clustering
PDF Full Text Request
Related items