Font Size: a A A

The Research And Implementation Of Cubic Relationship Search Engine In Taiwan Field

Posted on:2011-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:D ZhouFull Text:PDF
GTID:2178360308462338Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Search engine technology can provide information retrieval services, and enable users get the interested data fast on the Internet. As the time by, the traditional full-text search engines and vertical search engines can no longer meet all the needs of users, and a new search pattern is bound to produce. Cubic relationship search engine in Taiwan field is a vertical search engine based on object-level. It combines web information extraction, social network analysis and the traditional search technology to provide users with many powerful capabilities, such as search, social network analysis and visualization.In this paper, based on the data of Chinese news in Taiwan field, We have studied search engine, information extraction and other related technologies, and achieved the following results:·Complemented a relationship search engine web crawler. This paper has studied crawling algorithms, page theme collection and page traversal strategy in web crawler. It proposed a page crawling method based on the custom configuration files.·Complemented a page parser and page theme filter in relationship search engine. Parsing the web page, this paper proposed a methord based on features combining HTMLParser technology. In the page theme filtering, this paper dopted a traditional text classification method.·Complemented web information extraction in relationship search engine. Web information extraction includes named entity recognition and entity relationship extraction. In this paper, named entity recognition is based on maximum entropy model combining special field rules, and entity relationship extraction is a classification method based on the vector space model.·Complemented the analysis and visualization function. The relationship search engine is trying to record the people's social activities across the Taiwan field in the Internet. It revealed the dynamic structure of the social network. This paper provides several social network analysis methods, and supports network visualization.Except the work introduced above, this paper has implemented the system of cubic relationship search engine in Taiwan field. At last, this paper validated the function and application of system with some cases.
Keywords/Search Tags:Relationship search engine, Web crawler, Web page, Parsing, Information extraction, Social network analysis
PDF Full Text Request
Related items