Font Size: a A A

The Study Of Blog Search Engine And Ranking Technology

Posted on:2010-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:L YanFull Text:PDF
GTID:2178360278975079Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of World Wide Web (WWW), information grows rapidly and internet has become the important source that people obtained information. However, for the extension and opening of web, it is extremely easy and unrestricted to release information on internet, no matter as any unit, group or individual person. This has aggravated the information's inflation of WWW. It is more and more difficulty to find useful things for Web users. So how we look for necessary information fast and accurate has been a difficult problem to users.At present, Blog as representatives of Web 2.0, since appears, it has caused a kind of change of circulation ways of mass media and has influenced and changed the mode of internet continuously. The rise on the Internet of Blogging has created a highly dynamic and tightly interwoven subset of the World Wide Web. The Blog is giving rise to a large body of research, both concerning content and structure.In this paper we focus on another aspect of the Blog: searching Blogs. The exponential rise in the number of Blogs from thousands in the late 1990s to tens of millions in 2005 has created a need for effective access and retrieval services. Today, there is a broad range of search and discovery tools for Blogs, offered by a variety of players; some focus exclusively on Blog access (e.g., Blogdigger, Blogpulse, and Technorati), while web search engines such as Google, Yahoo! and AskJeeves offer specialized Blog services. The development of specialized retrieval technology aimed at the distinct features of the Blog is still in its early stages.This paper has done the work of several respects of the following mainly.Researches on foundation: Firstly, we introduce the foundation of web mining, system structure of web crawler, relative algorithm of crawler, several methods of Chinese segmentation, and then we put forward the system frame of Blog search and ranking, the design of three function modules.Search engine for Blog: Firstly, we introduce the traditional technology of search engine, and then we put forward a kind of optimized algorithm of search engine.Ranking for Blog: We utilize the technique of implicit link etc. to bring forward a novel algorithm: a content-based algorithm for Blog ranking. The algorithm considers both the link analysis and the content analysis of the Blog, mining more implicit features of Blog, such as common topics, to improve the satisfaction of the users.
Keywords/Search Tags:Blog, Blog Search Engine, Ranking, Implicit Link, Content
PDF Full Text Request
Related items