Font Size: a A A

The Crawling Indexing And Searching Of Blog Resourses

Posted on:2009-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y M XuFull Text:PDF
GTID:2178360245995821Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Recently,with the development of economical and culture,the information resourse on the internet is growing rapidly.The present of the information turns out to be various and complex.Like bbs,blog,video-blog,net hard-disc,all kinds of information services are developing.Facing so many kinds of internet information,we need to have an effective method to get the valueable information we need.It is proved that the search-engine is the right searching tool.To be the studying and developing object of computer and information area,the web search engine is being more and more perfect.To be a newly developed internet service,blog is being paid attention to and used by lots of people.But on its developing stage,the search engine technic applicated on blog search has to be improved. Specificly speaking,blog is a share space on the internet;people use this space to share articles,photos,vidios and so on as their diaries every day.But traditional search engine could not satisfy the demands of time efficiency,covery rate and analyzing of pages when users search for blog resourses. So how to design a type of search engine that can crawl blog pages accurately,cover blog resourses entirely is becoming the hot topic and the challenge in the search engine studying area.At the same time,to be part of the campus searching system,the blog resourses this engine gets should be close to the campus life as much as possible.How to make the blog search engine cralw pages in the specific topic,and how to sort the results aoutomaticly has become the hot topic of recent studying.To make the engine search resourses in the campus life range is one application of the subject specific search engine.Nowadays the famous search engine companies have some defects more or less when people use them to search for blog resourses.Though many service suppliers invent the blog search functions,they still have the defaults on specific top search and sorting.In this paper I will design a blog search engine based on traditional search engine and to make it have specific topic search and automaticly sort function,so as to run rightly in the system of campus search.This article elaborates on the development process and method of this search engine. And then analyze its software architecture and data structure etc. We also introduce the Plug-in system and distribution computation platform, and trait them as the fundamental part of our system to design and develop. On the research of basic conception,key technology and procedures,with the analyze of the blog search requirement, making use of a full-text search engine toolkit named Lucene,I build up the blog search engine system;complete the whole design about the blog search engine frame and the coding.
Keywords/Search Tags:Search Engine, Blog, Blog Search Engine, Plug-in, Distribution, Lucene
PDF Full Text Request
Related items