Font Size: a A A

Study And Realization Of Template-based Web Crawler And Editing System

Posted on:2013-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:H M ZhouFull Text:PDF
GTID:2248330374476310Subject:Software engineering
Abstract/Summary:PDF Full Text Request
According to the Ministry of Education statistics, the number of university graduatesin the country in2012will reach6.6million; the employment situation is grim. In order tofind a satisfactory job, graduates not only need to have relevant skills, but also have a goodsource of information, so as to timely access to job information. Currently graduates mainlythrough the following three ways to get job information:1) employment online of college, thedisadvantage is the small amount of information, and some colleges have noteven employment online,2) Recruitment web sites, most professional recruitment websites are for social workers, and is not very suitable for graduates,3) Search engine,search engine pursuit the universal goal, so can not meet the personalized search, alsoinformation is not timely.The study in this paper Template-Based Web Crawler And Editing System is proposed inorder to solve the above problem. System by collecting the job information from the careercenter of college and structuring information, complete information integration, then notonly meets the demand of graduates on the amount of information, but also maintains thequality of information, solves graduates of personalized information needs.Design and realize a Web Crawler by using template technology In this paper. This WebCrawler extracts information from the career center of college through template technology,then structured information and published on the external recruitment web site by aweb editing system. Experimental results show that this system is able to complete theextraction of recruitment information and provide convenient structured informationprocessing function. This system has a good practical utility.Firstly, the thesis introduces the research background of Template-Based Web CrawlerAnd Editing System, and describes the domestic and international research on informationextraction technology.Secondly, the thesis describes the use of open source framework and informationextraction strategy, design and implementation of Template-Based Web Crawler And EditingSystem.Finally, the thesis summarizes the practical value of the research work. It also lists the innovation and further work.
Keywords/Search Tags:Information Integration, Template Technology, Web Crawler, Structure
PDF Full Text Request
Related items