Font Size: a A A

Research And Implementation Of Domain-Specific Topic Search System

Posted on:2013-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:M XuFull Text:PDF
GTID:2268330398970499Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid increase of Internet information, there is more and more spam information and repetitive information on the Internet. General search engine has difficulty in satisfying users’professional and personalized searching demand, then the topic search engine appears. Based on the research status of topic search, this paper makes a further study of domain-specific topic search system.At present there is massive bidding information of governments and enterprises. It is of great significance to grasp all kinds of bidding information in time. Therefore, this paper sets the specific field as bidding field, investigates and implements the topic search system for bidding field. The main research work and contributions of this paper are as follows:Firstly, this paper proposes web filtering method based on the dual feature selection. It improves the CHI feature selection method, proposes the dual feature selection algorithm, and makes binary classification with the improved TF-IDF formula and SVM classifier. The experimental results show that the method has better effect for web filtering.Secondly, an incremental crawling model for bidding field sites is proposed. Based on the seven characteristics of information changes for bidding sites, this paper describes the model from three respects:the object of incremental-crawling, the method of incremental crawling and the moment of incremental crawling. The experiments prove the validity of the incremental crawling model.Thirdly, this paper designs and implements a topic search system for bidding field. It makes a detailed design for topic crawler module, web filtering module, information extraction module, text classification module and incremental crawling module. The system achieves a good effect.The research of the topic search system for bidding field could meet the demand of governments and enterprises for new bidding information, and has great realistic significance.
Keywords/Search Tags:topic search, web filtering, dual feature selection, incremental crawler
PDF Full Text Request
Related items