Font Size: a A A

Application And Research Of Academic Literature Search Engine Based On Classification

Posted on:2017-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2308330503974533Subject:Traffic Information Engineering & Control
Abstract/Summary:PDF Full Text Request
With the exponential increase of the number information on the Internet, and websites related to academic information are also as growth. To retrieval of academic information in the daily,we found that the common academic search engine on the retrieval way, basically all is the full text of a keyword search, but most of the web page keywords. Which is not the theme of the article, on the classification of academic websites, most of keywords are so rough and non-professional, it couldn’t give user a good guiding, and these defects will greatly reduce the user’s experience.In order to solve the above problems, this paper designed an academic literature search engine based on classification, and as a Chinese library classification. In academic terms of web page, and put forward the web academic judgment algorithm based on Bayes algorithm, this algorithm through the web page content features, format and the analysis of the structure characteristics, implements the web academic judgment, In terms of classification to outline the classification of Chinese library classification as a category, the algorithm by use of web page subject keyword build vector space, finally realizes the correct classification of the page.Combining the above two key algorithms in the system of web page subject extraction, using Html Parser technology combined with regular expressions of web page subject extraction algorithm, realize to the scraping of the page content. Part participle used is based on positive maximum matching algorithm. Finally to grab web links to establish effective index, using the Lucene open source technology, using Lucene build efficient index of the library in order to meet the user’s query function. Based on the algorithms mentioned above we have built an academic literature search engine based on classification.
Keywords/Search Tags:Network information process, Search engine based on classification, Web academic judgment, Web page classification algorithm
PDF Full Text Request
Related items