Font Size: a A A

Content Resource Evaluation Base On Web Crawler

Posted on:2016-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:B HuFull Text:PDF
GTID:2308330503458819Subject:Education Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and computer technology, the network information is exploding, combined with the network capture technology and text analysis technology to achieve the evaluation of content resources has become hot research fields. The use of this method is significant to the research of teaching evaluation, film and television evaluation, literature evaluation and so on. The topic of this paper comes from the project of the Ministry of science and technology project "Contentbank Evaluation System", this paper will make a detailed analysis of the evaluation mode based on the network information, and make the research and design of the network data capture and text analysis technology.The first things to evaluate content resource is to obtain the network data, in order to obtain a more extensive and comprehensive network data, this paper designs a better data acquisition methods. Using the web crawler to crawl the traditional Internet data, in order to adapt to a variety of Web site structure, designed and finished a kind of directional web crawler. In order to obtain mobile Internet information, we use proxy server to intercept and analyze datagram. After the experimental verification, the web crawler can effectively achieve the data of directional access.After obtaining a large amount of data, we need deal the data first and then analyze data by text analysis technology. This paper focuses on the method of keyword extraction and text orientation. The paper mainly research keywords extraction and text orientation identification, the purpose of keywords extraction is to local subject of text, we compared two methods and finished keywords extraction of text; we use Naive Bayesian classifier to identity the orientation of text, and judge the social public opinion of some subjects by counting the number of positive and negative text. After the experimental verification, the analysis results are in line with expectations, and it plays a good foundation for the future more complex text analysis.
Keywords/Search Tags:Web crawler, proxy server, content resource evaluation, text analysis, keyword extraction
PDF Full Text Request
Related items