Font Size: a A A

Massive Web Of Public Opinion Mining Algorithm

Posted on:2012-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhouFull Text:PDF
GTID:2208330332486730Subject:Software engineering
Abstract/Summary:PDF Full Text Request
web public opinion is defined as people's attitude to some social events and it spreads by forum, blog. The open and free nature of Internet makes itself convenient to everybody and a growing number of Internet users willing to express their views by forum and blog.In the web forum, people can speak freely for any social problems and their views can be easily forged to consensus in a very short time, which will affect society. However, as the limitations of expericence, the views expressed by internet users are extreme and one-side. The popularity of the network makes the web public opinion more and more important. Now, many organs and institutions pay more attention to the web public opinion monitoring and research. The Internet has become the center of ideology, culture and the amplifier of public opinion.Therefore, the research of how to find and monitor web punlic opinion must be undertaked.The network has massive data, so it is impossible to collect data and find web public opinion by manul. To solve this problem, the paper studies the technology of web information collection and web public opinion finding. The main contents of study include network reptile, parallel computing, data division, web public opinion finding. First, the paper proposes a network reptile. This network reptile unifies the merits of general reptile and subject reptile and it can change the crawling strategy, which making the reptile friendlier and using the network resource effectively. Second, the paper analyzes the characteristic of web public opnion and proposes a classfying algorithm. At the same time, the paper uses the technology of data division to make the hierarchical clustering algorithm more responsive to parallel computing. Third, the paper realizes the data division algorithm on HADOOP and the test result shows that HADOOP has the advantage of handing massive data. Finally, according to the study, the paper designs and realizes a web public opinion finding system; the system has friendly contact surface. The experiment shows that this system can gather the web information effectively and discover the web public opinion accurately.
Keywords/Search Tags:network reptile, web public opinion finding, classifying, clustering
PDF Full Text Request
Related items