Font Size: a A A

Design And Implementation Of Web Public Opinion Information Automatic Acquisition System

Posted on:2015-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiFull Text:PDF
GTID:2308330473458355Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Public opinion as a mass social phenomenon exists in some of the views and attitudes of the collection, the government maintain social stability, understanding of social problems to improve the credibility of the government has a positive effect. Meanwhile, public opinion on the company accurately grasp customer views and suggestions on the company’s products and services, improve the quality of products and services, enhance the overall competitiveness of the company’s far-reaching strategic significance. The rise of Web 2.0, for the automated information gathering public opinion web challenges and opportunities. Web information as a main source of information of public opinion, it should focus on the automation of information collection technology web of public opinion. From the point of view of existing research results, web capture public opinion need to address massive data mining, data analysis and real-time analysis of data accuracy and other issues.Thesis first research status existing Web information extraction technology to do a summary of the summary, and then present the research results of a detailed analysis. Combined with the actual needs of the project, put forward their own web of public opinion information collection methods. The main contents are as follows:1. The study of the existing information sampling model and algorithm, and their functions and advantages and disadvantages are compared and analyzed. Acquisition model mainly includes understanding model, object model and visual model, acquisition algorithm including the ontological algorithm, the markov algorithm, etc., summarizes the comprehensive.2. Research and put forward the visual information gathering template generation technology, the user operating behavior(including click the next page link or button, click on the web page an element, a drop-down list, etc.) into a collection template, decrease the difficulty of the template production and improved the production efficiency of template.3. A piece of distribution function of the web page text extraction subsystem, application of xpath and regular expressions related to technology, systems integration adopted the approach of combining statistics and rules to solve the problem of system’s generality.4. Implement the clustering analysis was carried out on the collected web information such as data processing, finally provides users with a comprehensive public opinion in public opinion browsing, hot topics found.
Keywords/Search Tags:public opinion search, information extraction, web content mining
PDF Full Text Request
Related items