Font Size: a A A

Fine-grained Opinion Mining Based On Social Media

Posted on:2014-09-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y WangFull Text:PDF
GTID:1318330398955434Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Web2.0and mobile Internet technology, social media has become the main carrier of people to access and create information gradually. The wide-ranging use of social media creates vast amounts of user-generated content which implies rich views and opinions that the users hold. These views and opinions of people are important basis for comprehensive decision-making. However, the relate information is so mass with varying quality. How to automatic extract the people most concerned content and representative view efficiently, accurately and comprehensively from them has become an urgent problem to be solved with the development of social media.Opinion mining is a new technology to meet the needs of people for the subjective information analysis, which combines the capabilities of text understanding and text data mining. It uses NLP, information extraction and data mining technology to identify and obtain the most representative view from a large number of subjective text, and summarize all views. Therefore, it becomes the main ideas of the people to to use opinion mining method for solving the above problem. However, Comparing with the general web text, social media information has obvious social and dissemination characteristics, which reflects the people's concern to information in some degree. So, the paper considers the process demand for social media features in the period of opinion mining, and studies the social media opinion mining from the fine-grained analysis perspective.Specifically, the paper focuses on the following four key questions:(1) Opinion mining corpus select questionsIn the social media environment, user becomes the information creator. Because of the the freedom and randomness that the user to publish information, it is result in the social media information quality is uneven, there is a lot of noise content or false, which affect on the efficiency of mining opinion. The credibility and reliability of the social media information is the basis to ensure the efficiency of opinion mining. How to control the information quality of the Corpus and select high-quality information to constitute a corpus has become a challenge to promote the research of opinion mining development.(2) The fine-grained user's opinion extractionExtraction of fine-grained user's opinion is to extract evaluation object and the corresponding opinon in the corpus information and get the feature-opinion pair. The extraction of feature-opinion is not only the emphasis point but also the difficult point in the fine-grained opinion mining research. Due to the randomness of user-generated content, users will not to publish information using the SVO sentence structure deliberately, in the mean time, they like to use the web popular expression, abbreviations, spelling, Pinyin English and so on. On the basis of the social media content features analysis, it study on the extraction method of fine-grained user's opinion.(3) Opinion word polarity judgment question in social media informationSocial media makes the wisdom of ordinary web users to stick together, so that there is an ordinary phenomenon that have endless buzzwords appearance on the social media platform. How to identify the emerging opinion words and how to judge their polarity bring some challenges for user's opinion sentiment analysis.(4) The statistical analysis and visualization of the user's opinionBecause the society and dissemination of social media information reflect the people's concern to information in a certain degree. Then, we must consider the impact to design the statistical analysis indicators when we summarize the user's opinion, so that the user opinion mining results would be more scientific and reasonable. In the same time, in order to better express the mining results, it is necessary and worthy of further study to analyze and present the user's opinion from multiple dimensions and look for new prospect to classify the dimension in the user opinion mining.In order to solve the above four key issues, the paper summarizes the existing research results first, and on this basis, describes from six chapters.The first chapter is to discribe the theoretical basis of this paper. It summarizes and introduces the main related theoretical?algorithm and tools that used in the paper.The second chapter is to research on the social media filter method. Firstly, this chapter analyzes the characteristics and performance of social media spam, on the basis, it proposes to use the rule-based approach to filter explicit garbage; ues web dictionary-based approach to filter the network ultra short small garbage; and use half supervised machine learning methods which is considered the features of both the garbage published and spam content, to identify the false information.The third chapter is to research on the feature-opinion pair extraction. This chapter begins with analysis of the characteristics of social media information, and on this basis, find the construct ways of the topic feature and opinion word and put forward chunk rules, and then, propose the feature-opinion extraction algorithm on the basis of topic feature and opinion word chuck extraction using the dependency syntax.The fourth chapter is to research on opinion word sentiment analysis. This chapter includes the classification and quantification of the opinion words.. In order to analyze the sentiment polarity of the opinion words, we first build a sentiment dictionary, using the matching method for the determination of the initial polarity of login word; in order to select the method to judge the polarity of not logged opinion word. The paper select the improved SO_PMI_IR method to judge the polarity of not logged opinion on the basis of sentiment classification method analysis and comparison experiment Finally, the paper uses dependency syntactic analysis to identify the qualifier of opinion word, and modify the polarity value by the algorithm and qualifier table.The fifth chapter is the fine-grained social media opinion mining model. This chapter makes the key content of the previous three chapters together to build a fine-grained social media opinion mining model. Then it describes the key mission of each step. In order to obtain more reasonable summary result of user opinion mining, we take into account the social and communication features of social media information and design a statistical analysis indicators for user's opinion. Finally, through experiments, the paper achieves fine-grained opinion mining visualization results.The sixth chapter is summary and outlook. This chapter summarizes the work we have done in the paper and, at the same time, also point out the shortcomings in the current work. In the end, it analysis the prospects and future research.
Keywords/Search Tags:social media, opinion mining, spam filtering, sentiment classification, fine-grained
PDF Full Text Request
Related items