Font Size: a A A

Research On Key Techniques Of Consumption Intention Analysis For Social Media

Posted on:2016-09-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:B FuFull Text:PDF
GTID:1108330503469598Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of web technology, social media services such as Twitter and Sina Weibo, have become an important platform for users to acquire information and communicate with each other. More and more users would like to expresses their needs and desires on social media. Such data offer great opportunities to analyze users’ consumption intention. The goal of this thesis refers to automatically infer users’ consumption intention from user-generated content or user preference in the Internet.Previous work on consumption intention analysis can be classified into two main directions. The first is consumption intention detection based on user-generated content.We define the consumption intention in the microblog that must contain two important elements: the trigger words and the consumption intention targets. These two elements,which directly lead to user’s purchase intention, are important features for user consumption intention detection. The second direction is consumption intention recognition based on user preference. User preference refers to that the users implicitly indicate their needs and desires. Our research covers the above two directions. We explore text information and crowd knowledge of users, by using machine learning, statistical machine translation and information retrieval techniques, to investigate the consumption intention problem for social media. The main content of our research work can be summarized as follows.1. Generating consumption intention data based on transfer feature learning. There is a lack of training data for consumption intention detection. Due to the lack of training data for consumption intention detection, annotating the training data becomes a challenging problem. To address this issue, we extracted a large volume of training data based on user naturally annotated(query and click URL). Specifically, the proposed method regards as the transfer feature learning. This method can not only transfer common features among the different domains, but also can effectively leverage the specific features from heterogeneous data, which makes it has much better adaptability in learning.2. Detecting consumption intention based on graph ranking. Prior work mainly focuses on supervised approaches for consumption intention detection. However, it is time consuming and labor expensive to collect a large amount of labeled data. Due to the lack of training data, we propose a weakly-supervised graph ranking algorithm to detect consumption intention. This method can be applied to a large volume of unlabeled data which related to the amount of labeled data, so that all data can also be involved in the learning process of the graph ranking algorithm. Firstly, an undirected graph can be constructed by adding a node from the labeled data and the unlabeled data. Then the numerical weight is assigned to the edge between two nodes by their similarity. Finally, the node belonging to the given category is extracted according to the weight. Experimental results show that the graph ranking method is effective for consumption intention detection.3. Extracting consumption intention targets based on monolingual word alignment model. Consumption intention target refers to a product or service from the message expressed by users with consumption intention. Consumption intention targets are usually represented by the text fragment of word sequences. Extracting consumption intention targets refer to extract the above word sequences from messages with consumption intention. The proposed method includes two stages:(1) consumption intention targets candidate extraction;(2) consumption intention targets modification. This thesis presents a novel method which is based on monolingual word alignment model and the web mining method in the first and second steps, respectively. Experimental results show that our approach significantly improves the performance, compared to the baseline methods.4. Recognizing consumption intention based on user preference. Different from previous work such as consumption intention recognition based on user-generated content,this paper first presents a novel method that uses preference information to automatically recognize the consumption intention. Specifically, the proposed method regards consumption intention recognition as a multi-label classification problem, which combines multiple features based on the features of follower’s tags, domain tags features, retweets or reply behavior features, and user gender features. This thesis proposes a method for the automatic extraction of a large linked users for cross social media. With the proposed method, more than 120,000 pairs of linked users are extracted. Experimental results show that the multi-label classification-based method is effective to intention recognition. In particular, the exploited features are all benefit to improve the recognition performance.In conclusion, this thesis not only focuses on messages with consumption intention recognition in microblogs, but also tries to apply the user information of cross community on the consumption intention recognition task. This research has achieved some preliminary results, which we hope to be helpful to other researchers in this area. We believe that the research of consumption intention analysis can make a great breakthrough as the NLP foundational techniques and the processing capability of large-scale data can be improved. Moreover, the progress of the consumption intention analysis techniques can also put forward the development of other related research.
Keywords/Search Tags:consumption intention analysis, social media, graph ranking, consumption intention targets, user preference
PDF Full Text Request
Related items