Font Size: a A A

Comsumption Intent Mining And Classification In Micro-blogging

Posted on:2013-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:H D GaoFull Text:PDF
GTID:2268330392469327Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, microblogging (micro blog) isgradually on the rise. In recent years, microblogging gradually infiltrated into everyaspect of people’s lives, and people not only receive a variety of information fromthe microblogging, but also take the initiative to publish a wide variety ofinformation, and even expressed a desire to buy a product, that is, they haveconsumption intent for certain products. Microblogging is social media and socialnetwork. At the same time, Microblogging contains a lot of commercial value.We carried out a series of studies about consumption intent. And we mainlyfocus on the following aspects: mining the external consumption intent resourcesbased on Bootstrapping; mining consumption intent microblogging based on thegraph model; microblogging consumption intent classification based on SVM. Thefollowing content briefly describe the the various studies.(1) Mining the external consumption intent resources based on Bootstrapping.In etao qiugou, many items of information contain consumption intent. Comparedto the microblogging, these items of information in etao qiugou can be calledexternal consumption intent resources. At the same time, a large number of labeleddata is a very important resource in the study of natural language processing.However, the study of microblogging consumption intent as a new researchdirection, the lack of labeled data and a standard evaluation data set keep us fromdoing this research. This paper presents a semi-superviesed approach for labelingexternal consumption intent resources based on Bootstrapping. The advantages ofthis approach is that we can take advantage of a small amount of manually labeleddata, and then label the categroy of a large number of data that have not beenlabeled.(2) Mining consumption intent microblogging based on the graph model. It isvery difficult to mine the microblogging data with consumption intent directly fromthe mass of the microblogging text, that’s because about only3%of the totalnumber of microbloggings contain consumption intent. Therefore, we manuallyextract a number of the feature phrases with consumption intent, and then filter themicroblogging through these feature phrases. The ratio of consumption intent in thefiltered microblogging is greatly improved. Then use a graph model-based approachto mine a large number of microbloggings with consumption intent. In this articlewe use a graph model algorithm called the CIRank. It comes from the traditionalTextRank algorithm and RelevanceRank algorithm. Finally we use three methods tocalculate the similarity of nodes in the graph, namely: Jaccard coefficient, cosinesimilarity and consumption intent similarity.(3) Microblogging consumption intent classification based on SVM. This paper presents a method used to classify the consumption intent of a givenmicroblogging based on SVM classifiers, using external consumption intentresources and real microblogging text to train a classification model. It should benoted that this classification is a binary classification, that is, for a givenmicroblogging, the output of the classifier is that microblogging containingconsumption intent or not.
Keywords/Search Tags:Microblogging consumption intent, mining, classification, graphmodel, Bootstrapping
PDF Full Text Request
Related items