Font Size: a A A

Extracting Opinion Targets Based On Conditional Random Field

Posted on:2018-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiaFull Text:PDF
GTID:2428330569485390Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
The extraction of the opinion target is defined as obtaining the evaluation target corresponding to the opinion content from the comment text,the purpose of extracting target is usually identifying the subject matter,the commodity attribute discussed in the context.With the popularity of the Internet,we are faced with massice comment texts while going shopping online,browsing the forum,scaning the news,we need an efficient way desperately to get the information that we want.In this paper,the focus of the study is to extracting opinion target based on conditional random field.In this paper,we extract opinion target with the foundation of the theory about conditional random field.On the one hand,analyze how it influences the result when using the different context and features in the single-field comment text data set.On the other hand,make a conclusion about how field correlation influences the extraction of opinion target,and make regular pattern of the performance of the training set system by incremental mixing in case of multi-field.This paper chooses the comment texts from Taobao website as the research object,the texts are from three different fields,the every field is divided into two sub-dataset abount one kind of product.Firstly,based on the two key technologies between co-training algorithm and naive Bayesian classifier,we pretreat the original dataset that consists of word segmentation,labeling position,syntactic analysis and classification of comment text,the pretreatment work aims at transforming the comment text of sentence level into vector level.Secondly,refered to the baseline experiment,two kinds of additional features that consist of dependency feature and word feature of father word's position are proposed after deeply analyzing the characteristics of chinese text.Then,based on the conditional random field,we model by studying the training set of text in different fields,and label each feature vector in the test set.Lastly,make a conclusion about the experimental results.The experimental results show that in the case of single field,a front and back word to the current word make best extraction result when only one word is allowed.In the case of feature combination,the feature of token is most key,combination of word feature and part of speech feature results in the highest F value when only using two features,the combination is differrent when only using three features with the goal to make best result.In the additional feature,The combination of the basic features,dependency feature,and word feature offather word's position results in the highest F value.In the multi-field case,the interaction coefficient between the different sub-categories in the same field is different,and in the incremental mix,when the souce field is same as the target field,the higher mixing ratio is,the bigger F value of extracting opinion target becomes.
Keywords/Search Tags:Opinion target, Syntactic analysis, Conditional random field, Tri-training, Naive bayesian classifier
PDF Full Text Request
Related items