A Novel Chinese Subjective Sentences Recognition Method Based On Word Co-occurrence Relationship Graphic Model

Posted on:2016-10-17

Degree:Master

Type:Thesis

Country:China

Candidate:C Q Fu

Full Text:PDF

GTID:2308330470464019

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the application and popularization of the Web2.0, the view of "from user-centered design to participatory design" has become the advocated concept of the Internet today. Nowdays, new media such as the forum, post bar, blog, and micro blog has provided customers with a more free communication platform. More and more user express, spreads or exchanges their personal views or ideas via the Internet. This kind of user-generated content contains huge commercial and social value. Therefore,how to exactly extract and recognize the subjective sentences from the large quantity of text has an important theoretical value and realistic meaning.Currently, the main methods for subjective sentence recognition adopt the vector space model to represent documents. That is, each document is represented as a termvector or a feature vector. However, the feature vector representation method is based on the strong assumption of term independence, which doesnâ€™t consider the order and dependency between any two terms. Based upon the above observation, in this paper, we propose a novel term co-occurrence relationship driven and graph model-based method to recognize the Chinese subjective sentences. It describes the distribution difference among the terms within both subjective and non-subjective sentences sets via the term co-occurrence relationship graph model and semantic information. It can effectively capture the semantic information within the Chinese subjective sentences. Meanwhile, different with the traditional VSM-based feature value calculation, it combines the indegree-based term weighting calculation way of graph model with the complex eigenvalues calculation method of information retrieval to effectively calculate the emotional value of the terms in the graph model.Experiment results on the corpus show that the performance of the Chinese subjective sentences recognition can be significantly improved, which outperforms the state-of-art methods.The main work of this paper consists of the following three portions:1) Firstly, we build the term co-occurrence relationship directed graph for the subjective and non-subjective sentence sets, respectively. Specifically, we describe the co-occurrence, syntactic relationship, and the distribution difference of terms.2) Secondly, we combines the indegree-based term weighting calculation way of graph model with the complex eigenvalues calculation method of information retrieval to effectively calculate the emotional value of the terms in the graph model.Meanwhile, we train a SVM classifier to identify the Chinese subjective sentences based on the above method. In order to verify the effectiveness of our method, we also setup the comparation experiment with current representative models.3) Finally, we tune some parameters such as the sliding window size and the direction of the directed graph of our graph model in order to improve the performance of Chinese subjective sentences identification further.

Keywords/Search Tags:

word co-occurrence relationship, graphic model, subjective sentence, recognition, machine learning

PDF Full Text Request

Related items

1	Research And Realization Of Chinese Subjective Sentence Recognition Method Based On Fuzzy Set
2	Automatic Recognition Of Causal Complex Sentences Based On DPCNN Model And Fusion Of Sentence Features
3	Research And Implementation On Judgment And Relationship Recognition Of Chinese Complex Sentence
4	The Research Of Subjective Sentence Extraction Method For Texts On Network
5	Research And Implementation Of Subjective Question Scoring System Based On Chinese Word Segmentation And Text Similarity
6	Hot Topics Detected From Micro-bloggings Based On Word Co-occurrence Model
7	Pedestrian Detection Algorithm With Co-occurrence Relationship And Adaptive HCS-LBP Features
8	Research About Micro-blog Hot Topics Discovery Based On Optimized TF-TDF And Word Co-occurrence Model
9	Negation Word And Sentence Detection Towards Specific Prefix And Suffix
10	Research Of The Alignment Between Features Of Space Relationships In 2D Images And Describing Words