Automatic Selecting The Syntactic Features Of The Compound Sentence Based On The Dependency Tree

Posted on:2015-01-22

Degree:Master

Type:Thesis

Country:China

Candidate:L Y Ye

Full Text:PDF

GTID:2268330428467680

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of society, advances in technology, people can get information by more ways and more convenient, meantime, large amounts of data generate in the communication. Automation and intelligent information processing is an inevitable social development. In this context, natural language processing has been the rapid development. In the area of Chinese information processing, Chinese segment problem and POS tagging problem have been well solved, and some softs can been used in the practical applications. However, we must achieve understanding of sentences before comprehending chapters. The study of Chinese Compound sentences is a bridge between the study of sentences and the study of chapters.Actually, the compound sentence is constituted by the clauses, and it contains much more information than the single clause. The compound sentence is often used to express the logical relationship between people and people, people and things, people and matters. Simultaneously, it has many attributes about syntax, semantics and even pragmatics. Dividing the hierarchical relationships of the compound sentence is a fundamental research, and the problems of relation tags tagging and the collocation of relation tags tagging must been solved before the fundamental research. Based on the above facts, itâ€™s necessary to stand on the level of the syntax, semantics and even pragmatics about the compound sentence.This paper attempts to achieve some understanding about the hierarchical relationships of the compound sentence based on the relation tags tagging. Studying the features of the compound sentence is the most fundamental research. This paper discusses how to automatically select the syntactic features of the compound sentence based on the dependency tree and get the syntactic features which express the relation tags and the collocations between them. When we select features from the sentence, both lexical features and syntax features included.Conditional Random Fields (CRFs) are undirected graphical models. CRFs can include a wide variety of non-dependent features of the sentence. This model has been widely used in many NLP problems. In this paper, we use CRFs to train the corpus of the compound sentences, and embed the feature selection algorithm into the model to achieve selecting syntactic features automatically. The experiments are divided to two parts:relation tags tagging and the collocations of relation tags tagging. The result of relation tags tagging is better because of more research and its simplicity. Its precision and recall are about98%. This paper only expresses a little about the collocations of relation tags, and its precision and recall are about77%, we need more study at this task. The model files which we get from the experiments can been well used in the related tasks.

Keywords/Search Tags:

The dependency tree, the features of the compound sentence, Relationtags, Conditional Random Fields

PDF Full Text Request

Related items

1	Research On Fast Exact Structured Learning
2	Recognition Of Named Entity In Electronic Medical Records Based On Cascaded Conditional Random Fields
3	Automatic Labeling Of Chinese Frame Semantic Roles Based On The Dependency Features
4	Research On Dependency-based Chinese Semantic Role Labeling
5	SAR Image Change Detection Based On Conditional Random Fields
6	Study And Application Of Chinese Sentence Structure Clustering
7	Research On Online Detection Method Of Reputation Fraud Campaign Based On Conditional Random Fields
8	Research On Japanese Dependency Parsing Technology
9	A Study On Chinese Personal Name Recognition Based On Conditional Random Fields
10	An Self-adaptive BLP Optimal Model Employing Conditional Random Fields