Font Size: a A A

The Research Of Semantic Construction Method Based On Massive Text

Posted on:2013-10-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:D YuanFull Text:PDF
GTID:1268330401974098Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and the quick increase of the amount of information, how to access information effectively is getting more and more attention. Traditional natural language processing methods cant meet peoples requirements gradually, while how to use intelligent means to process information has become a very important issue.The basis of the automatic processing of text information is to understand the semantics of the text which is using formal semantic structure to represent the meaning of the text while this kind of formal semantic structure should be understand and processed by computers. At present there are two main methods:method based on knowledge and method based on statistic. But because of the huge gap between text and the formal semantic structure, these two methods could not achieve the desired results.To avoid transforming text to semantic structure directly, researchers proposed a new theory called shallow semantic analysis which used Predicate-Argument structure as the core. This theory focus on the lexical level and its purpose is to find the semantic relations between syntactic component such as word and phrase of a sentence. Because shallow semantic analysis can be treated as a common semantic extraction technology and used as the basis of deep semantic analysis, it had a rapid development and been applied to many fields of natural language processing.Based on above researches, this paper proposed a new semantic construction framework for massive text based on shallow semantic analysis. Our work mainly includes the following aspects:1. Proposed a semantic construction framework for massive text. This framework use Predicate-Argument structure as the core and implements the semantic role labeling of massive text through semantic role induction, Then transform Predicate-Argument structure to deep semantic structure according to their map relationship.2. Proposed a semantic role induction algorithm based on the multi-features. This method treats semantic role induction as a cluster problem. For the given predicate, it first recognizes all the arguments of the predicate from massive text, then divides the arguments into two groups by the complexity of their syntactic structure and classifies each group with different syntactic features. After that an optimized hierarchical algorithm is used to merge clusters from these two groups into finial results. Each cluster in the clustering results represents a specific semantic role. This method needs no manual annotation data and it can get the Predicate-Argument structure for all predicates from massive text automatically.3. Proposed a map algorithm between Predicate-Argument structure and ontology based on semantic similarity. This paper uses ontology as the presentation form of text semantics while Semantic-Construction-Towards ontology is usually organized with events. This paper proposed an algorithm which can link syntactic content to semantic content by mapping the Predicate-Argument structure to its correspond event in the ontology through computing their semantic similarity. For sentence, after syntactic analysis and shallow semantic analysis, its semantic role labeling result can be transformed into deep semantic structure through the relationship between the Predicate-Argument structure and its correspond event.4. Proposed a self-evaluation mechanism for the semantic construction result. Different algorithms have different applicability for different texts, so this paper proposed a self-evaluation mechanism by which the algorithm can give the confidence of each result. This confidence can be used to sort the results while the relative correct part of the results can be picked up by this confidence.The semantic construction method for massive text proposed by this paper takes the advantage of scale to realize the unsupervised semantic role labeling which needs no manual annotated training data. Besides, through the automatic mapping between Predicate-Argument structure and ontology, semantic role labeling result can be transformed to the deep semantic structure. These two parts constitute an integral semantic construction process.
Keywords/Search Tags:Text Semantic Construction, Massive Text, Semantic RoleInduction, Ontology, Natural Language Understanding
PDF Full Text Request
Related items